vision-agent-framework


Namevision-agent-framework JSON
Version 1.0.0 PyPI version JSON
download
home_pagehttps://github.com/krishna-bajpai/vision-agent
SummaryWorld-Class Multi-Modal AI Agent Framework with Revolutionary Performance Features
upload_time2025-08-31 14:12:30
maintainerKrishna Bajpai, Vedanshi Gupta
docs_urlNone
authorKrishna Bajpai, Vedanshi Gupta
requires_python>=3.9
licenseNone
keywords ai machine-learning computer-vision agent-framework multi-modal face-detection object-detection video-processing fastapi async performance-optimization enterprise token-recycling predictive-scaling cost-prediction canvas-interface workflow-automation differential-privacy
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # VisionAgent - Professional Multi-Modal AI Agent Framework

A cutting-edge, production-ready AI agent platform for image, video, and face analytics built with modern Python and state-of-the-art AI models.

## ๐Ÿš€ Features

### Core Capabilities

- **Face Detection & Recognition** - Advanced face detection, encoding, and recognition with facial landmarks
- **Object Detection** - YOLOv8-powered object detection with real-time inference
- **Video Analysis** - Frame-by-frame video processing with object/face tracking
- **Image Classification** - HuggingFace Transformers integration for image classification
- **Real-time Processing** - WebSocket streaming for live video analytics

### Technical Excellence

- **Modular Architecture** - Easily extendable agent framework
- **GPU Acceleration** - Automatic CUDA detection with CPU fallback
- **Async Processing** - FastAPI with async endpoints for high performance
- **Production Ready** - Docker support, logging, metrics, and error handling
- **Type Safety** - Full type hints and Pydantic models
- **Scalable** - Batch processing and concurrent request handling

## ๐Ÿ—๏ธ Architecture

```txt
vision-sphere/
โ”œโ”€โ”€ agents/                 # AI Agent implementations
โ”‚   โ”œโ”€โ”€ base_agent.py      # Abstract base class
โ”‚   โ”œโ”€โ”€ face_agent.py      # Face detection & recognition
โ”‚   โ”œโ”€โ”€ object_agent.py    # Object detection (YOLOv8)
โ”‚   โ”œโ”€โ”€ video_agent.py     # Video analysis & tracking
โ”‚   โ””โ”€โ”€ classification_agent.py  # Image classification
โ”œโ”€โ”€ models/                # Downloaded/trained models
โ”œโ”€โ”€ utils/                 # Common utilities
โ”‚   โ””โ”€โ”€ helpers.py         # Helper functions
โ”œโ”€โ”€ server.py              # FastAPI application
โ”œโ”€โ”€ config.py              # Configuration management
โ”œโ”€โ”€ cli.py                 # Command-line interface
โ”œโ”€โ”€ requirements.txt       # Python dependencies
โ””โ”€โ”€ Dockerfile            # Container deployment
```

## ๐Ÿ› ๏ธ Installation

### Prerequisites

- Python 3.11+
- CUDA 11.8+ (optional, for GPU acceleration)
- 8GB+ RAM (16GB+ recommended for video processing)

### Quick Start

1. **Clone and Setup**

   ```bash
   git clone <repository-url>
   cd vision-sphere
   python -m venv venv
   source venv/bin/activate  # On Windows: venv\Scripts\activate
   pip install -r requirements.txt
   ```

2. **Run API Server**

   ```bash
   python server.py
   ```

3. **Access API Documentation**

   - Open [http://localhost:8000/docs](http://localhost:8000/docs) for interactive API docs
   - Or [http://localhost:8000/redoc](http://localhost:8000/redoc) for alternative documentation

### Docker Deployment

```bash
# Build image
docker build -t visionagent .

# Run with GPU support
docker run --gpus all -p 8000:8000 visionagent

# Run CPU-only
docker run -p 8000:8000 visionagent
```

## ๐Ÿ“– Usage

### Command Line Interface

```bash
# Face detection
python cli.py face image.jpg --output results.json

# Object detection  
python cli.py object image.jpg --confidence 0.7 --verbose

# Video analysis
python cli.py video video.mp4 --max-frames 500 --format detailed

# Image classification
python cli.py classify image.jpg --top-k 10 --confidence 0.1

# System information
python cli.py info

# Start server
python cli.py server --host 0.0.0.0 --port 8000
```

### API Endpoints

#### Face Detection

```bash
# Upload file
curl -X POST "http://localhost:8000/face" \
     -F "file=@image.jpg"

# Or use image URL
curl -X POST "http://localhost:8000/face" \
     -H "Content-Type: application/json" \
     -d '{"image_url": "https://example.com/image.jpg"}'
```

#### Object Detection

```bash
curl -X POST "http://localhost:8000/object" \
     -F "file=@image.jpg"
```

#### Video Analysis

```bash
curl -X POST "http://localhost:8000/video" \
     -F "file=@video.mp4"
```

#### Image Classification

```bash
curl -X POST "http://localhost:8000/classify" \
     -F "file=@image.jpg"
```

#### Batch Processing

```bash
curl -X POST "http://localhost:8000/batch/classify" \
     -F "files=@image1.jpg" \
     -F "files=@image2.jpg" \
     -F "files=@image3.jpg"
```

### WebSocket Streaming

```javascript
// Real-time video processing
const ws = new WebSocket('ws://localhost:8000/ws/video');

ws.onopen = function() {
    // Send video frames as binary data
    ws.send(frameData);
};

ws.onmessage = function(event) {
    const result = JSON.parse(event.data);
    console.log('Analysis result:', result);
};
```

## โš™๏ธ Configuration

Create a `config.yaml` file to customize the framework:

```yaml
# Global settings
default_device: "auto"  # auto, cpu, cuda
model_cache_dir: "./models"
temp_dir: "./temp"

# Face Agent
face_agent:
  enabled: true
  model:
    name: "face_recognition"
    confidence_threshold: 0.6
    custom_params:
      face_detection_model: "hog"  # hog, cnn
      num_jitters: 1
      tolerance: 0.6

# Object Agent
object_agent:
  enabled: true
  model:
    name: "yolov8s.pt"
    confidence_threshold: 0.5
    custom_params:
      iou_threshold: 0.45
      max_detections: 100

# Video Agent
video_agent:
  enabled: true
  processing_params:
    frame_skip: 1
    max_frames: 1000
    track_objects: true
    track_faces: true

# Classification Agent
classification_agent:
  enabled: true
  model:
    name: "microsoft/resnet-50"
    custom_params:
      top_k: 5
      threshold: 0.1
      return_features: false

# Server Configuration
server:
  host: "0.0.0.0"
  port: 8000
  workers: 1
  max_file_size_mb: 100
  enable_websocket: true
  rate_limit_per_minute: 60

# Logging
logging:
  level: "INFO"
  file_path: "./logs/visionagent.log"
  max_file_size_mb: 10
  backup_count: 5
```

### Environment Variables

```bash
# Override configuration with environment variables
export VISIONAGENT_CONFIG=/path/to/config.yaml
export VISIONAGENT_DEVICE=cuda
export VISIONAGENT_HOST=0.0.0.0
export VISIONAGENT_PORT=8000
export VISIONAGENT_LOG_LEVEL=DEBUG
export VISIONAGENT_MODEL_CACHE_DIR=/app/models
```

## ๐Ÿงฉ Extending the Framework

### Creating Custom Agents

```python
from agents.base_agent import BaseAgent, ProcessingResult

class CustomAgent(BaseAgent):
    def initialize(self) -> bool:
        # Initialize your model here
        self._is_initialized = True
        return True
    
    def process(self, input_data: Any) -> ProcessingResult:
        # Implement your processing logic
        try:
            # Your processing code here
            result_data = {"custom_analysis": "results"}
            
            return ProcessingResult(
                success=True,
                data=result_data,
                confidence=0.95,
                inference_time=50.0
            )
        except Exception as e:
            return ProcessingResult(
                success=False,
                data={},
                error=str(e)
            )
```

## ๐Ÿ“Š API Response Format

All endpoints return standardized responses:

```json
{
  "success": true,
  "data": {
    "detections": [...],
    "detection_count": 5,
    "class_summary": {...}
  },
  "inference_time_ms": 45.2,
  "agent_info": {
    "agent_type": "ObjectAgent",
    "device": "cuda",
    "initialized": true
  },
  "timestamp": "2025-08-31T12:00:00.000Z",
  "request_id": "uuid-string"
}
```

## ๐Ÿ”ง Development

### Setup Development Environment

```bash
# Install development dependencies
pip install -r requirements.txt
pip install pytest pytest-asyncio black flake8 mypy

# Run tests
pytest

# Format code
black .

# Lint code
flake8 .
mypy .
```

### Project Structure Guidelines

- **agents/** - All AI agent implementations inherit from `BaseAgent`
- **models/** - Downloaded model files and weights
- **utils/** - Shared utilities and helper functions
- **server.py** - FastAPI application with all endpoints
- **config.py** - Centralized configuration management
- **cli.py** - Command-line interface for all agents

## ๐Ÿš€ Production Deployment

sDocker Deployment

```bash
# Build production image
docker build -t visionagent:latest .

# Run with GPU support
docker run --gpus all \
  -p 8000:8000 \
  -v $(pwd)/models:/app/models \
  -v $(pwd)/logs:/app/logs \
  -e VISIONAGENT_LOG_LEVEL=INFO \
  visionagent:latest

# Docker Compose (recommended)
docker-compose up -d
```

### Kubernetes Deployment

```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: visionagent
spec:
  replicas: 3
  selector:
    matchLabels:
      app: visionagent
  template:
    metadata:
      labels:
        app: visionagent
    spec:
      containers:
      - name: visionagent
        image: visionagent:latest
        ports:
        - containerPort: 8000
        env:
        - name: VISIONAGENT_DEVICE
          value: "cuda"
        resources:
          requests:
            nvidia.com/gpu: 1
          limits:
            nvidia.com/gpu: 1
```

## ๐Ÿ“ˆ Performance Optimization

### GPU Acceleration

- Automatic CUDA detection and device selection
- Batch processing for multiple images
- Memory-efficient model loading

### Scalability Features

- Async FastAPI endpoints
- WebSocket streaming for real-time processing
- Configurable worker processes
- Model caching and lazy loading

## ๐Ÿ”’ Security Considerations

- File size limits for uploads
- Input validation and sanitization
- Non-root container execution
- Rate limiting support
- CORS configuration

## ๐Ÿงช Testing

```bash
# Run all tests
pytest

# Run specific test categories
pytest tests/test_agents.py
pytest tests/test_api.py
pytest tests/test_utils.py

# Run with coverage
pytest --cov=agents --cov=utils --cov-report=html
```

## ๐Ÿ“ License

This project is licensed under the MIT License - see the LICENSE file for details.

## ๐Ÿค Contributing

1. Fork the repository
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Commit your changes (`git commit -m 'Add amazing feature'`)
4. Push to the branch (`git push origin feature/amazing-feature`)
5. Open a Pull Request

## ๐Ÿ†˜ Support

For issues and questions:

- Check the [API documentation](http://localhost:8000/docs)
- Review the [configuration guide]([def]: #configuration)
- Check system requirements and GPU setup
- Enable debug logging for detailed error information

## ๐ŸŽฏ Roadmap

- [ ] ONNX model support for cross-platform deployment
- [ ] Advanced video tracking algorithms
- [ ] Real-time face recognition optimization
- [ ] Model quantization for edge deployment
- [ ] Multi-camera support
- [ ] Advanced analytics and reporting
- [ ] Model fine-tuning utilities
- [ ] REST API rate limiting and authentication

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/krishna-bajpai/vision-agent",
    "name": "vision-agent-framework",
    "maintainer": "Krishna Bajpai, Vedanshi Gupta",
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": "krishna.bajpai@example.com",
    "keywords": "ai, machine-learning, computer-vision, agent-framework, multi-modal, face-detection, object-detection, video-processing, fastapi, async, performance-optimization, enterprise, token-recycling, predictive-scaling, cost-prediction, canvas-interface, workflow-automation, differential-privacy",
    "author": "Krishna Bajpai, Vedanshi Gupta",
    "author_email": "krishna@krishnabajpai.me, vedanshigupta158@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/20/6f/f116ba4cd2d27699b70a94d7a9aa23c834188dac64751fae0b6cd35f0a16/vision_agent_framework-1.0.0.tar.gz",
    "platform": "any",
    "description": "# VisionAgent - Professional Multi-Modal AI Agent Framework\r\n\r\nA cutting-edge, production-ready AI agent platform for image, video, and face analytics built with modern Python and state-of-the-art AI models.\r\n\r\n## \ud83d\ude80 Features\r\n\r\n### Core Capabilities\r\n\r\n- **Face Detection & Recognition** - Advanced face detection, encoding, and recognition with facial landmarks\r\n- **Object Detection** - YOLOv8-powered object detection with real-time inference\r\n- **Video Analysis** - Frame-by-frame video processing with object/face tracking\r\n- **Image Classification** - HuggingFace Transformers integration for image classification\r\n- **Real-time Processing** - WebSocket streaming for live video analytics\r\n\r\n### Technical Excellence\r\n\r\n- **Modular Architecture** - Easily extendable agent framework\r\n- **GPU Acceleration** - Automatic CUDA detection with CPU fallback\r\n- **Async Processing** - FastAPI with async endpoints for high performance\r\n- **Production Ready** - Docker support, logging, metrics, and error handling\r\n- **Type Safety** - Full type hints and Pydantic models\r\n- **Scalable** - Batch processing and concurrent request handling\r\n\r\n## \ud83c\udfd7\ufe0f Architecture\r\n\r\n```txt\r\nvision-sphere/\r\n\u251c\u2500\u2500 agents/                 # AI Agent implementations\r\n\u2502   \u251c\u2500\u2500 base_agent.py      # Abstract base class\r\n\u2502   \u251c\u2500\u2500 face_agent.py      # Face detection & recognition\r\n\u2502   \u251c\u2500\u2500 object_agent.py    # Object detection (YOLOv8)\r\n\u2502   \u251c\u2500\u2500 video_agent.py     # Video analysis & tracking\r\n\u2502   \u2514\u2500\u2500 classification_agent.py  # Image classification\r\n\u251c\u2500\u2500 models/                # Downloaded/trained models\r\n\u251c\u2500\u2500 utils/                 # Common utilities\r\n\u2502   \u2514\u2500\u2500 helpers.py         # Helper functions\r\n\u251c\u2500\u2500 server.py              # FastAPI application\r\n\u251c\u2500\u2500 config.py              # Configuration management\r\n\u251c\u2500\u2500 cli.py                 # Command-line interface\r\n\u251c\u2500\u2500 requirements.txt       # Python dependencies\r\n\u2514\u2500\u2500 Dockerfile            # Container deployment\r\n```\r\n\r\n## \ud83d\udee0\ufe0f Installation\r\n\r\n### Prerequisites\r\n\r\n- Python 3.11+\r\n- CUDA 11.8+ (optional, for GPU acceleration)\r\n- 8GB+ RAM (16GB+ recommended for video processing)\r\n\r\n### Quick Start\r\n\r\n1. **Clone and Setup**\r\n\r\n   ```bash\r\n   git clone <repository-url>\r\n   cd vision-sphere\r\n   python -m venv venv\r\n   source venv/bin/activate  # On Windows: venv\\Scripts\\activate\r\n   pip install -r requirements.txt\r\n   ```\r\n\r\n2. **Run API Server**\r\n\r\n   ```bash\r\n   python server.py\r\n   ```\r\n\r\n3. **Access API Documentation**\r\n\r\n   - Open [http://localhost:8000/docs](http://localhost:8000/docs) for interactive API docs\r\n   - Or [http://localhost:8000/redoc](http://localhost:8000/redoc) for alternative documentation\r\n\r\n### Docker Deployment\r\n\r\n```bash\r\n# Build image\r\ndocker build -t visionagent .\r\n\r\n# Run with GPU support\r\ndocker run --gpus all -p 8000:8000 visionagent\r\n\r\n# Run CPU-only\r\ndocker run -p 8000:8000 visionagent\r\n```\r\n\r\n## \ud83d\udcd6 Usage\r\n\r\n### Command Line Interface\r\n\r\n```bash\r\n# Face detection\r\npython cli.py face image.jpg --output results.json\r\n\r\n# Object detection  \r\npython cli.py object image.jpg --confidence 0.7 --verbose\r\n\r\n# Video analysis\r\npython cli.py video video.mp4 --max-frames 500 --format detailed\r\n\r\n# Image classification\r\npython cli.py classify image.jpg --top-k 10 --confidence 0.1\r\n\r\n# System information\r\npython cli.py info\r\n\r\n# Start server\r\npython cli.py server --host 0.0.0.0 --port 8000\r\n```\r\n\r\n### API Endpoints\r\n\r\n#### Face Detection\r\n\r\n```bash\r\n# Upload file\r\ncurl -X POST \"http://localhost:8000/face\" \\\r\n     -F \"file=@image.jpg\"\r\n\r\n# Or use image URL\r\ncurl -X POST \"http://localhost:8000/face\" \\\r\n     -H \"Content-Type: application/json\" \\\r\n     -d '{\"image_url\": \"https://example.com/image.jpg\"}'\r\n```\r\n\r\n#### Object Detection\r\n\r\n```bash\r\ncurl -X POST \"http://localhost:8000/object\" \\\r\n     -F \"file=@image.jpg\"\r\n```\r\n\r\n#### Video Analysis\r\n\r\n```bash\r\ncurl -X POST \"http://localhost:8000/video\" \\\r\n     -F \"file=@video.mp4\"\r\n```\r\n\r\n#### Image Classification\r\n\r\n```bash\r\ncurl -X POST \"http://localhost:8000/classify\" \\\r\n     -F \"file=@image.jpg\"\r\n```\r\n\r\n#### Batch Processing\r\n\r\n```bash\r\ncurl -X POST \"http://localhost:8000/batch/classify\" \\\r\n     -F \"files=@image1.jpg\" \\\r\n     -F \"files=@image2.jpg\" \\\r\n     -F \"files=@image3.jpg\"\r\n```\r\n\r\n### WebSocket Streaming\r\n\r\n```javascript\r\n// Real-time video processing\r\nconst ws = new WebSocket('ws://localhost:8000/ws/video');\r\n\r\nws.onopen = function() {\r\n    // Send video frames as binary data\r\n    ws.send(frameData);\r\n};\r\n\r\nws.onmessage = function(event) {\r\n    const result = JSON.parse(event.data);\r\n    console.log('Analysis result:', result);\r\n};\r\n```\r\n\r\n## \u2699\ufe0f Configuration\r\n\r\nCreate a `config.yaml` file to customize the framework:\r\n\r\n```yaml\r\n# Global settings\r\ndefault_device: \"auto\"  # auto, cpu, cuda\r\nmodel_cache_dir: \"./models\"\r\ntemp_dir: \"./temp\"\r\n\r\n# Face Agent\r\nface_agent:\r\n  enabled: true\r\n  model:\r\n    name: \"face_recognition\"\r\n    confidence_threshold: 0.6\r\n    custom_params:\r\n      face_detection_model: \"hog\"  # hog, cnn\r\n      num_jitters: 1\r\n      tolerance: 0.6\r\n\r\n# Object Agent\r\nobject_agent:\r\n  enabled: true\r\n  model:\r\n    name: \"yolov8s.pt\"\r\n    confidence_threshold: 0.5\r\n    custom_params:\r\n      iou_threshold: 0.45\r\n      max_detections: 100\r\n\r\n# Video Agent\r\nvideo_agent:\r\n  enabled: true\r\n  processing_params:\r\n    frame_skip: 1\r\n    max_frames: 1000\r\n    track_objects: true\r\n    track_faces: true\r\n\r\n# Classification Agent\r\nclassification_agent:\r\n  enabled: true\r\n  model:\r\n    name: \"microsoft/resnet-50\"\r\n    custom_params:\r\n      top_k: 5\r\n      threshold: 0.1\r\n      return_features: false\r\n\r\n# Server Configuration\r\nserver:\r\n  host: \"0.0.0.0\"\r\n  port: 8000\r\n  workers: 1\r\n  max_file_size_mb: 100\r\n  enable_websocket: true\r\n  rate_limit_per_minute: 60\r\n\r\n# Logging\r\nlogging:\r\n  level: \"INFO\"\r\n  file_path: \"./logs/visionagent.log\"\r\n  max_file_size_mb: 10\r\n  backup_count: 5\r\n```\r\n\r\n### Environment Variables\r\n\r\n```bash\r\n# Override configuration with environment variables\r\nexport VISIONAGENT_CONFIG=/path/to/config.yaml\r\nexport VISIONAGENT_DEVICE=cuda\r\nexport VISIONAGENT_HOST=0.0.0.0\r\nexport VISIONAGENT_PORT=8000\r\nexport VISIONAGENT_LOG_LEVEL=DEBUG\r\nexport VISIONAGENT_MODEL_CACHE_DIR=/app/models\r\n```\r\n\r\n## \ud83e\udde9 Extending the Framework\r\n\r\n### Creating Custom Agents\r\n\r\n```python\r\nfrom agents.base_agent import BaseAgent, ProcessingResult\r\n\r\nclass CustomAgent(BaseAgent):\r\n    def initialize(self) -> bool:\r\n        # Initialize your model here\r\n        self._is_initialized = True\r\n        return True\r\n    \r\n    def process(self, input_data: Any) -> ProcessingResult:\r\n        # Implement your processing logic\r\n        try:\r\n            # Your processing code here\r\n            result_data = {\"custom_analysis\": \"results\"}\r\n            \r\n            return ProcessingResult(\r\n                success=True,\r\n                data=result_data,\r\n                confidence=0.95,\r\n                inference_time=50.0\r\n            )\r\n        except Exception as e:\r\n            return ProcessingResult(\r\n                success=False,\r\n                data={},\r\n                error=str(e)\r\n            )\r\n```\r\n\r\n## \ud83d\udcca API Response Format\r\n\r\nAll endpoints return standardized responses:\r\n\r\n```json\r\n{\r\n  \"success\": true,\r\n  \"data\": {\r\n    \"detections\": [...],\r\n    \"detection_count\": 5,\r\n    \"class_summary\": {...}\r\n  },\r\n  \"inference_time_ms\": 45.2,\r\n  \"agent_info\": {\r\n    \"agent_type\": \"ObjectAgent\",\r\n    \"device\": \"cuda\",\r\n    \"initialized\": true\r\n  },\r\n  \"timestamp\": \"2025-08-31T12:00:00.000Z\",\r\n  \"request_id\": \"uuid-string\"\r\n}\r\n```\r\n\r\n## \ud83d\udd27 Development\r\n\r\n### Setup Development Environment\r\n\r\n```bash\r\n# Install development dependencies\r\npip install -r requirements.txt\r\npip install pytest pytest-asyncio black flake8 mypy\r\n\r\n# Run tests\r\npytest\r\n\r\n# Format code\r\nblack .\r\n\r\n# Lint code\r\nflake8 .\r\nmypy .\r\n```\r\n\r\n### Project Structure Guidelines\r\n\r\n- **agents/** - All AI agent implementations inherit from `BaseAgent`\r\n- **models/** - Downloaded model files and weights\r\n- **utils/** - Shared utilities and helper functions\r\n- **server.py** - FastAPI application with all endpoints\r\n- **config.py** - Centralized configuration management\r\n- **cli.py** - Command-line interface for all agents\r\n\r\n## \ud83d\ude80 Production Deployment\r\n\r\nsDocker Deployment\r\n\r\n```bash\r\n# Build production image\r\ndocker build -t visionagent:latest .\r\n\r\n# Run with GPU support\r\ndocker run --gpus all \\\r\n  -p 8000:8000 \\\r\n  -v $(pwd)/models:/app/models \\\r\n  -v $(pwd)/logs:/app/logs \\\r\n  -e VISIONAGENT_LOG_LEVEL=INFO \\\r\n  visionagent:latest\r\n\r\n# Docker Compose (recommended)\r\ndocker-compose up -d\r\n```\r\n\r\n### Kubernetes Deployment\r\n\r\n```yaml\r\napiVersion: apps/v1\r\nkind: Deployment\r\nmetadata:\r\n  name: visionagent\r\nspec:\r\n  replicas: 3\r\n  selector:\r\n    matchLabels:\r\n      app: visionagent\r\n  template:\r\n    metadata:\r\n      labels:\r\n        app: visionagent\r\n    spec:\r\n      containers:\r\n      - name: visionagent\r\n        image: visionagent:latest\r\n        ports:\r\n        - containerPort: 8000\r\n        env:\r\n        - name: VISIONAGENT_DEVICE\r\n          value: \"cuda\"\r\n        resources:\r\n          requests:\r\n            nvidia.com/gpu: 1\r\n          limits:\r\n            nvidia.com/gpu: 1\r\n```\r\n\r\n## \ud83d\udcc8 Performance Optimization\r\n\r\n### GPU Acceleration\r\n\r\n- Automatic CUDA detection and device selection\r\n- Batch processing for multiple images\r\n- Memory-efficient model loading\r\n\r\n### Scalability Features\r\n\r\n- Async FastAPI endpoints\r\n- WebSocket streaming for real-time processing\r\n- Configurable worker processes\r\n- Model caching and lazy loading\r\n\r\n## \ud83d\udd12 Security Considerations\r\n\r\n- File size limits for uploads\r\n- Input validation and sanitization\r\n- Non-root container execution\r\n- Rate limiting support\r\n- CORS configuration\r\n\r\n## \ud83e\uddea Testing\r\n\r\n```bash\r\n# Run all tests\r\npytest\r\n\r\n# Run specific test categories\r\npytest tests/test_agents.py\r\npytest tests/test_api.py\r\npytest tests/test_utils.py\r\n\r\n# Run with coverage\r\npytest --cov=agents --cov=utils --cov-report=html\r\n```\r\n\r\n## \ud83d\udcdd License\r\n\r\nThis project is licensed under the MIT License - see the LICENSE file for details.\r\n\r\n## \ud83e\udd1d Contributing\r\n\r\n1. Fork the repository\r\n2. Create a feature branch (`git checkout -b feature/amazing-feature`)\r\n3. Commit your changes (`git commit -m 'Add amazing feature'`)\r\n4. Push to the branch (`git push origin feature/amazing-feature`)\r\n5. Open a Pull Request\r\n\r\n## \ud83c\udd98 Support\r\n\r\nFor issues and questions:\r\n\r\n- Check the [API documentation](http://localhost:8000/docs)\r\n- Review the [configuration guide]([def]: #configuration)\r\n- Check system requirements and GPU setup\r\n- Enable debug logging for detailed error information\r\n\r\n## \ud83c\udfaf Roadmap\r\n\r\n- [ ] ONNX model support for cross-platform deployment\r\n- [ ] Advanced video tracking algorithms\r\n- [ ] Real-time face recognition optimization\r\n- [ ] Model quantization for edge deployment\r\n- [ ] Multi-camera support\r\n- [ ] Advanced analytics and reporting\r\n- [ ] Model fine-tuning utilities\r\n- [ ] REST API rate limiting and authentication\r\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "World-Class Multi-Modal AI Agent Framework with Revolutionary Performance Features",
    "version": "1.0.0",
    "project_urls": {
        "Bug Reports": "https://github.com/krishna-bajpai/vision-agent/issues",
        "Changelog": "https://github.com/krishna-bajpai/vision-agent/blob/main/CHANGELOG.md",
        "Documentation": "https://vision-agent.readthedocs.io",
        "Download": "https://github.com/krishna-bajpai/vision-agent/archive/main.zip",
        "Homepage": "https://github.com/krishna-bajpai/vision-agent",
        "Source": "https://github.com/krishna-bajpai/vision-agent"
    },
    "split_keywords": [
        "ai",
        " machine-learning",
        " computer-vision",
        " agent-framework",
        " multi-modal",
        " face-detection",
        " object-detection",
        " video-processing",
        " fastapi",
        " async",
        " performance-optimization",
        " enterprise",
        " token-recycling",
        " predictive-scaling",
        " cost-prediction",
        " canvas-interface",
        " workflow-automation",
        " differential-privacy"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "007cb6ff772ac0e2db3803d3cec26b2e0c3dea1cb58fdb0c0b16bbd9b4d7f8db",
                "md5": "2fa2c915d046b82ce497fcb217301384",
                "sha256": "69cb070fd5fb4f936a8517966cdfe8380f1958681b65f695060bcfcf8ce50c0c"
            },
            "downloads": -1,
            "filename": "vision_agent_framework-1.0.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "2fa2c915d046b82ce497fcb217301384",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 187642,
            "upload_time": "2025-08-31T14:12:28",
            "upload_time_iso_8601": "2025-08-31T14:12:28.770441Z",
            "url": "https://files.pythonhosted.org/packages/00/7c/b6ff772ac0e2db3803d3cec26b2e0c3dea1cb58fdb0c0b16bbd9b4d7f8db/vision_agent_framework-1.0.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "206ff116ba4cd2d27699b70a94d7a9aa23c834188dac64751fae0b6cd35f0a16",
                "md5": "e318c23e11913f9746f5f5ac07a92f01",
                "sha256": "ef7ee018eaad14935af109f6eb01d6bf4faad65e069cf4875a6f8205c87f093a"
            },
            "downloads": -1,
            "filename": "vision_agent_framework-1.0.0.tar.gz",
            "has_sig": false,
            "md5_digest": "e318c23e11913f9746f5f5ac07a92f01",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 163094,
            "upload_time": "2025-08-31T14:12:30",
            "upload_time_iso_8601": "2025-08-31T14:12:30.909348Z",
            "url": "https://files.pythonhosted.org/packages/20/6f/f116ba4cd2d27699b70a94d7a9aa23c834188dac64751fae0b6cd35f0a16/vision_agent_framework-1.0.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-31 14:12:30",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "krishna-bajpai",
    "github_project": "vision-agent",
    "github_not_found": true,
    "lcname": "vision-agent-framework"
}
        
Elapsed time: 1.47358s