nedo-vision-training

Name	nedo-vision-training JSON
Version	1.0.0 JSON
	download
home_page	None
Summary	A comprehensive training service library for AI models in the Nedo Vision platform
upload_time	2025-08-04 04:14:21
maintainer	None
docs_url	None
author	None
requires_python	>=3.10
license	None
keywords	computer-vision machine-learning ai training deep-learning object-detection neural-networks pytorch
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Nedo Vision Training Service

A distributed AI model training service for the Nedo Vision platform. This service manages training workflows, monitoring, and lifecycle management for computer vision models using RF-DETR architecture.

## Features

- **Configurable Training Service**: Automated training with customizable intervals and parameters
- **gRPC Communication**: Reliable communication with the vision manager and other services
- **Distributed Training**: Support for multi-GPU and distributed training scenarios
- **Real-time Monitoring**: System resource monitoring and training progress tracking
- **Cloud Integration**: AWS S3 integration for model storage and dataset management
- **Message Queue Support**: RabbitMQ integration for task queue management

## Installation

Install the package from PyPI:

```bash
pip install nedo-vision-training
```

For GPU support with CUDA 12.1:

```bash
pip install nedo-vision-training[gpu] --extra-index-url https://download.pytorch.org/whl/cu121
```

For development with all tools:

```bash
pip install nedo-vision-training[dev]
```

## Quick Start

### Using the CLI

After installation, you can use the training service CLI:

```bash
# Show CLI help
nedo-trainer --help

# Start training service with authentication token
nedo-trainer --token YOUR_TOKEN

# Start with custom server configuration
nedo-trainer --token YOUR_TOKEN --server-host custom.server.com --server-port 60000

# Start with custom system usage reporting interval (in seconds)
nedo-trainer --token YOUR_TOKEN --system-usage-interval 30

# Start with custom latency monitoring interval (in seconds)
nedo-trainer --token YOUR_TOKEN --latency-check-interval 15
```

### Configuration Options

The service supports various configuration options:

- `--token`: Authentication token for secure communication
- `--server-host`: gRPC server host (default: localhost)
- `--server-port`: gRPC server port (default: 50051)
- `--system-usage-interval`: System usage reporting interval in seconds (default: 30)
- `--latency-check-interval`: Latency monitoring interval in seconds (default: 10)

## Architecture

### Core Components

- **TrainingService**: Main service orchestrator for training workflows
- **RFDETRTrainer**: RF-DETR algorithm implementation with PyTorch backend
- **TrainerLogger**: Real-time training progress logging via gRPC
- **ResourceMonitor**: System resource monitoring (GPU, CPU, memory)

### Dependencies

The service relies on several key technologies:

- **PyTorch**: Deep learning framework with CUDA support
- **RF-DETR**: Roboflow's Real-time Detection Transformer
- **gRPC**: High-performance RPC framework
- **RabbitMQ**: Message queue for distributed task management
- **AWS SDK**: Cloud storage integration
- **NVIDIA ML**: GPU monitoring and management

## Development Setup

## Troubleshooting

### Common Issues

1. **gRPC Connection Timeouts**: Ensure the server host and port are correctly configured
2. **CUDA Out of Memory**: Reduce batch size or use gradient accumulation
3. **Missing Dependencies**: Reinstall with `pip install --upgrade nedo-vision-training`

### Support

For issues and questions:

- Check the logs for detailed error information
- Ensure your token is valid and not expired
- Verify network connectivity to the training manager

## License

This project is part of the Nedo Vision platform. Please refer to the main project license for usage terms.

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "nedo-vision-training",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": "Willy Achmat Fauzi <willy.achmat@gmail.com>",
    "keywords": "computer-vision, machine-learning, ai, training, deep-learning, object-detection, neural-networks, pytorch",
    "author": null,
    "author_email": "Willy Achmat Fauzi <willy.achmat@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/27/e0/d9e0e99c27492a6df12f0269434dce1c98ad26803c2831baf2ac7d86546f/nedo_vision_training-1.0.0.tar.gz",
    "platform": null,
    "description": "# Nedo Vision Training Service\n\nA distributed AI model training service for the Nedo Vision platform. This service manages training workflows, monitoring, and lifecycle management for computer vision models using RF-DETR architecture.\n\n## Features\n\n- **Configurable Training Service**: Automated training with customizable intervals and parameters\n- **gRPC Communication**: Reliable communication with the vision manager and other services\n- **Distributed Training**: Support for multi-GPU and distributed training scenarios\n- **Real-time Monitoring**: System resource monitoring and training progress tracking\n- **Cloud Integration**: AWS S3 integration for model storage and dataset management\n- **Message Queue Support**: RabbitMQ integration for task queue management\n\n## Installation\n\nInstall the package from PyPI:\n\n```bash\npip install nedo-vision-training\n```\n\nFor GPU support with CUDA 12.1:\n\n```bash\npip install nedo-vision-training[gpu] --extra-index-url https://download.pytorch.org/whl/cu121\n```\n\nFor development with all tools:\n\n```bash\npip install nedo-vision-training[dev]\n```\n\n## Quick Start\n\n### Using the CLI\n\nAfter installation, you can use the training service CLI:\n\n```bash\n# Show CLI help\nnedo-trainer --help\n\n# Start training service with authentication token\nnedo-trainer --token YOUR_TOKEN\n\n# Start with custom server configuration\nnedo-trainer --token YOUR_TOKEN --server-host custom.server.com --server-port 60000\n\n# Start with custom system usage reporting interval (in seconds)\nnedo-trainer --token YOUR_TOKEN --system-usage-interval 30\n\n# Start with custom latency monitoring interval (in seconds)\nnedo-trainer --token YOUR_TOKEN --latency-check-interval 15\n```\n\n### Configuration Options\n\nThe service supports various configuration options:\n\n- `--token`: Authentication token for secure communication\n- `--server-host`: gRPC server host (default: localhost)\n- `--server-port`: gRPC server port (default: 50051)\n- `--system-usage-interval`: System usage reporting interval in seconds (default: 30)\n- `--latency-check-interval`: Latency monitoring interval in seconds (default: 10)\n\n## Architecture\n\n### Core Components\n\n- **TrainingService**: Main service orchestrator for training workflows\n- **RFDETRTrainer**: RF-DETR algorithm implementation with PyTorch backend\n- **TrainerLogger**: Real-time training progress logging via gRPC\n- **ResourceMonitor**: System resource monitoring (GPU, CPU, memory)\n\n### Dependencies\n\nThe service relies on several key technologies:\n\n- **PyTorch**: Deep learning framework with CUDA support\n- **RF-DETR**: Roboflow's Real-time Detection Transformer\n- **gRPC**: High-performance RPC framework\n- **RabbitMQ**: Message queue for distributed task management\n- **AWS SDK**: Cloud storage integration\n- **NVIDIA ML**: GPU monitoring and management\n\n## Development Setup\n\n## Troubleshooting\n\n### Common Issues\n\n1. **gRPC Connection Timeouts**: Ensure the server host and port are correctly configured\n2. **CUDA Out of Memory**: Reduce batch size or use gradient accumulation\n3. **Missing Dependencies**: Reinstall with `pip install --upgrade nedo-vision-training`\n\n### Support\n\nFor issues and questions:\n\n- Check the logs for detailed error information\n- Ensure your token is valid and not expired\n- Verify network connectivity to the training manager\n\n## License\n\nThis project is part of the Nedo Vision platform. Please refer to the main project license for usage terms.\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A comprehensive training service library for AI models in the Nedo Vision platform",
    "version": "1.0.0",
    "project_urls": {
        "Bug Reports": "https://gitlab.com/sindika/research/nedo-vision/nedo-vision-training-service/-/issues",
        "Documentation": "https://gitlab.com/sindika/research/nedo-vision/nedo-vision-training-service/-/blob/main/README.md",
        "Homepage": "https://gitlab.com/sindika/research/nedo-vision/nedo-vision-training-service",
        "Repository": "https://gitlab.com/sindika/research/nedo-vision/nedo-vision-training-service"
    },
    "split_keywords": [
        "computer-vision",
        " machine-learning",
        " ai",
        " training",
        " deep-learning",
        " object-detection",
        " neural-networks",
        " pytorch"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "a57909e573d98ff2f2e5256631103295a9f2debe33ebd6411f9ae22f651179c8",
                "md5": "03d5ff99fc11939d8007a7c15020da26",
                "sha256": "d62d2008b4480e050ddd9dc68c2a96cdc16af2bd53d962d7707ad784632e708a"
            },
            "downloads": -1,
            "filename": "nedo_vision_training-1.0.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "03d5ff99fc11939d8007a7c15020da26",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 62601,
            "upload_time": "2025-08-04T04:14:20",
            "upload_time_iso_8601": "2025-08-04T04:14:20.507090Z",
            "url": "https://files.pythonhosted.org/packages/a5/79/09e573d98ff2f2e5256631103295a9f2debe33ebd6411f9ae22f651179c8/nedo_vision_training-1.0.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "27e0d9e0e99c27492a6df12f0269434dce1c98ad26803c2831baf2ac7d86546f",
                "md5": "ba3098b7253ffc9302c273f3cacff411",
                "sha256": "d7d2f6158fb2023aa3bd0b70fd22984972ab1e7cd13f330db4c907a19c087568"
            },
            "downloads": -1,
            "filename": "nedo_vision_training-1.0.0.tar.gz",
            "has_sig": false,
            "md5_digest": "ba3098b7253ffc9302c273f3cacff411",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 45783,
            "upload_time": "2025-08-04T04:14:21",
            "upload_time_iso_8601": "2025-08-04T04:14:21.835729Z",
            "url": "https://files.pythonhosted.org/packages/27/e0/d9e0e99c27492a6df12f0269434dce1c98ad26803c2831baf2ac7d86546f/nedo_vision_training-1.0.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-04 04:14:21",
    "github": false,
    "gitlab": true,
    "bitbucket": false,
    "codeberg": false,
    "gitlab_user": "sindika",
    "gitlab_project": "research",
    "lcname": "nedo-vision-training"
}

None