ls-ml-toolkit

Name	ls-ml-toolkit JSON
Version	1.0.2 JSON
	download
home_page	https://github.com/bavix/ls-ml-toolkit
Summary	Label Studio ML Toolkit: Convert, Train, Optimize object detection models (CPU only)
upload_time	2025-10-08 16:59:31
maintainer	None
docs_url	None
author	Babichev Maxim
requires_python	>=3.8
license	MIT
keywords	label-studio yolo object-detection machine-learning computer-vision ml-toolkit dataset-conversion model-training onnx-optimization
VCS
bugtrack_url
requirements	ultralytics onnx onnxruntime tqdm requests PyYAML boto3 botocore torch torchvision
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # LS-ML-Toolkit

[![Python Version](https://img.shields.io/badge/python-3.8%2B-blue.svg)](https://python.org)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![PyPI version](https://badge.fury.io/py/ls-ml-toolkit.svg)](https://badge.fury.io/py/ls-ml-toolkit)

A comprehensive machine learning toolkit for converting Label Studio annotations, training object detection models, and optimizing for deployment.

## Features

- **Label Studio to YOLO Conversion**: Convert Label Studio JSON exports to YOLO format
- **Image Downloading**: Download images from S3/HTTP sources with progress tracking
- **YOLO Model Training**: Train YOLOv11 models with automatic device detection
- **ONNX Export & Optimization**: Export and optimize models for mobile deployment
- **Cross-Platform GPU Support**: MPS (macOS), CUDA (NVIDIA), ROCm (AMD)
- **Centralized Configuration**: YAML-based configuration with environment variable support
- **Automatic .env Loading**: Seamless integration with .env files for sensitive credentials
- **Environment Variable Substitution**: Support for `${VAR_NAME}` and `${VAR_NAME:-default}` syntax in YAML
- **Flexible Import System**: Works both as a Python module and as standalone scripts
- **Secure Configuration**: Sensitive data in .env, regular settings in YAML
- **Modern CLI Interface**: Beautiful terminal output with progress indicators and status displays
- **Smart NMS Configuration**: Optimized Non-Maximum Suppression settings to reduce warnings
- **Automatic Training Directory Detection**: Finds the latest YOLO training output automatically

## Quick Start

### Installation

```bash
# Install package (includes GPU support for all platforms)
pip install ls-ml-toolkit

# PyTorch automatically detects and uses:
# - macOS: Metal Performance Shaders (MPS)
# - Linux: CUDA/ROCm (if available)
# - Windows: CUDA (if available)
```

### Basic Usage

```bash
# 1. Create .env file with your S3 credentials
cp env.example .env
# Edit .env with your AWS credentials

# 2. Train a model from Label Studio dataset
lsml-train dataset/v0.json --epochs 50 --batch 8 --device auto

# 3. Optimize an ONNX model
lsml-optimize model.onnx

# PyTorch automatically detects your platform and GPU
# All configuration is loaded automatically from .env and ls-ml-toolkit.yaml
```

### Python API

```python
from ls_ml_toolkit import LabelStudioToYOLOConverter, YOLOTrainer

# Convert dataset
converter = LabelStudioToYOLOConverter('dataset_name', 'path/to/labelstudio.json')
converter.process_dataset()

# Train model
trainer = YOLOTrainer('path/to/dataset')
trainer.train_model(epochs=50, device='auto')
```

## Configuration

### Environment Variables (.env)

Create a `.env` file with your sensitive credentials only:

```bash
# S3 Credentials (Sensitive Data)
LS_ML_S3_ACCESS_KEY_ID=your_access_key
LS_ML_S3_SECRET_ACCESS_KEY=your_secret_key

# Optional: Environment-specific settings
LS_ML_S3_DEFAULT_REGION=us-east-1
LS_ML_S3_ENDPOINT=https://custom-s3.example.com
```

**Important**: 
- Only use `.env` for **sensitive data** (API keys, passwords, tokens)
- All other configuration should be in `ls-ml-toolkit.yaml`
- Copy `env.example` to `.env` and configure your credentials
- The toolkit automatically loads these variables and makes them available throughout the application

### YAML Configuration (ls-ml-toolkit.yaml)

All regular settings are configured in `ls-ml-toolkit.yaml`. Environment variables are used only for sensitive data:

```yaml
# Dataset Configuration
dataset:
  base_dir: "dataset"
  train_split: 0.8
  val_split: 0.2

# Training Configuration
training:
  epochs: 50
  batch_size: 8
  image_size: 640
  device: "auto"
  
  # NMS (Non-Maximum Suppression) settings
  nms:
    iou_threshold: 0.7  # IoU threshold for NMS (0.0-1.0) - higher = fewer detections
    conf_threshold: 0.25  # Confidence threshold for predictions (0.0-1.0) - higher = fewer detections
    max_det: 300  # Maximum number of detections per image - lower = faster processing

# Model Export Configuration
export:
  model_path: "shared/models/layout_yolo_universal.onnx"
  optimized_model_path: "shared/models/layout_yolo_universal_optimized.onnx"  # Optional
  optimize: true
  optimization_level: "all"

# S3 Configuration (uses .env for sensitive data)
s3:
  access_key_id: "${LS_ML_S3_ACCESS_KEY_ID}"  # From .env file
  secret_access_key: "${LS_ML_S3_SECRET_ACCESS_KEY}"  # From .env file
  region: "${LS_ML_S3_DEFAULT_REGION:-us-east-1}"  # From .env file with default
  endpoint: "${LS_ML_S3_ENDPOINT:-}"  # From .env file (optional)

# Platform-specific settings
platform:
  auto_detect_gpu: true
  force_device: null
  macos:
    device: "mps"
    batch_size: 16
  linux:
    device: "auto"  # PyTorch will auto-detect GPU
    batch_size: 16
```

## Platform Support

### macOS
- **MPS Support**: Automatic Metal Performance Shaders detection
- **Installation**: `pip install ls-ml-toolkit`

### Linux
- **CUDA Support**: Automatic NVIDIA GPU detection and configuration
- **ROCm Support**: Automatic AMD GPU detection
- **Installation**: `pip install ls-ml-toolkit`
- **Requirements**: NVIDIA drivers + CUDA toolkit OR ROCm drivers

### Windows
- **CUDA Support**: Automatic NVIDIA GPU detection
- **Installation**: `pip install ls-ml-toolkit`
- **Requirements**: NVIDIA drivers + CUDA toolkit

## Development

### Setup Development Environment

```bash
git clone https://github.com/bavix/ls-ml-toolkit.git
cd ls-ml-toolkit
pip install -e .
pip install -r requirements-dev.txt
```

### Running Tests

```bash
pytest tests/
```

### Building Packages

```bash
# Build package
python -m build

# Install in development mode
pip install -e .
```

## Command Line Tools

- **`lsml-train`**: Train YOLO models from Label Studio datasets
- **`lsml-optimize`**: Optimize ONNX models for deployment

### CLI Features

- **Beautiful Interface**: Modern terminal UI with colors, icons, and progress indicators
- **Status Tracking**: Real-time progress updates during training and optimization
- **Configuration Display**: Shows current settings in a formatted table
- **File Tree Display**: Visual representation of training results and file structure
- **Error Handling**: Clear error messages and troubleshooting guidance

## Examples

### Training with Custom Configuration

```bash
# Method 1: Use .env file (recommended for secrets)
echo "LS_ML_S3_ACCESS_KEY_ID=your_key" >> .env
echo "LS_ML_S3_SECRET_ACCESS_KEY=your_secret" >> .env

# Method 2: Use environment variables
export LS_ML_S3_ACCESS_KEY_ID="your_key"
export LS_ML_S3_SECRET_ACCESS_KEY="your_secret"

# Train with custom settings
lsml-train dataset/v0.json \
  --epochs 100 \
  --batch 16 \
  --device mps \
  --imgsz 640 \
  --optimize \
  --force-download
```

### Using Configuration File

```bash
# Use custom YAML configuration
lsml-train dataset/v0.json --config custom-config.yaml

# Override specific settings via command line
lsml-train dataset/v0.json --epochs 100 --batch 16 --device mps
```

### Advanced Usage Examples

```bash
# Force re-download of existing images
lsml-train dataset/v0.json --force-download

# Train with custom NMS settings (via YAML config)
# Edit ls-ml-toolkit.yaml:
# training:
#   nms:
#     iou_threshold: 0.8
#     conf_threshold: 0.3
#     max_det: 200

# Optimize existing ONNX model
lsml-optimize model.onnx --level extended

# Use custom output path for optimization
lsml-optimize model.onnx --output optimized_model.onnx
```

### Quick Setup Guide

```bash
# 1. Clone and install
git clone https://github.com/bavix/ls-ml-toolkit.git
cd ls-ml-toolkit
pip install -e .

# 2. Setup credentials
cp env.example .env
# Edit .env with your AWS credentials

# 3. Train your model
lsml-train your_dataset.json --epochs 50 --batch 8
```

### Environment Variable Substitution

The YAML configuration supports environment variable substitution **only for sensitive data**:

```yaml
# S3 Configuration (uses .env variables)
s3:
  access_key_id: "${LS_ML_S3_ACCESS_KEY_ID}"  # From .env file
  secret_access_key: "${LS_ML_S3_SECRET_ACCESS_KEY}"  # From .env file
  region: "${LS_ML_S3_DEFAULT_REGION:-us-east-1}"  # From .env with default
  endpoint: "${LS_ML_S3_ENDPOINT:-}"  # From .env (optional)

# Regular configuration (no env vars needed)
training:
  epochs: 50
  batch_size: 8
  image_size: 640
```

**Naming Convention**: `LS_ML_<CATEGORY>_<SETTING>`
- `LS_ML_S3_ACCESS_KEY_ID` - S3 credentials
- `LS_ML_S3_SECRET_ACCESS_KEY` - S3 credentials  
- `LS_ML_S3_DEFAULT_REGION` - S3 configuration
- `LS_ML_S3_ENDPOINT` - S3 endpoint

## Configuration Best Practices

### ✅ Use .env for:
- **API Keys & Secrets**: `LS_ML_S3_ACCESS_KEY_ID`, `LS_ML_S3_SECRET_ACCESS_KEY`
- **Environment-specific settings**: `LS_ML_S3_DEFAULT_REGION`, `LS_ML_S3_ENDPOINT`
- **Values that change between deployments**

### ✅ Use YAML for:
- **Regular configuration**: epochs, batch_size, image_size
- **Default values**: model paths, directory structures
- **Platform settings**: device detection, optimization levels

## Model Export Configuration

### Model Paths
- **`model_path`**: Path for the regular ONNX export (required)
- **`optimized_model_path`**: Path for the optimized ONNX model (optional)

### Fallback Behavior
If `optimized_model_path` is not specified in the configuration:
- **Training script**: Uses `{model_path}_optimized.onnx` as fallback
- **Optimization script**: Uses `{input_model}_optimized.onnx` as fallback

### Examples
```yaml
export:
  model_path: "models/my_model.onnx"
  optimized_model_path: "models/my_model_optimized.onnx"  # Optional
  optimize: true
  optimization_level: "all"
```
- **All non-sensitive settings**

### 🔒 Security:
- Never commit `.env` files to version control
- Use `.env.example` as a template
- Keep sensitive data separate from code

## File Structure

```
ls-ml-toolkit/
├── src/
│   └── ls_ml_toolkit/         # Main package source
│       ├── __init__.py
│       ├── train.py            # Main training script
│       ├── config_loader.py    # Configuration management with .env support
│       ├── env_loader.py       # Environment variable loader
│       ├── optimize_onnx.py    # ONNX optimization
│       └── ui.py               # CLI UI components
├── tests/                      # Test files
├── requirements.txt            # Dependencies
├── pyproject.toml             # Package configuration
├── setup.py                   # Setup script
├── ls-ml-toolkit.yaml         # Main configuration with env var substitution
├── env.example                # Environment template
├── .env                       # Your environment variables (create from .env.example)
└── README.md                  # This file
```

## Contributing

1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Add tests
5. Submit a pull request

## Troubleshooting

### NMS Time Limit Warnings

If you see `WARNING ⚠️ NMS time limit 2.800s exceeded`:

**What it means:**
- NMS (Non-Maximum Suppression) operation is taking too long
- This can slow down validation and inference
- Usually happens with many objects or suboptimal settings

**How to fix:**
1. **Optimize NMS settings** in `ls-ml-toolkit.yaml`:
   ```yaml
   training:
     nms:
       iou_threshold: 0.8    # Higher = fewer detections (0.7-0.9)
       conf_threshold: 0.3   # Higher = fewer detections (0.25-0.5)
       max_det: 200          # Lower = fewer detections (100-300)
   ```

2. **Reduce batch size** if memory allows:
   ```yaml
   training:
     batch_size: 4  # Reduce from 8 to 4
   ```

3. **Optimize other parameters**: Focus on `iou_threshold`, `conf_threshold`, and `max_det` for better performance

### Environment Variables Not Loading

If your `.env` file is not being loaded:

1. **Check file location**: Ensure `.env` is in the project root directory
2. **Verify file format**: Use `KEY=value` format (no spaces around `=`)
3. **Check permissions**: Ensure the file is readable
4. **Copy from template**: Use `cp env.example .env` as a starting point
5. **Check naming**: Use exact variable names like `LS_ML_S3_ACCESS_KEY_ID`

### YAML Variable Substitution Issues

If environment variables are not substituted in YAML:

1. **Check variable names**: Use exact names like `LS_ML_S3_ACCESS_KEY_ID`
2. **Verify syntax**: Use `${VAR_NAME}` or `${VAR_NAME:-default}` format
3. **Test loading**: Run `python -c "from ls_ml_toolkit.config_loader import ConfigLoader; print(ConfigLoader().get_s3_config())"`
4. **Remember**: Only use env vars for sensitive data, not regular config

### Import Errors

If you get import errors when running scripts:

1. **Install in development mode**: `pip install -e .`
2. **Check Python path**: Ensure the package is in your Python path
3. **Use absolute imports**: The toolkit supports both relative and absolute imports

### Training Directory Issues

If the script can't find the latest training directory:

1. **Check YOLO output**: Ensure `runs/detect/` directory exists
2. **Verify permissions**: Make sure the script can read the directory
3. **Manual path**: The script automatically finds the latest `train*` directory

### ONNX Optimization Issues

If ONNX optimization fails:

1. **Install dependencies**: `pip install onnx onnxruntime`
2. **Check model format**: Ensure input is a valid ONNX model
3. **Use fallback**: The script will use default naming if config path is missing

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/bavix/ls-ml-toolkit",
    "name": "ls-ml-toolkit",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "Babichev Maxim <info@babichev.net>",
    "keywords": "label-studio, yolo, object-detection, machine-learning, computer-vision, ml-toolkit, dataset-conversion, model-training, onnx-optimization",
    "author": "Babichev Maxim",
    "author_email": "Babichev Maxim <info@babichev.net>",
    "download_url": "https://files.pythonhosted.org/packages/a2/54/3468b34c361bf1bf85deeea340019cd08dd7f06dd9b62d8da7c4fcbc1dcb/ls_ml_toolkit-1.0.2.tar.gz",
    "platform": null,
    "description": "# LS-ML-Toolkit\n\n[![Python Version](https://img.shields.io/badge/python-3.8%2B-blue.svg)](https://python.org)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![PyPI version](https://badge.fury.io/py/ls-ml-toolkit.svg)](https://badge.fury.io/py/ls-ml-toolkit)\n\nA comprehensive machine learning toolkit for converting Label Studio annotations, training object detection models, and optimizing for deployment.\n\n## Features\n\n- **Label Studio to YOLO Conversion**: Convert Label Studio JSON exports to YOLO format\n- **Image Downloading**: Download images from S3/HTTP sources with progress tracking\n- **YOLO Model Training**: Train YOLOv11 models with automatic device detection\n- **ONNX Export & Optimization**: Export and optimize models for mobile deployment\n- **Cross-Platform GPU Support**: MPS (macOS), CUDA (NVIDIA), ROCm (AMD)\n- **Centralized Configuration**: YAML-based configuration with environment variable support\n- **Automatic .env Loading**: Seamless integration with .env files for sensitive credentials\n- **Environment Variable Substitution**: Support for `${VAR_NAME}` and `${VAR_NAME:-default}` syntax in YAML\n- **Flexible Import System**: Works both as a Python module and as standalone scripts\n- **Secure Configuration**: Sensitive data in .env, regular settings in YAML\n- **Modern CLI Interface**: Beautiful terminal output with progress indicators and status displays\n- **Smart NMS Configuration**: Optimized Non-Maximum Suppression settings to reduce warnings\n- **Automatic Training Directory Detection**: Finds the latest YOLO training output automatically\n\n## Quick Start\n\n### Installation\n\n```bash\n# Install package (includes GPU support for all platforms)\npip install ls-ml-toolkit\n\n# PyTorch automatically detects and uses:\n# - macOS: Metal Performance Shaders (MPS)\n# - Linux: CUDA/ROCm (if available)\n# - Windows: CUDA (if available)\n```\n\n### Basic Usage\n\n```bash\n# 1. Create .env file with your S3 credentials\ncp env.example .env\n# Edit .env with your AWS credentials\n\n# 2. Train a model from Label Studio dataset\nlsml-train dataset/v0.json --epochs 50 --batch 8 --device auto\n\n# 3. Optimize an ONNX model\nlsml-optimize model.onnx\n\n# PyTorch automatically detects your platform and GPU\n# All configuration is loaded automatically from .env and ls-ml-toolkit.yaml\n```\n\n### Python API\n\n```python\nfrom ls_ml_toolkit import LabelStudioToYOLOConverter, YOLOTrainer\n\n# Convert dataset\nconverter = LabelStudioToYOLOConverter('dataset_name', 'path/to/labelstudio.json')\nconverter.process_dataset()\n\n# Train model\ntrainer = YOLOTrainer('path/to/dataset')\ntrainer.train_model(epochs=50, device='auto')\n```\n\n## Configuration\n\n### Environment Variables (.env)\n\nCreate a `.env` file with your sensitive credentials only:\n\n```bash\n# S3 Credentials (Sensitive Data)\nLS_ML_S3_ACCESS_KEY_ID=your_access_key\nLS_ML_S3_SECRET_ACCESS_KEY=your_secret_key\n\n# Optional: Environment-specific settings\nLS_ML_S3_DEFAULT_REGION=us-east-1\nLS_ML_S3_ENDPOINT=https://custom-s3.example.com\n```\n\n**Important**: \n- Only use `.env` for **sensitive data** (API keys, passwords, tokens)\n- All other configuration should be in `ls-ml-toolkit.yaml`\n- Copy `env.example` to `.env` and configure your credentials\n- The toolkit automatically loads these variables and makes them available throughout the application\n\n### YAML Configuration (ls-ml-toolkit.yaml)\n\nAll regular settings are configured in `ls-ml-toolkit.yaml`. Environment variables are used only for sensitive data:\n\n```yaml\n# Dataset Configuration\ndataset:\n  base_dir: \"dataset\"\n  train_split: 0.8\n  val_split: 0.2\n\n# Training Configuration\ntraining:\n  epochs: 50\n  batch_size: 8\n  image_size: 640\n  device: \"auto\"\n  \n  # NMS (Non-Maximum Suppression) settings\n  nms:\n    iou_threshold: 0.7  # IoU threshold for NMS (0.0-1.0) - higher = fewer detections\n    conf_threshold: 0.25  # Confidence threshold for predictions (0.0-1.0) - higher = fewer detections\n    max_det: 300  # Maximum number of detections per image - lower = faster processing\n\n# Model Export Configuration\nexport:\n  model_path: \"shared/models/layout_yolo_universal.onnx\"\n  optimized_model_path: \"shared/models/layout_yolo_universal_optimized.onnx\"  # Optional\n  optimize: true\n  optimization_level: \"all\"\n\n# S3 Configuration (uses .env for sensitive data)\ns3:\n  access_key_id: \"${LS_ML_S3_ACCESS_KEY_ID}\"  # From .env file\n  secret_access_key: \"${LS_ML_S3_SECRET_ACCESS_KEY}\"  # From .env file\n  region: \"${LS_ML_S3_DEFAULT_REGION:-us-east-1}\"  # From .env file with default\n  endpoint: \"${LS_ML_S3_ENDPOINT:-}\"  # From .env file (optional)\n\n# Platform-specific settings\nplatform:\n  auto_detect_gpu: true\n  force_device: null\n  macos:\n    device: \"mps\"\n    batch_size: 16\n  linux:\n    device: \"auto\"  # PyTorch will auto-detect GPU\n    batch_size: 16\n```\n\n## Platform Support\n\n### macOS\n- **MPS Support**: Automatic Metal Performance Shaders detection\n- **Installation**: `pip install ls-ml-toolkit`\n\n### Linux\n- **CUDA Support**: Automatic NVIDIA GPU detection and configuration\n- **ROCm Support**: Automatic AMD GPU detection\n- **Installation**: `pip install ls-ml-toolkit`\n- **Requirements**: NVIDIA drivers + CUDA toolkit OR ROCm drivers\n\n### Windows\n- **CUDA Support**: Automatic NVIDIA GPU detection\n- **Installation**: `pip install ls-ml-toolkit`\n- **Requirements**: NVIDIA drivers + CUDA toolkit\n\n## Development\n\n### Setup Development Environment\n\n```bash\ngit clone https://github.com/bavix/ls-ml-toolkit.git\ncd ls-ml-toolkit\npip install -e .\npip install -r requirements-dev.txt\n```\n\n### Running Tests\n\n```bash\npytest tests/\n```\n\n### Building Packages\n\n```bash\n# Build package\npython -m build\n\n# Install in development mode\npip install -e .\n```\n\n## Command Line Tools\n\n- **`lsml-train`**: Train YOLO models from Label Studio datasets\n- **`lsml-optimize`**: Optimize ONNX models for deployment\n\n### CLI Features\n\n- **Beautiful Interface**: Modern terminal UI with colors, icons, and progress indicators\n- **Status Tracking**: Real-time progress updates during training and optimization\n- **Configuration Display**: Shows current settings in a formatted table\n- **File Tree Display**: Visual representation of training results and file structure\n- **Error Handling**: Clear error messages and troubleshooting guidance\n\n## Examples\n\n### Training with Custom Configuration\n\n```bash\n# Method 1: Use .env file (recommended for secrets)\necho \"LS_ML_S3_ACCESS_KEY_ID=your_key\" >> .env\necho \"LS_ML_S3_SECRET_ACCESS_KEY=your_secret\" >> .env\n\n# Method 2: Use environment variables\nexport LS_ML_S3_ACCESS_KEY_ID=\"your_key\"\nexport LS_ML_S3_SECRET_ACCESS_KEY=\"your_secret\"\n\n# Train with custom settings\nlsml-train dataset/v0.json \\\n  --epochs 100 \\\n  --batch 16 \\\n  --device mps \\\n  --imgsz 640 \\\n  --optimize \\\n  --force-download\n```\n\n### Using Configuration File\n\n```bash\n# Use custom YAML configuration\nlsml-train dataset/v0.json --config custom-config.yaml\n\n# Override specific settings via command line\nlsml-train dataset/v0.json --epochs 100 --batch 16 --device mps\n```\n\n### Advanced Usage Examples\n\n```bash\n# Force re-download of existing images\nlsml-train dataset/v0.json --force-download\n\n# Train with custom NMS settings (via YAML config)\n# Edit ls-ml-toolkit.yaml:\n# training:\n#   nms:\n#     iou_threshold: 0.8\n#     conf_threshold: 0.3\n#     max_det: 200\n\n# Optimize existing ONNX model\nlsml-optimize model.onnx --level extended\n\n# Use custom output path for optimization\nlsml-optimize model.onnx --output optimized_model.onnx\n```\n\n### Quick Setup Guide\n\n```bash\n# 1. Clone and install\ngit clone https://github.com/bavix/ls-ml-toolkit.git\ncd ls-ml-toolkit\npip install -e .\n\n# 2. Setup credentials\ncp env.example .env\n# Edit .env with your AWS credentials\n\n# 3. Train your model\nlsml-train your_dataset.json --epochs 50 --batch 8\n```\n\n### Environment Variable Substitution\n\nThe YAML configuration supports environment variable substitution **only for sensitive data**:\n\n```yaml\n# S3 Configuration (uses .env variables)\ns3:\n  access_key_id: \"${LS_ML_S3_ACCESS_KEY_ID}\"  # From .env file\n  secret_access_key: \"${LS_ML_S3_SECRET_ACCESS_KEY}\"  # From .env file\n  region: \"${LS_ML_S3_DEFAULT_REGION:-us-east-1}\"  # From .env with default\n  endpoint: \"${LS_ML_S3_ENDPOINT:-}\"  # From .env (optional)\n\n# Regular configuration (no env vars needed)\ntraining:\n  epochs: 50\n  batch_size: 8\n  image_size: 640\n```\n\n**Naming Convention**: `LS_ML_<CATEGORY>_<SETTING>`\n- `LS_ML_S3_ACCESS_KEY_ID` - S3 credentials\n- `LS_ML_S3_SECRET_ACCESS_KEY` - S3 credentials  \n- `LS_ML_S3_DEFAULT_REGION` - S3 configuration\n- `LS_ML_S3_ENDPOINT` - S3 endpoint\n\n## Configuration Best Practices\n\n### \u2705 Use .env for:\n- **API Keys & Secrets**: `LS_ML_S3_ACCESS_KEY_ID`, `LS_ML_S3_SECRET_ACCESS_KEY`\n- **Environment-specific settings**: `LS_ML_S3_DEFAULT_REGION`, `LS_ML_S3_ENDPOINT`\n- **Values that change between deployments**\n\n### \u2705 Use YAML for:\n- **Regular configuration**: epochs, batch_size, image_size\n- **Default values**: model paths, directory structures\n- **Platform settings**: device detection, optimization levels\n\n## Model Export Configuration\n\n### Model Paths\n- **`model_path`**: Path for the regular ONNX export (required)\n- **`optimized_model_path`**: Path for the optimized ONNX model (optional)\n\n### Fallback Behavior\nIf `optimized_model_path` is not specified in the configuration:\n- **Training script**: Uses `{model_path}_optimized.onnx` as fallback\n- **Optimization script**: Uses `{input_model}_optimized.onnx` as fallback\n\n### Examples\n```yaml\nexport:\n  model_path: \"models/my_model.onnx\"\n  optimized_model_path: \"models/my_model_optimized.onnx\"  # Optional\n  optimize: true\n  optimization_level: \"all\"\n```\n- **All non-sensitive settings**\n\n### \ud83d\udd12 Security:\n- Never commit `.env` files to version control\n- Use `.env.example` as a template\n- Keep sensitive data separate from code\n\n## File Structure\n\n```\nls-ml-toolkit/\n\u251c\u2500\u2500 src/\n\u2502   \u2514\u2500\u2500 ls_ml_toolkit/         # Main package source\n\u2502       \u251c\u2500\u2500 __init__.py\n\u2502       \u251c\u2500\u2500 train.py            # Main training script\n\u2502       \u251c\u2500\u2500 config_loader.py    # Configuration management with .env support\n\u2502       \u251c\u2500\u2500 env_loader.py       # Environment variable loader\n\u2502       \u251c\u2500\u2500 optimize_onnx.py    # ONNX optimization\n\u2502       \u2514\u2500\u2500 ui.py               # CLI UI components\n\u251c\u2500\u2500 tests/                      # Test files\n\u251c\u2500\u2500 requirements.txt            # Dependencies\n\u251c\u2500\u2500 pyproject.toml             # Package configuration\n\u251c\u2500\u2500 setup.py                   # Setup script\n\u251c\u2500\u2500 ls-ml-toolkit.yaml         # Main configuration with env var substitution\n\u251c\u2500\u2500 env.example                # Environment template\n\u251c\u2500\u2500 .env                       # Your environment variables (create from .env.example)\n\u2514\u2500\u2500 README.md                  # This file\n```\n\n## Contributing\n\n1. Fork the repository\n2. Create a feature branch\n3. Make your changes\n4. Add tests\n5. Submit a pull request\n\n## Troubleshooting\n\n### NMS Time Limit Warnings\n\nIf you see `WARNING \u26a0\ufe0f NMS time limit 2.800s exceeded`:\n\n**What it means:**\n- NMS (Non-Maximum Suppression) operation is taking too long\n- This can slow down validation and inference\n- Usually happens with many objects or suboptimal settings\n\n**How to fix:**\n1. **Optimize NMS settings** in `ls-ml-toolkit.yaml`:\n   ```yaml\n   training:\n     nms:\n       iou_threshold: 0.8    # Higher = fewer detections (0.7-0.9)\n       conf_threshold: 0.3   # Higher = fewer detections (0.25-0.5)\n       max_det: 200          # Lower = fewer detections (100-300)\n   ```\n\n2. **Reduce batch size** if memory allows:\n   ```yaml\n   training:\n     batch_size: 4  # Reduce from 8 to 4\n   ```\n\n3. **Optimize other parameters**: Focus on `iou_threshold`, `conf_threshold`, and `max_det` for better performance\n\n### Environment Variables Not Loading\n\nIf your `.env` file is not being loaded:\n\n1. **Check file location**: Ensure `.env` is in the project root directory\n2. **Verify file format**: Use `KEY=value` format (no spaces around `=`)\n3. **Check permissions**: Ensure the file is readable\n4. **Copy from template**: Use `cp env.example .env` as a starting point\n5. **Check naming**: Use exact variable names like `LS_ML_S3_ACCESS_KEY_ID`\n\n### YAML Variable Substitution Issues\n\nIf environment variables are not substituted in YAML:\n\n1. **Check variable names**: Use exact names like `LS_ML_S3_ACCESS_KEY_ID`\n2. **Verify syntax**: Use `${VAR_NAME}` or `${VAR_NAME:-default}` format\n3. **Test loading**: Run `python -c \"from ls_ml_toolkit.config_loader import ConfigLoader; print(ConfigLoader().get_s3_config())\"`\n4. **Remember**: Only use env vars for sensitive data, not regular config\n\n### Import Errors\n\nIf you get import errors when running scripts:\n\n1. **Install in development mode**: `pip install -e .`\n2. **Check Python path**: Ensure the package is in your Python path\n3. **Use absolute imports**: The toolkit supports both relative and absolute imports\n\n### Training Directory Issues\n\nIf the script can't find the latest training directory:\n\n1. **Check YOLO output**: Ensure `runs/detect/` directory exists\n2. **Verify permissions**: Make sure the script can read the directory\n3. **Manual path**: The script automatically finds the latest `train*` directory\n\n### ONNX Optimization Issues\n\nIf ONNX optimization fails:\n\n1. **Install dependencies**: `pip install onnx onnxruntime`\n2. **Check model format**: Ensure input is a valid ONNX model\n3. **Use fallback**: The script will use default naming if config path is missing\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Label Studio ML Toolkit: Convert, Train, Optimize object detection models (CPU only)",
    "version": "1.0.2",
    "project_urls": {
        "Documentation": "https://github.com/bavix/ls-ml-toolkit#readme",
        "Homepage": "https://github.com/bavix/ls-ml-toolkit",
        "Issues": "https://github.com/bavix/ls-ml-toolkit/issues",
        "Repository": "https://github.com/bavix/ls-ml-toolkit"
    },
    "split_keywords": [
        "label-studio",
        " yolo",
        " object-detection",
        " machine-learning",
        " computer-vision",
        " ml-toolkit",
        " dataset-conversion",
        " model-training",
        " onnx-optimization"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "d0ca470e60273bf72529a9a9951583ec968280865c393d5a3c3a992cde3d89e9",
                "md5": "2aeefbf46a7b1fa8b5adb21b49be7ad4",
                "sha256": "0ffd7e13648be9f7b95339009c9de82b6aa4862e2893bb907aefb2f85f092ff6"
            },
            "downloads": -1,
            "filename": "ls_ml_toolkit-1.0.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "2aeefbf46a7b1fa8b5adb21b49be7ad4",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 25495,
            "upload_time": "2025-10-08T16:59:30",
            "upload_time_iso_8601": "2025-10-08T16:59:30.712051Z",
            "url": "https://files.pythonhosted.org/packages/d0/ca/470e60273bf72529a9a9951583ec968280865c393d5a3c3a992cde3d89e9/ls_ml_toolkit-1.0.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "a2543468b34c361bf1bf85deeea340019cd08dd7f06dd9b62d8da7c4fcbc1dcb",
                "md5": "c2d3ebef9a1382aeae62b42472327289",
                "sha256": "8d8e27bea17dcffe5316f3c03cd5cedf8510b3365af532cf27e3832231f68799"
            },
            "downloads": -1,
            "filename": "ls_ml_toolkit-1.0.2.tar.gz",
            "has_sig": false,
            "md5_digest": "c2d3ebef9a1382aeae62b42472327289",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 30039,
            "upload_time": "2025-10-08T16:59:31",
            "upload_time_iso_8601": "2025-10-08T16:59:31.869731Z",
            "url": "https://files.pythonhosted.org/packages/a2/54/3468b34c361bf1bf85deeea340019cd08dd7f06dd9b62d8da7c4fcbc1dcb/ls_ml_toolkit-1.0.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-10-08 16:59:31",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "bavix",
    "github_project": "ls-ml-toolkit",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "ultralytics",
            "specs": [
                [
                    ">=",
                    "8.0.0"
                ]
            ]
        },
        {
            "name": "onnx",
            "specs": [
                [
                    ">=",
                    "1.15.0"
                ]
            ]
        },
        {
            "name": "onnxruntime",
            "specs": [
                [
                    ">=",
                    "1.16.0"
                ]
            ]
        },
        {
            "name": "tqdm",
            "specs": [
                [
                    ">=",
                    "4.65.0"
                ]
            ]
        },
        {
            "name": "requests",
            "specs": [
                [
                    ">=",
                    "2.31.0"
                ]
            ]
        },
        {
            "name": "PyYAML",
            "specs": [
                [
                    ">=",
                    "6.0.0"
                ]
            ]
        },
        {
            "name": "boto3",
            "specs": [
                [
                    ">=",
                    "1.34.0"
                ]
            ]
        },
        {
            "name": "botocore",
            "specs": [
                [
                    ">=",
                    "1.34.0"
                ]
            ]
        },
        {
            "name": "torch",
            "specs": [
                [
                    ">=",
                    "2.0.0"
                ]
            ]
        },
        {
            "name": "torchvision",
            "specs": [
                [
                    ">=",
                    "0.15.0"
                ]
            ]
        }
    ],
    "lcname": "ls-ml-toolkit"
}

Babichev Maxim