# LS-ML-Toolkit
[](https://python.org)
[](https://opensource.org/licenses/MIT)
[](https://badge.fury.io/py/ls-ml-toolkit)
A comprehensive machine learning toolkit for converting Label Studio annotations, training object detection models, and optimizing for deployment.
## Features
- **Label Studio to YOLO Conversion**: Convert Label Studio JSON exports to YOLO format
- **Image Downloading**: Download images from S3/HTTP sources with progress tracking
- **YOLO Model Training**: Train YOLOv11 models with automatic device detection
- **ONNX Export & Optimization**: Export and optimize models for mobile deployment
- **Cross-Platform GPU Support**: MPS (macOS), CUDA (NVIDIA), ROCm (AMD)
- **Centralized Configuration**: YAML-based configuration with environment variable support
- **Automatic .env Loading**: Seamless integration with .env files for sensitive credentials
- **Environment Variable Substitution**: Support for `${VAR_NAME}` and `${VAR_NAME:-default}` syntax in YAML
- **Flexible Import System**: Works both as a Python module and as standalone scripts
- **Secure Configuration**: Sensitive data in .env, regular settings in YAML
- **Modern CLI Interface**: Beautiful terminal output with progress indicators and status displays
- **Smart NMS Configuration**: Optimized Non-Maximum Suppression settings to reduce warnings
- **Automatic Training Directory Detection**: Finds the latest YOLO training output automatically
## Quick Start
### Installation
```bash
# Install package (includes GPU support for all platforms)
pip install ls-ml-toolkit
# PyTorch automatically detects and uses:
# - macOS: Metal Performance Shaders (MPS)
# - Linux: CUDA/ROCm (if available)
# - Windows: CUDA (if available)
```
### Basic Usage
```bash
# 1. Create .env file with your S3 credentials
cp env.example .env
# Edit .env with your AWS credentials
# 2. Train a model from Label Studio dataset
lsml-train dataset/v0.json --epochs 50 --batch 8 --device auto
# 3. Optimize an ONNX model
lsml-optimize model.onnx
# PyTorch automatically detects your platform and GPU
# All configuration is loaded automatically from .env and ls-ml-toolkit.yaml
```
### Python API
```python
from ls_ml_toolkit import LabelStudioToYOLOConverter, YOLOTrainer
# Convert dataset
converter = LabelStudioToYOLOConverter('dataset_name', 'path/to/labelstudio.json')
converter.process_dataset()
# Train model
trainer = YOLOTrainer('path/to/dataset')
trainer.train_model(epochs=50, device='auto')
```
## Configuration
### Environment Variables (.env)
Create a `.env` file with your sensitive credentials only:
```bash
# S3 Credentials (Sensitive Data)
LS_ML_S3_ACCESS_KEY_ID=your_access_key
LS_ML_S3_SECRET_ACCESS_KEY=your_secret_key
# Optional: Environment-specific settings
LS_ML_S3_DEFAULT_REGION=us-east-1
LS_ML_S3_ENDPOINT=https://custom-s3.example.com
```
**Important**:
- Only use `.env` for **sensitive data** (API keys, passwords, tokens)
- All other configuration should be in `ls-ml-toolkit.yaml`
- Copy `env.example` to `.env` and configure your credentials
- The toolkit automatically loads these variables and makes them available throughout the application
### YAML Configuration (ls-ml-toolkit.yaml)
All regular settings are configured in `ls-ml-toolkit.yaml`. Environment variables are used only for sensitive data:
```yaml
# Dataset Configuration
dataset:
base_dir: "dataset"
train_split: 0.8
val_split: 0.2
# Training Configuration
training:
epochs: 50
batch_size: 8
image_size: 640
device: "auto"
# NMS (Non-Maximum Suppression) settings
nms:
iou_threshold: 0.7 # IoU threshold for NMS (0.0-1.0) - higher = fewer detections
conf_threshold: 0.25 # Confidence threshold for predictions (0.0-1.0) - higher = fewer detections
max_det: 300 # Maximum number of detections per image - lower = faster processing
# Model Export Configuration
export:
model_path: "shared/models/layout_yolo_universal.onnx"
optimized_model_path: "shared/models/layout_yolo_universal_optimized.onnx" # Optional
optimize: true
optimization_level: "all"
# S3 Configuration (uses .env for sensitive data)
s3:
access_key_id: "${LS_ML_S3_ACCESS_KEY_ID}" # From .env file
secret_access_key: "${LS_ML_S3_SECRET_ACCESS_KEY}" # From .env file
region: "${LS_ML_S3_DEFAULT_REGION:-us-east-1}" # From .env file with default
endpoint: "${LS_ML_S3_ENDPOINT:-}" # From .env file (optional)
# Platform-specific settings
platform:
auto_detect_gpu: true
force_device: null
macos:
device: "mps"
batch_size: 16
linux:
device: "auto" # PyTorch will auto-detect GPU
batch_size: 16
```
## Platform Support
### macOS
- **MPS Support**: Automatic Metal Performance Shaders detection
- **Installation**: `pip install ls-ml-toolkit`
### Linux
- **CUDA Support**: Automatic NVIDIA GPU detection and configuration
- **ROCm Support**: Automatic AMD GPU detection
- **Installation**: `pip install ls-ml-toolkit`
- **Requirements**: NVIDIA drivers + CUDA toolkit OR ROCm drivers
### Windows
- **CUDA Support**: Automatic NVIDIA GPU detection
- **Installation**: `pip install ls-ml-toolkit`
- **Requirements**: NVIDIA drivers + CUDA toolkit
## Development
### Setup Development Environment
```bash
git clone https://github.com/bavix/ls-ml-toolkit.git
cd ls-ml-toolkit
pip install -e .
pip install -r requirements-dev.txt
```
### Running Tests
```bash
pytest tests/
```
### Building Packages
```bash
# Build package
python -m build
# Install in development mode
pip install -e .
```
## Command Line Tools
- **`lsml-train`**: Train YOLO models from Label Studio datasets
- **`lsml-optimize`**: Optimize ONNX models for deployment
### CLI Features
- **Beautiful Interface**: Modern terminal UI with colors, icons, and progress indicators
- **Status Tracking**: Real-time progress updates during training and optimization
- **Configuration Display**: Shows current settings in a formatted table
- **File Tree Display**: Visual representation of training results and file structure
- **Error Handling**: Clear error messages and troubleshooting guidance
## Examples
### Training with Custom Configuration
```bash
# Method 1: Use .env file (recommended for secrets)
echo "LS_ML_S3_ACCESS_KEY_ID=your_key" >> .env
echo "LS_ML_S3_SECRET_ACCESS_KEY=your_secret" >> .env
# Method 2: Use environment variables
export LS_ML_S3_ACCESS_KEY_ID="your_key"
export LS_ML_S3_SECRET_ACCESS_KEY="your_secret"
# Train with custom settings
lsml-train dataset/v0.json \
--epochs 100 \
--batch 16 \
--device mps \
--imgsz 640 \
--optimize \
--force-download
```
### Using Configuration File
```bash
# Use custom YAML configuration
lsml-train dataset/v0.json --config custom-config.yaml
# Override specific settings via command line
lsml-train dataset/v0.json --epochs 100 --batch 16 --device mps
```
### Advanced Usage Examples
```bash
# Force re-download of existing images
lsml-train dataset/v0.json --force-download
# Train with custom NMS settings (via YAML config)
# Edit ls-ml-toolkit.yaml:
# training:
# nms:
# iou_threshold: 0.8
# conf_threshold: 0.3
# max_det: 200
# Optimize existing ONNX model
lsml-optimize model.onnx --level extended
# Use custom output path for optimization
lsml-optimize model.onnx --output optimized_model.onnx
```
### Quick Setup Guide
```bash
# 1. Clone and install
git clone https://github.com/bavix/ls-ml-toolkit.git
cd ls-ml-toolkit
pip install -e .
# 2. Setup credentials
cp env.example .env
# Edit .env with your AWS credentials
# 3. Train your model
lsml-train your_dataset.json --epochs 50 --batch 8
```
### Environment Variable Substitution
The YAML configuration supports environment variable substitution **only for sensitive data**:
```yaml
# S3 Configuration (uses .env variables)
s3:
access_key_id: "${LS_ML_S3_ACCESS_KEY_ID}" # From .env file
secret_access_key: "${LS_ML_S3_SECRET_ACCESS_KEY}" # From .env file
region: "${LS_ML_S3_DEFAULT_REGION:-us-east-1}" # From .env with default
endpoint: "${LS_ML_S3_ENDPOINT:-}" # From .env (optional)
# Regular configuration (no env vars needed)
training:
epochs: 50
batch_size: 8
image_size: 640
```
**Naming Convention**: `LS_ML_<CATEGORY>_<SETTING>`
- `LS_ML_S3_ACCESS_KEY_ID` - S3 credentials
- `LS_ML_S3_SECRET_ACCESS_KEY` - S3 credentials
- `LS_ML_S3_DEFAULT_REGION` - S3 configuration
- `LS_ML_S3_ENDPOINT` - S3 endpoint
## Configuration Best Practices
### ✅ Use .env for:
- **API Keys & Secrets**: `LS_ML_S3_ACCESS_KEY_ID`, `LS_ML_S3_SECRET_ACCESS_KEY`
- **Environment-specific settings**: `LS_ML_S3_DEFAULT_REGION`, `LS_ML_S3_ENDPOINT`
- **Values that change between deployments**
### ✅ Use YAML for:
- **Regular configuration**: epochs, batch_size, image_size
- **Default values**: model paths, directory structures
- **Platform settings**: device detection, optimization levels
## Model Export Configuration
### Model Paths
- **`model_path`**: Path for the regular ONNX export (required)
- **`optimized_model_path`**: Path for the optimized ONNX model (optional)
### Fallback Behavior
If `optimized_model_path` is not specified in the configuration:
- **Training script**: Uses `{model_path}_optimized.onnx` as fallback
- **Optimization script**: Uses `{input_model}_optimized.onnx` as fallback
### Examples
```yaml
export:
model_path: "models/my_model.onnx"
optimized_model_path: "models/my_model_optimized.onnx" # Optional
optimize: true
optimization_level: "all"
```
- **All non-sensitive settings**
### 🔒 Security:
- Never commit `.env` files to version control
- Use `.env.example` as a template
- Keep sensitive data separate from code
## File Structure
```
ls-ml-toolkit/
├── src/
│ └── ls_ml_toolkit/ # Main package source
│ ├── __init__.py
│ ├── train.py # Main training script
│ ├── config_loader.py # Configuration management with .env support
│ ├── env_loader.py # Environment variable loader
│ ├── optimize_onnx.py # ONNX optimization
│ └── ui.py # CLI UI components
├── tests/ # Test files
├── requirements.txt # Dependencies
├── pyproject.toml # Package configuration
├── setup.py # Setup script
├── ls-ml-toolkit.yaml # Main configuration with env var substitution
├── env.example # Environment template
├── .env # Your environment variables (create from .env.example)
└── README.md # This file
```
## Contributing
1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Add tests
5. Submit a pull request
## Troubleshooting
### NMS Time Limit Warnings
If you see `WARNING ⚠️ NMS time limit 2.800s exceeded`:
**What it means:**
- NMS (Non-Maximum Suppression) operation is taking too long
- This can slow down validation and inference
- Usually happens with many objects or suboptimal settings
**How to fix:**
1. **Optimize NMS settings** in `ls-ml-toolkit.yaml`:
```yaml
training:
nms:
iou_threshold: 0.8 # Higher = fewer detections (0.7-0.9)
conf_threshold: 0.3 # Higher = fewer detections (0.25-0.5)
max_det: 200 # Lower = fewer detections (100-300)
```
2. **Reduce batch size** if memory allows:
```yaml
training:
batch_size: 4 # Reduce from 8 to 4
```
3. **Optimize other parameters**: Focus on `iou_threshold`, `conf_threshold`, and `max_det` for better performance
### Environment Variables Not Loading
If your `.env` file is not being loaded:
1. **Check file location**: Ensure `.env` is in the project root directory
2. **Verify file format**: Use `KEY=value` format (no spaces around `=`)
3. **Check permissions**: Ensure the file is readable
4. **Copy from template**: Use `cp env.example .env` as a starting point
5. **Check naming**: Use exact variable names like `LS_ML_S3_ACCESS_KEY_ID`
### YAML Variable Substitution Issues
If environment variables are not substituted in YAML:
1. **Check variable names**: Use exact names like `LS_ML_S3_ACCESS_KEY_ID`
2. **Verify syntax**: Use `${VAR_NAME}` or `${VAR_NAME:-default}` format
3. **Test loading**: Run `python -c "from ls_ml_toolkit.config_loader import ConfigLoader; print(ConfigLoader().get_s3_config())"`
4. **Remember**: Only use env vars for sensitive data, not regular config
### Import Errors
If you get import errors when running scripts:
1. **Install in development mode**: `pip install -e .`
2. **Check Python path**: Ensure the package is in your Python path
3. **Use absolute imports**: The toolkit supports both relative and absolute imports
### Training Directory Issues
If the script can't find the latest training directory:
1. **Check YOLO output**: Ensure `runs/detect/` directory exists
2. **Verify permissions**: Make sure the script can read the directory
3. **Manual path**: The script automatically finds the latest `train*` directory
### ONNX Optimization Issues
If ONNX optimization fails:
1. **Install dependencies**: `pip install onnx onnxruntime`
2. **Check model format**: Ensure input is a valid ONNX model
3. **Use fallback**: The script will use default naming if config path is missing
## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
Raw data
{
"_id": null,
"home_page": "https://github.com/bavix/ls-ml-toolkit",
"name": "ls-ml-toolkit",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": "Babichev Maxim <info@babichev.net>",
"keywords": "label-studio, yolo, object-detection, machine-learning, computer-vision, ml-toolkit, dataset-conversion, model-training, onnx-optimization",
"author": "Babichev Maxim",
"author_email": "Babichev Maxim <info@babichev.net>",
"download_url": "https://files.pythonhosted.org/packages/a2/54/3468b34c361bf1bf85deeea340019cd08dd7f06dd9b62d8da7c4fcbc1dcb/ls_ml_toolkit-1.0.2.tar.gz",
"platform": null,
"description": "# LS-ML-Toolkit\n\n[](https://python.org)\n[](https://opensource.org/licenses/MIT)\n[](https://badge.fury.io/py/ls-ml-toolkit)\n\nA comprehensive machine learning toolkit for converting Label Studio annotations, training object detection models, and optimizing for deployment.\n\n## Features\n\n- **Label Studio to YOLO Conversion**: Convert Label Studio JSON exports to YOLO format\n- **Image Downloading**: Download images from S3/HTTP sources with progress tracking\n- **YOLO Model Training**: Train YOLOv11 models with automatic device detection\n- **ONNX Export & Optimization**: Export and optimize models for mobile deployment\n- **Cross-Platform GPU Support**: MPS (macOS), CUDA (NVIDIA), ROCm (AMD)\n- **Centralized Configuration**: YAML-based configuration with environment variable support\n- **Automatic .env Loading**: Seamless integration with .env files for sensitive credentials\n- **Environment Variable Substitution**: Support for `${VAR_NAME}` and `${VAR_NAME:-default}` syntax in YAML\n- **Flexible Import System**: Works both as a Python module and as standalone scripts\n- **Secure Configuration**: Sensitive data in .env, regular settings in YAML\n- **Modern CLI Interface**: Beautiful terminal output with progress indicators and status displays\n- **Smart NMS Configuration**: Optimized Non-Maximum Suppression settings to reduce warnings\n- **Automatic Training Directory Detection**: Finds the latest YOLO training output automatically\n\n## Quick Start\n\n### Installation\n\n```bash\n# Install package (includes GPU support for all platforms)\npip install ls-ml-toolkit\n\n# PyTorch automatically detects and uses:\n# - macOS: Metal Performance Shaders (MPS)\n# - Linux: CUDA/ROCm (if available)\n# - Windows: CUDA (if available)\n```\n\n### Basic Usage\n\n```bash\n# 1. Create .env file with your S3 credentials\ncp env.example .env\n# Edit .env with your AWS credentials\n\n# 2. Train a model from Label Studio dataset\nlsml-train dataset/v0.json --epochs 50 --batch 8 --device auto\n\n# 3. Optimize an ONNX model\nlsml-optimize model.onnx\n\n# PyTorch automatically detects your platform and GPU\n# All configuration is loaded automatically from .env and ls-ml-toolkit.yaml\n```\n\n### Python API\n\n```python\nfrom ls_ml_toolkit import LabelStudioToYOLOConverter, YOLOTrainer\n\n# Convert dataset\nconverter = LabelStudioToYOLOConverter('dataset_name', 'path/to/labelstudio.json')\nconverter.process_dataset()\n\n# Train model\ntrainer = YOLOTrainer('path/to/dataset')\ntrainer.train_model(epochs=50, device='auto')\n```\n\n## Configuration\n\n### Environment Variables (.env)\n\nCreate a `.env` file with your sensitive credentials only:\n\n```bash\n# S3 Credentials (Sensitive Data)\nLS_ML_S3_ACCESS_KEY_ID=your_access_key\nLS_ML_S3_SECRET_ACCESS_KEY=your_secret_key\n\n# Optional: Environment-specific settings\nLS_ML_S3_DEFAULT_REGION=us-east-1\nLS_ML_S3_ENDPOINT=https://custom-s3.example.com\n```\n\n**Important**: \n- Only use `.env` for **sensitive data** (API keys, passwords, tokens)\n- All other configuration should be in `ls-ml-toolkit.yaml`\n- Copy `env.example` to `.env` and configure your credentials\n- The toolkit automatically loads these variables and makes them available throughout the application\n\n### YAML Configuration (ls-ml-toolkit.yaml)\n\nAll regular settings are configured in `ls-ml-toolkit.yaml`. Environment variables are used only for sensitive data:\n\n```yaml\n# Dataset Configuration\ndataset:\n base_dir: \"dataset\"\n train_split: 0.8\n val_split: 0.2\n\n# Training Configuration\ntraining:\n epochs: 50\n batch_size: 8\n image_size: 640\n device: \"auto\"\n \n # NMS (Non-Maximum Suppression) settings\n nms:\n iou_threshold: 0.7 # IoU threshold for NMS (0.0-1.0) - higher = fewer detections\n conf_threshold: 0.25 # Confidence threshold for predictions (0.0-1.0) - higher = fewer detections\n max_det: 300 # Maximum number of detections per image - lower = faster processing\n\n# Model Export Configuration\nexport:\n model_path: \"shared/models/layout_yolo_universal.onnx\"\n optimized_model_path: \"shared/models/layout_yolo_universal_optimized.onnx\" # Optional\n optimize: true\n optimization_level: \"all\"\n\n# S3 Configuration (uses .env for sensitive data)\ns3:\n access_key_id: \"${LS_ML_S3_ACCESS_KEY_ID}\" # From .env file\n secret_access_key: \"${LS_ML_S3_SECRET_ACCESS_KEY}\" # From .env file\n region: \"${LS_ML_S3_DEFAULT_REGION:-us-east-1}\" # From .env file with default\n endpoint: \"${LS_ML_S3_ENDPOINT:-}\" # From .env file (optional)\n\n# Platform-specific settings\nplatform:\n auto_detect_gpu: true\n force_device: null\n macos:\n device: \"mps\"\n batch_size: 16\n linux:\n device: \"auto\" # PyTorch will auto-detect GPU\n batch_size: 16\n```\n\n## Platform Support\n\n### macOS\n- **MPS Support**: Automatic Metal Performance Shaders detection\n- **Installation**: `pip install ls-ml-toolkit`\n\n### Linux\n- **CUDA Support**: Automatic NVIDIA GPU detection and configuration\n- **ROCm Support**: Automatic AMD GPU detection\n- **Installation**: `pip install ls-ml-toolkit`\n- **Requirements**: NVIDIA drivers + CUDA toolkit OR ROCm drivers\n\n### Windows\n- **CUDA Support**: Automatic NVIDIA GPU detection\n- **Installation**: `pip install ls-ml-toolkit`\n- **Requirements**: NVIDIA drivers + CUDA toolkit\n\n## Development\n\n### Setup Development Environment\n\n```bash\ngit clone https://github.com/bavix/ls-ml-toolkit.git\ncd ls-ml-toolkit\npip install -e .\npip install -r requirements-dev.txt\n```\n\n### Running Tests\n\n```bash\npytest tests/\n```\n\n### Building Packages\n\n```bash\n# Build package\npython -m build\n\n# Install in development mode\npip install -e .\n```\n\n## Command Line Tools\n\n- **`lsml-train`**: Train YOLO models from Label Studio datasets\n- **`lsml-optimize`**: Optimize ONNX models for deployment\n\n### CLI Features\n\n- **Beautiful Interface**: Modern terminal UI with colors, icons, and progress indicators\n- **Status Tracking**: Real-time progress updates during training and optimization\n- **Configuration Display**: Shows current settings in a formatted table\n- **File Tree Display**: Visual representation of training results and file structure\n- **Error Handling**: Clear error messages and troubleshooting guidance\n\n## Examples\n\n### Training with Custom Configuration\n\n```bash\n# Method 1: Use .env file (recommended for secrets)\necho \"LS_ML_S3_ACCESS_KEY_ID=your_key\" >> .env\necho \"LS_ML_S3_SECRET_ACCESS_KEY=your_secret\" >> .env\n\n# Method 2: Use environment variables\nexport LS_ML_S3_ACCESS_KEY_ID=\"your_key\"\nexport LS_ML_S3_SECRET_ACCESS_KEY=\"your_secret\"\n\n# Train with custom settings\nlsml-train dataset/v0.json \\\n --epochs 100 \\\n --batch 16 \\\n --device mps \\\n --imgsz 640 \\\n --optimize \\\n --force-download\n```\n\n### Using Configuration File\n\n```bash\n# Use custom YAML configuration\nlsml-train dataset/v0.json --config custom-config.yaml\n\n# Override specific settings via command line\nlsml-train dataset/v0.json --epochs 100 --batch 16 --device mps\n```\n\n### Advanced Usage Examples\n\n```bash\n# Force re-download of existing images\nlsml-train dataset/v0.json --force-download\n\n# Train with custom NMS settings (via YAML config)\n# Edit ls-ml-toolkit.yaml:\n# training:\n# nms:\n# iou_threshold: 0.8\n# conf_threshold: 0.3\n# max_det: 200\n\n# Optimize existing ONNX model\nlsml-optimize model.onnx --level extended\n\n# Use custom output path for optimization\nlsml-optimize model.onnx --output optimized_model.onnx\n```\n\n### Quick Setup Guide\n\n```bash\n# 1. Clone and install\ngit clone https://github.com/bavix/ls-ml-toolkit.git\ncd ls-ml-toolkit\npip install -e .\n\n# 2. Setup credentials\ncp env.example .env\n# Edit .env with your AWS credentials\n\n# 3. Train your model\nlsml-train your_dataset.json --epochs 50 --batch 8\n```\n\n### Environment Variable Substitution\n\nThe YAML configuration supports environment variable substitution **only for sensitive data**:\n\n```yaml\n# S3 Configuration (uses .env variables)\ns3:\n access_key_id: \"${LS_ML_S3_ACCESS_KEY_ID}\" # From .env file\n secret_access_key: \"${LS_ML_S3_SECRET_ACCESS_KEY}\" # From .env file\n region: \"${LS_ML_S3_DEFAULT_REGION:-us-east-1}\" # From .env with default\n endpoint: \"${LS_ML_S3_ENDPOINT:-}\" # From .env (optional)\n\n# Regular configuration (no env vars needed)\ntraining:\n epochs: 50\n batch_size: 8\n image_size: 640\n```\n\n**Naming Convention**: `LS_ML_<CATEGORY>_<SETTING>`\n- `LS_ML_S3_ACCESS_KEY_ID` - S3 credentials\n- `LS_ML_S3_SECRET_ACCESS_KEY` - S3 credentials \n- `LS_ML_S3_DEFAULT_REGION` - S3 configuration\n- `LS_ML_S3_ENDPOINT` - S3 endpoint\n\n## Configuration Best Practices\n\n### \u2705 Use .env for:\n- **API Keys & Secrets**: `LS_ML_S3_ACCESS_KEY_ID`, `LS_ML_S3_SECRET_ACCESS_KEY`\n- **Environment-specific settings**: `LS_ML_S3_DEFAULT_REGION`, `LS_ML_S3_ENDPOINT`\n- **Values that change between deployments**\n\n### \u2705 Use YAML for:\n- **Regular configuration**: epochs, batch_size, image_size\n- **Default values**: model paths, directory structures\n- **Platform settings**: device detection, optimization levels\n\n## Model Export Configuration\n\n### Model Paths\n- **`model_path`**: Path for the regular ONNX export (required)\n- **`optimized_model_path`**: Path for the optimized ONNX model (optional)\n\n### Fallback Behavior\nIf `optimized_model_path` is not specified in the configuration:\n- **Training script**: Uses `{model_path}_optimized.onnx` as fallback\n- **Optimization script**: Uses `{input_model}_optimized.onnx` as fallback\n\n### Examples\n```yaml\nexport:\n model_path: \"models/my_model.onnx\"\n optimized_model_path: \"models/my_model_optimized.onnx\" # Optional\n optimize: true\n optimization_level: \"all\"\n```\n- **All non-sensitive settings**\n\n### \ud83d\udd12 Security:\n- Never commit `.env` files to version control\n- Use `.env.example` as a template\n- Keep sensitive data separate from code\n\n## File Structure\n\n```\nls-ml-toolkit/\n\u251c\u2500\u2500 src/\n\u2502 \u2514\u2500\u2500 ls_ml_toolkit/ # Main package source\n\u2502 \u251c\u2500\u2500 __init__.py\n\u2502 \u251c\u2500\u2500 train.py # Main training script\n\u2502 \u251c\u2500\u2500 config_loader.py # Configuration management with .env support\n\u2502 \u251c\u2500\u2500 env_loader.py # Environment variable loader\n\u2502 \u251c\u2500\u2500 optimize_onnx.py # ONNX optimization\n\u2502 \u2514\u2500\u2500 ui.py # CLI UI components\n\u251c\u2500\u2500 tests/ # Test files\n\u251c\u2500\u2500 requirements.txt # Dependencies\n\u251c\u2500\u2500 pyproject.toml # Package configuration\n\u251c\u2500\u2500 setup.py # Setup script\n\u251c\u2500\u2500 ls-ml-toolkit.yaml # Main configuration with env var substitution\n\u251c\u2500\u2500 env.example # Environment template\n\u251c\u2500\u2500 .env # Your environment variables (create from .env.example)\n\u2514\u2500\u2500 README.md # This file\n```\n\n## Contributing\n\n1. Fork the repository\n2. Create a feature branch\n3. Make your changes\n4. Add tests\n5. Submit a pull request\n\n## Troubleshooting\n\n### NMS Time Limit Warnings\n\nIf you see `WARNING \u26a0\ufe0f NMS time limit 2.800s exceeded`:\n\n**What it means:**\n- NMS (Non-Maximum Suppression) operation is taking too long\n- This can slow down validation and inference\n- Usually happens with many objects or suboptimal settings\n\n**How to fix:**\n1. **Optimize NMS settings** in `ls-ml-toolkit.yaml`:\n ```yaml\n training:\n nms:\n iou_threshold: 0.8 # Higher = fewer detections (0.7-0.9)\n conf_threshold: 0.3 # Higher = fewer detections (0.25-0.5)\n max_det: 200 # Lower = fewer detections (100-300)\n ```\n\n2. **Reduce batch size** if memory allows:\n ```yaml\n training:\n batch_size: 4 # Reduce from 8 to 4\n ```\n\n3. **Optimize other parameters**: Focus on `iou_threshold`, `conf_threshold`, and `max_det` for better performance\n\n### Environment Variables Not Loading\n\nIf your `.env` file is not being loaded:\n\n1. **Check file location**: Ensure `.env` is in the project root directory\n2. **Verify file format**: Use `KEY=value` format (no spaces around `=`)\n3. **Check permissions**: Ensure the file is readable\n4. **Copy from template**: Use `cp env.example .env` as a starting point\n5. **Check naming**: Use exact variable names like `LS_ML_S3_ACCESS_KEY_ID`\n\n### YAML Variable Substitution Issues\n\nIf environment variables are not substituted in YAML:\n\n1. **Check variable names**: Use exact names like `LS_ML_S3_ACCESS_KEY_ID`\n2. **Verify syntax**: Use `${VAR_NAME}` or `${VAR_NAME:-default}` format\n3. **Test loading**: Run `python -c \"from ls_ml_toolkit.config_loader import ConfigLoader; print(ConfigLoader().get_s3_config())\"`\n4. **Remember**: Only use env vars for sensitive data, not regular config\n\n### Import Errors\n\nIf you get import errors when running scripts:\n\n1. **Install in development mode**: `pip install -e .`\n2. **Check Python path**: Ensure the package is in your Python path\n3. **Use absolute imports**: The toolkit supports both relative and absolute imports\n\n### Training Directory Issues\n\nIf the script can't find the latest training directory:\n\n1. **Check YOLO output**: Ensure `runs/detect/` directory exists\n2. **Verify permissions**: Make sure the script can read the directory\n3. **Manual path**: The script automatically finds the latest `train*` directory\n\n### ONNX Optimization Issues\n\nIf ONNX optimization fails:\n\n1. **Install dependencies**: `pip install onnx onnxruntime`\n2. **Check model format**: Ensure input is a valid ONNX model\n3. **Use fallback**: The script will use default naming if config path is missing\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Label Studio ML Toolkit: Convert, Train, Optimize object detection models (CPU only)",
"version": "1.0.2",
"project_urls": {
"Documentation": "https://github.com/bavix/ls-ml-toolkit#readme",
"Homepage": "https://github.com/bavix/ls-ml-toolkit",
"Issues": "https://github.com/bavix/ls-ml-toolkit/issues",
"Repository": "https://github.com/bavix/ls-ml-toolkit"
},
"split_keywords": [
"label-studio",
" yolo",
" object-detection",
" machine-learning",
" computer-vision",
" ml-toolkit",
" dataset-conversion",
" model-training",
" onnx-optimization"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "d0ca470e60273bf72529a9a9951583ec968280865c393d5a3c3a992cde3d89e9",
"md5": "2aeefbf46a7b1fa8b5adb21b49be7ad4",
"sha256": "0ffd7e13648be9f7b95339009c9de82b6aa4862e2893bb907aefb2f85f092ff6"
},
"downloads": -1,
"filename": "ls_ml_toolkit-1.0.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "2aeefbf46a7b1fa8b5adb21b49be7ad4",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 25495,
"upload_time": "2025-10-08T16:59:30",
"upload_time_iso_8601": "2025-10-08T16:59:30.712051Z",
"url": "https://files.pythonhosted.org/packages/d0/ca/470e60273bf72529a9a9951583ec968280865c393d5a3c3a992cde3d89e9/ls_ml_toolkit-1.0.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "a2543468b34c361bf1bf85deeea340019cd08dd7f06dd9b62d8da7c4fcbc1dcb",
"md5": "c2d3ebef9a1382aeae62b42472327289",
"sha256": "8d8e27bea17dcffe5316f3c03cd5cedf8510b3365af532cf27e3832231f68799"
},
"downloads": -1,
"filename": "ls_ml_toolkit-1.0.2.tar.gz",
"has_sig": false,
"md5_digest": "c2d3ebef9a1382aeae62b42472327289",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 30039,
"upload_time": "2025-10-08T16:59:31",
"upload_time_iso_8601": "2025-10-08T16:59:31.869731Z",
"url": "https://files.pythonhosted.org/packages/a2/54/3468b34c361bf1bf85deeea340019cd08dd7f06dd9b62d8da7c4fcbc1dcb/ls_ml_toolkit-1.0.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-10-08 16:59:31",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "bavix",
"github_project": "ls-ml-toolkit",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "ultralytics",
"specs": [
[
">=",
"8.0.0"
]
]
},
{
"name": "onnx",
"specs": [
[
">=",
"1.15.0"
]
]
},
{
"name": "onnxruntime",
"specs": [
[
">=",
"1.16.0"
]
]
},
{
"name": "tqdm",
"specs": [
[
">=",
"4.65.0"
]
]
},
{
"name": "requests",
"specs": [
[
">=",
"2.31.0"
]
]
},
{
"name": "PyYAML",
"specs": [
[
">=",
"6.0.0"
]
]
},
{
"name": "boto3",
"specs": [
[
">=",
"1.34.0"
]
]
},
{
"name": "botocore",
"specs": [
[
">=",
"1.34.0"
]
]
},
{
"name": "torch",
"specs": [
[
">=",
"2.0.0"
]
]
},
{
"name": "torchvision",
"specs": [
[
">=",
"0.15.0"
]
]
}
],
"lcname": "ls-ml-toolkit"
}