# OpenVINO-Easy ๐
**Framework-agnostic Python wrapper for OpenVINO 2025**
Load and run AI models with three functions:
```python
import oe
oe.load("runwayml/stable-diffusion-v1-5") # auto-download & convert
img = oe.infer("a neon cyber-city at night") # chooses NPU>GPU>CPU
stats = oe.benchmark() # JSON perf report
```
## ๐ฏ Installation
**Pick the variant that matches your hardware:**
```bash
# CPU-only (40MB wheel, fastest install)
pip install "openvino-easy[cpu]"
# or
pip install "openvino-easy[runtime]"
# Intelยฎ Arc/Xe GPU support
pip install "openvino-easy[gpu]"
# Intelยฎ NPU support (Arrow Lake/Lunar Lake with FP16-NF4)
pip install "openvino-easy[npu]"
# With INT8 quantization support
pip install "openvino-easy[quant]"
# Audio model support (Whisper, TTS)
pip install "openvino-easy[audio]"
# Full development environment (OpenVINO, NNCF, optimum ~1GB)
pip install "openvino-easy[full]"
# Everything (for development)
pip install "openvino-easy[all]"
```
### ๐ฉบ Installation Troubleshooting
**Something not working?** Run the doctor:
```bash
# Comprehensive diagnostics
oe doctor
# Get fix suggestions for specific hardware
oe doctor --fix gpu
oe doctor --fix npu
# JSON output for CI systems
oe doctor --json
# Check device status
oe devices
```
**Common issues:**
| Problem | Solution |
|---------|----------|
| `ImportError: OpenVINO runtime not found` | Install with hardware extras: `pip install "openvino-easy[cpu]"` |
| NPU detected but not functional | Install Intel NPU drivers from intel.com |
| GPU detected but not functional | Install Intel GPU drivers (`intel-opencl-icd` on Linux) |
| `NNCF not available` for INT8 quantization | Install quantization support: `pip install "openvino-easy[quant]"` |
| FP16-NF4 not supported | Requires Arrow Lake/Lunar Lake NPU with OpenVINO 2025.2+ |
| Version warnings | Upgrade OpenVINO: `pip install --upgrade "openvino>=2025.2,<2026.0"` |
| **PyTorch model (.pt/.pth) not loading** | **Convert to ONNX first:** `torch.onnx.export(model, dummy_input, "model.onnx")` then `oe.load("model.onnx")` |
| **"Native PyTorch model conversion failed"** | **Upload to Hugging Face Hub** with config.json or **use ONNX format** for best compatibility |
### ๐ฆ What Each Variant Includes
| Variant | OpenVINO Package | Size | Best For |
|---------|------------------|------|----------|
| `[cpu]` / `[runtime]` | `openvino` runtime | ~40MB | Production deployments, CPU-only inference |
| `[gpu]` | `openvino` runtime | ~40MB | Intel GPU acceleration |
| `[npu]` | `openvino` runtime | ~40MB | Intel NPU acceleration |
| `[quant]` | `openvino` + NNCF | ~440MB | INT8 quantization support |
| `[audio]` | `openvino` + librosa | ~100MB | Audio models (Whisper, TTS) |
| `[full]` | `openvino` + NNCF + optimum | ~1GB | Development, model optimization, research |
## โก Quick Start
### Basic Usage
```python
import oe
# Load any model (Hugging Face, ONNX, or OpenVINO IR)
oe.load("microsoft/DialoGPT-medium")
# Run inference (automatic tokenization for text models)
response = oe.infer("Hello, how are you?")
print(response) # "I'm doing well, thank you for asking!"
# Benchmark performance
stats = oe.benchmark()
print(f"Average latency: {stats['avg_latency_ms']:.2f}ms")
print(f"Throughput: {stats['throughput_fps']:.1f} FPS")
# Explicitly free memory when done
oe.unload()
```
### Advanced Usage
```python
# Specify device preference and precision
oe.load(
"runwayml/stable-diffusion-v1-5",
device_preference=["NPU", "GPU", "CPU"], # Try NPU first, fallback to GPU, then CPU
dtype="fp16-nf4" # New FP16-NF4 precision for Arrow Lake/Lunar Lake NPUs
)
# Generate image
image = oe.infer(
"a serene mountain landscape at sunset",
num_inference_steps=20,
guidance_scale=7.5
)
# Get detailed model info
info = oe.get_info()
print(f"Running on: {info['device']}")
print(f"Model type: {info['dtype']}")
print(f"Quantized: {info['quantized']}")
# Context manager for automatic cleanup
with oe.load("runwayml/stable-diffusion-v1-5") as pipe:
image = pipe.infer("a serene mountain landscape")
# Model automatically unloaded when exiting context
```
### Audio Models
```python
# Speech-to-text with Whisper
oe.load("openai/whisper-base")
transcription = oe.infer("path/to/audio.wav")
print(transcription) # "Hello, this is the transcribed audio"
# Text-to-speech (OpenVINO 2025.2+)
oe.load("microsoft/speecht5_tts")
audio = oe.infer("Hello world!")
# Save or play the generated audio
```
### Memory Management
OpenVINO-Easy provides flexible memory management for production applications:
```python
# Method 1: Explicit unload
oe.load("large-model")
result = oe.infer(data)
oe.unload() # Free memory immediately
# Method 2: Context manager (recommended)
with oe.load("large-model") as pipe:
result = pipe.infer(data)
# Model automatically unloaded when exiting
# Method 3: Multiple model switching
oe.load("text-model")
result1 = oe.infer("Hello world")
oe.unload()
oe.load("image-model")
result2 = oe.infer(image_data)
oe.unload()
# Check if model is still loaded
if oe.is_loaded():
result = oe.infer(data)
else:
print("Model has been unloaded")
```
### Model Management & Discovery
OpenVINO-Easy provides comprehensive model management capabilities:
```python
# Search for models on Hugging Face Hub
results = oe.models.search("stable diffusion", limit=5, model_type="image")
for model in results:
print(f"{model['id']}: {model['downloads']:,} downloads")
# Get detailed model information
info = oe.models.info("microsoft/DialoGPT-medium")
print(f"Local: {info['local']}, Remote: {info['remote']}")
print(f"Requirements: {info['requirements']['min_memory_mb']} MB")
# Install models without loading them
result = oe.models.install("runwayml/stable-diffusion-v1-5", dtype="fp16")
print(f"Installed: {result['size_mb']:.1f} MB")
# Validate model integrity
results = oe.models.validate()
print(f"Validation: {results['passed']}/{results['validated']} models valid")
# Benchmark all installed models
results = oe.models.benchmark_all()
best = results['summary']['fastest_model']
print(f"Fastest model: {best['id']} ({best['fps']:.1f} FPS)")
```
### Model Storage & Cache Management
OpenVINO-Easy uses a clean, Ollama-style directory structure:
```python
# Check where models are stored
print("Models directory:", oe.models.dir())
# Windows: C:\Users\username\AppData\Local\openvino-easy\models\
# Linux/Mac: ~/.openvino-easy/models/
# List all cached models
models_list = oe.models.list()
for model in models_list:
print(f"{model['name']}: {model['size_mb']:.1f} MB")
# Check cache usage
cache_info = oe.cache.size()
print(f"Total cache size: {cache_info['total_size_mb']:.1f} MB")
print(f"Models: {cache_info['model_count']}")
# Clean up temporary files only (keeps models)
oe.cache.clear()
# Remove a specific model (exact name required for safety)
result = oe.models.remove("microsoft--DialoGPT-medium--fp16--a1b2c3d4")
print(result) # Shows what was removed
# Clear everything including models (requires confirmation)
result = oe.models.clear() # Shows safety warning, requires confirm=False
result = oe.models.clear(confirm=False) # Actually performs deletion
# Clear temp cache only (safe)
oe.cache.clear()
# Clear both temp cache and models (dangerous, requires confirmation)
oe.cache.clear(models=True) # Shows safety warning
oe.cache.clear(models=True, confirm=False) # Actually performs deletion
```
**Directory Structure:**
```
~/.openvino-easy/ # Linux/Mac
C:\Users\user\AppData\Local\openvino-easy\ # Windows
โโโ models/ # Downloaded/converted models (permanent)
โ โโโ microsoft--DialoGPT-medium--fp16--a1b2c3d4/
โ โโโ openai--whisper-base--int8--e5f6g7h8/
โโโ cache/ # Temporary conversion files
โโโ config/ # User settings
```
**Environment Override:**
```bash
# Custom models directory
export OE_MODELS_DIR="/shared/ai-models"
# or
OE_MODELS_DIR="/shared/ai-models" python app.py
```
## ๐ง Command Line Interface
```bash
# Text inference
oe run "microsoft/DialoGPT-medium" --prompt "Hello there"
# Audio inference (speech-to-text)
oe run "openai/whisper-base" --input-file "audio.wav"
# Image generation
oe run "runwayml/stable-diffusion-v1-5" --prompt "a beautiful sunset"
# Benchmark with latest NPU precision
oe bench "runwayml/stable-diffusion-v1-5" --dtype fp16-nf4
# System diagnostics
oe doctor
# List available devices
oe devices
# Enhanced NPU diagnostics (Arrow Lake/Lunar Lake detection)
oe npu-doctor
# Cache management
oe cache list # List cached models
oe cache size # Show cache usage
oe cache remove <model> # Remove specific model (with confirmation)
oe cache clear # Clear temp cache only (safe)
oe cache clear --models # Clear all models (DANGEROUS - requires confirmation)
oe cache clear --models --force # Override safety (VERY DANGEROUS)
# Advanced model management
oe models search "stable diffusion" --limit 5 # Search HuggingFace Hub
oe models info microsoft/DialoGPT-medium # Get model details
oe models install runwayml/stable-diffusion-v1-5 --dtype fp16 # Install model
oe models validate # Validate all models
oe models benchmark # Benchmark all installed models
```
## ๐๏ธ Architecture
OpenVINO-Easy wraps OpenVINO's API:
```
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
โ Your Code โ โ OpenVINO-Easy โ โ OpenVINO โ
โ โ โ โ โ โ
โ oe.load(...) โโโโโถโ โข Model Loading โโโโโถโ โข IR Conversion โ
โ oe.infer(...) โ โ โข Device Select โ โ โข Compilation โ
โ oe.benchmark() โ โ โข Preprocessing โ โ โข Inference โ
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
```
### Key Features
- **Device Selection**: Chooses NPU โ GPU โ CPU based on availability
- **Model Loading**: Supports Hugging Face, ONNX, and OpenVINO IR formats
- **Conversion**: Converts models to OpenVINO IR format
- **INT8 Quantization**: Quantization with NNCF for faster inference
- **Benchmarking**: Performance metrics and timing
- **Caching**: SHA-256 based model caching for fast re-loading
- **Memory Management**: Explicit unload() and context manager support
- **Hardware Diagnostics**: Tools for troubleshooting device issues
## ๐ค Supported Models
### Text Models
- **Conversational**: DialoGPT, BlenderBot, ChatGLM
- **Text Generation**: GPT-2, GPT-J, OPT, BLOOM
- **Question Answering**: BERT, RoBERTa, DeBERTa
- **Text Classification**: DistilBERT, ALBERT
### Vision Models
- **Image Generation**: Stable Diffusion, DALL-E 2
- **Object Detection**: YOLO, SSD, RetinaNet
- **Image Classification**: ResNet, EfficientNet, Vision Transformer
- **Segmentation**: U-Net, DeepLab, Mask R-CNN
### Audio Models
- **Speech Recognition**: Whisper, Wav2Vec2, WavLM
- **Text-to-Speech**: SpeechT5, Bark (coming soon)
- **Audio Classification**: Hubert, Audio Transformers
### Multimodal Models
- **Vision-Language**: CLIP, BLIP, LLaVA
- **Image Captioning**: BLIP-2, GIT, OFA
## ๐ Performance
Performance benchmarks:
| Model | Hardware | Throughput | Latency |
|-------|----------|------------|---------|
| Stable Diffusion 1.5 | Intel Core Ultra 7 Lunar Lake (NPU) | **2.3+ img/s** | 420ms |
| Stable Diffusion 1.5 | Intel Core Ultra 7 Arrow Lake (NPU) | **2.2+ img/s** | 450ms |
| Stable Diffusion 1.5 | Intel Core Ultra 7 (1st gen NPU) | 1.8 img/s | 556ms |
| Stable Diffusion 1.5 | Intel Arc A770 (GPU) | 1.6 img/s | 625ms |
| Stable Diffusion 1.5 | Intel Core i7-13700K (CPU) | 0.4 img/s | 2.5s |
| DialoGPT-medium | Intel Core Ultra 7 Lunar Lake (NPU) | **50+ tok/s** | 20ms |
| DialoGPT-medium | Intel Core Ultra 7 Arrow Lake (NPU) | **48+ tok/s** | 21ms |
| DialoGPT-medium | Intel Core Ultra 7 (1st gen NPU) | 40 tok/s | 25ms |
| DialoGPT-medium | Intel Arc A770 (GPU) | 38 tok/s | 26ms |
| DialoGPT-medium | Intel Core i7-13700K (CPU) | 12 tok/s | 83ms |
*Benchmarks with FP16-NF4 precision on Arrow Lake/Lunar Lake NPUs (OpenVINO 2025.2+)*
## ๐ฌ Text Processing Details
OpenVINO-Easy handles text preprocessing automatically:
```python
# For text models, tokenization is automatic
pipe = oe.load("microsoft/DialoGPT-medium")
# Multiple input formats supported:
response = pipe.infer("Hello!") # String input
response = pipe.infer(["Hello!", "How are you?"]) # Batch input
response = pipe.infer({"text": "Hello!"}) # Dict input
```
**Tokenization Strategy:**
1. **HuggingFace Models**: Uses `transformers.AutoTokenizer` with model-specific settings
2. **ONNX Models**: Attempts to infer tokenizer from model metadata
3. **OpenVINO IR**: Falls back to basic text preprocessing
4. **Custom Models**: Provides hooks for custom tokenization
## ๐งช Development & Testing
### **Modern Python Packaging (Recommended)**
```bash
# Install in editable mode with development dependencies
pip install -e ".[dev]"
# Or install specific extras for testing
pip install -e ".[full,dev]" # Full OpenVINO + dev tools
```
### **Comprehensive Testing Framework**
OpenVINO-Easy includes a robust testing framework with multiple test categories:
```bash
# Quick tests (unit tests only, fast)
python test_runner.py --mode fast
# All tests except slow ones
python test_runner.py --mode full
# Integration tests with real OpenVINO models
python test_runner.py --mode integration
# End-to-end tests with real HuggingFace models (requires internet)
python test_runner.py --mode e2e
# Performance regression testing
python test_runner.py --mode performance
# Model compatibility validation
pytest tests/test_model_compatibility.py -v
# Cache management and safety tests
pytest tests/test_model_management.py -v
# CLI functionality tests
pytest tests/test_cli_models.py -v
# Run with coverage
python test_runner.py --mode coverage
```
### **Quality Assurance Features**
#### **Performance Regression Testing**
```python
# Automated performance baselines
from tests.test_performance_regression_enhanced import PerformanceRegression
tester = PerformanceRegression()
test = PerformanceTest(
model_id="microsoft/DialoGPT-medium",
tolerance_percent=15.0 # Allow 15% regression
)
results = tester.run_performance_test(test)
if results['regressions']:
print("Performance regressions detected!")
```
#### **Model Compatibility Validation**
```python
# Automated compatibility testing across devices/precisions
from tests.test_model_compatibility import ModelCompatibilityValidator
validator = ModelCompatibilityValidator()
result = validator.validate_model_compatibility("runwayml/stable-diffusion-v1-5")
if not result['overall_compatible']:
print(f"Compatibility issues: {result['issues']}")
```
#### **Enhanced Error Recovery**
```python
# Automatic device fallback and retry logic
oe.load(
"microsoft/DialoGPT-medium",
device_preference=["NPU", "GPU", "CPU"],
retry_on_failure=True,
fallback_device="CPU"
)
# Automatically tries NPU -> GPU -> CPU -> CPU with default config
```
### **Test Categories**
| Test Type | Command | Purpose |
|-----------|---------|----------|
| **Unit Tests** | `pytest tests/ -m "not slow and not integration"` | Core functionality |
| **Integration Tests** | `pytest tests/ -m "integration"` | Real model loading |
| **Performance Tests** | `pytest tests/ -m "performance"` | Regression detection |
| **Compatibility Tests** | `pytest tests/ -m "compatibility"` | Device/model validation |
| **End-to-End Tests** | `pytest tests/test_e2e_real_models.py` | Full workflows |
| **CLI Tests** | `pytest tests/test_cli*.py` | Command-line interface |
| **Safety Tests** | `pytest tests/test_model_management.py` | Security validation |
### **Development Workflow**
```bash
# Format code
black oe/ tests/
isort oe/ tests/
# Type checking
mypy oe/
# Run all quality checks
python test_runner.py --mode full
pytest tests/test_model_compatibility.py -x
pytest tests/test_performance_regression_enhanced.py -x
```
## ๐ Examples
Check out the `examples/` directory:
- **[Stable Diffusion Notebook](examples/stable_diffusion.ipynb)**: Image generation with automatic optimization
- **Text Generation**: Conversational AI with DialoGPT
- **ONNX Models**: Loading and running ONNX models
- **Custom Models**: Integrating your own models
## ๐ค Contributing
We welcome contributions! Please see our [Contributing Guide](CONTRIBUTING.md) for details.
## ๐ License
Apache License 2.0 - see [LICENSE](LICENSE) for details.
## ๐ Acknowledgments
- Intel OpenVINO Team for the inference engine
- Hugging Face for the transformers ecosystem
- ONNX Community for the model format standards
Raw data
{
"_id": null,
"home_page": null,
"name": "openvino-easy",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": "OpenVINO-Easy Contributors <info@example.com>",
"keywords": "ai, inference, openvino, optimization, quantization",
"author": null,
"author_email": "OpenVINO-Easy Contributors <info@example.com>",
"download_url": "https://files.pythonhosted.org/packages/60/b0/1fc25018203a3d04a014b27140de7024dc8cbbc935d322ccc095863ed5b0/openvino_easy-1.0.0.tar.gz",
"platform": null,
"description": "# OpenVINO-Easy \ud83d\ude80\n\n**Framework-agnostic Python wrapper for OpenVINO 2025**\n\nLoad and run AI models with three functions:\n```python\nimport oe\n\noe.load(\"runwayml/stable-diffusion-v1-5\") # auto-download & convert\nimg = oe.infer(\"a neon cyber-city at night\") # chooses NPU>GPU>CPU \nstats = oe.benchmark() # JSON perf report\n```\n\n## \ud83c\udfaf Installation\n\n**Pick the variant that matches your hardware:**\n\n```bash\n# CPU-only (40MB wheel, fastest install)\npip install \"openvino-easy[cpu]\"\n# or\npip install \"openvino-easy[runtime]\"\n\n# Intel\u00ae Arc/Xe GPU support\npip install \"openvino-easy[gpu]\"\n\n# Intel\u00ae NPU support (Arrow Lake/Lunar Lake with FP16-NF4)\npip install \"openvino-easy[npu]\"\n\n# With INT8 quantization support\npip install \"openvino-easy[quant]\"\n\n# Audio model support (Whisper, TTS)\npip install \"openvino-easy[audio]\"\n\n# Full development environment (OpenVINO, NNCF, optimum ~1GB)\npip install \"openvino-easy[full]\"\n\n# Everything (for development)\npip install \"openvino-easy[all]\"\n```\n\n### \ud83e\ude7a Installation Troubleshooting\n\n**Something not working?** Run the doctor:\n\n```bash\n# Comprehensive diagnostics\noe doctor\n\n# Get fix suggestions for specific hardware\noe doctor --fix gpu\noe doctor --fix npu\n\n# JSON output for CI systems\noe doctor --json\n\n# Check device status\noe devices\n```\n\n**Common issues:**\n\n| Problem | Solution |\n|---------|----------|\n| `ImportError: OpenVINO runtime not found` | Install with hardware extras: `pip install \"openvino-easy[cpu]\"` |\n| NPU detected but not functional | Install Intel NPU drivers from intel.com |\n| GPU detected but not functional | Install Intel GPU drivers (`intel-opencl-icd` on Linux) |\n| `NNCF not available` for INT8 quantization | Install quantization support: `pip install \"openvino-easy[quant]\"` |\n| FP16-NF4 not supported | Requires Arrow Lake/Lunar Lake NPU with OpenVINO 2025.2+ |\n| Version warnings | Upgrade OpenVINO: `pip install --upgrade \"openvino>=2025.2,<2026.0\"` |\n| **PyTorch model (.pt/.pth) not loading** | **Convert to ONNX first:** `torch.onnx.export(model, dummy_input, \"model.onnx\")` then `oe.load(\"model.onnx\")` |\n| **\"Native PyTorch model conversion failed\"** | **Upload to Hugging Face Hub** with config.json or **use ONNX format** for best compatibility |\n\n### \ud83d\udce6 What Each Variant Includes\n\n| Variant | OpenVINO Package | Size | Best For |\n|---------|------------------|------|----------|\n| `[cpu]` / `[runtime]` | `openvino` runtime | ~40MB | Production deployments, CPU-only inference |\n| `[gpu]` | `openvino` runtime | ~40MB | Intel GPU acceleration |\n| `[npu]` | `openvino` runtime | ~40MB | Intel NPU acceleration |\n| `[quant]` | `openvino` + NNCF | ~440MB | INT8 quantization support |\n| `[audio]` | `openvino` + librosa | ~100MB | Audio models (Whisper, TTS) |\n| `[full]` | `openvino` + NNCF + optimum | ~1GB | Development, model optimization, research |\n\n## \u26a1 Quick Start\n\n### Basic Usage\n\n```python\nimport oe\n\n# Load any model (Hugging Face, ONNX, or OpenVINO IR)\noe.load(\"microsoft/DialoGPT-medium\")\n\n# Run inference (automatic tokenization for text models)\nresponse = oe.infer(\"Hello, how are you?\")\nprint(response) # \"I'm doing well, thank you for asking!\"\n\n# Benchmark performance\nstats = oe.benchmark()\nprint(f\"Average latency: {stats['avg_latency_ms']:.2f}ms\")\nprint(f\"Throughput: {stats['throughput_fps']:.1f} FPS\")\n\n# Explicitly free memory when done\noe.unload()\n```\n\n### Advanced Usage\n\n```python\n# Specify device preference and precision\noe.load(\n \"runwayml/stable-diffusion-v1-5\",\n device_preference=[\"NPU\", \"GPU\", \"CPU\"], # Try NPU first, fallback to GPU, then CPU\n dtype=\"fp16-nf4\" # New FP16-NF4 precision for Arrow Lake/Lunar Lake NPUs\n)\n\n# Generate image\nimage = oe.infer(\n \"a serene mountain landscape at sunset\",\n num_inference_steps=20,\n guidance_scale=7.5\n)\n\n# Get detailed model info\ninfo = oe.get_info()\nprint(f\"Running on: {info['device']}\")\nprint(f\"Model type: {info['dtype']}\")\nprint(f\"Quantized: {info['quantized']}\")\n\n# Context manager for automatic cleanup\nwith oe.load(\"runwayml/stable-diffusion-v1-5\") as pipe:\n image = pipe.infer(\"a serene mountain landscape\")\n # Model automatically unloaded when exiting context\n```\n\n### Audio Models\n\n```python\n# Speech-to-text with Whisper\noe.load(\"openai/whisper-base\")\ntranscription = oe.infer(\"path/to/audio.wav\")\nprint(transcription) # \"Hello, this is the transcribed audio\"\n\n# Text-to-speech (OpenVINO 2025.2+)\noe.load(\"microsoft/speecht5_tts\")\naudio = oe.infer(\"Hello world!\")\n# Save or play the generated audio\n```\n\n### Memory Management\n\nOpenVINO-Easy provides flexible memory management for production applications:\n\n```python\n# Method 1: Explicit unload\noe.load(\"large-model\")\nresult = oe.infer(data)\noe.unload() # Free memory immediately\n\n# Method 2: Context manager (recommended)\nwith oe.load(\"large-model\") as pipe:\n result = pipe.infer(data)\n # Model automatically unloaded when exiting\n\n# Method 3: Multiple model switching \noe.load(\"text-model\")\nresult1 = oe.infer(\"Hello world\")\noe.unload()\n\noe.load(\"image-model\")\nresult2 = oe.infer(image_data)\noe.unload()\n\n# Check if model is still loaded\nif oe.is_loaded():\n result = oe.infer(data)\nelse:\n print(\"Model has been unloaded\")\n```\n\n### Model Management & Discovery\n\nOpenVINO-Easy provides comprehensive model management capabilities:\n\n```python\n# Search for models on Hugging Face Hub\nresults = oe.models.search(\"stable diffusion\", limit=5, model_type=\"image\")\nfor model in results:\n print(f\"{model['id']}: {model['downloads']:,} downloads\")\n\n# Get detailed model information\ninfo = oe.models.info(\"microsoft/DialoGPT-medium\")\nprint(f\"Local: {info['local']}, Remote: {info['remote']}\")\nprint(f\"Requirements: {info['requirements']['min_memory_mb']} MB\")\n\n# Install models without loading them\nresult = oe.models.install(\"runwayml/stable-diffusion-v1-5\", dtype=\"fp16\")\nprint(f\"Installed: {result['size_mb']:.1f} MB\")\n\n# Validate model integrity\nresults = oe.models.validate()\nprint(f\"Validation: {results['passed']}/{results['validated']} models valid\")\n\n# Benchmark all installed models\nresults = oe.models.benchmark_all()\nbest = results['summary']['fastest_model']\nprint(f\"Fastest model: {best['id']} ({best['fps']:.1f} FPS)\")\n```\n\n### Model Storage & Cache Management\n\nOpenVINO-Easy uses a clean, Ollama-style directory structure:\n\n```python\n# Check where models are stored\nprint(\"Models directory:\", oe.models.dir())\n# Windows: C:\\Users\\username\\AppData\\Local\\openvino-easy\\models\\\n# Linux/Mac: ~/.openvino-easy/models/\n\n# List all cached models\nmodels_list = oe.models.list()\nfor model in models_list:\n print(f\"{model['name']}: {model['size_mb']:.1f} MB\")\n\n# Check cache usage\ncache_info = oe.cache.size()\nprint(f\"Total cache size: {cache_info['total_size_mb']:.1f} MB\")\nprint(f\"Models: {cache_info['model_count']}\")\n\n# Clean up temporary files only (keeps models)\noe.cache.clear()\n\n# Remove a specific model (exact name required for safety)\nresult = oe.models.remove(\"microsoft--DialoGPT-medium--fp16--a1b2c3d4\")\nprint(result) # Shows what was removed\n\n# Clear everything including models (requires confirmation)\nresult = oe.models.clear() # Shows safety warning, requires confirm=False\nresult = oe.models.clear(confirm=False) # Actually performs deletion\n\n# Clear temp cache only (safe)\noe.cache.clear()\n\n# Clear both temp cache and models (dangerous, requires confirmation)\noe.cache.clear(models=True) # Shows safety warning\noe.cache.clear(models=True, confirm=False) # Actually performs deletion\n```\n\n**Directory Structure:**\n```\n~/.openvino-easy/ # Linux/Mac\nC:\\Users\\user\\AppData\\Local\\openvino-easy\\ # Windows\n\u251c\u2500\u2500 models/ # Downloaded/converted models (permanent)\n\u2502 \u251c\u2500\u2500 microsoft--DialoGPT-medium--fp16--a1b2c3d4/\n\u2502 \u2514\u2500\u2500 openai--whisper-base--int8--e5f6g7h8/\n\u251c\u2500\u2500 cache/ # Temporary conversion files\n\u2514\u2500\u2500 config/ # User settings\n```\n\n**Environment Override:**\n```bash\n# Custom models directory\nexport OE_MODELS_DIR=\"/shared/ai-models\"\n# or\nOE_MODELS_DIR=\"/shared/ai-models\" python app.py\n```\n\n## \ud83d\udd27 Command Line Interface\n\n```bash\n# Text inference\noe run \"microsoft/DialoGPT-medium\" --prompt \"Hello there\"\n\n# Audio inference (speech-to-text)\noe run \"openai/whisper-base\" --input-file \"audio.wav\"\n\n# Image generation\noe run \"runwayml/stable-diffusion-v1-5\" --prompt \"a beautiful sunset\"\n\n# Benchmark with latest NPU precision\noe bench \"runwayml/stable-diffusion-v1-5\" --dtype fp16-nf4\n\n# System diagnostics\noe doctor\n\n# List available devices\noe devices\n\n# Enhanced NPU diagnostics (Arrow Lake/Lunar Lake detection)\noe npu-doctor\n\n# Cache management\noe cache list # List cached models\noe cache size # Show cache usage\noe cache remove <model> # Remove specific model (with confirmation)\noe cache clear # Clear temp cache only (safe)\noe cache clear --models # Clear all models (DANGEROUS - requires confirmation)\noe cache clear --models --force # Override safety (VERY DANGEROUS)\n\n# Advanced model management\noe models search \"stable diffusion\" --limit 5 # Search HuggingFace Hub\noe models info microsoft/DialoGPT-medium # Get model details\noe models install runwayml/stable-diffusion-v1-5 --dtype fp16 # Install model\noe models validate # Validate all models\noe models benchmark # Benchmark all installed models\n```\n\n## \ud83c\udfd7\ufe0f Architecture\n\nOpenVINO-Easy wraps OpenVINO's API:\n\n```\n\u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510 \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510 \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\n\u2502 Your Code \u2502 \u2502 OpenVINO-Easy \u2502 \u2502 OpenVINO \u2502\n\u2502 \u2502 \u2502 \u2502 \u2502 \u2502\n\u2502 oe.load(...) \u2502\u2500\u2500\u2500\u25b6\u2502 \u2022 Model Loading \u2502\u2500\u2500\u2500\u25b6\u2502 \u2022 IR Conversion \u2502\n\u2502 oe.infer(...) \u2502 \u2502 \u2022 Device Select \u2502 \u2502 \u2022 Compilation \u2502\n\u2502 oe.benchmark() \u2502 \u2502 \u2022 Preprocessing \u2502 \u2502 \u2022 Inference \u2502\n\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518 \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518 \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n```\n\n### Key Features\n\n- **Device Selection**: Chooses NPU \u2192 GPU \u2192 CPU based on availability\n- **Model Loading**: Supports Hugging Face, ONNX, and OpenVINO IR formats\n- **Conversion**: Converts models to OpenVINO IR format\n- **INT8 Quantization**: Quantization with NNCF for faster inference\n- **Benchmarking**: Performance metrics and timing\n- **Caching**: SHA-256 based model caching for fast re-loading\n- **Memory Management**: Explicit unload() and context manager support\n- **Hardware Diagnostics**: Tools for troubleshooting device issues\n\n## \ud83e\udd16 Supported Models\n\n### Text Models\n- **Conversational**: DialoGPT, BlenderBot, ChatGLM\n- **Text Generation**: GPT-2, GPT-J, OPT, BLOOM \n- **Question Answering**: BERT, RoBERTa, DeBERTa\n- **Text Classification**: DistilBERT, ALBERT\n\n### Vision Models\n- **Image Generation**: Stable Diffusion, DALL-E 2\n- **Object Detection**: YOLO, SSD, RetinaNet\n- **Image Classification**: ResNet, EfficientNet, Vision Transformer\n- **Segmentation**: U-Net, DeepLab, Mask R-CNN\n\n### Audio Models\n- **Speech Recognition**: Whisper, Wav2Vec2, WavLM\n- **Text-to-Speech**: SpeechT5, Bark (coming soon)\n- **Audio Classification**: Hubert, Audio Transformers\n\n### Multimodal Models\n- **Vision-Language**: CLIP, BLIP, LLaVA\n- **Image Captioning**: BLIP-2, GIT, OFA\n\n## \ud83d\ude80 Performance\n\nPerformance benchmarks:\n\n| Model | Hardware | Throughput | Latency |\n|-------|----------|------------|---------|\n| Stable Diffusion 1.5 | Intel Core Ultra 7 Lunar Lake (NPU) | **2.3+ img/s** | 420ms |\n| Stable Diffusion 1.5 | Intel Core Ultra 7 Arrow Lake (NPU) | **2.2+ img/s** | 450ms |\n| Stable Diffusion 1.5 | Intel Core Ultra 7 (1st gen NPU) | 1.8 img/s | 556ms |\n| Stable Diffusion 1.5 | Intel Arc A770 (GPU) | 1.6 img/s | 625ms |\n| Stable Diffusion 1.5 | Intel Core i7-13700K (CPU) | 0.4 img/s | 2.5s |\n| DialoGPT-medium | Intel Core Ultra 7 Lunar Lake (NPU) | **50+ tok/s** | 20ms |\n| DialoGPT-medium | Intel Core Ultra 7 Arrow Lake (NPU) | **48+ tok/s** | 21ms |\n| DialoGPT-medium | Intel Core Ultra 7 (1st gen NPU) | 40 tok/s | 25ms |\n| DialoGPT-medium | Intel Arc A770 (GPU) | 38 tok/s | 26ms |\n| DialoGPT-medium | Intel Core i7-13700K (CPU) | 12 tok/s | 83ms |\n\n*Benchmarks with FP16-NF4 precision on Arrow Lake/Lunar Lake NPUs (OpenVINO 2025.2+)*\n\n## \ud83d\udd2c Text Processing Details\n\nOpenVINO-Easy handles text preprocessing automatically:\n\n```python\n# For text models, tokenization is automatic\npipe = oe.load(\"microsoft/DialoGPT-medium\")\n\n# Multiple input formats supported:\nresponse = pipe.infer(\"Hello!\") # String input\nresponse = pipe.infer([\"Hello!\", \"How are you?\"]) # Batch input\nresponse = pipe.infer({\"text\": \"Hello!\"}) # Dict input\n```\n\n**Tokenization Strategy:**\n1. **HuggingFace Models**: Uses `transformers.AutoTokenizer` with model-specific settings\n2. **ONNX Models**: Attempts to infer tokenizer from model metadata\n3. **OpenVINO IR**: Falls back to basic text preprocessing\n4. **Custom Models**: Provides hooks for custom tokenization\n\n## \ud83e\uddea Development & Testing\n\n### **Modern Python Packaging (Recommended)**\n\n```bash\n# Install in editable mode with development dependencies\npip install -e \".[dev]\"\n\n# Or install specific extras for testing\npip install -e \".[full,dev]\" # Full OpenVINO + dev tools\n```\n\n### **Comprehensive Testing Framework**\n\nOpenVINO-Easy includes a robust testing framework with multiple test categories:\n\n```bash\n# Quick tests (unit tests only, fast)\npython test_runner.py --mode fast\n\n# All tests except slow ones\npython test_runner.py --mode full\n\n# Integration tests with real OpenVINO models\npython test_runner.py --mode integration\n\n# End-to-end tests with real HuggingFace models (requires internet)\npython test_runner.py --mode e2e\n\n# Performance regression testing\npython test_runner.py --mode performance\n\n# Model compatibility validation\npytest tests/test_model_compatibility.py -v\n\n# Cache management and safety tests\npytest tests/test_model_management.py -v\n\n# CLI functionality tests\npytest tests/test_cli_models.py -v\n\n# Run with coverage\npython test_runner.py --mode coverage\n```\n\n### **Quality Assurance Features**\n\n#### **Performance Regression Testing**\n```python\n# Automated performance baselines\nfrom tests.test_performance_regression_enhanced import PerformanceRegression\n\ntester = PerformanceRegression()\ntest = PerformanceTest(\n model_id=\"microsoft/DialoGPT-medium\",\n tolerance_percent=15.0 # Allow 15% regression\n)\n\nresults = tester.run_performance_test(test)\nif results['regressions']:\n print(\"Performance regressions detected!\")\n```\n\n#### **Model Compatibility Validation**\n```python\n# Automated compatibility testing across devices/precisions\nfrom tests.test_model_compatibility import ModelCompatibilityValidator\n\nvalidator = ModelCompatibilityValidator()\nresult = validator.validate_model_compatibility(\"runwayml/stable-diffusion-v1-5\")\n\nif not result['overall_compatible']:\n print(f\"Compatibility issues: {result['issues']}\")\n```\n\n#### **Enhanced Error Recovery**\n```python\n# Automatic device fallback and retry logic\noe.load(\n \"microsoft/DialoGPT-medium\",\n device_preference=[\"NPU\", \"GPU\", \"CPU\"],\n retry_on_failure=True,\n fallback_device=\"CPU\"\n)\n# Automatically tries NPU -> GPU -> CPU -> CPU with default config\n```\n\n### **Test Categories**\n\n| Test Type | Command | Purpose |\n|-----------|---------|----------|\n| **Unit Tests** | `pytest tests/ -m \"not slow and not integration\"` | Core functionality |\n| **Integration Tests** | `pytest tests/ -m \"integration\"` | Real model loading |\n| **Performance Tests** | `pytest tests/ -m \"performance\"` | Regression detection |\n| **Compatibility Tests** | `pytest tests/ -m \"compatibility\"` | Device/model validation |\n| **End-to-End Tests** | `pytest tests/test_e2e_real_models.py` | Full workflows |\n| **CLI Tests** | `pytest tests/test_cli*.py` | Command-line interface |\n| **Safety Tests** | `pytest tests/test_model_management.py` | Security validation |\n\n### **Development Workflow**\n\n```bash\n# Format code\nblack oe/ tests/\nisort oe/ tests/\n\n# Type checking\nmypy oe/\n\n# Run all quality checks\npython test_runner.py --mode full\npytest tests/test_model_compatibility.py -x\npytest tests/test_performance_regression_enhanced.py -x\n```\n\n## \ud83d\udcda Examples\n\nCheck out the `examples/` directory:\n\n- **[Stable Diffusion Notebook](examples/stable_diffusion.ipynb)**: Image generation with automatic optimization\n- **Text Generation**: Conversational AI with DialoGPT\n- **ONNX Models**: Loading and running ONNX models\n- **Custom Models**: Integrating your own models\n\n## \ud83e\udd1d Contributing\n\nWe welcome contributions! Please see our [Contributing Guide](CONTRIBUTING.md) for details.\n\n## \ud83d\udcc4 License\n\nApache License 2.0 - see [LICENSE](LICENSE) for details.\n\n## \ud83d\ude4f Acknowledgments\n\n- Intel OpenVINO Team for the inference engine\n- Hugging Face for the transformers ecosystem \n- ONNX Community for the model format standards ",
"bugtrack_url": null,
"license": "Apache-2.0",
"summary": "Framework-agnostic Python wrapper for OpenVINO 2025",
"version": "1.0.0",
"project_urls": {
"Bug Tracker": "https://github.com/example/openvino-easy/issues",
"Changelog": "https://github.com/example/openvino-easy/blob/main/CHANGELOG.md",
"Documentation": "https://openvino-easy.readthedocs.io",
"Homepage": "https://github.com/example/openvino-easy",
"Repository": "https://github.com/example/openvino-easy.git"
},
"split_keywords": [
"ai",
" inference",
" openvino",
" optimization",
" quantization"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "9900c8b5b3dbddafaec4061f2738cf8331acc87b43b9506a4ecca6ec341fc16c",
"md5": "19fc305a898c0b90576c74284036ae3c",
"sha256": "8bf6bb5713c62ccd1f8d4c28936afa76748227b97f17a25a41523de38999895a"
},
"downloads": -1,
"filename": "openvino_easy-1.0.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "19fc305a898c0b90576c74284036ae3c",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 63889,
"upload_time": "2025-08-21T04:53:19",
"upload_time_iso_8601": "2025-08-21T04:53:19.418463Z",
"url": "https://files.pythonhosted.org/packages/99/00/c8b5b3dbddafaec4061f2738cf8331acc87b43b9506a4ecca6ec341fc16c/openvino_easy-1.0.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "60b01fc25018203a3d04a014b27140de7024dc8cbbc935d322ccc095863ed5b0",
"md5": "0181c0061e343d83abb08951da034b11",
"sha256": "6a02fc4b80521d056ff36c9926e6e78ca352e8d6429976f8a0cafbdd7e348453"
},
"downloads": -1,
"filename": "openvino_easy-1.0.0.tar.gz",
"has_sig": false,
"md5_digest": "0181c0061e343d83abb08951da034b11",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 178099,
"upload_time": "2025-08-21T04:53:21",
"upload_time_iso_8601": "2025-08-21T04:53:21.131337Z",
"url": "https://files.pythonhosted.org/packages/60/b0/1fc25018203a3d04a014b27140de7024dc8cbbc935d322ccc095863ed5b0/openvino_easy-1.0.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-21 04:53:21",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "example",
"github_project": "openvino-easy",
"github_not_found": true,
"lcname": "openvino-easy"
}