docling-onnx-models


Namedocling-onnx-models JSON
Version 0.1.3 PyPI version JSON
download
home_pageNone
SummaryONNX Runtime implementations for Docling AI models
upload_time2025-09-09 08:45:47
maintainerNone
docs_urlNone
authorNone
requires_python>=3.10
licenseMIT
keywords docling onnx document ai layout table classification ocr document-processing
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Docling ONNX Models

ONNX Runtime implementations for Docling AI models, providing improved performance and cross-platform compatibility.

## Overview

`docling-onnx-models` is a drop-in replacement for the `docling-ibm-models` package, offering the same APIs but powered by ONNX Runtime. This provides several advantages:

- **Improved Performance**: ONNX Runtime optimizations for better inference speed
- **Cross-Platform**: Consistent behavior across different operating systems
- **Hardware Acceleration**: Support for CPU, CUDA, and other execution providers
- **Reduced Dependencies**: Lighter weight than full PyTorch models

## Installation

```bash
pip install docling-onnx-models
```

For GPU support:
```bash
pip install docling-onnx-models[gpu]
```

## Supported Models

### Layout Model
- **Input**: Document page images
- **Output**: Layout element detection (text, tables, figures, etc.)
- **Compatible with**: `docling.models.layout_model.LayoutModel`

### Document Figure Classifier
- **Input**: Figure/image regions
- **Output**: Classification into 16 figure types (charts, logos, maps, etc.)
- **Compatible with**: `docling.models.document_picture_classifier.DocumentPictureClassifier`

### Table Structure Predictor
- **Input**: Table region images and token data
- **Output**: Table structure with rows, columns, and cell relationships
- **Compatible with**: `docling.models.table_structure_model.TableStructureModel`

## Usage

### Basic Usage

The ONNX models are designed as drop-in replacements for the original models:

```python
# Instead of:
# from docling_ibm_models.layoutmodel.layout_predictor import LayoutPredictor

# Use:
from docling_onnx_models.layoutmodel.layout_predictor import LayoutPredictor

# Same API
predictor = LayoutPredictor(
    artifact_path="/path/to/onnx/model/directory",
    device="cpu",
    num_threads=4
)

predictions = predictor.predict_batch(images)
```

### With Docling

To use ONNX models with Docling, ensure your model directories contain ONNX files:

```
model_directory/
├── model.onnx              # ONNX model file
├── config.json             # Model configuration
├── preprocessor_config.json # Preprocessing config
└── ...                     # Other model files
```

The models will be automatically detected and used by Docling when available.

### Custom Execution Providers

```python
from docling_onnx_models.layoutmodel.layout_predictor import LayoutPredictor

# Use specific execution providers
predictor = LayoutPredictor(
    artifact_path="/path/to/model",
    device="cuda",
    providers=["CUDAExecutionProvider", "CPUExecutionProvider"]
)
```

## Model Conversion

If you have PyTorch models that need to be converted to ONNX:

```python
# Example for converting a layout model
import torch
from transformers import AutoModelForObjectDetection

model = AutoModelForObjectDetection.from_pretrained("path/to/pytorch/model")
model.eval()

# Dummy input (adjust dimensions as needed)
dummy_input = torch.randn(1, 3, 640, 640)

# Export to ONNX
torch.onnx.export(
    model,
    dummy_input,
    "model.onnx",
    export_params=True,
    opset_version=11,
    do_constant_folding=True,
    input_names=["input"],
    output_names=["output"],
    dynamic_axes={
        "input": {0: "batch_size"},
        "output": {0: "batch_size"}
    }
)
```

## Performance Tips

1. **Use appropriate execution providers** based on your hardware
2. **Set optimal thread counts** for CPU inference
3. **Batch processing** when possible for better throughput
4. **Model quantization** for reduced memory usage (if supported)

## API Compatibility

This package maintains full API compatibility with `docling-ibm-models`:

- All method signatures are identical
- Input/output formats are preserved  
- Configuration files use the same structure
- Error handling behavior is consistent

## Requirements

- Python 3.10+
- ONNX Runtime 1.15.0+
- NumPy 1.21.0+
- Pillow 8.3.0+
- OpenCV 4.5.0+

## Contributing

Please see the main [Docling repository](https://github.com/docling-project/docling) for contribution guidelines.

## License

This project is licensed under the MIT License - see the LICENSE file for details.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "docling-onnx-models",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": "IBM Research <docling@zurich.ibm.com>",
    "keywords": "docling, onnx, document, ai, layout, table, classification, ocr, document-processing",
    "author": null,
    "author_email": "IBM Research <docling@zurich.ibm.com>",
    "download_url": "https://files.pythonhosted.org/packages/48/b4/f03df36e31beccc991c6c1d7fb5a4b7d0c19ef7c211bf08b545d788e891d/docling_onnx_models-0.1.3.tar.gz",
    "platform": null,
    "description": "# Docling ONNX Models\n\nONNX Runtime implementations for Docling AI models, providing improved performance and cross-platform compatibility.\n\n## Overview\n\n`docling-onnx-models` is a drop-in replacement for the `docling-ibm-models` package, offering the same APIs but powered by ONNX Runtime. This provides several advantages:\n\n- **Improved Performance**: ONNX Runtime optimizations for better inference speed\n- **Cross-Platform**: Consistent behavior across different operating systems\n- **Hardware Acceleration**: Support for CPU, CUDA, and other execution providers\n- **Reduced Dependencies**: Lighter weight than full PyTorch models\n\n## Installation\n\n```bash\npip install docling-onnx-models\n```\n\nFor GPU support:\n```bash\npip install docling-onnx-models[gpu]\n```\n\n## Supported Models\n\n### Layout Model\n- **Input**: Document page images\n- **Output**: Layout element detection (text, tables, figures, etc.)\n- **Compatible with**: `docling.models.layout_model.LayoutModel`\n\n### Document Figure Classifier\n- **Input**: Figure/image regions\n- **Output**: Classification into 16 figure types (charts, logos, maps, etc.)\n- **Compatible with**: `docling.models.document_picture_classifier.DocumentPictureClassifier`\n\n### Table Structure Predictor\n- **Input**: Table region images and token data\n- **Output**: Table structure with rows, columns, and cell relationships\n- **Compatible with**: `docling.models.table_structure_model.TableStructureModel`\n\n## Usage\n\n### Basic Usage\n\nThe ONNX models are designed as drop-in replacements for the original models:\n\n```python\n# Instead of:\n# from docling_ibm_models.layoutmodel.layout_predictor import LayoutPredictor\n\n# Use:\nfrom docling_onnx_models.layoutmodel.layout_predictor import LayoutPredictor\n\n# Same API\npredictor = LayoutPredictor(\n    artifact_path=\"/path/to/onnx/model/directory\",\n    device=\"cpu\",\n    num_threads=4\n)\n\npredictions = predictor.predict_batch(images)\n```\n\n### With Docling\n\nTo use ONNX models with Docling, ensure your model directories contain ONNX files:\n\n```\nmodel_directory/\n\u251c\u2500\u2500 model.onnx              # ONNX model file\n\u251c\u2500\u2500 config.json             # Model configuration\n\u251c\u2500\u2500 preprocessor_config.json # Preprocessing config\n\u2514\u2500\u2500 ...                     # Other model files\n```\n\nThe models will be automatically detected and used by Docling when available.\n\n### Custom Execution Providers\n\n```python\nfrom docling_onnx_models.layoutmodel.layout_predictor import LayoutPredictor\n\n# Use specific execution providers\npredictor = LayoutPredictor(\n    artifact_path=\"/path/to/model\",\n    device=\"cuda\",\n    providers=[\"CUDAExecutionProvider\", \"CPUExecutionProvider\"]\n)\n```\n\n## Model Conversion\n\nIf you have PyTorch models that need to be converted to ONNX:\n\n```python\n# Example for converting a layout model\nimport torch\nfrom transformers import AutoModelForObjectDetection\n\nmodel = AutoModelForObjectDetection.from_pretrained(\"path/to/pytorch/model\")\nmodel.eval()\n\n# Dummy input (adjust dimensions as needed)\ndummy_input = torch.randn(1, 3, 640, 640)\n\n# Export to ONNX\ntorch.onnx.export(\n    model,\n    dummy_input,\n    \"model.onnx\",\n    export_params=True,\n    opset_version=11,\n    do_constant_folding=True,\n    input_names=[\"input\"],\n    output_names=[\"output\"],\n    dynamic_axes={\n        \"input\": {0: \"batch_size\"},\n        \"output\": {0: \"batch_size\"}\n    }\n)\n```\n\n## Performance Tips\n\n1. **Use appropriate execution providers** based on your hardware\n2. **Set optimal thread counts** for CPU inference\n3. **Batch processing** when possible for better throughput\n4. **Model quantization** for reduced memory usage (if supported)\n\n## API Compatibility\n\nThis package maintains full API compatibility with `docling-ibm-models`:\n\n- All method signatures are identical\n- Input/output formats are preserved  \n- Configuration files use the same structure\n- Error handling behavior is consistent\n\n## Requirements\n\n- Python 3.10+\n- ONNX Runtime 1.15.0+\n- NumPy 1.21.0+\n- Pillow 8.3.0+\n- OpenCV 4.5.0+\n\n## Contributing\n\nPlease see the main [Docling repository](https://github.com/docling-project/docling) for contribution guidelines.\n\n## License\n\nThis project is licensed under the MIT License - see the LICENSE file for details.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "ONNX Runtime implementations for Docling AI models",
    "version": "0.1.3",
    "project_urls": {
        "Bug Reports": "https://github.com/asmud/docling-onnx-models/issues",
        "Changelog": "https://github.com/asmud/docling-onnx-models/blob/main/CHANGELOG.md",
        "Documentation": "https://asmud.github.io/docling-onnx-models/",
        "Homepage": "https://github.com/asmud/docling-onnx-models",
        "PyPI": "https://pypi.org/project/docling-onnx-models/",
        "Repository": "https://github.com/asmud/docling-onnx-models.git",
        "Source Code": "https://github.com/asmud/docling-onnx-models"
    },
    "split_keywords": [
        "docling",
        " onnx",
        " document",
        " ai",
        " layout",
        " table",
        " classification",
        " ocr",
        " document-processing"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "69a9d16c6a18e4c3734214c100394671b65ff0d0b6cdd2cecb8605d70e8cda8b",
                "md5": "956751350eda583ffac5c75b00f7e33f",
                "sha256": "edc7fdfcb62087753358f09b971e76ec493398d54e87c52d680485f57ccc46cc"
            },
            "downloads": -1,
            "filename": "docling_onnx_models-0.1.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "956751350eda583ffac5c75b00f7e33f",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 26019,
            "upload_time": "2025-09-09T08:45:45",
            "upload_time_iso_8601": "2025-09-09T08:45:45.732226Z",
            "url": "https://files.pythonhosted.org/packages/69/a9/d16c6a18e4c3734214c100394671b65ff0d0b6cdd2cecb8605d70e8cda8b/docling_onnx_models-0.1.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "48b4f03df36e31beccc991c6c1d7fb5a4b7d0c19ef7c211bf08b545d788e891d",
                "md5": "e0de8b4fa966c5107ec95a569ecc9872",
                "sha256": "b72c4d03affeb30c6af7773b3e792d1dd99b7e0bb448ba6bb82d8b6fa15730da"
            },
            "downloads": -1,
            "filename": "docling_onnx_models-0.1.3.tar.gz",
            "has_sig": false,
            "md5_digest": "e0de8b4fa966c5107ec95a569ecc9872",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 32641,
            "upload_time": "2025-09-09T08:45:47",
            "upload_time_iso_8601": "2025-09-09T08:45:47.649953Z",
            "url": "https://files.pythonhosted.org/packages/48/b4/f03df36e31beccc991c6c1d7fb5a4b7d0c19ef7c211bf08b545d788e891d/docling_onnx_models-0.1.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-09-09 08:45:47",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "asmud",
    "github_project": "docling-onnx-models",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "docling-onnx-models"
}
        
Elapsed time: 2.32756s