docuglean-ocr

Name	docuglean-ocr JSON
Version	1.0.0 JSON
	download
home_page	None
Summary	An SDK for intelligent document processing using SOTA VLLM models
upload_time	2025-09-02 13:19:12
maintainer	None
docs_url	None
author	None
requires_python	>=3.11
license	Apache-2.0
keywords	document document-processing llm ocr text-extraction
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Docuglean OCR - Python SDK

A unified Python SDK for intelligent document processing using State of the Art AI models.

## Features

- 🚀 **Easy to Use**: Simple, intuitive API with detailed documentation
- 🔍 **OCR Capabilities**: Extract text from images and scanned documents  
- 📊 **Structured Data Extraction**: Use Pydantic models for type-safe data extraction
- 📄 **Multimodal Support**: Process PDFs and images with ease
- 🤖 **Multiple AI Providers**: Support for OpenAI, Mistral, Google Gemini, and Hugging Face
- 🔒 **Type Safety**: Full Python type hints with Pydantic validation

## Installation

```bash
pip install docuglean-ocr
```

## Quick Start

### OCR Processing

```python
from docuglean import ocr

# Mistral OCR
result = await ocr(
    file_path="./document.pdf",
    provider="mistral",
    model="mistral-ocr-latest",
    api_key="your-api-key"
)

# Google Gemini OCR
result = await ocr(
    file_path="./document.pdf",
    provider="gemini",
    model="gemini-2.5-flash",
    api_key="your-gemini-api-key",
    prompt="Extract all text from this document"
)

# Hugging Face OCR (no API key needed)
result = await ocr(
    file_path="https://example.com/image.jpg",  # Supports URLs, local files, base64
    provider="huggingface",
    model="Qwen/Qwen2.5-VL-3B-Instruct",
    prompt="Extract all text from this image"
)
```

### Structured Data Extraction

```python
from docuglean import extract
from pydantic import BaseModel
from typing import List

class ReceiptItem(BaseModel):
    name: str
    price: float

class Receipt(BaseModel):
    date: str
    total: float
    items: List[ReceiptItem]

# Extract structured data with OpenAI
receipt = await extract(
    file_path="./receipt.pdf",
    provider="openai",
    api_key="your-api-key",
    response_format=Receipt,
    prompt="Extract receipt information"
)

# Extract structured data with Gemini
receipt = await extract(
    file_path="./receipt.pdf",
    provider="gemini",
    api_key="your-gemini-api-key",
    response_format=Receipt,
    prompt="Extract receipt information including date, total, and all items"
)
```

## Development

### Setup

```bash
# Install with UV
uv sync
```

### Testing

```bash
# Run all tests
uv run pytest tests/ -v

# Run specific test files
uv run pytest tests/test_basic.py -v                    # Basic tests only
uv run pytest tests/test_ocr.py tests/test_extract.py -v  # Mistral tests (requires MISTRAL_API_KEY)
uv run pytest tests/test_openai.py -v                   # OpenAI tests (requires OPENAI_API_KEY)

# Run with output (shows print statements)
uv run pytest tests/ -v -s

# Run specific test function
uv run pytest tests/test_openai.py::test_openai_extract_unstructured_pdf -v -s

# Set API keys for real testing
export MISTRAL_API_KEY=your_mistral_key_here
export OPENAI_API_KEY=your_openai_key_here
export GEMINI_API_KEY=your_gemini_key_here
uv run pytest tests/ -v -s
```

### Code Quality

```bash
# Run linting and type checking
uv run ruff check src/ tests/

# Fix linting issues automatically
uv run ruff check src/ tests/ --fix

# Format code
uv run ruff format src/ tests/
```

## License

Apache 2.0 - see the [LICENSE](LICENSE) file for details.

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "docuglean-ocr",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.11",
    "maintainer_email": null,
    "keywords": "document, document-processing, llm, ocr, text-extraction",
    "author": null,
    "author_email": "docuglean-ai <contact@docuglean.ai>",
    "download_url": "https://files.pythonhosted.org/packages/26/b1/c74ec358a51ab4da502672b784325050fba76846540fb325c6c2beb13672/docuglean_ocr-1.0.0.tar.gz",
    "platform": null,
    "description": "# Docuglean OCR - Python SDK\n\nA unified Python SDK for intelligent document processing using State of the Art AI models.\n\n## Features\n\n- \ud83d\ude80 **Easy to Use**: Simple, intuitive API with detailed documentation\n- \ud83d\udd0d **OCR Capabilities**: Extract text from images and scanned documents  \n- \ud83d\udcca **Structured Data Extraction**: Use Pydantic models for type-safe data extraction\n- \ud83d\udcc4 **Multimodal Support**: Process PDFs and images with ease\n- \ud83e\udd16 **Multiple AI Providers**: Support for OpenAI, Mistral, Google Gemini, and Hugging Face\n- \ud83d\udd12 **Type Safety**: Full Python type hints with Pydantic validation\n\n## Installation\n\n```bash\npip install docuglean-ocr\n```\n\n## Quick Start\n\n### OCR Processing\n\n```python\nfrom docuglean import ocr\n\n# Mistral OCR\nresult = await ocr(\n    file_path=\"./document.pdf\",\n    provider=\"mistral\",\n    model=\"mistral-ocr-latest\",\n    api_key=\"your-api-key\"\n)\n\n# Google Gemini OCR\nresult = await ocr(\n    file_path=\"./document.pdf\",\n    provider=\"gemini\",\n    model=\"gemini-2.5-flash\",\n    api_key=\"your-gemini-api-key\",\n    prompt=\"Extract all text from this document\"\n)\n\n# Hugging Face OCR (no API key needed)\nresult = await ocr(\n    file_path=\"https://example.com/image.jpg\",  # Supports URLs, local files, base64\n    provider=\"huggingface\",\n    model=\"Qwen/Qwen2.5-VL-3B-Instruct\",\n    prompt=\"Extract all text from this image\"\n)\n```\n\n### Structured Data Extraction\n\n```python\nfrom docuglean import extract\nfrom pydantic import BaseModel\nfrom typing import List\n\nclass ReceiptItem(BaseModel):\n    name: str\n    price: float\n\nclass Receipt(BaseModel):\n    date: str\n    total: float\n    items: List[ReceiptItem]\n\n# Extract structured data with OpenAI\nreceipt = await extract(\n    file_path=\"./receipt.pdf\",\n    provider=\"openai\",\n    api_key=\"your-api-key\",\n    response_format=Receipt,\n    prompt=\"Extract receipt information\"\n)\n\n# Extract structured data with Gemini\nreceipt = await extract(\n    file_path=\"./receipt.pdf\",\n    provider=\"gemini\",\n    api_key=\"your-gemini-api-key\",\n    response_format=Receipt,\n    prompt=\"Extract receipt information including date, total, and all items\"\n)\n```\n\n## Development\n\n### Setup\n\n```bash\n# Install with UV\nuv sync\n```\n\n### Testing\n\n```bash\n# Run all tests\nuv run pytest tests/ -v\n\n# Run specific test files\nuv run pytest tests/test_basic.py -v                    # Basic tests only\nuv run pytest tests/test_ocr.py tests/test_extract.py -v  # Mistral tests (requires MISTRAL_API_KEY)\nuv run pytest tests/test_openai.py -v                   # OpenAI tests (requires OPENAI_API_KEY)\n\n# Run with output (shows print statements)\nuv run pytest tests/ -v -s\n\n# Run specific test function\nuv run pytest tests/test_openai.py::test_openai_extract_unstructured_pdf -v -s\n\n# Set API keys for real testing\nexport MISTRAL_API_KEY=your_mistral_key_here\nexport OPENAI_API_KEY=your_openai_key_here\nexport GEMINI_API_KEY=your_gemini_key_here\nuv run pytest tests/ -v -s\n```\n\n### Code Quality\n\n```bash\n# Run linting and type checking\nuv run ruff check src/ tests/\n\n# Fix linting issues automatically\nuv run ruff check src/ tests/ --fix\n\n# Format code\nuv run ruff format src/ tests/\n```\n\n## License\n\nApache 2.0 - see the [LICENSE](LICENSE) file for details.\n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "An SDK for intelligent document processing using SOTA VLLM models",
    "version": "1.0.0",
    "project_urls": {
        "Issues": "https://github.com/docuglean-ai/docuglean-ocr/issues",
        "Repository": "https://github.com/docuglean-ai/docuglean-ocr"
    },
    "split_keywords": [
        "document",
        " document-processing",
        " llm",
        " ocr",
        " text-extraction"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "d50b5f73d512e2a52c674aaf1f964bbebc8b62c0a5500374fbdeae7d87480524",
                "md5": "0267f6759a7d97c2b7581a894f6e647a",
                "sha256": "3f37e8b4b608406a04cbf7d6fbb62aa5870770a62263caf60775977fe006033a"
            },
            "downloads": -1,
            "filename": "docuglean_ocr-1.0.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "0267f6759a7d97c2b7581a894f6e647a",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.11",
            "size": 17751,
            "upload_time": "2025-09-02T13:19:11",
            "upload_time_iso_8601": "2025-09-02T13:19:11.209690Z",
            "url": "https://files.pythonhosted.org/packages/d5/0b/5f73d512e2a52c674aaf1f964bbebc8b62c0a5500374fbdeae7d87480524/docuglean_ocr-1.0.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "26b1c74ec358a51ab4da502672b784325050fba76846540fb325c6c2beb13672",
                "md5": "9980e20590c331c08a7c6b30e324692d",
                "sha256": "b65d961c69b3e734151df7747448fe8344ed1dab1eee3288c75b52f3b47b8685"
            },
            "downloads": -1,
            "filename": "docuglean_ocr-1.0.0.tar.gz",
            "has_sig": false,
            "md5_digest": "9980e20590c331c08a7c6b30e324692d",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.11",
            "size": 93936,
            "upload_time": "2025-09-02T13:19:12",
            "upload_time_iso_8601": "2025-09-02T13:19:12.451984Z",
            "url": "https://files.pythonhosted.org/packages/26/b1/c74ec358a51ab4da502672b784325050fba76846540fb325c6c2beb13672/docuglean_ocr-1.0.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-09-02 13:19:12",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "docuglean-ai",
    "github_project": "docuglean-ocr",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "docuglean-ocr"
}

None