# Docuglean OCR - Python SDK
A unified Python SDK for intelligent document processing using State of the Art AI models.
## Features
- 🚀 **Easy to Use**: Simple, intuitive API with detailed documentation
- 🔍 **OCR Capabilities**: Extract text from images and scanned documents
- 📊 **Structured Data Extraction**: Use Pydantic models for type-safe data extraction
- 📄 **Multimodal Support**: Process PDFs and images with ease
- 🤖 **Multiple AI Providers**: Support for OpenAI, Mistral, Google Gemini, and Hugging Face
- 🔒 **Type Safety**: Full Python type hints with Pydantic validation
## Installation
```bash
pip install docuglean-ocr
```
## Quick Start
### OCR Processing
```python
from docuglean import ocr
# Mistral OCR
result = await ocr(
file_path="./document.pdf",
provider="mistral",
model="mistral-ocr-latest",
api_key="your-api-key"
)
# Google Gemini OCR
result = await ocr(
file_path="./document.pdf",
provider="gemini",
model="gemini-2.5-flash",
api_key="your-gemini-api-key",
prompt="Extract all text from this document"
)
# Hugging Face OCR (no API key needed)
result = await ocr(
file_path="https://example.com/image.jpg", # Supports URLs, local files, base64
provider="huggingface",
model="Qwen/Qwen2.5-VL-3B-Instruct",
prompt="Extract all text from this image"
)
```
### Structured Data Extraction
```python
from docuglean import extract
from pydantic import BaseModel
from typing import List
class ReceiptItem(BaseModel):
name: str
price: float
class Receipt(BaseModel):
date: str
total: float
items: List[ReceiptItem]
# Extract structured data with OpenAI
receipt = await extract(
file_path="./receipt.pdf",
provider="openai",
api_key="your-api-key",
response_format=Receipt,
prompt="Extract receipt information"
)
# Extract structured data with Gemini
receipt = await extract(
file_path="./receipt.pdf",
provider="gemini",
api_key="your-gemini-api-key",
response_format=Receipt,
prompt="Extract receipt information including date, total, and all items"
)
```
## Development
### Setup
```bash
# Install with UV
uv sync
```
### Testing
```bash
# Run all tests
uv run pytest tests/ -v
# Run specific test files
uv run pytest tests/test_basic.py -v # Basic tests only
uv run pytest tests/test_ocr.py tests/test_extract.py -v # Mistral tests (requires MISTRAL_API_KEY)
uv run pytest tests/test_openai.py -v # OpenAI tests (requires OPENAI_API_KEY)
# Run with output (shows print statements)
uv run pytest tests/ -v -s
# Run specific test function
uv run pytest tests/test_openai.py::test_openai_extract_unstructured_pdf -v -s
# Set API keys for real testing
export MISTRAL_API_KEY=your_mistral_key_here
export OPENAI_API_KEY=your_openai_key_here
export GEMINI_API_KEY=your_gemini_key_here
uv run pytest tests/ -v -s
```
### Code Quality
```bash
# Run linting and type checking
uv run ruff check src/ tests/
# Fix linting issues automatically
uv run ruff check src/ tests/ --fix
# Format code
uv run ruff format src/ tests/
```
## License
Apache 2.0 - see the [LICENSE](LICENSE) file for details.
Raw data
{
"_id": null,
"home_page": null,
"name": "docuglean-ocr",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.11",
"maintainer_email": null,
"keywords": "document, document-processing, llm, ocr, text-extraction",
"author": null,
"author_email": "docuglean-ai <contact@docuglean.ai>",
"download_url": "https://files.pythonhosted.org/packages/26/b1/c74ec358a51ab4da502672b784325050fba76846540fb325c6c2beb13672/docuglean_ocr-1.0.0.tar.gz",
"platform": null,
"description": "# Docuglean OCR - Python SDK\n\nA unified Python SDK for intelligent document processing using State of the Art AI models.\n\n## Features\n\n- \ud83d\ude80 **Easy to Use**: Simple, intuitive API with detailed documentation\n- \ud83d\udd0d **OCR Capabilities**: Extract text from images and scanned documents \n- \ud83d\udcca **Structured Data Extraction**: Use Pydantic models for type-safe data extraction\n- \ud83d\udcc4 **Multimodal Support**: Process PDFs and images with ease\n- \ud83e\udd16 **Multiple AI Providers**: Support for OpenAI, Mistral, Google Gemini, and Hugging Face\n- \ud83d\udd12 **Type Safety**: Full Python type hints with Pydantic validation\n\n## Installation\n\n```bash\npip install docuglean-ocr\n```\n\n## Quick Start\n\n### OCR Processing\n\n```python\nfrom docuglean import ocr\n\n# Mistral OCR\nresult = await ocr(\n file_path=\"./document.pdf\",\n provider=\"mistral\",\n model=\"mistral-ocr-latest\",\n api_key=\"your-api-key\"\n)\n\n# Google Gemini OCR\nresult = await ocr(\n file_path=\"./document.pdf\",\n provider=\"gemini\",\n model=\"gemini-2.5-flash\",\n api_key=\"your-gemini-api-key\",\n prompt=\"Extract all text from this document\"\n)\n\n# Hugging Face OCR (no API key needed)\nresult = await ocr(\n file_path=\"https://example.com/image.jpg\", # Supports URLs, local files, base64\n provider=\"huggingface\",\n model=\"Qwen/Qwen2.5-VL-3B-Instruct\",\n prompt=\"Extract all text from this image\"\n)\n```\n\n### Structured Data Extraction\n\n```python\nfrom docuglean import extract\nfrom pydantic import BaseModel\nfrom typing import List\n\nclass ReceiptItem(BaseModel):\n name: str\n price: float\n\nclass Receipt(BaseModel):\n date: str\n total: float\n items: List[ReceiptItem]\n\n# Extract structured data with OpenAI\nreceipt = await extract(\n file_path=\"./receipt.pdf\",\n provider=\"openai\",\n api_key=\"your-api-key\",\n response_format=Receipt,\n prompt=\"Extract receipt information\"\n)\n\n# Extract structured data with Gemini\nreceipt = await extract(\n file_path=\"./receipt.pdf\",\n provider=\"gemini\",\n api_key=\"your-gemini-api-key\",\n response_format=Receipt,\n prompt=\"Extract receipt information including date, total, and all items\"\n)\n```\n\n## Development\n\n### Setup\n\n```bash\n# Install with UV\nuv sync\n```\n\n### Testing\n\n```bash\n# Run all tests\nuv run pytest tests/ -v\n\n# Run specific test files\nuv run pytest tests/test_basic.py -v # Basic tests only\nuv run pytest tests/test_ocr.py tests/test_extract.py -v # Mistral tests (requires MISTRAL_API_KEY)\nuv run pytest tests/test_openai.py -v # OpenAI tests (requires OPENAI_API_KEY)\n\n# Run with output (shows print statements)\nuv run pytest tests/ -v -s\n\n# Run specific test function\nuv run pytest tests/test_openai.py::test_openai_extract_unstructured_pdf -v -s\n\n# Set API keys for real testing\nexport MISTRAL_API_KEY=your_mistral_key_here\nexport OPENAI_API_KEY=your_openai_key_here\nexport GEMINI_API_KEY=your_gemini_key_here\nuv run pytest tests/ -v -s\n```\n\n### Code Quality\n\n```bash\n# Run linting and type checking\nuv run ruff check src/ tests/\n\n# Fix linting issues automatically\nuv run ruff check src/ tests/ --fix\n\n# Format code\nuv run ruff format src/ tests/\n```\n\n## License\n\nApache 2.0 - see the [LICENSE](LICENSE) file for details.\n",
"bugtrack_url": null,
"license": "Apache-2.0",
"summary": "An SDK for intelligent document processing using SOTA VLLM models",
"version": "1.0.0",
"project_urls": {
"Issues": "https://github.com/docuglean-ai/docuglean-ocr/issues",
"Repository": "https://github.com/docuglean-ai/docuglean-ocr"
},
"split_keywords": [
"document",
" document-processing",
" llm",
" ocr",
" text-extraction"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "d50b5f73d512e2a52c674aaf1f964bbebc8b62c0a5500374fbdeae7d87480524",
"md5": "0267f6759a7d97c2b7581a894f6e647a",
"sha256": "3f37e8b4b608406a04cbf7d6fbb62aa5870770a62263caf60775977fe006033a"
},
"downloads": -1,
"filename": "docuglean_ocr-1.0.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "0267f6759a7d97c2b7581a894f6e647a",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.11",
"size": 17751,
"upload_time": "2025-09-02T13:19:11",
"upload_time_iso_8601": "2025-09-02T13:19:11.209690Z",
"url": "https://files.pythonhosted.org/packages/d5/0b/5f73d512e2a52c674aaf1f964bbebc8b62c0a5500374fbdeae7d87480524/docuglean_ocr-1.0.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "26b1c74ec358a51ab4da502672b784325050fba76846540fb325c6c2beb13672",
"md5": "9980e20590c331c08a7c6b30e324692d",
"sha256": "b65d961c69b3e734151df7747448fe8344ed1dab1eee3288c75b52f3b47b8685"
},
"downloads": -1,
"filename": "docuglean_ocr-1.0.0.tar.gz",
"has_sig": false,
"md5_digest": "9980e20590c331c08a7c6b30e324692d",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.11",
"size": 93936,
"upload_time": "2025-09-02T13:19:12",
"upload_time_iso_8601": "2025-09-02T13:19:12.451984Z",
"url": "https://files.pythonhosted.org/packages/26/b1/c74ec358a51ab4da502672b784325050fba76846540fb325c6c2beb13672/docuglean_ocr-1.0.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-09-02 13:19:12",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "docuglean-ai",
"github_project": "docuglean-ocr",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "docuglean-ocr"
}