# PDF OCR Processor
> Advanced PDF processing with AI-powered OCR, text extraction, and selectable text overlays using Ollama models
[](https://python.org)
[](LICENSE)
[](https://github.com/wronai/ocr/wiki)
[](https://github.com/wronai/ocr/actions)
[](https://github.com/psf/black)
## π Features
- **AI-Powered OCR** using Ollama models (llava, moondream, etc.)
- **Modular Architecture** with clear separation of concerns
- **Multiple Output Formats**:
- SVG with selectable text overlays
- Raw text extraction
- JSON metadata
- **Image Enhancement** with multiple strategies
- **Robust Error Handling** with configurable retries
- **Parallel Processing** for batch operations
- **CLI Interface** with progress tracking
## π οΈ System Architecture
```
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β PDF OCR Processor β
βββββββββββββββββββ¬ββββββββββββββββββββββββββββββββ€
β ββββββββββββββ β βββββββββββββββββββββββββββ β
β β PDF β β β OCRProcessor β β
β β Processor βββΌββΆβ - Text extraction β β
β ββββββββββββββ β β - Ollama integration β β
β β βββββββββββββββ¬ββββββββββββ β
β ββββββββββββββ β βββββββββββββββΌββββββββββββ β
β β Image β β β SVG Generator β β
β β Enhancer βββΌββΆβ - Text overlay β β
β ββββββββββββββ β β - Searchable output β β
βββββββββββββββββββ΄ββββββββββββββββββββββββββββββββ
```
## π¦ Installation
### Prerequisites
- Python 3.8+
- [Ollama](https://ollama.ai) (for OCR processing)
- System dependencies:
```bash
# Ubuntu/Debian
sudo apt-get install -y tesseract-ocr poppler-utils
# macOS
brew install tesseract poppler
```
### Install from source
```bash
# Clone the repository
git clone https://github.com/wronai/ocr.git
cd ocr
# Create and activate a virtual environment
python -m venv venv
source venv/bin/activate # Linux/macOS
# Install dependencies
pip install -r requirements.txt
pip install -r requirements-dev.txt # For development
```
## π Quick Start
### Basic Usage
```bash
# Process a single PDF
python -m pdf_processor --input document.pdf --output output/
# Process all PDFs in a directory
python -m pdf_processor --input ./documents --output ./output --model llava:7b
# Show help
python -m pdf_processor --help
```
### Python API
```python
from pdf_processor import PDFProcessor
from pdf_processor.processing.pdf_processor import PDFProcessorConfig
# Configure the processor
config = PDFProcessorConfig(
input_path="document.pdf",
output_dir="./output",
ocr_model="llava:7b",
dpi=300,
max_workers=4
)
# Process a document
processor = PDFProcessor(config)
result = processor.process_pdf("document.pdf")
print(f"Processed {result['pages_processed']} pages")
```
## βοΈ Configuration
### Configuration File
Create a `config.yaml` file:
```yaml
# config.yaml
input_path: ./documents # Input file or directory
output_dir: ./output # Output directory
ocr_model: llava:7b # Ollama model to use
dpi: 300 # Image resolution
max_workers: 4 # Number of worker threads
timeout: 300 # Timeout in seconds
max_retries: 3 # Max retry attempts
log_level: INFO # Logging level
log_file: pdf_processor.log # Log file path
# Image enhancement strategies
enhancement_strategies:
- original # Keep original image
- grayscale # Convert to grayscale
- adaptive_threshold # Apply adaptive thresholding
- contrast_stretch # Stretch contrast
- sharpen # Sharpen image
- denoise # Remove noise
```
### Environment Variables
```bash
export OLLAMA_HOST="http://localhost:11434"
export OLLAMA_MODEL="llava:7b"
export LOG_LEVEL="DEBUG"
```
## π Advanced Usage
### Processing Options
```bash
# Process with specific DPI
python -m pdf_processor --input document.pdf --output output/ --dpi 400
# Limit number of pages to process
python -m pdf_processor --input document.pdf --output output/ --max-pages 10
# Use a specific enhancement strategy
python -m pdf_processor --input document.pdf --output output/ --enhance grayscale
# Process in verbose mode
python -m pdf_processor --input document.pdf --output output/ --verbose
```
### Available Enhancement Strategies
- `original`: Keep original image (fastest)
- `grayscale`: Convert to grayscale (good for text-heavy documents)
- `adaptive_threshold`: Apply adaptive thresholding (good for low-quality scans)
- `contrast_stretch`: Stretch contrast to improve readability
- `sharpen`: Apply sharpening filter
- `denoise`: Remove image noise
## π οΈ Development
### Project Structure
```
pdf_processor/
βββ __init__.py # Package initialization
βββ cli.py # Command-line interface
βββ config/ # Configuration files
βββ models/ # Data models
β βββ __init__.py
β βββ ocr_result.py # OCR result data structures
β βββ retry_config.py # Retry configuration
βββ processing/ # Core processing modules
β βββ __init__.py
β βββ image_enhancement.py # Image processing
β βββ ocr_processor.py # OCR processing
β βββ pdf_processor.py # Main PDF processing
β βββ svg_generator.py # SVG output generation
βββ utils/ # Utility functions
βββ file_utils.py # File operations
βββ logging_utils.py # Logging configuration
βββ validation_utils.py # Input validation
```
### Running Tests
```bash
# Install test dependencies
pip install -r requirements-dev.txt
# Run all tests
pytest
# Run tests with coverage report
pytest --cov=pdf_processor --cov-report=html
```
## π€ Contributing
Contributions are welcome! Please follow these steps:
1. Fork the repository
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Commit your changes (`git commit -m 'Add some amazing feature'`)
4. Push to the branch (`git push origin feature/amazing-feature`)
5. Open a Pull Request
## π License
This project is licensed under the Apache License 2.0 - see the [LICENSE](LICENSE) file for details.
## π Resources
- [Ollama Documentation](https://ollama.ai/docs)
- [PyMuPDF Documentation](https://pymupdf.readthedocs.io/)
- [Pillow Documentation](https://pillow.readthedocs.io/)
## π Acknowledgments
- The Ollama team for their amazing AI models
- The PyMuPDF team for excellent PDF processing
- All contributors who have helped improve this project
## π οΈ Development Workflow
This project uses a script-based workflow for development tasks. All scripts are located in the `scripts/` directory and can be run directly or via the Makefile.
### Setup
1. Clone the repository and navigate to the project directory:
```bash
git clone https://github.com/wronai/ocr.git
cd ocr
```
2. Set up the development environment:
```bash
make install-dev
```
This will:
- Create and activate a virtual environment
- Install all development dependencies
- Set up pre-commit hooks
### Common Development Tasks
```bash
# Run tests
make test
# Run tests with coverage
make test-cov
# Format code
make format
# Run linters
make lint
# Start development server
make dev-server
# Build documentation
make docs
make docs-serve # Serve docs locally
```
### Scripts Directory
All development and build scripts are located in the `scripts/` directory. See [scripts/README.md](scripts/README.md) for detailed documentation of each script.
### Docker Development
```bash
# Build Docker image
make docker-build
# Start services with Docker Compose
make docker-run
# Stop services
make docker-stop
```
## π€ Contributing
Contributions are welcome! Please follow these steps:
1. Fork the repository
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Commit your changes (`git commit -m 'Add some amazing feature'`)
4. Push to the branch (`git push origin feature/amazing-feature`)
5. Open a Pull Request
Please ensure your code follows our coding standards and includes appropriate tests.
## π License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## π Changelog
See [CHANGELOG.md](CHANGELOG.md) for a list of changes in each version.**
```bash
python proc.py --model llava:7b --workers 4
```
3. **View Results**
- Open `output/*_complete.svg` in your browser
- Check details in `output/processing_report.json`
## π Documentation
Full documentation is available in the [docs/](docs/) directory:
- [π User Guide](docs/user-guide/README.md)
- [βοΈ Installation and Configuration](docs/getting-started/installation.md)
- [π§ API Reference](docs/api-reference/README.md)
- [β Frequently Asked Questions](docs/faq/README.md)
- [π¨βπ» Development and Contributing](docs/development/contributing.md)
Raw data
{
"_id": null,
"home_page": "https://github.com/your-username/pdf-ocr-processor",
"name": "pdf-ocr-processor",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "pdf, ocr, text-extraction, ai, ollama, document-processing, computer-vision",
"author": "PDF OCR Processor Team",
"author_email": "team@pdf-ocr-processor.com",
"download_url": "https://files.pythonhosted.org/packages/e3/01/5cdca9c73405e0f1c4df60ffccfd511b885c4d628e05d22ebe11f491880f/pdf_ocr_processor-2.0.3.tar.gz",
"platform": null,
"description": "# PDF OCR Processor\n\n> Advanced PDF processing with AI-powered OCR, text extraction, and selectable text overlays using Ollama models\n\n[](https://python.org)\n[](LICENSE)\n[](https://github.com/wronai/ocr/wiki)\n[](https://github.com/wronai/ocr/actions)\n[](https://github.com/psf/black)\n\n## \ud83d\ude80 Features\n\n- **AI-Powered OCR** using Ollama models (llava, moondream, etc.)\n- **Modular Architecture** with clear separation of concerns\n- **Multiple Output Formats**:\n - SVG with selectable text overlays\n - Raw text extraction\n - JSON metadata\n- **Image Enhancement** with multiple strategies\n- **Robust Error Handling** with configurable retries\n- **Parallel Processing** for batch operations\n- **CLI Interface** with progress tracking\n\n## \ud83d\udee0\ufe0f System Architecture\n\n```\n\u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\n\u2502 PDF OCR Processor \u2502\n\u251c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2524\n\u2502 \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510 \u2502 \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510 \u2502\n\u2502 \u2502 PDF \u2502 \u2502 \u2502 OCRProcessor \u2502 \u2502\n\u2502 \u2502 Processor \u251c\u2500\u253c\u2500\u25b6\u2502 - Text extraction \u2502 \u2502\n\u2502 \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518 \u2502 \u2502 - Ollama integration \u2502 \u2502\n\u2502 \u2502 \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518 \u2502\n\u2502 \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510 \u2502 \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u25bc\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510 \u2502\n\u2502 \u2502 Image \u2502 \u2502 \u2502 SVG Generator \u2502 \u2502\n\u2502 \u2502 Enhancer \u251c\u2500\u253c\u2500\u25b6\u2502 - Text overlay \u2502 \u2502\n\u2502 \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518 \u2502 \u2502 - Searchable output \u2502 \u2502\n\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n```\n\n## \ud83d\udce6 Installation\n\n### Prerequisites\n- Python 3.8+\n- [Ollama](https://ollama.ai) (for OCR processing)\n- System dependencies:\n ```bash\n # Ubuntu/Debian\n sudo apt-get install -y tesseract-ocr poppler-utils\n \n # macOS\n brew install tesseract poppler\n ```\n\n### Install from source\n```bash\n# Clone the repository\ngit clone https://github.com/wronai/ocr.git\ncd ocr\n\n# Create and activate a virtual environment\npython -m venv venv\nsource venv/bin/activate # Linux/macOS\n\n# Install dependencies\npip install -r requirements.txt\npip install -r requirements-dev.txt # For development\n```\n\n## \ud83c\udfc1 Quick Start\n\n### Basic Usage\n```bash\n# Process a single PDF\npython -m pdf_processor --input document.pdf --output output/\n\n# Process all PDFs in a directory\npython -m pdf_processor --input ./documents --output ./output --model llava:7b\n\n# Show help\npython -m pdf_processor --help\n```\n\n### Python API\n```python\nfrom pdf_processor import PDFProcessor\nfrom pdf_processor.processing.pdf_processor import PDFProcessorConfig\n\n# Configure the processor\nconfig = PDFProcessorConfig(\n input_path=\"document.pdf\",\n output_dir=\"./output\",\n ocr_model=\"llava:7b\",\n dpi=300,\n max_workers=4\n)\n\n# Process a document\nprocessor = PDFProcessor(config)\nresult = processor.process_pdf(\"document.pdf\")\nprint(f\"Processed {result['pages_processed']} pages\")\n```\n\n## \u2699\ufe0f Configuration\n\n### Configuration File\nCreate a `config.yaml` file:\n\n```yaml\n# config.yaml\ninput_path: ./documents # Input file or directory\noutput_dir: ./output # Output directory\nocr_model: llava:7b # Ollama model to use\ndpi: 300 # Image resolution\nmax_workers: 4 # Number of worker threads\ntimeout: 300 # Timeout in seconds\nmax_retries: 3 # Max retry attempts\nlog_level: INFO # Logging level\nlog_file: pdf_processor.log # Log file path\n\n# Image enhancement strategies\nenhancement_strategies:\n - original # Keep original image\n - grayscale # Convert to grayscale\n - adaptive_threshold # Apply adaptive thresholding\n - contrast_stretch # Stretch contrast\n - sharpen # Sharpen image\n - denoise # Remove noise\n```\n\n### Environment Variables\n\n```bash\nexport OLLAMA_HOST=\"http://localhost:11434\"\nexport OLLAMA_MODEL=\"llava:7b\"\nexport LOG_LEVEL=\"DEBUG\"\n```\n\n## \ud83d\ude80 Advanced Usage\n\n### Processing Options\n\n```bash\n# Process with specific DPI\npython -m pdf_processor --input document.pdf --output output/ --dpi 400\n\n# Limit number of pages to process\npython -m pdf_processor --input document.pdf --output output/ --max-pages 10\n\n# Use a specific enhancement strategy\npython -m pdf_processor --input document.pdf --output output/ --enhance grayscale\n\n# Process in verbose mode\npython -m pdf_processor --input document.pdf --output output/ --verbose\n```\n\n### Available Enhancement Strategies\n\n- `original`: Keep original image (fastest)\n- `grayscale`: Convert to grayscale (good for text-heavy documents)\n- `adaptive_threshold`: Apply adaptive thresholding (good for low-quality scans)\n- `contrast_stretch`: Stretch contrast to improve readability\n- `sharpen`: Apply sharpening filter\n- `denoise`: Remove image noise\n\n## \ud83d\udee0\ufe0f Development\n\n### Project Structure\n\n```\npdf_processor/\n\u251c\u2500\u2500 __init__.py # Package initialization\n\u251c\u2500\u2500 cli.py # Command-line interface\n\u251c\u2500\u2500 config/ # Configuration files\n\u251c\u2500\u2500 models/ # Data models\n\u2502 \u251c\u2500\u2500 __init__.py\n\u2502 \u251c\u2500\u2500 ocr_result.py # OCR result data structures\n\u2502 \u2514\u2500\u2500 retry_config.py # Retry configuration\n\u251c\u2500\u2500 processing/ # Core processing modules\n\u2502 \u251c\u2500\u2500 __init__.py\n\u2502 \u251c\u2500\u2500 image_enhancement.py # Image processing\n\u2502 \u251c\u2500\u2500 ocr_processor.py # OCR processing\n\u2502 \u251c\u2500\u2500 pdf_processor.py # Main PDF processing\n\u2502 \u2514\u2500\u2500 svg_generator.py # SVG output generation\n\u2514\u2500\u2500 utils/ # Utility functions\n \u251c\u2500\u2500 file_utils.py # File operations\n \u251c\u2500\u2500 logging_utils.py # Logging configuration\n \u2514\u2500\u2500 validation_utils.py # Input validation\n```\n\n### Running Tests\n\n```bash\n# Install test dependencies\npip install -r requirements-dev.txt\n\n# Run all tests\npytest\n\n# Run tests with coverage report\npytest --cov=pdf_processor --cov-report=html\n```\n\n## \ud83e\udd1d Contributing\n\nContributions are welcome! Please follow these steps:\n\n1. Fork the repository\n2. Create a feature branch (`git checkout -b feature/amazing-feature`)\n3. Commit your changes (`git commit -m 'Add some amazing feature'`)\n4. Push to the branch (`git push origin feature/amazing-feature`)\n5. Open a Pull Request\n\n## \ud83d\udcc4 License\n\nThis project is licensed under the Apache License 2.0 - see the [LICENSE](LICENSE) file for details.\n\n## \ud83d\udcda Resources\n\n- [Ollama Documentation](https://ollama.ai/docs)\n- [PyMuPDF Documentation](https://pymupdf.readthedocs.io/)\n- [Pillow Documentation](https://pillow.readthedocs.io/)\n\n## \ud83d\ude4f Acknowledgments\n\n- The Ollama team for their amazing AI models\n- The PyMuPDF team for excellent PDF processing\n- All contributors who have helped improve this project\n\n## \ud83d\udee0\ufe0f Development Workflow\n\nThis project uses a script-based workflow for development tasks. All scripts are located in the `scripts/` directory and can be run directly or via the Makefile.\n\n### Setup\n\n1. Clone the repository and navigate to the project directory:\n ```bash\n git clone https://github.com/wronai/ocr.git\n cd ocr\n ```\n\n2. Set up the development environment:\n ```bash\n make install-dev\n ```\n This will:\n - Create and activate a virtual environment\n - Install all development dependencies\n - Set up pre-commit hooks\n\n### Common Development Tasks\n\n```bash\n# Run tests\nmake test\n\n# Run tests with coverage\nmake test-cov\n\n# Format code\nmake format\n\n# Run linters\nmake lint\n\n# Start development server\nmake dev-server\n\n# Build documentation\nmake docs\nmake docs-serve # Serve docs locally\n```\n\n### Scripts Directory\n\nAll development and build scripts are located in the `scripts/` directory. See [scripts/README.md](scripts/README.md) for detailed documentation of each script.\n\n### Docker Development\n\n```bash\n# Build Docker image\nmake docker-build\n\n# Start services with Docker Compose\nmake docker-run\n\n# Stop services\nmake docker-stop\n```\n\n## \ud83e\udd1d Contributing\n\nContributions are welcome! Please follow these steps:\n\n1. Fork the repository\n2. Create a feature branch (`git checkout -b feature/amazing-feature`)\n3. Commit your changes (`git commit -m 'Add some amazing feature'`)\n4. Push to the branch (`git push origin feature/amazing-feature`)\n5. Open a Pull Request\n\nPlease ensure your code follows our coding standards and includes appropriate tests.\n\n## \ud83d\udcc4 License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## \ud83d\udcdc Changelog\n\nSee [CHANGELOG.md](CHANGELOG.md) for a list of changes in each version.**\n ```bash\n python proc.py --model llava:7b --workers 4\n ```\n\n3. **View Results**\n - Open `output/*_complete.svg` in your browser\n - Check details in `output/processing_report.json`\n\n## \ud83d\udcda Documentation\n\nFull documentation is available in the [docs/](docs/) directory:\n\n- [\ud83d\udcd6 User Guide](docs/user-guide/README.md)\n- [\u2699\ufe0f Installation and Configuration](docs/getting-started/installation.md)\n- [\ud83d\udd27 API Reference](docs/api-reference/README.md)\n- [\u2753 Frequently Asked Questions](docs/faq/README.md)\n- [\ud83d\udc68\u200d\ud83d\udcbb Development and Contributing](docs/development/contributing.md)\n",
"bugtrack_url": null,
"license": "Apache-2.0",
"summary": "Advanced PDF OCR processing with AI-powered text extraction and selectable text overlays",
"version": "2.0.3",
"project_urls": {
"Bug Reports": "https://github.com/your-username/pdf-ocr-processor/issues",
"Changelog": "https://github.com/your-username/pdf-ocr-processor/blob/main/CHANGELOG.md",
"Documentation": "https://pdf-ocr-processor.readthedocs.io/",
"Homepage": "https://github.com/your-username/pdf-ocr-processor",
"Repository": "https://github.com/your-username/pdf-ocr-processor"
},
"split_keywords": [
"pdf",
" ocr",
" text-extraction",
" ai",
" ollama",
" document-processing",
" computer-vision"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "93de567cb554d08aee52f18155bbb7da949026bf806c2e895c155ca3681221b5",
"md5": "68f18a73bed371d5bf9d1fe4b1a92ae5",
"sha256": "0a45da53ada6c1b39d7b20b01a8d6dd4449162a9eca3b01cec9648ca902eafaa"
},
"downloads": -1,
"filename": "pdf_ocr_processor-2.0.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "68f18a73bed371d5bf9d1fe4b1a92ae5",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 78285,
"upload_time": "2025-07-11T21:11:22",
"upload_time_iso_8601": "2025-07-11T21:11:22.895694Z",
"url": "https://files.pythonhosted.org/packages/93/de/567cb554d08aee52f18155bbb7da949026bf806c2e895c155ca3681221b5/pdf_ocr_processor-2.0.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "e3015cdca9c73405e0f1c4df60ffccfd511b885c4d628e05d22ebe11f491880f",
"md5": "fdc679db5c2132437b501bcc09d6b0ef",
"sha256": "594bf07c1b5d7e9d09654ba056da00749015e5e3d6d451cc5a04d7111036aedc"
},
"downloads": -1,
"filename": "pdf_ocr_processor-2.0.3.tar.gz",
"has_sig": false,
"md5_digest": "fdc679db5c2132437b501bcc09d6b0ef",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 64151,
"upload_time": "2025-07-11T21:11:24",
"upload_time_iso_8601": "2025-07-11T21:11:24.732253Z",
"url": "https://files.pythonhosted.org/packages/e3/01/5cdca9c73405e0f1c4df60ffccfd511b885c4d628e05d22ebe11f491880f/pdf_ocr_processor-2.0.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-11 21:11:24",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "your-username",
"github_project": "pdf-ocr-processor",
"github_not_found": true,
"lcname": "pdf-ocr-processor"
}