pdf-ocr-processor


Namepdf-ocr-processor JSON
Version 2.0.3 PyPI version JSON
download
home_pagehttps://github.com/your-username/pdf-ocr-processor
SummaryAdvanced PDF OCR processing with AI-powered text extraction and selectable text overlays
upload_time2025-07-11 21:11:24
maintainerNone
docs_urlNone
authorPDF OCR Processor Team
requires_python>=3.8
licenseApache-2.0
keywords pdf ocr text-extraction ai ollama document-processing computer-vision
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # PDF OCR Processor

> Advanced PDF processing with AI-powered OCR, text extraction, and selectable text overlays using Ollama models

[![Python](https://img.shields.io/badge/Python-3.8%2B-blue.svg)](https://python.org)
[![License](https://img.shields.io/badge/License-Apache%202.0-green.svg)](LICENSE)
[![Documentation](https://img.shields.io/badge/Docs-GitHub%20Wiki-blueviolet)](https://github.com/wronai/ocr/wiki)
[![Tests](https://github.com/wronai/ocr/actions/workflows/tests.yml/badge.svg)](https://github.com/wronai/ocr/actions)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)

## πŸš€ Features

- **AI-Powered OCR** using Ollama models (llava, moondream, etc.)
- **Modular Architecture** with clear separation of concerns
- **Multiple Output Formats**:
  - SVG with selectable text overlays
  - Raw text extraction
  - JSON metadata
- **Image Enhancement** with multiple strategies
- **Robust Error Handling** with configurable retries
- **Parallel Processing** for batch operations
- **CLI Interface** with progress tracking

## πŸ› οΈ System Architecture

```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚               PDF OCR Processor                 β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚ PDF        β”‚ β”‚  β”‚      OCRProcessor       β”‚  β”‚
β”‚  β”‚ Processor  β”œβ”€β”Όβ”€β–Άβ”‚  - Text extraction      β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚  β”‚  - Ollama integration   β”‚  β”‚
β”‚                 β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚ Image      β”‚ β”‚  β”‚      SVG Generator      β”‚  β”‚
β”‚  β”‚ Enhancer   β”œβ”€β”Όβ”€β–Άβ”‚  - Text overlay         β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚  β”‚  - Searchable output    β”‚  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```

## πŸ“¦ Installation

### Prerequisites
- Python 3.8+
- [Ollama](https://ollama.ai) (for OCR processing)
- System dependencies:
  ```bash
  # Ubuntu/Debian
  sudo apt-get install -y tesseract-ocr poppler-utils
  
  # macOS
  brew install tesseract poppler
  ```

### Install from source
```bash
# Clone the repository
git clone https://github.com/wronai/ocr.git
cd ocr

# Create and activate a virtual environment
python -m venv venv
source venv/bin/activate  # Linux/macOS

# Install dependencies
pip install -r requirements.txt
pip install -r requirements-dev.txt  # For development
```

## 🏁 Quick Start

### Basic Usage
```bash
# Process a single PDF
python -m pdf_processor --input document.pdf --output output/

# Process all PDFs in a directory
python -m pdf_processor --input ./documents --output ./output --model llava:7b

# Show help
python -m pdf_processor --help
```

### Python API
```python
from pdf_processor import PDFProcessor
from pdf_processor.processing.pdf_processor import PDFProcessorConfig

# Configure the processor
config = PDFProcessorConfig(
    input_path="document.pdf",
    output_dir="./output",
    ocr_model="llava:7b",
    dpi=300,
    max_workers=4
)

# Process a document
processor = PDFProcessor(config)
result = processor.process_pdf("document.pdf")
print(f"Processed {result['pages_processed']} pages")
```

## βš™οΈ Configuration

### Configuration File
Create a `config.yaml` file:

```yaml
# config.yaml
input_path: ./documents    # Input file or directory
output_dir: ./output       # Output directory
ocr_model: llava:7b        # Ollama model to use
dpi: 300                   # Image resolution
max_workers: 4             # Number of worker threads
timeout: 300               # Timeout in seconds
max_retries: 3             # Max retry attempts
log_level: INFO            # Logging level
log_file: pdf_processor.log # Log file path

# Image enhancement strategies
enhancement_strategies:
  - original            # Keep original image
  - grayscale           # Convert to grayscale
  - adaptive_threshold  # Apply adaptive thresholding
  - contrast_stretch    # Stretch contrast
  - sharpen             # Sharpen image
  - denoise             # Remove noise
```

### Environment Variables

```bash
export OLLAMA_HOST="http://localhost:11434"
export OLLAMA_MODEL="llava:7b"
export LOG_LEVEL="DEBUG"
```

## πŸš€ Advanced Usage

### Processing Options

```bash
# Process with specific DPI
python -m pdf_processor --input document.pdf --output output/ --dpi 400

# Limit number of pages to process
python -m pdf_processor --input document.pdf --output output/ --max-pages 10

# Use a specific enhancement strategy
python -m pdf_processor --input document.pdf --output output/ --enhance grayscale

# Process in verbose mode
python -m pdf_processor --input document.pdf --output output/ --verbose
```

### Available Enhancement Strategies

- `original`: Keep original image (fastest)
- `grayscale`: Convert to grayscale (good for text-heavy documents)
- `adaptive_threshold`: Apply adaptive thresholding (good for low-quality scans)
- `contrast_stretch`: Stretch contrast to improve readability
- `sharpen`: Apply sharpening filter
- `denoise`: Remove image noise

## πŸ› οΈ Development

### Project Structure

```
pdf_processor/
β”œβ”€β”€ __init__.py          # Package initialization
β”œβ”€β”€ cli.py               # Command-line interface
β”œβ”€β”€ config/              # Configuration files
β”œβ”€β”€ models/              # Data models
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ ocr_result.py    # OCR result data structures
β”‚   └── retry_config.py  # Retry configuration
β”œβ”€β”€ processing/          # Core processing modules
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ image_enhancement.py  # Image processing
β”‚   β”œβ”€β”€ ocr_processor.py      # OCR processing
β”‚   β”œβ”€β”€ pdf_processor.py      # Main PDF processing
β”‚   └── svg_generator.py      # SVG output generation
└── utils/               # Utility functions
    β”œβ”€β”€ file_utils.py    # File operations
    β”œβ”€β”€ logging_utils.py # Logging configuration
    └── validation_utils.py # Input validation
```

### Running Tests

```bash
# Install test dependencies
pip install -r requirements-dev.txt

# Run all tests
pytest

# Run tests with coverage report
pytest --cov=pdf_processor --cov-report=html
```

## 🀝 Contributing

Contributions are welcome! Please follow these steps:

1. Fork the repository
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Commit your changes (`git commit -m 'Add some amazing feature'`)
4. Push to the branch (`git push origin feature/amazing-feature`)
5. Open a Pull Request

## πŸ“„ License

This project is licensed under the Apache License 2.0 - see the [LICENSE](LICENSE) file for details.

## πŸ“š Resources

- [Ollama Documentation](https://ollama.ai/docs)
- [PyMuPDF Documentation](https://pymupdf.readthedocs.io/)
- [Pillow Documentation](https://pillow.readthedocs.io/)

## πŸ™ Acknowledgments

- The Ollama team for their amazing AI models
- The PyMuPDF team for excellent PDF processing
- All contributors who have helped improve this project

## πŸ› οΈ Development Workflow

This project uses a script-based workflow for development tasks. All scripts are located in the `scripts/` directory and can be run directly or via the Makefile.

### Setup

1. Clone the repository and navigate to the project directory:
   ```bash
   git clone https://github.com/wronai/ocr.git
   cd ocr
   ```

2. Set up the development environment:
   ```bash
   make install-dev
   ```
   This will:
   - Create and activate a virtual environment
   - Install all development dependencies
   - Set up pre-commit hooks

### Common Development Tasks

```bash
# Run tests
make test

# Run tests with coverage
make test-cov

# Format code
make format

# Run linters
make lint

# Start development server
make dev-server

# Build documentation
make docs
make docs-serve  # Serve docs locally
```

### Scripts Directory

All development and build scripts are located in the `scripts/` directory. See [scripts/README.md](scripts/README.md) for detailed documentation of each script.

### Docker Development

```bash
# Build Docker image
make docker-build

# Start services with Docker Compose
make docker-run

# Stop services
make docker-stop
```

## 🀝 Contributing

Contributions are welcome! Please follow these steps:

1. Fork the repository
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Commit your changes (`git commit -m 'Add some amazing feature'`)
4. Push to the branch (`git push origin feature/amazing-feature`)
5. Open a Pull Request

Please ensure your code follows our coding standards and includes appropriate tests.

## πŸ“„ License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## πŸ“œ Changelog

See [CHANGELOG.md](CHANGELOG.md) for a list of changes in each version.**
   ```bash
   python proc.py --model llava:7b --workers 4
   ```

3. **View Results**
   - Open `output/*_complete.svg` in your browser
   - Check details in `output/processing_report.json`

## πŸ“š Documentation

Full documentation is available in the [docs/](docs/) directory:

- [πŸ“– User Guide](docs/user-guide/README.md)
- [βš™οΈ Installation and Configuration](docs/getting-started/installation.md)
- [πŸ”§ API Reference](docs/api-reference/README.md)
- [❓ Frequently Asked Questions](docs/faq/README.md)
- [πŸ‘¨β€πŸ’» Development and Contributing](docs/development/contributing.md)

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/your-username/pdf-ocr-processor",
    "name": "pdf-ocr-processor",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "pdf, ocr, text-extraction, ai, ollama, document-processing, computer-vision",
    "author": "PDF OCR Processor Team",
    "author_email": "team@pdf-ocr-processor.com",
    "download_url": "https://files.pythonhosted.org/packages/e3/01/5cdca9c73405e0f1c4df60ffccfd511b885c4d628e05d22ebe11f491880f/pdf_ocr_processor-2.0.3.tar.gz",
    "platform": null,
    "description": "# PDF OCR Processor\n\n> Advanced PDF processing with AI-powered OCR, text extraction, and selectable text overlays using Ollama models\n\n[![Python](https://img.shields.io/badge/Python-3.8%2B-blue.svg)](https://python.org)\n[![License](https://img.shields.io/badge/License-Apache%202.0-green.svg)](LICENSE)\n[![Documentation](https://img.shields.io/badge/Docs-GitHub%20Wiki-blueviolet)](https://github.com/wronai/ocr/wiki)\n[![Tests](https://github.com/wronai/ocr/actions/workflows/tests.yml/badge.svg)](https://github.com/wronai/ocr/actions)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n\n## \ud83d\ude80 Features\n\n- **AI-Powered OCR** using Ollama models (llava, moondream, etc.)\n- **Modular Architecture** with clear separation of concerns\n- **Multiple Output Formats**:\n  - SVG with selectable text overlays\n  - Raw text extraction\n  - JSON metadata\n- **Image Enhancement** with multiple strategies\n- **Robust Error Handling** with configurable retries\n- **Parallel Processing** for batch operations\n- **CLI Interface** with progress tracking\n\n## \ud83d\udee0\ufe0f System Architecture\n\n```\n\u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\n\u2502               PDF OCR Processor                 \u2502\n\u251c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2524\n\u2502  \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510 \u2502  \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510  \u2502\n\u2502  \u2502 PDF        \u2502 \u2502  \u2502      OCRProcessor       \u2502  \u2502\n\u2502  \u2502 Processor  \u251c\u2500\u253c\u2500\u25b6\u2502  - Text extraction      \u2502  \u2502\n\u2502  \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518 \u2502  \u2502  - Ollama integration   \u2502  \u2502\n\u2502                 \u2502  \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518  \u2502\n\u2502  \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510 \u2502  \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u25bc\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510  \u2502\n\u2502  \u2502 Image      \u2502 \u2502  \u2502      SVG Generator      \u2502  \u2502\n\u2502  \u2502 Enhancer   \u251c\u2500\u253c\u2500\u25b6\u2502  - Text overlay         \u2502  \u2502\n\u2502  \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518 \u2502  \u2502  - Searchable output    \u2502  \u2502\n\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n```\n\n## \ud83d\udce6 Installation\n\n### Prerequisites\n- Python 3.8+\n- [Ollama](https://ollama.ai) (for OCR processing)\n- System dependencies:\n  ```bash\n  # Ubuntu/Debian\n  sudo apt-get install -y tesseract-ocr poppler-utils\n  \n  # macOS\n  brew install tesseract poppler\n  ```\n\n### Install from source\n```bash\n# Clone the repository\ngit clone https://github.com/wronai/ocr.git\ncd ocr\n\n# Create and activate a virtual environment\npython -m venv venv\nsource venv/bin/activate  # Linux/macOS\n\n# Install dependencies\npip install -r requirements.txt\npip install -r requirements-dev.txt  # For development\n```\n\n## \ud83c\udfc1 Quick Start\n\n### Basic Usage\n```bash\n# Process a single PDF\npython -m pdf_processor --input document.pdf --output output/\n\n# Process all PDFs in a directory\npython -m pdf_processor --input ./documents --output ./output --model llava:7b\n\n# Show help\npython -m pdf_processor --help\n```\n\n### Python API\n```python\nfrom pdf_processor import PDFProcessor\nfrom pdf_processor.processing.pdf_processor import PDFProcessorConfig\n\n# Configure the processor\nconfig = PDFProcessorConfig(\n    input_path=\"document.pdf\",\n    output_dir=\"./output\",\n    ocr_model=\"llava:7b\",\n    dpi=300,\n    max_workers=4\n)\n\n# Process a document\nprocessor = PDFProcessor(config)\nresult = processor.process_pdf(\"document.pdf\")\nprint(f\"Processed {result['pages_processed']} pages\")\n```\n\n## \u2699\ufe0f Configuration\n\n### Configuration File\nCreate a `config.yaml` file:\n\n```yaml\n# config.yaml\ninput_path: ./documents    # Input file or directory\noutput_dir: ./output       # Output directory\nocr_model: llava:7b        # Ollama model to use\ndpi: 300                   # Image resolution\nmax_workers: 4             # Number of worker threads\ntimeout: 300               # Timeout in seconds\nmax_retries: 3             # Max retry attempts\nlog_level: INFO            # Logging level\nlog_file: pdf_processor.log # Log file path\n\n# Image enhancement strategies\nenhancement_strategies:\n  - original            # Keep original image\n  - grayscale           # Convert to grayscale\n  - adaptive_threshold  # Apply adaptive thresholding\n  - contrast_stretch    # Stretch contrast\n  - sharpen             # Sharpen image\n  - denoise             # Remove noise\n```\n\n### Environment Variables\n\n```bash\nexport OLLAMA_HOST=\"http://localhost:11434\"\nexport OLLAMA_MODEL=\"llava:7b\"\nexport LOG_LEVEL=\"DEBUG\"\n```\n\n## \ud83d\ude80 Advanced Usage\n\n### Processing Options\n\n```bash\n# Process with specific DPI\npython -m pdf_processor --input document.pdf --output output/ --dpi 400\n\n# Limit number of pages to process\npython -m pdf_processor --input document.pdf --output output/ --max-pages 10\n\n# Use a specific enhancement strategy\npython -m pdf_processor --input document.pdf --output output/ --enhance grayscale\n\n# Process in verbose mode\npython -m pdf_processor --input document.pdf --output output/ --verbose\n```\n\n### Available Enhancement Strategies\n\n- `original`: Keep original image (fastest)\n- `grayscale`: Convert to grayscale (good for text-heavy documents)\n- `adaptive_threshold`: Apply adaptive thresholding (good for low-quality scans)\n- `contrast_stretch`: Stretch contrast to improve readability\n- `sharpen`: Apply sharpening filter\n- `denoise`: Remove image noise\n\n## \ud83d\udee0\ufe0f Development\n\n### Project Structure\n\n```\npdf_processor/\n\u251c\u2500\u2500 __init__.py          # Package initialization\n\u251c\u2500\u2500 cli.py               # Command-line interface\n\u251c\u2500\u2500 config/              # Configuration files\n\u251c\u2500\u2500 models/              # Data models\n\u2502   \u251c\u2500\u2500 __init__.py\n\u2502   \u251c\u2500\u2500 ocr_result.py    # OCR result data structures\n\u2502   \u2514\u2500\u2500 retry_config.py  # Retry configuration\n\u251c\u2500\u2500 processing/          # Core processing modules\n\u2502   \u251c\u2500\u2500 __init__.py\n\u2502   \u251c\u2500\u2500 image_enhancement.py  # Image processing\n\u2502   \u251c\u2500\u2500 ocr_processor.py      # OCR processing\n\u2502   \u251c\u2500\u2500 pdf_processor.py      # Main PDF processing\n\u2502   \u2514\u2500\u2500 svg_generator.py      # SVG output generation\n\u2514\u2500\u2500 utils/               # Utility functions\n    \u251c\u2500\u2500 file_utils.py    # File operations\n    \u251c\u2500\u2500 logging_utils.py # Logging configuration\n    \u2514\u2500\u2500 validation_utils.py # Input validation\n```\n\n### Running Tests\n\n```bash\n# Install test dependencies\npip install -r requirements-dev.txt\n\n# Run all tests\npytest\n\n# Run tests with coverage report\npytest --cov=pdf_processor --cov-report=html\n```\n\n## \ud83e\udd1d Contributing\n\nContributions are welcome! Please follow these steps:\n\n1. Fork the repository\n2. Create a feature branch (`git checkout -b feature/amazing-feature`)\n3. Commit your changes (`git commit -m 'Add some amazing feature'`)\n4. Push to the branch (`git push origin feature/amazing-feature`)\n5. Open a Pull Request\n\n## \ud83d\udcc4 License\n\nThis project is licensed under the Apache License 2.0 - see the [LICENSE](LICENSE) file for details.\n\n## \ud83d\udcda Resources\n\n- [Ollama Documentation](https://ollama.ai/docs)\n- [PyMuPDF Documentation](https://pymupdf.readthedocs.io/)\n- [Pillow Documentation](https://pillow.readthedocs.io/)\n\n## \ud83d\ude4f Acknowledgments\n\n- The Ollama team for their amazing AI models\n- The PyMuPDF team for excellent PDF processing\n- All contributors who have helped improve this project\n\n## \ud83d\udee0\ufe0f Development Workflow\n\nThis project uses a script-based workflow for development tasks. All scripts are located in the `scripts/` directory and can be run directly or via the Makefile.\n\n### Setup\n\n1. Clone the repository and navigate to the project directory:\n   ```bash\n   git clone https://github.com/wronai/ocr.git\n   cd ocr\n   ```\n\n2. Set up the development environment:\n   ```bash\n   make install-dev\n   ```\n   This will:\n   - Create and activate a virtual environment\n   - Install all development dependencies\n   - Set up pre-commit hooks\n\n### Common Development Tasks\n\n```bash\n# Run tests\nmake test\n\n# Run tests with coverage\nmake test-cov\n\n# Format code\nmake format\n\n# Run linters\nmake lint\n\n# Start development server\nmake dev-server\n\n# Build documentation\nmake docs\nmake docs-serve  # Serve docs locally\n```\n\n### Scripts Directory\n\nAll development and build scripts are located in the `scripts/` directory. See [scripts/README.md](scripts/README.md) for detailed documentation of each script.\n\n### Docker Development\n\n```bash\n# Build Docker image\nmake docker-build\n\n# Start services with Docker Compose\nmake docker-run\n\n# Stop services\nmake docker-stop\n```\n\n## \ud83e\udd1d Contributing\n\nContributions are welcome! Please follow these steps:\n\n1. Fork the repository\n2. Create a feature branch (`git checkout -b feature/amazing-feature`)\n3. Commit your changes (`git commit -m 'Add some amazing feature'`)\n4. Push to the branch (`git push origin feature/amazing-feature`)\n5. Open a Pull Request\n\nPlease ensure your code follows our coding standards and includes appropriate tests.\n\n## \ud83d\udcc4 License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## \ud83d\udcdc Changelog\n\nSee [CHANGELOG.md](CHANGELOG.md) for a list of changes in each version.**\n   ```bash\n   python proc.py --model llava:7b --workers 4\n   ```\n\n3. **View Results**\n   - Open `output/*_complete.svg` in your browser\n   - Check details in `output/processing_report.json`\n\n## \ud83d\udcda Documentation\n\nFull documentation is available in the [docs/](docs/) directory:\n\n- [\ud83d\udcd6 User Guide](docs/user-guide/README.md)\n- [\u2699\ufe0f Installation and Configuration](docs/getting-started/installation.md)\n- [\ud83d\udd27 API Reference](docs/api-reference/README.md)\n- [\u2753 Frequently Asked Questions](docs/faq/README.md)\n- [\ud83d\udc68\u200d\ud83d\udcbb Development and Contributing](docs/development/contributing.md)\n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "Advanced PDF OCR processing with AI-powered text extraction and selectable text overlays",
    "version": "2.0.3",
    "project_urls": {
        "Bug Reports": "https://github.com/your-username/pdf-ocr-processor/issues",
        "Changelog": "https://github.com/your-username/pdf-ocr-processor/blob/main/CHANGELOG.md",
        "Documentation": "https://pdf-ocr-processor.readthedocs.io/",
        "Homepage": "https://github.com/your-username/pdf-ocr-processor",
        "Repository": "https://github.com/your-username/pdf-ocr-processor"
    },
    "split_keywords": [
        "pdf",
        " ocr",
        " text-extraction",
        " ai",
        " ollama",
        " document-processing",
        " computer-vision"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "93de567cb554d08aee52f18155bbb7da949026bf806c2e895c155ca3681221b5",
                "md5": "68f18a73bed371d5bf9d1fe4b1a92ae5",
                "sha256": "0a45da53ada6c1b39d7b20b01a8d6dd4449162a9eca3b01cec9648ca902eafaa"
            },
            "downloads": -1,
            "filename": "pdf_ocr_processor-2.0.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "68f18a73bed371d5bf9d1fe4b1a92ae5",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 78285,
            "upload_time": "2025-07-11T21:11:22",
            "upload_time_iso_8601": "2025-07-11T21:11:22.895694Z",
            "url": "https://files.pythonhosted.org/packages/93/de/567cb554d08aee52f18155bbb7da949026bf806c2e895c155ca3681221b5/pdf_ocr_processor-2.0.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e3015cdca9c73405e0f1c4df60ffccfd511b885c4d628e05d22ebe11f491880f",
                "md5": "fdc679db5c2132437b501bcc09d6b0ef",
                "sha256": "594bf07c1b5d7e9d09654ba056da00749015e5e3d6d451cc5a04d7111036aedc"
            },
            "downloads": -1,
            "filename": "pdf_ocr_processor-2.0.3.tar.gz",
            "has_sig": false,
            "md5_digest": "fdc679db5c2132437b501bcc09d6b0ef",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 64151,
            "upload_time": "2025-07-11T21:11:24",
            "upload_time_iso_8601": "2025-07-11T21:11:24.732253Z",
            "url": "https://files.pythonhosted.org/packages/e3/01/5cdca9c73405e0f1c4df60ffccfd511b885c4d628e05d22ebe11f491880f/pdf_ocr_processor-2.0.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-11 21:11:24",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "your-username",
    "github_project": "pdf-ocr-processor",
    "github_not_found": true,
    "lcname": "pdf-ocr-processor"
}
        
Elapsed time: 0.79495s