ai-batch


Nameai-batch JSON
Version 0.2.0 PyPI version JSON
download
home_pageNone
SummaryBatch processing for Anthropic's Claude API with structured output
upload_time2025-07-11 04:22:25
maintainerNone
docs_urlNone
authorAgam More
requires_python>=3.12
licenseMIT
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # AI Batch

Python SDK for **batch processing** with structured output and citation mapping.

- **50% cost savings** via Anthropic's batch API pricing
- **Automatic cost tracking** with token usage and pricing
- **Structured output** with Pydantic models  
- **Field-level citations** map results to source documents
- **Type safety** with full validation

Currently supports Anthropic Claude. OpenAI support coming soon.

## API Reference

- [`batch()`](#batch) - Process message conversations or PDF files
- [`BatchJob`](#batchjob) - Job status and results
- [`BatchManager`](#batchmanager) - Manage large-scale batch processing with parallel execution

## Quick Start

```python
from ai_batch import batch
from pydantic import BaseModel

class Invoice(BaseModel):
    company_name: str
    total_amount: str
    date: str

# Process PDFs with structured output + citations
job = batch(
    files=["invoice1.pdf", "invoice2.pdf", "invoice3.pdf"],
    prompt="Extract the company name, total amount, and date.",
    model="claude-3-5-sonnet-20241022",
    response_model=Invoice,
    enable_citations=True
)

# Wait for completion
while not job.is_complete():
    time.sleep(30)
    
results = job.results()
# Results now contain both data and citations together:
# [{"result": Invoice(...), "citations": {"company_name": [Citation(...)], ...}}, ...]
```

## Installation

```bash
pip install ai-batch
```

## Usage

Create a `.env` file in your project root:

```bash
ANTHROPIC_API_KEY=your-api-key
```

## API Functions

### batch()

Process multiple message conversations with optional structured output.

```python
from ai_batch import batch
from pydantic import BaseModel

class SpamResult(BaseModel):
    is_spam: bool
    confidence: float
    reason: str

# Process messages
job = batch(
    messages=[
        [{"role": "user", "content": "Is this spam? You've won $1000!"}],
        [{"role": "user", "content": "Meeting at 3pm tomorrow"}],
        [{"role": "user", "content": "URGENT: Click here now!"}]
    ],
    model="claude-3-haiku-20240307",
    response_model=SpamResult
)

# Get results
results = job.results()
```

**Response:**
```python
[
    SpamResult(is_spam=True, confidence=0.95, reason="Contains monetary prize claim"),
    SpamResult(is_spam=False, confidence=0.98, reason="Normal meeting reminder"),
    SpamResult(is_spam=True, confidence=0.92, reason="Urgent call-to-action pattern")
]
```

### batch() with files

Process PDF files with optional structured output and citations.

```python
from ai_batch import batch
from pydantic import BaseModel

class Invoice(BaseModel):
    company_name: str
    total_amount: str
    date: str

# Process PDFs with citations
job = batch(
    files=["invoice1.pdf", "invoice2.pdf"],
    prompt="Extract the company name, total amount, and date.",
    model="claude-3-5-sonnet-20241022",
    response_model=Invoice,
    enable_citations=True
)

results = job.results()
# Results now contain both data and citations together
```

**Response:**
```python
# Results now contain both data and citations together
[
    {
        "result": Invoice(company_name="TechCorp Solutions", total_amount="$12,500.00", date="March 15, 2024"),
        "citations": {
            "company_name": [Citation(cited_text="TechCorp Solutions", start_page=1)],
            "total_amount": [Citation(cited_text="TOTAL: $12,500.00", start_page=2)],
            "date": [Citation(cited_text="Date: March 15, 2024", start_page=1)]
        }
    },
    {
        "result": Invoice(company_name="DataFlow Systems", total_amount="$8,750.00", date="March 18, 2024"),
        "citations": {
            "company_name": [Citation(cited_text="DataFlow Systems", start_page=1)],
            "total_amount": [Citation(cited_text="Total Due: $8,750.00", start_page=3)],
            "date": [Citation(cited_text="Invoice Date: March 18, 2024", start_page=1)]
        }
    }
]
```

### BatchJob

The job object returned by `batch()`.

```python
# Check completion status
if job.is_complete():
    results = job.results()

# Get processing statistics with cost tracking
stats = job.stats(print_stats=True)
# Output:
# 📊 Batch Statistics
#    ID: msgbatch_01BPtdnmEwxtaDcdJ2eUsq4T
#    Status: ended
#    Complete: ✅
#    Elapsed: 41.8s
#    Mode: Text + Citations
#    Results: 0
#    Citations: 0
#    Input tokens: 2,117
#    Output tokens: 81
#    Total cost: $0.0038
#    (50% batch discount applied)

# Citations are now included in results (if enabled)
# Access via: results[0]["citations"]

# Save raw API responses
job = batch(..., raw_results_dir="./raw_responses")
```

### BatchManager

Manage large-scale batch processing with automatic job splitting, parallel execution, state persistence, and cost management.

```python
from ai_batch import BatchManager
from pydantic import BaseModel

class Invoice(BaseModel):
    company_name: str
    total_amount: float
    invoice_number: str

# Initialize BatchManager for large-scale processing
manager = BatchManager(
    files=["invoice1.pdf", "invoice2.pdf", ...],  # 100+ files
    prompt="Extract invoice data",
    model="claude-3-5-sonnet-20241022",
    response_model=Invoice,
    enable_citations=True,
    items_per_job=10,      # Process 10 files per job
    max_parallel_jobs=5,   # 5 jobs in parallel
    max_cost=50.0,         # Stop if cost exceeds $50
    state_path="batch_state.json",  # Auto-resume capability
    save_results_dir="results/"     # Save results to disk
)

# Run processing (auto-resumes if interrupted)
summary = manager.run(print_progress=True)

# Retry failed items
if summary['failed_items'] > 0:
    retry_summary = manager.retry_failed()

# Get statistics
stats = manager.stats
print(f"Completed: {stats['completed_items']}/{stats['total_items']}")
print(f"Total cost: ${stats['total_cost']:.2f}")

# Load results from disk
results = manager.get_results_from_disk()
```

**Key Features:**
- **Automatic job splitting**: Breaks large batches into smaller chunks
- **Parallel processing**: Multiple jobs run concurrently with ThreadPoolExecutor
- **State persistence**: Resume from interruptions with JSON state files
- **Cost management**: Stop processing when budget limit is reached
- **Progress monitoring**: Real-time progress updates with statistics
- **Retry mechanism**: Easily retry failed items
- **Result saving**: Organized directory structure for results

## Citations

Citations work in two modes depending on whether you use structured output:

### 1. Text + Citations (Flat List)

When `enable_citations=True` without a response model, citations are returned as a flat list:

```python
job = batch(
    files=["document.pdf"],
    prompt="Summarize the key findings",
    enable_citations=True
)

results = job.results()   # List of {"result": str, "citations": List[Citation]}

# Example result structure:
[
    {
        "result": "Summary text...",
        "citations": [
            Citation(cited_text="AI reduces errors by 30%", start_page=2),
            Citation(cited_text="Implementation cost: $50,000", start_page=5)
        ]
    }
]
```

### 2. Structured + Field Citations (Mapping)

When using both `response_model` and `enable_citations=True`, citations are mapped to specific fields:

```python
job = batch(
    files=["document.pdf"],
    prompt="Extract the data",
    response_model=MyModel,
    enable_citations=True
)

results = job.results()   # List of {"result": Model, "citations": Dict[str, List[Citation]]}

# Example result structure:
[
    {
        "result": MyModel(title="Annual Report 2024", revenue="$1.2M"),
        "citations": {
            "title": [Citation(cited_text="Annual Report 2024", start_page=1)],
            "revenue": [Citation(cited_text="Revenue: $1.2M", start_page=3)],
            "growth": [Citation(cited_text="YoY Growth: 25%", start_page=3)]
        }
    }
]
```

The field mapping allows you to trace exactly which part of the source document was used to populate each field in your structured output.

### Robust Citation Parsing

AI Batch uses proper JSON parsing for citation field mapping, ensuring reliability with complex JSON structures:

**Handles Complex Scenarios:**
- ✅ Escaped quotes in JSON values: `"name": "John \"The Great\" Doe"`
- ✅ URLs with colons: `"website": "http://example.com:8080"`
- ✅ Nested objects and arrays: `"metadata": {"nested": {"deep": "value"}}`
- ✅ Multi-line strings and special characters
- ✅ Fields with numbers/underscores: `user_name`, `age_2`

**Previous Limitations (Fixed):**
The old regex-based approach would fail on complex JSON patterns. The new JSON parser reliably handles any valid JSON structure that Claude produces, making citation mapping robust for production use.

## Cost Tracking

AI Batch automatically tracks token usage and costs for all batch operations:

```python
from ai_batch import batch

job = batch(
    messages=[...],
    model="claude-3-5-sonnet-20241022"
)

# Get cost information
stats = job.stats()
print(f"Total cost: ${stats['total_cost']:.4f}")
print(f"Input tokens: {stats['total_input_tokens']:,}")
print(f"Output tokens: {stats['total_output_tokens']:,}")

# Or print formatted statistics
job.stats(print_stats=True)
```

## Example Scripts

- [`examples/spam_detection.py`](examples/spam_detection.py) - Email classification
- [`examples/pdf_extraction.py`](examples/pdf_extraction.py) - PDF data extraction
- [`examples/citation_example.py`](examples/citation_example.py) - Basic citation usage
- [`examples/citation_with_pydantic.py`](examples/citation_with_pydantic.py) - Structured output with citations
- [`examples/batch_manager_example.py`](examples/batch_manager_example.py) - Large-scale batch processing with BatchManager

## Limitations

- Citationm mapping only work with flat Pydantic models (no nested models)
- No support for OpenAI.
- PDFs require Opus/Sonnet models for best results
- Batch jobs can take up to 24 hours to process
- Use `job.is_complete()` to check status before getting results
- Citations may not be available in all batch API responses

## Comparison with Alternatives

| Feature | ai-batch | LangChain | Instructor | PydanticAI |
|---------|----------|-----------|------------|------------|
| **Batch Requests** | ✅ Native (50% cost savings) | ❌ No native batch API | ✅ Via OpenAI Batch API ([#1092](https://github.com/instructor-ai/instructor/issues/1092)) | ⚠️ Planned ([#1771](https://github.com/pydantic/pydantic-ai/issues/1771)) |
| **Structured Output** | ✅ Full support | ✅ Via parsers | ✅ Core feature | ✅ Native |
| **PDF File Input** | ✅ Native support | ✅ Via document loaders | ✅ Via multimodal models | ✅ Via file handling |
| **Citation Mapping** | ✅ Field-level citations | ❌ Manual implementation | ❌ Manual implementation | ❌ Manual implementation |
| **Cost Tracking** | ✅ Automatic with tokencost | ❌ Manual implementation | ❌ Manual implementation | ❌ Manual implementation |
| **Cost Limits** | ✅ max_cost parameter | ❌ Manual implementation | ❌ Manual implementation | ❌ Manual implementation |
| **Batch Providers** | 2/2 (Anthropic, OpenAI planned) | 0/2 | 1/2 (OpenAI only) | 0/2 |
| **Focus** | Streamlined batch requests | General LLM orchestration | Structured outputs CLI | Agent framework |

## License

MIT

## Todos

- [x] ~~Add pricing metadata and max_spend controls~~ (Cost tracking implemented)
- [x] ~~Auto batch manager (parallel batches, retry, spend control)~~ (BatchManager implemented)
- [ ] Test mode to run on 1% sample before full batch
- [ ] Quick batch - split into smaller chunks for faster results
- [ ] Support text/other file types (not just PDFs)
- [ ] Support for OpenAI
            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "ai-batch",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.12",
    "maintainer_email": null,
    "keywords": null,
    "author": "Agam More",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/41/c3/f24582405bf70090136bde085ef2d0aa1afa2771a02fefca46fd5c0c9817/ai_batch-0.2.0.tar.gz",
    "platform": null,
    "description": "# AI Batch\n\nPython SDK for **batch processing** with structured output and citation mapping.\n\n- **50% cost savings** via Anthropic's batch API pricing\n- **Automatic cost tracking** with token usage and pricing\n- **Structured output** with Pydantic models  \n- **Field-level citations** map results to source documents\n- **Type safety** with full validation\n\nCurrently supports Anthropic Claude. OpenAI support coming soon.\n\n## API Reference\n\n- [`batch()`](#batch) - Process message conversations or PDF files\n- [`BatchJob`](#batchjob) - Job status and results\n- [`BatchManager`](#batchmanager) - Manage large-scale batch processing with parallel execution\n\n## Quick Start\n\n```python\nfrom ai_batch import batch\nfrom pydantic import BaseModel\n\nclass Invoice(BaseModel):\n    company_name: str\n    total_amount: str\n    date: str\n\n# Process PDFs with structured output + citations\njob = batch(\n    files=[\"invoice1.pdf\", \"invoice2.pdf\", \"invoice3.pdf\"],\n    prompt=\"Extract the company name, total amount, and date.\",\n    model=\"claude-3-5-sonnet-20241022\",\n    response_model=Invoice,\n    enable_citations=True\n)\n\n# Wait for completion\nwhile not job.is_complete():\n    time.sleep(30)\n    \nresults = job.results()\n# Results now contain both data and citations together:\n# [{\"result\": Invoice(...), \"citations\": {\"company_name\": [Citation(...)], ...}}, ...]\n```\n\n## Installation\n\n```bash\npip install ai-batch\n```\n\n## Usage\n\nCreate a `.env` file in your project root:\n\n```bash\nANTHROPIC_API_KEY=your-api-key\n```\n\n## API Functions\n\n### batch()\n\nProcess multiple message conversations with optional structured output.\n\n```python\nfrom ai_batch import batch\nfrom pydantic import BaseModel\n\nclass SpamResult(BaseModel):\n    is_spam: bool\n    confidence: float\n    reason: str\n\n# Process messages\njob = batch(\n    messages=[\n        [{\"role\": \"user\", \"content\": \"Is this spam? You've won $1000!\"}],\n        [{\"role\": \"user\", \"content\": \"Meeting at 3pm tomorrow\"}],\n        [{\"role\": \"user\", \"content\": \"URGENT: Click here now!\"}]\n    ],\n    model=\"claude-3-haiku-20240307\",\n    response_model=SpamResult\n)\n\n# Get results\nresults = job.results()\n```\n\n**Response:**\n```python\n[\n    SpamResult(is_spam=True, confidence=0.95, reason=\"Contains monetary prize claim\"),\n    SpamResult(is_spam=False, confidence=0.98, reason=\"Normal meeting reminder\"),\n    SpamResult(is_spam=True, confidence=0.92, reason=\"Urgent call-to-action pattern\")\n]\n```\n\n### batch() with files\n\nProcess PDF files with optional structured output and citations.\n\n```python\nfrom ai_batch import batch\nfrom pydantic import BaseModel\n\nclass Invoice(BaseModel):\n    company_name: str\n    total_amount: str\n    date: str\n\n# Process PDFs with citations\njob = batch(\n    files=[\"invoice1.pdf\", \"invoice2.pdf\"],\n    prompt=\"Extract the company name, total amount, and date.\",\n    model=\"claude-3-5-sonnet-20241022\",\n    response_model=Invoice,\n    enable_citations=True\n)\n\nresults = job.results()\n# Results now contain both data and citations together\n```\n\n**Response:**\n```python\n# Results now contain both data and citations together\n[\n    {\n        \"result\": Invoice(company_name=\"TechCorp Solutions\", total_amount=\"$12,500.00\", date=\"March 15, 2024\"),\n        \"citations\": {\n            \"company_name\": [Citation(cited_text=\"TechCorp Solutions\", start_page=1)],\n            \"total_amount\": [Citation(cited_text=\"TOTAL: $12,500.00\", start_page=2)],\n            \"date\": [Citation(cited_text=\"Date: March 15, 2024\", start_page=1)]\n        }\n    },\n    {\n        \"result\": Invoice(company_name=\"DataFlow Systems\", total_amount=\"$8,750.00\", date=\"March 18, 2024\"),\n        \"citations\": {\n            \"company_name\": [Citation(cited_text=\"DataFlow Systems\", start_page=1)],\n            \"total_amount\": [Citation(cited_text=\"Total Due: $8,750.00\", start_page=3)],\n            \"date\": [Citation(cited_text=\"Invoice Date: March 18, 2024\", start_page=1)]\n        }\n    }\n]\n```\n\n### BatchJob\n\nThe job object returned by `batch()`.\n\n```python\n# Check completion status\nif job.is_complete():\n    results = job.results()\n\n# Get processing statistics with cost tracking\nstats = job.stats(print_stats=True)\n# Output:\n# \ud83d\udcca Batch Statistics\n#    ID: msgbatch_01BPtdnmEwxtaDcdJ2eUsq4T\n#    Status: ended\n#    Complete: \u2705\n#    Elapsed: 41.8s\n#    Mode: Text + Citations\n#    Results: 0\n#    Citations: 0\n#    Input tokens: 2,117\n#    Output tokens: 81\n#    Total cost: $0.0038\n#    (50% batch discount applied)\n\n# Citations are now included in results (if enabled)\n# Access via: results[0][\"citations\"]\n\n# Save raw API responses\njob = batch(..., raw_results_dir=\"./raw_responses\")\n```\n\n### BatchManager\n\nManage large-scale batch processing with automatic job splitting, parallel execution, state persistence, and cost management.\n\n```python\nfrom ai_batch import BatchManager\nfrom pydantic import BaseModel\n\nclass Invoice(BaseModel):\n    company_name: str\n    total_amount: float\n    invoice_number: str\n\n# Initialize BatchManager for large-scale processing\nmanager = BatchManager(\n    files=[\"invoice1.pdf\", \"invoice2.pdf\", ...],  # 100+ files\n    prompt=\"Extract invoice data\",\n    model=\"claude-3-5-sonnet-20241022\",\n    response_model=Invoice,\n    enable_citations=True,\n    items_per_job=10,      # Process 10 files per job\n    max_parallel_jobs=5,   # 5 jobs in parallel\n    max_cost=50.0,         # Stop if cost exceeds $50\n    state_path=\"batch_state.json\",  # Auto-resume capability\n    save_results_dir=\"results/\"     # Save results to disk\n)\n\n# Run processing (auto-resumes if interrupted)\nsummary = manager.run(print_progress=True)\n\n# Retry failed items\nif summary['failed_items'] > 0:\n    retry_summary = manager.retry_failed()\n\n# Get statistics\nstats = manager.stats\nprint(f\"Completed: {stats['completed_items']}/{stats['total_items']}\")\nprint(f\"Total cost: ${stats['total_cost']:.2f}\")\n\n# Load results from disk\nresults = manager.get_results_from_disk()\n```\n\n**Key Features:**\n- **Automatic job splitting**: Breaks large batches into smaller chunks\n- **Parallel processing**: Multiple jobs run concurrently with ThreadPoolExecutor\n- **State persistence**: Resume from interruptions with JSON state files\n- **Cost management**: Stop processing when budget limit is reached\n- **Progress monitoring**: Real-time progress updates with statistics\n- **Retry mechanism**: Easily retry failed items\n- **Result saving**: Organized directory structure for results\n\n## Citations\n\nCitations work in two modes depending on whether you use structured output:\n\n### 1. Text + Citations (Flat List)\n\nWhen `enable_citations=True` without a response model, citations are returned as a flat list:\n\n```python\njob = batch(\n    files=[\"document.pdf\"],\n    prompt=\"Summarize the key findings\",\n    enable_citations=True\n)\n\nresults = job.results()   # List of {\"result\": str, \"citations\": List[Citation]}\n\n# Example result structure:\n[\n    {\n        \"result\": \"Summary text...\",\n        \"citations\": [\n            Citation(cited_text=\"AI reduces errors by 30%\", start_page=2),\n            Citation(cited_text=\"Implementation cost: $50,000\", start_page=5)\n        ]\n    }\n]\n```\n\n### 2. Structured + Field Citations (Mapping)\n\nWhen using both `response_model` and `enable_citations=True`, citations are mapped to specific fields:\n\n```python\njob = batch(\n    files=[\"document.pdf\"],\n    prompt=\"Extract the data\",\n    response_model=MyModel,\n    enable_citations=True\n)\n\nresults = job.results()   # List of {\"result\": Model, \"citations\": Dict[str, List[Citation]]}\n\n# Example result structure:\n[\n    {\n        \"result\": MyModel(title=\"Annual Report 2024\", revenue=\"$1.2M\"),\n        \"citations\": {\n            \"title\": [Citation(cited_text=\"Annual Report 2024\", start_page=1)],\n            \"revenue\": [Citation(cited_text=\"Revenue: $1.2M\", start_page=3)],\n            \"growth\": [Citation(cited_text=\"YoY Growth: 25%\", start_page=3)]\n        }\n    }\n]\n```\n\nThe field mapping allows you to trace exactly which part of the source document was used to populate each field in your structured output.\n\n### Robust Citation Parsing\n\nAI Batch uses proper JSON parsing for citation field mapping, ensuring reliability with complex JSON structures:\n\n**Handles Complex Scenarios:**\n- \u2705 Escaped quotes in JSON values: `\"name\": \"John \\\"The Great\\\" Doe\"`\n- \u2705 URLs with colons: `\"website\": \"http://example.com:8080\"`\n- \u2705 Nested objects and arrays: `\"metadata\": {\"nested\": {\"deep\": \"value\"}}`\n- \u2705 Multi-line strings and special characters\n- \u2705 Fields with numbers/underscores: `user_name`, `age_2`\n\n**Previous Limitations (Fixed):**\nThe old regex-based approach would fail on complex JSON patterns. The new JSON parser reliably handles any valid JSON structure that Claude produces, making citation mapping robust for production use.\n\n## Cost Tracking\n\nAI Batch automatically tracks token usage and costs for all batch operations:\n\n```python\nfrom ai_batch import batch\n\njob = batch(\n    messages=[...],\n    model=\"claude-3-5-sonnet-20241022\"\n)\n\n# Get cost information\nstats = job.stats()\nprint(f\"Total cost: ${stats['total_cost']:.4f}\")\nprint(f\"Input tokens: {stats['total_input_tokens']:,}\")\nprint(f\"Output tokens: {stats['total_output_tokens']:,}\")\n\n# Or print formatted statistics\njob.stats(print_stats=True)\n```\n\n## Example Scripts\n\n- [`examples/spam_detection.py`](examples/spam_detection.py) - Email classification\n- [`examples/pdf_extraction.py`](examples/pdf_extraction.py) - PDF data extraction\n- [`examples/citation_example.py`](examples/citation_example.py) - Basic citation usage\n- [`examples/citation_with_pydantic.py`](examples/citation_with_pydantic.py) - Structured output with citations\n- [`examples/batch_manager_example.py`](examples/batch_manager_example.py) - Large-scale batch processing with BatchManager\n\n## Limitations\n\n- Citationm mapping only work with flat Pydantic models (no nested models)\n- No support for OpenAI.\n- PDFs require Opus/Sonnet models for best results\n- Batch jobs can take up to 24 hours to process\n- Use `job.is_complete()` to check status before getting results\n- Citations may not be available in all batch API responses\n\n## Comparison with Alternatives\n\n| Feature | ai-batch | LangChain | Instructor | PydanticAI |\n|---------|----------|-----------|------------|------------|\n| **Batch Requests** | \u2705 Native (50% cost savings) | \u274c No native batch API | \u2705 Via OpenAI Batch API ([#1092](https://github.com/instructor-ai/instructor/issues/1092)) | \u26a0\ufe0f Planned ([#1771](https://github.com/pydantic/pydantic-ai/issues/1771)) |\n| **Structured Output** | \u2705 Full support | \u2705 Via parsers | \u2705 Core feature | \u2705 Native |\n| **PDF File Input** | \u2705 Native support | \u2705 Via document loaders | \u2705 Via multimodal models | \u2705 Via file handling |\n| **Citation Mapping** | \u2705 Field-level citations | \u274c Manual implementation | \u274c Manual implementation | \u274c Manual implementation |\n| **Cost Tracking** | \u2705 Automatic with tokencost | \u274c Manual implementation | \u274c Manual implementation | \u274c Manual implementation |\n| **Cost Limits** | \u2705 max_cost parameter | \u274c Manual implementation | \u274c Manual implementation | \u274c Manual implementation |\n| **Batch Providers** | 2/2 (Anthropic, OpenAI planned) | 0/2 | 1/2 (OpenAI only) | 0/2 |\n| **Focus** | Streamlined batch requests | General LLM orchestration | Structured outputs CLI | Agent framework |\n\n## License\n\nMIT\n\n## Todos\n\n- [x] ~~Add pricing metadata and max_spend controls~~ (Cost tracking implemented)\n- [x] ~~Auto batch manager (parallel batches, retry, spend control)~~ (BatchManager implemented)\n- [ ] Test mode to run on 1% sample before full batch\n- [ ] Quick batch - split into smaller chunks for faster results\n- [ ] Support text/other file types (not just PDFs)\n- [ ] Support for OpenAI",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Batch processing for Anthropic's Claude API with structured output",
    "version": "0.2.0",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "8bc570ae5e07707ca68dbd77d0f28a1fc1759ab1da905139e11ae33bfdfae47c",
                "md5": "4e403d9ffed275b446f0cfcc2c1ede1e",
                "sha256": "1a144d25a601a293aa3dacd23cb44b50cbf5370035221cc1846a5a7dd069a6bb"
            },
            "downloads": -1,
            "filename": "ai_batch-0.2.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "4e403d9ffed275b446f0cfcc2c1ede1e",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.12",
            "size": 29789,
            "upload_time": "2025-07-11T04:22:24",
            "upload_time_iso_8601": "2025-07-11T04:22:24.462907Z",
            "url": "https://files.pythonhosted.org/packages/8b/c5/70ae5e07707ca68dbd77d0f28a1fc1759ab1da905139e11ae33bfdfae47c/ai_batch-0.2.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "41c3f24582405bf70090136bde085ef2d0aa1afa2771a02fefca46fd5c0c9817",
                "md5": "326fe31eadc33516c6339b017ed22f08",
                "sha256": "8f43b285c6da70bc9bd38fa12926bf504d02cb84e2987830e8f031848a505932"
            },
            "downloads": -1,
            "filename": "ai_batch-0.2.0.tar.gz",
            "has_sig": false,
            "md5_digest": "326fe31eadc33516c6339b017ed22f08",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.12",
            "size": 125134,
            "upload_time": "2025-07-11T04:22:25",
            "upload_time_iso_8601": "2025-07-11T04:22:25.665269Z",
            "url": "https://files.pythonhosted.org/packages/41/c3/f24582405bf70090136bde085ef2d0aa1afa2771a02fefca46fd5c0c9817/ai_batch-0.2.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-11 04:22:25",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "ai-batch"
}
        
Elapsed time: 1.13659s