# LLM Batch Helper
[](https://badge.fury.io/py/llm_batch_helper)
[](https://pepy.tech/project/llm_batch_helper)
[](https://pepy.tech/project/llm_batch_helper)
[](https://llm-batch-helper.readthedocs.io/en/latest/?badge=latest)
[](https://opensource.org/licenses/MIT)
A Python package that enables batch submission of prompts to LLM APIs, with built-in async capabilities, response caching, prompt verification, and more. This package is designed to streamline applications like LLM simulation, LLM-as-a-judge, and other batch processing scenarios.
π **[Complete Documentation](https://llm-batch-helper.readthedocs.io/)** | π **[Quick Start Guide](https://llm-batch-helper.readthedocs.io/en/latest/quickstart.html)**
## Why we designed this package
Imagine you have 5000 prompts you need to send to an LLM. Running them sequentially can be painfully slowβsometimes taking hours or even days. Worse, if the process fails midway, youβre forced to start all over again. Weβve struggled with this exact frustration, which is why we built this package, to directly tackle these pain points:
1. **Efficient Batch Processing**: How do you run LLM calls in batches efficiently? Our async implementation is 3X-100X faster than multi-thread/multi-process approaches. In my own experience, it reduces the time from 24 hours to 10min.
2. **API Reliability**: LLM APIs can be unstable, so we need robust retry mechanisms when calls get interrupted.
3. **Long-Running Simulations**: During long-running LLM simulations, computers can crash and APIs can fail. Can we cache LLM API calls to avoid repeating completed work?
4. **Output Validation**: LLM outputs often have format requirements. If the output isn't right, we need to retry with validation.
This package is designed to solve these exact pain points with async processing, intelligent caching, and comprehensive error handling. If there are some additional features you need, please post an issue.
## Features
- **π Dramatic Speed Improvements**: **10-100x faster** than sequential processing ([see demo](https://github.com/TianyiPeng/LLM_batch_helper/blob/main/tutorials/performance_comparison_tutorial.ipynb))
- **β‘ Async Processing**: Submit multiple prompts concurrently for maximum throughput
- **πΎ Smart Caching**: Automatically cache responses and resume interrupted work seamlessly
- **π Multiple Input Formats**: Support for strings, tuples, dictionaries, and file-based prompts
- **π Multi-Provider Support**: Works with OpenAI (all models), OpenRouter (100+ models), and Together.ai
- **π Intelligent Retry Logic**: Built-in retry mechanism with exponential backoff and detailed logging
- **β
Quality Control**: Custom verification callbacks for response validation
- **π Progress Tracking**: Real-time progress bars and comprehensive statistics
- **π― Simplified API**: No async/await complexity - works seamlessly in Jupyter notebooks (v0.3.0+)
- **π§ Tunable Performance**: Adjust concurrency on-the-fly for optimal speed vs rate limits
## Installation
```bash
# Install from PyPI
pip install llm_batch_helper
```
## Quick Start
### 1. Set up environment variables
**Option A: Environment Variables**
```bash
# For OpenAI (all OpenAI models including GPT-5)
export OPENAI_API_KEY="your-openai-api-key"
# For OpenRouter (100+ models - Recommended)
export OPENROUTER_API_KEY="your-openrouter-api-key"
# For Together.ai
export TOGETHER_API_KEY="your-together-api-key"
```
**Option B: .env File (Recommended for Development)**
Create a `.env` file in your project:
```
OPENAI_API_KEY=your-openai-api-key
```
```python
# In your script, before importing llm_batch_helper
from dotenv import load_dotenv
load_dotenv() # Load from .env file
# Then use the package normally
from llm_batch_helper import LLMConfig, process_prompts_batch
```
### 2. Interactive Tutorials (Recommended)
**π― NEW: Performance Comparison Tutorial**
See the dramatic speed improvements! Our [Performance Comparison Tutorial](https://github.com/TianyiPeng/LLM_batch_helper/blob/main/tutorials/performance_comparison_tutorial.ipynb) demonstrates:
- **10-100x speedup** vs naive sequential processing
- Processing **5,000 prompts** in minutes instead of hours
- **Smart caching** that lets you resume interrupted work
- **Tunable concurrency** for optimal performance
**π Complete Feature Tutorial**
Check out the comprehensive [main tutorial](https://github.com/TianyiPeng/LLM_batch_helper/blob/main/tutorials/llm_batch_helper_tutorial.ipynb) covering all features with interactive examples!
### 3. Basic usage
```python
from dotenv import load_dotenv # Optional: for .env file support
from llm_batch_helper import LLMConfig, process_prompts_batch
# Optional: Load environment variables from .env file
load_dotenv()
# Create configuration
config = LLMConfig(
model_name="gpt-4o-mini",
temperature=1.0,
max_completion_tokens=100,
max_concurrent_requests=100 # number of concurrent requests with asyncIO, this number decides how fast your pipeline can run. We suggest a number that is as large as possible (e.g., 300) while making sure you are not over the rate limit constrained by the LLM APIs.
)
# Process prompts
prompts = [
"What is the capital of France?",
"What is 2+2?",
"Who wrote 'Hamlet'?"
]
results = process_prompts_batch(
config=config,
provider="openai",
prompts=prompts,
cache_dir="cache"
)
# Print results
for prompt_id, response in results.items():
print(f"{prompt_id}: {response['response_text']}")
```
**π New in v0.3.0**: `process_prompts_batch` now handles async operations **implicitly** - no more async/await syntax needed! Works seamlessly in Jupyter notebooks.
### 4. Multiple Input Formats
The package supports three different input formats for maximum flexibility:
```python
from llm_batch_helper import LLMConfig, process_prompts_batch
config = LLMConfig(
model_name="gpt-4o-mini",
temperature=1.0,
max_completion_tokens=100
)
# Mix different input formats in the same list
prompts = [
# String format - ID will be auto-generated from hash
"What is the capital of France?",
# Tuple format - (custom_id, prompt_text)
("custom_id_1", "What is 2+2?"),
# Dictionary format - {"id": custom_id, "text": prompt_text}
{"id": "shakespeare_q", "text": "Who wrote 'Hamlet'?"},
{"id": "science_q", "text": "Explain photosynthesis briefly."}
]
results = process_prompts_batch(
config=config,
provider="openai",
prompts=prompts,
cache_dir="cache"
)
# Print results with custom IDs
for prompt_id, response in results.items():
print(f"{prompt_id}: {response['response_text']}")
```
**Input Format Requirements:**
- **String**: Plain text prompt (ID auto-generated)
- **Tuple**: `(prompt_id, prompt_text)` - both elements required
- **Dictionary**: `{"id": "prompt_id", "text": "prompt_text"}` - both keys required
### π Backward Compatibility
For users who prefer the async version or have existing code, the async API is still available:
```python
import asyncio
from llm_batch_helper import process_prompts_batch_async
async def main():
results = await process_prompts_batch_async(
prompts=["Hello world!"],
config=config,
provider="openai"
)
return results
results = asyncio.run(main())
```
## Usage Examples
### OpenRouter (Recommended - 100+ Models)
```python
from llm_batch_helper import LLMConfig, process_prompts_batch
# Access 100+ models through OpenRouter
config = LLMConfig(
model_name="deepseek/deepseek-v3.1-base", # or openai/gpt-4o, anthropic/claude-3-5-sonnet
temperature=1.0,
max_completion_tokens=500
)
prompts = [
"Explain quantum computing briefly.",
"What are the benefits of renewable energy?",
"How does machine learning work?"
]
results = process_prompts_batch(
prompts=prompts,
config=config,
provider="openrouter" # Access to 100+ models!
)
for prompt_id, result in results.items():
print(f"Response: {result['response_text']}")
```
### File-based Prompts
```python
from llm_batch_helper import LLMConfig, process_prompts_batch
config = LLMConfig(
model_name="gpt-4o-mini",
temperature=1.0,
max_completion_tokens=200
)
# Process all .txt files in a directory
results = process_prompts_batch(
config=config,
provider="openai",
input_dir="prompts", # Directory containing .txt files
cache_dir="cache",
force=False # Use cached responses if available
)
print(f"Processed {len(results)} prompts from files")
```
### Custom Verification
```python
from llm_batch_helper import LLMConfig
def verify_response(prompt_id, llm_response_data, original_prompt_text, **kwargs):
"""Custom verification callback"""
response_text = llm_response_data.get("response_text", "")
# Check minimum length
if len(response_text) < kwargs.get("min_length", 10):
return False
# Check for specific keywords
if "error" in response_text.lower():
return False
return True
config = LLMConfig(
model_name="gpt-4o-mini",
temperature=1.0,
verification_callback=verify_response,
verification_callback_args={"min_length": 20}
)
```
## API Reference
### LLMConfig
Configuration class for LLM requests.
```python
LLMConfig(
model_name: str,
temperature: float = 1.0,
max_completion_tokens: Optional[int] = None, # Preferred parameter
max_tokens: Optional[int] = None, # Deprecated, kept for backward compatibility
system_instruction: Optional[str] = None,
max_retries: int = 5,
max_concurrent_requests: int = 30,
verification_callback: Optional[Callable] = None,
verification_callback_args: Optional[Dict] = None
)
```
### process_prompts_batch
Main function for batch processing of prompts (async operations handled implicitly).
```python
def process_prompts_batch(
config: LLMConfig,
provider: str, # "openai", "openrouter" (recommended), or "together"
prompts: Optional[List[str]] = None,
input_dir: Optional[str] = None,
cache_dir: str = "llm_cache",
force: bool = False,
desc: str = "Processing prompts"
) -> Dict[str, Dict[str, Any]]
```
### process_prompts_batch_async
Async version for backward compatibility and advanced use cases.
```python
async def process_prompts_batch_async(
config: LLMConfig,
provider: str, # "openai", "openrouter" (recommended), or "together"
prompts: Optional[List[str]] = None,
input_dir: Optional[str] = None,
cache_dir: str = "llm_cache",
force: bool = False,
desc: str = "Processing prompts"
) -> Dict[str, Dict[str, Any]]
```
### LLMCache
Caching functionality for responses.
```python
cache = LLMCache(cache_dir="my_cache")
# Check for cached response
cached = cache.get_cached_response(prompt_id)
# Save response to cache
cache.save_response(prompt_id, prompt_text, response_data)
# Clear all cached responses
cache.clear_cache()
```
## Project Structure
```
llm_batch_helper/
βββ pyproject.toml # Poetry configuration
βββ poetry.lock # Locked dependencies
βββ README.md # This file
βββ LICENSE # License file
βββ llm_batch_helper/ # Main package
β βββ __init__.py # Package exports
β βββ cache.py # Response caching
β βββ config.py # Configuration classes
β βββ providers.py # LLM provider implementations
β βββ input_handlers.py # Input processing utilities
β βββ exceptions.py # Custom exceptions
βββ examples/ # Usage examples
β βββ example.py # Basic usage example
β βββ prompts/ # Sample prompt files
β βββ llm_cache/ # Example cache directory
βββ tutorials/ # Interactive tutorials
βββ llm_batch_helper_tutorial.ipynb # Comprehensive feature tutorial
βββ performance_comparison_tutorial.ipynb # Performance demo (NEW!)
```
## Supported Models
### OpenAI
- **All OpenAI models**
### OpenRouter (Recommended - 100+ Models)
- **OpenAI models**: `openai/gpt-4o`, `openai/gpt-4o-mini`
- **Anthropic models**: `anthropic/claude-3-5-sonnet`, `anthropic/claude-3-haiku`
- **DeepSeek models**: `deepseek/deepseek-v3.1-base`, `deepseek/deepseek-chat`
- **Meta models**: `meta-llama/llama-3.1-405b-instruct`
- **Google models**: `google/gemini-pro-1.5`
- **And 90+ more models** from all major providers
### Together.ai
- meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo
- meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo
- mistralai/Mixtral-8x7B-Instruct-v0.1
- And many other open-source models
## Documentation
π **[Complete Documentation](https://llm-batch-helper.readthedocs.io/)** - Comprehensive docs on Read the Docs
### Quick Links:
- [Quick Start Guide](https://llm-batch-helper.readthedocs.io/en/latest/quickstart.html) - Get started quickly
- [API Reference](https://llm-batch-helper.readthedocs.io/en/latest/api.html) - Complete API documentation
- [Examples](https://llm-batch-helper.readthedocs.io/en/latest/examples.html) - Practical usage examples
- [Tutorials](https://llm-batch-helper.readthedocs.io/en/latest/tutorials.html) - Step-by-step tutorials
- [Provider Guide](https://llm-batch-helper.readthedocs.io/en/latest/providers.html) - OpenAI, OpenRouter & Together.ai setup
## Contributing
1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Add tests if applicable
5. Run the test suite
6. Submit a pull request
## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## Changelog
### v0.3.1
- **π§ Configuration Updates**: Optimized default values for better performance
- Updated `max_retries` from 10 to 5 for faster failure detection
- Updated `max_concurrent_requests` from 5 to 30 for improved batch processing performance
### v0.3.0
- **π Major Update**: Simplified API - async operations handled implicitly, no async/await required!
- **π Jupyter Support**: Works seamlessly in notebooks without event loop issues
- **π Detailed Retry Logging**: See exactly what happens during retries with timestamps
- **π Backward Compatibility**: Original async API still available as `process_prompts_batch_async`
- **π Updated Examples**: All documentation updated to show simplified usage
- **β‘ Smart Event Loop Handling**: Automatically detects and handles different Python environments
### v0.2.0
- Enhanced API stability
- Improved error handling
- Better documentation
### v0.1.5
- Added Together.ai provider support
- Support for open-source models (Llama, Mixtral, etc.)
- Enhanced documentation with Read the Docs
- Updated examples and tutorials
### v0.1.0
- Initial release
- Support for OpenAI API
- Async batch processing
- Response caching
- File and list-based input support
- Custom verification callbacks
- Poetry package management
Raw data
{
"_id": null,
"home_page": "https://github.com/TianyiPeng/LLM_batch_helper",
"name": "llm_batch_helper",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.10",
"maintainer_email": null,
"keywords": "llm, openai, together, openrouter, batch, async, ai, nlp, api",
"author": "Tianyi Peng",
"author_email": "tianyipeng95@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/b3/e1/069bf58a11f81bb67ad3fe3b9d4693b4e68d396d206bf1d7eface956e132/llm_batch_helper-0.3.3.tar.gz",
"platform": null,
"description": "# LLM Batch Helper\n\n[](https://badge.fury.io/py/llm_batch_helper)\n[](https://pepy.tech/project/llm_batch_helper)\n[](https://pepy.tech/project/llm_batch_helper)\n[](https://llm-batch-helper.readthedocs.io/en/latest/?badge=latest)\n[](https://opensource.org/licenses/MIT)\n\nA Python package that enables batch submission of prompts to LLM APIs, with built-in async capabilities, response caching, prompt verification, and more. This package is designed to streamline applications like LLM simulation, LLM-as-a-judge, and other batch processing scenarios.\n\n\ud83d\udcd6 **[Complete Documentation](https://llm-batch-helper.readthedocs.io/)** | \ud83d\ude80 **[Quick Start Guide](https://llm-batch-helper.readthedocs.io/en/latest/quickstart.html)**\n\n## Why we designed this package\n\nImagine you have 5000 prompts you need to send to an LLM. Running them sequentially can be painfully slow\u2014sometimes taking hours or even days. Worse, if the process fails midway, you\u2019re forced to start all over again. We\u2019ve struggled with this exact frustration, which is why we built this package, to directly tackle these pain points:\n\n1. **Efficient Batch Processing**: How do you run LLM calls in batches efficiently? Our async implementation is 3X-100X faster than multi-thread/multi-process approaches. In my own experience, it reduces the time from 24 hours to 10min. \n\n2. **API Reliability**: LLM APIs can be unstable, so we need robust retry mechanisms when calls get interrupted.\n\n3. **Long-Running Simulations**: During long-running LLM simulations, computers can crash and APIs can fail. Can we cache LLM API calls to avoid repeating completed work?\n\n4. **Output Validation**: LLM outputs often have format requirements. If the output isn't right, we need to retry with validation.\n\nThis package is designed to solve these exact pain points with async processing, intelligent caching, and comprehensive error handling. If there are some additional features you need, please post an issue. \n\n## Features\n\n- **\ud83d\ude80 Dramatic Speed Improvements**: **10-100x faster** than sequential processing ([see demo](https://github.com/TianyiPeng/LLM_batch_helper/blob/main/tutorials/performance_comparison_tutorial.ipynb))\n- **\u26a1 Async Processing**: Submit multiple prompts concurrently for maximum throughput\n- **\ud83d\udcbe Smart Caching**: Automatically cache responses and resume interrupted work seamlessly\n- **\ud83d\udcdd Multiple Input Formats**: Support for strings, tuples, dictionaries, and file-based prompts\n- **\ud83c\udf10 Multi-Provider Support**: Works with OpenAI (all models), OpenRouter (100+ models), and Together.ai\n- **\ud83d\udd04 Intelligent Retry Logic**: Built-in retry mechanism with exponential backoff and detailed logging\n- **\u2705 Quality Control**: Custom verification callbacks for response validation\n- **\ud83d\udcca Progress Tracking**: Real-time progress bars and comprehensive statistics\n- **\ud83c\udfaf Simplified API**: No async/await complexity - works seamlessly in Jupyter notebooks (v0.3.0+)\n- **\ud83d\udd27 Tunable Performance**: Adjust concurrency on-the-fly for optimal speed vs rate limits\n\n## Installation\n\n```bash\n# Install from PyPI\npip install llm_batch_helper\n```\n\n## Quick Start\n\n### 1. Set up environment variables\n\n**Option A: Environment Variables**\n```bash\n# For OpenAI (all OpenAI models including GPT-5)\nexport OPENAI_API_KEY=\"your-openai-api-key\"\n\n# For OpenRouter (100+ models - Recommended)\nexport OPENROUTER_API_KEY=\"your-openrouter-api-key\"\n\n# For Together.ai\nexport TOGETHER_API_KEY=\"your-together-api-key\"\n```\n\n**Option B: .env File (Recommended for Development)**\nCreate a `.env` file in your project:\n```\nOPENAI_API_KEY=your-openai-api-key\n```\n\n```python\n# In your script, before importing llm_batch_helper\nfrom dotenv import load_dotenv\nload_dotenv() # Load from .env file\n\n# Then use the package normally\nfrom llm_batch_helper import LLMConfig, process_prompts_batch\n```\n\n### 2. Interactive Tutorials (Recommended)\n\n**\ud83c\udfaf NEW: Performance Comparison Tutorial**\nSee the dramatic speed improvements! Our [Performance Comparison Tutorial](https://github.com/TianyiPeng/LLM_batch_helper/blob/main/tutorials/performance_comparison_tutorial.ipynb) demonstrates:\n- **10-100x speedup** vs naive sequential processing\n- Processing **5,000 prompts** in minutes instead of hours\n- **Smart caching** that lets you resume interrupted work\n- **Tunable concurrency** for optimal performance\n\n**\ud83d\udcda Complete Feature Tutorial**\nCheck out the comprehensive [main tutorial](https://github.com/TianyiPeng/LLM_batch_helper/blob/main/tutorials/llm_batch_helper_tutorial.ipynb) covering all features with interactive examples!\n\n### 3. Basic usage\n\n```python\nfrom dotenv import load_dotenv # Optional: for .env file support\nfrom llm_batch_helper import LLMConfig, process_prompts_batch\n\n# Optional: Load environment variables from .env file\nload_dotenv()\n\n# Create configuration\nconfig = LLMConfig(\n model_name=\"gpt-4o-mini\",\n temperature=1.0,\n max_completion_tokens=100,\n max_concurrent_requests=100 # number of concurrent requests with asyncIO, this number decides how fast your pipeline can run. We suggest a number that is as large as possible (e.g., 300) while making sure you are not over the rate limit constrained by the LLM APIs. \n)\n\n# Process prompts\nprompts = [\n \"What is the capital of France?\",\n \"What is 2+2?\",\n \"Who wrote 'Hamlet'?\"\n]\n\nresults = process_prompts_batch(\n config=config,\n provider=\"openai\",\n prompts=prompts,\n cache_dir=\"cache\"\n)\n\n# Print results\nfor prompt_id, response in results.items():\n print(f\"{prompt_id}: {response['response_text']}\")\n```\n\n**\ud83c\udf89 New in v0.3.0**: `process_prompts_batch` now handles async operations **implicitly** - no more async/await syntax needed! Works seamlessly in Jupyter notebooks.\n\n### 4. Multiple Input Formats\n\nThe package supports three different input formats for maximum flexibility:\n\n```python\nfrom llm_batch_helper import LLMConfig, process_prompts_batch\n\nconfig = LLMConfig(\n model_name=\"gpt-4o-mini\",\n temperature=1.0,\n max_completion_tokens=100\n)\n\n# Mix different input formats in the same list\nprompts = [\n # String format - ID will be auto-generated from hash\n \"What is the capital of France?\",\n \n # Tuple format - (custom_id, prompt_text)\n (\"custom_id_1\", \"What is 2+2?\"),\n \n # Dictionary format - {\"id\": custom_id, \"text\": prompt_text}\n {\"id\": \"shakespeare_q\", \"text\": \"Who wrote 'Hamlet'?\"},\n {\"id\": \"science_q\", \"text\": \"Explain photosynthesis briefly.\"}\n]\n\nresults = process_prompts_batch(\n config=config,\n provider=\"openai\",\n prompts=prompts,\n cache_dir=\"cache\"\n)\n\n# Print results with custom IDs\nfor prompt_id, response in results.items():\n print(f\"{prompt_id}: {response['response_text']}\")\n```\n\n**Input Format Requirements:**\n- **String**: Plain text prompt (ID auto-generated)\n- **Tuple**: `(prompt_id, prompt_text)` - both elements required\n- **Dictionary**: `{\"id\": \"prompt_id\", \"text\": \"prompt_text\"}` - both keys required\n\n### \ud83d\udd04 Backward Compatibility\n\nFor users who prefer the async version or have existing code, the async API is still available:\n\n```python\nimport asyncio\nfrom llm_batch_helper import process_prompts_batch_async\n\nasync def main():\n results = await process_prompts_batch_async(\n prompts=[\"Hello world!\"],\n config=config,\n provider=\"openai\"\n )\n return results\n\nresults = asyncio.run(main())\n```\n\n## Usage Examples\n\n### OpenRouter (Recommended - 100+ Models)\n\n```python\nfrom llm_batch_helper import LLMConfig, process_prompts_batch\n\n# Access 100+ models through OpenRouter\nconfig = LLMConfig(\n model_name=\"deepseek/deepseek-v3.1-base\", # or openai/gpt-4o, anthropic/claude-3-5-sonnet\n temperature=1.0,\n max_completion_tokens=500\n)\n\nprompts = [\n \"Explain quantum computing briefly.\",\n \"What are the benefits of renewable energy?\",\n \"How does machine learning work?\"\n]\n\nresults = process_prompts_batch(\n prompts=prompts,\n config=config,\n provider=\"openrouter\" # Access to 100+ models!\n)\n\nfor prompt_id, result in results.items():\n print(f\"Response: {result['response_text']}\")\n```\n\n### File-based Prompts\n\n```python\nfrom llm_batch_helper import LLMConfig, process_prompts_batch\n\nconfig = LLMConfig(\n model_name=\"gpt-4o-mini\",\n temperature=1.0,\n max_completion_tokens=200\n)\n\n# Process all .txt files in a directory\nresults = process_prompts_batch(\n config=config,\n provider=\"openai\",\n input_dir=\"prompts\", # Directory containing .txt files\n cache_dir=\"cache\",\n force=False # Use cached responses if available\n)\n\nprint(f\"Processed {len(results)} prompts from files\")\n```\n\n### Custom Verification\n\n```python\nfrom llm_batch_helper import LLMConfig\n\ndef verify_response(prompt_id, llm_response_data, original_prompt_text, **kwargs):\n \"\"\"Custom verification callback\"\"\"\n response_text = llm_response_data.get(\"response_text\", \"\")\n \n # Check minimum length\n if len(response_text) < kwargs.get(\"min_length\", 10):\n return False\n \n # Check for specific keywords\n if \"error\" in response_text.lower():\n return False\n \n return True\n\nconfig = LLMConfig(\n model_name=\"gpt-4o-mini\",\n temperature=1.0,\n verification_callback=verify_response,\n verification_callback_args={\"min_length\": 20}\n)\n```\n\n\n\n## API Reference\n\n### LLMConfig\n\nConfiguration class for LLM requests.\n\n```python\nLLMConfig(\n model_name: str,\n temperature: float = 1.0,\n max_completion_tokens: Optional[int] = None, # Preferred parameter\n max_tokens: Optional[int] = None, # Deprecated, kept for backward compatibility\n system_instruction: Optional[str] = None,\n max_retries: int = 5,\n max_concurrent_requests: int = 30,\n verification_callback: Optional[Callable] = None,\n verification_callback_args: Optional[Dict] = None\n)\n```\n\n### process_prompts_batch\n\nMain function for batch processing of prompts (async operations handled implicitly).\n\n```python\ndef process_prompts_batch(\n config: LLMConfig,\n provider: str, # \"openai\", \"openrouter\" (recommended), or \"together\"\n prompts: Optional[List[str]] = None,\n input_dir: Optional[str] = None,\n cache_dir: str = \"llm_cache\",\n force: bool = False,\n desc: str = \"Processing prompts\"\n) -> Dict[str, Dict[str, Any]]\n```\n\n### process_prompts_batch_async\n\nAsync version for backward compatibility and advanced use cases.\n\n```python\nasync def process_prompts_batch_async(\n config: LLMConfig,\n provider: str, # \"openai\", \"openrouter\" (recommended), or \"together\"\n prompts: Optional[List[str]] = None,\n input_dir: Optional[str] = None,\n cache_dir: str = \"llm_cache\",\n force: bool = False,\n desc: str = \"Processing prompts\"\n) -> Dict[str, Dict[str, Any]]\n```\n\n### LLMCache\n\nCaching functionality for responses.\n\n```python\ncache = LLMCache(cache_dir=\"my_cache\")\n\n# Check for cached response\ncached = cache.get_cached_response(prompt_id)\n\n# Save response to cache\ncache.save_response(prompt_id, prompt_text, response_data)\n\n# Clear all cached responses\ncache.clear_cache()\n```\n\n## Project Structure\n\n```\nllm_batch_helper/\n\u251c\u2500\u2500 pyproject.toml # Poetry configuration\n\u251c\u2500\u2500 poetry.lock # Locked dependencies\n\u251c\u2500\u2500 README.md # This file\n\u251c\u2500\u2500 LICENSE # License file\n\u251c\u2500\u2500 llm_batch_helper/ # Main package\n\u2502 \u251c\u2500\u2500 __init__.py # Package exports\n\u2502 \u251c\u2500\u2500 cache.py # Response caching\n\u2502 \u251c\u2500\u2500 config.py # Configuration classes\n\u2502 \u251c\u2500\u2500 providers.py # LLM provider implementations\n\u2502 \u251c\u2500\u2500 input_handlers.py # Input processing utilities\n\u2502 \u2514\u2500\u2500 exceptions.py # Custom exceptions\n\u251c\u2500\u2500 examples/ # Usage examples\n\u2502 \u251c\u2500\u2500 example.py # Basic usage example\n\u2502 \u251c\u2500\u2500 prompts/ # Sample prompt files\n\u2502 \u2514\u2500\u2500 llm_cache/ # Example cache directory\n\u2514\u2500\u2500 tutorials/ # Interactive tutorials\n \u251c\u2500\u2500 llm_batch_helper_tutorial.ipynb # Comprehensive feature tutorial\n \u2514\u2500\u2500 performance_comparison_tutorial.ipynb # Performance demo (NEW!)\n```\n\n## Supported Models\n\n### OpenAI\n- **All OpenAI models** \n\n### OpenRouter (Recommended - 100+ Models)\n- **OpenAI models**: `openai/gpt-4o`, `openai/gpt-4o-mini`\n- **Anthropic models**: `anthropic/claude-3-5-sonnet`, `anthropic/claude-3-haiku`\n- **DeepSeek models**: `deepseek/deepseek-v3.1-base`, `deepseek/deepseek-chat`\n- **Meta models**: `meta-llama/llama-3.1-405b-instruct`\n- **Google models**: `google/gemini-pro-1.5`\n- **And 90+ more models** from all major providers\n\n### Together.ai\n- meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo\n- meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo\n- mistralai/Mixtral-8x7B-Instruct-v0.1\n- And many other open-source models\n\n## Documentation\n\n\ud83d\udcd6 **[Complete Documentation](https://llm-batch-helper.readthedocs.io/)** - Comprehensive docs on Read the Docs\n\n### Quick Links:\n- [Quick Start Guide](https://llm-batch-helper.readthedocs.io/en/latest/quickstart.html) - Get started quickly\n- [API Reference](https://llm-batch-helper.readthedocs.io/en/latest/api.html) - Complete API documentation \n- [Examples](https://llm-batch-helper.readthedocs.io/en/latest/examples.html) - Practical usage examples\n- [Tutorials](https://llm-batch-helper.readthedocs.io/en/latest/tutorials.html) - Step-by-step tutorials\n- [Provider Guide](https://llm-batch-helper.readthedocs.io/en/latest/providers.html) - OpenAI, OpenRouter & Together.ai setup\n\n## Contributing\n\n1. Fork the repository\n2. Create a feature branch\n3. Make your changes\n4. Add tests if applicable\n5. Run the test suite\n6. Submit a pull request\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## Changelog\n\n### v0.3.1\n- **\ud83d\udd27 Configuration Updates**: Optimized default values for better performance\n- Updated `max_retries` from 10 to 5 for faster failure detection\n- Updated `max_concurrent_requests` from 5 to 30 for improved batch processing performance\n\n### v0.3.0\n- **\ud83c\udf89 Major Update**: Simplified API - async operations handled implicitly, no async/await required!\n- **\ud83d\udcd3 Jupyter Support**: Works seamlessly in notebooks without event loop issues\n- **\ud83d\udd0d Detailed Retry Logging**: See exactly what happens during retries with timestamps\n- **\ud83d\udd04 Backward Compatibility**: Original async API still available as `process_prompts_batch_async`\n- **\ud83d\udcda Updated Examples**: All documentation updated to show simplified usage\n- **\u26a1 Smart Event Loop Handling**: Automatically detects and handles different Python environments\n\n### v0.2.0\n- Enhanced API stability\n- Improved error handling\n- Better documentation\n\n### v0.1.5\n- Added Together.ai provider support\n- Support for open-source models (Llama, Mixtral, etc.)\n- Enhanced documentation with Read the Docs\n- Updated examples and tutorials\n\n### v0.1.0\n- Initial release\n- Support for OpenAI API\n- Async batch processing\n- Response caching\n- File and list-based input support\n- Custom verification callbacks\n- Poetry package management\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A Python package that enables batch submission of prompts to LLM APIs, with simplified interface and built-in async capabilities handled implicitly.",
"version": "0.3.3",
"project_urls": {
"Homepage": "https://github.com/TianyiPeng/LLM_batch_helper",
"Repository": "https://github.com/TianyiPeng/LLM_batch_helper"
},
"split_keywords": [
"llm",
" openai",
" together",
" openrouter",
" batch",
" async",
" ai",
" nlp",
" api"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "0c46cfd32731bc15bf95d5a4dd9fecb21c1a47fa04b947eb98a70e46c98f044d",
"md5": "635f7a0439c7563ea67ceb59db802008",
"sha256": "9e16dd513b5f433546fcddbd3eb0d2d121fd534a7d133dfd79a3f57cfb3ed5dc"
},
"downloads": -1,
"filename": "llm_batch_helper-0.3.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "635f7a0439c7563ea67ceb59db802008",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.10",
"size": 18360,
"upload_time": "2025-09-12T05:03:46",
"upload_time_iso_8601": "2025-09-12T05:03:46.964070Z",
"url": "https://files.pythonhosted.org/packages/0c/46/cfd32731bc15bf95d5a4dd9fecb21c1a47fa04b947eb98a70e46c98f044d/llm_batch_helper-0.3.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "b3e1069bf58a11f81bb67ad3fe3b9d4693b4e68d396d206bf1d7eface956e132",
"md5": "6212cf37e111d37445f3aa1cdf251e56",
"sha256": "36d9e8a448a443498461cc43992b3b1e27ebd64acbce81ed5498042c968ff2b1"
},
"downloads": -1,
"filename": "llm_batch_helper-0.3.3.tar.gz",
"has_sig": false,
"md5_digest": "6212cf37e111d37445f3aa1cdf251e56",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.10",
"size": 20368,
"upload_time": "2025-09-12T05:03:48",
"upload_time_iso_8601": "2025-09-12T05:03:48.149813Z",
"url": "https://files.pythonhosted.org/packages/b3/e1/069bf58a11f81bb67ad3fe3b9d4693b4e68d396d206bf1d7eface956e132/llm_batch_helper-0.3.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-09-12 05:03:48",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "TianyiPeng",
"github_project": "LLM_batch_helper",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "llm_batch_helper"
}