llm-batch-helper

Name	llm-batch-helper JSON
Version	0.1.2 JSON
	download
home_page	None
Summary	A Python package that enables batch submission of prompts to LLM APIs, with built-in async capabilities and response caching.
upload_time	2025-07-16 17:35:51
maintainer	None
docs_url	None
author	Tianyi Peng
requires_python	<4.0,>=3.11
license	MIT
keywords	llm openai batch async ai nlp api
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # LLM Batch Helper

A Python package that enables batch submission of prompts to LLM APIs, with built-in async capabilities and response caching.

## Features

- **Async Processing**: Submit multiple prompts concurrently for faster processing
- **Response Caching**: Automatically cache responses to avoid redundant API calls
- **Multiple Input Formats**: Support for both file-based and list-based prompts
- **Provider Support**: Works with OpenAI API
- **Retry Logic**: Built-in retry mechanism with exponential backoff
- **Verification Callbacks**: Custom verification for response quality
- **Progress Tracking**: Real-time progress bars for batch operations

## Installation

### For Users (Recommended)

```bash
# Install from PyPI
pip install llm_batch_helper
```

### For Development

```bash
# Clone the repository
git clone https://github.com/TianyiPeng/LLM_batch_helper.git
cd llm_batch_helper

# Install with Poetry
poetry install

# Activate the virtual environment
poetry shell
```

## Quick Start

### 1. Set up environment variables

```bash
# For OpenAI
export OPENAI_API_KEY="your-openai-api-key"
```

### 2. Interactive Tutorial (Recommended)

Check out the comprehensive Jupyter notebook [tutorial](https://github.com/TianyiPeng/LLM_batch_helper/blob/main/tutorials/llm_batch_helper_tutorial.ipynb).

The tutorial covers all features with interactive examples!

### 3. Basic usage

```python
import asyncio
from llm_batch_helper import LLMConfig, process_prompts_batch

async def main():
    # Create configuration
    config = LLMConfig(
        model_name="gpt-4o-mini",
        temperature=0.7,
        max_tokens=100,
        max_concurrent_requests=30 # number of concurrent requests with asyncIO
    )
    
    # Process prompts
    prompts = [
        "What is the capital of France?",
        "What is 2+2?",
        "Who wrote 'Hamlet'?"
    ]
    
    results = await process_prompts_batch(
        config=config,
        provider="openai",
        prompts=prompts,
        cache_dir="cache"
    )
    
    # Print results
    for prompt_id, response in results.items():
        print(f"{prompt_id}: {response['response_text']}")

if __name__ == "__main__":
    asyncio.run(main())
```

## Usage Examples

### File-based Prompts

```python
import asyncio
from llm_batch_helper import LLMConfig, process_prompts_batch

async def process_files():
    config = LLMConfig(
        model_name="gpt-4o-mini",
        temperature=0.7,
        max_tokens=200
    )
    
    # Process all .txt files in a directory
    results = await process_prompts_batch(
        config=config,
        provider="openai",
        input_dir="prompts",  # Directory containing .txt files
        cache_dir="cache",
        force=False  # Use cached responses if available
    )
    
    return results

asyncio.run(process_files())
```

### Custom Verification

```python
from llm_batch_helper import LLMConfig

def verify_response(prompt_id, llm_response_data, original_prompt_text, **kwargs):
    """Custom verification callback"""
    response_text = llm_response_data.get("response_text", "")
    
    # Check minimum length
    if len(response_text) < kwargs.get("min_length", 10):
        return False
    
    # Check for specific keywords
    if "error" in response_text.lower():
        return False
    
    return True

config = LLMConfig(
    model_name="gpt-4o-mini",
    temperature=0.7,
    verification_callback=verify_response,
    verification_callback_args={"min_length": 20}
)
```



## API Reference

### LLMConfig

Configuration class for LLM requests.

```python
LLMConfig(
    model_name: str,
    temperature: float = 0.7,
    max_tokens: Optional[int] = None,
    system_instruction: Optional[str] = None,
    max_retries: int = 10,
    max_concurrent_requests: int = 5,
    verification_callback: Optional[Callable] = None,
    verification_callback_args: Optional[Dict] = None
)
```

### process_prompts_batch

Main function for batch processing of prompts.

```python
async def process_prompts_batch(
    config: LLMConfig,
    provider: str,  # "openai"
    prompts: Optional[List[str]] = None,
    input_dir: Optional[str] = None,
    cache_dir: str = "llm_cache",
    force: bool = False,
    desc: str = "Processing prompts"
) -> Dict[str, Dict[str, Any]]
```

### LLMCache

Caching functionality for responses.

```python
cache = LLMCache(cache_dir="my_cache")

# Check for cached response
cached = cache.get_cached_response(prompt_id)

# Save response to cache
cache.save_response(prompt_id, prompt_text, response_data)

# Clear all cached responses
cache.clear_cache()
```

## Project Structure

```
llm_batch_helper/
├── pyproject.toml              # Poetry configuration
├── poetry.lock                 # Locked dependencies
├── README.md                   # This file
├── LICENSE                     # License file
├── llm_batch_helper/          # Main package
│   ├── __init__.py            # Package exports
│   ├── cache.py               # Response caching
│   ├── config.py              # Configuration classes
│   ├── providers.py           # LLM provider implementations
│   ├── input_handlers.py      # Input processing utilities
│   └── exceptions.py          # Custom exceptions
├── examples/                   # Usage examples
│   ├── example.py             # Basic usage example
│   ├── prompts/               # Sample prompt files
│   └── llm_cache/             # Example cache directory
└── tutorials/                 # Interactive tutorials
    └── llm_batch_helper_tutorial.ipynb  # Comprehensive Jupyter notebook tutorial
```

## Supported Models

### OpenAI
- gpt-4o-mini
- gpt-4o
- gpt-4
- gpt-3.5-turbo

## Contributing

1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Add tests if applicable
5. Run the test suite
6. Submit a pull request

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## Changelog

### v0.1.0
- Initial release
- Support for OpenAI API
- Async batch processing
- Response caching
- File and list-based input support
- Custom verification callbacks
- Poetry package management

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "llm-batch-helper",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.11",
    "maintainer_email": null,
    "keywords": "llm, openai, batch, async, ai, nlp, api",
    "author": "Tianyi Peng",
    "author_email": "tianyipeng95@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/57/33/1cd3651f31ea4eafbd9ca39d0866e1d53f2672cc39458ae57f8b906c76bf/llm_batch_helper-0.1.2.tar.gz",
    "platform": null,
    "description": "# LLM Batch Helper\n\nA Python package that enables batch submission of prompts to LLM APIs, with built-in async capabilities and response caching.\n\n## Features\n\n- **Async Processing**: Submit multiple prompts concurrently for faster processing\n- **Response Caching**: Automatically cache responses to avoid redundant API calls\n- **Multiple Input Formats**: Support for both file-based and list-based prompts\n- **Provider Support**: Works with OpenAI API\n- **Retry Logic**: Built-in retry mechanism with exponential backoff\n- **Verification Callbacks**: Custom verification for response quality\n- **Progress Tracking**: Real-time progress bars for batch operations\n\n## Installation\n\n### For Users (Recommended)\n\n```bash\n# Install from PyPI\npip install llm_batch_helper\n```\n\n### For Development\n\n```bash\n# Clone the repository\ngit clone https://github.com/TianyiPeng/LLM_batch_helper.git\ncd llm_batch_helper\n\n# Install with Poetry\npoetry install\n\n# Activate the virtual environment\npoetry shell\n```\n\n## Quick Start\n\n### 1. Set up environment variables\n\n```bash\n# For OpenAI\nexport OPENAI_API_KEY=\"your-openai-api-key\"\n```\n\n### 2. Interactive Tutorial (Recommended)\n\nCheck out the comprehensive Jupyter notebook [tutorial](https://github.com/TianyiPeng/LLM_batch_helper/blob/main/tutorials/llm_batch_helper_tutorial.ipynb).\n\nThe tutorial covers all features with interactive examples!\n\n### 3. Basic usage\n\n```python\nimport asyncio\nfrom llm_batch_helper import LLMConfig, process_prompts_batch\n\nasync def main():\n    # Create configuration\n    config = LLMConfig(\n        model_name=\"gpt-4o-mini\",\n        temperature=0.7,\n        max_tokens=100,\n        max_concurrent_requests=30 # number of concurrent requests with asyncIO\n    )\n    \n    # Process prompts\n    prompts = [\n        \"What is the capital of France?\",\n        \"What is 2+2?\",\n        \"Who wrote 'Hamlet'?\"\n    ]\n    \n    results = await process_prompts_batch(\n        config=config,\n        provider=\"openai\",\n        prompts=prompts,\n        cache_dir=\"cache\"\n    )\n    \n    # Print results\n    for prompt_id, response in results.items():\n        print(f\"{prompt_id}: {response['response_text']}\")\n\nif __name__ == \"__main__\":\n    asyncio.run(main())\n```\n\n## Usage Examples\n\n### File-based Prompts\n\n```python\nimport asyncio\nfrom llm_batch_helper import LLMConfig, process_prompts_batch\n\nasync def process_files():\n    config = LLMConfig(\n        model_name=\"gpt-4o-mini\",\n        temperature=0.7,\n        max_tokens=200\n    )\n    \n    # Process all .txt files in a directory\n    results = await process_prompts_batch(\n        config=config,\n        provider=\"openai\",\n        input_dir=\"prompts\",  # Directory containing .txt files\n        cache_dir=\"cache\",\n        force=False  # Use cached responses if available\n    )\n    \n    return results\n\nasyncio.run(process_files())\n```\n\n### Custom Verification\n\n```python\nfrom llm_batch_helper import LLMConfig\n\ndef verify_response(prompt_id, llm_response_data, original_prompt_text, **kwargs):\n    \"\"\"Custom verification callback\"\"\"\n    response_text = llm_response_data.get(\"response_text\", \"\")\n    \n    # Check minimum length\n    if len(response_text) < kwargs.get(\"min_length\", 10):\n        return False\n    \n    # Check for specific keywords\n    if \"error\" in response_text.lower():\n        return False\n    \n    return True\n\nconfig = LLMConfig(\n    model_name=\"gpt-4o-mini\",\n    temperature=0.7,\n    verification_callback=verify_response,\n    verification_callback_args={\"min_length\": 20}\n)\n```\n\n\n\n## API Reference\n\n### LLMConfig\n\nConfiguration class for LLM requests.\n\n```python\nLLMConfig(\n    model_name: str,\n    temperature: float = 0.7,\n    max_tokens: Optional[int] = None,\n    system_instruction: Optional[str] = None,\n    max_retries: int = 10,\n    max_concurrent_requests: int = 5,\n    verification_callback: Optional[Callable] = None,\n    verification_callback_args: Optional[Dict] = None\n)\n```\n\n### process_prompts_batch\n\nMain function for batch processing of prompts.\n\n```python\nasync def process_prompts_batch(\n    config: LLMConfig,\n    provider: str,  # \"openai\"\n    prompts: Optional[List[str]] = None,\n    input_dir: Optional[str] = None,\n    cache_dir: str = \"llm_cache\",\n    force: bool = False,\n    desc: str = \"Processing prompts\"\n) -> Dict[str, Dict[str, Any]]\n```\n\n### LLMCache\n\nCaching functionality for responses.\n\n```python\ncache = LLMCache(cache_dir=\"my_cache\")\n\n# Check for cached response\ncached = cache.get_cached_response(prompt_id)\n\n# Save response to cache\ncache.save_response(prompt_id, prompt_text, response_data)\n\n# Clear all cached responses\ncache.clear_cache()\n```\n\n## Project Structure\n\n```\nllm_batch_helper/\n\u251c\u2500\u2500 pyproject.toml              # Poetry configuration\n\u251c\u2500\u2500 poetry.lock                 # Locked dependencies\n\u251c\u2500\u2500 README.md                   # This file\n\u251c\u2500\u2500 LICENSE                     # License file\n\u251c\u2500\u2500 llm_batch_helper/          # Main package\n\u2502   \u251c\u2500\u2500 __init__.py            # Package exports\n\u2502   \u251c\u2500\u2500 cache.py               # Response caching\n\u2502   \u251c\u2500\u2500 config.py              # Configuration classes\n\u2502   \u251c\u2500\u2500 providers.py           # LLM provider implementations\n\u2502   \u251c\u2500\u2500 input_handlers.py      # Input processing utilities\n\u2502   \u2514\u2500\u2500 exceptions.py          # Custom exceptions\n\u251c\u2500\u2500 examples/                   # Usage examples\n\u2502   \u251c\u2500\u2500 example.py             # Basic usage example\n\u2502   \u251c\u2500\u2500 prompts/               # Sample prompt files\n\u2502   \u2514\u2500\u2500 llm_cache/             # Example cache directory\n\u2514\u2500\u2500 tutorials/                 # Interactive tutorials\n    \u2514\u2500\u2500 llm_batch_helper_tutorial.ipynb  # Comprehensive Jupyter notebook tutorial\n```\n\n## Supported Models\n\n### OpenAI\n- gpt-4o-mini\n- gpt-4o\n- gpt-4\n- gpt-3.5-turbo\n\n## Contributing\n\n1. Fork the repository\n2. Create a feature branch\n3. Make your changes\n4. Add tests if applicable\n5. Run the test suite\n6. Submit a pull request\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## Changelog\n\n### v0.1.0\n- Initial release\n- Support for OpenAI API\n- Async batch processing\n- Response caching\n- File and list-based input support\n- Custom verification callbacks\n- Poetry package management\n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A Python package that enables batch submission of prompts to LLM APIs, with built-in async capabilities and response caching.",
    "version": "0.1.2",
    "project_urls": {
        "Homepage": "https://github.com/TianyiPeng/LLM_batch_helper",
        "Repository": "https://github.com/TianyiPeng/LLM_batch_helper"
    },
    "split_keywords": [
        "llm",
        " openai",
        " batch",
        " async",
        " ai",
        " nlp",
        " api"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "1629aecbfd24efc0f79c316573b18502fbc516323188e8521a8b347f2b1d36ee",
                "md5": "3b4b30407104d3ec64417819244faef8",
                "sha256": "721f83ccc1f61d8684d54074a6a9e02aaba16a70c530f0fc8d294ff150626d6f"
            },
            "downloads": -1,
            "filename": "llm_batch_helper-0.1.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "3b4b30407104d3ec64417819244faef8",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.11",
            "size": 13035,
            "upload_time": "2025-07-16T17:35:50",
            "upload_time_iso_8601": "2025-07-16T17:35:50.112327Z",
            "url": "https://files.pythonhosted.org/packages/16/29/aecbfd24efc0f79c316573b18502fbc516323188e8521a8b347f2b1d36ee/llm_batch_helper-0.1.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "57331cd3651f31ea4eafbd9ca39d0866e1d53f2672cc39458ae57f8b906c76bf",
                "md5": "cc90954fdabea7642019f140c3191f1e",
                "sha256": "4eca5e3330885ff17f336121bdbbed47b9df7a3f9b2a59d7b744f308c69b6923"
            },
            "downloads": -1,
            "filename": "llm_batch_helper-0.1.2.tar.gz",
            "has_sig": false,
            "md5_digest": "cc90954fdabea7642019f140c3191f1e",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.11",
            "size": 10745,
            "upload_time": "2025-07-16T17:35:51",
            "upload_time_iso_8601": "2025-07-16T17:35:51.222444Z",
            "url": "https://files.pythonhosted.org/packages/57/33/1cd3651f31ea4eafbd9ca39d0866e1d53f2672cc39458ae57f8b906c76bf/llm_batch_helper-0.1.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-16 17:35:51",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "TianyiPeng",
    "github_project": "LLM_batch_helper",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "llm-batch-helper"
}

Tianyi Peng