async-llm-handler


Nameasync-llm-handler JSON
Version 0.2.0 PyPI version JSON
download
home_pageNone
SummaryAn asynchronous handler for multiple LLM APIs
upload_time2024-08-03 00:22:29
maintainerNone
docs_urlNone
authorNone
requires_python>=3.7
licenseNone
keywords api async llm nlp
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Async LLM Handler

Async LLM Handler is a Python package that provides a unified interface for interacting with multiple Language Model APIs asynchronously. It currently supports Gemini, Claude, and OpenAI APIs.

## Features

- Asynchronous API calls
- Support for multiple LLM providers:
  - Gemini (model: gemini_flash)
  - Claude (models: claude_3_5_sonnet, claude_3_haiku)
  - OpenAI (models: gpt_4o, gpt_4o_mini)
- Automatic rate limiting for each API
- Token counting and prompt clipping utilities

## Installation

Install the Async LLM Handler using pip:

```bash
pip install async-llm-handler
```

## Configuration

Before using the package, set up your environment variables in a `.env` file in your project's root directory:

```
GEMINI_API_KEY=your_gemini_api_key
CLAUDE_API_KEY=your_claude_api_key
OPENAI_API_KEY=your_openai_api_key
```

## Usage

### Basic Usage

```python
import asyncio
from async_llm_handler import Handler

async def main():
    handler = Handler()

    # Using the default model
    response = await handler.query("What is the capital of France?")
    print(response)

    # Specifying a model
    response = await handler.query("Explain quantum computing", model="claude_3_5_sonnet")
    print(response)

asyncio.run(main())
```

### Advanced Usage

#### Using Multiple Models Concurrently

```python
import asyncio
from async_llm_handler import Handler

async def main():
    handler = Handler()
    prompt = "Explain the theory of relativity"
    
    tasks = [
        handler.query(prompt, model='gemini_flash'),
        handler.query(prompt, model='gpt_4o'),
        handler.query(prompt, model='claude_3_5_sonnet')
    ]
    
    responses = await asyncio.gather(*tasks)
    
    for model, response in zip(['Gemini Flash', 'GPT-4o', 'Claude 3.5 Sonnet'], responses):
        print(f"Response from {model}:")
        print(response)
        print()

asyncio.run(main())
```

#### Limiting Input and Output Tokens

```python
import asyncio
from async_llm_handler import Handler

async def main():
    handler = Handler()

    long_prompt = "Provide a detailed explanation of the entire history of artificial intelligence, including all major milestones and breakthroughs."

    response = await handler.query(long_prompt, model="gpt_4o", max_input_tokens=1000, max_output_tokens=500)
    print(response)

asyncio.run(main())
```

### Supported Models

The package supports the following models:

1. Gemini:
   - `gemini_flash`

2. Claude:
   - `claude_3_5_sonnet`
   - `claude_3_haiku`

3. OpenAI:
   - `gpt_4o`
   - `gpt_4o_mini`

You can specify these models using the `model` parameter in the `query` method.

### Error Handling

The package uses custom exceptions for error handling. Wrap your API calls in try-except blocks to handle potential errors:

```python
import asyncio
from async_llm_handler import Handler
from async_llm_handler.exceptions import LLMAPIError, RateLimitTimeoutError

async def main():
    handler = Handler()

    try:
        response = await handler.query("What is the meaning of life?", model="gpt_4o")
        print(response)
    except LLMAPIError as e:
        print(f"An API error occurred: {e}")
    except RateLimitTimeoutError as e:
        print(f"Rate limit exceeded: {e}")

asyncio.run(main())
```

### Rate Limiting

The package automatically handles rate limiting for each API. The current rate limits are:

- Gemini Flash: 30 requests per minute
- Claude 3.5 Sonnet: 5 requests per minute
- Claude 3 Haiku: 5 requests per minute
- GPT-4o: 5 requests per minute
- GPT-4o mini: 5 requests per minute

If you exceed these limits, the package will automatically wait before making the next request.

## Utility Functions

The package includes utility functions for token counting and prompt clipping:

```python
from async_llm_handler.utils import count_tokens, clip_prompt

text = "This is a sample text for token counting."
token_count = count_tokens(text)
print(f"Token count: {token_count}")

long_text = "This is a very long text that needs to be clipped..." * 100
clipped_text = clip_prompt(long_text, max_tokens=50)
print(f"Clipped text: {clipped_text}")
```

These utilities use the `cl100k_base` encoding by default, which is suitable for most modern language models.

## Logging

The package uses Python's built-in logging module. You can configure logging in your application to see debug information, warnings, and errors from the Async LLM Handler:

```python
import logging

logging.basicConfig(level=logging.INFO)
```

This will display INFO level logs and above from the Async LLM Handler.

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

## License

This project is licensed under the MIT License.



            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "async-llm-handler",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": null,
    "keywords": "api, async, llm, nlp",
    "author": null,
    "author_email": "Bryan Nsoh <bryan.anye.5@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/eb/30/904348605a9997982b47ec2ca7c91dedbbd38e2f272081dcce64fb78e3bd/async_llm_handler-0.2.0.tar.gz",
    "platform": null,
    "description": "# Async LLM Handler\n\nAsync LLM Handler is a Python package that provides a unified interface for interacting with multiple Language Model APIs asynchronously. It currently supports Gemini, Claude, and OpenAI APIs.\n\n## Features\n\n- Asynchronous API calls\n- Support for multiple LLM providers:\n  - Gemini (model: gemini_flash)\n  - Claude (models: claude_3_5_sonnet, claude_3_haiku)\n  - OpenAI (models: gpt_4o, gpt_4o_mini)\n- Automatic rate limiting for each API\n- Token counting and prompt clipping utilities\n\n## Installation\n\nInstall the Async LLM Handler using pip:\n\n```bash\npip install async-llm-handler\n```\n\n## Configuration\n\nBefore using the package, set up your environment variables in a `.env` file in your project's root directory:\n\n```\nGEMINI_API_KEY=your_gemini_api_key\nCLAUDE_API_KEY=your_claude_api_key\nOPENAI_API_KEY=your_openai_api_key\n```\n\n## Usage\n\n### Basic Usage\n\n```python\nimport asyncio\nfrom async_llm_handler import Handler\n\nasync def main():\n    handler = Handler()\n\n    # Using the default model\n    response = await handler.query(\"What is the capital of France?\")\n    print(response)\n\n    # Specifying a model\n    response = await handler.query(\"Explain quantum computing\", model=\"claude_3_5_sonnet\")\n    print(response)\n\nasyncio.run(main())\n```\n\n### Advanced Usage\n\n#### Using Multiple Models Concurrently\n\n```python\nimport asyncio\nfrom async_llm_handler import Handler\n\nasync def main():\n    handler = Handler()\n    prompt = \"Explain the theory of relativity\"\n    \n    tasks = [\n        handler.query(prompt, model='gemini_flash'),\n        handler.query(prompt, model='gpt_4o'),\n        handler.query(prompt, model='claude_3_5_sonnet')\n    ]\n    \n    responses = await asyncio.gather(*tasks)\n    \n    for model, response in zip(['Gemini Flash', 'GPT-4o', 'Claude 3.5 Sonnet'], responses):\n        print(f\"Response from {model}:\")\n        print(response)\n        print()\n\nasyncio.run(main())\n```\n\n#### Limiting Input and Output Tokens\n\n```python\nimport asyncio\nfrom async_llm_handler import Handler\n\nasync def main():\n    handler = Handler()\n\n    long_prompt = \"Provide a detailed explanation of the entire history of artificial intelligence, including all major milestones and breakthroughs.\"\n\n    response = await handler.query(long_prompt, model=\"gpt_4o\", max_input_tokens=1000, max_output_tokens=500)\n    print(response)\n\nasyncio.run(main())\n```\n\n### Supported Models\n\nThe package supports the following models:\n\n1. Gemini:\n   - `gemini_flash`\n\n2. Claude:\n   - `claude_3_5_sonnet`\n   - `claude_3_haiku`\n\n3. OpenAI:\n   - `gpt_4o`\n   - `gpt_4o_mini`\n\nYou can specify these models using the `model` parameter in the `query` method.\n\n### Error Handling\n\nThe package uses custom exceptions for error handling. Wrap your API calls in try-except blocks to handle potential errors:\n\n```python\nimport asyncio\nfrom async_llm_handler import Handler\nfrom async_llm_handler.exceptions import LLMAPIError, RateLimitTimeoutError\n\nasync def main():\n    handler = Handler()\n\n    try:\n        response = await handler.query(\"What is the meaning of life?\", model=\"gpt_4o\")\n        print(response)\n    except LLMAPIError as e:\n        print(f\"An API error occurred: {e}\")\n    except RateLimitTimeoutError as e:\n        print(f\"Rate limit exceeded: {e}\")\n\nasyncio.run(main())\n```\n\n### Rate Limiting\n\nThe package automatically handles rate limiting for each API. The current rate limits are:\n\n- Gemini Flash: 30 requests per minute\n- Claude 3.5 Sonnet: 5 requests per minute\n- Claude 3 Haiku: 5 requests per minute\n- GPT-4o: 5 requests per minute\n- GPT-4o mini: 5 requests per minute\n\nIf you exceed these limits, the package will automatically wait before making the next request.\n\n## Utility Functions\n\nThe package includes utility functions for token counting and prompt clipping:\n\n```python\nfrom async_llm_handler.utils import count_tokens, clip_prompt\n\ntext = \"This is a sample text for token counting.\"\ntoken_count = count_tokens(text)\nprint(f\"Token count: {token_count}\")\n\nlong_text = \"This is a very long text that needs to be clipped...\" * 100\nclipped_text = clip_prompt(long_text, max_tokens=50)\nprint(f\"Clipped text: {clipped_text}\")\n```\n\nThese utilities use the `cl100k_base` encoding by default, which is suitable for most modern language models.\n\n## Logging\n\nThe package uses Python's built-in logging module. You can configure logging in your application to see debug information, warnings, and errors from the Async LLM Handler:\n\n```python\nimport logging\n\nlogging.basicConfig(level=logging.INFO)\n```\n\nThis will display INFO level logs and above from the Async LLM Handler.\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request.\n\n## License\n\nThis project is licensed under the MIT License.\n\n\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "An asynchronous handler for multiple LLM APIs",
    "version": "0.2.0",
    "project_urls": {
        "Homepage": "https://github.com/BryanNsoh/async_llm_handler"
    },
    "split_keywords": [
        "api",
        " async",
        " llm",
        " nlp"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ae2923ea6a5be4c7ab2910ec8705d5504f33845b03c013a8fdc829b55ce47eda",
                "md5": "b535a479086d881ed2ebed45a476c8c3",
                "sha256": "32c1edd4e86862787d25ebaaa28bf1186bc26ba3b7920ef0acce29329e0f8f45"
            },
            "downloads": -1,
            "filename": "async_llm_handler-0.2.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "b535a479086d881ed2ebed45a476c8c3",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 11602,
            "upload_time": "2024-08-03T00:22:27",
            "upload_time_iso_8601": "2024-08-03T00:22:27.911361Z",
            "url": "https://files.pythonhosted.org/packages/ae/29/23ea6a5be4c7ab2910ec8705d5504f33845b03c013a8fdc829b55ce47eda/async_llm_handler-0.2.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "eb30904348605a9997982b47ec2ca7c91dedbbd38e2f272081dcce64fb78e3bd",
                "md5": "a0d19ba157361277f3e29158ed666988",
                "sha256": "244f25db0316cfffe7efdcf519cd3a8cd5dcb544bffbbb64d504f0e3ba827abe"
            },
            "downloads": -1,
            "filename": "async_llm_handler-0.2.0.tar.gz",
            "has_sig": false,
            "md5_digest": "a0d19ba157361277f3e29158ed666988",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 12349,
            "upload_time": "2024-08-03T00:22:29",
            "upload_time_iso_8601": "2024-08-03T00:22:29.535942Z",
            "url": "https://files.pythonhosted.org/packages/eb/30/904348605a9997982b47ec2ca7c91dedbbd38e2f272081dcce64fb78e3bd/async_llm_handler-0.2.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-08-03 00:22:29",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "BryanNsoh",
    "github_project": "async_llm_handler",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "async-llm-handler"
}
        
Elapsed time: 0.55788s