chat-limiter

Name	chat-limiter JSON
Version	0.6.0 JSON
	download
home_page	None
Summary	A Pythonic rate limiter for OpenAI, Anthropic, and OpenRouter APIs
upload_time	2025-07-17 15:33:09
maintainer	None
docs_url	None
author	None
requires_python	>=3.10
license	None
keywords	anthropic api openai openrouter rate-limiting
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # chat-limiter

A Pythonic rate limiter for OpenAI, Anthropic, and OpenRouter APIs that provides a high-level chat completion interface with automatic rate limit management.

## Features

- 🚀 **High-Level Chat Interface**: OpenAI/Anthropic-style chat completion methods
- 📡 **Automatic Rate Limit Discovery**: Fetches current limits from API response headers
- ⚡ **Sync & Async Support**: Use with `async/await` or synchronous code
- 📦 **Batch Processing**: Process multiple requests efficiently with concurrency control
- 🔄 **Intelligent Retry Logic**: Exponential backoff with provider-specific optimizations
- 🌐 **Multi-Provider Support**: Works seamlessly with OpenAI, Anthropic, and OpenRouter
- 🎯 **Pythonic Design**: Context manager interface with proper error handling
- 🛡️ **Fully Tested**: Comprehensive test suite with 93% coverage
- 🔧 **Token Estimation**: Basic token counting for better rate limit management
- 🔑 **Environment Variable Support**: Automatic API key detection from env vars
- 🔀 **Provider Override**: Manually specify provider for custom models

## Installation

```bash
pip install chat-limiter
```

Or with uv:

```bash
uv add chat-limiter
```

## Quick Start

### High-Level Chat Completion Interface (Recommended)

```python
import asyncio
from chat_limiter import ChatLimiter, Message, MessageRole

async def main():
    # Auto-detect provider and use environment variable for API key
    async with ChatLimiter.for_model("gpt-4o") as limiter:
        response = await limiter.chat_completion(
            model="gpt-4o",
            messages=[Message(role=MessageRole.USER, content="Hello!")]
        )
        print(response.choices[0].message.content)

    # Or provide API key explicitly
    async with ChatLimiter.for_model("claude-3-5-sonnet-20241022", api_key="sk-ant-...") as limiter:
        response = await limiter.simple_chat(
            model="claude-3-5-sonnet-20241022",
            prompt="What is Python?",
            max_tokens=100
        )
        print(response)

asyncio.run(main())
```

### Environment Variables

Set your API keys as environment variables:

```bash
export OPENAI_API_KEY="sk-your-openai-key"
export ANTHROPIC_API_KEY="sk-ant-your-anthropic-key"  
export OPENROUTER_API_KEY="sk-or-your-openrouter-key"
```

The library will automatically detect the provider from the model name and use the appropriate environment variable.

### Provider Override

For custom models or when auto-detection fails:

```python
async with ChatLimiter.for_model(
    "custom-model-name",
    provider="openai",  # or "anthropic", "openrouter"
    api_key="sk-key"
) as limiter:
    response = await limiter.chat_completion(
        model="custom-model-name",
        messages=[Message(role=MessageRole.USER, content="Hello!")]
    )
```

### Synchronous Usage

```python
from chat_limiter import ChatLimiter, Message, MessageRole

with ChatLimiter.for_model("gpt-4o") as limiter:
    response = limiter.chat_completion_sync(
        model="gpt-4o",
        messages=[Message(role=MessageRole.USER, content="Hello!")]
    )
    print(response.choices[0].message.content)

    # Or use the simple interface
    text_response = limiter.simple_chat_sync(
        model="gpt-4o",
        prompt="What is the capital of France?",
        max_tokens=50
    )
    print(text_response)
```

### Batch Processing with High-Level Interface

```python
import asyncio
from chat_limiter import (
    ChatLimiter, 
    Message, 
    MessageRole, 
    ChatCompletionRequest,
    process_chat_completion_batch,
    create_chat_completion_requests,
    BatchConfig
)

async def batch_example():
    # Create requests from simple prompts
    requests = create_chat_completion_requests(
        model="gpt-4o",
        prompts=["Hello!", "How are you?", "What is Python?"],
        max_tokens=50,
        temperature=0.7
    )
    
    async with ChatLimiter.for_model("gpt-4o") as limiter:
        # Process with custom configuration
        config = BatchConfig(
            max_concurrent_requests=5,
            max_retries_per_item=3,
            group_by_model=True
        )
        
        results = await process_chat_completion_batch(limiter, requests, config)
        
        # Extract successful responses
        for result in results:
            if result.success:
                response = result.result
                print(response.choices[0].message.content)

asyncio.run(batch_example())
```

## Provider Support

### Auto-Detection from Model Names

The library automatically detects providers based on model names:

- **OpenAI**: `gpt-4o`, `gpt-4o-mini`, `gpt-3.5-turbo`, etc.
- **Anthropic**: `claude-3-5-sonnet-20241022`, `claude-3-haiku-20240307`, etc.
- **OpenRouter**: `openai/gpt-4o`, `anthropic/claude-3-sonnet`, etc.

### Provider-Specific Features

**OpenAI**
- ✅ Automatic header parsing (`x-ratelimit-*`)
- ✅ Request and token rate limiting  
- ✅ Exponential backoff with jitter
- ✅ Model-specific optimizations

**Anthropic**
- ✅ Claude-specific headers (`anthropic-ratelimit-*`)
- ✅ Separate input/output token tracking
- ✅ System message handling
- ✅ Retry-after header support

**OpenRouter**
- ✅ Multi-model proxy support
- ✅ Dynamic limit discovery
- ✅ Model-specific rate adjustments
- ✅ Credit-based limiting

## Advanced Usage

### Low-Level Interface

For advanced users who need direct HTTP access:

```python
from chat_limiter import ChatLimiter, Provider

async with ChatLimiter(
    provider=Provider.OPENAI,
    api_key="sk-your-key"
) as limiter:
    # Direct HTTP requests
    response = await limiter.request(
        "POST", "/chat/completions",
        json={
            "model": "gpt-4o",
            "messages": [{"role": "user", "content": "Hello!"}]
        }
    )
    
    result = response.json()
    print(result["choices"][0]["message"]["content"])
```

### Custom HTTP Clients

```python
import httpx
from chat_limiter import ChatLimiter

# Use custom HTTP client
custom_client = httpx.AsyncClient(
    timeout=httpx.Timeout(60.0),
    headers={"Custom-Header": "value"}
)

async with ChatLimiter.for_model(
    "gpt-4o",
    http_client=custom_client
) as limiter:
    response = await limiter.chat_completion(
        model="gpt-4o",
        messages=[Message(role=MessageRole.USER, content="Hello!")]
    )
```

### Provider Configuration

```python
from chat_limiter import ChatLimiter, ProviderConfig, Provider

# Custom provider configuration
config = ProviderConfig(
    provider=Provider.OPENAI,
    base_url="https://api.openai.com/v1",
    default_request_limit=100,
    default_token_limit=50000,
    max_retries=5,
    base_backoff=2.0,
    request_buffer_ratio=0.8  # Use 80% of limits
)

async with ChatLimiter(config=config, api_key="sk-key") as limiter:
    response = await limiter.chat_completion(
        model="gpt-4o",
        messages=[Message(role=MessageRole.USER, content="Hello!")]
    )
```

### Error Handling

```python
from chat_limiter import ChatLimiter, Message, MessageRole
from tenacity import RetryError
import httpx

async with ChatLimiter.for_model("gpt-4o") as limiter:
    try:
        response = await limiter.chat_completion(
            model="gpt-4o",
            messages=[Message(role=MessageRole.USER, content="Hello!")]
        )
    except RetryError as e:
        print(f"Request failed after retries: {e}")
    except httpx.HTTPStatusError as e:
        print(f"HTTP error: {e.response.status_code}")
    except httpx.RequestError as e:
        print(f"Request error: {e}")
```

### Monitoring and Metrics

```python
async with ChatLimiter.for_model("gpt-4o") as limiter:
    # Make some requests...
    await limiter.chat_completion(
        model="gpt-4o",
        messages=[Message(role=MessageRole.USER, content="Hello!")]
    )
    
    # Check current limits and usage
    limits = limiter.get_current_limits()
    print(f"Requests used: {limits['requests_used']}/{limits['request_limit']}")
    print(f"Tokens used: {limits['tokens_used']}/{limits['token_limit']}")
    
    # Reset usage tracking
    limiter.reset_usage_tracking()
```

## Message Types and Parameters

### Message Structure

```python
from chat_limiter import Message, MessageRole

messages = [
    Message(role=MessageRole.SYSTEM, content="You are a helpful assistant."),
    Message(role=MessageRole.USER, content="Hello!"),
    Message(role=MessageRole.ASSISTANT, content="Hi there!"),
    Message(role=MessageRole.USER, content="How are you?")
]
```

### Chat Completion Parameters

```python
response = await limiter.chat_completion(
    model="gpt-4o",
    messages=messages,
    max_tokens=100,           # Maximum tokens to generate
    temperature=0.7,          # Sampling temperature (0-2)
    top_p=0.9,               # Top-p sampling
    stop=["END"],            # Stop sequences
    stream=False,            # Streaming response
    frequency_penalty=0.0,   # Frequency penalty (-2 to 2)
    presence_penalty=0.0,    # Presence penalty (-2 to 2)
    top_k=40,               # Top-k sampling (Anthropic/OpenRouter)
)
```

## Batch Processing

### Simple Batch Processing

```python
from chat_limiter import create_chat_completion_requests, process_chat_completion_batch

# Create requests from prompts
requests = create_chat_completion_requests(
    model="gpt-4o",
    prompts=["Question 1", "Question 2", "Question 3"],
    max_tokens=50
)

async with ChatLimiter.for_model("gpt-4o") as limiter:
    results = await process_chat_completion_batch(limiter, requests)
    
    # Process results
    for result in results:
        if result.success:
            print(result.result.choices[0].message.content)
        else:
            print(f"Error: {result.error}")
```

### Batch Configuration

```python
from chat_limiter import BatchConfig

config = BatchConfig(
    max_concurrent_requests=10,     # Concurrent request limit
    max_workers=4,                  # Thread pool size for sync
    max_retries_per_item=3,         # Retries per failed item
    retry_delay=1.0,                # Base retry delay
    stop_on_first_error=False,      # Continue on individual failures
    group_by_model=True,            # Group requests by model
    adaptive_batch_size=True        # Adapt batch size to rate limits
)
```

## Rate Limiting Details

### How It Works

1. **Header Parsing**: Automatically extracts rate limit information from API response headers
2. **Token Bucket Algorithm**: Uses PyrateLimiter for smooth rate limiting with burst support
3. **Adaptive Limits**: Updates limits based on server responses in real-time
4. **Intelligent Queuing**: Coordinates requests to stay under limits while maximizing throughput

### Provider-Specific Behavior

| Provider   | Request Limits | Token Limits | Dynamic Discovery | Special Features |
|------------|---------------|--------------|-------------------|------------------|
| OpenAI     | ✅ RPM        | ✅ TPM       | ✅ Headers        | Model detection, batch optimization |
| Anthropic  | ✅ RPM        | ✅ Input/Output TPM | ✅ Headers | Tier handling, system messages |
| OpenRouter | ✅ RPM        | ✅ TPM       | ✅ Auth endpoint  | Multi-model, credit tracking |

## Testing

The library includes a comprehensive test suite:

```bash
# Run tests
uv run pytest

# Run with coverage
uv run pytest --cov=chat_limiter

# Run specific test file
uv run pytest tests/test_high_level_interface.py -v
```

## Development

```bash
# Clone the repository
git clone https://github.com/your-repo/chat-limiter.git
cd chat-limiter

# Install dependencies
uv sync --group dev

# Run linting
uv run ruff check src/ tests/

# Run type checking
uv run mypy src/

# Format code
uv run ruff format src/ tests/
```

## Contributing

Contributions are welcome! Please:

1. Fork the repository
2. Create a feature branch
3. Add tests for new functionality
4. Run the test suite and linting
5. Submit a pull request

## License

MIT License - see LICENSE file for details.

## Changelog

### 0.2.0 (Latest)

- 🚀 **High-level chat completion interface** - OpenAI/Anthropic-style methods
- 🔑 **Environment variable support** - Automatic API key detection
- 🔀 **Provider override** - Manual provider specification for custom models
- 📦 **Enhanced batch processing** - High-level batch operations with ChatCompletionRequest
- 🎯 **Unified message types** - Cross-provider message and response compatibility
- 🧪 **Improved testing** - 93% test coverage with comprehensive high-level interface tests

### 0.1.0 (Initial Release)

- Multi-provider support (OpenAI, Anthropic, OpenRouter)
- Async and sync interfaces
- Batch processing with concurrency control
- Automatic rate limit discovery
- Comprehensive test suite
- Type hints and documentation

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "chat-limiter",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "anthropic, api, openai, openrouter, rate-limiting",
    "author": null,
    "author_email": "Ivan Arcuschin <ivan@arcuschin.com>",
    "download_url": "https://files.pythonhosted.org/packages/b4/42/16a09681c013767b6a27e2c74c6e9eb118eb8b368b9bb10d1f0d60087018/chat_limiter-0.6.0.tar.gz",
    "platform": null,
    "description": "# chat-limiter\n\nA Pythonic rate limiter for OpenAI, Anthropic, and OpenRouter APIs that provides a high-level chat completion interface with automatic rate limit management.\n\n## Features\n\n- \ud83d\ude80 **High-Level Chat Interface**: OpenAI/Anthropic-style chat completion methods\n- \ud83d\udce1 **Automatic Rate Limit Discovery**: Fetches current limits from API response headers\n- \u26a1 **Sync & Async Support**: Use with `async/await` or synchronous code\n- \ud83d\udce6 **Batch Processing**: Process multiple requests efficiently with concurrency control\n- \ud83d\udd04 **Intelligent Retry Logic**: Exponential backoff with provider-specific optimizations\n- \ud83c\udf10 **Multi-Provider Support**: Works seamlessly with OpenAI, Anthropic, and OpenRouter\n- \ud83c\udfaf **Pythonic Design**: Context manager interface with proper error handling\n- \ud83d\udee1\ufe0f **Fully Tested**: Comprehensive test suite with 93% coverage\n- \ud83d\udd27 **Token Estimation**: Basic token counting for better rate limit management\n- \ud83d\udd11 **Environment Variable Support**: Automatic API key detection from env vars\n- \ud83d\udd00 **Provider Override**: Manually specify provider for custom models\n\n## Installation\n\n```bash\npip install chat-limiter\n```\n\nOr with uv:\n\n```bash\nuv add chat-limiter\n```\n\n## Quick Start\n\n### High-Level Chat Completion Interface (Recommended)\n\n```python\nimport asyncio\nfrom chat_limiter import ChatLimiter, Message, MessageRole\n\nasync def main():\n    # Auto-detect provider and use environment variable for API key\n    async with ChatLimiter.for_model(\"gpt-4o\") as limiter:\n        response = await limiter.chat_completion(\n            model=\"gpt-4o\",\n            messages=[Message(role=MessageRole.USER, content=\"Hello!\")]\n        )\n        print(response.choices[0].message.content)\n\n    # Or provide API key explicitly\n    async with ChatLimiter.for_model(\"claude-3-5-sonnet-20241022\", api_key=\"sk-ant-...\") as limiter:\n        response = await limiter.simple_chat(\n            model=\"claude-3-5-sonnet-20241022\",\n            prompt=\"What is Python?\",\n            max_tokens=100\n        )\n        print(response)\n\nasyncio.run(main())\n```\n\n### Environment Variables\n\nSet your API keys as environment variables:\n\n```bash\nexport OPENAI_API_KEY=\"sk-your-openai-key\"\nexport ANTHROPIC_API_KEY=\"sk-ant-your-anthropic-key\"  \nexport OPENROUTER_API_KEY=\"sk-or-your-openrouter-key\"\n```\n\nThe library will automatically detect the provider from the model name and use the appropriate environment variable.\n\n### Provider Override\n\nFor custom models or when auto-detection fails:\n\n```python\nasync with ChatLimiter.for_model(\n    \"custom-model-name\",\n    provider=\"openai\",  # or \"anthropic\", \"openrouter\"\n    api_key=\"sk-key\"\n) as limiter:\n    response = await limiter.chat_completion(\n        model=\"custom-model-name\",\n        messages=[Message(role=MessageRole.USER, content=\"Hello!\")]\n    )\n```\n\n### Synchronous Usage\n\n```python\nfrom chat_limiter import ChatLimiter, Message, MessageRole\n\nwith ChatLimiter.for_model(\"gpt-4o\") as limiter:\n    response = limiter.chat_completion_sync(\n        model=\"gpt-4o\",\n        messages=[Message(role=MessageRole.USER, content=\"Hello!\")]\n    )\n    print(response.choices[0].message.content)\n\n    # Or use the simple interface\n    text_response = limiter.simple_chat_sync(\n        model=\"gpt-4o\",\n        prompt=\"What is the capital of France?\",\n        max_tokens=50\n    )\n    print(text_response)\n```\n\n### Batch Processing with High-Level Interface\n\n```python\nimport asyncio\nfrom chat_limiter import (\n    ChatLimiter, \n    Message, \n    MessageRole, \n    ChatCompletionRequest,\n    process_chat_completion_batch,\n    create_chat_completion_requests,\n    BatchConfig\n)\n\nasync def batch_example():\n    # Create requests from simple prompts\n    requests = create_chat_completion_requests(\n        model=\"gpt-4o\",\n        prompts=[\"Hello!\", \"How are you?\", \"What is Python?\"],\n        max_tokens=50,\n        temperature=0.7\n    )\n    \n    async with ChatLimiter.for_model(\"gpt-4o\") as limiter:\n        # Process with custom configuration\n        config = BatchConfig(\n            max_concurrent_requests=5,\n            max_retries_per_item=3,\n            group_by_model=True\n        )\n        \n        results = await process_chat_completion_batch(limiter, requests, config)\n        \n        # Extract successful responses\n        for result in results:\n            if result.success:\n                response = result.result\n                print(response.choices[0].message.content)\n\nasyncio.run(batch_example())\n```\n\n## Provider Support\n\n### Auto-Detection from Model Names\n\nThe library automatically detects providers based on model names:\n\n- **OpenAI**: `gpt-4o`, `gpt-4o-mini`, `gpt-3.5-turbo`, etc.\n- **Anthropic**: `claude-3-5-sonnet-20241022`, `claude-3-haiku-20240307`, etc.\n- **OpenRouter**: `openai/gpt-4o`, `anthropic/claude-3-sonnet`, etc.\n\n### Provider-Specific Features\n\n**OpenAI**\n- \u2705 Automatic header parsing (`x-ratelimit-*`)\n- \u2705 Request and token rate limiting  \n- \u2705 Exponential backoff with jitter\n- \u2705 Model-specific optimizations\n\n**Anthropic**\n- \u2705 Claude-specific headers (`anthropic-ratelimit-*`)\n- \u2705 Separate input/output token tracking\n- \u2705 System message handling\n- \u2705 Retry-after header support\n\n**OpenRouter**\n- \u2705 Multi-model proxy support\n- \u2705 Dynamic limit discovery\n- \u2705 Model-specific rate adjustments\n- \u2705 Credit-based limiting\n\n## Advanced Usage\n\n### Low-Level Interface\n\nFor advanced users who need direct HTTP access:\n\n```python\nfrom chat_limiter import ChatLimiter, Provider\n\nasync with ChatLimiter(\n    provider=Provider.OPENAI,\n    api_key=\"sk-your-key\"\n) as limiter:\n    # Direct HTTP requests\n    response = await limiter.request(\n        \"POST\", \"/chat/completions\",\n        json={\n            \"model\": \"gpt-4o\",\n            \"messages\": [{\"role\": \"user\", \"content\": \"Hello!\"}]\n        }\n    )\n    \n    result = response.json()\n    print(result[\"choices\"][0][\"message\"][\"content\"])\n```\n\n### Custom HTTP Clients\n\n```python\nimport httpx\nfrom chat_limiter import ChatLimiter\n\n# Use custom HTTP client\ncustom_client = httpx.AsyncClient(\n    timeout=httpx.Timeout(60.0),\n    headers={\"Custom-Header\": \"value\"}\n)\n\nasync with ChatLimiter.for_model(\n    \"gpt-4o\",\n    http_client=custom_client\n) as limiter:\n    response = await limiter.chat_completion(\n        model=\"gpt-4o\",\n        messages=[Message(role=MessageRole.USER, content=\"Hello!\")]\n    )\n```\n\n### Provider Configuration\n\n```python\nfrom chat_limiter import ChatLimiter, ProviderConfig, Provider\n\n# Custom provider configuration\nconfig = ProviderConfig(\n    provider=Provider.OPENAI,\n    base_url=\"https://api.openai.com/v1\",\n    default_request_limit=100,\n    default_token_limit=50000,\n    max_retries=5,\n    base_backoff=2.0,\n    request_buffer_ratio=0.8  # Use 80% of limits\n)\n\nasync with ChatLimiter(config=config, api_key=\"sk-key\") as limiter:\n    response = await limiter.chat_completion(\n        model=\"gpt-4o\",\n        messages=[Message(role=MessageRole.USER, content=\"Hello!\")]\n    )\n```\n\n### Error Handling\n\n```python\nfrom chat_limiter import ChatLimiter, Message, MessageRole\nfrom tenacity import RetryError\nimport httpx\n\nasync with ChatLimiter.for_model(\"gpt-4o\") as limiter:\n    try:\n        response = await limiter.chat_completion(\n            model=\"gpt-4o\",\n            messages=[Message(role=MessageRole.USER, content=\"Hello!\")]\n        )\n    except RetryError as e:\n        print(f\"Request failed after retries: {e}\")\n    except httpx.HTTPStatusError as e:\n        print(f\"HTTP error: {e.response.status_code}\")\n    except httpx.RequestError as e:\n        print(f\"Request error: {e}\")\n```\n\n### Monitoring and Metrics\n\n```python\nasync with ChatLimiter.for_model(\"gpt-4o\") as limiter:\n    # Make some requests...\n    await limiter.chat_completion(\n        model=\"gpt-4o\",\n        messages=[Message(role=MessageRole.USER, content=\"Hello!\")]\n    )\n    \n    # Check current limits and usage\n    limits = limiter.get_current_limits()\n    print(f\"Requests used: {limits['requests_used']}/{limits['request_limit']}\")\n    print(f\"Tokens used: {limits['tokens_used']}/{limits['token_limit']}\")\n    \n    # Reset usage tracking\n    limiter.reset_usage_tracking()\n```\n\n## Message Types and Parameters\n\n### Message Structure\n\n```python\nfrom chat_limiter import Message, MessageRole\n\nmessages = [\n    Message(role=MessageRole.SYSTEM, content=\"You are a helpful assistant.\"),\n    Message(role=MessageRole.USER, content=\"Hello!\"),\n    Message(role=MessageRole.ASSISTANT, content=\"Hi there!\"),\n    Message(role=MessageRole.USER, content=\"How are you?\")\n]\n```\n\n### Chat Completion Parameters\n\n```python\nresponse = await limiter.chat_completion(\n    model=\"gpt-4o\",\n    messages=messages,\n    max_tokens=100,           # Maximum tokens to generate\n    temperature=0.7,          # Sampling temperature (0-2)\n    top_p=0.9,               # Top-p sampling\n    stop=[\"END\"],            # Stop sequences\n    stream=False,            # Streaming response\n    frequency_penalty=0.0,   # Frequency penalty (-2 to 2)\n    presence_penalty=0.0,    # Presence penalty (-2 to 2)\n    top_k=40,               # Top-k sampling (Anthropic/OpenRouter)\n)\n```\n\n## Batch Processing\n\n### Simple Batch Processing\n\n```python\nfrom chat_limiter import create_chat_completion_requests, process_chat_completion_batch\n\n# Create requests from prompts\nrequests = create_chat_completion_requests(\n    model=\"gpt-4o\",\n    prompts=[\"Question 1\", \"Question 2\", \"Question 3\"],\n    max_tokens=50\n)\n\nasync with ChatLimiter.for_model(\"gpt-4o\") as limiter:\n    results = await process_chat_completion_batch(limiter, requests)\n    \n    # Process results\n    for result in results:\n        if result.success:\n            print(result.result.choices[0].message.content)\n        else:\n            print(f\"Error: {result.error}\")\n```\n\n### Batch Configuration\n\n```python\nfrom chat_limiter import BatchConfig\n\nconfig = BatchConfig(\n    max_concurrent_requests=10,     # Concurrent request limit\n    max_workers=4,                  # Thread pool size for sync\n    max_retries_per_item=3,         # Retries per failed item\n    retry_delay=1.0,                # Base retry delay\n    stop_on_first_error=False,      # Continue on individual failures\n    group_by_model=True,            # Group requests by model\n    adaptive_batch_size=True        # Adapt batch size to rate limits\n)\n```\n\n## Rate Limiting Details\n\n### How It Works\n\n1. **Header Parsing**: Automatically extracts rate limit information from API response headers\n2. **Token Bucket Algorithm**: Uses PyrateLimiter for smooth rate limiting with burst support\n3. **Adaptive Limits**: Updates limits based on server responses in real-time\n4. **Intelligent Queuing**: Coordinates requests to stay under limits while maximizing throughput\n\n### Provider-Specific Behavior\n\n| Provider   | Request Limits | Token Limits | Dynamic Discovery | Special Features |\n|------------|---------------|--------------|-------------------|------------------|\n| OpenAI     | \u2705 RPM        | \u2705 TPM       | \u2705 Headers        | Model detection, batch optimization |\n| Anthropic  | \u2705 RPM        | \u2705 Input/Output TPM | \u2705 Headers | Tier handling, system messages |\n| OpenRouter | \u2705 RPM        | \u2705 TPM       | \u2705 Auth endpoint  | Multi-model, credit tracking |\n\n## Testing\n\nThe library includes a comprehensive test suite:\n\n```bash\n# Run tests\nuv run pytest\n\n# Run with coverage\nuv run pytest --cov=chat_limiter\n\n# Run specific test file\nuv run pytest tests/test_high_level_interface.py -v\n```\n\n## Development\n\n```bash\n# Clone the repository\ngit clone https://github.com/your-repo/chat-limiter.git\ncd chat-limiter\n\n# Install dependencies\nuv sync --group dev\n\n# Run linting\nuv run ruff check src/ tests/\n\n# Run type checking\nuv run mypy src/\n\n# Format code\nuv run ruff format src/ tests/\n```\n\n## Contributing\n\nContributions are welcome! Please:\n\n1. Fork the repository\n2. Create a feature branch\n3. Add tests for new functionality\n4. Run the test suite and linting\n5. Submit a pull request\n\n## License\n\nMIT License - see LICENSE file for details.\n\n## Changelog\n\n### 0.2.0 (Latest)\n\n- \ud83d\ude80 **High-level chat completion interface** - OpenAI/Anthropic-style methods\n- \ud83d\udd11 **Environment variable support** - Automatic API key detection\n- \ud83d\udd00 **Provider override** - Manual provider specification for custom models\n- \ud83d\udce6 **Enhanced batch processing** - High-level batch operations with ChatCompletionRequest\n- \ud83c\udfaf **Unified message types** - Cross-provider message and response compatibility\n- \ud83e\uddea **Improved testing** - 93% test coverage with comprehensive high-level interface tests\n\n### 0.1.0 (Initial Release)\n\n- Multi-provider support (OpenAI, Anthropic, OpenRouter)\n- Async and sync interfaces\n- Batch processing with concurrency control\n- Automatic rate limit discovery\n- Comprehensive test suite\n- Type hints and documentation",
    "bugtrack_url": null,
    "license": null,
    "summary": "A Pythonic rate limiter for OpenAI, Anthropic, and OpenRouter APIs",
    "version": "0.6.0",
    "project_urls": null,
    "split_keywords": [
        "anthropic",
        " api",
        " openai",
        " openrouter",
        " rate-limiting"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "4023bd00f9bab358aafe854e5f4af55f66e427d57ea626e45baab32fc4daa12d",
                "md5": "fa6c73a21c0fe3872ac522f41d8e1be7",
                "sha256": "0e73b3754ab70d632559d3acf8a9223e718ef0ddb642eb5072ebfab39618708c"
            },
            "downloads": -1,
            "filename": "chat_limiter-0.6.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "fa6c73a21c0fe3872ac522f41d8e1be7",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 28109,
            "upload_time": "2025-07-17T15:33:06",
            "upload_time_iso_8601": "2025-07-17T15:33:06.202216Z",
            "url": "https://files.pythonhosted.org/packages/40/23/bd00f9bab358aafe854e5f4af55f66e427d57ea626e45baab32fc4daa12d/chat_limiter-0.6.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "b44216a09681c013767b6a27e2c74c6e9eb118eb8b368b9bb10d1f0d60087018",
                "md5": "e31bb62808fce035d9e0012f1e1a8906",
                "sha256": "dcbc2ff577caf7079f2d5780870a05cfc3b49943286a237405fd11983f4a2a37"
            },
            "downloads": -1,
            "filename": "chat_limiter-0.6.0.tar.gz",
            "has_sig": false,
            "md5_digest": "e31bb62808fce035d9e0012f1e1a8906",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 245573,
            "upload_time": "2025-07-17T15:33:09",
            "upload_time_iso_8601": "2025-07-17T15:33:09.488917Z",
            "url": "https://files.pythonhosted.org/packages/b4/42/16a09681c013767b6a27e2c74c6e9eb118eb8b368b9bb10d1f0d60087018/chat_limiter-0.6.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-17 15:33:09",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "chat-limiter"
}

None