modelbridge


Namemodelbridge JSON
Version 1.0.2 PyPI version JSON
download
home_pagehttps://github.com/code-mohanprakash/modelbridge
SummaryEnterprise-Grade Multi-Provider LLM Gateway with Intelligent Routing
upload_time2025-08-08 05:00:03
maintainerNone
docs_urlNone
authorMohan Prakash
requires_python>=3.8
licenseMIT
keywords llm ai gateway routing openai anthropic google groq multi-provider load-balancing
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # ModelBridge

[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![PyPI version](https://badge.fury.io/py/modelbridge.svg)](https://badge.fury.io/py/modelbridge)

Unified Python interface for multiple LLM providers with automatic failover, caching, and rate limiting.

## Installation

```bash
pip install modelbridge
```

## Quick Start

```python
import asyncio
from modelbridge import ModelBridge

async def main():
    # Initialize with API keys from environment variables
    bridge = ModelBridge()
    await bridge.initialize()
    
    # Generate text using automatic provider selection
    response = await bridge.generate_text(
        prompt="Explain the concept of recursion",
        model="balanced"  # Uses optimal model based on availability
    )
    
    print(response.content)

asyncio.run(main())
```

## Features

### Core Capabilities
- **Multi-provider support**: OpenAI, Anthropic, Google Gemini, Groq
- **Automatic failover**: Seamlessly switches providers on failure
- **Response caching**: Redis and in-memory caching with configurable TTL
- **Rate limiting**: Token bucket and sliding window algorithms
- **Circuit breakers**: Prevents cascading failures with exponential backoff
- **Cost tracking**: Per-request and cumulative cost calculation
- **Performance monitoring**: Response time tracking and success rate metrics

### Supported Models

#### OpenAI (GPT-5 Family - August 2025)
- `gpt-5`: Best for coding & agents, state-of-art reasoning
- `gpt-5-mini`: Balanced performance and cost
- `gpt-5-nano`: Ultra-fast, cheapest option ($0.05/1M input tokens)
- `gpt-4.1`, `gpt-4.1-mini`: Better instruction following
- `o3`, `o3-mini`, `o4-mini`: Advanced reasoning models

#### Anthropic (Claude 4 Series)
- `claude-opus-4-1`: Most capable, best for complex analysis
- `claude-opus-4`: Previous Opus version
- `claude-sonnet-4`: Balanced performance and cost
- `claude-3-5-sonnet-20241022`: Legacy but excellent performance

#### Google (Gemini 2.5 Series)
- `gemini-2.5-pro`: 1M context, multimodal capabilities
- `gemini-2.5-flash`: 250+ tokens/second, ultra-fast
- `gemini-2.5-flash-lite`: $0.10/1M tokens, most cost-effective
- `gemini-1.5-pro`, `gemini-1.5-flash`: Legacy models

#### Groq (Ultra-Fast Inference)
- `llama-3.3-70b-versatile`: 276 tokens/second, latest Meta model
- `mixtral-8x7b-32768`: 500+ tokens/second, world's fastest
- `llama-3.1-405b-reasoning`: Exceptional reasoning capability
- `llama-3.1-8b-instant`: Lightning fast for simple tasks

## Configuration

### Environment Variables

```bash
# Required API Keys
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
export GOOGLE_API_KEY="..."
export GROQ_API_KEY="gsk_..."

# Optional Configuration
export REDIS_URL="redis://localhost:6379"
export MODEL_BRIDGE_CACHE_TTL="3600"
export MODEL_BRIDGE_MAX_RETRIES="3"
```

### Programmatic Configuration

```python
from modelbridge import ModelBridge

bridge = ModelBridge()
await bridge.initialize()

# Custom configuration per request
response = await bridge.generate_text(
    prompt="Your prompt here",
    model="gpt-5",
    temperature=0.7,
    max_tokens=1000,
    system_message="You are a helpful assistant"
)
```

## Model Aliases

Pre-configured routing strategies for common use cases:

```python
# Quality-first routing
response = await bridge.generate_text(prompt="...", model="best")
# Routes: gpt-5 → claude-opus-4-1 → gpt-4.1

# Speed-optimized routing  
response = await bridge.generate_text(prompt="...", model="fastest")
# Routes: mixtral-8x7b → llama-3.3-70b → gpt-5-nano

# Cost-optimized routing
response = await bridge.generate_text(prompt="...", model="cheapest")
# Routes: gpt-5-nano → llama-3.1-8b → gemini-2.5-flash-lite

# Balanced routing
response = await bridge.generate_text(prompt="...", model="balanced")
# Routes: gpt-5-mini → claude-sonnet-4 → llama-3.3-70b
```

## Advanced Usage

### Structured Output

```python
schema = {
    "type": "object",
    "properties": {
        "name": {"type": "string"},
        "age": {"type": "integer"},
        "skills": {"type": "array", "items": {"type": "string"}}
    },
    "required": ["name", "age"]
}

response = await bridge.generate_structured_output(
    prompt="Extract person information from: John Doe, 30 years old, knows Python and JavaScript",
    schema=schema,
    model="gpt-5"
)

import json
data = json.loads(response.content)
# {"name": "John Doe", "age": 30, "skills": ["Python", "JavaScript"]}
```

### Caching

```python
from modelbridge.cache import CacheFactory

# Configure Redis caching
cache = CacheFactory.create("redis", {
    "url": "redis://localhost:6379",
    "ttl": 3600,
    "namespace": "modelbridge"
})

bridge = ModelBridge(cache=cache)
```

### Rate Limiting

```python
from modelbridge.ratelimit import RateLimitFactory

# Token bucket rate limiting
rate_limiter = RateLimitFactory.create("token_bucket", {
    "capacity": 1000,
    "refill_rate": 100,
    "refill_interval": 60
})

bridge = ModelBridge(rate_limiter=rate_limiter)
```

### Performance Monitoring

```python
# Get performance statistics
stats = bridge.get_performance_stats()
print(f"Average response time: {stats['openai:gpt-5']['avg_response_time']:.2f}s")
print(f"Success rate: {stats['openai:gpt-5']['success_rate']:.2%}")

# Health check all providers
health = await bridge.health_check()
for provider, status in health['providers'].items():
    print(f"{provider}: {status['status']}")
```

## Architecture

### Component Overview

```
ModelBridge
├── Providers (OpenAI, Anthropic, Google, Groq)
│   ├── Retry Logic (exponential backoff)
│   ├── Circuit Breakers (failure detection)
│   └── Response Validation
├── Intelligent Router
│   ├── Performance Tracking
│   ├── Cost Optimization
│   └── Capability Matching
├── Cache Layer
│   ├── Redis Cache (distributed)
│   └── Memory Cache (local)
├── Rate Limiter
│   ├── Token Bucket
│   └── Sliding Window
└── Monitoring
    ├── Metrics Collection
    ├── Health Checks
    └── Cost Tracking
```

### Request Flow

1. **Request Reception**: Validate input parameters
2. **Cache Check**: Return cached response if available
3. **Rate Limit Check**: Ensure within rate limits
4. **Provider Selection**: Choose optimal provider based on:
   - Model availability
   - Historical performance
   - Cost constraints
   - Current health status
5. **Request Execution**: Send to provider with retry logic
6. **Response Processing**: Validate and format response
7. **Cache Storage**: Store successful responses
8. **Metrics Update**: Track performance and costs

## Testing

```bash
# Run test suite
python -m pytest tests/

# Run with coverage
python -m pytest tests/ --cov=modelbridge --cov-report=html

# Run specific test categories
python -m pytest tests/test_providers.py
python -m pytest tests/test_routing.py
python -m pytest tests/test_cache.py
```

## Performance Benchmarks

| Provider | Model | Avg Response Time | Tokens/Second | Cost per 1M Tokens |
|----------|-------|------------------|---------------|-------------------|
| Groq | mixtral-8x7b | 0.2s | 500+ | $0.27 |
| Groq | llama-3.3-70b | 0.4s | 276 | $0.59 |
| OpenAI | gpt-5-nano | 0.8s | 150 | $0.05 |
| OpenAI | gpt-5 | 1.2s | 100 | $1.25 |
| Anthropic | claude-opus-4-1 | 1.5s | 80 | $15.00 |

## Error Handling

```python
response = await bridge.generate_text(
    prompt="Your prompt",
    model="gpt-5"
)

if response.error:
    print(f"Error: {response.error}")
    # Automatic fallback already attempted
else:
    print(f"Success: {response.content}")
    print(f"Provider used: {response.provider_name}")
    print(f"Fallback used: {response.metadata.get('fallback_used', False)}")
```

## Contributing

1. Fork the repository
2. Create a feature branch (`git checkout -b feature/improvement`)
3. Make your changes
4. Add/update tests
5. Run test suite (`python -m pytest`)
6. Commit changes (`git commit -am 'Add feature'`)
7. Push branch (`git push origin feature/improvement`)
8. Create Pull Request

## License

MIT License - see [LICENSE](LICENSE) file for details.

## Support

- Issues: [GitHub Issues](https://github.com/code-mohanprakash/modelbridge/issues)
- Documentation: [GitHub Wiki](https://github.com/code-mohanprakash/modelbridge/wiki)
- PyPI: [modelbridge](https://pypi.org/project/modelbridge/)

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/code-mohanprakash/modelbridge",
    "name": "modelbridge",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "llm, ai, gateway, routing, openai, anthropic, google, groq, multi-provider, load-balancing",
    "author": "Mohan Prakash",
    "author_email": "Mohan Prakash <mohanprkash462@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/4f/a0/a5c39796f003738ef72c0b7d0c59ee16d3e8e596d418dfb78626004b74a0/modelbridge-1.0.2.tar.gz",
    "platform": null,
    "description": "# ModelBridge\n\n[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![PyPI version](https://badge.fury.io/py/modelbridge.svg)](https://badge.fury.io/py/modelbridge)\n\nUnified Python interface for multiple LLM providers with automatic failover, caching, and rate limiting.\n\n## Installation\n\n```bash\npip install modelbridge\n```\n\n## Quick Start\n\n```python\nimport asyncio\nfrom modelbridge import ModelBridge\n\nasync def main():\n    # Initialize with API keys from environment variables\n    bridge = ModelBridge()\n    await bridge.initialize()\n    \n    # Generate text using automatic provider selection\n    response = await bridge.generate_text(\n        prompt=\"Explain the concept of recursion\",\n        model=\"balanced\"  # Uses optimal model based on availability\n    )\n    \n    print(response.content)\n\nasyncio.run(main())\n```\n\n## Features\n\n### Core Capabilities\n- **Multi-provider support**: OpenAI, Anthropic, Google Gemini, Groq\n- **Automatic failover**: Seamlessly switches providers on failure\n- **Response caching**: Redis and in-memory caching with configurable TTL\n- **Rate limiting**: Token bucket and sliding window algorithms\n- **Circuit breakers**: Prevents cascading failures with exponential backoff\n- **Cost tracking**: Per-request and cumulative cost calculation\n- **Performance monitoring**: Response time tracking and success rate metrics\n\n### Supported Models\n\n#### OpenAI (GPT-5 Family - August 2025)\n- `gpt-5`: Best for coding & agents, state-of-art reasoning\n- `gpt-5-mini`: Balanced performance and cost\n- `gpt-5-nano`: Ultra-fast, cheapest option ($0.05/1M input tokens)\n- `gpt-4.1`, `gpt-4.1-mini`: Better instruction following\n- `o3`, `o3-mini`, `o4-mini`: Advanced reasoning models\n\n#### Anthropic (Claude 4 Series)\n- `claude-opus-4-1`: Most capable, best for complex analysis\n- `claude-opus-4`: Previous Opus version\n- `claude-sonnet-4`: Balanced performance and cost\n- `claude-3-5-sonnet-20241022`: Legacy but excellent performance\n\n#### Google (Gemini 2.5 Series)\n- `gemini-2.5-pro`: 1M context, multimodal capabilities\n- `gemini-2.5-flash`: 250+ tokens/second, ultra-fast\n- `gemini-2.5-flash-lite`: $0.10/1M tokens, most cost-effective\n- `gemini-1.5-pro`, `gemini-1.5-flash`: Legacy models\n\n#### Groq (Ultra-Fast Inference)\n- `llama-3.3-70b-versatile`: 276 tokens/second, latest Meta model\n- `mixtral-8x7b-32768`: 500+ tokens/second, world's fastest\n- `llama-3.1-405b-reasoning`: Exceptional reasoning capability\n- `llama-3.1-8b-instant`: Lightning fast for simple tasks\n\n## Configuration\n\n### Environment Variables\n\n```bash\n# Required API Keys\nexport OPENAI_API_KEY=\"sk-...\"\nexport ANTHROPIC_API_KEY=\"sk-ant-...\"\nexport GOOGLE_API_KEY=\"...\"\nexport GROQ_API_KEY=\"gsk_...\"\n\n# Optional Configuration\nexport REDIS_URL=\"redis://localhost:6379\"\nexport MODEL_BRIDGE_CACHE_TTL=\"3600\"\nexport MODEL_BRIDGE_MAX_RETRIES=\"3\"\n```\n\n### Programmatic Configuration\n\n```python\nfrom modelbridge import ModelBridge\n\nbridge = ModelBridge()\nawait bridge.initialize()\n\n# Custom configuration per request\nresponse = await bridge.generate_text(\n    prompt=\"Your prompt here\",\n    model=\"gpt-5\",\n    temperature=0.7,\n    max_tokens=1000,\n    system_message=\"You are a helpful assistant\"\n)\n```\n\n## Model Aliases\n\nPre-configured routing strategies for common use cases:\n\n```python\n# Quality-first routing\nresponse = await bridge.generate_text(prompt=\"...\", model=\"best\")\n# Routes: gpt-5 \u2192 claude-opus-4-1 \u2192 gpt-4.1\n\n# Speed-optimized routing  \nresponse = await bridge.generate_text(prompt=\"...\", model=\"fastest\")\n# Routes: mixtral-8x7b \u2192 llama-3.3-70b \u2192 gpt-5-nano\n\n# Cost-optimized routing\nresponse = await bridge.generate_text(prompt=\"...\", model=\"cheapest\")\n# Routes: gpt-5-nano \u2192 llama-3.1-8b \u2192 gemini-2.5-flash-lite\n\n# Balanced routing\nresponse = await bridge.generate_text(prompt=\"...\", model=\"balanced\")\n# Routes: gpt-5-mini \u2192 claude-sonnet-4 \u2192 llama-3.3-70b\n```\n\n## Advanced Usage\n\n### Structured Output\n\n```python\nschema = {\n    \"type\": \"object\",\n    \"properties\": {\n        \"name\": {\"type\": \"string\"},\n        \"age\": {\"type\": \"integer\"},\n        \"skills\": {\"type\": \"array\", \"items\": {\"type\": \"string\"}}\n    },\n    \"required\": [\"name\", \"age\"]\n}\n\nresponse = await bridge.generate_structured_output(\n    prompt=\"Extract person information from: John Doe, 30 years old, knows Python and JavaScript\",\n    schema=schema,\n    model=\"gpt-5\"\n)\n\nimport json\ndata = json.loads(response.content)\n# {\"name\": \"John Doe\", \"age\": 30, \"skills\": [\"Python\", \"JavaScript\"]}\n```\n\n### Caching\n\n```python\nfrom modelbridge.cache import CacheFactory\n\n# Configure Redis caching\ncache = CacheFactory.create(\"redis\", {\n    \"url\": \"redis://localhost:6379\",\n    \"ttl\": 3600,\n    \"namespace\": \"modelbridge\"\n})\n\nbridge = ModelBridge(cache=cache)\n```\n\n### Rate Limiting\n\n```python\nfrom modelbridge.ratelimit import RateLimitFactory\n\n# Token bucket rate limiting\nrate_limiter = RateLimitFactory.create(\"token_bucket\", {\n    \"capacity\": 1000,\n    \"refill_rate\": 100,\n    \"refill_interval\": 60\n})\n\nbridge = ModelBridge(rate_limiter=rate_limiter)\n```\n\n### Performance Monitoring\n\n```python\n# Get performance statistics\nstats = bridge.get_performance_stats()\nprint(f\"Average response time: {stats['openai:gpt-5']['avg_response_time']:.2f}s\")\nprint(f\"Success rate: {stats['openai:gpt-5']['success_rate']:.2%}\")\n\n# Health check all providers\nhealth = await bridge.health_check()\nfor provider, status in health['providers'].items():\n    print(f\"{provider}: {status['status']}\")\n```\n\n## Architecture\n\n### Component Overview\n\n```\nModelBridge\n\u251c\u2500\u2500 Providers (OpenAI, Anthropic, Google, Groq)\n\u2502   \u251c\u2500\u2500 Retry Logic (exponential backoff)\n\u2502   \u251c\u2500\u2500 Circuit Breakers (failure detection)\n\u2502   \u2514\u2500\u2500 Response Validation\n\u251c\u2500\u2500 Intelligent Router\n\u2502   \u251c\u2500\u2500 Performance Tracking\n\u2502   \u251c\u2500\u2500 Cost Optimization\n\u2502   \u2514\u2500\u2500 Capability Matching\n\u251c\u2500\u2500 Cache Layer\n\u2502   \u251c\u2500\u2500 Redis Cache (distributed)\n\u2502   \u2514\u2500\u2500 Memory Cache (local)\n\u251c\u2500\u2500 Rate Limiter\n\u2502   \u251c\u2500\u2500 Token Bucket\n\u2502   \u2514\u2500\u2500 Sliding Window\n\u2514\u2500\u2500 Monitoring\n    \u251c\u2500\u2500 Metrics Collection\n    \u251c\u2500\u2500 Health Checks\n    \u2514\u2500\u2500 Cost Tracking\n```\n\n### Request Flow\n\n1. **Request Reception**: Validate input parameters\n2. **Cache Check**: Return cached response if available\n3. **Rate Limit Check**: Ensure within rate limits\n4. **Provider Selection**: Choose optimal provider based on:\n   - Model availability\n   - Historical performance\n   - Cost constraints\n   - Current health status\n5. **Request Execution**: Send to provider with retry logic\n6. **Response Processing**: Validate and format response\n7. **Cache Storage**: Store successful responses\n8. **Metrics Update**: Track performance and costs\n\n## Testing\n\n```bash\n# Run test suite\npython -m pytest tests/\n\n# Run with coverage\npython -m pytest tests/ --cov=modelbridge --cov-report=html\n\n# Run specific test categories\npython -m pytest tests/test_providers.py\npython -m pytest tests/test_routing.py\npython -m pytest tests/test_cache.py\n```\n\n## Performance Benchmarks\n\n| Provider | Model | Avg Response Time | Tokens/Second | Cost per 1M Tokens |\n|----------|-------|------------------|---------------|-------------------|\n| Groq | mixtral-8x7b | 0.2s | 500+ | $0.27 |\n| Groq | llama-3.3-70b | 0.4s | 276 | $0.59 |\n| OpenAI | gpt-5-nano | 0.8s | 150 | $0.05 |\n| OpenAI | gpt-5 | 1.2s | 100 | $1.25 |\n| Anthropic | claude-opus-4-1 | 1.5s | 80 | $15.00 |\n\n## Error Handling\n\n```python\nresponse = await bridge.generate_text(\n    prompt=\"Your prompt\",\n    model=\"gpt-5\"\n)\n\nif response.error:\n    print(f\"Error: {response.error}\")\n    # Automatic fallback already attempted\nelse:\n    print(f\"Success: {response.content}\")\n    print(f\"Provider used: {response.provider_name}\")\n    print(f\"Fallback used: {response.metadata.get('fallback_used', False)}\")\n```\n\n## Contributing\n\n1. Fork the repository\n2. Create a feature branch (`git checkout -b feature/improvement`)\n3. Make your changes\n4. Add/update tests\n5. Run test suite (`python -m pytest`)\n6. Commit changes (`git commit -am 'Add feature'`)\n7. Push branch (`git push origin feature/improvement`)\n8. Create Pull Request\n\n## License\n\nMIT License - see [LICENSE](LICENSE) file for details.\n\n## Support\n\n- Issues: [GitHub Issues](https://github.com/code-mohanprakash/modelbridge/issues)\n- Documentation: [GitHub Wiki](https://github.com/code-mohanprakash/modelbridge/wiki)\n- PyPI: [modelbridge](https://pypi.org/project/modelbridge/)\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Enterprise-Grade Multi-Provider LLM Gateway with Intelligent Routing",
    "version": "1.0.2",
    "project_urls": {
        "Homepage": "https://github.com/code-mohanprakash/modelbridge",
        "Issues": "https://github.com/code-mohanprakash/modelbridge/issues",
        "Repository": "https://github.com/code-mohanprakash/modelbridge"
    },
    "split_keywords": [
        "llm",
        " ai",
        " gateway",
        " routing",
        " openai",
        " anthropic",
        " google",
        " groq",
        " multi-provider",
        " load-balancing"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "bbc2e29ae4337e9457295837c915459fdfdd70bb101a3a6124431e3524832dfd",
                "md5": "46d9deebc658eb54b3c575194656ee77",
                "sha256": "3014f34deb27524f67699359c6a7e5268e316b71d5897b9b86ad6738f15386e6"
            },
            "downloads": -1,
            "filename": "modelbridge-1.0.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "46d9deebc658eb54b3c575194656ee77",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 192412,
            "upload_time": "2025-08-08T05:00:02",
            "upload_time_iso_8601": "2025-08-08T05:00:02.050429Z",
            "url": "https://files.pythonhosted.org/packages/bb/c2/e29ae4337e9457295837c915459fdfdd70bb101a3a6124431e3524832dfd/modelbridge-1.0.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "4fa0a5c39796f003738ef72c0b7d0c59ee16d3e8e596d418dfb78626004b74a0",
                "md5": "f2c6781878a4159197306d8525747922",
                "sha256": "5e42cf758d4ec0ae382ec4641a7c78c4bd61496e5e4167a2f70992ac171ef83b"
            },
            "downloads": -1,
            "filename": "modelbridge-1.0.2.tar.gz",
            "has_sig": false,
            "md5_digest": "f2c6781878a4159197306d8525747922",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 170221,
            "upload_time": "2025-08-08T05:00:03",
            "upload_time_iso_8601": "2025-08-08T05:00:03.751905Z",
            "url": "https://files.pythonhosted.org/packages/4f/a0/a5c39796f003738ef72c0b7d0c59ee16d3e8e596d418dfb78626004b74a0/modelbridge-1.0.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-08 05:00:03",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "code-mohanprakash",
    "github_project": "modelbridge",
    "github_not_found": true,
    "lcname": "modelbridge"
}
        
Elapsed time: 0.44799s