<div align="center">
# Model Router
[](https://github.com/mfbatra/model-router/actions)
[](#)
[](LICENSE)
</div>
## Quick Start
```python
from model_router.core.container import DIContainer
router = DIContainer.create_router(openai_key="sk-your-key")
print(router.complete("Summarize the request").content)
```
## Installation
```bash
git clone https://github.com/mfbatra/model-router
cd model-router
poetry install
```
## Basic Usage Examples
```python
router = DIContainer.create_router(openai_key="sk-test")
response = router.complete("Top 5 databases in 2024", max_cost=0.05)
print(response.content)
chat = router.chat(
[
{"role": "user", "content": "Help me deploy a service"},
{"role": "assistant", "content": "What's the stack?"}
]
)
print(chat.content)
```
## Advanced Usage
- **Constraints**: `router.complete(prompt, max_cost=0.1, max_latency=500, min_quality=0.8)`
- **Fallback Chain**: Configure via `RouterConfig(fallback_models=["gpt-3.5", "claude-3"])`
- **Analytics**: Access `router.analytics.get_summary("last_7_days")`
- **Custom Middleware**: Inject `MiddlewareChain([...])` via `DIContainer.create_custom_router`
## Comparing LLMs & Benchmarking
### Multi-turn chat with guardrails
```python
response = router.chat(
[
{"role": "user", "content": "I need to deploy a microservice"},
{"role": "assistant", "content": "What technology stack are you using?"},
{"role": "user", "content": "Python FastAPI with PostgreSQL"}
],
max_cost=0.03,
min_quality=0.75,
)
print(response.content)
```
### Force specific models for side-by-side comparisons
```python
import pandas as pd
from model_router.core.container import DIContainer
router = DIContainer.create_router(
openai_api_key="sk-...",
anthropic_api_key="sk-ant-...",
google_api_key="..."
)
def compare_providers(prompt: str, models: list[str]) -> pd.DataFrame:
"""Compare the same prompt across different models."""
rows = []
for model in models:
response = router.complete(prompt, metadata={"force_model": model})
rows.append(
{
"Model": model,
"Response": response.content[:100] + "...",
"Cost": f"${response.cost:.4f}",
"Latency": f"{response.latency * 1000:.0f}ms",
"Tokens": response.tokens,
}
)
return pd.DataFrame(rows)
df = compare_providers(
"Explain quantum computing in simple terms",
["gpt-3.5-turbo", "gpt-4-turbo", "claude-sonnet-3.5", "gemini-pro"],
)
print(df.to_markdown(index=False))
```
### Track cost vs latency across a test suite
```python
def benchmark(prompts: list[str], targets: list[tuple[str, str]]):
results = {model: {"cost": 0.0, "latency": 0.0, "runs": 0} for _, model in targets}
for prompt in prompts:
for provider_key, model_name in targets:
provider = factory.create(model_name, configs[provider_key])
response = provider.complete(Request(prompt=prompt))
stats = results[model_name]
stats["cost"] += response.cost
stats["latency"] += response.latency
stats["runs"] += 1
for model, stats in results.items():
stats["avg_cost"] = stats["cost"] / stats["runs"]
stats["avg_latency"] = stats["latency"] / stats["runs"]
return results
bench = benchmark(
[
"Translate 'hello' to French",
"Summarize the causes of World War II",
"Write a Python function that reverses a list",
],
[("openai", "gpt-4o"), ("anthropic", "claude-3")],
)
print(bench)
```
### Simple A/B harness
```python
def ab_test(router, prompt: str, model_a: str, model_b: str, provider_key="openai"):
provider = router._provider_factory
config = router._provider_configs[provider_key]
response_a = provider.create(model_a, config).complete(Request(prompt=prompt))
response_b = provider.create(model_b, config).complete(Request(prompt=prompt))
return {
model_a: {"cost": response_a.cost, "latency": response_a.latency},
model_b: {"cost": response_b.cost, "latency": response_b.latency},
}
print(ab_test(router, "Describe a scalable e-commerce platform", "gpt-4o", "gpt-4o-mini"))
```
Use these snippets to build richer reports (DataFrames, Matplotlib charts, or analytics exports via `router.analytics.to_dataframe()`).
## Configuration Options
| Option | Env | Description |
| --- | --- | --- |
| `default_strategy` | `ROUTER_DEFAULT_STRATEGY` | `balanced`, `cost_optimized`, etc. |
| `enable_analytics` | `ROUTER_ENABLE_ANALYTICS` | Toggle UsageTracker |
| `enable_cache` | `ROUTER_ENABLE_CACHE` | Enable middleware caching |
| `fallback_models` | `ROUTER_FALLBACK_MODELS` | Comma-separated list |
| `max_retries` | `ROUTER_MAX_RETRIES` | Provider retry attempts |
| `timeout_seconds` | `ROUTER_TIMEOUT_SECONDS` | Per-request timeout |
Load from env (`RouterConfig.from_env()`) or JSON/YAML file (`RouterConfig.from_file("config.yaml")`).
## Architecture Overview
See [docs/ARCHITECTURE.md](docs/ARCHITECTURE.md) for clean architecture diagrams, SOLID mappings, and sequence flows.
## Contributing Guide
1. Fork the repo and create a feature branch.
2. `poetry install && poetry run pytest`
3. Follow the existing coding style (Black, Ruff, Mypy).
4. Add tests for new code and update docs if behavior changes.
5. Open a PR describing changes and testing steps.
Raw data
{
"_id": null,
"home_page": "https://github.com/mfbatra/model-router",
"name": "llm-model-router",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.9",
"maintainer_email": null,
"keywords": "llm, ai, routing, optimization, openai, anthropic",
"author": "Faizan Batra",
"author_email": "fbatra1@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/a7/04/4a728777a987e724b18cfec469eb72909b26dcb96263b3e08573cd6de688/llm_model_router-0.2.0.tar.gz",
"platform": null,
"description": "<div align=\"center\">\n\n# Model Router\n\n[](https://github.com/mfbatra/model-router/actions)\n[](#)\n[](LICENSE)\n\n</div>\n\n## Quick Start\n\n```python\nfrom model_router.core.container import DIContainer\nrouter = DIContainer.create_router(openai_key=\"sk-your-key\")\nprint(router.complete(\"Summarize the request\").content)\n```\n\n## Installation\n\n```bash\ngit clone https://github.com/mfbatra/model-router\ncd model-router\npoetry install\n```\n\n## Basic Usage Examples\n\n```python\nrouter = DIContainer.create_router(openai_key=\"sk-test\")\n\nresponse = router.complete(\"Top 5 databases in 2024\", max_cost=0.05)\nprint(response.content)\n\nchat = router.chat(\n [\n {\"role\": \"user\", \"content\": \"Help me deploy a service\"},\n {\"role\": \"assistant\", \"content\": \"What's the stack?\"}\n ]\n)\nprint(chat.content)\n```\n\n## Advanced Usage\n\n- **Constraints**: `router.complete(prompt, max_cost=0.1, max_latency=500, min_quality=0.8)`\n- **Fallback Chain**: Configure via `RouterConfig(fallback_models=[\"gpt-3.5\", \"claude-3\"])`\n- **Analytics**: Access `router.analytics.get_summary(\"last_7_days\")`\n- **Custom Middleware**: Inject `MiddlewareChain([...])` via `DIContainer.create_custom_router`\n\n## Comparing LLMs & Benchmarking\n\n### Multi-turn chat with guardrails\n\n```python\nresponse = router.chat(\n [\n {\"role\": \"user\", \"content\": \"I need to deploy a microservice\"},\n {\"role\": \"assistant\", \"content\": \"What technology stack are you using?\"},\n {\"role\": \"user\", \"content\": \"Python FastAPI with PostgreSQL\"}\n ],\n max_cost=0.03,\n min_quality=0.75,\n)\nprint(response.content)\n```\n\n### Force specific models for side-by-side comparisons\n\n```python\nimport pandas as pd\nfrom model_router.core.container import DIContainer\n\nrouter = DIContainer.create_router(\n openai_api_key=\"sk-...\",\n anthropic_api_key=\"sk-ant-...\",\n google_api_key=\"...\"\n)\n\ndef compare_providers(prompt: str, models: list[str]) -> pd.DataFrame:\n \"\"\"Compare the same prompt across different models.\"\"\"\n rows = []\n for model in models:\n response = router.complete(prompt, metadata={\"force_model\": model})\n rows.append(\n {\n \"Model\": model,\n \"Response\": response.content[:100] + \"...\",\n \"Cost\": f\"${response.cost:.4f}\",\n \"Latency\": f\"{response.latency * 1000:.0f}ms\",\n \"Tokens\": response.tokens,\n }\n )\n return pd.DataFrame(rows)\n\ndf = compare_providers(\n \"Explain quantum computing in simple terms\",\n [\"gpt-3.5-turbo\", \"gpt-4-turbo\", \"claude-sonnet-3.5\", \"gemini-pro\"],\n)\nprint(df.to_markdown(index=False))\n```\n\n### Track cost vs latency across a test suite\n\n```python\ndef benchmark(prompts: list[str], targets: list[tuple[str, str]]):\n results = {model: {\"cost\": 0.0, \"latency\": 0.0, \"runs\": 0} for _, model in targets}\n for prompt in prompts:\n for provider_key, model_name in targets:\n provider = factory.create(model_name, configs[provider_key])\n response = provider.complete(Request(prompt=prompt))\n stats = results[model_name]\n stats[\"cost\"] += response.cost\n stats[\"latency\"] += response.latency\n stats[\"runs\"] += 1\n for model, stats in results.items():\n stats[\"avg_cost\"] = stats[\"cost\"] / stats[\"runs\"]\n stats[\"avg_latency\"] = stats[\"latency\"] / stats[\"runs\"]\n return results\n\nbench = benchmark(\n [\n \"Translate 'hello' to French\",\n \"Summarize the causes of World War II\",\n \"Write a Python function that reverses a list\",\n ],\n [(\"openai\", \"gpt-4o\"), (\"anthropic\", \"claude-3\")],\n)\nprint(bench)\n```\n\n### Simple A/B harness\n\n```python\ndef ab_test(router, prompt: str, model_a: str, model_b: str, provider_key=\"openai\"):\n provider = router._provider_factory\n config = router._provider_configs[provider_key]\n response_a = provider.create(model_a, config).complete(Request(prompt=prompt))\n response_b = provider.create(model_b, config).complete(Request(prompt=prompt))\n return {\n model_a: {\"cost\": response_a.cost, \"latency\": response_a.latency},\n model_b: {\"cost\": response_b.cost, \"latency\": response_b.latency},\n }\n\nprint(ab_test(router, \"Describe a scalable e-commerce platform\", \"gpt-4o\", \"gpt-4o-mini\"))\n```\n\nUse these snippets to build richer reports (DataFrames, Matplotlib charts, or analytics exports via `router.analytics.to_dataframe()`).\n\n## Configuration Options\n\n| Option | Env | Description |\n| --- | --- | --- |\n| `default_strategy` | `ROUTER_DEFAULT_STRATEGY` | `balanced`, `cost_optimized`, etc. |\n| `enable_analytics` | `ROUTER_ENABLE_ANALYTICS` | Toggle UsageTracker |\n| `enable_cache` | `ROUTER_ENABLE_CACHE` | Enable middleware caching |\n| `fallback_models` | `ROUTER_FALLBACK_MODELS` | Comma-separated list |\n| `max_retries` | `ROUTER_MAX_RETRIES` | Provider retry attempts |\n| `timeout_seconds` | `ROUTER_TIMEOUT_SECONDS` | Per-request timeout |\n\nLoad from env (`RouterConfig.from_env()`) or JSON/YAML file (`RouterConfig.from_file(\"config.yaml\")`).\n\n## Architecture Overview\n\nSee [docs/ARCHITECTURE.md](docs/ARCHITECTURE.md) for clean architecture diagrams, SOLID mappings, and sequence flows.\n\n## Contributing Guide\n\n1. Fork the repo and create a feature branch.\n2. `poetry install && poetry run pytest`\n3. Follow the existing coding style (Black, Ruff, Mypy).\n4. Add tests for new code and update docs if behavior changes.\n5. Open a PR describing changes and testing steps.\n",
"bugtrack_url": null,
"license": "Apache-2.0",
"summary": "Intelligent multi-LLM request router for cost optimization",
"version": "0.2.0",
"project_urls": {
"Homepage": "https://github.com/mfbatra/model-router",
"Repository": "https://github.com/mfbatra/model-router"
},
"split_keywords": [
"llm",
" ai",
" routing",
" optimization",
" openai",
" anthropic"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "207436e7689cbf99316ca74c435e97a4f3929010379443283f83d0496c152f4d",
"md5": "0f3f9349654a521d072b387ff27f3883",
"sha256": "249e8ccb86bca0070e7ad95fa64553b9498c996ad39df9b8eed9b10edff03f62"
},
"downloads": -1,
"filename": "llm_model_router-0.2.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "0f3f9349654a521d072b387ff27f3883",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.9",
"size": 39055,
"upload_time": "2025-10-26T20:38:15",
"upload_time_iso_8601": "2025-10-26T20:38:15.024856Z",
"url": "https://files.pythonhosted.org/packages/20/74/36e7689cbf99316ca74c435e97a4f3929010379443283f83d0496c152f4d/llm_model_router-0.2.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "a7044a728777a987e724b18cfec469eb72909b26dcb96263b3e08573cd6de688",
"md5": "ab5c96c4bbc9a1a2c7adfaa7c333b333",
"sha256": "6d226d9a69b361db6b02501109cb215c70249b80e057af665946adf6ee4e61cf"
},
"downloads": -1,
"filename": "llm_model_router-0.2.0.tar.gz",
"has_sig": false,
"md5_digest": "ab5c96c4bbc9a1a2c7adfaa7c333b333",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.9",
"size": 25335,
"upload_time": "2025-10-26T20:38:15",
"upload_time_iso_8601": "2025-10-26T20:38:15.934914Z",
"url": "https://files.pythonhosted.org/packages/a7/04/4a728777a987e724b18cfec469eb72909b26dcb96263b3e08573cd6de688/llm_model_router-0.2.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-10-26 20:38:15",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "mfbatra",
"github_project": "model-router",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "llm-model-router"
}