llamux


Namellamux JSON
Version 0.1.9 PyPI version JSON
download
home_pageNone
SummaryA simple llm router
upload_time2024-12-12 02:46:39
maintainerNone
docs_urlNone
authorAndrea Pinto
requires_python<4.0,>=3.11
licenseMIT
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            [![GitHub stars](https://img.shields.io/github/stars/andreakiro/llamux-llm-router?style=social)](https://github.com/andreakiro/llamux-llm-router/stargazers)
[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
[![PyPI version](https://img.shields.io/pypi/v/llamux)](https://pypi.org/project/llamux/)
[![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/)

# llamux 🦙

A simple router to rotate across your configured LLM endpoints, balancing load and avoiding rate limits. The router selects your preferred provider-model pair based on an implicit preference list, ensuring token and request limits (day/hour/minute) aren't crossed. State persists across sessions, with quotas stored in local cache.

# Install

Requires Python 3.11+

```bash
pip install llamux
```

# Usage

You need first to set the list of endpoints you want to allow routing on;

```markdown
$ > endpoints.csv

| provider | model                   | rpm | tpm   | rph   | tph     | rpd   | tpd     |
| -------- | ----------------------- | --- | ----- | ----- | ------- | ----- | ------- |
| cerebras | llama3.3-70b            | 30  | 60000 | 900   | 1000000 | 14400 | 1000000 |
| groq     | llama-3.3-70b-versatile | 30  | 6000  | 14400 |         | 14400 |         |
```

where rpm, tpm are requests and tokens limits per minutes, and the same follows for hours and days. **Important note 🔊** Your implicit preference list is given by the ordering of the endpoints in this table. In the above, we'll always prefer Cerebras over Groq, as long as the quota limits are not exceeded.

## Use it as a standalone router;

```python
from llamux import Router

os.environ["CEREBRAS_API_KEY"] = "sk-..."
os.environ["GROQ_API_KEY"] = "sk-..."

router = Router.from_csv("endpoints.csv")
messages = [{"role": "user", "content": "Hello, how are you?"}]

provider, model, id, props = router.query(messages)
# provider: cerebras, model: llama3.3-70b
```

## Or use it directly as a completion endpoint;

```python
from llamux import Router

router = Router.from_csv("endpoints.csv")
messages = [{"role": "user", "content": "hey" * 59999}]

response = router.completion(messages) # calls cerebras
response = router.completion(messages) # calls groq (cerebras quota is out!)
```

The above builds upon [litellm](https://github.com/BerriAI/litellm) llm proxy

## More features

Contributions are welcome :)

- [ ] Add support for speed and cost routing preferences
- [ ] Add other routing strategies (now preferencial ordering only)
- [ ] Avoid getting preference listing from table ordering

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "llamux",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.11",
    "maintainer_email": null,
    "keywords": null,
    "author": "Andrea Pinto",
    "author_email": "ap@agpinto.com",
    "download_url": "https://files.pythonhosted.org/packages/7b/2d/8bc6a0a7460ed6ab3aac0078f894408bf1e20d4c12f30cf52416bfe29803/llamux-0.1.9.tar.gz",
    "platform": null,
    "description": "[![GitHub stars](https://img.shields.io/github/stars/andreakiro/llamux-llm-router?style=social)](https://github.com/andreakiro/llamux-llm-router/stargazers)\n[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)\n[![PyPI version](https://img.shields.io/pypi/v/llamux)](https://pypi.org/project/llamux/)\n[![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/)\n\n# llamux \ud83e\udd99\n\nA simple router to rotate across your configured LLM endpoints, balancing load and avoiding rate limits. The router selects your preferred provider-model pair based on an implicit preference list, ensuring token and request limits (day/hour/minute) aren't crossed. State persists across sessions, with quotas stored in local cache.\n\n# Install\n\nRequires Python 3.11+\n\n```bash\npip install llamux\n```\n\n# Usage\n\nYou need first to set the list of endpoints you want to allow routing on;\n\n```markdown\n$ > endpoints.csv\n\n| provider | model                   | rpm | tpm   | rph   | tph     | rpd   | tpd     |\n| -------- | ----------------------- | --- | ----- | ----- | ------- | ----- | ------- |\n| cerebras | llama3.3-70b            | 30  | 60000 | 900   | 1000000 | 14400 | 1000000 |\n| groq     | llama-3.3-70b-versatile | 30  | 6000  | 14400 |         | 14400 |         |\n```\n\nwhere rpm, tpm are requests and tokens limits per minutes, and the same follows for hours and days. **Important note \ud83d\udd0a** Your implicit preference list is given by the ordering of the endpoints in this table. In the above, we'll always prefer Cerebras over Groq, as long as the quota limits are not exceeded.\n\n## Use it as a standalone router;\n\n```python\nfrom llamux import Router\n\nos.environ[\"CEREBRAS_API_KEY\"] = \"sk-...\"\nos.environ[\"GROQ_API_KEY\"] = \"sk-...\"\n\nrouter = Router.from_csv(\"endpoints.csv\")\nmessages = [{\"role\": \"user\", \"content\": \"Hello, how are you?\"}]\n\nprovider, model, id, props = router.query(messages)\n#\u00a0provider: cerebras, model: llama3.3-70b\n```\n\n## Or use it directly as a completion endpoint;\n\n```python\nfrom llamux import Router\n\nrouter = Router.from_csv(\"endpoints.csv\")\nmessages = [{\"role\": \"user\", \"content\": \"hey\" * 59999}]\n\nresponse = router.completion(messages) #\u00a0calls cerebras\nresponse = router.completion(messages) #\u00a0calls groq (cerebras quota is out!)\n```\n\nThe above builds upon [litellm](https://github.com/BerriAI/litellm) llm proxy\n\n## More features\n\nContributions are welcome :)\n\n- [ ] Add support for speed and cost routing preferences\n- [ ] Add other routing strategies (now preferencial ordering only)\n- [ ] Avoid getting preference listing from table ordering\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A simple llm router",
    "version": "0.1.9",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ca95e48aa86223b5ad5ba3c8a615a90ee202e00202d9212530745c7492829aa1",
                "md5": "d3a44076f6fe0178b0f63e76d9dc0c03",
                "sha256": "5bd703fc754e78c6d43ad2900bb7ca373df678a41038da4520cd2e539e99fd4d"
            },
            "downloads": -1,
            "filename": "llamux-0.1.9-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "d3a44076f6fe0178b0f63e76d9dc0c03",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.11",
            "size": 5376,
            "upload_time": "2024-12-12T02:46:36",
            "upload_time_iso_8601": "2024-12-12T02:46:36.977153Z",
            "url": "https://files.pythonhosted.org/packages/ca/95/e48aa86223b5ad5ba3c8a615a90ee202e00202d9212530745c7492829aa1/llamux-0.1.9-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "7b2d8bc6a0a7460ed6ab3aac0078f894408bf1e20d4c12f30cf52416bfe29803",
                "md5": "bb3766927723e93aa6a3710c4f0fc430",
                "sha256": "0ca00469b92fb523a4e905159ac269c53b2146233de10c858f016536c75d68e9"
            },
            "downloads": -1,
            "filename": "llamux-0.1.9.tar.gz",
            "has_sig": false,
            "md5_digest": "bb3766927723e93aa6a3710c4f0fc430",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.11",
            "size": 4821,
            "upload_time": "2024-12-12T02:46:39",
            "upload_time_iso_8601": "2024-12-12T02:46:39.273701Z",
            "url": "https://files.pythonhosted.org/packages/7b/2d/8bc6a0a7460ed6ab3aac0078f894408bf1e20d4c12f30cf52416bfe29803/llamux-0.1.9.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-12-12 02:46:39",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "llamux"
}
        
Elapsed time: 8.36866s