# any-llm-client
A unified and lightweight asynchronous Python API for communicating with LLMs.
Supports multiple providers, including OpenAI Chat Completions API (and any OpenAI-compatible API, such as Ollama and vLLM) and YandexGPT API.
## How To Use
Before starting using any-llm-client, make sure you have it installed:
```sh
uv add any-llm-client
poetry add any-llm-client
```
### Response API
Here's a full example that uses Ollama and Qwen2.5-Coder:
```python
import asyncio
import any_llm_client
config = any_llm_client.OpenAIConfig(url="http://127.0.0.1:11434/v1/chat/completions", model_name="qwen2.5-coder:1.5b")
async def main() -> None:
async with any_llm_client.get_client(config) as client:
print(await client.request_llm_message("Кек, чо как вообще на нарах?"))
asyncio.run(main())
```
To use `YandexGPT`, replace the config:
```python
config = any_llm_client.YandexGPTConfig(
auth_header=os.environ["YANDEX_AUTH_HEADER"], folder_id=os.environ["YANDEX_FOLDER_ID"], model_name="yandexgpt"
)
```
### Streaming API
LLMs often take long time to respond fully. Here's an example of streaming API usage:
```python
import asyncio
import any_llm_client
config = any_llm_client.OpenAIConfig(url="http://127.0.0.1:11434/v1/chat/completions", model_name="qwen2.5-coder:1.5b")
async def main() -> None:
async with (
any_llm_client.get_client(config) as client,
client.stream_llm_message_chunks("Кек, чо как вообще на нарах?") as message_chunks,
):
async for chunk in message_chunks:
print(chunk, end="", flush=True)
asyncio.run(main())
```
### Passing chat history and temperature
You can pass list of messages instead of `str` as the first argument, and set `temperature`:
```python
async with (
any_llm_client.get_client(config) as client,
client.stream_llm_message_chunks(
messages=[
any_llm_client.SystemMessage("Ты — опытный ассистент"),
any_llm_client.UserMessage("Кек, чо как вообще на нарах?"),
],
temperature=1.0,
) as message_chunks,
):
...
```
### Other
#### Mock client
You can use a mock client for testing:
```python
config = any_llm_client.MockLLMConfig(
response_message=...,
stream_messages=["Hi!"],
)
async with any_llm_client.get_client(config, ...) as client:
...
```
#### Configuration with environment variables
##### Credentials
Instead of passing credentials directly, you can set corresponding environment variables:
- OpenAI: `ANY_LLM_CLIENT_OPENAI_AUTH_TOKEN`,
- YandexGPT: `ANY_LLM_CLIENT_YANDEXGPT_AUTH_HEADER`, `ANY_LLM_CLIENT_YANDEXGPT_FOLDER_ID`.
##### LLM model config (with [pydantic-settings](https://docs.pydantic.dev/latest/concepts/pydantic_settings/))
```python
import os
import pydantic_settings
import any_llm_client
class Settings(pydantic_settings.BaseSettings):
llm_model: any_llm_client.AnyLLMConfig
os.environ["LLM_MODEL"] = """{
"api_type": "openai",
"url": "http://127.0.0.1:11434/v1/chat/completions",
"model_name": "qwen2.5-coder:1.5b"
}"""
settings = Settings()
async with any_llm_client.get_client(settings.llm_model, ...) as client:
...
```
Combining with environment variables from previous section, you can keep LLM model configuration and secrets separate.
#### Using clients directly
The recommended way to get LLM client is to call `any_llm_client.get_client()`. This way you can easily swap LLM models. If you prefer, you can use `any_llm_client.OpenAIClient` or `any_llm_client.YandexGPTClient` directly:
```python
config = any_llm_client.OpenAIConfig(
url=pydantic.HttpUrl("https://api.openai.com/v1/chat/completions"),
auth_token=os.environ["OPENAI_API_KEY"],
model_name="gpt-4o-mini",
)
async with any_llm_client.OpenAIClient(config, ...) as client:
...
```
#### Errors
`any_llm_client.LLMClient.request_llm_message()` and `any_llm_client.LLMClient.stream_llm_message_chunks()` will raise `any_llm_client.LLMError` or `any_llm_client.OutOfTokensOrSymbolsError` when the LLM API responds with a failed HTTP status.
#### Timeouts, proxy & other HTTP settings
Pass custom [HTTPX](https://www.python-httpx.org) kwargs to `any_llm_client.get_client()`:
```python
import httpx
import any_llm_client
async with any_llm_client.get_client(
...,
mounts={"https://api.openai.com": httpx.AsyncHTTPTransport(proxy="http://localhost:8030")},
timeout=httpx.Timeout(None, connect=5.0),
) as client:
...
```
Default timeout is `httpx.Timeout(None, connect=5.0)` (5 seconds on connect, unlimited on read, write or pool).
#### Retries
By default, requests are retried 3 times on HTTP status errors. You can change the retry behaviour by supplying `request_retry` parameter:
```python
async with any_llm_client.get_client(..., request_retry=any_llm_client.RequestRetryConfig(attempts=5, ...)) as client:
...
```
#### Passing extra data to LLM
```python
await client.request_llm_message("Кек, чо как вообще на нарах?", extra={"best_of": 3})
```
Raw data
{
"_id": null,
"home_page": null,
"name": "any-llm-client",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": "llm, llm-client, openai, yandex, yandexgpt",
"author": null,
"author_email": "Lev Vereshchagin <mail@vrslev.com>",
"download_url": "https://files.pythonhosted.org/packages/23/5d/fbe400a3629208b2090a56a3fc4249a4482729887946260ce600f1339841/any_llm_client-2.0.0.tar.gz",
"platform": null,
"description": "# any-llm-client\n\nA unified and lightweight asynchronous Python API for communicating with LLMs.\n\nSupports multiple providers, including OpenAI Chat Completions API (and any OpenAI-compatible API, such as Ollama and vLLM) and YandexGPT API.\n\n## How To Use\n\nBefore starting using any-llm-client, make sure you have it installed:\n\n```sh\nuv add any-llm-client\npoetry add any-llm-client\n```\n\n### Response API\n\nHere's a full example that uses Ollama and Qwen2.5-Coder:\n\n```python\nimport asyncio\n\nimport any_llm_client\n\n\nconfig = any_llm_client.OpenAIConfig(url=\"http://127.0.0.1:11434/v1/chat/completions\", model_name=\"qwen2.5-coder:1.5b\")\n\n\nasync def main() -> None:\n async with any_llm_client.get_client(config) as client:\n print(await client.request_llm_message(\"\u041a\u0435\u043a, \u0447\u043e \u043a\u0430\u043a \u0432\u043e\u043e\u0431\u0449\u0435 \u043d\u0430 \u043d\u0430\u0440\u0430\u0445?\"))\n\n\nasyncio.run(main())\n```\n\nTo use `YandexGPT`, replace the config:\n\n```python\nconfig = any_llm_client.YandexGPTConfig(\n auth_header=os.environ[\"YANDEX_AUTH_HEADER\"], folder_id=os.environ[\"YANDEX_FOLDER_ID\"], model_name=\"yandexgpt\"\n)\n```\n\n### Streaming API\n\nLLMs often take long time to respond fully. Here's an example of streaming API usage:\n\n```python\nimport asyncio\n\nimport any_llm_client\n\n\nconfig = any_llm_client.OpenAIConfig(url=\"http://127.0.0.1:11434/v1/chat/completions\", model_name=\"qwen2.5-coder:1.5b\")\n\n\nasync def main() -> None:\n async with (\n any_llm_client.get_client(config) as client,\n client.stream_llm_message_chunks(\"\u041a\u0435\u043a, \u0447\u043e \u043a\u0430\u043a \u0432\u043e\u043e\u0431\u0449\u0435 \u043d\u0430 \u043d\u0430\u0440\u0430\u0445?\") as message_chunks,\n ):\n async for chunk in message_chunks:\n print(chunk, end=\"\", flush=True)\n\n\nasyncio.run(main())\n```\n\n### Passing chat history and temperature\n\nYou can pass list of messages instead of `str` as the first argument, and set `temperature`:\n\n```python\nasync with (\n any_llm_client.get_client(config) as client,\n client.stream_llm_message_chunks(\n messages=[\n any_llm_client.SystemMessage(\"\u0422\u044b \u2014 \u043e\u043f\u044b\u0442\u043d\u044b\u0439 \u0430\u0441\u0441\u0438\u0441\u0442\u0435\u043d\u0442\"),\n any_llm_client.UserMessage(\"\u041a\u0435\u043a, \u0447\u043e \u043a\u0430\u043a \u0432\u043e\u043e\u0431\u0449\u0435 \u043d\u0430 \u043d\u0430\u0440\u0430\u0445?\"),\n ],\n temperature=1.0,\n ) as message_chunks,\n):\n ...\n```\n\n### Other\n\n#### Mock client\n\nYou can use a mock client for testing:\n\n```python\nconfig = any_llm_client.MockLLMConfig(\n response_message=...,\n stream_messages=[\"Hi!\"],\n)\n\nasync with any_llm_client.get_client(config, ...) as client:\n ...\n```\n\n#### Configuration with environment variables\n\n##### Credentials\n\nInstead of passing credentials directly, you can set corresponding environment variables:\n\n- OpenAI: `ANY_LLM_CLIENT_OPENAI_AUTH_TOKEN`,\n- YandexGPT: `ANY_LLM_CLIENT_YANDEXGPT_AUTH_HEADER`, `ANY_LLM_CLIENT_YANDEXGPT_FOLDER_ID`.\n\n##### LLM model config (with [pydantic-settings](https://docs.pydantic.dev/latest/concepts/pydantic_settings/))\n\n```python\nimport os\n\nimport pydantic_settings\n\nimport any_llm_client\n\n\nclass Settings(pydantic_settings.BaseSettings):\n llm_model: any_llm_client.AnyLLMConfig\n\n\nos.environ[\"LLM_MODEL\"] = \"\"\"{\n \"api_type\": \"openai\",\n \"url\": \"http://127.0.0.1:11434/v1/chat/completions\",\n \"model_name\": \"qwen2.5-coder:1.5b\"\n}\"\"\"\nsettings = Settings()\n\nasync with any_llm_client.get_client(settings.llm_model, ...) as client:\n ...\n```\n\nCombining with environment variables from previous section, you can keep LLM model configuration and secrets separate.\n\n#### Using clients directly\n\nThe recommended way to get LLM client is to call `any_llm_client.get_client()`. This way you can easily swap LLM models. If you prefer, you can use `any_llm_client.OpenAIClient` or `any_llm_client.YandexGPTClient` directly:\n\n```python\nconfig = any_llm_client.OpenAIConfig(\n url=pydantic.HttpUrl(\"https://api.openai.com/v1/chat/completions\"),\n auth_token=os.environ[\"OPENAI_API_KEY\"],\n model_name=\"gpt-4o-mini\",\n)\n\nasync with any_llm_client.OpenAIClient(config, ...) as client:\n ...\n```\n\n#### Errors\n\n`any_llm_client.LLMClient.request_llm_message()` and `any_llm_client.LLMClient.stream_llm_message_chunks()` will raise `any_llm_client.LLMError` or `any_llm_client.OutOfTokensOrSymbolsError` when the LLM API responds with a failed HTTP status.\n\n#### Timeouts, proxy & other HTTP settings\n\n\nPass custom [HTTPX](https://www.python-httpx.org) kwargs to `any_llm_client.get_client()`:\n\n```python\nimport httpx\n\nimport any_llm_client\n\n\nasync with any_llm_client.get_client(\n ...,\n mounts={\"https://api.openai.com\": httpx.AsyncHTTPTransport(proxy=\"http://localhost:8030\")},\n timeout=httpx.Timeout(None, connect=5.0),\n) as client:\n ...\n```\n\nDefault timeout is `httpx.Timeout(None, connect=5.0)` (5 seconds on connect, unlimited on read, write or pool).\n\n#### Retries\n\nBy default, requests are retried 3 times on HTTP status errors. You can change the retry behaviour by supplying `request_retry` parameter:\n\n```python\nasync with any_llm_client.get_client(..., request_retry=any_llm_client.RequestRetryConfig(attempts=5, ...)) as client:\n ...\n```\n\n#### Passing extra data to LLM\n\n```python\nawait client.request_llm_message(\"\u041a\u0435\u043a, \u0447\u043e \u043a\u0430\u043a \u0432\u043e\u043e\u0431\u0449\u0435 \u043d\u0430 \u043d\u0430\u0440\u0430\u0445?\", extra={\"best_of\": 3})\n```\n",
"bugtrack_url": null,
"license": null,
"summary": "Add your description here",
"version": "2.0.0",
"project_urls": null,
"split_keywords": [
"llm",
" llm-client",
" openai",
" yandex",
" yandexgpt"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "57dec1f1eaff2821d58edef4ad825432d3f2675aa2d07b39c8cb9aef17929ae7",
"md5": "4dc29d072c9d58d2d5d0cdd929371ce1",
"sha256": "4436e6fdb5bc33935d0c8d9ab9024f38e104cf5fb9c5ea71c77c39d154484514"
},
"downloads": -1,
"filename": "any_llm_client-2.0.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "4dc29d072c9d58d2d5d0cdd929371ce1",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 12714,
"upload_time": "2024-12-05T15:39:09",
"upload_time_iso_8601": "2024-12-05T15:39:09.255629Z",
"url": "https://files.pythonhosted.org/packages/57/de/c1f1eaff2821d58edef4ad825432d3f2675aa2d07b39c8cb9aef17929ae7/any_llm_client-2.0.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "235dfbe400a3629208b2090a56a3fc4249a4482729887946260ce600f1339841",
"md5": "50b655b1589dc7cf279a68543db4fa38",
"sha256": "d1457a8900d144d141ac884632312870d5880ab32e0dcca33e05b4d674671726"
},
"downloads": -1,
"filename": "any_llm_client-2.0.0.tar.gz",
"has_sig": false,
"md5_digest": "50b655b1589dc7cf279a68543db4fa38",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 13174,
"upload_time": "2024-12-05T15:39:10",
"upload_time_iso_8601": "2024-12-05T15:39:10.938634Z",
"url": "https://files.pythonhosted.org/packages/23/5d/fbe400a3629208b2090a56a3fc4249a4482729887946260ce600f1339841/any_llm_client-2.0.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-12-05 15:39:10",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "any-llm-client"
}