any-llm-client


Nameany-llm-client JSON
Version 3.2.1 PyPI version JSON
download
home_pageNone
SummaryAdd your description here
upload_time2025-07-14 14:33:08
maintainerNone
docs_urlNone
authorNone
requires_python>=3.10
licenseNone
keywords llm llm-client openai yandex yandexgpt
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # any-llm-client

A unified and lightweight asynchronous Python API for communicating with LLMs.

Supports multiple providers, including OpenAI Chat Completions API (and any OpenAI-compatible API, such as Ollama and vLLM) and YandexGPT API.

## How To Use

Before starting using any-llm-client, make sure you have it installed:

```sh
uv add any-llm-client
poetry add any-llm-client
```

### Response API

Here's a full example that uses Ollama and Qwen2.5-Coder:

```python
import asyncio

import any_llm_client


config = any_llm_client.OpenAIConfig(
    url="http://127.0.0.1:11434/v1/chat/completions",
    model_name="qwen2.5-coder:1.5b",
    request_extra={"best_of": 3}
)


async def main() -> None:
    async with any_llm_client.get_client(config) as client:
        print(await client.request_llm_message("Кек, чо как вообще на нарах?"))


asyncio.run(main())
```

To use `YandexGPT`, replace the config:

```python
config = any_llm_client.YandexGPTConfig(
    auth_header=os.environ["YANDEX_AUTH_HEADER"], folder_id=os.environ["YANDEX_FOLDER_ID"], model_name="yandexgpt"
)
```

### Streaming API

LLMs often take long time to respond fully. Here's an example of streaming API usage:

```python
import asyncio

import any_llm_client


config = any_llm_client.OpenAIConfig(
    url="http://127.0.0.1:11434/v1/chat/completions",
    model_name="qwen2.5-coder:1.5b",
    request_extra={"best_of": 3}
)


async def main() -> None:
    async with (
        any_llm_client.get_client(config) as client,
        client.stream_llm_message_chunks("Кек, чо как вообще на нарах?") as message_chunks,
    ):
        async for chunk in message_chunks:
            print(chunk, end="", flush=True)


asyncio.run(main())
```

### Passing chat history and temperature

You can pass list of messages instead of `str` as the first argument, and set `temperature`:

```python
async with (
    any_llm_client.get_client(config) as client,
    client.stream_llm_message_chunks(
        messages=[
            any_llm_client.SystemMessage("Ты — опытный ассистент"),
            any_llm_client.UserMessage("Кек, чо как вообще на нарах?"),
        ],
        temperature=1.0,
    ) as message_chunks,
):
    ...
```

### Reasoning models

Today you can access openapi-like reasoning models and retrieve their reasoning content:

```python
async def main() -> None:
    async with any_llm_client.get_client(config) as client:
        llm_response = await client.request_llm_message("Кек, чо как вообще на нарах?")
        print(f"Just a regular LLM response content: {llm_response.content}")
        print(f"LLM reasoning response content: {llm_response.reasoning_content}")

    ...
```

### Other

#### Mock client

You can use a mock client for testing:

```python
config = any_llm_client.MockLLMConfig(
    response_message=...,
    stream_messages=["Hi!"],
)

async with any_llm_client.get_client(config, ...) as client:
    ...
```

#### Configuration with environment variables

##### Credentials

Instead of passing credentials directly, you can set corresponding environment variables:

- OpenAI: `ANY_LLM_CLIENT_OPENAI_AUTH_TOKEN`,
- YandexGPT: `ANY_LLM_CLIENT_YANDEXGPT_AUTH_HEADER`, `ANY_LLM_CLIENT_YANDEXGPT_FOLDER_ID`.

##### LLM model config (with [pydantic-settings](https://docs.pydantic.dev/latest/concepts/pydantic_settings/))

```python
import os

import pydantic_settings

import any_llm_client


class Settings(pydantic_settings.BaseSettings):
    llm_model: any_llm_client.AnyLLMConfig


os.environ["LLM_MODEL"] = """{
    "api_type": "openai",
    "url": "http://127.0.0.1:11434/v1/chat/completions",
    "model_name": "qwen2.5-coder:1.5b",
    "request_extra": {"best_of": 3}
}"""
settings = Settings()

async with any_llm_client.get_client(settings.llm_model, ...) as client:
    ...
```

Combining with environment variables from previous section, you can keep LLM model configuration and secrets separate.

#### Using clients directly

The recommended way to get LLM client is to call `any_llm_client.get_client()`. This way you can easily swap LLM models. If you prefer, you can use `any_llm_client.OpenAIClient` or `any_llm_client.YandexGPTClient` directly:

```python
config = any_llm_client.OpenAIConfig(
    url=pydantic.HttpUrl("https://api.openai.com/v1/chat/completions"),
    auth_token=os.environ["OPENAI_API_KEY"],
    model_name="gpt-4o-mini",
    request_extra={"best_of": 3}
)

async with any_llm_client.OpenAIClient(config, ...) as client:
    ...
```

#### Errors

`any_llm_client.LLMClient.request_llm_message()` and `any_llm_client.LLMClient.stream_llm_message_chunks()` will raise:

- `any_llm_client.LLMError` or `any_llm_client.OutOfTokensOrSymbolsError` when the LLM API responds with a failed HTTP status,
- `any_llm_client.LLMRequestValidationError` when images are passed to YandexGPT client.
- `any_llm_client.LLMResponseValidationError` when invalid response come from LLM API (reraised from `pydantic.ValidationError`).

All these exceptions inherit from the base class `any_llm_client.AnyLLMClientError`.

#### Timeouts, proxy & other HTTP settings

Pass custom [HTTPX](https://www.python-httpx.org) kwargs to `any_llm_client.get_client()`:

```python
import httpx

import any_llm_client


async with any_llm_client.get_client(
    ...,
    mounts={"https://api.openai.com": httpx.AsyncHTTPTransport(proxy="http://localhost:8030")},
    timeout=httpx.Timeout(None, connect=5.0),
) as client:
    ...
```

Default timeout is `httpx.Timeout(None, connect=5.0)` (5 seconds on connect, unlimited on read, write or pool).

#### Retries

By default, requests are retried 3 times on HTTP status errors. You can change the retry behaviour by supplying `request_retry` parameter:

```python
async with any_llm_client.get_client(..., request_retry=any_llm_client.RequestRetryConfig(attempts=5, ...)) as client:
    ...
```

#### Passing extra data to LLM

```python
await client.request_llm_message("Кек, чо как вообще на нарах?", extra={"best_of": 3})
```

The `extra` parameter is united with `request_extra` in OpenAIConfig

#### Passing images

You can pass images to OpenAI client (YandexGPT doesn't support images yet):

```python
await client.request_llm_message(
    messages=[
        any_llm_client.TextContentItem("What's on the image?"),
        any_llm_client.ImageContentItem("https://upload.wikimedia.org/wikipedia/commons/a/a9/Example.jpg"),
    ]
)
```

You can also pass a data url with base64-encoded image:

```python
await client.request_llm_message(
    messages=[
        any_llm_client.TextContentItem("What's on the image?"),
        any_llm_client.ImageContentItem(
            f"data:image/jpeg;base64,{base64.b64encode(image_content_bytes).decode('utf-8')}"
        ),
    ]
)
```

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "any-llm-client",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "llm, llm-client, openai, yandex, yandexgpt",
    "author": null,
    "author_email": "Lev Vereshchagin <mail@vrslev.com>",
    "download_url": "https://files.pythonhosted.org/packages/bb/fe/eb29a966815040d28a6dcb0c17a0fa54256efc8cd25bd33648caca04c736/any_llm_client-3.2.1.tar.gz",
    "platform": null,
    "description": "# any-llm-client\n\nA unified and lightweight asynchronous Python API for communicating with LLMs.\n\nSupports multiple providers, including OpenAI Chat Completions API (and any OpenAI-compatible API, such as Ollama and vLLM) and YandexGPT API.\n\n## How To Use\n\nBefore starting using any-llm-client, make sure you have it installed:\n\n```sh\nuv add any-llm-client\npoetry add any-llm-client\n```\n\n### Response API\n\nHere's a full example that uses Ollama and Qwen2.5-Coder:\n\n```python\nimport asyncio\n\nimport any_llm_client\n\n\nconfig = any_llm_client.OpenAIConfig(\n    url=\"http://127.0.0.1:11434/v1/chat/completions\",\n    model_name=\"qwen2.5-coder:1.5b\",\n    request_extra={\"best_of\": 3}\n)\n\n\nasync def main() -> None:\n    async with any_llm_client.get_client(config) as client:\n        print(await client.request_llm_message(\"\u041a\u0435\u043a, \u0447\u043e \u043a\u0430\u043a \u0432\u043e\u043e\u0431\u0449\u0435 \u043d\u0430 \u043d\u0430\u0440\u0430\u0445?\"))\n\n\nasyncio.run(main())\n```\n\nTo use `YandexGPT`, replace the config:\n\n```python\nconfig = any_llm_client.YandexGPTConfig(\n    auth_header=os.environ[\"YANDEX_AUTH_HEADER\"], folder_id=os.environ[\"YANDEX_FOLDER_ID\"], model_name=\"yandexgpt\"\n)\n```\n\n### Streaming API\n\nLLMs often take long time to respond fully. Here's an example of streaming API usage:\n\n```python\nimport asyncio\n\nimport any_llm_client\n\n\nconfig = any_llm_client.OpenAIConfig(\n    url=\"http://127.0.0.1:11434/v1/chat/completions\",\n    model_name=\"qwen2.5-coder:1.5b\",\n    request_extra={\"best_of\": 3}\n)\n\n\nasync def main() -> None:\n    async with (\n        any_llm_client.get_client(config) as client,\n        client.stream_llm_message_chunks(\"\u041a\u0435\u043a, \u0447\u043e \u043a\u0430\u043a \u0432\u043e\u043e\u0431\u0449\u0435 \u043d\u0430 \u043d\u0430\u0440\u0430\u0445?\") as message_chunks,\n    ):\n        async for chunk in message_chunks:\n            print(chunk, end=\"\", flush=True)\n\n\nasyncio.run(main())\n```\n\n### Passing chat history and temperature\n\nYou can pass list of messages instead of `str` as the first argument, and set `temperature`:\n\n```python\nasync with (\n    any_llm_client.get_client(config) as client,\n    client.stream_llm_message_chunks(\n        messages=[\n            any_llm_client.SystemMessage(\"\u0422\u044b \u2014 \u043e\u043f\u044b\u0442\u043d\u044b\u0439 \u0430\u0441\u0441\u0438\u0441\u0442\u0435\u043d\u0442\"),\n            any_llm_client.UserMessage(\"\u041a\u0435\u043a, \u0447\u043e \u043a\u0430\u043a \u0432\u043e\u043e\u0431\u0449\u0435 \u043d\u0430 \u043d\u0430\u0440\u0430\u0445?\"),\n        ],\n        temperature=1.0,\n    ) as message_chunks,\n):\n    ...\n```\n\n### Reasoning models\n\nToday you can access openapi-like reasoning models and retrieve their reasoning content:\n\n```python\nasync def main() -> None:\n    async with any_llm_client.get_client(config) as client:\n        llm_response = await client.request_llm_message(\"\u041a\u0435\u043a, \u0447\u043e \u043a\u0430\u043a \u0432\u043e\u043e\u0431\u0449\u0435 \u043d\u0430 \u043d\u0430\u0440\u0430\u0445?\")\n        print(f\"Just a regular LLM response content: {llm_response.content}\")\n        print(f\"LLM reasoning response content: {llm_response.reasoning_content}\")\n\n    ...\n```\n\n### Other\n\n#### Mock client\n\nYou can use a mock client for testing:\n\n```python\nconfig = any_llm_client.MockLLMConfig(\n    response_message=...,\n    stream_messages=[\"Hi!\"],\n)\n\nasync with any_llm_client.get_client(config, ...) as client:\n    ...\n```\n\n#### Configuration with environment variables\n\n##### Credentials\n\nInstead of passing credentials directly, you can set corresponding environment variables:\n\n- OpenAI: `ANY_LLM_CLIENT_OPENAI_AUTH_TOKEN`,\n- YandexGPT: `ANY_LLM_CLIENT_YANDEXGPT_AUTH_HEADER`, `ANY_LLM_CLIENT_YANDEXGPT_FOLDER_ID`.\n\n##### LLM model config (with [pydantic-settings](https://docs.pydantic.dev/latest/concepts/pydantic_settings/))\n\n```python\nimport os\n\nimport pydantic_settings\n\nimport any_llm_client\n\n\nclass Settings(pydantic_settings.BaseSettings):\n    llm_model: any_llm_client.AnyLLMConfig\n\n\nos.environ[\"LLM_MODEL\"] = \"\"\"{\n    \"api_type\": \"openai\",\n    \"url\": \"http://127.0.0.1:11434/v1/chat/completions\",\n    \"model_name\": \"qwen2.5-coder:1.5b\",\n    \"request_extra\": {\"best_of\": 3}\n}\"\"\"\nsettings = Settings()\n\nasync with any_llm_client.get_client(settings.llm_model, ...) as client:\n    ...\n```\n\nCombining with environment variables from previous section, you can keep LLM model configuration and secrets separate.\n\n#### Using clients directly\n\nThe recommended way to get LLM client is to call `any_llm_client.get_client()`. This way you can easily swap LLM models. If you prefer, you can use `any_llm_client.OpenAIClient` or `any_llm_client.YandexGPTClient` directly:\n\n```python\nconfig = any_llm_client.OpenAIConfig(\n    url=pydantic.HttpUrl(\"https://api.openai.com/v1/chat/completions\"),\n    auth_token=os.environ[\"OPENAI_API_KEY\"],\n    model_name=\"gpt-4o-mini\",\n    request_extra={\"best_of\": 3}\n)\n\nasync with any_llm_client.OpenAIClient(config, ...) as client:\n    ...\n```\n\n#### Errors\n\n`any_llm_client.LLMClient.request_llm_message()` and `any_llm_client.LLMClient.stream_llm_message_chunks()` will raise:\n\n- `any_llm_client.LLMError` or `any_llm_client.OutOfTokensOrSymbolsError` when the LLM API responds with a failed HTTP status,\n- `any_llm_client.LLMRequestValidationError` when images are passed to YandexGPT client.\n- `any_llm_client.LLMResponseValidationError` when invalid response come from LLM API (reraised from `pydantic.ValidationError`).\n\nAll these exceptions inherit from the base class `any_llm_client.AnyLLMClientError`.\n\n#### Timeouts, proxy & other HTTP settings\n\nPass custom [HTTPX](https://www.python-httpx.org) kwargs to `any_llm_client.get_client()`:\n\n```python\nimport httpx\n\nimport any_llm_client\n\n\nasync with any_llm_client.get_client(\n    ...,\n    mounts={\"https://api.openai.com\": httpx.AsyncHTTPTransport(proxy=\"http://localhost:8030\")},\n    timeout=httpx.Timeout(None, connect=5.0),\n) as client:\n    ...\n```\n\nDefault timeout is `httpx.Timeout(None, connect=5.0)` (5 seconds on connect, unlimited on read, write or pool).\n\n#### Retries\n\nBy default, requests are retried 3 times on HTTP status errors. You can change the retry behaviour by supplying `request_retry` parameter:\n\n```python\nasync with any_llm_client.get_client(..., request_retry=any_llm_client.RequestRetryConfig(attempts=5, ...)) as client:\n    ...\n```\n\n#### Passing extra data to LLM\n\n```python\nawait client.request_llm_message(\"\u041a\u0435\u043a, \u0447\u043e \u043a\u0430\u043a \u0432\u043e\u043e\u0431\u0449\u0435 \u043d\u0430 \u043d\u0430\u0440\u0430\u0445?\", extra={\"best_of\": 3})\n```\n\nThe `extra` parameter is united with `request_extra` in OpenAIConfig\n\n#### Passing images\n\nYou can pass images to OpenAI client (YandexGPT doesn't support images yet):\n\n```python\nawait client.request_llm_message(\n    messages=[\n        any_llm_client.TextContentItem(\"What's on the image?\"),\n        any_llm_client.ImageContentItem(\"https://upload.wikimedia.org/wikipedia/commons/a/a9/Example.jpg\"),\n    ]\n)\n```\n\nYou can also pass a data url with base64-encoded image:\n\n```python\nawait client.request_llm_message(\n    messages=[\n        any_llm_client.TextContentItem(\"What's on the image?\"),\n        any_llm_client.ImageContentItem(\n            f\"data:image/jpeg;base64,{base64.b64encode(image_content_bytes).decode('utf-8')}\"\n        ),\n    ]\n)\n```\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Add your description here",
    "version": "3.2.1",
    "project_urls": null,
    "split_keywords": [
        "llm",
        " llm-client",
        " openai",
        " yandex",
        " yandexgpt"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "78e543d2e28ef684d8414696440f19c16ca4613f27961a8adf61821bfe225c7b",
                "md5": "aea08c5b7daf7ae5680686f5b2d994a6",
                "sha256": "ab2cc929285cab90041d7ccec2e0290e26cabdceb769321018e669af1bfc70fe"
            },
            "downloads": -1,
            "filename": "any_llm_client-3.2.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "aea08c5b7daf7ae5680686f5b2d994a6",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 14821,
            "upload_time": "2025-07-14T14:33:07",
            "upload_time_iso_8601": "2025-07-14T14:33:07.064352Z",
            "url": "https://files.pythonhosted.org/packages/78/e5/43d2e28ef684d8414696440f19c16ca4613f27961a8adf61821bfe225c7b/any_llm_client-3.2.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "bbfeeb29a966815040d28a6dcb0c17a0fa54256efc8cd25bd33648caca04c736",
                "md5": "7430ecea1f4169ebbf05cea5e5a3ade7",
                "sha256": "33169841d14ae5d9c550cdb6b821877c56f77abb45a646d66923cf9eb66e43f4"
            },
            "downloads": -1,
            "filename": "any_llm_client-3.2.1.tar.gz",
            "has_sig": false,
            "md5_digest": "7430ecea1f4169ebbf05cea5e5a3ade7",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 26605,
            "upload_time": "2025-07-14T14:33:08",
            "upload_time_iso_8601": "2025-07-14T14:33:08.024525Z",
            "url": "https://files.pythonhosted.org/packages/bb/fe/eb29a966815040d28a6dcb0c17a0fa54256efc8cd25bd33648caca04c736/any_llm_client-3.2.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-14 14:33:08",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "any-llm-client"
}
        
Elapsed time: 0.82947s