any-llm-client

Name	any-llm-client JSON
Version	2.0.0 JSON
	download
home_page	None
Summary	Add your description here
upload_time	2024-12-05 15:39:10
maintainer	None
docs_url	None
author	None
requires_python	>=3.10
license	None
keywords	llm llm-client openai yandex yandexgpt
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # any-llm-client

A unified and lightweight asynchronous Python API for communicating with LLMs.

Supports multiple providers, including OpenAI Chat Completions API (and any OpenAI-compatible API, such as Ollama and vLLM) and YandexGPT API.

## How To Use

Before starting using any-llm-client, make sure you have it installed:

```sh
uv add any-llm-client
poetry add any-llm-client
```

### Response API

Here's a full example that uses Ollama and Qwen2.5-Coder:

```python
import asyncio

import any_llm_client


config = any_llm_client.OpenAIConfig(url="http://127.0.0.1:11434/v1/chat/completions", model_name="qwen2.5-coder:1.5b")


async def main() -> None:
    async with any_llm_client.get_client(config) as client:
        print(await client.request_llm_message("Кек, чо как вообще на нарах?"))


asyncio.run(main())
```

To use `YandexGPT`, replace the config:

```python
config = any_llm_client.YandexGPTConfig(
    auth_header=os.environ["YANDEX_AUTH_HEADER"], folder_id=os.environ["YANDEX_FOLDER_ID"], model_name="yandexgpt"
)
```

### Streaming API

LLMs often take long time to respond fully. Here's an example of streaming API usage:

```python
import asyncio

import any_llm_client


config = any_llm_client.OpenAIConfig(url="http://127.0.0.1:11434/v1/chat/completions", model_name="qwen2.5-coder:1.5b")


async def main() -> None:
    async with (
        any_llm_client.get_client(config) as client,
        client.stream_llm_message_chunks("Кек, чо как вообще на нарах?") as message_chunks,
    ):
        async for chunk in message_chunks:
            print(chunk, end="", flush=True)


asyncio.run(main())
```

### Passing chat history and temperature

You can pass list of messages instead of `str` as the first argument, and set `temperature`:

```python
async with (
    any_llm_client.get_client(config) as client,
    client.stream_llm_message_chunks(
        messages=[
            any_llm_client.SystemMessage("Ты — опытный ассистент"),
            any_llm_client.UserMessage("Кек, чо как вообще на нарах?"),
        ],
        temperature=1.0,
    ) as message_chunks,
):
    ...
```

### Other

#### Mock client

You can use a mock client for testing:

```python
config = any_llm_client.MockLLMConfig(
    response_message=...,
    stream_messages=["Hi!"],
)

async with any_llm_client.get_client(config, ...) as client:
    ...
```

#### Configuration with environment variables

##### Credentials

Instead of passing credentials directly, you can set corresponding environment variables:

- OpenAI: `ANY_LLM_CLIENT_OPENAI_AUTH_TOKEN`,
- YandexGPT: `ANY_LLM_CLIENT_YANDEXGPT_AUTH_HEADER`, `ANY_LLM_CLIENT_YANDEXGPT_FOLDER_ID`.

##### LLM model config (with [pydantic-settings](https://docs.pydantic.dev/latest/concepts/pydantic_settings/))

```python
import os

import pydantic_settings

import any_llm_client


class Settings(pydantic_settings.BaseSettings):
    llm_model: any_llm_client.AnyLLMConfig


os.environ["LLM_MODEL"] = """{
    "api_type": "openai",
    "url": "http://127.0.0.1:11434/v1/chat/completions",
    "model_name": "qwen2.5-coder:1.5b"
}"""
settings = Settings()

async with any_llm_client.get_client(settings.llm_model, ...) as client:
    ...
```

Combining with environment variables from previous section, you can keep LLM model configuration and secrets separate.

#### Using clients directly

The recommended way to get LLM client is to call `any_llm_client.get_client()`. This way you can easily swap LLM models. If you prefer, you can use `any_llm_client.OpenAIClient` or `any_llm_client.YandexGPTClient` directly:

```python
config = any_llm_client.OpenAIConfig(
    url=pydantic.HttpUrl("https://api.openai.com/v1/chat/completions"),
    auth_token=os.environ["OPENAI_API_KEY"],
    model_name="gpt-4o-mini",
)

async with any_llm_client.OpenAIClient(config, ...) as client:
    ...
```

#### Errors

`any_llm_client.LLMClient.request_llm_message()` and `any_llm_client.LLMClient.stream_llm_message_chunks()` will raise `any_llm_client.LLMError` or `any_llm_client.OutOfTokensOrSymbolsError` when the LLM API responds with a failed HTTP status.

#### Timeouts, proxy & other HTTP settings


Pass custom [HTTPX](https://www.python-httpx.org) kwargs to `any_llm_client.get_client()`:

```python
import httpx

import any_llm_client


async with any_llm_client.get_client(
    ...,
    mounts={"https://api.openai.com": httpx.AsyncHTTPTransport(proxy="http://localhost:8030")},
    timeout=httpx.Timeout(None, connect=5.0),
) as client:
    ...
```

Default timeout is `httpx.Timeout(None, connect=5.0)` (5 seconds on connect, unlimited on read, write or pool).

#### Retries

By default, requests are retried 3 times on HTTP status errors. You can change the retry behaviour by supplying `request_retry` parameter:

```python
async with any_llm_client.get_client(..., request_retry=any_llm_client.RequestRetryConfig(attempts=5, ...)) as client:
    ...
```

#### Passing extra data to LLM

```python
await client.request_llm_message("Кек, чо как вообще на нарах?", extra={"best_of": 3})
```

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "any-llm-client",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "llm, llm-client, openai, yandex, yandexgpt",
    "author": null,
    "author_email": "Lev Vereshchagin <mail@vrslev.com>",
    "download_url": "https://files.pythonhosted.org/packages/23/5d/fbe400a3629208b2090a56a3fc4249a4482729887946260ce600f1339841/any_llm_client-2.0.0.tar.gz",
    "platform": null,
    "description": "# any-llm-client\n\nA unified and lightweight asynchronous Python API for communicating with LLMs.\n\nSupports multiple providers, including OpenAI Chat Completions API (and any OpenAI-compatible API, such as Ollama and vLLM) and YandexGPT API.\n\n## How To Use\n\nBefore starting using any-llm-client, make sure you have it installed:\n\n```sh\nuv add any-llm-client\npoetry add any-llm-client\n```\n\n### Response API\n\nHere's a full example that uses Ollama and Qwen2.5-Coder:\n\n```python\nimport asyncio\n\nimport any_llm_client\n\n\nconfig = any_llm_client.OpenAIConfig(url=\"http://127.0.0.1:11434/v1/chat/completions\", model_name=\"qwen2.5-coder:1.5b\")\n\n\nasync def main() -> None:\n    async with any_llm_client.get_client(config) as client:\n        print(await client.request_llm_message(\"\u041a\u0435\u043a, \u0447\u043e \u043a\u0430\u043a \u0432\u043e\u043e\u0431\u0449\u0435 \u043d\u0430 \u043d\u0430\u0440\u0430\u0445?\"))\n\n\nasyncio.run(main())\n```\n\nTo use `YandexGPT`, replace the config:\n\n```python\nconfig = any_llm_client.YandexGPTConfig(\n    auth_header=os.environ[\"YANDEX_AUTH_HEADER\"], folder_id=os.environ[\"YANDEX_FOLDER_ID\"], model_name=\"yandexgpt\"\n)\n```\n\n### Streaming API\n\nLLMs often take long time to respond fully. Here's an example of streaming API usage:\n\n```python\nimport asyncio\n\nimport any_llm_client\n\n\nconfig = any_llm_client.OpenAIConfig(url=\"http://127.0.0.1:11434/v1/chat/completions\", model_name=\"qwen2.5-coder:1.5b\")\n\n\nasync def main() -> None:\n    async with (\n        any_llm_client.get_client(config) as client,\n        client.stream_llm_message_chunks(\"\u041a\u0435\u043a, \u0447\u043e \u043a\u0430\u043a \u0432\u043e\u043e\u0431\u0449\u0435 \u043d\u0430 \u043d\u0430\u0440\u0430\u0445?\") as message_chunks,\n    ):\n        async for chunk in message_chunks:\n            print(chunk, end=\"\", flush=True)\n\n\nasyncio.run(main())\n```\n\n### Passing chat history and temperature\n\nYou can pass list of messages instead of `str` as the first argument, and set `temperature`:\n\n```python\nasync with (\n    any_llm_client.get_client(config) as client,\n    client.stream_llm_message_chunks(\n        messages=[\n            any_llm_client.SystemMessage(\"\u0422\u044b \u2014 \u043e\u043f\u044b\u0442\u043d\u044b\u0439 \u0430\u0441\u0441\u0438\u0441\u0442\u0435\u043d\u0442\"),\n            any_llm_client.UserMessage(\"\u041a\u0435\u043a, \u0447\u043e \u043a\u0430\u043a \u0432\u043e\u043e\u0431\u0449\u0435 \u043d\u0430 \u043d\u0430\u0440\u0430\u0445?\"),\n        ],\n        temperature=1.0,\n    ) as message_chunks,\n):\n    ...\n```\n\n### Other\n\n#### Mock client\n\nYou can use a mock client for testing:\n\n```python\nconfig = any_llm_client.MockLLMConfig(\n    response_message=...,\n    stream_messages=[\"Hi!\"],\n)\n\nasync with any_llm_client.get_client(config, ...) as client:\n    ...\n```\n\n#### Configuration with environment variables\n\n##### Credentials\n\nInstead of passing credentials directly, you can set corresponding environment variables:\n\n- OpenAI: `ANY_LLM_CLIENT_OPENAI_AUTH_TOKEN`,\n- YandexGPT: `ANY_LLM_CLIENT_YANDEXGPT_AUTH_HEADER`, `ANY_LLM_CLIENT_YANDEXGPT_FOLDER_ID`.\n\n##### LLM model config (with [pydantic-settings](https://docs.pydantic.dev/latest/concepts/pydantic_settings/))\n\n```python\nimport os\n\nimport pydantic_settings\n\nimport any_llm_client\n\n\nclass Settings(pydantic_settings.BaseSettings):\n    llm_model: any_llm_client.AnyLLMConfig\n\n\nos.environ[\"LLM_MODEL\"] = \"\"\"{\n    \"api_type\": \"openai\",\n    \"url\": \"http://127.0.0.1:11434/v1/chat/completions\",\n    \"model_name\": \"qwen2.5-coder:1.5b\"\n}\"\"\"\nsettings = Settings()\n\nasync with any_llm_client.get_client(settings.llm_model, ...) as client:\n    ...\n```\n\nCombining with environment variables from previous section, you can keep LLM model configuration and secrets separate.\n\n#### Using clients directly\n\nThe recommended way to get LLM client is to call `any_llm_client.get_client()`. This way you can easily swap LLM models. If you prefer, you can use `any_llm_client.OpenAIClient` or `any_llm_client.YandexGPTClient` directly:\n\n```python\nconfig = any_llm_client.OpenAIConfig(\n    url=pydantic.HttpUrl(\"https://api.openai.com/v1/chat/completions\"),\n    auth_token=os.environ[\"OPENAI_API_KEY\"],\n    model_name=\"gpt-4o-mini\",\n)\n\nasync with any_llm_client.OpenAIClient(config, ...) as client:\n    ...\n```\n\n#### Errors\n\n`any_llm_client.LLMClient.request_llm_message()` and `any_llm_client.LLMClient.stream_llm_message_chunks()` will raise `any_llm_client.LLMError` or `any_llm_client.OutOfTokensOrSymbolsError` when the LLM API responds with a failed HTTP status.\n\n#### Timeouts, proxy & other HTTP settings\n\n\nPass custom [HTTPX](https://www.python-httpx.org) kwargs to `any_llm_client.get_client()`:\n\n```python\nimport httpx\n\nimport any_llm_client\n\n\nasync with any_llm_client.get_client(\n    ...,\n    mounts={\"https://api.openai.com\": httpx.AsyncHTTPTransport(proxy=\"http://localhost:8030\")},\n    timeout=httpx.Timeout(None, connect=5.0),\n) as client:\n    ...\n```\n\nDefault timeout is `httpx.Timeout(None, connect=5.0)` (5 seconds on connect, unlimited on read, write or pool).\n\n#### Retries\n\nBy default, requests are retried 3 times on HTTP status errors. You can change the retry behaviour by supplying `request_retry` parameter:\n\n```python\nasync with any_llm_client.get_client(..., request_retry=any_llm_client.RequestRetryConfig(attempts=5, ...)) as client:\n    ...\n```\n\n#### Passing extra data to LLM\n\n```python\nawait client.request_llm_message(\"\u041a\u0435\u043a, \u0447\u043e \u043a\u0430\u043a \u0432\u043e\u043e\u0431\u0449\u0435 \u043d\u0430 \u043d\u0430\u0440\u0430\u0445?\", extra={\"best_of\": 3})\n```\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Add your description here",
    "version": "2.0.0",
    "project_urls": null,
    "split_keywords": [
        "llm",
        " llm-client",
        " openai",
        " yandex",
        " yandexgpt"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "57dec1f1eaff2821d58edef4ad825432d3f2675aa2d07b39c8cb9aef17929ae7",
                "md5": "4dc29d072c9d58d2d5d0cdd929371ce1",
                "sha256": "4436e6fdb5bc33935d0c8d9ab9024f38e104cf5fb9c5ea71c77c39d154484514"
            },
            "downloads": -1,
            "filename": "any_llm_client-2.0.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "4dc29d072c9d58d2d5d0cdd929371ce1",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 12714,
            "upload_time": "2024-12-05T15:39:09",
            "upload_time_iso_8601": "2024-12-05T15:39:09.255629Z",
            "url": "https://files.pythonhosted.org/packages/57/de/c1f1eaff2821d58edef4ad825432d3f2675aa2d07b39c8cb9aef17929ae7/any_llm_client-2.0.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "235dfbe400a3629208b2090a56a3fc4249a4482729887946260ce600f1339841",
                "md5": "50b655b1589dc7cf279a68543db4fa38",
                "sha256": "d1457a8900d144d141ac884632312870d5880ab32e0dcca33e05b4d674671726"
            },
            "downloads": -1,
            "filename": "any_llm_client-2.0.0.tar.gz",
            "has_sig": false,
            "md5_digest": "50b655b1589dc7cf279a68543db4fa38",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 13174,
            "upload_time": "2024-12-05T15:39:10",
            "upload_time_iso_8601": "2024-12-05T15:39:10.938634Z",
            "url": "https://files.pythonhosted.org/packages/23/5d/fbe400a3629208b2090a56a3fc4249a4482729887946260ce600f1339841/any_llm_client-2.0.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-12-05 15:39:10",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "any-llm-client"
}

None