langchain-openai-limiter


Namelangchain-openai-limiter JSON
Version 0.0.2.5 PyPI version JSON
download
home_pagehttps://github.com/alex4321/langchain-openai-limiter
SummaryWrapper for Langchain & OpenAI api calls which use OpenAI headers to deal with TPM & RPM.
upload_time2023-11-12 22:25:44
maintainer
docs_urlNone
authorAlexander Pozharskii
requires_python
license
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Langchain OpenAI limiter

## Goal

By default langchain only do retries if OpenAI queries hit limits. Which could lead to spending many resources in some cases.

Moreover, OpenAI have *very* different tiers for different users.

Like by default for GPT-4 it's something like 10 000 TPM (token per minute) and 1000 RPM (request per minute) - and up to something like 10 000 RPM and 150 000 TPM (in my personal case).

Fortunately, they provide response headers with all the required info, so we don't have to monitor it ourselves.

Unfortunately, neither OpenAI python library nor LangChain built on top of that do not provide easy built-in access to them.

So I made this package

## Installation

You should be able to install it via pip, like
```bash
pip install langchain_openai_limiter
```

## Examples
You could see `example.ipynb` notebook for examples. However:

### Chat completion

```python
# LangChain built-in model
chat_model = ChatOpenAI(
    model_name="gpt-4-0613",
    streaming=True,
)
# Thing which will await for rate/token limits
chat_model_limit_await = LimitAwaitChatOpenAI(
    chat_openai=chat_model,
    limit_await_timeout=60.0,
    limit_await_sleep=0.1,
)
# Thing which will do key rotation
chat_model_key_choose = ChooseKeyChatOpenAI(
    chat_openai=chat_model_limit_await,
    openai_api_keys=[
        os.environ["OPENAI_API_KEY0"],
        os.environ["OPENAI_API_KEY1"],
    ]
)
```
all three things is compatible with LangChain's ChatModel, so:
```python
history = [
    SystemMessage(
        content="You are a helpful assistant that translates English to French."
    ),
    HumanMessage(
        content="Translate this sentence from English to French. I love programming."
    ),
]
print(chat_model_key_choose.invoke(history).content)
```
> J'aime la programmation.

Async and streaming methods implemented as well.

### Embeddings

Pretty often we do not only need chat models - we need embeddings (for RAG, for instance) too:

```python
# LangChain built-in model
embedder_model = OpenAIEmbeddings(
    model="text-embedding-ada-002",
)
# Thing which will await for rate/token limits
embedder_model_limit_await = LimitAwaitOpenAIEmbeddings(
    openai_embeddings=embedder_model,
    limit_await_timeout=60.0,
    limit_await_sleep=0.1,
)
# Thing which will do key rotation
embedder_model_key_choose = ChooseKeyOpenAIEmbeddings(
    openai_embeddings=embedder_model_limit_await,
    openai_api_keys=[
        os.environ["OPENAI_API_KEY0"],
        os.environ["OPENAI_API_KEY1"],
    ]
)
```

```python
docs = embedder_model_key_choose.embed_documents([
    "Markdown is a lightweight markup language",
    "Brainfuck is an esoteric programming language",
])
query = embedder_model_key_choose.embed_query("What is Markdown?")
```

> `-0.01  0.03 -0.00 -0.00  0.00 ...`
> `-0.02  0.00 -0.01 -0.00 -0.00 ...`
> `-0.01  0.01  0.00 -0.01  0.00 ...`

## Testing

To run tests - you can do the following stuff

```
pip install langchain_openai_limiter[dev]
pytest
```

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/alex4321/langchain-openai-limiter",
    "name": "langchain-openai-limiter",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "",
    "author": "Alexander Pozharskii",
    "author_email": "gaussmake@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/ca/63/64f673530049bda0148d759f0284d50910537dc181826a6cba8b09d129f2/langchain_openai_limiter-0.0.2.5.tar.gz",
    "platform": null,
    "description": "# Langchain OpenAI limiter\r\n\r\n## Goal\r\n\r\nBy default langchain only do retries if OpenAI queries hit limits. Which could lead to spending many resources in some cases.\r\n\r\nMoreover, OpenAI have *very* different tiers for different users.\r\n\r\nLike by default for GPT-4 it's something like 10 000 TPM (token per minute) and 1000 RPM (request per minute) - and up to something like 10 000 RPM and 150 000 TPM (in my personal case).\r\n\r\nFortunately, they provide response headers with all the required info, so we don't have to monitor it ourselves.\r\n\r\nUnfortunately, neither OpenAI python library nor LangChain built on top of that do not provide easy built-in access to them.\r\n\r\nSo I made this package\r\n\r\n## Installation\r\n\r\nYou should be able to install it via pip, like\r\n```bash\r\npip install langchain_openai_limiter\r\n```\r\n\r\n## Examples\r\nYou could see `example.ipynb` notebook for examples. However:\r\n\r\n### Chat completion\r\n\r\n```python\r\n# LangChain built-in model\r\nchat_model = ChatOpenAI(\r\n    model_name=\"gpt-4-0613\",\r\n    streaming=True,\r\n)\r\n# Thing which will await for rate/token limits\r\nchat_model_limit_await = LimitAwaitChatOpenAI(\r\n    chat_openai=chat_model,\r\n    limit_await_timeout=60.0,\r\n    limit_await_sleep=0.1,\r\n)\r\n# Thing which will do key rotation\r\nchat_model_key_choose = ChooseKeyChatOpenAI(\r\n    chat_openai=chat_model_limit_await,\r\n    openai_api_keys=[\r\n        os.environ[\"OPENAI_API_KEY0\"],\r\n        os.environ[\"OPENAI_API_KEY1\"],\r\n    ]\r\n)\r\n```\r\nall three things is compatible with LangChain's ChatModel, so:\r\n```python\r\nhistory = [\r\n    SystemMessage(\r\n        content=\"You are a helpful assistant that translates English to French.\"\r\n    ),\r\n    HumanMessage(\r\n        content=\"Translate this sentence from English to French. I love programming.\"\r\n    ),\r\n]\r\nprint(chat_model_key_choose.invoke(history).content)\r\n```\r\n> J'aime la programmation.\r\n\r\nAsync and streaming methods implemented as well.\r\n\r\n### Embeddings\r\n\r\nPretty often we do not only need chat models - we need embeddings (for RAG, for instance) too:\r\n\r\n```python\r\n# LangChain built-in model\r\nembedder_model = OpenAIEmbeddings(\r\n    model=\"text-embedding-ada-002\",\r\n)\r\n# Thing which will await for rate/token limits\r\nembedder_model_limit_await = LimitAwaitOpenAIEmbeddings(\r\n    openai_embeddings=embedder_model,\r\n    limit_await_timeout=60.0,\r\n    limit_await_sleep=0.1,\r\n)\r\n# Thing which will do key rotation\r\nembedder_model_key_choose = ChooseKeyOpenAIEmbeddings(\r\n    openai_embeddings=embedder_model_limit_await,\r\n    openai_api_keys=[\r\n        os.environ[\"OPENAI_API_KEY0\"],\r\n        os.environ[\"OPENAI_API_KEY1\"],\r\n    ]\r\n)\r\n```\r\n\r\n```python\r\ndocs = embedder_model_key_choose.embed_documents([\r\n    \"Markdown is a lightweight markup language\",\r\n    \"Brainfuck is an esoteric programming language\",\r\n])\r\nquery = embedder_model_key_choose.embed_query(\"What is Markdown?\")\r\n```\r\n\r\n> `-0.01  0.03 -0.00 -0.00  0.00 ...`\r\n> `-0.02  0.00 -0.01 -0.00 -0.00 ...`\r\n> `-0.01  0.01  0.00 -0.01  0.00 ...`\r\n\r\n## Testing\r\n\r\nTo run tests - you can do the following stuff\r\n\r\n```\r\npip install langchain_openai_limiter[dev]\r\npytest\r\n```\r\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Wrapper for Langchain & OpenAI api calls which use OpenAI headers to deal with TPM & RPM.",
    "version": "0.0.2.5",
    "project_urls": {
        "Homepage": "https://github.com/alex4321/langchain-openai-limiter"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ca6364f673530049bda0148d759f0284d50910537dc181826a6cba8b09d129f2",
                "md5": "3dc1bd3545608bb12211c3fdc0263715",
                "sha256": "51b618e119e8ea701f53fb10a86cd5210f6b77179830f2a1183a98c1cdec7d6f"
            },
            "downloads": -1,
            "filename": "langchain_openai_limiter-0.0.2.5.tar.gz",
            "has_sig": false,
            "md5_digest": "3dc1bd3545608bb12211c3fdc0263715",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 11978,
            "upload_time": "2023-11-12T22:25:44",
            "upload_time_iso_8601": "2023-11-12T22:25:44.338553Z",
            "url": "https://files.pythonhosted.org/packages/ca/63/64f673530049bda0148d759f0284d50910537dc181826a6cba8b09d129f2/langchain_openai_limiter-0.0.2.5.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-11-12 22:25:44",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "alex4321",
    "github_project": "langchain-openai-limiter",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "langchain-openai-limiter"
}
        
Elapsed time: 0.28558s