# yandex-chain - LangChain-compatible integrations with YandexGPT and YandexGPT Embeddings
This library is community-maintained Python package that provides support for [Yandex GPT](https://cloud.yandex.ru/docs/yandexgpt/) LLM and Embeddings for [LangChain Framework](https://www.langchain.com/).
> Currently, Yandex GPT is in preview stage, so this library may occasionally break. Please use it at your own risk!
## What's Included
The library includes the following main classes:
* **YandexLLM** is a class representing [YandexGPT Text Generation](https://cloud.yandex.ru/docs/yandexgpt/api-ref/TextGeneration/).
* **ChatYandexGPT** exposes the same model in chat interface that expects messages as input.
* **YandexEmbeddings** represents [YandexGPT Embeddings](https://cloud.yandex.ru/docs/yandexgpt/api-ref/Embeddings/) service.
* **YandexGPTClassifier** that supports [zero-shot and few-shot classification](https://yandex.cloud/en/docs/foundation-models/concepts/classifier).
## Usage
You can use `YandexLLM` in the following manner:
```python
from yandex_chain import YandexLLM
LLM = YandexLLM(folder_id="...", api_key="...")
print(LLM("How are you today?"))
```
You can use `YandexEmbeddings` to compute embedding vectors:
```python
from yandex_chain import YandexEmbeddings
embeddings = YandexEmbeddings(...)
print(embeddings("How are you today?"))
```
Use `ChatYandexGPT` to execute a dialog with the model:
```python
from yandex_chain import YandexLLM
gpt = ChatYandexGPT(...)
print(gpt(
[
HumanMessage(content='Привет! Придумай 10 новых слов для приветствия.')
]))
```
## Authentication
In order to use Yandex GPT, you need to provide one of the following authentication methods, which you can specify as parameters to `YandexLLM`, `ChatYandexGPT` and `YandexEmbeddings` classes:
* A pair of `folder_id` and `api_key`
* A pair of `folder_id` and `iam_token`
* A path to [`config.json`](tests/config_sample.json) file, which may in turn contain parameters listed above in a convenient JSON format.
## Complete Example
A pair of LLM and Embeddings are a good combination to create problem-oriented chatbots using Retrieval-Augmented Generation (RAG). Here is a short example of this approach, inspired by [this LangChain tutorial](https://python.langchain.com/docs/expression_language/cookbook/retrieval).
To begin with, we have a set of documents `docs` (for simplicity, let's assume it is just a list of strings), which we store in vector storage. We can use `YandexEmbeddings` to compute embedding vectors:
```python
from yandex_chain import YandexLLM, YandexEmbeddings
from langchain.vectorstores import FAISS
embeddings = YandexEmbeddings(config="config.json")
vectorstore = FAISS.from_texts(docs, embedding=embeddings)
retriever = vectorstore.as_retriever()
```
We can now retrieve a set of documents relevant to a query:
```python
query = "Which library can be used to work with Yandex GPT?"
res = retriever.get_relevant_documents(query)
```
Now, to provide a full-text answer to the query, we can use LLM. We will prompt the LLM, giving it retrieved documents as a context, and the input query, and ask it to answer the question. This can be done using LangChain *chains*:
```python
from operator import itemgetter
from langchain.prompts import ChatPromptTemplate
from langchain.schema.output_parser import StrOutputParser
from langchain.schema.runnable import RunnablePassthrough
template = """Answer the question based only on the following context:
{context}
Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)
model = YandexLLM(config="config.json")
chain = (
{"context": retriever, "question": RunnablePassthrough()}
| prompt
| model
| StrOutputParser()
)
```
This chain can now answer our questions:
```python
chain.invoke(query)
```
## Lite vs. Full Models
YandexGPT model comes in several flavours - YandexGPT Lite (current and RC), YandexGPT Pro and Summarization model. By default, YandexGPT Lite is used. If you want to use different model, please specify it in the constructor of `YandexLLM` or `ChatYandexGPT` language model classes:
* **Pro** (based on Yandex GPT 3): `model=YandexGPTModel.Pro`
* **Lite** (based on Yandex GPT 3): `model=YandexGPTModel.Lite`
* **Pro RC** (based on Yandex GPT 4): `model=YandexGPTModel.ProRC`
* **Lite RC** (based on Yandex GPT 4): `model=YandexGPTModel.LiteRC`
* **Pro 32k** (based on Yandex GPT 4): `model=YandexGPTModel.Pro32k`
* **Summarization** (based on Yandex GPT 2): `model=YandexGPTModel.Summarization`
> In previous versions, we were using `use_lite` flag to switch between Lite and Pro models. This behavior is still supported, but is deprecated, and will be removed in the next version.
## Async Operations
The library supports explicit async mode of Yandex GPT API. Provided `model` is `YandexLLM` model,
you can call `model.invokeAsync(...)` to obtain the `id` of the async operation. You can then call `model.checkAsyncResult(id)` to check if the result if ready. `checkAsyncResult` returns `None` when the result is not ready, otherwise it returns the result of the operation (string, or Message if `return_message` argument is `True`).
## Testing
This repository contains some basic unit tests. To run them, you need to place a configuration file `config.json` with your credentials into `tests` folder. Use `config_sample.json` as a reference. After that, please run the following at the repository root directory:
```bash
python -m unittest discover -s tests
```
## Credits
* This library has originally been developed by [Dmitri Soshnikov](https://soshnikov.com).
Raw data
{
"_id": null,
"home_page": "https://github.com/yandex-datasphere/yandex-chain",
"name": "yandex-chain",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": null,
"author": "Dmitri Soshnikov",
"author_email": "dmitri@soshnikov.com",
"download_url": "https://files.pythonhosted.org/packages/01/e2/5217740f8f3f055675f025008f2ad6da8aa59b19ab203f432bb6652178b4/yandex-chain-0.0.10.tar.gz",
"platform": null,
"description": "# yandex-chain - LangChain-compatible integrations with YandexGPT and YandexGPT Embeddings\r\n\r\nThis library is community-maintained Python package that provides support for [Yandex GPT](https://cloud.yandex.ru/docs/yandexgpt/) LLM and Embeddings for [LangChain Framework](https://www.langchain.com/).\r\n\r\n> Currently, Yandex GPT is in preview stage, so this library may occasionally break. Please use it at your own risk!\r\n\r\n## What's Included\r\n\r\nThe library includes the following main classes:\r\n\r\n* **YandexLLM** is a class representing [YandexGPT Text Generation](https://cloud.yandex.ru/docs/yandexgpt/api-ref/TextGeneration/).\r\n* **ChatYandexGPT** exposes the same model in chat interface that expects messages as input.\r\n* **YandexEmbeddings** represents [YandexGPT Embeddings](https://cloud.yandex.ru/docs/yandexgpt/api-ref/Embeddings/) service.\r\n* **YandexGPTClassifier** that supports [zero-shot and few-shot classification](https://yandex.cloud/en/docs/foundation-models/concepts/classifier).\r\n\r\n## Usage\r\n\r\nYou can use `YandexLLM` in the following manner:\r\n\r\n```python\r\nfrom yandex_chain import YandexLLM\r\n\r\nLLM = YandexLLM(folder_id=\"...\", api_key=\"...\")\r\nprint(LLM(\"How are you today?\"))\r\n```\r\n\r\nYou can use `YandexEmbeddings` to compute embedding vectors:\r\n\r\n```python\r\nfrom yandex_chain import YandexEmbeddings\r\n\r\nembeddings = YandexEmbeddings(...)\r\nprint(embeddings(\"How are you today?\"))\r\n```\r\n\r\nUse `ChatYandexGPT` to execute a dialog with the model:\r\n```python\r\nfrom yandex_chain import YandexLLM\r\n\r\ngpt = ChatYandexGPT(...)\r\nprint(gpt(\r\n [\r\n HumanMessage(content='\u041f\u0440\u0438\u0432\u0435\u0442! \u041f\u0440\u0438\u0434\u0443\u043c\u0430\u0439 10 \u043d\u043e\u0432\u044b\u0445 \u0441\u043b\u043e\u0432 \u0434\u043b\u044f \u043f\u0440\u0438\u0432\u0435\u0442\u0441\u0442\u0432\u0438\u044f.')\r\n ]))\r\n\r\n``` \r\n\r\n## Authentication\r\n\r\nIn order to use Yandex GPT, you need to provide one of the following authentication methods, which you can specify as parameters to `YandexLLM`, `ChatYandexGPT` and `YandexEmbeddings` classes:\r\n\r\n* A pair of `folder_id` and `api_key`\r\n* A pair of `folder_id` and `iam_token`\r\n* A path to [`config.json`](tests/config_sample.json) file, which may in turn contain parameters listed above in a convenient JSON format.\r\n\r\n## Complete Example\r\n\r\nA pair of LLM and Embeddings are a good combination to create problem-oriented chatbots using Retrieval-Augmented Generation (RAG). Here is a short example of this approach, inspired by [this LangChain tutorial](https://python.langchain.com/docs/expression_language/cookbook/retrieval).\r\n\r\nTo begin with, we have a set of documents `docs` (for simplicity, let's assume it is just a list of strings), which we store in vector storage. We can use `YandexEmbeddings` to compute embedding vectors:\r\n\r\n```python\r\nfrom yandex_chain import YandexLLM, YandexEmbeddings\r\nfrom langchain.vectorstores import FAISS\r\n\r\nembeddings = YandexEmbeddings(config=\"config.json\")\r\nvectorstore = FAISS.from_texts(docs, embedding=embeddings)\r\nretriever = vectorstore.as_retriever()\r\n```\r\n\r\nWe can now retrieve a set of documents relevant to a query:\r\n\r\n```python\r\nquery = \"Which library can be used to work with Yandex GPT?\"\r\nres = retriever.get_relevant_documents(query)\r\n```\r\n\r\nNow, to provide a full-text answer to the query, we can use LLM. We will prompt the LLM, giving it retrieved documents as a context, and the input query, and ask it to answer the question. This can be done using LangChain *chains*:\r\n\r\n```python\r\nfrom operator import itemgetter\r\n\r\nfrom langchain.prompts import ChatPromptTemplate\r\nfrom langchain.schema.output_parser import StrOutputParser\r\nfrom langchain.schema.runnable import RunnablePassthrough\r\n\r\ntemplate = \"\"\"Answer the question based only on the following context:\r\n{context}\r\n\r\nQuestion: {question}\r\n\"\"\"\r\nprompt = ChatPromptTemplate.from_template(template)\r\nmodel = YandexLLM(config=\"config.json\")\r\n\r\nchain = (\r\n {\"context\": retriever, \"question\": RunnablePassthrough()} \r\n | prompt \r\n | model \r\n | StrOutputParser()\r\n)\r\n```\r\n\r\nThis chain can now answer our questions:\r\n```python\r\nchain.invoke(query)\r\n```\r\n\r\n## Lite vs. Full Models\r\n\r\nYandexGPT model comes in several flavours - YandexGPT Lite (current and RC), YandexGPT Pro and Summarization model. By default, YandexGPT Lite is used. If you want to use different model, please specify it in the constructor of `YandexLLM` or `ChatYandexGPT` language model classes:\r\n* **Pro** (based on Yandex GPT 3): `model=YandexGPTModel.Pro`\r\n* **Lite** (based on Yandex GPT 3): `model=YandexGPTModel.Lite`\r\n* **Pro RC** (based on Yandex GPT 4): `model=YandexGPTModel.ProRC`\r\n* **Lite RC** (based on Yandex GPT 4): `model=YandexGPTModel.LiteRC`\r\n* **Pro 32k** (based on Yandex GPT 4): `model=YandexGPTModel.Pro32k`\r\n* **Summarization** (based on Yandex GPT 2): `model=YandexGPTModel.Summarization`\r\n\r\n> In previous versions, we were using `use_lite` flag to switch between Lite and Pro models. This behavior is still supported, but is deprecated, and will be removed in the next version.\r\n\r\n## Async Operations\r\n\r\nThe library supports explicit async mode of Yandex GPT API. Provided `model` is `YandexLLM` model,\r\nyou can call `model.invokeAsync(...)` to obtain the `id` of the async operation. You can then call `model.checkAsyncResult(id)` to check if the result if ready. `checkAsyncResult` returns `None` when the result is not ready, otherwise it returns the result of the operation (string, or Message if `return_message` argument is `True`).\r\n\r\n## Testing\r\n\r\nThis repository contains some basic unit tests. To run them, you need to place a configuration file `config.json` with your credentials into `tests` folder. Use `config_sample.json` as a reference. After that, please run the following at the repository root directory:\r\n\r\n```bash\r\npython -m unittest discover -s tests\r\n```\r\n\r\n## Credits\r\n\r\n* This library has originally been developed by [Dmitri Soshnikov](https://soshnikov.com).\r\n",
"bugtrack_url": null,
"license": "MIT license",
"summary": "Yandex GPT Support for LangChain",
"version": "0.0.10",
"project_urls": {
"Homepage": "https://github.com/yandex-datasphere/yandex-chain"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "01e25217740f8f3f055675f025008f2ad6da8aa59b19ab203f432bb6652178b4",
"md5": "0b48ff2bd45c398cfc9acaa38f4a2015",
"sha256": "2215738d2d255379d65337c0f21f60d5522c2608d037c23f949b7ffb4bbcd630"
},
"downloads": -1,
"filename": "yandex-chain-0.0.10.tar.gz",
"has_sig": false,
"md5_digest": "0b48ff2bd45c398cfc9acaa38f4a2015",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 10475,
"upload_time": "2024-10-25T04:19:32",
"upload_time_iso_8601": "2024-10-25T04:19:32.599475Z",
"url": "https://files.pythonhosted.org/packages/01/e2/5217740f8f3f055675f025008f2ad6da8aa59b19ab203f432bb6652178b4/yandex-chain-0.0.10.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-25 04:19:32",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "yandex-datasphere",
"github_project": "yandex-chain",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [
{
"name": "langchain",
"specs": []
},
{
"name": "requests",
"specs": []
},
{
"name": "tenacity",
"specs": []
}
],
"lcname": "yandex-chain"
}