# A Serverless RAG Chatbot Framework using AWS OpenSearch, AWS Bedrock and LangChain
This is a RAG chatbot application based on AWS components and designed to be optimized for a serverless architecture
and cost optimized for a high-volume mobile application use case. The AWS Services being used are:
- AWS Bedrock with the Titan LLM and Titan embeddings
- OpenSearch (aka Elasticsearch) vector database - this is instance based but can be serverless (not implemented yet)
The frameworks I created here abstract out a variety of components to enable easily testing variations. This makes it easier to tune the implementation based on the
content being use in my applications. The goal is to play with different LLM's, different embeddings, parameters like temperature, top_p and more.
Content loading also needed to be highly refined allowing me to control each source easily.
This library is meant more to provide a complete working set of examples and less to be an out-of-the-box library to use as-is.
Because this framework imports so many variants of embeddings, LLM models, etc. it is not suitable for a direct deployment into something like AWS Lambda... it is a :pig:.
Serious package bloat.
## Features and aspects of interest
- **LangChain** - Great framework building Generative AI applications.
- **LLM Model Support** - Defaults to AWS Bedrock Titan LLM, but supports other Bedrock models (LLama 2, Jurassic, Anthropic Claude) OpenAI (GTP-3.5 and GTP-4), Google Gemini
- **LangChain LLM Callbacks** - Example of using LLM Callbacks, provided here for custom costing
- **Conversational Memory** -- Designed to manage chat history memory from the client side - to support a serverless model
- **OpenSearch** - an AWS hosted semantic search service which has a vector database that is valuable in the retrieval feature of RAG
- **OpenSearch loading library** - making it easier to load multiple content sources into an OpenSearch vector database.
- **Langchain LCEL Chains** - for more flexible chaining of steps in the RAG app
- **Multiple embedding models** - Defaults to Bedrock Titan, OpenAI and Hugging Face and Cohere
- **Web and directory crawlers** - for loading content into the vector DB. Lots of fine-tuning to document selection features - whitelisting, black listing, etc.
- **Prompt library management** - making it easier to implement query routing to optimize for specific domains of questions
- **LangSmith logging** - for logging and debugging
- **Ragas Evaluation** - A simple starter module that uses [ragas](https://docs.ragas.io/en/stable/) to run evaluations on the RAG chain configurations you want to compare.
## How to use
### Prerequisites
- Python 3.11 (may work on older, but this was the target version used)
- AWS Account with keys defined in .aws/credentials file
- AWS OpenSearch Domain created and accessible to the AWS account credentials above
- AWS Bedrock with Titan Express LLM (or other LLM supported by this code) - you may need to request access
- Ideally a LangChain (LangSmith) API Key defined in .env file (or removed if not using LangSmith) for logging and monitoring
- If using google gemini or openai, you will need to have an account and keys for those services and have them defined in the .env file
- Python dependencies installed (dependency manager for this project coming soon)
### Setting up the project locally
This project uses Poetry to manage dependencies. You can install it with the following command:
```bash
pip install poetry
```
Then you can install the dependencies with the following command:
```bash
poetry install
```
Alternatively, you can install the package from PyPi with the following command:
```bash
pip install aws-rag-bot
```
Where the PyPi package is available at https://pypi.org/project/aws-rag-bot/
****
### Sample Code
This is a very simple, high-level example. Check out the rag_bot_code_samples.ipynb for a more.
First step is to have content in your vector database.
```python
from open_search_vector_db.aws_opensearch_vector_database import OpenSearchVectorDBLoader
content_sources = [{"name": "Internal KB Docs", "type": "PDF", "location": "kb-docs"}]
vectordb_loader = OpenSearchVectorDBLoader(os_endpoint=my_open_search_endpoint,
index_name=my_index_name,
data_sources=content_sources)
vectordb_loader.load()
```
Then you can start asking questions of it
```python
from rag_chatbot import RagChatbot, LlmModelTypes
from prompt_library import DefaultPrompts
chatbot = RagChatbot(my_open_search_endpoint,
model_key=LlmModelTypes.BEDROCK_TITAN_EXPRESS,
prompt_model=NasaSpokespersonPrompts)
chat_history = []
question = "What...?" # Ask a question related to the content you loaded
response = chatbot.ask_question(question, chat_history, verbose=True)
print(response)
```
### Provisioning a test index in OpenSearch
You can use the tests/provision_test_index.py to create a test index in OpenSearch. The content loaded
supports the test cases in this project
### Running chatbot_client.py
A very simple command line client program has been created as an example and tool to test. It is called chatbot_client.py.
It is a simple command line program that will ask a question and then print the response while retaining the chat history for context.
```bash
python chatbot_client.py my-opensource-domain-name
```
### Running tests
There are two test modules in the tests folder used to run through search and the RAG bot to make sure everything is working as well as provide
some additional samples of how to use the framework.
### Using Ragas evaluation framework
Ragas is one of a variety of evaluation tools for RAG applications.
It can be used to evaluate both the retrieval and generation aspects of the RAG bot.
With an evaluation tool you can then use this project's features to vary whatever aspects you need
and compare the results to make decisions and tune.
A simple example of this can be found in the tests folder.
## References
**Vector Database:** May references show using Chroma and FAISS, but I needed a solution that worked well in a Lambda serverless environment.
Ideally it would be at AWS keeping my stack uniform.
- https://aws.amazon.com/what-is/vector-databases/
Ultimately I chose OpenSearch because of cost, support by LangChain and a serverless version I plan to evaluate
**Vector Database Loader:** LangChain has a great library of DataLoaders for loading data into OpenSearch. I wanted an effective way
to scrape a website with help from this article chose to use Selenium.
- https://www.comet.com/site/blog/langchain-document-loaders-for-web-data/
I also used the directory loader and plan to implement cloud based directory loaders in the future. Primarily S3 and Google Drive.
**Langsmith Logging and Debugging:** I used LangSmith for logging and debugging. It is a great tool for this purpose.
- https://docs.smith.langchain.com
Raw data
{
"_id": null,
"home_page": "https://github.com/harvest2o-llc/aws-rag-bot",
"name": "aws-rag-bot",
"maintainer": null,
"docs_url": null,
"requires_python": "<3.12,>=3.11",
"maintainer_email": null,
"keywords": "aws, rag, bedrock, opensearch, vector database",
"author": "danjamk",
"author_email": "dan@risegardens.com",
"download_url": "https://files.pythonhosted.org/packages/b8/33/1cf356d6bc4516c22844e3b95649f57b2734bd03d462c92c26fc60689c71/aws_rag_bot-0.1.16.tar.gz",
"platform": null,
"description": "# A Serverless RAG Chatbot Framework using AWS OpenSearch, AWS Bedrock and LangChain\nThis is a RAG chatbot application based on AWS components and designed to be optimized for a serverless architecture\nand cost optimized for a high-volume mobile application use case. The AWS Services being used are:\n- AWS Bedrock with the Titan LLM and Titan embeddings \n- OpenSearch (aka Elasticsearch) vector database - this is instance based but can be serverless (not implemented yet)\n\nThe frameworks I created here abstract out a variety of components to enable easily testing variations. This makes it easier to tune the implementation based on the \ncontent being use in my applications. The goal is to play with different LLM's, different embeddings, parameters like temperature, top_p and more.\nContent loading also needed to be highly refined allowing me to control each source easily.\n\nThis library is meant more to provide a complete working set of examples and less to be an out-of-the-box library to use as-is. \nBecause this framework imports so many variants of embeddings, LLM models, etc. it is not suitable for a direct deployment into something like AWS Lambda... it is a :pig:. \nSerious package bloat. \n\n\n## Features and aspects of interest\n- **LangChain** - Great framework building Generative AI applications. \n- **LLM Model Support** - Defaults to AWS Bedrock Titan LLM, but supports other Bedrock models (LLama 2, Jurassic, Anthropic Claude) OpenAI (GTP-3.5 and GTP-4), Google Gemini\n- **LangChain LLM Callbacks** - Example of using LLM Callbacks, provided here for custom costing \n- **Conversational Memory** -- Designed to manage chat history memory from the client side - to support a serverless model\n- **OpenSearch** - an AWS hosted semantic search service which has a vector database that is valuable in the retrieval feature of RAG\n- **OpenSearch loading library** - making it easier to load multiple content sources into an OpenSearch vector database.\n- **Langchain LCEL Chains** - for more flexible chaining of steps in the RAG app\n- **Multiple embedding models** - Defaults to Bedrock Titan, OpenAI and Hugging Face and Cohere\n- **Web and directory crawlers** - for loading content into the vector DB. Lots of fine-tuning to document selection features - whitelisting, black listing, etc.\n- **Prompt library management** - making it easier to implement query routing to optimize for specific domains of questions\n- **LangSmith logging** - for logging and debugging\n- **Ragas Evaluation** - A simple starter module that uses [ragas](https://docs.ragas.io/en/stable/) to run evaluations on the RAG chain configurations you want to compare.\n\n## How to use\n### Prerequisites\n- Python 3.11 (may work on older, but this was the target version used)\n- AWS Account with keys defined in .aws/credentials file\n- AWS OpenSearch Domain created and accessible to the AWS account credentials above\n- AWS Bedrock with Titan Express LLM (or other LLM supported by this code) - you may need to request access\n- Ideally a LangChain (LangSmith) API Key defined in .env file (or removed if not using LangSmith) for logging and monitoring\n- If using google gemini or openai, you will need to have an account and keys for those services and have them defined in the .env file\n- Python dependencies installed (dependency manager for this project coming soon)\n\n### Setting up the project locally\nThis project uses Poetry to manage dependencies. You can install it with the following command:\n```bash\npip install poetry\n```\n\nThen you can install the dependencies with the following command:\n```bash\npoetry install\n```\n\nAlternatively, you can install the package from PyPi with the following command:\n```bash\npip install aws-rag-bot\n```\nWhere the PyPi package is available at https://pypi.org/project/aws-rag-bot/\n****\n### Sample Code\nThis is a very simple, high-level example. Check out the rag_bot_code_samples.ipynb for a more.\nFirst step is to have content in your vector database. \n```python\nfrom open_search_vector_db.aws_opensearch_vector_database import OpenSearchVectorDBLoader\ncontent_sources = [{\"name\": \"Internal KB Docs\", \"type\": \"PDF\", \"location\": \"kb-docs\"}]\nvectordb_loader = OpenSearchVectorDBLoader(os_endpoint=my_open_search_endpoint, \n index_name=my_index_name,\n data_sources=content_sources)\n\nvectordb_loader.load()\n```\n\nThen you can start asking questions of it\n```python\nfrom rag_chatbot import RagChatbot, LlmModelTypes\nfrom prompt_library import DefaultPrompts\nchatbot = RagChatbot(my_open_search_endpoint,\n model_key=LlmModelTypes.BEDROCK_TITAN_EXPRESS,\n prompt_model=NasaSpokespersonPrompts)\nchat_history = []\nquestion = \"What...?\" # Ask a question related to the content you loaded\n\nresponse = chatbot.ask_question(question, chat_history, verbose=True)\nprint(response)\n```\n\n\n### Provisioning a test index in OpenSearch\nYou can use the tests/provision_test_index.py to create a test index in OpenSearch. The content loaded\nsupports the test cases in this project\n\n\n### Running chatbot_client.py\nA very simple command line client program has been created as an example and tool to test. It is called chatbot_client.py. \nIt is a simple command line program that will ask a question and then print the response while retaining the chat history for context. \n\n```bash\npython chatbot_client.py my-opensource-domain-name\n```\n\n### Running tests\nThere are two test modules in the tests folder used to run through search and the RAG bot to make sure everything is working as well as provide\nsome additional samples of how to use the framework.\n\n### Using Ragas evaluation framework\nRagas is one of a variety of evaluation tools for RAG applications. \nIt can be used to evaluate both the retrieval and generation aspects of the RAG bot. \nWith an evaluation tool you can then use this project's features to vary whatever aspects you need\nand compare the results to make decisions and tune. \nA simple example of this can be found in the tests folder.\n\n\n## References\n**Vector Database:** May references show using Chroma and FAISS, but I needed a solution that worked well in a Lambda serverless environment. \nIdeally it would be at AWS keeping my stack uniform. \n- https://aws.amazon.com/what-is/vector-databases/\n\nUltimately I chose OpenSearch because of cost, support by LangChain and a serverless version I plan to evaluate\n\n**Vector Database Loader:** LangChain has a great library of DataLoaders for loading data into OpenSearch. I wanted an effective way\nto scrape a website with help from this article chose to use Selenium.\n- https://www.comet.com/site/blog/langchain-document-loaders-for-web-data/\n\nI also used the directory loader and plan to implement cloud based directory loaders in the future. Primarily S3 and Google Drive.\n\n**Langsmith Logging and Debugging:** I used LangSmith for logging and debugging. It is a great tool for this purpose.\n- https://docs.smith.langchain.com\n\n\n\n",
"bugtrack_url": null,
"license": "MIT",
"summary": null,
"version": "0.1.16",
"project_urls": {
"Homepage": "https://github.com/harvest2o-llc/aws-rag-bot",
"Repository": "https://github.com/harvest2o-llc/aws-rag-bot"
},
"split_keywords": [
"aws",
" rag",
" bedrock",
" opensearch",
" vector database"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "45a5574bd225f9614f860eaaa249ca2caa62d02bc210f373739ffe8dbf1b8e8e",
"md5": "8a5262667aaa6a2fe595c0cb319d94fb",
"sha256": "9d8916310ebaf44f56668b0217305b6911b656712588e6c71e064255ee410953"
},
"downloads": -1,
"filename": "aws_rag_bot-0.1.16-py3-none-any.whl",
"has_sig": false,
"md5_digest": "8a5262667aaa6a2fe595c0cb319d94fb",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<3.12,>=3.11",
"size": 17221,
"upload_time": "2024-04-01T19:53:08",
"upload_time_iso_8601": "2024-04-01T19:53:08.769084Z",
"url": "https://files.pythonhosted.org/packages/45/a5/574bd225f9614f860eaaa249ca2caa62d02bc210f373739ffe8dbf1b8e8e/aws_rag_bot-0.1.16-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "b8331cf356d6bc4516c22844e3b95649f57b2734bd03d462c92c26fc60689c71",
"md5": "61da22253701d3967bc61a6cb598df2b",
"sha256": "cce14ac8cd3668ab50793ac01b7ea2ea327f40d98806f25577f6d80562ab4f81"
},
"downloads": -1,
"filename": "aws_rag_bot-0.1.16.tar.gz",
"has_sig": false,
"md5_digest": "61da22253701d3967bc61a6cb598df2b",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<3.12,>=3.11",
"size": 17861,
"upload_time": "2024-04-01T19:53:10",
"upload_time_iso_8601": "2024-04-01T19:53:10.572967Z",
"url": "https://files.pythonhosted.org/packages/b8/33/1cf356d6bc4516c22844e3b95649f57b2734bd03d462c92c26fc60689c71/aws_rag_bot-0.1.16.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-04-01 19:53:10",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "harvest2o-llc",
"github_project": "aws-rag-bot",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "aws-rag-bot"
}