[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://githubtocolab.com/snexus/llm-search/blob/main/notebooks/llmsearch_google_colab_demo.ipynb)
# pyLLMSearch - Advanced RAG
[Documentation](https://llm-search.readthedocs.io/en/latest/)
The purpose of this package is to offer a convenient question-answering (RAG) system with a simple YAML-based configuration that enables interaction with multiple collections of local documents. Special attention is given to improvements in various components of the system **in addition to basic LLM-based RAGs** - better document parsing, hybrid search, HyDE enabled search, chat history, deep linking, re-ranking, the ability to customize embeddings, and more. The package is designed to work with custom Large Language Models (LLMs) – whether from OpenAI or installed locally.
## Features
* Supported formats
* Build-in parsers:
* `.md` - Divides files based on logical components such as headings, subheadings, and code blocks. Supports additional features like cleaning image links, adding custom metadata, and more.
* `.pdf` - MuPDF-based parser.
* `.docx` - custom parser, supports nested tables.
* Other common formats are supported by `Unstructured` pre-processor:
* List of formats see [here](https://unstructured-io.github.io/unstructured/core/partition.html).
* Support for table parsing via open-source gmft (https://github.com/conjuncts/gmft) or Azure Document Intelligence.
* Optional support for image parsing using Gemini API.
* Supports multiple collection of documents, and filtering the results by a collection.
* An ability to update the embeddings incrementally, without a need to re-index the entire document base.
* Generates dense embeddings from a folder of documents and stores them in a vector database ([ChromaDB](https://github.com/chroma-core/chroma)).
* The following embedding models are supported:
* Hugging Face embeddings.
* Sentence-transformers-based models, e.g., `multilingual-e5-base`.
* Instructor-based models, e.g., `instructor-large`.
* Generates sparse embeddings using SPLADE (https://github.com/naver/splade) to enable hybrid search (sparse + dense).
* Supports the "Retrieve and Re-rank" strategy for semantic search, see [here](https://www.sbert.net/examples/applications/retrieve_rerank/README.html).
* Besides the originally `ms-marco-MiniLM` cross-encoder, more modern `bge-reranker` is supported.
* Supports HyDE (Hypothetical Document Embeddings) - see [here](https://arxiv.org/pdf/2212.10496.pdf).
* WARNING: Enabling HyDE (via config OR webapp) can significantly alter the quality of the results. Please make sure to read the paper before enabling.
* From my own experiments, enabling HyDE significantly boosts quality of the output on a topics where user can't formulate the quesiton using domain specific language of the topic - e.g. when learning new topics.
* Support for multi-querying, inspired by `RAG Fusion` - https://towardsdatascience.com/forget-rag-the-future-is-rag-fusion-1147298d8ad1
* When multi-querying is turned on (either config or webapp), the original query will be replaced by 3 variants of the same query, allowing to bridge the gap in the terminology and "offer different angles or perspectives" according to the article.
* Supprts optional chat history with question contextualization
* Allows interaction with embedded documents, internally supporting the following models and methods (including locally hosted):
* OpenAI models (ChatGPT 3.5/4 and Azure OpenAI).
* HuggingFace models.
* Llama cpp supported models - for full list see [here](https://github.com/ggerganov/llama.cpp#description).
* AutoGPTQ models (temporarily disabled due to broken dependencies).
* Interoperability with LiteLLM + Ollama via OpenAI API, supporting hundreds of different models (see [Model configuration for LiteLLM](sample_templates/llm/litellm.yaml))
* Other features
* Simple CLI and web interfaces.
* Deep linking into document sections - jump to an individual PDF page or a header in a markdown file.
* Ability to save responses to an offline database for future analysis.
* Experimental API
## Demo
![Demo](media/llmsearch-demo-v2.gif)
## Documentation
[Browse Documentation](https://llm-search.readthedocs.io/en/latest/)
Raw data
{
"_id": null,
"home_page": null,
"name": "pyllmsearch",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": "llm, rag, retrieval-augemented-generation, large-language-models, local, splade, hyde, reranking, chroma, openai",
"author": null,
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/6b/59/549084e134a5d223f475979152041affe44deabb5d8eab717856d51c1350/pyllmsearch-0.8.1.tar.gz",
"platform": null,
"description": "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://githubtocolab.com/snexus/llm-search/blob/main/notebooks/llmsearch_google_colab_demo.ipynb)\n\n# pyLLMSearch - Advanced RAG\n\n[Documentation](https://llm-search.readthedocs.io/en/latest/)\n\nThe purpose of this package is to offer a convenient question-answering (RAG) system with a simple YAML-based configuration that enables interaction with multiple collections of local documents. Special attention is given to improvements in various components of the system **in addition to basic LLM-based RAGs** - better document parsing, hybrid search, HyDE enabled search, chat history, deep linking, re-ranking, the ability to customize embeddings, and more. The package is designed to work with custom Large Language Models (LLMs) \u2013 whether from OpenAI or installed locally.\n\n## Features\n\n* Supported formats\n * Build-in parsers:\n * `.md` - Divides files based on logical components such as headings, subheadings, and code blocks. Supports additional features like cleaning image links, adding custom metadata, and more.\n * `.pdf` - MuPDF-based parser.\n * `.docx` - custom parser, supports nested tables.\n * Other common formats are supported by `Unstructured` pre-processor:\n * List of formats see [here](https://unstructured-io.github.io/unstructured/core/partition.html).\n\n* Support for table parsing via open-source gmft (https://github.com/conjuncts/gmft) or Azure Document Intelligence.\n\n* Optional support for image parsing using Gemini API.\n\n* Supports multiple collection of documents, and filtering the results by a collection.\n\n* An ability to update the embeddings incrementally, without a need to re-index the entire document base.\n\n* Generates dense embeddings from a folder of documents and stores them in a vector database ([ChromaDB](https://github.com/chroma-core/chroma)).\n * The following embedding models are supported:\n * Hugging Face embeddings.\n * Sentence-transformers-based models, e.g., `multilingual-e5-base`.\n * Instructor-based models, e.g., `instructor-large`.\n\n* Generates sparse embeddings using SPLADE (https://github.com/naver/splade) to enable hybrid search (sparse + dense).\n\n* Supports the \"Retrieve and Re-rank\" strategy for semantic search, see [here](https://www.sbert.net/examples/applications/retrieve_rerank/README.html).\n * Besides the originally `ms-marco-MiniLM` cross-encoder, more modern `bge-reranker` is supported.\n\n* Supports HyDE (Hypothetical Document Embeddings) - see [here](https://arxiv.org/pdf/2212.10496.pdf).\n * WARNING: Enabling HyDE (via config OR webapp) can significantly alter the quality of the results. Please make sure to read the paper before enabling.\n * From my own experiments, enabling HyDE significantly boosts quality of the output on a topics where user can't formulate the quesiton using domain specific language of the topic - e.g. when learning new topics.\n\n* Support for multi-querying, inspired by `RAG Fusion` - https://towardsdatascience.com/forget-rag-the-future-is-rag-fusion-1147298d8ad1\n * When multi-querying is turned on (either config or webapp), the original query will be replaced by 3 variants of the same query, allowing to bridge the gap in the terminology and \"offer different angles or perspectives\" according to the article.\n\n* Supprts optional chat history with question contextualization\n\n* Allows interaction with embedded documents, internally supporting the following models and methods (including locally hosted):\n * OpenAI models (ChatGPT 3.5/4 and Azure OpenAI).\n * HuggingFace models.\n * Llama cpp supported models - for full list see [here](https://github.com/ggerganov/llama.cpp#description).\n * AutoGPTQ models (temporarily disabled due to broken dependencies).\n\n* Interoperability with LiteLLM + Ollama via OpenAI API, supporting hundreds of different models (see [Model configuration for LiteLLM](sample_templates/llm/litellm.yaml))\n\n* Other features\n * Simple CLI and web interfaces.\n * Deep linking into document sections - jump to an individual PDF page or a header in a markdown file.\n * Ability to save responses to an offline database for future analysis.\n * Experimental API\n\n\n## Demo\n\n![Demo](media/llmsearch-demo-v2.gif)\n\n\n## Documentation\n\n[Browse Documentation](https://llm-search.readthedocs.io/en/latest/)\n",
"bugtrack_url": null,
"license": null,
"summary": "LLM Powered Advanced RAG Application",
"version": "0.8.1",
"project_urls": {
"Documentation": "https://llm-search.readthedocs.io/en/latest/",
"Homepage": "https://github.com/snexus/llm-search"
},
"split_keywords": [
"llm",
" rag",
" retrieval-augemented-generation",
" large-language-models",
" local",
" splade",
" hyde",
" reranking",
" chroma",
" openai"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "f39fd335f1dbb180c799381d86a677e81d36bf2794dd95c5a46b898db35d1c45",
"md5": "686f366de73961b1a12ab0884898a4dd",
"sha256": "e79c74312ed0f8af3c0cd3ea5c9a971b22187a06f4a8df8ad6f9ce85e17c723f"
},
"downloads": -1,
"filename": "pyllmsearch-0.8.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "686f366de73961b1a12ab0884898a4dd",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 74023,
"upload_time": "2024-11-06T10:47:45",
"upload_time_iso_8601": "2024-11-06T10:47:45.354059Z",
"url": "https://files.pythonhosted.org/packages/f3/9f/d335f1dbb180c799381d86a677e81d36bf2794dd95c5a46b898db35d1c45/pyllmsearch-0.8.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "6b59549084e134a5d223f475979152041affe44deabb5d8eab717856d51c1350",
"md5": "6b84d66bc496c6c6448be5a0b409e91f",
"sha256": "75617a113f6118266d15fc6ca7b9bcfd43dcd90b06e8262aaa538d90949c7f36"
},
"downloads": -1,
"filename": "pyllmsearch-0.8.1.tar.gz",
"has_sig": false,
"md5_digest": "6b84d66bc496c6c6448be5a0b409e91f",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 3625856,
"upload_time": "2024-11-06T10:47:47",
"upload_time_iso_8601": "2024-11-06T10:47:47.897073Z",
"url": "https://files.pythonhosted.org/packages/6b/59/549084e134a5d223f475979152041affe44deabb5d8eab717856d51c1350/pyllmsearch-0.8.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-11-06 10:47:47",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "snexus",
"github_project": "llm-search",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "llama-cpp-python",
"specs": [
[
"==",
"0.2.76"
]
]
},
{
"name": "chromadb",
"specs": [
[
"~=",
"0.5.5"
]
]
},
{
"name": "langchain",
"specs": [
[
"~=",
"0.2.14"
]
]
},
{
"name": "langchain-community",
"specs": [
[
"~=",
"0.2.12"
]
]
},
{
"name": "langchain-openai",
"specs": [
[
"~=",
"0.1.22"
]
]
},
{
"name": "langchain-huggingface",
"specs": [
[
"~=",
"0.0.3"
]
]
},
{
"name": "pydantic",
"specs": [
[
"~=",
"2.7"
]
]
},
{
"name": "transformers",
"specs": [
[
"~=",
"4.41"
]
]
},
{
"name": "sentence-transformers",
"specs": [
[
"==",
"3.0.1"
]
]
},
{
"name": "pypdf2",
"specs": [
[
"~=",
"3.0.1"
]
]
},
{
"name": "ebooklib",
"specs": [
[
"==",
"0.18"
]
]
},
{
"name": "setuptools",
"specs": [
[
"==",
"67.7.2"
]
]
},
{
"name": "loguru",
"specs": []
},
{
"name": "python-dotenv",
"specs": []
},
{
"name": "accelerate",
"specs": [
[
"~=",
"0.33"
]
]
},
{
"name": "protobuf",
"specs": [
[
"==",
"3.20.2"
]
]
},
{
"name": "termcolor",
"specs": []
},
{
"name": "openai",
"specs": [
[
"~=",
"1.41"
]
]
},
{
"name": "einops",
"specs": []
},
{
"name": "click",
"specs": []
},
{
"name": "bitsandbytes",
"specs": [
[
"==",
"0.43.1"
]
]
},
{
"name": "InstructorEmbedding",
"specs": [
[
"==",
"1.0.1"
]
]
},
{
"name": "unstructured",
"specs": [
[
"~=",
"0.14.5"
]
]
},
{
"name": "pymupdf",
"specs": [
[
"==",
"1.24.9"
]
]
},
{
"name": "streamlit",
"specs": [
[
"~=",
"1.28"
]
]
},
{
"name": "python-docx",
"specs": [
[
"~=",
"1.1"
]
]
},
{
"name": "six",
"specs": [
[
"==",
"1.16.0"
]
]
},
{
"name": "sniffio",
"specs": [
[
"==",
"1.3.0"
]
]
},
{
"name": "sqlalchemy",
"specs": [
[
"==",
"1.4.48"
]
]
},
{
"name": "starlette",
"specs": [
[
"==",
"0.27.0"
]
]
},
{
"name": "sympy",
"specs": [
[
"==",
"1.11.1"
]
]
},
{
"name": "tenacity",
"specs": [
[
"==",
"8.2.3"
]
]
},
{
"name": "threadpoolctl",
"specs": [
[
"==",
"3.1.0"
]
]
},
{
"name": "tiktoken",
"specs": [
[
"==",
"0.7.0"
]
]
},
{
"name": "tokenizers",
"specs": [
[
"==",
"0.19.1"
]
]
},
{
"name": "tqdm",
"specs": [
[
"==",
"4.65.0"
]
]
},
{
"name": "gmft",
"specs": [
[
"==",
"0.2.1"
]
]
},
{
"name": "google-generativeai",
"specs": [
[
"~=",
"0.7"
]
]
}
],
"lcname": "pyllmsearch"
}