---
# RAG-X Library
## Overview
RAG-X is a comprehensive library designed to optimize Retrieval-Augmented Generation (RAG) processes. It provides a suite of tools to automatically determine the best parameters for processing specific documents. This includes selecting appropriate chunking techniques, embedding models, vector databases, and Language Model (LLM) configurations.
## Flow Chart
![Flow Chart](pictures/structure.png)
### Key Features:
- **Adaptive Chunking:** Incorporates four advanced text chunking methodologies to enhance the handling of diverse document structures.
- Specific Text Splitting
- Recursive Text Splitting
- Sentence Window Splitting
- Semantic Window Splitting
- **Expandability:** Future versions will introduce additional chunking strategies and enhancements based on user feedback and ongoing research.
- **Compatibility:** Designed to seamlessly integrate with a wide range of embedding models and vector databases.
## Getting Started
### Installation
To get started, install the test_RAG_X library using the following command:
```bash
pip install test-RAG-X
```
To verify the installation and view library details, execute:
```bash
pip show test_RAG_X
```
### Setting Up Your Environment
Before diving into the functionality of test-RAG-X, ensure that your environment variables are properly configured with your OpenAI API key and your Hugging Face token:
```python
import os
os.environ['OPENAI_API_KEY'] = "YOUR_OPENAI_API_KEY"
```
** Note :- API Key from Free tier OpenAI account is not supported. **
## Usage
The following steps guide you through the process of utilizing the test-RAG-X library to optimize your RAG parameters:
```python
import hagrid as hg
# Specify the path to your PDF document
file_path = "PATH_TO_YOUR_PDF_FILE"
# Initialize the RAG-X instance
model = hg.ChunkEvaluator(file_path)
# Generate the optimal RAG parameters for your document
score_card = model.evaluate_parameters()
# Output the results
print(score_card)
```
## Set parameters for evaluation
If you wish to analyse the performance of your parameters, you can pass the parameters as below:
```python
kwarg = {
'number_of_questions': 5, # Number of questions used to evaluate the process: type(int)
'chunk_size': 250, # Chunk size: type(int)
'chunk_overlap': 0, # Chunk overlap size: type(int)
'separator': '', # Separator to be used for chunking if any, type(str)
'strip_whitespace': False, # Strip white space, type(bool)
'sentence_buffer_window': 3, # Sentence Buffer window, type(int)
'sentence_cutoff_percentile': 80, # Sentence chunk split percentile for spliting context, type(int), range(1,100)
}
# Specify the path to your PDF document
file_path = "PATH_TO_YOUR_PDF_FILE"
# Initialize the RAG-X instance
model = hg.ChunkEvaluator(file_path, **kwargs)
# Generate the optimal RAG parameters for your document
score_card = model.evaluate_parameters()
# Output the results, output will be a pandas dataframe
print(score_card)
```
## Output
![output_image](pictures/result.png)
## Contribution
We are open for contributions and any feedback from the users. Feel free to contact us.
## Contact Us:
If you wish to integarte GenAI into your company, please contact at ...
Raw data
{
"_id": null,
"home_page": "https://github.com/hidevscommunity/gen-ai-library/tree/main/Ankit",
"name": "test-RAG-X",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": null,
"author": "Ankit",
"author_email": "a.baliyan008@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/ae/b4/03db5c19e554ed962f9a29984e92c5401463b36827ccda01dd5914952589/test_RAG_X-0.2.1.tar.gz",
"platform": null,
"description": "\r\n---\r\n\r\n# RAG-X Library\r\n\r\n## Overview\r\n\r\nRAG-X is a comprehensive library designed to optimize Retrieval-Augmented Generation (RAG) processes. It provides a suite of tools to automatically determine the best parameters for processing specific documents. This includes selecting appropriate chunking techniques, embedding models, vector databases, and Language Model (LLM) configurations.\r\n\r\n## Flow Chart\r\n\r\n![Flow Chart](pictures/structure.png)\r\n\r\n### Key Features:\r\n- **Adaptive Chunking:** Incorporates four advanced text chunking methodologies to enhance the handling of diverse document structures.\r\n - Specific Text Splitting\r\n - Recursive Text Splitting\r\n - Sentence Window Splitting\r\n - Semantic Window Splitting\r\n- **Expandability:** Future versions will introduce additional chunking strategies and enhancements based on user feedback and ongoing research.\r\n- **Compatibility:** Designed to seamlessly integrate with a wide range of embedding models and vector databases.\r\n\r\n## Getting Started\r\n\r\n### Installation\r\n\r\nTo get started, install the test_RAG_X library using the following command:\r\n\r\n```bash\r\npip install test-RAG-X\r\n```\r\n\r\nTo verify the installation and view library details, execute:\r\n\r\n```bash\r\npip show test_RAG_X\r\n```\r\n\r\n### Setting Up Your Environment\r\n\r\nBefore diving into the functionality of test-RAG-X, ensure that your environment variables are properly configured with your OpenAI API key and your Hugging Face token:\r\n\r\n```python\r\nimport os\r\n\r\nos.environ['OPENAI_API_KEY'] = \"YOUR_OPENAI_API_KEY\"\r\n\r\n```\r\n** Note :- API Key from Free tier OpenAI account is not supported. ** \r\n## Usage\r\n\r\nThe following steps guide you through the process of utilizing the test-RAG-X library to optimize your RAG parameters:\r\n\r\n```python\r\nimport hagrid as hg\r\n\r\n# Specify the path to your PDF document\r\nfile_path = \"PATH_TO_YOUR_PDF_FILE\"\r\n\r\n# Initialize the RAG-X instance\r\nmodel = hg.ChunkEvaluator(file_path)\r\n\r\n# Generate the optimal RAG parameters for your document\r\nscore_card = model.evaluate_parameters()\r\n\r\n# Output the results\r\nprint(score_card)\r\n```\r\n\r\n\r\n## Set parameters for evaluation\r\n\r\nIf you wish to analyse the performance of your parameters, you can pass the parameters as below:\r\n```python\r\nkwarg = {\r\n 'number_of_questions': 5, # Number of questions used to evaluate the process: type(int)\r\n 'chunk_size': 250, # Chunk size: type(int)\r\n 'chunk_overlap': 0, # Chunk overlap size: type(int)\r\n 'separator': '', # Separator to be used for chunking if any, type(str)\r\n 'strip_whitespace': False, # Strip white space, type(bool)\r\n 'sentence_buffer_window': 3, # Sentence Buffer window, type(int) \r\n 'sentence_cutoff_percentile': 80, # Sentence chunk split percentile for spliting context, type(int), range(1,100)\r\n }\r\n\r\n# Specify the path to your PDF document\r\nfile_path = \"PATH_TO_YOUR_PDF_FILE\"\r\n\r\n# Initialize the RAG-X instance\r\nmodel = hg.ChunkEvaluator(file_path, **kwargs)\r\n\r\n# Generate the optimal RAG parameters for your document\r\nscore_card = model.evaluate_parameters()\r\n\r\n# Output the results, output will be a pandas dataframe\r\nprint(score_card)\r\n```\r\n\r\n## Output\r\n![output_image](pictures/result.png)\r\n\r\n## Contribution\r\nWe are open for contributions and any feedback from the users. Feel free to contact us.\r\n\r\n## Contact Us:\r\nIf you wish to integarte GenAI into your company, please contact at ...\r\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "This library is to search the best parameters across different steps of the RAG process.",
"version": "0.2.1",
"project_urls": {
"Homepage": "https://github.com/hidevscommunity/gen-ai-library/tree/main/Ankit"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "aeb403db5c19e554ed962f9a29984e92c5401463b36827ccda01dd5914952589",
"md5": "1054a399ffa98a5dd2851b1865916f27",
"sha256": "73cf8b79c4e36a8354364bcef282ba29f791498db9c2abe48078ce28e754d7cc"
},
"downloads": -1,
"filename": "test_RAG_X-0.2.1.tar.gz",
"has_sig": false,
"md5_digest": "1054a399ffa98a5dd2851b1865916f27",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 13951,
"upload_time": "2024-04-13T10:50:11",
"upload_time_iso_8601": "2024-04-13T10:50:11.460934Z",
"url": "https://files.pythonhosted.org/packages/ae/b4/03db5c19e554ed962f9a29984e92c5401463b36827ccda01dd5914952589/test_RAG_X-0.2.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-04-13 10:50:11",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "hidevscommunity",
"github_project": "gen-ai-library",
"github_not_found": true,
"lcname": "test-rag-x"
}