test-RAG-X


Nametest-RAG-X JSON
Version 0.2.1 PyPI version JSON
download
home_pagehttps://github.com/hidevscommunity/gen-ai-library/tree/main/Ankit
SummaryThis library is to search the best parameters across different steps of the RAG process.
upload_time2024-04-13 10:50:11
maintainerNone
docs_urlNone
authorAnkit
requires_pythonNone
licenseMIT
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            
---

# RAG-X Library

## Overview

RAG-X is a comprehensive library designed to optimize Retrieval-Augmented Generation (RAG) processes. It provides a suite of tools to automatically determine the best parameters for processing specific documents. This includes selecting appropriate chunking techniques, embedding models, vector databases, and Language Model (LLM) configurations.

## Flow Chart

![Flow Chart](pictures/structure.png)

### Key Features:
- **Adaptive Chunking:** Incorporates four advanced text chunking methodologies to enhance the handling of diverse document structures.
  - Specific Text Splitting
  - Recursive Text Splitting
  - Sentence Window Splitting
  - Semantic Window Splitting
- **Expandability:** Future versions will introduce additional chunking strategies and enhancements based on user feedback and ongoing research.
- **Compatibility:** Designed to seamlessly integrate with a wide range of embedding models and vector databases.

## Getting Started

### Installation

To get started, install the test_RAG_X library using the following command:

```bash
pip install test-RAG-X
```

To verify the installation and view library details, execute:

```bash
pip show test_RAG_X
```

### Setting Up Your Environment

Before diving into the functionality of test-RAG-X, ensure that your environment variables are properly configured with your OpenAI API key and your Hugging Face token:

```python
import os

os.environ['OPENAI_API_KEY'] = "YOUR_OPENAI_API_KEY"

```
** Note :- API Key from Free tier OpenAI account is not supported. **  
## Usage

The following steps guide you through the process of utilizing the test-RAG-X library to optimize your RAG parameters:

```python
import hagrid as hg

# Specify the path to your PDF document
file_path = "PATH_TO_YOUR_PDF_FILE"

# Initialize the RAG-X instance
model = hg.ChunkEvaluator(file_path)

# Generate the optimal RAG parameters for your document
score_card = model.evaluate_parameters()

# Output the results
print(score_card)
```


## Set parameters for evaluation

If you wish to analyse the performance of your parameters, you can pass the parameters as below:
```python
kwarg = {
        'number_of_questions': 5, # Number of questions used to evaluate the process: type(int)
        'chunk_size': 250, # Chunk size: type(int)
        'chunk_overlap': 0, # Chunk overlap size: type(int)
        'separator': '',  # Separator to be used for chunking if any, type(str)
        'strip_whitespace': False, # Strip white space, type(bool)
        'sentence_buffer_window': 3, # Sentence Buffer window, type(int) 
        'sentence_cutoff_percentile': 80, # Sentence chunk split percentile for spliting context, type(int), range(1,100)
        }

# Specify the path to your PDF document
file_path = "PATH_TO_YOUR_PDF_FILE"

# Initialize the RAG-X instance
model = hg.ChunkEvaluator(file_path, **kwargs)

# Generate the optimal RAG parameters for your document
score_card = model.evaluate_parameters()

# Output the results, output will be a pandas dataframe
print(score_card)
```

## Output
![output_image](pictures/result.png)

## Contribution
We are open for contributions and any feedback from the users. Feel free to contact us.

## Contact Us:
If you wish to integarte GenAI into your company, please contact at ...

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/hidevscommunity/gen-ai-library/tree/main/Ankit",
    "name": "test-RAG-X",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": null,
    "author": "Ankit",
    "author_email": "a.baliyan008@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/ae/b4/03db5c19e554ed962f9a29984e92c5401463b36827ccda01dd5914952589/test_RAG_X-0.2.1.tar.gz",
    "platform": null,
    "description": "\r\n---\r\n\r\n# RAG-X Library\r\n\r\n## Overview\r\n\r\nRAG-X is a comprehensive library designed to optimize Retrieval-Augmented Generation (RAG) processes. It provides a suite of tools to automatically determine the best parameters for processing specific documents. This includes selecting appropriate chunking techniques, embedding models, vector databases, and Language Model (LLM) configurations.\r\n\r\n## Flow Chart\r\n\r\n![Flow Chart](pictures/structure.png)\r\n\r\n### Key Features:\r\n- **Adaptive Chunking:** Incorporates four advanced text chunking methodologies to enhance the handling of diverse document structures.\r\n  - Specific Text Splitting\r\n  - Recursive Text Splitting\r\n  - Sentence Window Splitting\r\n  - Semantic Window Splitting\r\n- **Expandability:** Future versions will introduce additional chunking strategies and enhancements based on user feedback and ongoing research.\r\n- **Compatibility:** Designed to seamlessly integrate with a wide range of embedding models and vector databases.\r\n\r\n## Getting Started\r\n\r\n### Installation\r\n\r\nTo get started, install the test_RAG_X library using the following command:\r\n\r\n```bash\r\npip install test-RAG-X\r\n```\r\n\r\nTo verify the installation and view library details, execute:\r\n\r\n```bash\r\npip show test_RAG_X\r\n```\r\n\r\n### Setting Up Your Environment\r\n\r\nBefore diving into the functionality of test-RAG-X, ensure that your environment variables are properly configured with your OpenAI API key and your Hugging Face token:\r\n\r\n```python\r\nimport os\r\n\r\nos.environ['OPENAI_API_KEY'] = \"YOUR_OPENAI_API_KEY\"\r\n\r\n```\r\n** Note :- API Key from Free tier OpenAI account is not supported. **  \r\n## Usage\r\n\r\nThe following steps guide you through the process of utilizing the test-RAG-X library to optimize your RAG parameters:\r\n\r\n```python\r\nimport hagrid as hg\r\n\r\n# Specify the path to your PDF document\r\nfile_path = \"PATH_TO_YOUR_PDF_FILE\"\r\n\r\n# Initialize the RAG-X instance\r\nmodel = hg.ChunkEvaluator(file_path)\r\n\r\n# Generate the optimal RAG parameters for your document\r\nscore_card = model.evaluate_parameters()\r\n\r\n# Output the results\r\nprint(score_card)\r\n```\r\n\r\n\r\n## Set parameters for evaluation\r\n\r\nIf you wish to analyse the performance of your parameters, you can pass the parameters as below:\r\n```python\r\nkwarg = {\r\n        'number_of_questions': 5, # Number of questions used to evaluate the process: type(int)\r\n        'chunk_size': 250, # Chunk size: type(int)\r\n        'chunk_overlap': 0, # Chunk overlap size: type(int)\r\n        'separator': '',  # Separator to be used for chunking if any, type(str)\r\n        'strip_whitespace': False, # Strip white space, type(bool)\r\n        'sentence_buffer_window': 3, # Sentence Buffer window, type(int) \r\n        'sentence_cutoff_percentile': 80, # Sentence chunk split percentile for spliting context, type(int), range(1,100)\r\n        }\r\n\r\n# Specify the path to your PDF document\r\nfile_path = \"PATH_TO_YOUR_PDF_FILE\"\r\n\r\n# Initialize the RAG-X instance\r\nmodel = hg.ChunkEvaluator(file_path, **kwargs)\r\n\r\n# Generate the optimal RAG parameters for your document\r\nscore_card = model.evaluate_parameters()\r\n\r\n# Output the results, output will be a pandas dataframe\r\nprint(score_card)\r\n```\r\n\r\n## Output\r\n![output_image](pictures/result.png)\r\n\r\n## Contribution\r\nWe are open for contributions and any feedback from the users. Feel free to contact us.\r\n\r\n## Contact Us:\r\nIf you wish to integarte GenAI into your company, please contact at ...\r\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "This library is to search the best parameters across different steps of the RAG process.",
    "version": "0.2.1",
    "project_urls": {
        "Homepage": "https://github.com/hidevscommunity/gen-ai-library/tree/main/Ankit"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "aeb403db5c19e554ed962f9a29984e92c5401463b36827ccda01dd5914952589",
                "md5": "1054a399ffa98a5dd2851b1865916f27",
                "sha256": "73cf8b79c4e36a8354364bcef282ba29f791498db9c2abe48078ce28e754d7cc"
            },
            "downloads": -1,
            "filename": "test_RAG_X-0.2.1.tar.gz",
            "has_sig": false,
            "md5_digest": "1054a399ffa98a5dd2851b1865916f27",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 13951,
            "upload_time": "2024-04-13T10:50:11",
            "upload_time_iso_8601": "2024-04-13T10:50:11.460934Z",
            "url": "https://files.pythonhosted.org/packages/ae/b4/03db5c19e554ed962f9a29984e92c5401463b36827ccda01dd5914952589/test_RAG_X-0.2.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-04-13 10:50:11",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "hidevscommunity",
    "github_project": "gen-ai-library",
    "github_not_found": true,
    "lcname": "test-rag-x"
}
        
Elapsed time: 0.25444s