rage-toolkit

Name	rage-toolkit JSON
Version	0.0.2 JSON
	download
home_page	https://github.com/othr-nlp/rage_toolkit
Summary	A framework for retrieval augmented generation evaluation (RAGE)
upload_time	2024-09-08 19:35:52
maintainer	None
docs_url	None
author	Vinzent Penzkofer, Timo Baumann
requires_python	>=3.6
license	MIT
keywords	retrieval augmented generation evaluation rag nlp
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # RAGE - Retrieval Augmented Generation Evaluation

## TL;DR
RAGE is a tool for evaluating how well Large Language Models (LLMs) cite relevant sources in Retrieval Augmented Generation (RAG) tasks.

## What am I looking at?

RAGE is a framework designed to evaluate Large Language Models (LLMs) regarding their suitability for Retrieval Augmented Generation (RAG) applications.
In RAG settings, LLMs are augmented with documents that are relevant to a given search query.
The key element evaluated is the ability of an LLM to cite the sources it used for answer generation.

The main idea is to present the LLM with a query and with relevant, irrelevant, and seemingly relevant documents. 
Seemingly relevant documents are from the same area as the relevant documents but don't contain the actual answer.
RAGE then measures how well the LLM recognized the relevant documents.

![Rage Evaluation Process](docs/rage_evaluation_process.svg)
*Figure 1: RAGE Evaluation Process. Examples are extracted from the [Natural Questions](https://ai.google.com/research/NaturalQuestions) Dataset.*

For a more detailed description of the inner workings, dataset creation and metrics, we refer to our paper:<br>
→ [Evaluating and Fine-Tuning Retrieval-Augmented Language Models to Generate Text With Accurate Citations](URL)

## Installation

Pip:
```python
pip install rage-toolkit
```

Build from source:

```bash
$ git clone https://github.com/othr-nlp/rage_toolkit.git
$ cd rage_toolkit
$ pip install -e .
```

## Get Started

We recommend starting at the [`rage_getting_started.ipynb`](rage_getting_started.ipynb) Jupyter Notebook. 
It gives you a quick introduction into how to set up and run an evaluation with a custom LLM.

## Datasets

Note that RAGE works with any datasets that comply with our format. Feel free to create your own datasets that suit your needs.

For guidance on creating one, take a look at our preprocessed examples or refer to our [paper](URL).

Our datasets are built on top of those from the BEIR Benchmark ([BEIR Benchmark](https://doi.org/10.48550/arXiv.2104.08663)).

Our preprocessed datasets can be found here:

| Original Dataset       | Website     | RAGE version on Huggingface                                               |
|------------------------|-------------|---------------------------------------------------------------------------|
| Natural Questions (NQ) | https://ai.google.com/research/NaturalQuestions | [RAGE - NQ](https://huggingface.co/datasets/othr-nlp/rage_nq)             |
| HotpotQA               | https://hotpotqa.github.io/ | [RAGE - HotpotQA](https://huggingface.co/datasets/othr-nlp/rage_hotpotqa) |
## License

This project is licensed under the MIT License - see the [LICENSE](./LICENSE) file for details.

## Contributing

Contributions are welcome! Feel free to open issues or submit pull requests for improvements.

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/othr-nlp/rage_toolkit",
    "name": "rage-toolkit",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": null,
    "keywords": "retrieval, augmented generation, evaluation, RAG, NLP",
    "author": "Vinzent Penzkofer, Timo Baumann",
    "author_email": "vinzent.penzkofer@outlook.de, timo.baumann@oth-regensburg.de",
    "download_url": "https://files.pythonhosted.org/packages/bd/a3/32006fa525acdd757ebecd6943bd38eb92633976820a44e62c8b26cc0008/rage_toolkit-0.0.2.tar.gz",
    "platform": null,
    "description": "# RAGE - Retrieval Augmented Generation Evaluation\n\n## TL;DR\nRAGE is a tool for evaluating how well Large Language Models (LLMs) cite relevant sources in Retrieval Augmented Generation (RAG) tasks.\n\n## What am I looking at?\n\nRAGE is a framework designed to evaluate Large Language Models (LLMs) regarding their suitability for Retrieval Augmented Generation (RAG) applications.\nIn RAG settings, LLMs are augmented with documents that are relevant to a given search query.\nThe key element evaluated is the ability of an LLM to cite the sources it used for answer generation.\n\nThe main idea is to present the LLM with a query and with relevant, irrelevant, and seemingly relevant documents. \nSeemingly relevant documents are from the same area as the relevant documents but don't contain the actual answer.\nRAGE then measures how well the LLM recognized the relevant documents.\n\n![Rage Evaluation Process](docs/rage_evaluation_process.svg)\n*Figure 1: RAGE Evaluation Process. Examples are extracted from the [Natural Questions](https://ai.google.com/research/NaturalQuestions) Dataset.*\n\nFor a more detailed description of the inner workings, dataset creation and metrics, we refer to our paper:<br>\n\u2192 [Evaluating and Fine-Tuning Retrieval-Augmented Language Models to Generate Text With Accurate Citations](URL)\n\n## Installation\n\nPip:\n```python\npip install rage-toolkit\n```\n\nBuild from source:\n\n```bash\n$ git clone https://github.com/othr-nlp/rage_toolkit.git\n$ cd rage_toolkit\n$ pip install -e .\n```\n\n## Get Started\n\nWe recommend starting at the [`rage_getting_started.ipynb`](rage_getting_started.ipynb) Jupyter Notebook. \nIt gives you a quick introduction into how to set up and run an evaluation with a custom LLM.\n\n## Datasets\n\nNote that RAGE works with any datasets that comply with our format. Feel free to create your own datasets that suit your needs.\n\nFor guidance on creating one, take a look at our preprocessed examples or refer to our [paper](URL).\n\nOur datasets are built on top of those from the BEIR Benchmark ([BEIR Benchmark](https://doi.org/10.48550/arXiv.2104.08663)).\n\nOur preprocessed datasets can be found here:\n\n| Original Dataset       | Website     | RAGE version on Huggingface                                               |\n|------------------------|-------------|---------------------------------------------------------------------------|\n| Natural Questions (NQ) | https://ai.google.com/research/NaturalQuestions | [RAGE - NQ](https://huggingface.co/datasets/othr-nlp/rage_nq)             |\n| HotpotQA               | https://hotpotqa.github.io/ | [RAGE - HotpotQA](https://huggingface.co/datasets/othr-nlp/rage_hotpotqa) |\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](./LICENSE) file for details.\n\n## Contributing\n\nContributions are welcome! Feel free to open issues or submit pull requests for improvements.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A framework for retrieval augmented generation evaluation (RAGE)",
    "version": "0.0.2",
    "project_urls": {
        "Documentation": "https://github.com/othr-nlp/rage_toolkit",
        "Homepage": "https://github.com/othr-nlp/rage_toolkit"
    },
    "split_keywords": [
        "retrieval",
        " augmented generation",
        " evaluation",
        " rag",
        " nlp"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "43d685d05789d289cf4f2a4cb46ce69788e847e02faf6ed8bd783fadf1cf43c2",
                "md5": "56964b00e4100f2117915d5fb47844c8",
                "sha256": "7f76e51a8b7ff6931babe201634eae894e24516db06d9d35a1831ded3cec7052"
            },
            "downloads": -1,
            "filename": "rage_toolkit-0.0.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "56964b00e4100f2117915d5fb47844c8",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 7313,
            "upload_time": "2024-09-08T19:35:50",
            "upload_time_iso_8601": "2024-09-08T19:35:50.251913Z",
            "url": "https://files.pythonhosted.org/packages/43/d6/85d05789d289cf4f2a4cb46ce69788e847e02faf6ed8bd783fadf1cf43c2/rage_toolkit-0.0.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "bda332006fa525acdd757ebecd6943bd38eb92633976820a44e62c8b26cc0008",
                "md5": "520da57403475547a062ae06bf9e56e7",
                "sha256": "7f7b676bae2b09cf523a3cb1dd98baac99cafd97d8689094de9f74041a84b7dc"
            },
            "downloads": -1,
            "filename": "rage_toolkit-0.0.2.tar.gz",
            "has_sig": false,
            "md5_digest": "520da57403475547a062ae06bf9e56e7",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 153089,
            "upload_time": "2024-09-08T19:35:52",
            "upload_time_iso_8601": "2024-09-08T19:35:52.018051Z",
            "url": "https://files.pythonhosted.org/packages/bd/a3/32006fa525acdd757ebecd6943bd38eb92633976820a44e62c8b26cc0008/rage_toolkit-0.0.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-09-08 19:35:52",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "othr-nlp",
    "github_project": "rage_toolkit",
    "github_not_found": true,
    "lcname": "rage-toolkit"
}

Vinzent Penzkofer, Timo Baumann