llama-index-packs-rag-evaluator


Namellama-index-packs-rag-evaluator JSON
Version 0.1.3 PyPI version JSON
download
home_page
Summaryllama-index packs rag_evaluator integration
upload_time2024-02-22 01:29:47
maintainernerdai
docs_urlNone
authorYour Name
requires_python>=3.8.1,<4.0
licenseMIT
keywords benchmarks evaluation rag
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Retrieval-Augmented Generation (RAG) Evaluation Pack

Get benchmark scores on your own RAG pipeline (i.e. `QueryEngine`) on a RAG
dataset (i.e., `LabelledRagDataset`). Specifically this pack takes in as input a
query engine and a `LabelledRagDataset`, which can also be downloaded from
[llama-hub](https://llamahub.ai).

## CLI Usage

You can download llamapacks directly using `llamaindex-cli`, which comes installed with the `llama-index` python package:

```bash
llamaindex-cli download-llamapack RagEvaluatorPack --download-dir ./rag_evaluator_pack
```

You can then inspect the files at `./rag_evaluator_pack` and use them as a template for your own project!

## Code Usage

You can download the pack to the `./rag_evaluator_pack` directory through python
code as well. The sample script below demonstrates how to construct `RagEvaluatorPack`
using a `LabelledRagDataset` downloaded from `llama-hub` and a simple RAG pipeline
built off of its source documents.

```python
from llama_index.core.llama_dataset import download_llama_dataset
from llama_index.core.llama_pack import download_llama_pack
from llama_index import VectorStoreIndex

# download a LabelledRagDataset from llama-hub
rag_dataset, documents = download_llama_dataset(
    "PaulGrahamEssayDataset", "./paul_graham"
)

# build a basic RAG pipeline off of the source documents
index = VectorStoreIndex.from_documents(documents=documents)
query_engine = index.as_query_engine()

# Time to benchmark/evaluate this RAG pipeline
# Download and install dependencies
RagEvaluatorPack = download_llama_pack(
    "RagEvaluatorPack", "./rag_evaluator_pack"
)

# construction requires a query_engine, a rag_dataset, and optionally a judge_llm
rag_evaluator_pack = RagEvaluatorPack(
    query_engine=query_engine, rag_dataset=rag_dataset
)

# PERFORM EVALUATION
benchmark_df = rag_evaluator_pack.run()  # async arun() also supported
print(benchmark_df)
```

`Output:`

```text
rag                            base_rag
metrics
mean_correctness_score         4.511364
mean_relevancy_score           0.931818
mean_faithfulness_score        1.000000
mean_context_similarity_score  0.945952
```

Note that `rag_evaluator_pack.run()` will also save two files in the same directory
in which the pack was invoked:

```bash
.
├── benchmark.csv (CSV format of the benchmark scores)
└── _evaluations.json (raw evaluation results for all examples & predictions)
```

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "llama-index-packs-rag-evaluator",
    "maintainer": "nerdai",
    "docs_url": null,
    "requires_python": ">=3.8.1,<4.0",
    "maintainer_email": "",
    "keywords": "benchmarks,evaluation,rag",
    "author": "Your Name",
    "author_email": "you@example.com",
    "download_url": "https://files.pythonhosted.org/packages/19/de/520989eadfcf40f28ac20ff346327affa7ebca22dfb26af4f291a0e17799/llama_index_packs_rag_evaluator-0.1.3.tar.gz",
    "platform": null,
    "description": "# Retrieval-Augmented Generation (RAG) Evaluation Pack\n\nGet benchmark scores on your own RAG pipeline (i.e. `QueryEngine`) on a RAG\ndataset (i.e., `LabelledRagDataset`). Specifically this pack takes in as input a\nquery engine and a `LabelledRagDataset`, which can also be downloaded from\n[llama-hub](https://llamahub.ai).\n\n## CLI Usage\n\nYou can download llamapacks directly using `llamaindex-cli`, which comes installed with the `llama-index` python package:\n\n```bash\nllamaindex-cli download-llamapack RagEvaluatorPack --download-dir ./rag_evaluator_pack\n```\n\nYou can then inspect the files at `./rag_evaluator_pack` and use them as a template for your own project!\n\n## Code Usage\n\nYou can download the pack to the `./rag_evaluator_pack` directory through python\ncode as well. The sample script below demonstrates how to construct `RagEvaluatorPack`\nusing a `LabelledRagDataset` downloaded from `llama-hub` and a simple RAG pipeline\nbuilt off of its source documents.\n\n```python\nfrom llama_index.core.llama_dataset import download_llama_dataset\nfrom llama_index.core.llama_pack import download_llama_pack\nfrom llama_index import VectorStoreIndex\n\n# download a LabelledRagDataset from llama-hub\nrag_dataset, documents = download_llama_dataset(\n    \"PaulGrahamEssayDataset\", \"./paul_graham\"\n)\n\n# build a basic RAG pipeline off of the source documents\nindex = VectorStoreIndex.from_documents(documents=documents)\nquery_engine = index.as_query_engine()\n\n# Time to benchmark/evaluate this RAG pipeline\n# Download and install dependencies\nRagEvaluatorPack = download_llama_pack(\n    \"RagEvaluatorPack\", \"./rag_evaluator_pack\"\n)\n\n# construction requires a query_engine, a rag_dataset, and optionally a judge_llm\nrag_evaluator_pack = RagEvaluatorPack(\n    query_engine=query_engine, rag_dataset=rag_dataset\n)\n\n# PERFORM EVALUATION\nbenchmark_df = rag_evaluator_pack.run()  # async arun() also supported\nprint(benchmark_df)\n```\n\n`Output:`\n\n```text\nrag                            base_rag\nmetrics\nmean_correctness_score         4.511364\nmean_relevancy_score           0.931818\nmean_faithfulness_score        1.000000\nmean_context_similarity_score  0.945952\n```\n\nNote that `rag_evaluator_pack.run()` will also save two files in the same directory\nin which the pack was invoked:\n\n```bash\n.\n\u251c\u2500\u2500 benchmark.csv (CSV format of the benchmark scores)\n\u2514\u2500\u2500 _evaluations.json (raw evaluation results for all examples & predictions)\n```\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "llama-index packs rag_evaluator integration",
    "version": "0.1.3",
    "project_urls": null,
    "split_keywords": [
        "benchmarks",
        "evaluation",
        "rag"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c06adfa8b006cf43c6015d553db0108624d9ee4ae42c59d60c51fca8b546bd9e",
                "md5": "92687b49fb6f5671fde9296be2436f0b",
                "sha256": "f253975bdc019126a1599128e82f1099e5be5b22ab57ba02e64ea757f45a2fa5"
            },
            "downloads": -1,
            "filename": "llama_index_packs_rag_evaluator-0.1.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "92687b49fb6f5671fde9296be2436f0b",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8.1,<4.0",
            "size": 5810,
            "upload_time": "2024-02-22T01:29:46",
            "upload_time_iso_8601": "2024-02-22T01:29:46.207628Z",
            "url": "https://files.pythonhosted.org/packages/c0/6a/dfa8b006cf43c6015d553db0108624d9ee4ae42c59d60c51fca8b546bd9e/llama_index_packs_rag_evaluator-0.1.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "19de520989eadfcf40f28ac20ff346327affa7ebca22dfb26af4f291a0e17799",
                "md5": "01c195d01abb9d45ab5f432d420fa515",
                "sha256": "d378372bb6fd1717542244f434b6bb1c773077eb5ee4a284d86e8c5682dc8bd7"
            },
            "downloads": -1,
            "filename": "llama_index_packs_rag_evaluator-0.1.3.tar.gz",
            "has_sig": false,
            "md5_digest": "01c195d01abb9d45ab5f432d420fa515",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8.1,<4.0",
            "size": 5508,
            "upload_time": "2024-02-22T01:29:47",
            "upload_time_iso_8601": "2024-02-22T01:29:47.453880Z",
            "url": "https://files.pythonhosted.org/packages/19/de/520989eadfcf40f28ac20ff346327affa7ebca22dfb26af4f291a0e17799/llama_index_packs_rag_evaluator-0.1.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-02-22 01:29:47",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "llama-index-packs-rag-evaluator"
}
        
Elapsed time: 0.21470s