llama-index-packs-rag-evaluator


Namellama-index-packs-rag-evaluator JSON
Version 0.3.0 PyPI version JSON
download
home_pageNone
Summaryllama-index packs rag_evaluator integration
upload_time2024-11-18 01:31:47
maintainernerdai
docs_urlNone
authorYour Name
requires_python<4.0,>=3.9
licenseMIT
keywords benchmarks evaluation rag
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Retrieval-Augmented Generation (RAG) Evaluation Pack

Get benchmark scores on your own RAG pipeline (i.e. `QueryEngine`) on a RAG
dataset (i.e., `LabelledRagDataset`). Specifically this pack takes in as input a
query engine and a `LabelledRagDataset`, which can also be downloaded from
[llama-hub](https://llamahub.ai).

## CLI Usage

You can download llamapacks directly using `llamaindex-cli`, which comes installed with the `llama-index` python package:

```bash
llamaindex-cli download-llamapack RagEvaluatorPack --download-dir ./rag_evaluator_pack
```

You can then inspect the files at `./rag_evaluator_pack` and use them as a template for your own project!

## Code Usage

You can download the pack to the `./rag_evaluator_pack` directory through python
code as well. The sample script below demonstrates how to construct `RagEvaluatorPack`
using a `LabelledRagDataset` downloaded from `llama-hub` and a simple RAG pipeline
built off of its source documents.

```python
from llama_index.core.llama_dataset import download_llama_dataset
from llama_index.core.llama_pack import download_llama_pack
from llama_index.core import VectorStoreIndex

# download a LabelledRagDataset from llama-hub
rag_dataset, documents = download_llama_dataset(
    "PaulGrahamEssayDataset", "./paul_graham"
)

# build a basic RAG pipeline off of the source documents
index = VectorStoreIndex.from_documents(documents=documents)
query_engine = index.as_query_engine()

# Time to benchmark/evaluate this RAG pipeline
# Download and install dependencies
RagEvaluatorPack = download_llama_pack(
    "RagEvaluatorPack", "./rag_evaluator_pack"
)

# construction requires a query_engine, a rag_dataset, and optionally a judge_llm
rag_evaluator_pack = RagEvaluatorPack(
    query_engine=query_engine, rag_dataset=rag_dataset
)

# PERFORM EVALUATION
benchmark_df = rag_evaluator_pack.run()  # async arun() also supported
print(benchmark_df)
```

`Output:`

```text
rag                            base_rag
metrics
mean_correctness_score         4.511364
mean_relevancy_score           0.931818
mean_faithfulness_score        1.000000
mean_context_similarity_score  0.945952
```

Note that `rag_evaluator_pack.run()` will also save two files in the same directory
in which the pack was invoked:

```bash
.
├── benchmark.csv (CSV format of the benchmark scores)
└── _evaluations.json (raw evaluation results for all examples & predictions)
```

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "llama-index-packs-rag-evaluator",
    "maintainer": "nerdai",
    "docs_url": null,
    "requires_python": "<4.0,>=3.9",
    "maintainer_email": null,
    "keywords": "benchmarks, evaluation, rag",
    "author": "Your Name",
    "author_email": "you@example.com",
    "download_url": "https://files.pythonhosted.org/packages/34/6a/e9f47b7bc45739ae59e1fd2195173e0b359b245d32915db155a527e5862f/llama_index_packs_rag_evaluator-0.3.0.tar.gz",
    "platform": null,
    "description": "# Retrieval-Augmented Generation (RAG) Evaluation Pack\n\nGet benchmark scores on your own RAG pipeline (i.e. `QueryEngine`) on a RAG\ndataset (i.e., `LabelledRagDataset`). Specifically this pack takes in as input a\nquery engine and a `LabelledRagDataset`, which can also be downloaded from\n[llama-hub](https://llamahub.ai).\n\n## CLI Usage\n\nYou can download llamapacks directly using `llamaindex-cli`, which comes installed with the `llama-index` python package:\n\n```bash\nllamaindex-cli download-llamapack RagEvaluatorPack --download-dir ./rag_evaluator_pack\n```\n\nYou can then inspect the files at `./rag_evaluator_pack` and use them as a template for your own project!\n\n## Code Usage\n\nYou can download the pack to the `./rag_evaluator_pack` directory through python\ncode as well. The sample script below demonstrates how to construct `RagEvaluatorPack`\nusing a `LabelledRagDataset` downloaded from `llama-hub` and a simple RAG pipeline\nbuilt off of its source documents.\n\n```python\nfrom llama_index.core.llama_dataset import download_llama_dataset\nfrom llama_index.core.llama_pack import download_llama_pack\nfrom llama_index.core import VectorStoreIndex\n\n# download a LabelledRagDataset from llama-hub\nrag_dataset, documents = download_llama_dataset(\n    \"PaulGrahamEssayDataset\", \"./paul_graham\"\n)\n\n# build a basic RAG pipeline off of the source documents\nindex = VectorStoreIndex.from_documents(documents=documents)\nquery_engine = index.as_query_engine()\n\n# Time to benchmark/evaluate this RAG pipeline\n# Download and install dependencies\nRagEvaluatorPack = download_llama_pack(\n    \"RagEvaluatorPack\", \"./rag_evaluator_pack\"\n)\n\n# construction requires a query_engine, a rag_dataset, and optionally a judge_llm\nrag_evaluator_pack = RagEvaluatorPack(\n    query_engine=query_engine, rag_dataset=rag_dataset\n)\n\n# PERFORM EVALUATION\nbenchmark_df = rag_evaluator_pack.run()  # async arun() also supported\nprint(benchmark_df)\n```\n\n`Output:`\n\n```text\nrag                            base_rag\nmetrics\nmean_correctness_score         4.511364\nmean_relevancy_score           0.931818\nmean_faithfulness_score        1.000000\nmean_context_similarity_score  0.945952\n```\n\nNote that `rag_evaluator_pack.run()` will also save two files in the same directory\nin which the pack was invoked:\n\n```bash\n.\n\u251c\u2500\u2500 benchmark.csv (CSV format of the benchmark scores)\n\u2514\u2500\u2500 _evaluations.json (raw evaluation results for all examples & predictions)\n```\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "llama-index packs rag_evaluator integration",
    "version": "0.3.0",
    "project_urls": null,
    "split_keywords": [
        "benchmarks",
        " evaluation",
        " rag"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "a4bc2f64714f10a10338d3cd39496ed453a4649fc9d2ae6aefa683cddf720bfc",
                "md5": "2155965141359bb6a8220774ac6e7abe",
                "sha256": "d2626989b21c0ce4bf2fdbc866598639b9ae3a05b0baea3a9bbad58855f8cd56"
            },
            "downloads": -1,
            "filename": "llama_index_packs_rag_evaluator-0.3.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "2155965141359bb6a8220774ac6e7abe",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.9",
            "size": 5944,
            "upload_time": "2024-11-18T01:31:46",
            "upload_time_iso_8601": "2024-11-18T01:31:46.308955Z",
            "url": "https://files.pythonhosted.org/packages/a4/bc/2f64714f10a10338d3cd39496ed453a4649fc9d2ae6aefa683cddf720bfc/llama_index_packs_rag_evaluator-0.3.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "346ae9f47b7bc45739ae59e1fd2195173e0b359b245d32915db155a527e5862f",
                "md5": "e1156ebacef06a1974f2d255719b2b98",
                "sha256": "b4a093f127dabab5434fc69c1fb35ab6a35b55de1004b2551c14ea2abb69a742"
            },
            "downloads": -1,
            "filename": "llama_index_packs_rag_evaluator-0.3.0.tar.gz",
            "has_sig": false,
            "md5_digest": "e1156ebacef06a1974f2d255719b2b98",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.9",
            "size": 5665,
            "upload_time": "2024-11-18T01:31:47",
            "upload_time_iso_8601": "2024-11-18T01:31:47.826168Z",
            "url": "https://files.pythonhosted.org/packages/34/6a/e9f47b7bc45739ae59e1fd2195173e0b359b245d32915db155a527e5862f/llama_index_packs_rag_evaluator-0.3.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-18 01:31:47",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "llama-index-packs-rag-evaluator"
}
        
Elapsed time: 0.34743s