indomee

Name	indomee JSON
Version	0.1.4 JSON
	download
home_page	None
Summary	Python package for evaluation of retrieval-augmented generation (RAG) models
upload_time	2024-12-04 02:49:40
maintainer	None
docs_url	None
author	None
requires_python	>=3.11
license	None
keywords
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Indomee

**Indomee** is a Python package designed to simplify the evaluation of retrieval-augmented generation (RAG) models and other retrieval-based systems. With `indomee`, you can compute common evaluation metrics like **recall** and **mean reciprocal rank (MRR)** at various levels of _k_, all through a straightforward API.

We also provide support for simple bootstrapping at the moment with t-testing coming soon.

## Installation

```bash
pip install indomee
```

You can get started with `indomee` with the following example

Indomee provides functions to calculate various metrics such as Mean Reciprocal Rank (MRR) and Recall.

#### Example Usage

```python
from indomee import calculate_mrr, calculate_recall, calculate_metrics_at_k

mrr = calculate_mrr([1, 2, 3], [2, 3, 4])
print("MRR:", mrr)
# > MRR: 0.5

# Calculate Recall
recall = calculate_recall([1, 2, 3], [2])
print("Recall:", recall)
# > Recall: 1

# Calculate metrics at specific k values
metrics = calculate_metrics_at_k(
    metrics=["recall"], preds=[1, 2, 3], labels=[2], k=[1, 2, 3]
)
print("Metrics at k:", metrics)
# > {'recall@1': 0.0, 'recall@2': 1.0, 'recall@3': 1.0}
```

### 2. Bootstrapping

Indomee also supports bootstrapping for more robust metric evaluation.

#### Example Usage

```python
from indomee import bootstrap_sample, bootstrap

# Bootstrapping a sample
result = bootstrap_sample(preds=[["a", "b"], ["c", "d"], ["e", "f"]], labels=[["a", "b"], ["c", "d"], ["e", "f"]], n_samples=10, metrics=["recall"], k=[1, 2, 3])
print("Bootstrap Sample Metrics:", result.sample_metrics)

# Bootstrapping multiple samples
result = bootstrap(preds=[["a", "b"], ["c", "d"], ["e", "f"]], labels=[["a", "b"], ["c", "d"], ["e", "f"]], n_samples=10, n_iterations=10, metrics=["recall"], k=[1, 2, 3])
print("Bootstrap Metrics:", result.sample_metrics)
```

### 3. T-Testing

For the last portion, we'll show how to perform a t-test between two different results that we've obtained from the different methods.

```python
from indomee import perform_t_tests
import pandas as pd

df = pd.read_csv("./data.csv")

# Calculate the mean for each method
method_1 = df["method_1"].tolist()
method_2 = df["method_2"].tolist()
baseline = df["baseline"].tolist()

results = perform_t_tests(
    baseline, method_1, method_2,
    names=["Baseline", "Method 1", "Method 2"],
    paired=True,
)
results
```

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "indomee",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.11",
    "maintainer_email": null,
    "keywords": null,
    "author": null,
    "author_email": "Ivan Leo <ivanleomk@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/ae/4f/cd4c9277c08f2c45540676e171ce318f30521ee2a87ee1e9d722e1b942d4/indomee-0.1.4.tar.gz",
    "platform": null,
    "description": "# Indomee\n\n**Indomee** is a Python package designed to simplify the evaluation of retrieval-augmented generation (RAG) models and other retrieval-based systems. With `indomee`, you can compute common evaluation metrics like **recall** and **mean reciprocal rank (MRR)** at various levels of _k_, all through a straightforward API.\n\nWe also provide support for simple bootstrapping at the moment with t-testing coming soon.\n\n## Installation\n\n```bash\npip install indomee\n```\n\nYou can get started with `indomee` with the following example\n\nIndomee provides functions to calculate various metrics such as Mean Reciprocal Rank (MRR) and Recall.\n\n#### Example Usage\n\n```python\nfrom indomee import calculate_mrr, calculate_recall, calculate_metrics_at_k\n\nmrr = calculate_mrr([1, 2, 3], [2, 3, 4])\nprint(\"MRR:\", mrr)\n# > MRR: 0.5\n\n# Calculate Recall\nrecall = calculate_recall([1, 2, 3], [2])\nprint(\"Recall:\", recall)\n# > Recall: 1\n\n# Calculate metrics at specific k values\nmetrics = calculate_metrics_at_k(\n    metrics=[\"recall\"], preds=[1, 2, 3], labels=[2], k=[1, 2, 3]\n)\nprint(\"Metrics at k:\", metrics)\n# > {'recall@1': 0.0, 'recall@2': 1.0, 'recall@3': 1.0}\n```\n\n### 2. Bootstrapping\n\nIndomee also supports bootstrapping for more robust metric evaluation.\n\n#### Example Usage\n\n```python\nfrom indomee import bootstrap_sample, bootstrap\n\n# Bootstrapping a sample\nresult = bootstrap_sample(preds=[[\"a\", \"b\"], [\"c\", \"d\"], [\"e\", \"f\"]], labels=[[\"a\", \"b\"], [\"c\", \"d\"], [\"e\", \"f\"]], n_samples=10, metrics=[\"recall\"], k=[1, 2, 3])\nprint(\"Bootstrap Sample Metrics:\", result.sample_metrics)\n\n# Bootstrapping multiple samples\nresult = bootstrap(preds=[[\"a\", \"b\"], [\"c\", \"d\"], [\"e\", \"f\"]], labels=[[\"a\", \"b\"], [\"c\", \"d\"], [\"e\", \"f\"]], n_samples=10, n_iterations=10, metrics=[\"recall\"], k=[1, 2, 3])\nprint(\"Bootstrap Metrics:\", result.sample_metrics)\n```\n\n### 3. T-Testing\n\nFor the last portion, we'll show how to perform a t-test between two different results that we've obtained from the different methods.\n\n```python\nfrom indomee import perform_t_tests\nimport pandas as pd\n\ndf = pd.read_csv(\"./data.csv\")\n\n# Calculate the mean for each method\nmethod_1 = df[\"method_1\"].tolist()\nmethod_2 = df[\"method_2\"].tolist()\nbaseline = df[\"baseline\"].tolist()\n\nresults = perform_t_tests(\n    baseline, method_1, method_2,\n    names=[\"Baseline\", \"Method 1\", \"Method 2\"],\n    paired=True,\n)\nresults\n```\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Python package for evaluation of retrieval-augmented generation (RAG) models",
    "version": "0.1.4",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "fde336e0a172b5c0a4d005d3cd8ec350d4d2832b66ed51aecc4f441ba2514785",
                "md5": "c3567054e848f07d5f56fc4395048563",
                "sha256": "6d56a06ec3ccc0dbae7f938e9b1bbdc477bea50887f3b7bcf92d8210cef718b4"
            },
            "downloads": -1,
            "filename": "indomee-0.1.4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "c3567054e848f07d5f56fc4395048563",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.11",
            "size": 5385,
            "upload_time": "2024-12-04T02:49:38",
            "upload_time_iso_8601": "2024-12-04T02:49:38.451481Z",
            "url": "https://files.pythonhosted.org/packages/fd/e3/36e0a172b5c0a4d005d3cd8ec350d4d2832b66ed51aecc4f441ba2514785/indomee-0.1.4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ae4fcd4c9277c08f2c45540676e171ce318f30521ee2a87ee1e9d722e1b942d4",
                "md5": "e26ca9be42271d69756bb0ff8cd73cdf",
                "sha256": "5c0066539a7fe112896cc501cc80bc705771cc08c7fe79639908e37cb5492e01"
            },
            "downloads": -1,
            "filename": "indomee-0.1.4.tar.gz",
            "has_sig": false,
            "md5_digest": "e26ca9be42271d69756bb0ff8cd73cdf",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.11",
            "size": 67114,
            "upload_time": "2024-12-04T02:49:40",
            "upload_time_iso_8601": "2024-12-04T02:49:40.346679Z",
            "url": "https://files.pythonhosted.org/packages/ae/4f/cd4c9277c08f2c45540676e171ce318f30521ee2a87ee1e9d722e1b942d4/indomee-0.1.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-12-04 02:49:40",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "indomee"
}

None