Name | indomee JSON |
Version |
0.1.4
JSON |
| download |
home_page | None |
Summary | Python package for evaluation of retrieval-augmented generation (RAG) models |
upload_time | 2024-12-04 02:49:40 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.11 |
license | None |
keywords |
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# Indomee
**Indomee** is a Python package designed to simplify the evaluation of retrieval-augmented generation (RAG) models and other retrieval-based systems. With `indomee`, you can compute common evaluation metrics like **recall** and **mean reciprocal rank (MRR)** at various levels of _k_, all through a straightforward API.
We also provide support for simple bootstrapping at the moment with t-testing coming soon.
## Installation
```bash
pip install indomee
```
You can get started with `indomee` with the following example
Indomee provides functions to calculate various metrics such as Mean Reciprocal Rank (MRR) and Recall.
#### Example Usage
```python
from indomee import calculate_mrr, calculate_recall, calculate_metrics_at_k
mrr = calculate_mrr([1, 2, 3], [2, 3, 4])
print("MRR:", mrr)
# > MRR: 0.5
# Calculate Recall
recall = calculate_recall([1, 2, 3], [2])
print("Recall:", recall)
# > Recall: 1
# Calculate metrics at specific k values
metrics = calculate_metrics_at_k(
metrics=["recall"], preds=[1, 2, 3], labels=[2], k=[1, 2, 3]
)
print("Metrics at k:", metrics)
# > {'recall@1': 0.0, 'recall@2': 1.0, 'recall@3': 1.0}
```
### 2. Bootstrapping
Indomee also supports bootstrapping for more robust metric evaluation.
#### Example Usage
```python
from indomee import bootstrap_sample, bootstrap
# Bootstrapping a sample
result = bootstrap_sample(preds=[["a", "b"], ["c", "d"], ["e", "f"]], labels=[["a", "b"], ["c", "d"], ["e", "f"]], n_samples=10, metrics=["recall"], k=[1, 2, 3])
print("Bootstrap Sample Metrics:", result.sample_metrics)
# Bootstrapping multiple samples
result = bootstrap(preds=[["a", "b"], ["c", "d"], ["e", "f"]], labels=[["a", "b"], ["c", "d"], ["e", "f"]], n_samples=10, n_iterations=10, metrics=["recall"], k=[1, 2, 3])
print("Bootstrap Metrics:", result.sample_metrics)
```
### 3. T-Testing
For the last portion, we'll show how to perform a t-test between two different results that we've obtained from the different methods.
```python
from indomee import perform_t_tests
import pandas as pd
df = pd.read_csv("./data.csv")
# Calculate the mean for each method
method_1 = df["method_1"].tolist()
method_2 = df["method_2"].tolist()
baseline = df["baseline"].tolist()
results = perform_t_tests(
baseline, method_1, method_2,
names=["Baseline", "Method 1", "Method 2"],
paired=True,
)
results
```
Raw data
{
"_id": null,
"home_page": null,
"name": "indomee",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.11",
"maintainer_email": null,
"keywords": null,
"author": null,
"author_email": "Ivan Leo <ivanleomk@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/ae/4f/cd4c9277c08f2c45540676e171ce318f30521ee2a87ee1e9d722e1b942d4/indomee-0.1.4.tar.gz",
"platform": null,
"description": "# Indomee\n\n**Indomee** is a Python package designed to simplify the evaluation of retrieval-augmented generation (RAG) models and other retrieval-based systems. With `indomee`, you can compute common evaluation metrics like **recall** and **mean reciprocal rank (MRR)** at various levels of _k_, all through a straightforward API.\n\nWe also provide support for simple bootstrapping at the moment with t-testing coming soon.\n\n## Installation\n\n```bash\npip install indomee\n```\n\nYou can get started with `indomee` with the following example\n\nIndomee provides functions to calculate various metrics such as Mean Reciprocal Rank (MRR) and Recall.\n\n#### Example Usage\n\n```python\nfrom indomee import calculate_mrr, calculate_recall, calculate_metrics_at_k\n\nmrr = calculate_mrr([1, 2, 3], [2, 3, 4])\nprint(\"MRR:\", mrr)\n# > MRR: 0.5\n\n# Calculate Recall\nrecall = calculate_recall([1, 2, 3], [2])\nprint(\"Recall:\", recall)\n# > Recall: 1\n\n# Calculate metrics at specific k values\nmetrics = calculate_metrics_at_k(\n metrics=[\"recall\"], preds=[1, 2, 3], labels=[2], k=[1, 2, 3]\n)\nprint(\"Metrics at k:\", metrics)\n# > {'recall@1': 0.0, 'recall@2': 1.0, 'recall@3': 1.0}\n```\n\n### 2. Bootstrapping\n\nIndomee also supports bootstrapping for more robust metric evaluation.\n\n#### Example Usage\n\n```python\nfrom indomee import bootstrap_sample, bootstrap\n\n# Bootstrapping a sample\nresult = bootstrap_sample(preds=[[\"a\", \"b\"], [\"c\", \"d\"], [\"e\", \"f\"]], labels=[[\"a\", \"b\"], [\"c\", \"d\"], [\"e\", \"f\"]], n_samples=10, metrics=[\"recall\"], k=[1, 2, 3])\nprint(\"Bootstrap Sample Metrics:\", result.sample_metrics)\n\n# Bootstrapping multiple samples\nresult = bootstrap(preds=[[\"a\", \"b\"], [\"c\", \"d\"], [\"e\", \"f\"]], labels=[[\"a\", \"b\"], [\"c\", \"d\"], [\"e\", \"f\"]], n_samples=10, n_iterations=10, metrics=[\"recall\"], k=[1, 2, 3])\nprint(\"Bootstrap Metrics:\", result.sample_metrics)\n```\n\n### 3. T-Testing\n\nFor the last portion, we'll show how to perform a t-test between two different results that we've obtained from the different methods.\n\n```python\nfrom indomee import perform_t_tests\nimport pandas as pd\n\ndf = pd.read_csv(\"./data.csv\")\n\n# Calculate the mean for each method\nmethod_1 = df[\"method_1\"].tolist()\nmethod_2 = df[\"method_2\"].tolist()\nbaseline = df[\"baseline\"].tolist()\n\nresults = perform_t_tests(\n baseline, method_1, method_2,\n names=[\"Baseline\", \"Method 1\", \"Method 2\"],\n paired=True,\n)\nresults\n```\n",
"bugtrack_url": null,
"license": null,
"summary": "Python package for evaluation of retrieval-augmented generation (RAG) models",
"version": "0.1.4",
"project_urls": null,
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "fde336e0a172b5c0a4d005d3cd8ec350d4d2832b66ed51aecc4f441ba2514785",
"md5": "c3567054e848f07d5f56fc4395048563",
"sha256": "6d56a06ec3ccc0dbae7f938e9b1bbdc477bea50887f3b7bcf92d8210cef718b4"
},
"downloads": -1,
"filename": "indomee-0.1.4-py3-none-any.whl",
"has_sig": false,
"md5_digest": "c3567054e848f07d5f56fc4395048563",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.11",
"size": 5385,
"upload_time": "2024-12-04T02:49:38",
"upload_time_iso_8601": "2024-12-04T02:49:38.451481Z",
"url": "https://files.pythonhosted.org/packages/fd/e3/36e0a172b5c0a4d005d3cd8ec350d4d2832b66ed51aecc4f441ba2514785/indomee-0.1.4-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "ae4fcd4c9277c08f2c45540676e171ce318f30521ee2a87ee1e9d722e1b942d4",
"md5": "e26ca9be42271d69756bb0ff8cd73cdf",
"sha256": "5c0066539a7fe112896cc501cc80bc705771cc08c7fe79639908e37cb5492e01"
},
"downloads": -1,
"filename": "indomee-0.1.4.tar.gz",
"has_sig": false,
"md5_digest": "e26ca9be42271d69756bb0ff8cd73cdf",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.11",
"size": 67114,
"upload_time": "2024-12-04T02:49:40",
"upload_time_iso_8601": "2024-12-04T02:49:40.346679Z",
"url": "https://files.pythonhosted.org/packages/ae/4f/cd4c9277c08f2c45540676e171ce318f30521ee2a87ee1e9d722e1b942d4/indomee-0.1.4.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-12-04 02:49:40",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "indomee"
}