# Information Retrieval Evaluation with numba
[![image](https://img.shields.io/pypi/v/ir_eval_numba.svg)](https://pypi.python.org/pypi/ir_eval_numba)
[![Actions status](https://github.com/plurch/ir_eval_numba/actions/workflows/ci-tests.yml/badge.svg)](https://github.com/plurch/ir_eval_numba/actions)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://github.com/plurch/ir_eval_numba/blob/main/LICENSE)
This project provides simple and tested [numba](https://github.com/numba/numba) implementations of popular information retrieval metrics. The source code is clear and easy to understand. All functions have pydoc help strings.
The metrics can be used to determine the quality of rankings that are returned by a retrieval or recommender system.
## Alternative library
If you don't need numba and want a library that is written in pure python, check out [plurch/ir_evaluation](https://github.com/plurch/ir_evaluation)
## Installation
Requirements:
Python 3.11 or 3.12
numba>=0.60.0
`ir_eval_numba` can be installed from pypi with:
```
pip install ir_eval_numba
```
## Usage
Metric functions will generally accept the following arguments:
`actual` (npt.NDArray[IntType]): An array of integer ground truth relevant items.
`predicted` (npt.NDArray[IntType]): An array of integer predicted items, ordered by relevance.
`k` (int): The number of top predictions to consider.
Functions will return a `float` value as the computed metric value.
## Unit tests
Unit tests with easy to follow scenarios and sample data are included.
### Run unit tests
```
uv run pytest
```
## Metrics
- [Recall](#recall)
- [Precision](#precision)
- [Average Precision (AP)](#average-precision-ap)
- [Mean Average Precision (MAP)](#mean-average-precision-map)
- [Normalized Discounted Cumulative Gain (nDCG)](#normalized-discounted-cumulative-gain-ndcg)
- [Reciprocal Rank (RR)](#reciprocal-rank-rr)
- [Mean Reciprocal Rank (MRR)](#mean-reciprocal-rank-mrr)
### Recall
Recall is defined as the ratio of the total number of relevant items retrieved within the top-k predictions to the total number of relevant items in the entire database.
Usage scenario: Prioritize returning all relevant items from database. Early retrieval stages where many candidates are returned should focus on this metric.
```
from ir_eval_numba.metrics import recall
```
### Precision
Precision is defined as the ratio of the total number of relevant items retrieved within the top-k predictions to the total number of returned items (k).
Usage scenario: Minimize false positives in predictions. Later ranking stages should focus on this metric.
```
from ir_eval_numba.metrics import precision
```
### Average Precision (AP)
Average Precision is calculated as the mean of precision values at each rank where a relevant item is retrieved within the top `k` predictions.
Usage scenario: Evaluates how well relevant items are ranked within the top-k returned list.
```
from ir_eval_numba.metrics import average_precision
```
### Mean Average Precision (MAP)
MAP is the mean of the Average Precision (AP - see above) scores computed for multiple queries.
Usage scenario: Reflects overall performance of AP for multiple queries. A good holistic metric that balances the tradeoff between recall and precision.
```
from ir_eval_numba.metrics import mean_average_precision
```
### Normalized Discounted Cumulative Gain (nDCG)
nDCG evaluates the quality of a predicted ranking by comparing it to an ideal ranking (i.e., perfect ordering of relevant items). It accounts for the position of relevant items in the ranking, giving higher weight to items appearing earlier.
Usage scenario: Prioritize returning relevant items higher in the returned top-k list. A good holistic metric.
```
from ir_eval_numba.metrics import ndcg
```
### Reciprocal Rank (RR)
Reciprocal Rank (RR) assigns a score based on the reciprocal of the rank at which the first relevant item is found.
Usage scenario: Useful when the topmost recommendation holds siginificant value. Use this when users are presented with one or very few returned results.
```
from ir_eval_numba.metrics import reciprocal_rank
```
### Mean Reciprocal Rank (MRR)
MRR calculates the mean of the Reciprocal Rank (RR) scores for a set of queries.
Usage scenario: Reflects overall performance of RR for multiple queries.
```
from ir_eval_numba.metrics import mean_reciprocal_rank
```
## Online Resources
[Pinecone - Evaluation Measures in Information Retrieval
](https://www.pinecone.io/learn/offline-evaluation/)
[Spot Intelligence - Mean Average Precision](https://spotintelligence.com/2023/09/07/mean-average-precision/)
[Spot Intelligence - Mean Reciprocal Rank](https://spotintelligence.com/2024/08/02/mean-reciprocal-rank-mrr/)
[google-research/ials](https://github.com/google-research/google-research/blob/943fffe2522da9e58667fb129eda84bd6c088035/ials/ncf_benchmarks/ials.py#L83)
Raw data
{
"_id": null,
"home_page": null,
"name": "ir_eval_numba",
"maintainer": null,
"docs_url": null,
"requires_python": "<3.13,>=3.11",
"maintainer_email": null,
"keywords": "Data science, Evaluation metrics, Information retrieval, Recommender systems",
"author": null,
"author_email": "Patrick Lurch <plurch@users.noreply.github.com>",
"download_url": "https://files.pythonhosted.org/packages/d7/24/bc07e33ac7f645ac858b024092db972ff550dd4f892fce6a8c4958a4832c/ir_eval_numba-1.0.0.tar.gz",
"platform": null,
"description": "# Information Retrieval Evaluation with numba\n\n[![image](https://img.shields.io/pypi/v/ir_eval_numba.svg)](https://pypi.python.org/pypi/ir_eval_numba)\n[![Actions status](https://github.com/plurch/ir_eval_numba/actions/workflows/ci-tests.yml/badge.svg)](https://github.com/plurch/ir_eval_numba/actions)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://github.com/plurch/ir_eval_numba/blob/main/LICENSE)\n\nThis project provides simple and tested [numba](https://github.com/numba/numba) implementations of popular information retrieval metrics. The source code is clear and easy to understand. All functions have pydoc help strings.\n\nThe metrics can be used to determine the quality of rankings that are returned by a retrieval or recommender system.\n\n## Alternative library\n\nIf you don't need numba and want a library that is written in pure python, check out [plurch/ir_evaluation](https://github.com/plurch/ir_evaluation)\n\n## Installation\n\nRequirements:\nPython 3.11 or 3.12\nnumba>=0.60.0\n\n`ir_eval_numba` can be installed from pypi with:\n\n```\npip install ir_eval_numba\n```\n\n## Usage\n\nMetric functions will generally accept the following arguments:\n\n`actual` (npt.NDArray[IntType]): An array of integer ground truth relevant items.\n\n`predicted` (npt.NDArray[IntType]): An array of integer predicted items, ordered by relevance.\n\n`k` (int): The number of top predictions to consider.\n\nFunctions will return a `float` value as the computed metric value.\n\n## Unit tests\n\nUnit tests with easy to follow scenarios and sample data are included.\n\n### Run unit tests\n```\nuv run pytest\n```\n\n## Metrics\n- [Recall](#recall)\n- [Precision](#precision)\n- [Average Precision (AP)](#average-precision-ap)\n- [Mean Average Precision (MAP)](#mean-average-precision-map)\n- [Normalized Discounted Cumulative Gain (nDCG)](#normalized-discounted-cumulative-gain-ndcg)\n- [Reciprocal Rank (RR)](#reciprocal-rank-rr)\n- [Mean Reciprocal Rank (MRR)](#mean-reciprocal-rank-mrr)\n\n\n### Recall\n\nRecall is defined as the ratio of the total number of relevant items retrieved within the top-k predictions to the total number of relevant items in the entire database.\n\nUsage scenario: Prioritize returning all relevant items from database. Early retrieval stages where many candidates are returned should focus on this metric.\n\n```\nfrom ir_eval_numba.metrics import recall\n```\n\n### Precision\n\nPrecision is defined as the ratio of the total number of relevant items retrieved within the top-k predictions to the total number of returned items (k).\n\nUsage scenario: Minimize false positives in predictions. Later ranking stages should focus on this metric.\n\n```\nfrom ir_eval_numba.metrics import precision\n```\n\n### Average Precision (AP)\n\nAverage Precision is calculated as the mean of precision values at each rank where a relevant item is retrieved within the top `k` predictions.\n\nUsage scenario: Evaluates how well relevant items are ranked within the top-k returned list.\n\n```\nfrom ir_eval_numba.metrics import average_precision\n```\n\n### Mean Average Precision (MAP)\n\nMAP is the mean of the Average Precision (AP - see above) scores computed for multiple queries.\n\nUsage scenario: Reflects overall performance of AP for multiple queries. A good holistic metric that balances the tradeoff between recall and precision.\n\n```\nfrom ir_eval_numba.metrics import mean_average_precision\n```\n\n### Normalized Discounted Cumulative Gain (nDCG)\n\nnDCG evaluates the quality of a predicted ranking by comparing it to an ideal ranking (i.e., perfect ordering of relevant items). It accounts for the position of relevant items in the ranking, giving higher weight to items appearing earlier.\n\nUsage scenario: Prioritize returning relevant items higher in the returned top-k list. A good holistic metric. \n\n```\nfrom ir_eval_numba.metrics import ndcg\n```\n\n### Reciprocal Rank (RR)\n\nReciprocal Rank (RR) assigns a score based on the reciprocal of the rank at which the first relevant item is found.\n\nUsage scenario: Useful when the topmost recommendation holds siginificant value. Use this when users are presented with one or very few returned results.\n\n```\nfrom ir_eval_numba.metrics import reciprocal_rank\n```\n\n### Mean Reciprocal Rank (MRR)\n\nMRR calculates the mean of the Reciprocal Rank (RR) scores for a set of queries.\n\nUsage scenario: Reflects overall performance of RR for multiple queries.\n\n```\nfrom ir_eval_numba.metrics import mean_reciprocal_rank\n```\n\n## Online Resources\n\n[Pinecone - Evaluation Measures in Information Retrieval\n](https://www.pinecone.io/learn/offline-evaluation/)\n\n[Spot Intelligence - Mean Average Precision](https://spotintelligence.com/2023/09/07/mean-average-precision/)\n\n[Spot Intelligence - Mean Reciprocal Rank](https://spotintelligence.com/2024/08/02/mean-reciprocal-rank-mrr/)\n\n[google-research/ials](https://github.com/google-research/google-research/blob/943fffe2522da9e58667fb129eda84bd6c088035/ials/ncf_benchmarks/ials.py#L83)\n",
"bugtrack_url": null,
"license": null,
"summary": "Add your description here",
"version": "1.0.0",
"project_urls": {
"Homepage": "https://github.com/plurch/ir_eval_numba",
"Issues": "https://github.com/plurch/ir_eval_numba/issues"
},
"split_keywords": [
"data science",
" evaluation metrics",
" information retrieval",
" recommender systems"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "8cbd997417664264fc4c22175a45c1688eb5523e9987202e674787a1706550af",
"md5": "12b49458d1a2960c5624a80f136192a8",
"sha256": "ca7bf066ffcc8676263b407c43e12f3cc919b25af770af81a647a4cb1eb431e6"
},
"downloads": -1,
"filename": "ir_eval_numba-1.0.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "12b49458d1a2960c5624a80f136192a8",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<3.13,>=3.11",
"size": 6484,
"upload_time": "2025-01-10T23:29:07",
"upload_time_iso_8601": "2025-01-10T23:29:07.271333Z",
"url": "https://files.pythonhosted.org/packages/8c/bd/997417664264fc4c22175a45c1688eb5523e9987202e674787a1706550af/ir_eval_numba-1.0.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "d724bc07e33ac7f645ac858b024092db972ff550dd4f892fce6a8c4958a4832c",
"md5": "90d3eb7049696c55ebc44ff1f54067df",
"sha256": "4197ff241094dab55acd341952210d1a0eb713231dbc36e9440f3e0ea4e75096"
},
"downloads": -1,
"filename": "ir_eval_numba-1.0.0.tar.gz",
"has_sig": false,
"md5_digest": "90d3eb7049696c55ebc44ff1f54067df",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<3.13,>=3.11",
"size": 12733,
"upload_time": "2025-01-10T23:29:08",
"upload_time_iso_8601": "2025-01-10T23:29:08.469459Z",
"url": "https://files.pythonhosted.org/packages/d7/24/bc07e33ac7f645ac858b024092db972ff550dd4f892fce6a8c4958a4832c/ir_eval_numba-1.0.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-01-10 23:29:08",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "plurch",
"github_project": "ir_eval_numba",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "ir_eval_numba"
}