[colab_badge]: https://img.shields.io/badge/Open%20In-Colab-blue?style=for-the-badge&logo=google-colab
[kaggle_badge]: https://img.shields.io/badge/Open%20In-Kaggle-blue?style=for-the-badge&logo=kaggle
[python_badge]: https://img.shields.io/badge/Python-3.10+-brightgreen?style=for-the-badge&logo=python&logoColor=white
[pypi_badge]: https://img.shields.io/pypi/v/xretrieval.svg?style=for-the-badge&logo=pypi&logoColor=white&label=PyPI&color=blue
[downloads_badge]: https://img.shields.io/pepy/dt/xretrieval.svg?style=for-the-badge&logo=pypi&logoColor=white&label=Downloads&color=purple
[license_badge]: https://img.shields.io/badge/License-Apache%202.0-green.svg?style=for-the-badge&logo=apache&logoColor=white
[![Python][python_badge]](https://pypi.org/project/xretrieval/)
[![PyPI version][pypi_badge]](https://pypi.org/project/xretrieval/)
[![Downloads][downloads_badge]](https://pypi.org/project/xretrieval/)
![License][license_badge]
<div align="center">
<img src="https://raw.githubusercontent.com/dnth/x.retrieval/main/assets/logo.png" alt="x.retrieval" width="600"/>
<br />
<br />
<a href="https://dnth.github.io/x.retrieval" target="_blank" rel="noopener noreferrer"><strong>Explore the docs »</strong></a>
<br />
<a href="#-quickstart" target="_blank" rel="noopener noreferrer">Quickstart</a>
·
<a href="https://github.com/dnth/x.retrieval/issues/new?assignees=&labels=Feature+Request&projects=&template=feature_request.md" target="_blank" rel="noopener noreferrer">Feature Request</a>
·
<a href="https://github.com/dnth/x.retrieval/issues/new?assignees=&labels=bug&projects=&template=bug_report.md" target="_blank" rel="noopener noreferrer">Report Bug</a>
·
<a href="https://github.com/dnth/x.retrieval/discussions" target="_blank" rel="noopener noreferrer">Discussions</a>
·
<a href="https://dicksonneoh.com/" target="_blank" rel="noopener noreferrer">About</a>
<br />
<br />
</div>
Evaluate your multimodal retrieval system in 3 lines of code.
## 🌟 Key Features
- ✅ Load datasets and models with one line of code.
- ✅ Built in support for Sentence Transformers, TIMM, BM25, and Transformers models.
- ✅ Run benchmarks and get retrieval metrics like MRR, NormalizedDCG, Precision, Recall, HitRate, and MAP.
- ✅ Visualize retrieval results to understand how your model is performing.
- ✅ Combine retrieval results from multiple models using Reciprocal Rank Fusion (RRF).
## 🚀 Quickstart
[![Open In Colab][colab_badge]](https://colab.research.google.com/github/dnth/x.retrieval/blob/main/nbs/quickstart.ipynb)
[![Open In Kaggle][kaggle_badge]](https://kaggle.com/kernels/welcome?src=https://github.com/dnth/x.retrieval/blob/main/nbs/quickstart.ipynb)
```python
import xretrieval
metrics, results_df = xretrieval.run_benchmark(
dataset="coco-val-2017",
model_id="transformers/Salesforce/blip2-itm-vit-g",
mode="text-to-text",
)
```
```bash
Retrieval Metrics @ k=10
┏━━━━━━━━━━━━━━━┳━━━━━━━━┓
┃ Metric ┃ Score ┃
┡━━━━━━━━━━━━━━━╇━━━━━━━━┩
│ MRR │ 0.2358 │
│ NormalizedDCG │ 0.2854 │
│ Precision │ 0.1660 │
│ Recall │ 0.4248 │
│ HitRate │ 0.4248 │
│ MAP │ 0.2095 │
└───────────────┴────────┘
```
## 📦 Installation
From PyPI:
```bash
pip install xretrieval
```
From source:
```bash
pip install git+https://github.com/dnth/x.retrieval
```
## 🛠️ Usage
List datasets:
```python
xretrieval.list_datasets()
```
```bash
Available Datasets
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Dataset Name ┃ Description ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ coco-val-2017 │ The COCO Validation Set with 5k images. │
│ coco-val-2017-blip2-captions │ The COCO Validation Set with 5k images and BLIP2 captions. │
│ coco-val-2017-vlrm-captions │ The COCO Validation Set with 5k images and VLRM captions. │
└──────────────────────────────┴────────────────────────────────────────────────────────────┘
```
List models:
```python
xretrieval.list_models()
```
```bash
Available Models
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓
┃ Model ID ┃ Model Input ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩
│ transformers/Salesforce/blip2-itm-vit-g │ text-image │
│ transformers/Salesforce/blip2-itm-vit-g-text │ text │
│ transformers/Salesforce/blip2-itm-vit-g-image │ image │
│ xhluca/bm25s │ text │
│ sentence-transformers/paraphrase-MiniLM-L3-v2 │ text │
│ sentence-transformers/paraphrase-albert-small-v2 │ text │
│ sentence-transformers/multi-qa-distilbert-cos-v1 │ text │
│ sentence-transformers/all-MiniLM-L12-v2 │ text │
│ sentence-transformers/all-distilroberta-v1 │ text │
│ sentence-transformers/multi-qa-mpnet-base-dot-v1 │ text │
│ sentence-transformers/all-mpnet-base-v2 │ text │
│ sentence-transformers/multi-qa-MiniLM-L6-cos-v1 │ text │
│ sentence-transformers/all-MiniLM-L6-v2 │ text │
│ timm/resnet18.a1_in1k │ image │
└──────────────────────────────────────────────────┴─────────────┘
```
Run benchmarks:
```python
results, results_df = xretrieval.run_benchmark_bm25("coco-val-2017-blip2-captions")
```
Visualize retrieval results:
```python
xretrieval.visualize_retrieval(results_df)
```
![alt text](assets/viz1.png)
![alt text](assets/viz2.png)
Run hybrid search with Reciprocal Rank Fusion (RRF):
```python
results_df = xretrieval.run_rrf([results_df, results_df], "coco-val-2017")
```
See [RRF notebook](nbs/rrf.ipynb) for more details.
Raw data
{
"_id": null,
"home_page": null,
"name": "xretrieval",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": "evaluation, machine-learning, multi-modal, retrieval",
"author": null,
"author_email": "Dickson Neoh <dickson.neoh@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/e1/f2/6dc93eb8dc91c810044381a690cd0a9e1ef537117398961fc075ef3d1d22/xretrieval-0.2.0.tar.gz",
"platform": null,
"description": "[colab_badge]: https://img.shields.io/badge/Open%20In-Colab-blue?style=for-the-badge&logo=google-colab\n[kaggle_badge]: https://img.shields.io/badge/Open%20In-Kaggle-blue?style=for-the-badge&logo=kaggle\n\n[python_badge]: https://img.shields.io/badge/Python-3.10+-brightgreen?style=for-the-badge&logo=python&logoColor=white\n[pypi_badge]: https://img.shields.io/pypi/v/xretrieval.svg?style=for-the-badge&logo=pypi&logoColor=white&label=PyPI&color=blue\n[downloads_badge]: https://img.shields.io/pepy/dt/xretrieval.svg?style=for-the-badge&logo=pypi&logoColor=white&label=Downloads&color=purple\n[license_badge]: https://img.shields.io/badge/License-Apache%202.0-green.svg?style=for-the-badge&logo=apache&logoColor=white\n\n[![Python][python_badge]](https://pypi.org/project/xretrieval/)\n[![PyPI version][pypi_badge]](https://pypi.org/project/xretrieval/)\n[![Downloads][downloads_badge]](https://pypi.org/project/xretrieval/)\n![License][license_badge]\n\n<div align=\"center\">\n <img src=\"https://raw.githubusercontent.com/dnth/x.retrieval/main/assets/logo.png\" alt=\"x.retrieval\" width=\"600\"/>\n <br />\n <br />\n <a href=\"https://dnth.github.io/x.retrieval\" target=\"_blank\" rel=\"noopener noreferrer\"><strong>Explore the docs \u00bb</strong></a>\n <br />\n <a href=\"#-quickstart\" target=\"_blank\" rel=\"noopener noreferrer\">Quickstart</a>\n \u00b7\n <a href=\"https://github.com/dnth/x.retrieval/issues/new?assignees=&labels=Feature+Request&projects=&template=feature_request.md\" target=\"_blank\" rel=\"noopener noreferrer\">Feature Request</a>\n \u00b7\n <a href=\"https://github.com/dnth/x.retrieval/issues/new?assignees=&labels=bug&projects=&template=bug_report.md\" target=\"_blank\" rel=\"noopener noreferrer\">Report Bug</a>\n \u00b7\n <a href=\"https://github.com/dnth/x.retrieval/discussions\" target=\"_blank\" rel=\"noopener noreferrer\">Discussions</a>\n \u00b7\n <a href=\"https://dicksonneoh.com/\" target=\"_blank\" rel=\"noopener noreferrer\">About</a>\n <br />\n <br />\n</div>\nEvaluate your multimodal retrieval system in 3 lines of code.\n\n\n## \ud83c\udf1f Key Features\n\n- \u2705 Load datasets and models with one line of code.\n- \u2705 Built in support for Sentence Transformers, TIMM, BM25, and Transformers models.\n- \u2705 Run benchmarks and get retrieval metrics like MRR, NormalizedDCG, Precision, Recall, HitRate, and MAP.\n- \u2705 Visualize retrieval results to understand how your model is performing.\n- \u2705 Combine retrieval results from multiple models using Reciprocal Rank Fusion (RRF).\n\n## \ud83d\ude80 Quickstart\n\n[![Open In Colab][colab_badge]](https://colab.research.google.com/github/dnth/x.retrieval/blob/main/nbs/quickstart.ipynb)\n[![Open In Kaggle][kaggle_badge]](https://kaggle.com/kernels/welcome?src=https://github.com/dnth/x.retrieval/blob/main/nbs/quickstart.ipynb)\n\n```python\nimport xretrieval\n\nmetrics, results_df = xretrieval.run_benchmark(\n dataset=\"coco-val-2017\",\n model_id=\"transformers/Salesforce/blip2-itm-vit-g\",\n mode=\"text-to-text\",\n)\n\n```\n\n```bash\n\n Retrieval Metrics @ k=10 \n\u250f\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2533\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2513\n\u2503 Metric \u2503 Score \u2503\n\u2521\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2547\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2529\n\u2502 MRR \u2502 0.2358 \u2502\n\u2502 NormalizedDCG \u2502 0.2854 \u2502\n\u2502 Precision \u2502 0.1660 \u2502\n\u2502 Recall \u2502 0.4248 \u2502\n\u2502 HitRate \u2502 0.4248 \u2502\n\u2502 MAP \u2502 0.2095 \u2502\n\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n\n```\n\n## \ud83d\udce6 Installation\nFrom PyPI:\n```bash\npip install xretrieval\n```\n\nFrom source:\n\n```bash\npip install git+https://github.com/dnth/x.retrieval\n```\n\n## \ud83d\udee0\ufe0f Usage\n\nList datasets:\n\n```python\nxretrieval.list_datasets()\n```\n\n```bash\n Available Datasets \n\u250f\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2533\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2513\n\u2503 Dataset Name \u2503 Description \u2503\n\u2521\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2547\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2529\n\u2502 coco-val-2017 \u2502 The COCO Validation Set with 5k images. \u2502\n\u2502 coco-val-2017-blip2-captions \u2502 The COCO Validation Set with 5k images and BLIP2 captions. \u2502\n\u2502 coco-val-2017-vlrm-captions \u2502 The COCO Validation Set with 5k images and VLRM captions. \u2502\n\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n```\n\nList models:\n\n```python\nxretrieval.list_models()\n```\n\n```bash\n Available Models \n\u250f\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2533\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2513\n\u2503 Model ID \u2503 Model Input \u2503\n\u2521\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2547\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2529\n\u2502 transformers/Salesforce/blip2-itm-vit-g \u2502 text-image \u2502\n\u2502 transformers/Salesforce/blip2-itm-vit-g-text \u2502 text \u2502\n\u2502 transformers/Salesforce/blip2-itm-vit-g-image \u2502 image \u2502\n\u2502 xhluca/bm25s \u2502 text \u2502\n\u2502 sentence-transformers/paraphrase-MiniLM-L3-v2 \u2502 text \u2502\n\u2502 sentence-transformers/paraphrase-albert-small-v2 \u2502 text \u2502\n\u2502 sentence-transformers/multi-qa-distilbert-cos-v1 \u2502 text \u2502\n\u2502 sentence-transformers/all-MiniLM-L12-v2 \u2502 text \u2502\n\u2502 sentence-transformers/all-distilroberta-v1 \u2502 text \u2502\n\u2502 sentence-transformers/multi-qa-mpnet-base-dot-v1 \u2502 text \u2502\n\u2502 sentence-transformers/all-mpnet-base-v2 \u2502 text \u2502\n\u2502 sentence-transformers/multi-qa-MiniLM-L6-cos-v1 \u2502 text \u2502\n\u2502 sentence-transformers/all-MiniLM-L6-v2 \u2502 text \u2502\n\u2502 timm/resnet18.a1_in1k \u2502 image \u2502\n\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n```\n\n\nRun benchmarks:\n\n```python\nresults, results_df = xretrieval.run_benchmark_bm25(\"coco-val-2017-blip2-captions\")\n```\n\nVisualize retrieval results:\n\n```python\nxretrieval.visualize_retrieval(results_df)\n```\n\n![alt text](assets/viz1.png)\n![alt text](assets/viz2.png)\n\nRun hybrid search with Reciprocal Rank Fusion (RRF):\n\n```python\nresults_df = xretrieval.run_rrf([results_df, results_df], \"coco-val-2017\")\n```\n\nSee [RRF notebook](nbs/rrf.ipynb) for more details.",
"bugtrack_url": null,
"license": "MIT",
"summary": "Retrieve and Evaluate with X(any) models",
"version": "0.2.0",
"project_urls": {
"Bug Tracker": "https://github.com/dnth/x.retrieval/issues",
"Homepage": "https://github.com/dnth/x.retrieval"
},
"split_keywords": [
"evaluation",
" machine-learning",
" multi-modal",
" retrieval"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "dc2d03fa308d979e8ed5739bf64211dda381c4fef0c1271e29831f1971a39e15",
"md5": "7647a084edc6010d7f06ec3f63a77aa4",
"sha256": "1811b0ba9cae943c7c7f3df55a080dad2d63c787769c51dc8f1a75451643acd6"
},
"downloads": -1,
"filename": "xretrieval-0.2.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "7647a084edc6010d7f06ec3f63a77aa4",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 17660,
"upload_time": "2024-12-04T07:13:47",
"upload_time_iso_8601": "2024-12-04T07:13:47.139509Z",
"url": "https://files.pythonhosted.org/packages/dc/2d/03fa308d979e8ed5739bf64211dda381c4fef0c1271e29831f1971a39e15/xretrieval-0.2.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "e1f26dc93eb8dc91c810044381a690cd0a9e1ef537117398961fc075ef3d1d22",
"md5": "04e279423c0e7ab60d3dd631595916c9",
"sha256": "459fcb26e01791482acbe65e94d2d7a40970d52a016e2cfb1eaaeb00e1481771"
},
"downloads": -1,
"filename": "xretrieval-0.2.0.tar.gz",
"has_sig": false,
"md5_digest": "04e279423c0e7ab60d3dd631595916c9",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 14170,
"upload_time": "2024-12-04T07:13:48",
"upload_time_iso_8601": "2024-12-04T07:13:48.718701Z",
"url": "https://files.pythonhosted.org/packages/e1/f2/6dc93eb8dc91c810044381a690cd0a9e1ef537117398961fc075ef3d1d22/xretrieval-0.2.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-12-04 07:13:48",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "dnth",
"github_project": "x.retrieval",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "xretrieval"
}