xretrieval

Name	xretrieval JSON
Version	0.2.0 JSON
	download
home_page	None
Summary	Retrieve and Evaluate with X(any) models
upload_time	2024-12-04 07:13:48
maintainer	None
docs_url	None
author	None
requires_python	>=3.10
license	MIT
keywords	evaluation machine-learning multi-modal retrieval
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            [colab_badge]: https://img.shields.io/badge/Open%20In-Colab-blue?style=for-the-badge&logo=google-colab
[kaggle_badge]: https://img.shields.io/badge/Open%20In-Kaggle-blue?style=for-the-badge&logo=kaggle

[python_badge]: https://img.shields.io/badge/Python-3.10+-brightgreen?style=for-the-badge&logo=python&logoColor=white
[pypi_badge]: https://img.shields.io/pypi/v/xretrieval.svg?style=for-the-badge&logo=pypi&logoColor=white&label=PyPI&color=blue
[downloads_badge]: https://img.shields.io/pepy/dt/xretrieval.svg?style=for-the-badge&logo=pypi&logoColor=white&label=Downloads&color=purple
[license_badge]: https://img.shields.io/badge/License-Apache%202.0-green.svg?style=for-the-badge&logo=apache&logoColor=white

[![Python][python_badge]](https://pypi.org/project/xretrieval/)
[![PyPI version][pypi_badge]](https://pypi.org/project/xretrieval/)
[![Downloads][downloads_badge]](https://pypi.org/project/xretrieval/)
![License][license_badge]

<div align="center">
    <img src="https://raw.githubusercontent.com/dnth/x.retrieval/main/assets/logo.png" alt="x.retrieval" width="600"/>
    <br />
    <br />
    <a href="https://dnth.github.io/x.retrieval" target="_blank" rel="noopener noreferrer"><strong>Explore the docs »</strong></a>
    <br />
    <a href="#-quickstart" target="_blank" rel="noopener noreferrer">Quickstart</a>
    ·
    <a href="https://github.com/dnth/x.retrieval/issues/new?assignees=&labels=Feature+Request&projects=&template=feature_request.md" target="_blank" rel="noopener noreferrer">Feature Request</a>
    ·
    <a href="https://github.com/dnth/x.retrieval/issues/new?assignees=&labels=bug&projects=&template=bug_report.md" target="_blank" rel="noopener noreferrer">Report Bug</a>
    ·
    <a href="https://github.com/dnth/x.retrieval/discussions" target="_blank" rel="noopener noreferrer">Discussions</a>
    ·
    <a href="https://dicksonneoh.com/" target="_blank" rel="noopener noreferrer">About</a>
    <br />
    <br />
</div>
Evaluate your multimodal retrieval system in 3 lines of code.


## 🌟 Key Features

- ✅ Load datasets and models with one line of code.
- ✅ Built in support for Sentence Transformers, TIMM, BM25, and Transformers models.
- ✅ Run benchmarks and get retrieval metrics like MRR, NormalizedDCG, Precision, Recall, HitRate, and MAP.
- ✅ Visualize retrieval results to understand how your model is performing.
- ✅ Combine retrieval results from multiple models using Reciprocal Rank Fusion (RRF).

## 🚀 Quickstart

[![Open In Colab][colab_badge]](https://colab.research.google.com/github/dnth/x.retrieval/blob/main/nbs/quickstart.ipynb)
[![Open In Kaggle][kaggle_badge]](https://kaggle.com/kernels/welcome?src=https://github.com/dnth/x.retrieval/blob/main/nbs/quickstart.ipynb)

```python
import xretrieval

metrics, results_df = xretrieval.run_benchmark(
    dataset="coco-val-2017",
    model_id="transformers/Salesforce/blip2-itm-vit-g",
    mode="text-to-text",
)

```

```bash

 Retrieval Metrics @ k=10 
┏━━━━━━━━━━━━━━━┳━━━━━━━━┓
┃ Metric        ┃ Score  ┃
┡━━━━━━━━━━━━━━━╇━━━━━━━━┩
│ MRR           │ 0.2358 │
│ NormalizedDCG │ 0.2854 │
│ Precision     │ 0.1660 │
│ Recall        │ 0.4248 │
│ HitRate       │ 0.4248 │
│ MAP           │ 0.2095 │
└───────────────┴────────┘

```

## 📦 Installation
From PyPI:
```bash
pip install xretrieval
```

From source:

```bash
pip install git+https://github.com/dnth/x.retrieval
```

## 🛠️ Usage

List datasets:

```python
xretrieval.list_datasets()
```

```bash
                                     Available Datasets                                      
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Dataset Name                 ┃ Description                                                ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ coco-val-2017                │ The COCO Validation Set with 5k images.                    │
│ coco-val-2017-blip2-captions │ The COCO Validation Set with 5k images and BLIP2 captions. │
│ coco-val-2017-vlrm-captions  │ The COCO Validation Set with 5k images and VLRM captions.  │
└──────────────────────────────┴────────────────────────────────────────────────────────────┘
```

List models:

```python
xretrieval.list_models()
```

```bash
                         Available Models                         
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓
┃ Model ID                                         ┃ Model Input ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩
│ transformers/Salesforce/blip2-itm-vit-g          │ text-image  │
│ transformers/Salesforce/blip2-itm-vit-g-text     │ text        │
│ transformers/Salesforce/blip2-itm-vit-g-image    │ image       │
│ xhluca/bm25s                                     │ text        │
│ sentence-transformers/paraphrase-MiniLM-L3-v2    │ text        │
│ sentence-transformers/paraphrase-albert-small-v2 │ text        │
│ sentence-transformers/multi-qa-distilbert-cos-v1 │ text        │
│ sentence-transformers/all-MiniLM-L12-v2          │ text        │
│ sentence-transformers/all-distilroberta-v1       │ text        │
│ sentence-transformers/multi-qa-mpnet-base-dot-v1 │ text        │
│ sentence-transformers/all-mpnet-base-v2          │ text        │
│ sentence-transformers/multi-qa-MiniLM-L6-cos-v1  │ text        │
│ sentence-transformers/all-MiniLM-L6-v2           │ text        │
│ timm/resnet18.a1_in1k                            │ image       │
└──────────────────────────────────────────────────┴─────────────┘
```


Run benchmarks:

```python
results, results_df = xretrieval.run_benchmark_bm25("coco-val-2017-blip2-captions")
```

Visualize retrieval results:

```python
xretrieval.visualize_retrieval(results_df)
```

![alt text](assets/viz1.png)
![alt text](assets/viz2.png)

Run hybrid search with Reciprocal Rank Fusion (RRF):

```python
results_df = xretrieval.run_rrf([results_df, results_df], "coco-val-2017")
```

See [RRF notebook](nbs/rrf.ipynb) for more details.

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "xretrieval",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "evaluation, machine-learning, multi-modal, retrieval",
    "author": null,
    "author_email": "Dickson Neoh <dickson.neoh@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/e1/f2/6dc93eb8dc91c810044381a690cd0a9e1ef537117398961fc075ef3d1d22/xretrieval-0.2.0.tar.gz",
    "platform": null,
    "description": "[colab_badge]: https://img.shields.io/badge/Open%20In-Colab-blue?style=for-the-badge&logo=google-colab\n[kaggle_badge]: https://img.shields.io/badge/Open%20In-Kaggle-blue?style=for-the-badge&logo=kaggle\n\n[python_badge]: https://img.shields.io/badge/Python-3.10+-brightgreen?style=for-the-badge&logo=python&logoColor=white\n[pypi_badge]: https://img.shields.io/pypi/v/xretrieval.svg?style=for-the-badge&logo=pypi&logoColor=white&label=PyPI&color=blue\n[downloads_badge]: https://img.shields.io/pepy/dt/xretrieval.svg?style=for-the-badge&logo=pypi&logoColor=white&label=Downloads&color=purple\n[license_badge]: https://img.shields.io/badge/License-Apache%202.0-green.svg?style=for-the-badge&logo=apache&logoColor=white\n\n[![Python][python_badge]](https://pypi.org/project/xretrieval/)\n[![PyPI version][pypi_badge]](https://pypi.org/project/xretrieval/)\n[![Downloads][downloads_badge]](https://pypi.org/project/xretrieval/)\n![License][license_badge]\n\n<div align=\"center\">\n    <img src=\"https://raw.githubusercontent.com/dnth/x.retrieval/main/assets/logo.png\" alt=\"x.retrieval\" width=\"600\"/>\n    <br />\n    <br />\n    <a href=\"https://dnth.github.io/x.retrieval\" target=\"_blank\" rel=\"noopener noreferrer\"><strong>Explore the docs \u00bb</strong></a>\n    <br />\n    <a href=\"#-quickstart\" target=\"_blank\" rel=\"noopener noreferrer\">Quickstart</a>\n    \u00b7\n    <a href=\"https://github.com/dnth/x.retrieval/issues/new?assignees=&labels=Feature+Request&projects=&template=feature_request.md\" target=\"_blank\" rel=\"noopener noreferrer\">Feature Request</a>\n    \u00b7\n    <a href=\"https://github.com/dnth/x.retrieval/issues/new?assignees=&labels=bug&projects=&template=bug_report.md\" target=\"_blank\" rel=\"noopener noreferrer\">Report Bug</a>\n    \u00b7\n    <a href=\"https://github.com/dnth/x.retrieval/discussions\" target=\"_blank\" rel=\"noopener noreferrer\">Discussions</a>\n    \u00b7\n    <a href=\"https://dicksonneoh.com/\" target=\"_blank\" rel=\"noopener noreferrer\">About</a>\n    <br />\n    <br />\n</div>\nEvaluate your multimodal retrieval system in 3 lines of code.\n\n\n## \ud83c\udf1f Key Features\n\n- \u2705 Load datasets and models with one line of code.\n- \u2705 Built in support for Sentence Transformers, TIMM, BM25, and Transformers models.\n- \u2705 Run benchmarks and get retrieval metrics like MRR, NormalizedDCG, Precision, Recall, HitRate, and MAP.\n- \u2705 Visualize retrieval results to understand how your model is performing.\n- \u2705 Combine retrieval results from multiple models using Reciprocal Rank Fusion (RRF).\n\n## \ud83d\ude80 Quickstart\n\n[![Open In Colab][colab_badge]](https://colab.research.google.com/github/dnth/x.retrieval/blob/main/nbs/quickstart.ipynb)\n[![Open In Kaggle][kaggle_badge]](https://kaggle.com/kernels/welcome?src=https://github.com/dnth/x.retrieval/blob/main/nbs/quickstart.ipynb)\n\n```python\nimport xretrieval\n\nmetrics, results_df = xretrieval.run_benchmark(\n    dataset=\"coco-val-2017\",\n    model_id=\"transformers/Salesforce/blip2-itm-vit-g\",\n    mode=\"text-to-text\",\n)\n\n```\n\n```bash\n\n Retrieval Metrics @ k=10 \n\u250f\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2533\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2513\n\u2503 Metric        \u2503 Score  \u2503\n\u2521\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2547\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2529\n\u2502 MRR           \u2502 0.2358 \u2502\n\u2502 NormalizedDCG \u2502 0.2854 \u2502\n\u2502 Precision     \u2502 0.1660 \u2502\n\u2502 Recall        \u2502 0.4248 \u2502\n\u2502 HitRate       \u2502 0.4248 \u2502\n\u2502 MAP           \u2502 0.2095 \u2502\n\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n\n```\n\n## \ud83d\udce6 Installation\nFrom PyPI:\n```bash\npip install xretrieval\n```\n\nFrom source:\n\n```bash\npip install git+https://github.com/dnth/x.retrieval\n```\n\n## \ud83d\udee0\ufe0f Usage\n\nList datasets:\n\n```python\nxretrieval.list_datasets()\n```\n\n```bash\n                                     Available Datasets                                      \n\u250f\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2533\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2513\n\u2503 Dataset Name                 \u2503 Description                                                \u2503\n\u2521\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2547\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2529\n\u2502 coco-val-2017                \u2502 The COCO Validation Set with 5k images.                    \u2502\n\u2502 coco-val-2017-blip2-captions \u2502 The COCO Validation Set with 5k images and BLIP2 captions. \u2502\n\u2502 coco-val-2017-vlrm-captions  \u2502 The COCO Validation Set with 5k images and VLRM captions.  \u2502\n\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n```\n\nList models:\n\n```python\nxretrieval.list_models()\n```\n\n```bash\n                         Available Models                         \n\u250f\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2533\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2513\n\u2503 Model ID                                         \u2503 Model Input \u2503\n\u2521\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2547\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2529\n\u2502 transformers/Salesforce/blip2-itm-vit-g          \u2502 text-image  \u2502\n\u2502 transformers/Salesforce/blip2-itm-vit-g-text     \u2502 text        \u2502\n\u2502 transformers/Salesforce/blip2-itm-vit-g-image    \u2502 image       \u2502\n\u2502 xhluca/bm25s                                     \u2502 text        \u2502\n\u2502 sentence-transformers/paraphrase-MiniLM-L3-v2    \u2502 text        \u2502\n\u2502 sentence-transformers/paraphrase-albert-small-v2 \u2502 text        \u2502\n\u2502 sentence-transformers/multi-qa-distilbert-cos-v1 \u2502 text        \u2502\n\u2502 sentence-transformers/all-MiniLM-L12-v2          \u2502 text        \u2502\n\u2502 sentence-transformers/all-distilroberta-v1       \u2502 text        \u2502\n\u2502 sentence-transformers/multi-qa-mpnet-base-dot-v1 \u2502 text        \u2502\n\u2502 sentence-transformers/all-mpnet-base-v2          \u2502 text        \u2502\n\u2502 sentence-transformers/multi-qa-MiniLM-L6-cos-v1  \u2502 text        \u2502\n\u2502 sentence-transformers/all-MiniLM-L6-v2           \u2502 text        \u2502\n\u2502 timm/resnet18.a1_in1k                            \u2502 image       \u2502\n\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n```\n\n\nRun benchmarks:\n\n```python\nresults, results_df = xretrieval.run_benchmark_bm25(\"coco-val-2017-blip2-captions\")\n```\n\nVisualize retrieval results:\n\n```python\nxretrieval.visualize_retrieval(results_df)\n```\n\n![alt text](assets/viz1.png)\n![alt text](assets/viz2.png)\n\nRun hybrid search with Reciprocal Rank Fusion (RRF):\n\n```python\nresults_df = xretrieval.run_rrf([results_df, results_df], \"coco-val-2017\")\n```\n\nSee [RRF notebook](nbs/rrf.ipynb) for more details.",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Retrieve and Evaluate with X(any) models",
    "version": "0.2.0",
    "project_urls": {
        "Bug Tracker": "https://github.com/dnth/x.retrieval/issues",
        "Homepage": "https://github.com/dnth/x.retrieval"
    },
    "split_keywords": [
        "evaluation",
        " machine-learning",
        " multi-modal",
        " retrieval"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "dc2d03fa308d979e8ed5739bf64211dda381c4fef0c1271e29831f1971a39e15",
                "md5": "7647a084edc6010d7f06ec3f63a77aa4",
                "sha256": "1811b0ba9cae943c7c7f3df55a080dad2d63c787769c51dc8f1a75451643acd6"
            },
            "downloads": -1,
            "filename": "xretrieval-0.2.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "7647a084edc6010d7f06ec3f63a77aa4",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 17660,
            "upload_time": "2024-12-04T07:13:47",
            "upload_time_iso_8601": "2024-12-04T07:13:47.139509Z",
            "url": "https://files.pythonhosted.org/packages/dc/2d/03fa308d979e8ed5739bf64211dda381c4fef0c1271e29831f1971a39e15/xretrieval-0.2.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e1f26dc93eb8dc91c810044381a690cd0a9e1ef537117398961fc075ef3d1d22",
                "md5": "04e279423c0e7ab60d3dd631595916c9",
                "sha256": "459fcb26e01791482acbe65e94d2d7a40970d52a016e2cfb1eaaeb00e1481771"
            },
            "downloads": -1,
            "filename": "xretrieval-0.2.0.tar.gz",
            "has_sig": false,
            "md5_digest": "04e279423c0e7ab60d3dd631595916c9",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 14170,
            "upload_time": "2024-12-04T07:13:48",
            "upload_time_iso_8601": "2024-12-04T07:13:48.718701Z",
            "url": "https://files.pythonhosted.org/packages/e1/f2/6dc93eb8dc91c810044381a690cd0a9e1ef537117398961fc075ef3d1d22/xretrieval-0.2.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-12-04 07:13:48",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "dnth",
    "github_project": "x.retrieval",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "xretrieval"
}

None