# autoqrels
`autoqrels` is a tool for automatically inferring query relevance assessments (qrels).
Currently, it supports the one-shot labeling approach (1SL) presented in *[MacAvaney and
Soldaini, One-Shot Labeling for Automatic Relevance Estimation, SIGIR 2023](https://arxiv.org/pdf/2302.11266.pdf)*.
This package adheres to the [`ir-measures`](https://ir-measur.es/) API, which means it can
be directly used by various tools, such as [PyTerrier](https://pyterrier.readthedocs.io/).
## Getting started
You can install `autoqrels` using pip:
```bash
pip install autoqrels
```
You can also work with the repository locally:
```bash
git clone https://github.com/seanmacavaney/autoqrels.git
cd autoqrels
python setup.py develop
```
## API
The primary interface in `autoqrels` is `autoqrels.Labeler`. A `Labeler` exposes a
method, `infer_qrels(run, qrels)`, which returns a new set of qrels that covers the
provided run:
- `run` is a Pandas DataFrame with the columns `query_id` (str), `doc_id` (str), and `score` (float)
- `qrels` is a Pandas DataFrame with the columns `query_id` (str), `doc_id` (str), and `relevance` (int)
- The return value is a Pandas DataFrame with the columns `query_id` (str), `doc_id` (str), and `relevance` (float)
`Labeler`s also expose several measure definitions compatible with `ir_measures`:
[`labeler.SDCG@k`](https://ir-measur.es/en/latest/measures.html#sdcg),
[`labeler.RBP(p=persistence)`](https://ir-measur.es/en/latest/measures.html#rbp),
[`labeler.P@k`](https://ir-measur.es/en/latest/measures.html#p).
These measures can be used to calculate the corresponding effectivness, with the
addition of the labeler's inferred qrels. See the [ir-measures documentation](https://ir-measur.es/)
for more details.
We'll now explore the available `Labeler` implementations.
### `autoqrels.oneshot`: 1SL (One-shot Labeling)
**Reproduction: See repro instructions in [`repro/oneshot`](repro/oneshot).**
One-shot labelers work over a single known relevant document per query. An error
is raised if multiple relevant documents are provided.
Example:
```python
import autoqrels
import ir_datasets
dataset = ir_datasets.load('msmarco-passage/trec-dl-2019')
duot5 = autoqrels.oneshot.DuoT5(dataset=dataset, cache_path='data/duot5.cache.json.gz')
# measures:
duot5.SDCG@10
duot5.P@10
duot5.RBP
```
## Citation
If you use this work, please cite:
```bibtex
@inproceedings{autoqrels,
author = {MacAvaney, Sean and Soldaini, Luca},
title = {One-Shot Labeling for Automatic Relevance Estimation},
booktitle = {Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval},
year = {2023},
url = {https://arxiv.org/abs/2302.11266}
}
```
Raw data
{
"_id": null,
"home_page": "https://github.com/seanmacavaney/autoqrels",
"name": "autoqrels",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": null,
"keywords": null,
"author": "Sean MacAvaney",
"author_email": "sean.macavaney@glasgow.ac.uk",
"download_url": "https://files.pythonhosted.org/packages/6d/75/2c8b6c658f624c44d8308e28da08fe267fb30c9e6a7ff7007312e0da5db1/autoqrels-0.0.1.tar.gz",
"platform": null,
"description": "# autoqrels\n\n`autoqrels` is a tool for automatically inferring query relevance assessments (qrels).\n\nCurrently, it supports the one-shot labeling approach (1SL) presented in *[MacAvaney and\nSoldaini, One-Shot Labeling for Automatic Relevance Estimation, SIGIR 2023](https://arxiv.org/pdf/2302.11266.pdf)*.\n\nThis package adheres to the [`ir-measures`](https://ir-measur.es/) API, which means it can\nbe directly used by various tools, such as [PyTerrier](https://pyterrier.readthedocs.io/).\n\n## Getting started\n\nYou can install `autoqrels` using pip:\n\n```bash\npip install autoqrels\n```\n\nYou can also work with the repository locally:\n\n```bash\ngit clone https://github.com/seanmacavaney/autoqrels.git\ncd autoqrels\npython setup.py develop\n```\n\n## API\n\nThe primary interface in `autoqrels` is `autoqrels.Labeler`. A `Labeler` exposes a\nmethod, `infer_qrels(run, qrels)`, which returns a new set of qrels that covers the\nprovided run:\n\n - `run` is a Pandas DataFrame with the columns `query_id` (str), `doc_id` (str), and `score` (float)\n - `qrels` is a Pandas DataFrame with the columns `query_id` (str), `doc_id` (str), and `relevance` (int)\n - The return value is a Pandas DataFrame with the columns `query_id` (str), `doc_id` (str), and `relevance` (float)\n\n`Labeler`s also expose several measure definitions compatible with `ir_measures`:\n[`labeler.SDCG@k`](https://ir-measur.es/en/latest/measures.html#sdcg),\n[`labeler.RBP(p=persistence)`](https://ir-measur.es/en/latest/measures.html#rbp),\n[`labeler.P@k`](https://ir-measur.es/en/latest/measures.html#p).\nThese measures can be used to calculate the corresponding effectivness, with the\naddition of the labeler's inferred qrels. See the [ir-measures documentation](https://ir-measur.es/)\nfor more details.\n\nWe'll now explore the available `Labeler` implementations.\n\n### `autoqrels.oneshot`: 1SL (One-shot Labeling)\n\n**Reproduction: See repro instructions in [`repro/oneshot`](repro/oneshot).**\n\nOne-shot labelers work over a single known relevant document per query. An error\nis raised if multiple relevant documents are provided.\n\nExample:\n\n```python\nimport autoqrels\nimport ir_datasets\ndataset = ir_datasets.load('msmarco-passage/trec-dl-2019')\nduot5 = autoqrels.oneshot.DuoT5(dataset=dataset, cache_path='data/duot5.cache.json.gz')\n# measures:\nduot5.SDCG@10\nduot5.P@10\nduot5.RBP\n```\n\n## Citation\n\nIf you use this work, please cite:\n\n```bibtex\n@inproceedings{autoqrels,\n author = {MacAvaney, Sean and Soldaini, Luca},\n title = {One-Shot Labeling for Automatic Relevance Estimation},\n booktitle = {Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval},\n year = {2023},\n url = {https://arxiv.org/abs/2302.11266}\n}\n```\n",
"bugtrack_url": null,
"license": null,
"summary": "a tool for automatically inferring query relevance assessments (qrels)",
"version": "0.0.1",
"project_urls": {
"Homepage": "https://github.com/seanmacavaney/autoqrels"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "868665406d3116ab5ef32b757465ac4de93019220bf29357454645026254c1ac",
"md5": "c96e13e70f7d9765b5541e4db2dcf595",
"sha256": "9fb17ba2e459853c1d45a17729421e5ccbf74267c79e8f946240eb49f190b938"
},
"downloads": -1,
"filename": "autoqrels-0.0.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "c96e13e70f7d9765b5541e4db2dcf595",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 17489,
"upload_time": "2024-08-22T07:20:39",
"upload_time_iso_8601": "2024-08-22T07:20:39.523291Z",
"url": "https://files.pythonhosted.org/packages/86/86/65406d3116ab5ef32b757465ac4de93019220bf29357454645026254c1ac/autoqrels-0.0.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "6d752c8b6c658f624c44d8308e28da08fe267fb30c9e6a7ff7007312e0da5db1",
"md5": "6a92c09d17c4f7c330c57dc13718f448",
"sha256": "4ec3b2bd5eaa1dce2b3755a1e67e9864fd5c4ce9fe5bfebb0ba754884df4ef07"
},
"downloads": -1,
"filename": "autoqrels-0.0.1.tar.gz",
"has_sig": false,
"md5_digest": "6a92c09d17c4f7c330c57dc13718f448",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 14001,
"upload_time": "2024-08-22T07:20:41",
"upload_time_iso_8601": "2024-08-22T07:20:41.108698Z",
"url": "https://files.pythonhosted.org/packages/6d/75/2c8b6c658f624c44d8308e28da08fe267fb30c9e6a7ff7007312e0da5db1/autoqrels-0.0.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-08-22 07:20:41",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "seanmacavaney",
"github_project": "autoqrels",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "ir_measures",
"specs": [
[
">=",
"0.3.2"
]
]
},
{
"name": "ir_datasets",
"specs": []
},
{
"name": "transformers",
"specs": []
},
{
"name": "more_itertools",
"specs": []
},
{
"name": "pandas",
"specs": []
},
{
"name": "smashed",
"specs": []
}
],
"lcname": "autoqrels"
}