Name | chatnoir-pyterrier JSON |
Version |
3.1.2
JSON |
| download |
home_page | None |
Summary | Use the ChatNoir search engine in PyTerrier. |
upload_time | 2025-01-09 16:02:05 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.8 |
license | None |
keywords |
|
VCS |
 |
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
[](https://pypi.org/project/chatnoir-pyterrier/)
[](https://github.com/chatnoir-eu/chatnoir-pyterrier/actions/workflows/ci.yml)
[](https://codecov.io/github/chatnoir-eu/chatnoir-pyterrier/)
[](https://pypi.org/project/chatnoir-pyterrier/)
[](https://colab.research.google.com/github/chatnoir-eu/chatnoir-pyterrier/blob/main/examples/search.ipynb)
[](https://github.com/chatnoir-eu/chatnoir-pyterrier/issues)
[](https://github.com/chatnoir-eu/chatnoir-pyterrier/commits)
[](https://pypi.org/project/chatnoir-pyterrier/)
[](LICENSE)
# 🔍 chatnoir-pyterrier
Use the ChatNoir REST-API in PyTerrier for retrieval/re-ranking against large corpora such as ClueWeb09, ClueWeb12, ClueWeb22, or MS MARCO.
Powered by the [`chatnoir-api`](https://pypi.org/project/chatnoir-api/) package.
## Installation
Install the package from PyPI:
```shell
pip install chatnoir-pyterrier
```
## Usage
You can use the `ChatNoirRetrieve` PyTerrier module in any PyTerrier pipeline, like you would do with `BatchRetrieve`.
```python
from chatnoir_pyterrier import ChatNoirRetrieve, Feature
chatnoir = ChatNoirRetrieve(index="msmarco-document-v2.1", features=Feature.SNIPPET_TEXT)
chatnoir.search("python library")
```
### Features
ChatNoir provides an extensive set of extra features, such as the full text or page rank / spam rank (for some indices).
These can easily be included in the response data frame for usage in subsequent PyTerrier re-ranking stages like so:
```python
from chatnoir_pyterrier import ChatNoirRetrieve, Feature
chatnoir_msmarco_snippet = ChatNoirRetrieve(index="msmarco-document-v2.1", features=Feature.SNIPPET_TEXT)
chatnoir_msmarco_snippet.search("python library")
chatnoir_cw09_page_spam_rank = ChatNoirRetrieve(index="clueweb09", features=Feature.PAGE_RANK | Feature.SPAM_RANK)
chatnoir_cw09_page_spam_rank.search("python library")
```
### Advanced usage
Please check out our [sample notebook](examples/search.ipynb) or [open it in Google Colab](https://colab.research.google.com/github/chatnoir-eu/chatnoir-pyterrier/blob/main/examples/search.ipynb).
We also provide a hands-on guide for the Touché 2023 shared tasks [here](examples/search_touche_2023.ipynb).
<!-- ## Citation
If you use this package, please cite the [paper](https://webis.de/publications.html#bevendorff_2018)
from the [ChatNoir](https://github.com/chatnoir-eu) authors.
You can use the following BibTeX information for citation:
```bibtex
@InProceedings{bevendorff:2018,
address = {Berlin Heidelberg New York},
author = {Janek Bevendorff and Benno Stein and Matthias Hagen and Martin Potthast},
booktitle = {Advances in Information Retrieval. 40th European Conference on IR Research (ECIR 2018)},
editor = {Leif Azzopardi and Allan Hanbury and Gabriella Pasi and Benjamin Piwowarski},
month = mar,
publisher = {Springer},
series = {Lecture Notes in Computer Science},
site = {Grenoble, France},
title = {{Elastic ChatNoir: Search Engine for the ClueWeb and the Common Crawl}},
year = 2018
}
``` -->
### Experiments
With chatnoir-pyterrier, it is easy to run benchmarks on a number of shared tasks that run on larger document collections.
We demonstrate this by running ChatNoir retrieval on all suported TREC, CLEF, and NTCIR shared tasks available in ir_datasets.
First install the experiment dependencies:
```shell
pip install -e .[experiment]
```
To run the experiments, first create the runs by running:
```shell
ray job submit --runtime-env examples/ray-runtime-env.yml --no-wait -- python examples/experiment.py
```
This will create runs for each shared task in parallel and save it to a cache.
After creating the runs, the [`experiment.ipynb`](examples/experiment.ipynb) notebook can be used to analyze the results.
## Development
To build this package and contribute to its development you need to install the `build`, and `setuptools` and `wheel` packages:
```shell
pip install build setuptools wheel
```
(On most systems, these packages are already pre-installed.)
### Development installation
Install package and test dependencies:
```shell
pip install -e .[test]
```
### Testing
Configure the API keys for testing:
```shell
export CHATNOIR_API_KEY="<API_KEY>"
```
Verify your changes against the test suite to verify.
```shell
ruff check . # Code format and LINT
mypy . # Static typing
bandit -c pyproject.toml -r . # Security
pytest . # Unit tests
```
Please also add tests for your newly developed code.
### Build wheels
Wheels for this package can be built with:
```shell
python -m build
```
## Support
If you hit any problems using this package, please file an [issue](https://github.com/chatnoir-eu/chatnoir-pyterrier/issues/new).
We're happy to help!
## License
This repository is released under the [MIT license](LICENSE).
Raw data
{
"_id": null,
"home_page": null,
"name": "chatnoir-pyterrier",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": null,
"author": null,
"author_email": "Jan Heinrich Merker <heinrich.merker@uni-jena.de>",
"download_url": "https://files.pythonhosted.org/packages/70/a1/2d674aa129c3b86760464ac30b8559a9e11b01c228d3ba4db2db3c6eae0f/chatnoir_pyterrier-3.1.2.tar.gz",
"platform": null,
"description": "[](https://pypi.org/project/chatnoir-pyterrier/)\n[](https://github.com/chatnoir-eu/chatnoir-pyterrier/actions/workflows/ci.yml)\n[](https://codecov.io/github/chatnoir-eu/chatnoir-pyterrier/)\n[](https://pypi.org/project/chatnoir-pyterrier/)\n[](https://colab.research.google.com/github/chatnoir-eu/chatnoir-pyterrier/blob/main/examples/search.ipynb)\n[](https://github.com/chatnoir-eu/chatnoir-pyterrier/issues)\n[](https://github.com/chatnoir-eu/chatnoir-pyterrier/commits)\n[](https://pypi.org/project/chatnoir-pyterrier/)\n[](LICENSE)\n\n# \ud83d\udd0d chatnoir-pyterrier\n\nUse the ChatNoir REST-API in PyTerrier for retrieval/re-ranking against large corpora such as ClueWeb09, ClueWeb12, ClueWeb22, or MS MARCO.\n\nPowered by the [`chatnoir-api`](https://pypi.org/project/chatnoir-api/) package.\n\n## Installation\n\nInstall the package from PyPI:\n\n```shell\npip install chatnoir-pyterrier\n```\n\n## Usage\n\nYou can use the `ChatNoirRetrieve` PyTerrier module in any PyTerrier pipeline, like you would do with `BatchRetrieve`.\n\n```python\nfrom chatnoir_pyterrier import ChatNoirRetrieve, Feature\n\nchatnoir = ChatNoirRetrieve(index=\"msmarco-document-v2.1\", features=Feature.SNIPPET_TEXT)\nchatnoir.search(\"python library\")\n```\n\n### Features\n\nChatNoir provides an extensive set of extra features, such as the full text or page rank / spam rank (for some indices).\nThese can easily be included in the response data frame for usage in subsequent PyTerrier re-ranking stages like so:\n\n```python\nfrom chatnoir_pyterrier import ChatNoirRetrieve, Feature\n\nchatnoir_msmarco_snippet = ChatNoirRetrieve(index=\"msmarco-document-v2.1\", features=Feature.SNIPPET_TEXT)\nchatnoir_msmarco_snippet.search(\"python library\")\n\nchatnoir_cw09_page_spam_rank = ChatNoirRetrieve(index=\"clueweb09\", features=Feature.PAGE_RANK | Feature.SPAM_RANK)\nchatnoir_cw09_page_spam_rank.search(\"python library\")\n```\n\n### Advanced usage\n\nPlease check out our [sample notebook](examples/search.ipynb) or [open it in Google Colab](https://colab.research.google.com/github/chatnoir-eu/chatnoir-pyterrier/blob/main/examples/search.ipynb).\n\nWe also provide a hands-on guide for the Touch\u00e9 2023 shared tasks [here](examples/search_touche_2023.ipynb).\n\n<!-- ## Citation\n\nIf you use this package, please cite the [paper](https://webis.de/publications.html#bevendorff_2018)\nfrom the [ChatNoir](https://github.com/chatnoir-eu) authors. \nYou can use the following BibTeX information for citation:\n\n```bibtex\n@InProceedings{bevendorff:2018,\n address = {Berlin Heidelberg New York},\n author = {Janek Bevendorff and Benno Stein and Matthias Hagen and Martin Potthast},\n booktitle = {Advances in Information Retrieval. 40th European Conference on IR Research (ECIR 2018)},\n editor = {Leif Azzopardi and Allan Hanbury and Gabriella Pasi and Benjamin Piwowarski},\n month = mar,\n publisher = {Springer},\n series = {Lecture Notes in Computer Science},\n site = {Grenoble, France},\n title = {{Elastic ChatNoir: Search Engine for the ClueWeb and the Common Crawl}},\n year = 2018\n}\n``` -->\n\n### Experiments\n\nWith chatnoir-pyterrier, it is easy to run benchmarks on a number of shared tasks that run on larger document collections.\nWe demonstrate this by running ChatNoir retrieval on all suported TREC, CLEF, and NTCIR shared tasks available in ir_datasets.\n\nFirst install the experiment dependencies:\n\n```shell\npip install -e .[experiment]\n```\n\nTo run the experiments, first create the runs by running:\n\n```shell\nray job submit --runtime-env examples/ray-runtime-env.yml --no-wait -- python examples/experiment.py \n```\n\nThis will create runs for each shared task in parallel and save it to a cache.\n\nAfter creating the runs, the [`experiment.ipynb`](examples/experiment.ipynb) notebook can be used to analyze the results.\n\n## Development\n\nTo build this package and contribute to its development you need to install the `build`, and `setuptools` and `wheel` packages:\n\n```shell\npip install build setuptools wheel\n```\n\n(On most systems, these packages are already pre-installed.)\n\n### Development installation\n\nInstall package and test dependencies:\n\n```shell\npip install -e .[test]\n```\n\n### Testing\n\nConfigure the API keys for testing:\n\n```shell\nexport CHATNOIR_API_KEY=\"<API_KEY>\"\n```\n\nVerify your changes against the test suite to verify.\n\n```shell\nruff check . # Code format and LINT\nmypy . # Static typing\nbandit -c pyproject.toml -r . # Security\npytest . # Unit tests\n```\n\nPlease also add tests for your newly developed code.\n\n### Build wheels\n\nWheels for this package can be built with:\n\n```shell\npython -m build\n```\n\n## Support\n\nIf you hit any problems using this package, please file an [issue](https://github.com/chatnoir-eu/chatnoir-pyterrier/issues/new).\nWe're happy to help!\n\n## License\n\nThis repository is released under the [MIT license](LICENSE).\n",
"bugtrack_url": null,
"license": null,
"summary": "Use the ChatNoir search engine in PyTerrier.",
"version": "3.1.2",
"project_urls": {
"Bug Tracker": "https://github.com/chatnoir-eu/chatnoir-pyterrier/issues",
"Homepage": "https://github.com/chatnoir-eu/chatnoir-pyterrier"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "5c96d19d379fc126714258b994ebeda07fcb3a0b4b053a225ba0a750762ef00e",
"md5": "132fca96209952da3ce0ed0b1c26a256",
"sha256": "845a0dbefb5d1507bc560183d55952f527550cf0b38073ad0f5549f8d184a444"
},
"downloads": -1,
"filename": "chatnoir_pyterrier-3.1.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "132fca96209952da3ce0ed0b1c26a256",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 31963,
"upload_time": "2025-01-09T16:02:01",
"upload_time_iso_8601": "2025-01-09T16:02:01.941000Z",
"url": "https://files.pythonhosted.org/packages/5c/96/d19d379fc126714258b994ebeda07fcb3a0b4b053a225ba0a750762ef00e/chatnoir_pyterrier-3.1.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "70a12d674aa129c3b86760464ac30b8559a9e11b01c228d3ba4db2db3c6eae0f",
"md5": "6d4dec5bcd79b10b6300b52ad8837005",
"sha256": "a6aadd5c62ab70746954f5379043b7024263233aa654de6b42f0f8f182487228"
},
"downloads": -1,
"filename": "chatnoir_pyterrier-3.1.2.tar.gz",
"has_sig": false,
"md5_digest": "6d4dec5bcd79b10b6300b52ad8837005",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 34611,
"upload_time": "2025-01-09T16:02:05",
"upload_time_iso_8601": "2025-01-09T16:02:05.021985Z",
"url": "https://files.pythonhosted.org/packages/70/a1/2d674aa129c3b86760464ac30b8559a9e11b01c228d3ba4db2db3c6eae0f/chatnoir_pyterrier-3.1.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-01-09 16:02:05",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "chatnoir-eu",
"github_project": "chatnoir-pyterrier",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "chatnoir-pyterrier"
}