# Coreference Resolution with ONNX (AllenNLP-based)
Lightweight, fast coreference resolution component using a distilled version of AllenNLP's coreference model, exported to ONNX.
## ✨ Features
- 🧠 Cross-lingual coreference resolution
- 🪶 Lightweight model based on AllenNLP’s coref modeling
- ⚡ Fast inference via ONNX
- 🔌 Easy integration with spaCy
---
## 📦 Installation
```bash
$ pip install coref-onnx
```
## 🚀 Quickstart
Usage as a standalone component
```python
from coref_onnx import CoreferenceResolver, decode_clusters
resolver = CoreferenceResolver.from_pretrained("talmago/allennlp-coref-onnx-mMiniLMv2-L12-H384-distilled-from-XLMR-Large")
sentences = [
["Barack", "Obama", "was", "the", "44th", "President", "of", "the", "United", "States", "."],
["He", "was", "born", "in", "Hawaii", "."]
]
pred = resolver(sentences)
print(decode_clusters(sentences, pred["clusters"][0]))
# Output is:
# [['Barack Obama', 'He']]
```
Usage with spaCy
```python
import spacy
from coref_onnx import create_coref_minilm_component
nlp = spacy.load("en_core_web_sm")
nlp.add_pipe("coref_minilm")
doc = nlp("Barack Obama was born in Hawaii. He was elected president in 2008.")
print(doc._.coref_clusters[0])
print(doc._.cluster_heads)
print(doc._.resolved_text)
# Output is:
# [Barack Obama, He]
# {'Barack Obama': Barack Obama}
# Barack Obama was born in Hawaii. Barack Obama was elected president in 2008.
```
## 🛠️ Development
Set up virtualenv
```sh
$ make env
```
Set PYTHONPATH
```sh
$ export PYTHONPATH=$PYTHONPATH:/Users/talmago/git/coref-onnx/src
```
Code formatting
```sh
$ make format
```
Raw data
{
"_id": null,
"home_page": null,
"name": "coref-onnx",
"maintainer": null,
"docs_url": null,
"requires_python": "<3.13,>=3.11",
"maintainer_email": null,
"keywords": "coreference, coreference resolution, onnx, onnxruntime, spacy, nlp, natural language processing, transformers, crosslingual, multilingual, huggingface",
"author": "Tal Almagor",
"author_email": "almagoric@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/4b/8e/6714659073ee3be77624521d29bfc62e20f3112fc9b47c8f4b60a7bafd6a/coref_onnx-0.1.2.tar.gz",
"platform": null,
"description": "# Coreference Resolution with ONNX (AllenNLP-based)\n\nLightweight, fast coreference resolution component using a distilled version of AllenNLP's coreference model, exported to ONNX. \n\n## \u2728 Features\n\n- \ud83e\udde0 Cross-lingual coreference resolution\n- \ud83e\udeb6 Lightweight model based on AllenNLP\u2019s coref modeling\n- \u26a1 Fast inference via ONNX\n- \ud83d\udd0c Easy integration with spaCy\n\n---\n\n## \ud83d\udce6 Installation\n\n```bash\n$ pip install coref-onnx\n```\n\n## \ud83d\ude80 Quickstart\n\nUsage as a standalone component\n\n```python\nfrom coref_onnx import CoreferenceResolver, decode_clusters\n\nresolver = CoreferenceResolver.from_pretrained(\"talmago/allennlp-coref-onnx-mMiniLMv2-L12-H384-distilled-from-XLMR-Large\")\n\nsentences = [\n [\"Barack\", \"Obama\", \"was\", \"the\", \"44th\", \"President\", \"of\", \"the\", \"United\", \"States\", \".\"],\n [\"He\", \"was\", \"born\", \"in\", \"Hawaii\", \".\"]\n]\n\npred = resolver(sentences)\n\nprint(decode_clusters(sentences, pred[\"clusters\"][0]))\n\n# Output is:\n# [['Barack Obama', 'He']]\n```\n\nUsage with spaCy\n\n```python\nimport spacy\n\nfrom coref_onnx import create_coref_minilm_component\n\nnlp = spacy.load(\"en_core_web_sm\")\nnlp.add_pipe(\"coref_minilm\")\n\ndoc = nlp(\"Barack Obama was born in Hawaii. He was elected president in 2008.\")\nprint(doc._.coref_clusters[0])\nprint(doc._.cluster_heads)\nprint(doc._.resolved_text)\n\n# Output is:\n# [Barack Obama, He]\n# {'Barack Obama': Barack Obama}\n# Barack Obama was born in Hawaii. Barack Obama was elected president in 2008.\n```\n\n## \ud83d\udee0\ufe0f Development\n\nSet up virtualenv\n\n```sh\n$ make env\n```\n\nSet PYTHONPATH\n\n```sh\n$ export PYTHONPATH=$PYTHONPATH:/Users/talmago/git/coref-onnx/src\n```\n\nCode formatting\n\n```sh\n$ make format\n```",
"bugtrack_url": null,
"license": "MIT",
"summary": "Lightweight cross-lingual coreference resolution using ONNX Runtime and distilled transformer models",
"version": "0.1.2",
"project_urls": null,
"split_keywords": [
"coreference",
" coreference resolution",
" onnx",
" onnxruntime",
" spacy",
" nlp",
" natural language processing",
" transformers",
" crosslingual",
" multilingual",
" huggingface"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "ea920c8f4787f7d51e5004c0e84f82bed30e2d0c88e5537384ce43b4b61ca543",
"md5": "791b5e4487b374b611b08115cc5d0d05",
"sha256": "49f3e971b38bfc557ab27764a338437764af1022cbed792b565d80d4b9fb90ea"
},
"downloads": -1,
"filename": "coref_onnx-0.1.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "791b5e4487b374b611b08115cc5d0d05",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<3.13,>=3.11",
"size": 7366,
"upload_time": "2025-08-03T10:33:02",
"upload_time_iso_8601": "2025-08-03T10:33:02.347605Z",
"url": "https://files.pythonhosted.org/packages/ea/92/0c8f4787f7d51e5004c0e84f82bed30e2d0c88e5537384ce43b4b61ca543/coref_onnx-0.1.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "4b8e6714659073ee3be77624521d29bfc62e20f3112fc9b47c8f4b60a7bafd6a",
"md5": "438e49402737b3d9a2604e93dc720221",
"sha256": "ffca0d653c43c77d43f46e6a7a16e1b176a2804d7752c46a1de00562037499e0"
},
"downloads": -1,
"filename": "coref_onnx-0.1.2.tar.gz",
"has_sig": false,
"md5_digest": "438e49402737b3d9a2604e93dc720221",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<3.13,>=3.11",
"size": 6187,
"upload_time": "2025-08-03T10:33:03",
"upload_time_iso_8601": "2025-08-03T10:33:03.394259Z",
"url": "https://files.pythonhosted.org/packages/4b/8e/6714659073ee3be77624521d29bfc62e20f3112fc9b47c8f4b60a7bafd6a/coref_onnx-0.1.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-03 10:33:03",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "coref-onnx"
}