# kogito
A Python NLP Commonsense Knowledge Inference Toolkit
System Description available here: https://arxiv.org/abs/2211.08451
## Installation
### Installation with pip
**kogito** can be installed using pip.
```sh
pip install kogito
```
It requires a minimum ``python`` version of ``3.8``.
## Setup
### Inference
**kogito** uses [spacy](https://spacy.io) under the hood for various text processing purposes, so, a [spacy](https://spacy.io) language package has to be installed before running the inference module.
```sh
python -m spacy download en_core_web_sm
```
By default, ``CommonsenseInference`` module uses ``en_core_web_sm`` to initialize ``spacy`` pipeline, but a different language pipeline can be specified as well.
### Evaluation
If you also would like evaluate knowledge models using `METEOR` score, then you need to download the following ``nltk`` libraries:
```python
import nltk
nltk.download("punkt")
nltk.download("wordnet")
nltk.download("omw-1.4")
```
## Quickstart
**kogito** provides an easy interface to interact with knowledge inference or commonsense reasoning models such as [COMET](https://arxiv.org/abs/2010.05953) to generate inferences from a text input.
Here is a sample usage of the library where you can initialize an inference module, a custom commonsense reasoning model, and generate a knowledge graph from text on the fly.
```python
from kogito.models.bart.comet import COMETBART
from kogito.inference import CommonsenseInference
# Load pre-trained model from HuggingFace
model = COMETBART.from_pretrained("mismayil/comet-bart-ai2")
# Initialize inference module with a spacy language pipeline
csi = CommonsenseInference(language="en_core_web_sm")
# Run inference
text = "PersonX becomes a great basketball player"
kgraph = csi.infer(text, model)
# Save output knowledge graph to JSON file
kgraph.to_jsonl("kgraph.json")
```
Here is an excerpt from the result of the above code sample:
```json
{"head": "PersonX becomes a great basketball player", "relation": "Causes", "tails": [" PersonX practices every day.", " PersonX plays basketball every day", " PersonX practices every day"]}
{"head": "basketball", "relation": "ObjectUse", "tails": [" play with friends", " play basketball with", " play basketball"]}
{"head": "player", "relation": "CapableOf", "tails": [" play game", " win game", " play football"]}
{"head": "great basketball player", "relation": "HasProperty", "tails": [" good at basketball", " good at sports", " very good"]}
{"head": "become player", "relation": "isAfter", "tails": [" play game", " become coach", " play with"]}
```
This is just one way to generate commonsense inferences and **kogito** offers much more. For complete documentation, check out the [kogito docs](https://kogito.readthedocs.io).
## Development
### Setup
**kogito** uses [Poetry](https://python-poetry.org/) to manage its dependencies.
Install poetry from the official repository first:
```sh
curl -sSL https://install.python-poetry.org | python3 -
```
Then run the following command to install package dependencies:
```sh
poetry install
```
## Data
If you need the ATOMIC2020 data to train your knowledge models, you can download it from AI2:
For ATOMIC:
```sh
wget https://storage.googleapis.com/ai2-mosaic/public/atomic/v1.0/atomic_data.tgz
```
For ATOMIC 2020:
```sh
wget https://ai2-atomic.s3-us-west-2.amazonaws.com/data/atomic2020_data-feb2021.zip
```
## Paper
If you want to learn more about the library design, models and data used for this toolkit, check out our [paper](https://arxiv.org/abs/2211.08451). The paper can be cited as:
```
@article{Ismayilzada2022kogito,
title={kogito: A Commonsense Knowledge Inference Toolkit},
author={Mete Ismayilzada and Antoine Bosselut},
journal={ArXiv},
volume={abs/2211.08451},
year={2022}
}
```
If you work with knowledge models, consider citing the following papers:
```
@article{Hwang2020COMETATOMIC,
author = {Jena D. Hwang and Chandra Bhagavatula and Ronan Le Bras and Jeff Da and Keisuke Sakaguchi and Antoine Bosselut and Yejin Choi},
booktitle = {Proceedings of the 35th AAAI Conference on Artificial Intelligence (AAAI)},
title = {COMET-ATOMIC 2020: On Symbolic and Neural Commonsense Knowledge Graphs},
year = {2021}
}
@inproceedings{Bosselut2019COMETCT,
author = {Antoine Bosselut and Hannah Rashkin and Maarten Sap and Chaitanya Malaviya and Asli Çelikyilmaz and Yejin Choi},
booktitle = {Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL)},
title = {COMET: Commonsense Transformers for Automatic Knowledge Graph Construction},
year = {2019}
}
```
## Acknowledgements
Significant portion of the model training and evaluation code has been adapted from the original [codebase](https://github.com/allenai/comet-atomic-2020) for the paper [(Comet-) Atomic 2020: On Symbolic and Neural Commonsense Knowledge Graphs.](https://www.semanticscholar.org/paper/COMET-ATOMIC-2020%3A-On-Symbolic-and-Neural-Knowledge-Hwang-Bhagavatula/e39503e01ebb108c6773948a24ca798cd444eb62)
Raw data
{
"_id": null,
"home_page": "https://github.com/epfl-nlp/kogito",
"name": "kogito",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.8,<3.11",
"maintainer_email": "",
"keywords": "natural language processing,nlp,natural language understanding,commonsense reasoning,commonsense inference,knowledge inference",
"author": "Mete Ismayil",
"author_email": "mahammad.ismayilzada@epfl.ch",
"download_url": "https://files.pythonhosted.org/packages/ab/d6/3e03b89fdceb24b31ca8e1dbda332d2fb15bc1e52b4c15a78b301bc15c84/kogito-0.6.1.tar.gz",
"platform": null,
"description": "# kogito\nA Python NLP Commonsense Knowledge Inference Toolkit\n\nSystem Description available here: https://arxiv.org/abs/2211.08451\n\n## Installation\n\n### Installation with pip\n**kogito** can be installed using pip.\n\n```sh\npip install kogito\n```\n\nIt requires a minimum ``python`` version of ``3.8``.\n\n## Setup\n\n### Inference\n**kogito** uses [spacy](https://spacy.io) under the hood for various text processing purposes, so, a [spacy](https://spacy.io) language package has to be installed before running the inference module.\n\n```sh\npython -m spacy download en_core_web_sm\n``` \nBy default, ``CommonsenseInference`` module uses ``en_core_web_sm`` to initialize ``spacy`` pipeline, but a different language pipeline can be specified as well.\n\n### Evaluation\nIf you also would like evaluate knowledge models using `METEOR` score, then you need to download the following ``nltk`` libraries:\n```python\nimport nltk\n\nnltk.download(\"punkt\")\nnltk.download(\"wordnet\")\nnltk.download(\"omw-1.4\")\n```\n\n## Quickstart\n**kogito** provides an easy interface to interact with knowledge inference or commonsense reasoning models such as [COMET](https://arxiv.org/abs/2010.05953) to generate inferences from a text input.\nHere is a sample usage of the library where you can initialize an inference module, a custom commonsense reasoning model, and generate a knowledge graph from text on the fly.\n\n```python\nfrom kogito.models.bart.comet import COMETBART\nfrom kogito.inference import CommonsenseInference\n\n# Load pre-trained model from HuggingFace\nmodel = COMETBART.from_pretrained(\"mismayil/comet-bart-ai2\")\n\n# Initialize inference module with a spacy language pipeline\ncsi = CommonsenseInference(language=\"en_core_web_sm\")\n\n# Run inference\ntext = \"PersonX becomes a great basketball player\"\nkgraph = csi.infer(text, model)\n\n# Save output knowledge graph to JSON file\nkgraph.to_jsonl(\"kgraph.json\")\n```\n\nHere is an excerpt from the result of the above code sample:\n\n```json\n{\"head\": \"PersonX becomes a great basketball player\", \"relation\": \"Causes\", \"tails\": [\" PersonX practices every day.\", \" PersonX plays basketball every day\", \" PersonX practices every day\"]}\n{\"head\": \"basketball\", \"relation\": \"ObjectUse\", \"tails\": [\" play with friends\", \" play basketball with\", \" play basketball\"]}\n{\"head\": \"player\", \"relation\": \"CapableOf\", \"tails\": [\" play game\", \" win game\", \" play football\"]}\n{\"head\": \"great basketball player\", \"relation\": \"HasProperty\", \"tails\": [\" good at basketball\", \" good at sports\", \" very good\"]}\n{\"head\": \"become player\", \"relation\": \"isAfter\", \"tails\": [\" play game\", \" become coach\", \" play with\"]}\n```\nThis is just one way to generate commonsense inferences and **kogito** offers much more. For complete documentation, check out the [kogito docs](https://kogito.readthedocs.io).\n\n## Development\n\n### Setup\n**kogito** uses [Poetry](https://python-poetry.org/) to manage its dependencies. \n\nInstall poetry from the official repository first:\n```sh\ncurl -sSL https://install.python-poetry.org | python3 -\n```\n\nThen run the following command to install package dependencies:\n```sh\npoetry install\n```\n\n## Data\nIf you need the ATOMIC2020 data to train your knowledge models, you can download it from AI2:\n\nFor ATOMIC:\n```sh\nwget https://storage.googleapis.com/ai2-mosaic/public/atomic/v1.0/atomic_data.tgz\n```\n\nFor ATOMIC 2020:\n```sh\nwget https://ai2-atomic.s3-us-west-2.amazonaws.com/data/atomic2020_data-feb2021.zip\n```\n\n## Paper\nIf you want to learn more about the library design, models and data used for this toolkit, check out our [paper](https://arxiv.org/abs/2211.08451). The paper can be cited as:\n\n```\n@article{Ismayilzada2022kogito,\n title={kogito: A Commonsense Knowledge Inference Toolkit},\n author={Mete Ismayilzada and Antoine Bosselut},\n journal={ArXiv},\n volume={abs/2211.08451},\n year={2022}\n}\n```\n\nIf you work with knowledge models, consider citing the following papers:\n\n```\n@article{Hwang2020COMETATOMIC,\n author = {Jena D. Hwang and Chandra Bhagavatula and Ronan Le Bras and Jeff Da and Keisuke Sakaguchi and Antoine Bosselut and Yejin Choi},\n booktitle = {Proceedings of the 35th AAAI Conference on Artificial Intelligence (AAAI)},\n title = {COMET-ATOMIC 2020: On Symbolic and Neural Commonsense Knowledge Graphs},\n year = {2021}\n}\n\n@inproceedings{Bosselut2019COMETCT,\n author = {Antoine Bosselut and Hannah Rashkin and Maarten Sap and Chaitanya Malaviya and Asli \u00c7elikyilmaz and Yejin Choi},\n booktitle = {Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL)},\n title = {COMET: Commonsense Transformers for Automatic Knowledge Graph Construction},\n year = {2019}\n}\n```\n\n## Acknowledgements\nSignificant portion of the model training and evaluation code has been adapted from the original [codebase](https://github.com/allenai/comet-atomic-2020) for the paper [(Comet-) Atomic 2020: On Symbolic and Neural Commonsense Knowledge Graphs.](https://www.semanticscholar.org/paper/COMET-ATOMIC-2020%3A-On-Symbolic-and-Neural-Knowledge-Hwang-Bhagavatula/e39503e01ebb108c6773948a24ca798cd444eb62)\n",
"bugtrack_url": null,
"license": "Apache-2.0",
"summary": "A Python NLP Commonsense Knowledge Inference Toolkit",
"version": "0.6.1",
"project_urls": {
"Documentation": "https://github.com/epfl-nlp/kogito",
"Homepage": "https://github.com/epfl-nlp/kogito",
"Repository": "https://github.com/epfl-nlp/kogito"
},
"split_keywords": [
"natural language processing",
"nlp",
"natural language understanding",
"commonsense reasoning",
"commonsense inference",
"knowledge inference"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "ee40c32b0be2cbb41e6b96d0151fe8149c432072dee8202f84662d0b51f392ea",
"md5": "11fe55bd34b15d323d394b6456e2e717",
"sha256": "f1335cf9e5e0d5061f17d4a70d220afcef772ad3acab80e593fc083473d43e31"
},
"downloads": -1,
"filename": "kogito-0.6.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "11fe55bd34b15d323d394b6456e2e717",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8,<3.11",
"size": 4310513,
"upload_time": "2023-06-07T23:51:01",
"upload_time_iso_8601": "2023-06-07T23:51:01.086878Z",
"url": "https://files.pythonhosted.org/packages/ee/40/c32b0be2cbb41e6b96d0151fe8149c432072dee8202f84662d0b51f392ea/kogito-0.6.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "abd63e03b89fdceb24b31ca8e1dbda332d2fb15bc1e52b4c15a78b301bc15c84",
"md5": "e8d6a145e44ae8897d9edea8d4d50a0e",
"sha256": "194c6c3c62d1cc4515aa240490f1de6f19505fac18091bbd5f01071dcae55fae"
},
"downloads": -1,
"filename": "kogito-0.6.1.tar.gz",
"has_sig": false,
"md5_digest": "e8d6a145e44ae8897d9edea8d4d50a0e",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8,<3.11",
"size": 4313798,
"upload_time": "2023-06-07T23:51:11",
"upload_time_iso_8601": "2023-06-07T23:51:11.965346Z",
"url": "https://files.pythonhosted.org/packages/ab/d6/3e03b89fdceb24b31ca8e1dbda332d2fb15bc1e52b4c15a78b301bc15c84/kogito-0.6.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-06-07 23:51:11",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "epfl-nlp",
"github_project": "kogito",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "kogito"
}