# AREkit 0.25.1
![](https://img.shields.io/badge/Python-3.9+-brightgreen.svg)
<p align="center">
<img src="logo.png"/>
</p>
**AREkit** (Attitude and Relation Extraction Toolkit) --
is a python toolkit, devoted to document level Attitude and Relation Extraction between text objects from mass-media news.
## Description
This toolkit aims at memory-effective data processing in [Relation Extraction (RE)](https://nlpprogress.com/english/relationship_extraction.html) related tasks.
<p align="center">
<img src="docs/arekit-pipeline-concept.png"/>
</p>
> Figure: AREkit pipelines design. More on
> **[ARElight: Context Sampling of Large Texts for Deep Learning Relation Extraction](https://link.springer.com/chapter/10.1007/978-3-031-56069-9_23)** paper
In particular, this framework serves the following features:
* ➿ [pipelines](https://github.com/nicolay-r/AREkit/wiki/Pipelines:-Text-Opinion-Annotation) and iterators for handling large-scale collections serialization without out-of-memory issues.
* 🔗 EL (entity-linking) API support for objects,
* ➰ avoidance of cyclic connections,
* :straight_ruler: distance consideration between relation participants (in `terms` or `sentences`),
* 📑 relations annotations and filtering rules,
* *️⃣ entities formatting or masking, and more.
The core functionality includes:
* API for document presentation with EL (Entity Linking, i.e. Object Synonymy) support
for sentence level relations preparation (dubbed as contexts);
* API for contexts extraction;
* Relations transferring from sentence-level onto document-level, and more.
## Installation
```bash
pip install git+https://github.com/nicolay-r/AREkit.git@0.25.1-rc
```
## Usage
Please follow the **[tutorial section on project Wiki](https://github.com/nicolay-r/AREkit/wiki/Tutorials)** for mode details.
## How to cite
A great research is also accompanied by the faithful reference.
if you use or extend our work, please cite as follows:
```bibtex
@inproceedings{rusnachenko2024arelight,
title={ARElight: Context Sampling of Large Texts for Deep Learning Relation Extraction},
author={Rusnachenko, Nicolay and Liang, Huizhi and Kolomeets, Maxim and Shi, Lei},
booktitle={European Conference on Information Retrieval},
year={2024},
organization={Springer}
}
```
Raw data
{
"_id": null,
"home_page": "https://github.com/nicolay-r/AREkit",
"name": "arekit",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": null,
"keywords": "natural language processing, relation extraction, sentiment analysis",
"author": "Nicolay Rusnachenko",
"author_email": "rusnicolay@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/5d/82/265c0545511705d686216d6c3d93a8eca5a16640ebe71315719d4faf8df0/arekit-0.25.1.tar.gz",
"platform": null,
"description": "# AREkit 0.25.1\n\n![](https://img.shields.io/badge/Python-3.9+-brightgreen.svg)\n\n<p align=\"center\">\n <img src=\"logo.png\"/>\n</p>\n\n**AREkit** (Attitude and Relation Extraction Toolkit) --\nis a python toolkit, devoted to document level Attitude and Relation Extraction between text objects from mass-media news. \n\n## Description\n\n\nThis toolkit aims at memory-effective data processing in [Relation Extraction (RE)](https://nlpprogress.com/english/relationship_extraction.html) related tasks.\n\n<p align=\"center\">\n <img src=\"docs/arekit-pipeline-concept.png\"/>\n</p>\n\n> Figure: AREkit pipelines design. More on \n> **[ARElight: Context Sampling of Large Texts for Deep Learning Relation Extraction](https://link.springer.com/chapter/10.1007/978-3-031-56069-9_23)** paper\n\nIn particular, this framework serves the following features: \n* \u27bf [pipelines](https://github.com/nicolay-r/AREkit/wiki/Pipelines:-Text-Opinion-Annotation) and iterators for handling large-scale collections serialization without out-of-memory issues.\n* \ud83d\udd17 EL (entity-linking) API support for objects, \n* \u27b0 avoidance of cyclic connections,\n* :straight_ruler: distance consideration between relation participants (in `terms` or `sentences`),\n* \ud83d\udcd1 relations annotations and filtering rules,\n* *\ufe0f\u20e3 entities formatting or masking, and more.\n\nThe core functionality includes: \n* API for document presentation with EL (Entity Linking, i.e. Object Synonymy) support \nfor sentence level relations preparation (dubbed as contexts);\n* API for contexts extraction;\n* Relations transferring from sentence-level onto document-level, and more.\n\n## Installation \n\n```bash\npip install git+https://github.com/nicolay-r/AREkit.git@0.25.1-rc\n```\n\n## Usage\n\nPlease follow the **[tutorial section on project Wiki](https://github.com/nicolay-r/AREkit/wiki/Tutorials)** for mode details.\n\n## How to cite\nA great research is also accompanied by the faithful reference. \nif you use or extend our work, please cite as follows:\n\n```bibtex\n@inproceedings{rusnachenko2024arelight,\n title={ARElight: Context Sampling of Large Texts for Deep Learning Relation Extraction},\n author={Rusnachenko, Nicolay and Liang, Huizhi and Kolomeets, Maxim and Shi, Lei},\n booktitle={European Conference on Information Retrieval},\n year={2024},\n organization={Springer}\n}\n```\n",
"bugtrack_url": null,
"license": "MIT License",
"summary": "Document level Attitude and Relation Extraction toolkit (AREkit) for sampling and prompting mass-media news into datasets for ML-model training",
"version": "0.25.1",
"project_urls": {
"Homepage": "https://github.com/nicolay-r/AREkit"
},
"split_keywords": [
"natural language processing",
" relation extraction",
" sentiment analysis"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "591cf707518c00e173635416c13e3ffb22a4b10acbf140345458c6c783636392",
"md5": "0530bedd07305baaa703a33348a01e45",
"sha256": "f6b336c1c00392f040a3dfbf98b32815c824ab3e6780df816fac7c2da5c168a8"
},
"downloads": -1,
"filename": "arekit-0.25.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "0530bedd07305baaa703a33348a01e45",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 136186,
"upload_time": "2024-12-07T11:51:58",
"upload_time_iso_8601": "2024-12-07T11:51:58.343180Z",
"url": "https://files.pythonhosted.org/packages/59/1c/f707518c00e173635416c13e3ffb22a4b10acbf140345458c6c783636392/arekit-0.25.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "5d82265c0545511705d686216d6c3d93a8eca5a16640ebe71315719d4faf8df0",
"md5": "59cf04a163a61aa37b4af5e56453628c",
"sha256": "b3f566191ebfd26a5bb28717d965c07c9d2f699c7a3bc399fb1cb7b6c1566049"
},
"downloads": -1,
"filename": "arekit-0.25.1.tar.gz",
"has_sig": false,
"md5_digest": "59cf04a163a61aa37b4af5e56453628c",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 97126,
"upload_time": "2024-12-07T11:52:01",
"upload_time_iso_8601": "2024-12-07T11:52:01.056293Z",
"url": "https://files.pythonhosted.org/packages/5d/82/265c0545511705d686216d6c3d93a8eca5a16640ebe71315719d4faf8df0/arekit-0.25.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-12-07 11:52:01",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "nicolay-r",
"github_project": "AREkit",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "arekit"
}