<h1 align="center">Welcome to the Corpus Annotation Graph Builder <code>(CAG)</code> </h1>
<p align="center">
<a href="https://pypi.org/project/cag/"><img src="https://badge.fury.io/py/cag.svg" alt="Badge: PyPI version" height="18"></a>
<a href="https://img.shields.io/badge/Made%20with-Python-1f425f.svg">
<img src="https://img.shields.io/badge/Made%20with-Python-1f425f.svg" alt="Badge: Made with Python"/>
</a>
<a href="https://open.vscode.dev/DLR-SC/corpus-annotation-graph-builder">
<img alt="Badge: Open in VSCode" src="https://img.shields.io/static/v1?logo=visualstudiocode&label=&message=open%20in%20visual%20studio%20code&labelColor=2c2c32&color=007acc&logoColor=007acc" target="_blank" />
</a>
<a href="https://github.com/psf/black"><img src="https://img.shields.io/badge/code%20style-black-000000.svg" alt="Badge: Black" height="18"></a>
<a href="https://zenodo.org/badge/latestdoi/572124344"><img src="https://zenodo.org/badge/572124344.svg" alt="DOI"></a>
<a href="https://github.com/DLR-SC/corpus-annotation-graph-builder/blob/master/LICENSE">
<img alt="License: MIT" src="https://img.shields.io/badge/license-MIT-yellow.svg" target="_blank" />
</a>
<a href="https://twitter.com/dlr_software">
<img alt="Twitter: DLR Software" src="https://img.shields.io/twitter/follow/dlr_software.svg?style=social" target="_blank" />
</a>
</p>
> `cag` is a Python Library offering an architectural framework to employ the build-annotate pattern when building Graphs.
---
[Official Documentation](https://cagraph.info/).
**Corpus Annotation Graph builder (CAG)** is an *architectural framework* that employs the *build-and-annotate* pattern for creating a graph. CAG is built on top of [ArangoDB](https://www.arangodb.com) and its Python drivers ([PyArango](https://pyarango.readthedocs.io/en/latest/)). The *build-and-annotate* pattern consists of two phases (see Figure below): (1) data is collected from different sources (e.g., publication databases, online encyclopedias, news feeds, web portals, electronic libraries, repositories, media platforms) and preprocessed to build the core nodes, which we call *Objects of Interest*. The component responsible for this phase is the **Graph-Creator**. (2) Annotations are extracted from the OOIs, and corresponding annotation nodes are created and linked to the core nodes. The component dealing with this phase is the **Graph-Annotator**.
![cag](https://github.com/DLR-SC/corpus-annotation-graph-builder/blob/main/docs/cag.png?raw=true)
This framework aims to offer researchers a flexible but unified and reproducible way of organizing and maintaining their interlinked document collections in a Corpus Annotation Graph.
## Installation
### Direct install via pip
The package can also be installed directly via pip.
```
pip install cag
```
This will allow you to use the module **`cag`** from any python script locally. The two main packages are **`cag.framework`** and **`cag.view_wrapper`**.
### Manual cloning
Clone the repository, go to the root folder and then run:
```
pip install -e .
```
## Citation
Please cite us in case you use CAG
@inproceedings{el-baff-etal-2023-corpus,
title = "Corpus Annotation Graph Builder ({CAG}): An Architectural Framework to Create and Annotate a Multi-source Graph",
author = "El Baff, Roxanne and
Hecking, Tobias and
Hamm, Andreas and
Korte, Jasper W. and
Bartsch, Sabine",
booktitle = "Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations",
month = may,
year = "2023",
address = "Dubrovnik, Croatia",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2023.eacl-demo.28",
pages = "248--255"
}
## Usage
* After the installation, a project scaffold can be created with the command `cag start-project`
* Graph Creation [[jupyter notebook](https://github.com/DLR-SC/corpus-annotation-graph-builder/blob/main/examples/1_create_graph.ipynb)]
* Graph Annotation [[jupyter notebook](https://github.com/DLR-SC/corpus-annotation-graph-builder/blob/main/examples/2_annotate_graph.ipynb)]
## Zenodo refs
* v1.5.0 [![DOI](https://zenodo.org/badge/572124344.svg)](https://zenodo.org/badge/latestdoi/572124344)
* v1.4.0 [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.7701921.svg)](https://doi.org/10.5281/zenodo.7701921)
Raw data
{
"_id": null,
"home_page": null,
"name": "cag",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "graph,architectural framework,graph creator,graph annotator",
"author": null,
"author_email": "Roxanne El Baff <roxanne.elbaff@dlr.de>, Tobias Hecking <tobias.hecking@dlr.de>",
"download_url": "https://files.pythonhosted.org/packages/f8/a6/57ccdb2e9ea7466ceab414c27b9d5321dcb789fede9b36d7d37e10fef080/cag-1.5.17.tar.gz",
"platform": null,
"description": "\n\n<h1 align=\"center\">Welcome to the Corpus Annotation Graph Builder <code>(CAG)</code> </h1>\n\n<p align=\"center\">\n <a href=\"https://pypi.org/project/cag/\"><img src=\"https://badge.fury.io/py/cag.svg\" alt=\"Badge: PyPI version\" height=\"18\"></a>\n <a href=\"https://img.shields.io/badge/Made%20with-Python-1f425f.svg\">\n <img src=\"https://img.shields.io/badge/Made%20with-Python-1f425f.svg\" alt=\"Badge: Made with Python\"/>\n </a>\n \n\n <a href=\"https://open.vscode.dev/DLR-SC/corpus-annotation-graph-builder\">\n <img alt=\"Badge: Open in VSCode\" src=\"https://img.shields.io/static/v1?logo=visualstudiocode&label=&message=open%20in%20visual%20studio%20code&labelColor=2c2c32&color=007acc&logoColor=007acc\" target=\"_blank\" />\n </a>\n <a href=\"https://github.com/psf/black\"><img src=\"https://img.shields.io/badge/code%20style-black-000000.svg\" alt=\"Badge: Black\" height=\"18\"></a>\n<a href=\"https://zenodo.org/badge/latestdoi/572124344\"><img src=\"https://zenodo.org/badge/572124344.svg\" alt=\"DOI\"></a>\n <a href=\"https://github.com/DLR-SC/corpus-annotation-graph-builder/blob/master/LICENSE\">\n <img alt=\"License: MIT\" src=\"https://img.shields.io/badge/license-MIT-yellow.svg\" target=\"_blank\" />\n </a>\n <a href=\"https://twitter.com/dlr_software\">\n <img alt=\"Twitter: DLR Software\" src=\"https://img.shields.io/twitter/follow/dlr_software.svg?style=social\" target=\"_blank\" />\n </a>\n</p>\n\n\n> `cag` is a Python Library offering an architectural framework to employ the build-annotate pattern when building Graphs.\n\n---\n\n\n\n[Official Documentation](https://cagraph.info/).\n\n**Corpus Annotation Graph builder (CAG)** is an *architectural framework* that employs the *build-and-annotate* pattern for creating a graph. CAG is built on top of [ArangoDB](https://www.arangodb.com) and its Python drivers ([PyArango](https://pyarango.readthedocs.io/en/latest/)). The *build-and-annotate* pattern consists of two phases (see Figure below): (1) data is collected from different sources (e.g., publication databases, online encyclopedias, news feeds, web portals, electronic libraries, repositories, media platforms) and preprocessed to build the core nodes, which we call *Objects of Interest*. The component responsible for this phase is the **Graph-Creator**. (2) Annotations are extracted from the OOIs, and corresponding annotation nodes are created and linked to the core nodes. The component dealing with this phase is the **Graph-Annotator**.\n\n\n![cag](https://github.com/DLR-SC/corpus-annotation-graph-builder/blob/main/docs/cag.png?raw=true)\n\n\nThis framework aims to offer researchers a flexible but unified and reproducible way of organizing and maintaining their interlinked document collections in a Corpus Annotation Graph. \n\n## Installation\n\n### Direct install via pip \n\nThe package can also be installed directly via pip.\n```\npip install cag\n```\n\nThis will allow you to use the module **`cag`** from any python script locally. The two main packages are **`cag.framework`** and **`cag.view_wrapper`**.\n\n\n### Manual cloning\nClone the repository, go to the root folder and then run:\n\n```\npip install -e .\n```\n\n## Citation\nPlease cite us in case you use CAG\n\n @inproceedings{el-baff-etal-2023-corpus,\n title = \"Corpus Annotation Graph Builder ({CAG}): An Architectural Framework to Create and Annotate a Multi-source Graph\",\n author = \"El Baff, Roxanne and\n Hecking, Tobias and\n Hamm, Andreas and\n Korte, Jasper W. and\n Bartsch, Sabine\",\n booktitle = \"Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations\",\n month = may,\n year = \"2023\",\n address = \"Dubrovnik, Croatia\",\n publisher = \"Association for Computational Linguistics\",\n url = \"https://aclanthology.org/2023.eacl-demo.28\",\n pages = \"248--255\"\n }\n\n\n## Usage\n* After the installation, a project scaffold can be created with the command `cag start-project`\n* Graph Creation [[jupyter notebook](https://github.com/DLR-SC/corpus-annotation-graph-builder/blob/main/examples/1_create_graph.ipynb)]\n* Graph Annotation [[jupyter notebook](https://github.com/DLR-SC/corpus-annotation-graph-builder/blob/main/examples/2_annotate_graph.ipynb)]\n\n\n\n## Zenodo refs\n\n* v1.5.0 [![DOI](https://zenodo.org/badge/572124344.svg)](https://zenodo.org/badge/latestdoi/572124344)\n* v1.4.0 [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.7701921.svg)](https://doi.org/10.5281/zenodo.7701921)\n\n\n",
"bugtrack_url": null,
"license": null,
"summary": "This is a general framework to create arango db graphs and annotate them.",
"version": "1.5.17",
"project_urls": {
"Homepage": "https://github.com/DLR-SC/corpus-annotation-graph-builder"
},
"split_keywords": [
"graph",
"architectural framework",
"graph creator",
"graph annotator"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "b4bbc904081b8c3c3fb8bd0638efeaf3af7a56261fdfbe243c4e344236679b96",
"md5": "d8404f120435d0ae3b0076c0d6e0df77",
"sha256": "1b92d177a5dafc132e8fdaa40d130af6226b0e70cec1629d8e17fd731ca192ab"
},
"downloads": -1,
"filename": "cag-1.5.17-py3-none-any.whl",
"has_sig": false,
"md5_digest": "d8404f120435d0ae3b0076c0d6e0df77",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 81017,
"upload_time": "2023-06-22T14:51:37",
"upload_time_iso_8601": "2023-06-22T14:51:37.315213Z",
"url": "https://files.pythonhosted.org/packages/b4/bb/c904081b8c3c3fb8bd0638efeaf3af7a56261fdfbe243c4e344236679b96/cag-1.5.17-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "f8a657ccdb2e9ea7466ceab414c27b9d5321dcb789fede9b36d7d37e10fef080",
"md5": "f7862287594a6139f91ae3c9e93b0c27",
"sha256": "cb530c4b30bd6fc69a1b3c2504a2ddd1536dc311530b07a43751909b4b275711"
},
"downloads": -1,
"filename": "cag-1.5.17.tar.gz",
"has_sig": false,
"md5_digest": "f7862287594a6139f91ae3c9e93b0c27",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 3052227,
"upload_time": "2023-06-22T14:51:44",
"upload_time_iso_8601": "2023-06-22T14:51:44.617454Z",
"url": "https://files.pythonhosted.org/packages/f8/a6/57ccdb2e9ea7466ceab414c27b9d5321dcb789fede9b36d7d37e10fef080/cag-1.5.17.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-06-22 14:51:44",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "DLR-SC",
"github_project": "corpus-annotation-graph-builder",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [],
"lcname": "cag"
}