# Interactive Clustering GUI
[![ci](https://github.com/cognitivefactory/interactive-clustering-gui/workflows/ci/badge.svg)](https://github.com/cognitivefactory/interactive-clustering-gui/actions?query=workflow%3Aci)
[![documentation](https://img.shields.io/badge/docs-mkdocs%20material-blue.svg?style=flat)](https://cognitivefactory.github.io/interactive-clustering-gui/)
[![pypi version](https://img.shields.io/pypi/v/cognitivefactory-interactive-clustering-gui.svg)](https://pypi.org/project/cognitivefactory-interactive-clustering-gui/)
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.4775270.svg)](https://doi.org/10.5281/zenodo.4775270)
A web application designed for NLP data annotation using Interactive Clustering methodology.
## <a name="Description"></a> Quick description
_Interactive clustering_ is a method intended to assist in the design of a training data set.
This iterative process begins with an unlabeled dataset, and it uses a sequence of two substeps :
1. the user defines constraints on data sampled by the computer ;
2. the computer performs data partitioning using a constrained clustering algorithm.
Thus, at each step of the process :
- the user corrects the clustering of the previous steps using constraints, and
- the computer offers a corrected and more relevant data partitioning for the next step.
This web application implements this annotation methodology with several features:
- _data preprocessing and vectorization_ in order to reduce noise in data;
- _constrainted clustering_ in order to automatically partition the data;
- _constraints sampling_ in order to select the most relevant data to annotate;
- _binary constraints annotation_ in order to correct clustering relevance;
- _annotation review and conflicts analysis_ in order to improve constraints consistency.
For more details, read the [Documentation](#Documentation) and the articles in the [References](#References) section.
## <a name="Documentation"></a> Documentation
- [Main documentation](https://cognitivefactory.github.io/interactive-clustering-gui/)
## <a name="Requirements"></a> Requirements
Interactive Clustering GUI requires Python 3.8 or above.
To install with [`pip`](https://github.com/pypa/pip):
```bash
# install package
python3 -m pip install cognitivefactory-interactive-clustering-gui
# install spacy language model dependencies (the one you want, with version "3.1.x")
python3 -m spacy download fr_core_news_md-3.1.0 --direct
```
To install with [`pipx`](https://github.com/pypa/pipx):
```bash
# install pipx
python3 -m pip install --user pipx
# install package
pipx install --python python3 cognitivefactory-interactive-clustering-gui
# install spacy language model dependencies (the one you want, with version "3.1.x")
python3 -m spacy download fr_core_news_md-3.1.0 --direct
```
## <a name="Run"></a> Run
To display the help message:
```bash
cognitivefactory-interactive-clustering-gui --help
```
To launch the web application:
```bash
cognitivefactory-interactive-clustering-gui # launch on 127.0.0.1:8080
```
Then, go to one of the following pages in your browser:
- Welcome page (web application home): [http://localhost:8080](http://localhost:8080)
- Swagger (interactive documentation): [http://localhost:8080/docs](http://localhost:8080/docs)
## <a name="Development"></a> Development
To work on this project or contribute to it, please read:
- the [Copier PDM](https://pawamoy.github.io/copier-pdm/) template documentation ;
- the [Contributing](https://cognitivefactory.github.io/interactive-clustering-gui/contributing/) page for environment setup and development help ;
- the [Code of Conduct](https://cognitivefactory.github.io/interactive-clustering-gui/code_of_conduct/) page for contribution rules.
## <a name="References"></a> References
- **Interactive Clustering**:
- First presentation: `Schild, E., Durantin, G., Lamirel, J.C., & Miconi, F. (2021). Conception itérative et semi-supervisée d'assistants conversationnels par regroupement interactif des questions. In EGC 2021 - 21èmes Journées Francophones Extraction et Gestion des Connaissances. Edition RNTI. ⟨hal-03133007⟩.`
- Theoretical study: `Schild, E., Durantin, G., Lamirel, J., & Miconi, F. (2022). Iterative and Semi-Supervised Design of Chatbots Using Interactive Clustering. International Journal of Data Warehousing and Mining (IJDWM), 18(2), 1-19. http://doi.org/10.4018/IJDWM.298007. ⟨hal-03648041⟩.`
- Methodological discussion: `Schild, E., Durantin, G., & Lamirel, J.C. (2021). Concevoir un assistant conversationnel de manière itérative et semi-supervisée avec le clustering interactif. In Atelier - Fouille de Textes - Text Mine 2021 - En conjonction avec EGC 2021. ⟨hal-03133060⟩.`
- Implementation: `Schild, E. (2021). cognitivefactory/interactive-clustering. Zenodo. https://doi.org/10.5281/zenodo.4775251.`
- **Web application**:
- _FastAPI_: `https://fastapi.tiangolo.com/`
## <a name="How to cite"></a> How to cite
`Schild, E. (2021). cognitivefactory/interactive-clustering-gui. Zenodo. https://doi.org/10.5281/zenodo.4775270.`
Raw data
{
"_id": null,
"home_page": "",
"name": "cognitivefactory-interactive-clustering-gui",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": "",
"keywords": "natural-language-processing,constraints,annotation-tool,interactive-clustering,constraints-annotation",
"author": "",
"author_email": "Erwan Schild <erwan.schild@e-i.com>",
"download_url": "https://files.pythonhosted.org/packages/5e/0c/ad394a913bbdfb818e9130926f19c6fcc438b7ce563c1d1a6f25df05e2d1/cognitivefactory-interactive-clustering-gui-0.4.1.tar.gz",
"platform": null,
"description": "# Interactive Clustering GUI\n\n[![ci](https://github.com/cognitivefactory/interactive-clustering-gui/workflows/ci/badge.svg)](https://github.com/cognitivefactory/interactive-clustering-gui/actions?query=workflow%3Aci)\n[![documentation](https://img.shields.io/badge/docs-mkdocs%20material-blue.svg?style=flat)](https://cognitivefactory.github.io/interactive-clustering-gui/)\n[![pypi version](https://img.shields.io/pypi/v/cognitivefactory-interactive-clustering-gui.svg)](https://pypi.org/project/cognitivefactory-interactive-clustering-gui/)\n[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.4775270.svg)](https://doi.org/10.5281/zenodo.4775270)\n\nA web application designed for NLP data annotation using Interactive Clustering methodology.\n\n\n## <a name=\"Description\"></a> Quick description\n\n_Interactive clustering_ is a method intended to assist in the design of a training data set.\n\nThis iterative process begins with an unlabeled dataset, and it uses a sequence of two substeps :\n\n1. the user defines constraints on data sampled by the computer ;\n2. the computer performs data partitioning using a constrained clustering algorithm.\n\nThus, at each step of the process :\n\n- the user corrects the clustering of the previous steps using constraints, and\n- the computer offers a corrected and more relevant data partitioning for the next step.\n\nThis web application implements this annotation methodology with several features:\n\n- _data preprocessing and vectorization_ in order to reduce noise in data;\n- _constrainted clustering_ in order to automatically partition the data;\n- _constraints sampling_ in order to select the most relevant data to annotate;\n- _binary constraints annotation_ in order to correct clustering relevance;\n- _annotation review and conflicts analysis_ in order to improve constraints consistency.\n\nFor more details, read the [Documentation](#Documentation) and the articles in the [References](#References) section.\n\n\n## <a name=\"Documentation\"></a> Documentation\n\n- [Main documentation](https://cognitivefactory.github.io/interactive-clustering-gui/)\n\n\n## <a name=\"Requirements\"></a> Requirements\n\nInteractive Clustering GUI requires Python 3.8 or above.\n\nTo install with [`pip`](https://github.com/pypa/pip):\n\n```bash\n# install package\npython3 -m pip install cognitivefactory-interactive-clustering-gui\n\n# install spacy language model dependencies (the one you want, with version \"3.1.x\")\npython3 -m spacy download fr_core_news_md-3.1.0 --direct\n```\n\nTo install with [`pipx`](https://github.com/pypa/pipx):\n\n```bash\n# install pipx\npython3 -m pip install --user pipx\n\n# install package\npipx install --python python3 cognitivefactory-interactive-clustering-gui\n\n# install spacy language model dependencies (the one you want, with version \"3.1.x\")\npython3 -m spacy download fr_core_news_md-3.1.0 --direct\n```\n\n\n## <a name=\"Run\"></a> Run\n\nTo display the help message:\n\n```bash\ncognitivefactory-interactive-clustering-gui --help\n```\n\nTo launch the web application:\n\n```bash\ncognitivefactory-interactive-clustering-gui # launch on 127.0.0.1:8080\n```\n\nThen, go to one of the following pages in your browser:\n\n- Welcome page (web application home): [http://localhost:8080](http://localhost:8080)\n- Swagger (interactive documentation): [http://localhost:8080/docs](http://localhost:8080/docs)\n\n\n## <a name=\"Development\"></a> Development\n\n\nTo work on this project or contribute to it, please read:\n\n- the [Copier PDM](https://pawamoy.github.io/copier-pdm/) template documentation ;\n- the [Contributing](https://cognitivefactory.github.io/interactive-clustering-gui/contributing/) page for environment setup and development help ;\n- the [Code of Conduct](https://cognitivefactory.github.io/interactive-clustering-gui/code_of_conduct/) page for contribution rules.\n\n\n## <a name=\"References\"></a> References\n\n- **Interactive Clustering**:\n - First presentation: `Schild, E., Durantin, G., Lamirel, J.C., & Miconi, F. (2021). Conception it\u00e9rative et semi-supervis\u00e9e d'assistants conversationnels par regroupement interactif des questions. In EGC 2021 - 21\u00e8mes Journ\u00e9es Francophones Extraction et Gestion des Connaissances. Edition RNTI. \u27e8hal-03133007\u27e9.`\n - Theoretical study: `Schild, E., Durantin, G., Lamirel, J., & Miconi, F. (2022). Iterative and Semi-Supervised Design of Chatbots Using Interactive Clustering. International Journal of Data Warehousing and Mining (IJDWM), 18(2), 1-19. http://doi.org/10.4018/IJDWM.298007. \u27e8hal-03648041\u27e9.`\n - Methodological discussion: `Schild, E., Durantin, G., & Lamirel, J.C. (2021). Concevoir un assistant conversationnel de mani\u00e8re it\u00e9rative et semi-supervis\u00e9e avec le clustering interactif. In Atelier - Fouille de Textes - Text Mine 2021 - En conjonction avec EGC 2021. \u27e8hal-03133060\u27e9.`\n - Implementation: `Schild, E. (2021). cognitivefactory/interactive-clustering. Zenodo. https://doi.org/10.5281/zenodo.4775251.`\n\n- **Web application**:\n - _FastAPI_: `https://fastapi.tiangolo.com/`\n\n\n## <a name=\"How to cite\"></a> How to cite\n\n`Schild, E. (2021). cognitivefactory/interactive-clustering-gui. Zenodo. https://doi.org/10.5281/zenodo.4775270.`\n",
"bugtrack_url": null,
"license": "CECILL-C",
"summary": "A web application designed for NLP data annotation using Interactive Clustering methodology.",
"version": "0.4.1",
"split_keywords": [
"natural-language-processing",
"constraints",
"annotation-tool",
"interactive-clustering",
"constraints-annotation"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "668c5d596a825b229159f6066633bfff738823f6de5b9f4f1cc3815d46c226e0",
"md5": "0fdbb7cc54cd12e5fdd18747dff172cc",
"sha256": "22b061a7b7e23d10c3139655f43916da6c42869bc5fe379152eb0ce50ce68781"
},
"downloads": -1,
"filename": "cognitivefactory_interactive_clustering_gui-0.4.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "0fdbb7cc54cd12e5fdd18747dff172cc",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 527973,
"upload_time": "2023-04-27T10:44:01",
"upload_time_iso_8601": "2023-04-27T10:44:01.917847Z",
"url": "https://files.pythonhosted.org/packages/66/8c/5d596a825b229159f6066633bfff738823f6de5b9f4f1cc3815d46c226e0/cognitivefactory_interactive_clustering_gui-0.4.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "5e0cad394a913bbdfb818e9130926f19c6fcc438b7ce563c1d1a6f25df05e2d1",
"md5": "36afaface0d43412fa612a53e858fe07",
"sha256": "91e8c1d6faf6b25fb465334cc75a5b23ebba7a3d739c509a64afb876eca929f2"
},
"downloads": -1,
"filename": "cognitivefactory-interactive-clustering-gui-0.4.1.tar.gz",
"has_sig": false,
"md5_digest": "36afaface0d43412fa612a53e858fe07",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 830364,
"upload_time": "2023-04-27T10:44:07",
"upload_time_iso_8601": "2023-04-27T10:44:07.825415Z",
"url": "https://files.pythonhosted.org/packages/5e/0c/ad394a913bbdfb818e9130926f19c6fcc438b7ce563c1d1a6f25df05e2d1/cognitivefactory-interactive-clustering-gui-0.4.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-04-27 10:44:07",
"github": false,
"gitlab": false,
"bitbucket": false,
"lcname": "cognitivefactory-interactive-clustering-gui"
}