cognitivefactory-interactive-clustering-gui


Namecognitivefactory-interactive-clustering-gui JSON
Version 0.4.1 PyPI version JSON
download
home_page
SummaryA web application designed for NLP data annotation using Interactive Clustering methodology.
upload_time2023-04-27 10:44:07
maintainer
docs_urlNone
author
requires_python>=3.8
licenseCECILL-C
keywords natural-language-processing constraints annotation-tool interactive-clustering constraints-annotation
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Interactive Clustering GUI

[![ci](https://github.com/cognitivefactory/interactive-clustering-gui/workflows/ci/badge.svg)](https://github.com/cognitivefactory/interactive-clustering-gui/actions?query=workflow%3Aci)
[![documentation](https://img.shields.io/badge/docs-mkdocs%20material-blue.svg?style=flat)](https://cognitivefactory.github.io/interactive-clustering-gui/)
[![pypi version](https://img.shields.io/pypi/v/cognitivefactory-interactive-clustering-gui.svg)](https://pypi.org/project/cognitivefactory-interactive-clustering-gui/)
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.4775270.svg)](https://doi.org/10.5281/zenodo.4775270)

A web application designed for NLP data annotation using Interactive Clustering methodology.


## <a name="Description"></a> Quick description

_Interactive clustering_ is a method intended to assist in the design of a training data set.

This iterative process begins with an unlabeled dataset, and it uses a sequence of two substeps :

1. the user defines constraints on data sampled by the computer ;
2. the computer performs data partitioning using a constrained clustering algorithm.

Thus, at each step of the process :

- the user corrects the clustering of the previous steps using constraints, and
- the computer offers a corrected and more relevant data partitioning for the next step.

This web application implements this annotation methodology with several features:

- _data preprocessing and vectorization_ in order to reduce noise in data;
- _constrainted clustering_ in order to automatically partition the data;
- _constraints sampling_ in order to select the most relevant data to annotate;
- _binary constraints annotation_ in order to correct clustering relevance;
- _annotation review and conflicts analysis_ in order to improve constraints consistency.

For more details, read the [Documentation](#Documentation) and the articles in the [References](#References) section.


## <a name="Documentation"></a> Documentation

- [Main documentation](https://cognitivefactory.github.io/interactive-clustering-gui/)


## <a name="Requirements"></a> Requirements

Interactive Clustering GUI requires Python 3.8 or above.

To install with [`pip`](https://github.com/pypa/pip):

```bash
# install package
python3 -m pip install cognitivefactory-interactive-clustering-gui

# install spacy language model dependencies (the one you want, with version "3.1.x")
python3 -m spacy download fr_core_news_md-3.1.0 --direct
```

To install with [`pipx`](https://github.com/pypa/pipx):

```bash
# install pipx
python3 -m pip install --user pipx

# install package
pipx install --python python3 cognitivefactory-interactive-clustering-gui

# install spacy language model dependencies (the one you want, with version "3.1.x")
python3 -m spacy download fr_core_news_md-3.1.0 --direct
```


## <a name="Run"></a> Run

To display the help message:

```bash
cognitivefactory-interactive-clustering-gui --help
```

To launch the web application:

```bash
cognitivefactory-interactive-clustering-gui  # launch on 127.0.0.1:8080
```

Then, go to one of the following pages in your browser:

- Welcome page (web application home): [http://localhost:8080](http://localhost:8080)
- Swagger (interactive documentation): [http://localhost:8080/docs](http://localhost:8080/docs)


## <a name="Development"></a> Development


To work on this project or contribute to it, please read:

- the [Copier PDM](https://pawamoy.github.io/copier-pdm/) template documentation ;
- the [Contributing](https://cognitivefactory.github.io/interactive-clustering-gui/contributing/) page for environment setup and development help ;
- the [Code of Conduct](https://cognitivefactory.github.io/interactive-clustering-gui/code_of_conduct/) page for contribution rules.


## <a name="References"></a> References

- **Interactive Clustering**:
    - First presentation: `Schild, E., Durantin, G., Lamirel, J.C., & Miconi, F. (2021). Conception itérative et semi-supervisée d'assistants conversationnels par regroupement interactif des questions. In EGC 2021 - 21èmes Journées Francophones Extraction et Gestion des Connaissances. Edition RNTI. ⟨hal-03133007⟩.`
    - Theoretical study: `Schild, E., Durantin, G., Lamirel, J., & Miconi, F. (2022). Iterative and Semi-Supervised Design of Chatbots Using Interactive Clustering. International Journal of Data Warehousing and Mining (IJDWM), 18(2), 1-19. http://doi.org/10.4018/IJDWM.298007. ⟨hal-03648041⟩.`
    - Methodological discussion: `Schild, E., Durantin, G., & Lamirel, J.C. (2021). Concevoir un assistant conversationnel de manière itérative et semi-supervisée avec le clustering interactif. In Atelier - Fouille de Textes - Text Mine 2021 - En conjonction avec EGC 2021. ⟨hal-03133060⟩.`
    - Implementation: `Schild, E. (2021). cognitivefactory/interactive-clustering. Zenodo. https://doi.org/10.5281/zenodo.4775251.`

- **Web application**:
    - _FastAPI_: `https://fastapi.tiangolo.com/`


## <a name="How to cite"></a> How to cite

`Schild, E. (2021). cognitivefactory/interactive-clustering-gui. Zenodo. https://doi.org/10.5281/zenodo.4775270.`

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "cognitivefactory-interactive-clustering-gui",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "",
    "keywords": "natural-language-processing,constraints,annotation-tool,interactive-clustering,constraints-annotation",
    "author": "",
    "author_email": "Erwan Schild <erwan.schild@e-i.com>",
    "download_url": "https://files.pythonhosted.org/packages/5e/0c/ad394a913bbdfb818e9130926f19c6fcc438b7ce563c1d1a6f25df05e2d1/cognitivefactory-interactive-clustering-gui-0.4.1.tar.gz",
    "platform": null,
    "description": "# Interactive Clustering GUI\n\n[![ci](https://github.com/cognitivefactory/interactive-clustering-gui/workflows/ci/badge.svg)](https://github.com/cognitivefactory/interactive-clustering-gui/actions?query=workflow%3Aci)\n[![documentation](https://img.shields.io/badge/docs-mkdocs%20material-blue.svg?style=flat)](https://cognitivefactory.github.io/interactive-clustering-gui/)\n[![pypi version](https://img.shields.io/pypi/v/cognitivefactory-interactive-clustering-gui.svg)](https://pypi.org/project/cognitivefactory-interactive-clustering-gui/)\n[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.4775270.svg)](https://doi.org/10.5281/zenodo.4775270)\n\nA web application designed for NLP data annotation using Interactive Clustering methodology.\n\n\n## <a name=\"Description\"></a> Quick description\n\n_Interactive clustering_ is a method intended to assist in the design of a training data set.\n\nThis iterative process begins with an unlabeled dataset, and it uses a sequence of two substeps :\n\n1. the user defines constraints on data sampled by the computer ;\n2. the computer performs data partitioning using a constrained clustering algorithm.\n\nThus, at each step of the process :\n\n- the user corrects the clustering of the previous steps using constraints, and\n- the computer offers a corrected and more relevant data partitioning for the next step.\n\nThis web application implements this annotation methodology with several features:\n\n- _data preprocessing and vectorization_ in order to reduce noise in data;\n- _constrainted clustering_ in order to automatically partition the data;\n- _constraints sampling_ in order to select the most relevant data to annotate;\n- _binary constraints annotation_ in order to correct clustering relevance;\n- _annotation review and conflicts analysis_ in order to improve constraints consistency.\n\nFor more details, read the [Documentation](#Documentation) and the articles in the [References](#References) section.\n\n\n## <a name=\"Documentation\"></a> Documentation\n\n- [Main documentation](https://cognitivefactory.github.io/interactive-clustering-gui/)\n\n\n## <a name=\"Requirements\"></a> Requirements\n\nInteractive Clustering GUI requires Python 3.8 or above.\n\nTo install with [`pip`](https://github.com/pypa/pip):\n\n```bash\n# install package\npython3 -m pip install cognitivefactory-interactive-clustering-gui\n\n# install spacy language model dependencies (the one you want, with version \"3.1.x\")\npython3 -m spacy download fr_core_news_md-3.1.0 --direct\n```\n\nTo install with [`pipx`](https://github.com/pypa/pipx):\n\n```bash\n# install pipx\npython3 -m pip install --user pipx\n\n# install package\npipx install --python python3 cognitivefactory-interactive-clustering-gui\n\n# install spacy language model dependencies (the one you want, with version \"3.1.x\")\npython3 -m spacy download fr_core_news_md-3.1.0 --direct\n```\n\n\n## <a name=\"Run\"></a> Run\n\nTo display the help message:\n\n```bash\ncognitivefactory-interactive-clustering-gui --help\n```\n\nTo launch the web application:\n\n```bash\ncognitivefactory-interactive-clustering-gui  # launch on 127.0.0.1:8080\n```\n\nThen, go to one of the following pages in your browser:\n\n- Welcome page (web application home): [http://localhost:8080](http://localhost:8080)\n- Swagger (interactive documentation): [http://localhost:8080/docs](http://localhost:8080/docs)\n\n\n## <a name=\"Development\"></a> Development\n\n\nTo work on this project or contribute to it, please read:\n\n- the [Copier PDM](https://pawamoy.github.io/copier-pdm/) template documentation ;\n- the [Contributing](https://cognitivefactory.github.io/interactive-clustering-gui/contributing/) page for environment setup and development help ;\n- the [Code of Conduct](https://cognitivefactory.github.io/interactive-clustering-gui/code_of_conduct/) page for contribution rules.\n\n\n## <a name=\"References\"></a> References\n\n- **Interactive Clustering**:\n    - First presentation: `Schild, E., Durantin, G., Lamirel, J.C., & Miconi, F. (2021). Conception it\u00e9rative et semi-supervis\u00e9e d'assistants conversationnels par regroupement interactif des questions. In EGC 2021 - 21\u00e8mes Journ\u00e9es Francophones Extraction et Gestion des Connaissances. Edition RNTI. \u27e8hal-03133007\u27e9.`\n    - Theoretical study: `Schild, E., Durantin, G., Lamirel, J., & Miconi, F. (2022). Iterative and Semi-Supervised Design of Chatbots Using Interactive Clustering. International Journal of Data Warehousing and Mining (IJDWM), 18(2), 1-19. http://doi.org/10.4018/IJDWM.298007. \u27e8hal-03648041\u27e9.`\n    - Methodological discussion: `Schild, E., Durantin, G., & Lamirel, J.C. (2021). Concevoir un assistant conversationnel de mani\u00e8re it\u00e9rative et semi-supervis\u00e9e avec le clustering interactif. In Atelier - Fouille de Textes - Text Mine 2021 - En conjonction avec EGC 2021. \u27e8hal-03133060\u27e9.`\n    - Implementation: `Schild, E. (2021). cognitivefactory/interactive-clustering. Zenodo. https://doi.org/10.5281/zenodo.4775251.`\n\n- **Web application**:\n    - _FastAPI_: `https://fastapi.tiangolo.com/`\n\n\n## <a name=\"How to cite\"></a> How to cite\n\n`Schild, E. (2021). cognitivefactory/interactive-clustering-gui. Zenodo. https://doi.org/10.5281/zenodo.4775270.`\n",
    "bugtrack_url": null,
    "license": "CECILL-C",
    "summary": "A web application designed for NLP data annotation using Interactive Clustering methodology.",
    "version": "0.4.1",
    "split_keywords": [
        "natural-language-processing",
        "constraints",
        "annotation-tool",
        "interactive-clustering",
        "constraints-annotation"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "668c5d596a825b229159f6066633bfff738823f6de5b9f4f1cc3815d46c226e0",
                "md5": "0fdbb7cc54cd12e5fdd18747dff172cc",
                "sha256": "22b061a7b7e23d10c3139655f43916da6c42869bc5fe379152eb0ce50ce68781"
            },
            "downloads": -1,
            "filename": "cognitivefactory_interactive_clustering_gui-0.4.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "0fdbb7cc54cd12e5fdd18747dff172cc",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 527973,
            "upload_time": "2023-04-27T10:44:01",
            "upload_time_iso_8601": "2023-04-27T10:44:01.917847Z",
            "url": "https://files.pythonhosted.org/packages/66/8c/5d596a825b229159f6066633bfff738823f6de5b9f4f1cc3815d46c226e0/cognitivefactory_interactive_clustering_gui-0.4.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "5e0cad394a913bbdfb818e9130926f19c6fcc438b7ce563c1d1a6f25df05e2d1",
                "md5": "36afaface0d43412fa612a53e858fe07",
                "sha256": "91e8c1d6faf6b25fb465334cc75a5b23ebba7a3d739c509a64afb876eca929f2"
            },
            "downloads": -1,
            "filename": "cognitivefactory-interactive-clustering-gui-0.4.1.tar.gz",
            "has_sig": false,
            "md5_digest": "36afaface0d43412fa612a53e858fe07",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 830364,
            "upload_time": "2023-04-27T10:44:07",
            "upload_time_iso_8601": "2023-04-27T10:44:07.825415Z",
            "url": "https://files.pythonhosted.org/packages/5e/0c/ad394a913bbdfb818e9130926f19c6fcc438b7ce563c1d1a6f25df05e2d1/cognitivefactory-interactive-clustering-gui-0.4.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-04-27 10:44:07",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "lcname": "cognitivefactory-interactive-clustering-gui"
}
        
Elapsed time: 0.06099s