estnltk


Nameestnltk JSON
Version 1.7.3 PyPI version JSON
download
home_pagehttps://github.com/estnltk/estnltk
SummaryEstNLTK — open source tools for Estonian natural language processing
upload_time2024-06-10 13:38:43
maintainerNone
docs_urlNone
authorUniversity of Tartu
requires_python>=3.9
licenseGPLv2
keywords estonian natural language processing estonian linguistic processing
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            EstNLTK -- Open source tools for Estonian natural language processing
=====================================================================

EstNLTK provides common natural language processing functionality such as paragraph, sentence and word tokenization,
morphological analysis, named entity recognition, etc. for the Estonian language.

The project is funded by EKT ([Eesti Keeletehnoloogia Riiklik Programm](https://www.keeletehnoloogia.ee/)).

This package contains EstNLTK's basic linguistic analysis, system and database tools:

* `Text` class with the Estonian NLP pipeline;
* tokenization tools: word, sentence and paragraph tokenization; clause segmentation; 
* morphology tools: morphological analysis and disambiguation, spelling correction, morphological synthesis and syllabification, HFST based analyser, GT and UD converters;
* information extraction tools: addresses tagger, named entity recognizer, temporal expression tagger; tools for rule based and grammar based fact extraction;
* experimental taggers: verb chain detector, noun phrase chunker, adjective phrase tagger;
* syntactic analysis tools: preprocessing for syntactic analysis, VislCG3 and Maltparser based syntactic parsers;
* Estonian Wordnet and Collocation-Net;
* web taggers -- such as bert embeddings web tagger, neural named entity recognition web tagger, stanza syntax web tagger and stanza ensemble syntax web tagger;
* corpus importing tools -- tools for importing data from large Estonian corpora, such as the Reference Corpus or the National Corpus of Estonia;
* system taggers -- regex tagger, disambiguator, atomizer, merge tagger etc;
* utils for downloading additional resources (e.g. model files required by taggers); 
* Postgres database tools;

## Version 1.7

### Installation

EstNLTK is available for osx, windows-64, and linux-64, and for python versions 3.9 to 3.12. 
You can install the latest version via PyPI:

```
pip install estnltk==1.7.3
```

Alternatively, you can install EstNLTK via [Anaconda](https://www.anaconda.com/download). Installation steps with conda:

1. [create a conda environment](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#creating-an-environment-with-commands) with python 3.10, for instance:
```
conda create -n py310 python=3.10
```

2. [activate the environment](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#activating-an-environment), for instance:
```
conda activate py310
```

3. install EstNLTK with the command:
```
conda install -c estnltk -c conda-forge estnltk=1.7.3
```

_Note_: for using some of the tools in estnltk, you also need to have Java installed in your system. We recommend using Oracle Java http://www.oracle.com/technetwork/java/javase/downloads/index.html, although alternatives such as OpenJDK (http://openjdk.java.net/) should also work.

### Using on Google Colab

You can install EstNLTK on [Google Colab](https://colab.research.google.com) environment via command:

```
!pip install estnltk==1.7.3
```

### Documentation

EstNLTK's tutorials come in the form of [jupyter notebooks](http://jupyter.org).

  * [Starting point of tutorials](https://github.com/estnltk/estnltk/tree/main/tutorials)
  
Additional educational materials on EstNLTK are available on web pages of an NLP course taught at the University of Tartu:

  * [https://github.com/d009/EstNLP](https://github.com/d009/EstNLP) (in Estonian)  

Note: if you have trouble viewing jupyter notebooks in github (you get an error message _Sorry, something went wrong. Reload?_ at loading a notebook), then try to open notebooks with the help of [https://nbviewer.jupyter.org](https://nbviewer.jupyter.org)

### Source

The source of the last release is available at the [main branch](https://github.com/estnltk/estnltk/tree/main/estnltk).

Changelog is available [here](https://github.com/estnltk/estnltk/blob/main/CHANGELOG.md).

## Citation

In case you use EstNLTK in your work, please cite us as follows:

    @InProceedings{laur-EtAl:2020:LREC,
      author    = {Laur, Sven  and  Orasmaa, Siim  and  Särg, Dage  and  Tammo, Paul},
      title     = {EstNLTK 1.6: Remastered Estonian NLP Pipeline},
      booktitle = {Proceedings of The 12th Language Resources and Evaluation Conference},
      month     = {May},
      year      = {2020},
      address   = {Marseille, France},
      publisher = {European Language Resources Association},
      pages     = {7154--7162},
      url       = {https://www.aclweb.org/anthology/2020.lrec-1.884}
    }

---

License: GNU General Public License v2.0

(C) University of Tartu  

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/estnltk/estnltk",
    "name": "estnltk",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "Estonian natural language processing, Estonian linguistic processing",
    "author": "University of Tartu",
    "author_email": "Siim Orasmaa <siim.orasmaa@ut.ee>, Sven Laur <swen@math.ut.ee>, Paul Tammo <paul.tammo@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/a8/9d/e86966b9f4047861dbb80920c8fd54cd80a252fb5ffd3ca607ee2dc732d7/estnltk-1.7.3.tar.gz",
    "platform": null,
    "description": "EstNLTK -- Open source tools for Estonian natural language processing\n=====================================================================\n\nEstNLTK provides common natural language processing functionality such as paragraph, sentence and word tokenization,\nmorphological analysis, named entity recognition, etc. for the Estonian language.\n\nThe project is funded by EKT ([Eesti Keeletehnoloogia Riiklik Programm](https://www.keeletehnoloogia.ee/)).\n\nThis package contains EstNLTK's basic linguistic analysis, system and database tools:\n\n* `Text` class with the Estonian NLP pipeline;\n* tokenization tools: word, sentence and paragraph tokenization; clause segmentation; \n* morphology tools: morphological analysis and disambiguation, spelling correction, morphological synthesis and syllabification, HFST based analyser, GT and UD converters;\n* information extraction tools: addresses tagger, named entity recognizer, temporal expression tagger; tools for rule based and grammar based fact extraction;\n* experimental taggers: verb chain detector, noun phrase chunker, adjective phrase tagger;\n* syntactic analysis tools: preprocessing for syntactic analysis, VislCG3 and Maltparser based syntactic parsers;\n* Estonian Wordnet and Collocation-Net;\n* web taggers -- such as bert embeddings web tagger, neural named entity recognition web tagger, stanza syntax web tagger and stanza ensemble syntax web tagger;\n* corpus importing tools -- tools for importing data from large Estonian corpora, such as the Reference Corpus or the National Corpus of Estonia;\n* system taggers -- regex tagger, disambiguator, atomizer, merge tagger etc;\n* utils for downloading additional resources (e.g. model files required by taggers); \n* Postgres database tools;\n\n## Version 1.7\n\n### Installation\n\nEstNLTK is available for osx, windows-64, and linux-64, and for python versions 3.9 to 3.12. \nYou can install the latest version via PyPI:\n\n```\npip install estnltk==1.7.3\n```\n\nAlternatively, you can install EstNLTK via [Anaconda](https://www.anaconda.com/download). Installation steps with conda:\n\n1. [create a conda environment](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#creating-an-environment-with-commands) with python 3.10, for instance:\n```\nconda create -n py310 python=3.10\n```\n\n2. [activate the environment](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#activating-an-environment), for instance:\n```\nconda activate py310\n```\n\n3. install EstNLTK with the command:\n```\nconda install -c estnltk -c conda-forge estnltk=1.7.3\n```\n\n_Note_: for using some of the tools in estnltk, you also need to have Java installed in your system. We recommend using Oracle Java http://www.oracle.com/technetwork/java/javase/downloads/index.html, although alternatives such as OpenJDK (http://openjdk.java.net/) should also work.\n\n### Using on Google Colab\n\nYou can install EstNLTK on [Google Colab](https://colab.research.google.com) environment via command:\n\n```\n!pip install estnltk==1.7.3\n```\n\n### Documentation\n\nEstNLTK's tutorials come in the form of [jupyter notebooks](http://jupyter.org).\n\n  * [Starting point of tutorials](https://github.com/estnltk/estnltk/tree/main/tutorials)\n  \nAdditional educational materials on EstNLTK are available on web pages of an NLP course taught at the University of Tartu:\n\n  * [https://github.com/d009/EstNLP](https://github.com/d009/EstNLP) (in Estonian)  \n\nNote: if you have trouble viewing jupyter notebooks in github (you get an error message _Sorry, something went wrong. Reload?_ at loading a notebook), then try to open notebooks with the help of [https://nbviewer.jupyter.org](https://nbviewer.jupyter.org)\n\n### Source\n\nThe source of the last release is available at the [main branch](https://github.com/estnltk/estnltk/tree/main/estnltk).\n\nChangelog is available [here](https://github.com/estnltk/estnltk/blob/main/CHANGELOG.md).\n\n## Citation\n\nIn case you use EstNLTK in your work, please cite us as follows:\n\n    @InProceedings{laur-EtAl:2020:LREC,\n      author    = {Laur, Sven  and  Orasmaa, Siim  and  S\u00e4rg, Dage  and  Tammo, Paul},\n      title     = {EstNLTK 1.6: Remastered Estonian NLP Pipeline},\n      booktitle = {Proceedings of The 12th Language Resources and Evaluation Conference},\n      month     = {May},\n      year      = {2020},\n      address   = {Marseille, France},\n      publisher = {European Language Resources Association},\n      pages     = {7154--7162},\n      url       = {https://www.aclweb.org/anthology/2020.lrec-1.884}\n    }\n\n---\n\nLicense: GNU General Public License v2.0\n\n(C) University of Tartu  \n",
    "bugtrack_url": null,
    "license": "GPLv2",
    "summary": "EstNLTK \u2014 open source tools for Estonian natural language processing",
    "version": "1.7.3",
    "project_urls": {
        "Homepage": "https://github.com/estnltk/estnltk",
        "changelog": "https://github.com/estnltk/estnltk/blob/main/CHANGELOG.md",
        "documentation": "https://github.com/estnltk/estnltk/tree/main/tutorials",
        "repository": "https://github.com/estnltk/estnltk/tree/main/estnltk"
    },
    "split_keywords": [
        "estonian natural language processing",
        " estonian linguistic processing"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "7dfa013b2fd3ef7fb5e7e804dcdba2c5009f9fa5fa645821be78292c6c47d9e5",
                "md5": "39aa38a766c1318ac9419af6a98a20e3",
                "sha256": "d52996cc3f822ddf6dc987139dc96e8dafbfc97ed414bcfd70ffabd6e56ef0e3"
            },
            "downloads": -1,
            "filename": "estnltk-1.7.3-cp39-cp39-macosx_10_9_x86_64.whl",
            "has_sig": false,
            "md5_digest": "39aa38a766c1318ac9419af6a98a20e3",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": ">=3.9",
            "size": 59784145,
            "upload_time": "2024-06-10T13:38:27",
            "upload_time_iso_8601": "2024-06-10T13:38:27.104497Z",
            "url": "https://files.pythonhosted.org/packages/7d/fa/013b2fd3ef7fb5e7e804dcdba2c5009f9fa5fa645821be78292c6c47d9e5/estnltk-1.7.3-cp39-cp39-macosx_10_9_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "098f2726a7503eab8428edffb11460cb023bb7b5d40c8106ec2ff817d3d78ba4",
                "md5": "21f2167e7402d2d77a50a9e7ec96159a",
                "sha256": "47492f97cf53f1eba5d189daa2557ae5e44173093b5beee0963779fe469f0e8f"
            },
            "downloads": -1,
            "filename": "estnltk-1.7.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "21f2167e7402d2d77a50a9e7ec96159a",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": ">=3.9",
            "size": 70810705,
            "upload_time": "2024-06-10T13:38:33",
            "upload_time_iso_8601": "2024-06-10T13:38:33.635433Z",
            "url": "https://files.pythonhosted.org/packages/09/8f/2726a7503eab8428edffb11460cb023bb7b5d40c8106ec2ff817d3d78ba4/estnltk-1.7.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "54542e70f72e3878837cd472096f56e001b156cbec0b525d34dda2c0ef3db2bf",
                "md5": "c2e9f835a49a1e22d2b74051815132a9",
                "sha256": "0c54cb01caf92a1bdfe3ad5cda0e874d9fe2374ca106890eff1147142a9de73c"
            },
            "downloads": -1,
            "filename": "estnltk-1.7.3-cp39-cp39-win_amd64.whl",
            "has_sig": false,
            "md5_digest": "c2e9f835a49a1e22d2b74051815132a9",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": ">=3.9",
            "size": 59658386,
            "upload_time": "2024-06-10T13:38:38",
            "upload_time_iso_8601": "2024-06-10T13:38:38.658868Z",
            "url": "https://files.pythonhosted.org/packages/54/54/2e70f72e3878837cd472096f56e001b156cbec0b525d34dda2c0ef3db2bf/estnltk-1.7.3-cp39-cp39-win_amd64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "a89de86966b9f4047861dbb80920c8fd54cd80a252fb5ffd3ca607ee2dc732d7",
                "md5": "795afb06c88ee7c8f851c7fd98f67ae6",
                "sha256": "a6755d3295bb0d8183a55076d17c71ab2d04be1c17db090aac71486674b0bda5"
            },
            "downloads": -1,
            "filename": "estnltk-1.7.3.tar.gz",
            "has_sig": false,
            "md5_digest": "795afb06c88ee7c8f851c7fd98f67ae6",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 58987108,
            "upload_time": "2024-06-10T13:38:43",
            "upload_time_iso_8601": "2024-06-10T13:38:43.852115Z",
            "url": "https://files.pythonhosted.org/packages/a8/9d/e86966b9f4047861dbb80920c8fd54cd80a252fb5ffd3ca607ee2dc732d7/estnltk-1.7.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-06-10 13:38:43",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "estnltk",
    "github_project": "estnltk",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "estnltk"
}
        
Elapsed time: 0.86455s