perke


Nameperke JSON
Version 0.4.4 PyPI version JSON
download
home_pagehttps://github.com/alirezatheh/perke
SummaryA keyphrase extractor for Persian
upload_time2023-06-25 09:52:07
maintainer
docs_urlNone
authorAlireza Hosseini
requires_python>=3.8
license
keywords nlp natural-language-processing information-retrieval computational-linguistics persian-language persian-nlp persian keyphrase-extraction keyphrase-extractor keyphrase keyword-extraction keyword-extractor keyword machine-learning ml unsupervised-learning
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Perke
[![tests](https://github.com/alirezatheh/perke/workflows/tests/badge.svg)](https://github.com/alirezatheh/perke/actions/workflows/tests.yaml)
[![pre-commit.ci](https://results.pre-commit.ci/badge/github/AlirezaTheH/perke/main.svg)](https://results.pre-commit.ci/latest/github/alirezatheh/perke/main)
[![PyPI Version](https://img.shields.io/pypi/v/perke)](https://pypi.python.org/pypi/perke)
[![Python Versions](https://img.shields.io/pypi/pyversions/perke)](https://pypi.org/project/perke)
[![Documentation Status](https://readthedocs.org/projects/perke/badge/?version=stable)](https://perke.readthedocs.io/en/stable/?badge=stable)

Perke is a Python keyphrase extraction package for Persian language. It
provides an end-to-end keyphrase extraction pipeline in which each component
can be easily modified or extended to develop new models.

## Installation
- The easiest way to install is from PyPI:
  ```bash
  pip install perke
  ```
  Alternatively, you can install directly from GitHub:
  ```bash
  pip install git+https://github.com/alirezatheh/perke.git
  ```
- Perke also requires a trained POS tagger model. We use
  [Hazm's](https://github.com/roshan-research/hazm) POS tagger model. You can
  easily download latest [Hazm's](https://github.com/roshan-research/hazm) POS
  tagger using the following command:
  ```bash
  python -m perke download
  ```
  Alternatively, you can use another model with same tag names and structure,
  and put it in the
  [`resources`](https://github.com/alirezatheh/perke/tree/main/perke/resources)
  directory.

## Simple Example
Perke provides a standardized API for extracting keyphrases from a text. Start
by typing the 4 lines below to use `TextRank` keyphrase extractor.


```python
from perke.unsupervised.graph_based import TextRank

# 1. Create a TextRank extractor.
extractor = TextRank()

# 2. Load the text.
extractor.load_text(input='text or path/to/input_file')

# 3. Build the graph representation of the text and weight the
#    words. Keyphrase candidates are composed of the 33 percent
#    highest weighted words.
extractor.weight_candidates(top_t_percent=0.33)

# 4. Get the 10 highest weighted candidates as keyphrases.
keyphrases = extractor.get_n_best(n=10)
```

For more in depth examples see the
[`examples`](https://github.com/alirezatheh/perke/tree/main/examples)
directory.

## Documentation
Documentation and references are available at
[Read The Docs](https://perke.readthedocs.io).

## Implemented Models
Perke currently, implements the following keyphrase extraction models:

- Unsupervised models
    - Graph-based models
        - TextRank: [article](http://www.aclweb.org/anthology/W04-3252.pdf)
          by Mihalcea and Tarau, 2004
        - SingleRank: [article](https://www.aaai.org/Papers/AAAI/2008/AAAI08-136.pdf)
          by Wan and Xiao, 2008
        - TopicRank: [article](http://aclweb.org/anthology/I13-1062.pdf)
          by Bougouin, Boudin and Daille, 2013
        - PositionRank: [article](http://www.aclweb.org/anthology/P17-1102.pdf)
          by Florescu and Caragea, 2017
        - MultipartiteRank: [article](https://www.aclweb.org/anthology/N18-2105.pdf)
          by Boudin, 2018

## Acknowledgements
Perke is inspired by [pke](https://github.com/boudinfl/pke).

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/alirezatheh/perke",
    "name": "perke",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "",
    "keywords": "nlp,natural-language-processing,information-retrieval,computational-linguistics,persian-language,persian-nlp,persian,keyphrase-extraction,keyphrase-extractor,keyphrase,keyword-extraction,keyword-extractor,keyword,machine-learning,ml,unsupervised-learning",
    "author": "Alireza Hosseini",
    "author_email": "alirezatheh@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/33/a3/49f2b59bed4f550b0275de5bfc3bbb6c8f143ba648fe187a881cf30bc0f9/perke-0.4.4.tar.gz",
    "platform": null,
    "description": "# Perke\n[![tests](https://github.com/alirezatheh/perke/workflows/tests/badge.svg)](https://github.com/alirezatheh/perke/actions/workflows/tests.yaml)\n[![pre-commit.ci](https://results.pre-commit.ci/badge/github/AlirezaTheH/perke/main.svg)](https://results.pre-commit.ci/latest/github/alirezatheh/perke/main)\n[![PyPI Version](https://img.shields.io/pypi/v/perke)](https://pypi.python.org/pypi/perke)\n[![Python Versions](https://img.shields.io/pypi/pyversions/perke)](https://pypi.org/project/perke)\n[![Documentation Status](https://readthedocs.org/projects/perke/badge/?version=stable)](https://perke.readthedocs.io/en/stable/?badge=stable)\n\nPerke is a Python keyphrase extraction package for Persian language. It\nprovides an end-to-end keyphrase extraction pipeline in which each component\ncan be easily modified or extended to develop new models.\n\n## Installation\n- The easiest way to install is from PyPI:\n  ```bash\n  pip install perke\n  ```\n  Alternatively, you can install directly from GitHub:\n  ```bash\n  pip install git+https://github.com/alirezatheh/perke.git\n  ```\n- Perke also requires a trained POS tagger model. We use\n  [Hazm's](https://github.com/roshan-research/hazm) POS tagger model. You can\n  easily download latest [Hazm's](https://github.com/roshan-research/hazm) POS\n  tagger using the following command:\n  ```bash\n  python -m perke download\n  ```\n  Alternatively, you can use another model with same tag names and structure,\n  and put it in the\n  [`resources`](https://github.com/alirezatheh/perke/tree/main/perke/resources)\n  directory.\n\n## Simple Example\nPerke provides a standardized API for extracting keyphrases from a text. Start\nby typing the 4 lines below to use `TextRank` keyphrase extractor.\n\n\n```python\nfrom perke.unsupervised.graph_based import TextRank\n\n# 1. Create a TextRank extractor.\nextractor = TextRank()\n\n# 2. Load the text.\nextractor.load_text(input='text or path/to/input_file')\n\n# 3. Build the graph representation of the text and weight the\n#    words. Keyphrase candidates are composed of the 33 percent\n#    highest weighted words.\nextractor.weight_candidates(top_t_percent=0.33)\n\n# 4. Get the 10 highest weighted candidates as keyphrases.\nkeyphrases = extractor.get_n_best(n=10)\n```\n\nFor more in depth examples see the\n[`examples`](https://github.com/alirezatheh/perke/tree/main/examples)\ndirectory.\n\n## Documentation\nDocumentation and references are available at\n[Read The Docs](https://perke.readthedocs.io).\n\n## Implemented Models\nPerke currently, implements the following keyphrase extraction models:\n\n- Unsupervised models\n    - Graph-based models\n        - TextRank: [article](http://www.aclweb.org/anthology/W04-3252.pdf)\n          by Mihalcea and Tarau, 2004\n        - SingleRank: [article](https://www.aaai.org/Papers/AAAI/2008/AAAI08-136.pdf)\n          by Wan and Xiao, 2008\n        - TopicRank: [article](http://aclweb.org/anthology/I13-1062.pdf)\n          by Bougouin, Boudin and Daille, 2013\n        - PositionRank: [article](http://www.aclweb.org/anthology/P17-1102.pdf)\n          by Florescu and Caragea, 2017\n        - MultipartiteRank: [article](https://www.aclweb.org/anthology/N18-2105.pdf)\n          by Boudin, 2018\n\n## Acknowledgements\nPerke is inspired by [pke](https://github.com/boudinfl/pke).\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "A keyphrase extractor for Persian",
    "version": "0.4.4",
    "project_urls": {
        "Bug Tracker": "https://github.com/alirezatheh/perke/issues",
        "Documentation": "https://perke.readthedocs.io",
        "Homepage": "https://github.com/alirezatheh/perke",
        "Source Code": "https://github.com/alirezatheh/perke"
    },
    "split_keywords": [
        "nlp",
        "natural-language-processing",
        "information-retrieval",
        "computational-linguistics",
        "persian-language",
        "persian-nlp",
        "persian",
        "keyphrase-extraction",
        "keyphrase-extractor",
        "keyphrase",
        "keyword-extraction",
        "keyword-extractor",
        "keyword",
        "machine-learning",
        "ml",
        "unsupervised-learning"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "a25715d359c899837adfd6482b48dd6d7fdda46bd1e537859c5b136b5afb798d",
                "md5": "4552cd02e9c49d84a3966825d09413bb",
                "sha256": "dc8f0777079e77e0b09ed4842b2e5632316548c6a52858de9d992cd7936008a2"
            },
            "downloads": -1,
            "filename": "perke-0.4.4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "4552cd02e9c49d84a3966825d09413bb",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 25258,
            "upload_time": "2023-06-25T09:51:54",
            "upload_time_iso_8601": "2023-06-25T09:51:54.796793Z",
            "url": "https://files.pythonhosted.org/packages/a2/57/15d359c899837adfd6482b48dd6d7fdda46bd1e537859c5b136b5afb798d/perke-0.4.4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "33a349f2b59bed4f550b0275de5bfc3bbb6c8f143ba648fe187a881cf30bc0f9",
                "md5": "ba7197beff7ae59a0253793e0b368f08",
                "sha256": "a2277223d68d51e4a70ebf1ed0d7b91f6804c05e66c623750e7cc2ecddcc8617"
            },
            "downloads": -1,
            "filename": "perke-0.4.4.tar.gz",
            "has_sig": false,
            "md5_digest": "ba7197beff7ae59a0253793e0b368f08",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 20086,
            "upload_time": "2023-06-25T09:52:07",
            "upload_time_iso_8601": "2023-06-25T09:52:07.502062Z",
            "url": "https://files.pythonhosted.org/packages/33/a3/49f2b59bed4f550b0275de5bfc3bbb6c8f143ba648fe187a881cf30bc0f9/perke-0.4.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-06-25 09:52:07",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "alirezatheh",
    "github_project": "perke",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "perke"
}
        
Elapsed time: 0.11837s