rhoknp


Namerhoknp JSON
Version 1.7.0 PyPI version JSON
download
home_pagehttps://github.com/ku-nlp/rhoknp
SummaryYet another Python binding for Juman++/KNP/KWJA
upload_time2024-01-16 12:33:15
maintainerHirokazu Kiyomaru
docs_urlNone
authorHirokazu Kiyomaru
requires_python>=3.8
licenseMIT
keywords nlp japanese juman++ knp kwja
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <p align="center">
<a href="https://rhoknp.readthedocs.io/en/latest/" rel="noopener" target="_blank">
<img width="150" src="https://raw.githubusercontent.com/ku-nlp/rhoknp/develop/docs/_static/logo.png" alt="rhoknp logo">
</a>
</p>

<h1 align="center">rhoknp: Yet another Python binding for Juman++/KNP/KWJA</h1>

<p align="center">
<a href="https://github.com/ku-nlp/rhoknp/actions/workflows/test.yml"><img alt="Test" src="https://img.shields.io/github/actions/workflow/status/ku-nlp/rhoknp/test.yml?branch=main&logo=github&label=test&style=flat-square"></a>
<a href="https://codecov.io/gh/ku-nlp/rhoknp"><img alt="Codecov" src="https://img.shields.io/codecov/c/github/ku-nlp/rhoknp?logo=codecov&style=flat-square"></a>
<a href="https://www.codefactor.io/repository/github/ku-nlp/rhoknp"><img alt="CodeFactor" src="https://img.shields.io/codefactor/grade/github/ku-nlp/rhoknp?style=flat-square"></a>
<a href="https://pypi.org/project/rhoknp/"><img alt="PyPI" src="https://img.shields.io/pypi/v/rhoknp?style=flat-square"></a>
<a href="https://pypi.org/project/rhoknp/"><img alt="PyPI - Python Version" src="https://img.shields.io/pypi/pyversions/rhoknp?style=flat-square">
<a href="https://rhoknp.readthedocs.io/en/latest/"><img alt="Documentation" src="https://img.shields.io/readthedocs/rhoknp?style=flat-square"></a>
<a href="https://github.com/astral-sh/ruff"><img src="https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json" alt="Ruff" style="max-width:100%;"></a>
</p>

---

**Documentation**: [https://rhoknp.readthedocs.io/en/latest/](https://rhoknp.readthedocs.io/en/latest/)

**Source Code**: [https://github.com/ku-nlp/rhoknp](https://github.com/ku-nlp/rhoknp)

---

_rhoknp_ is a Python binding for [Juman++](https://github.com/ku-nlp/jumanpp), [KNP](https://github.com/ku-nlp/knp), and [KWJA](https://github.com/ku-nlp/kwja).[^1]

[^1]: The logo was generated by OpenAI DALL·E 2.

```python
import rhoknp

# Perform morphological analysis by Juman++
jumanpp = rhoknp.Jumanpp()
sentence = jumanpp.apply_to_sentence(
    "電気抵抗率は電気の通しにくさを表す物性値である。"
)

# Access to the result
for morpheme in sentence.morphemes:  # a.k.a. keitai-so
    ...

# Save the result
with open("result.jumanpp", "wt") as f:
    f.write(sentence.to_jumanpp())

# Load the result
with open("result.jumanpp", "rt") as f:
    sentence = rhoknp.Sentence.from_jumanpp(f.read())
```

## Requirements

- Python 3.8+
- (Optional) [Juman++](https://github.com/ku-nlp/jumanpp) v2.0.0-rc3+
- (Optional) [KNP](https://github.com/ku-nlp/knp) 5.0+
- (Optional) [KWJA](https://github.com/ku-nlp/kwja) 1.0.0+

## Installation

```shell
pip install rhoknp
```

## Quick tour

Let's begin by using Juman++ with rhoknp.
Here, we present a simple example demonstrating how Juman++ can be used to analyze a sentence.

```python
# Perform morphological analysis by Juman++
jumanpp = rhoknp.Jumanpp()
sentence = jumanpp.apply_to_sentence("電気抵抗率は電気の通しにくさを表す物性値である。")
```

You can easily access the individual morphemes that make up the sentence.

```python
for morpheme in sentence.morphemes:  # a.k.a. keitai-so
    ...
```

Sentence objects can be saved in the JUMAN format.

```python
# Save the sentence in the JUMAN format
with open("sentence.jumanpp", "wt") as f:
    f.write(sentence.to_jumanpp())

# Load the sentence
with open("sentence.jumanpp", "rt") as f:
    sentence = rhoknp.Sentence.from_jumanpp(f.read())
```

Almost the same APIs are available for KNP.

```python
# Perform language analysis by KNP
knp = rhoknp.KNP()
sentence = knp.apply_to_sentence("電気抵抗率は電気の通しにくさを表す物性値である。")
```

KNP performs language analysis at multiple levels.

```python
for clause in sentence.clauses:  # a.k.a., setsu
    ...
for phrase in sentence.phrases:  # a.k.a. bunsetsu
    ...
for base_phrase in sentence.base_phrases:  # a.k.a. kihon-ku
    ...
for morpheme in sentence.morphemes:  # a.k.a. keitai-so
    ...
```

Sentence objects can be saved in the KNP format.

```python
# Save the sentence in the KNP format
with open("sentence.knp", "wt") as f:
    f.write(sentence.to_knp())

# Load the sentence
with open("sentence.knp", "rt") as f:
    sentence = rhoknp.Sentence.from_knp(f.read())
```

Furthermore, rhoknp provides convenient APIs for document-level language analysis.

```python
document = rhoknp.Document.from_raw_text(
    "電気抵抗率は電気の通しにくさを表す物性値である。単に抵抗率とも呼ばれる。"
)
# If you know sentence boundaries, you can use `Document.from_sentences` instead.
document = rhoknp.Document.from_sentences(
    [
        "電気抵抗率は電気の通しにくさを表す物性値である。",
        "単に抵抗率とも呼ばれる。",
    ]
)
```

Document objects can be handled in a similar manner as Sentence objects.

```python
# Perform morphological analysis by Juman++
document = jumanpp.apply_to_document(document)

# Access language units in the document
for sentence in document.sentences:
    ...
for morpheme in document.morphemes:
    ...

# Save language analysis by Juman++
with open("document.jumanpp", "wt") as f:
    f.write(document.to_jumanpp())

# Load language analysis by Juman++
with open("document.jumanpp", "rt") as f:
    document = rhoknp.Document.from_jumanpp(f.read())
```

For more information, please refer to the [examples](./examples) and [documentation](https://rhoknp.readthedocs.io/en/latest/).

## Main differences from [pyknp](https://github.com/ku-nlp/pyknp/)

[_pyknp_](https://pypi.org/project/pyknp/) serves as the official Python binding for Juman++ and KNP.
In the development of rhoknp, we redesigned the API, considering the current use cases of pyknp.
The key differences between the two are as follows:

- **Support for document-level language analysis**: rhoknp allows you to load and instantiate the results of document-level language analysis, including cohesion analysis and discourse relation analysis.
- **Strict type-awareness**: rhoknp has been thoroughly annotated with type annotations, ensuring strict type checking and improved code clarity.
- **Comprehensive test suite**: rhoknp is extensively tested with a comprehensive test suite. You can view the code coverage report on [Codecov](https://app.codecov.io/gh/ku-nlp/rhoknp).

## License

MIT

## Contributing

We warmly welcome contributions to rhoknp.
You can get started by reading the [contribution guide](https://rhoknp.readthedocs.io/en/latest/contributing/index.html).

## Reference

- [KNP FORMAT](http://cr.fvcrc.i.nagoya-u.ac.jp/~sasano/knp/format.html)
- [KNP - KUROHASHI-CHU-MURAWAKI LAB](https://nlp.ist.i.kyoto-u.ac.jp/?KNP)

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/ku-nlp/rhoknp",
    "name": "rhoknp",
    "maintainer": "Hirokazu Kiyomaru",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "h.kiyomaru@gmail.com",
    "keywords": "NLP,Japanese,Juman++,KNP,KWJA",
    "author": "Hirokazu Kiyomaru",
    "author_email": "h.kiyomaru@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/93/03/27d75ad51d5e947ca922ec9c18f2ea9c94ef83e08dc25162f6c9a48b7467/rhoknp-1.7.0.tar.gz",
    "platform": null,
    "description": "<p align=\"center\">\n<a href=\"https://rhoknp.readthedocs.io/en/latest/\" rel=\"noopener\" target=\"_blank\">\n<img width=\"150\" src=\"https://raw.githubusercontent.com/ku-nlp/rhoknp/develop/docs/_static/logo.png\" alt=\"rhoknp logo\">\n</a>\n</p>\n\n<h1 align=\"center\">rhoknp: Yet another Python binding for Juman++/KNP/KWJA</h1>\n\n<p align=\"center\">\n<a href=\"https://github.com/ku-nlp/rhoknp/actions/workflows/test.yml\"><img alt=\"Test\" src=\"https://img.shields.io/github/actions/workflow/status/ku-nlp/rhoknp/test.yml?branch=main&logo=github&label=test&style=flat-square\"></a>\n<a href=\"https://codecov.io/gh/ku-nlp/rhoknp\"><img alt=\"Codecov\" src=\"https://img.shields.io/codecov/c/github/ku-nlp/rhoknp?logo=codecov&style=flat-square\"></a>\n<a href=\"https://www.codefactor.io/repository/github/ku-nlp/rhoknp\"><img alt=\"CodeFactor\" src=\"https://img.shields.io/codefactor/grade/github/ku-nlp/rhoknp?style=flat-square\"></a>\n<a href=\"https://pypi.org/project/rhoknp/\"><img alt=\"PyPI\" src=\"https://img.shields.io/pypi/v/rhoknp?style=flat-square\"></a>\n<a href=\"https://pypi.org/project/rhoknp/\"><img alt=\"PyPI - Python Version\" src=\"https://img.shields.io/pypi/pyversions/rhoknp?style=flat-square\">\n<a href=\"https://rhoknp.readthedocs.io/en/latest/\"><img alt=\"Documentation\" src=\"https://img.shields.io/readthedocs/rhoknp?style=flat-square\"></a>\n<a href=\"https://github.com/astral-sh/ruff\"><img src=\"https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json\" alt=\"Ruff\" style=\"max-width:100%;\"></a>\n</p>\n\n---\n\n**Documentation**: [https://rhoknp.readthedocs.io/en/latest/](https://rhoknp.readthedocs.io/en/latest/)\n\n**Source Code**: [https://github.com/ku-nlp/rhoknp](https://github.com/ku-nlp/rhoknp)\n\n---\n\n_rhoknp_ is a Python binding for [Juman++](https://github.com/ku-nlp/jumanpp), [KNP](https://github.com/ku-nlp/knp), and [KWJA](https://github.com/ku-nlp/kwja).[^1]\n\n[^1]: The logo was generated by OpenAI DALL\u00b7E 2.\n\n```python\nimport rhoknp\n\n# Perform morphological analysis by Juman++\njumanpp = rhoknp.Jumanpp()\nsentence = jumanpp.apply_to_sentence(\n    \"\u96fb\u6c17\u62b5\u6297\u7387\u306f\u96fb\u6c17\u306e\u901a\u3057\u306b\u304f\u3055\u3092\u8868\u3059\u7269\u6027\u5024\u3067\u3042\u308b\u3002\"\n)\n\n# Access to the result\nfor morpheme in sentence.morphemes:  # a.k.a. keitai-so\n    ...\n\n# Save the result\nwith open(\"result.jumanpp\", \"wt\") as f:\n    f.write(sentence.to_jumanpp())\n\n# Load the result\nwith open(\"result.jumanpp\", \"rt\") as f:\n    sentence = rhoknp.Sentence.from_jumanpp(f.read())\n```\n\n## Requirements\n\n- Python 3.8+\n- (Optional) [Juman++](https://github.com/ku-nlp/jumanpp) v2.0.0-rc3+\n- (Optional) [KNP](https://github.com/ku-nlp/knp) 5.0+\n- (Optional) [KWJA](https://github.com/ku-nlp/kwja) 1.0.0+\n\n## Installation\n\n```shell\npip install rhoknp\n```\n\n## Quick tour\n\nLet's begin by using Juman++ with rhoknp.\nHere, we present a simple example demonstrating how Juman++ can be used to analyze a sentence.\n\n```python\n# Perform morphological analysis by Juman++\njumanpp = rhoknp.Jumanpp()\nsentence = jumanpp.apply_to_sentence(\"\u96fb\u6c17\u62b5\u6297\u7387\u306f\u96fb\u6c17\u306e\u901a\u3057\u306b\u304f\u3055\u3092\u8868\u3059\u7269\u6027\u5024\u3067\u3042\u308b\u3002\")\n```\n\nYou can easily access the individual morphemes that make up the sentence.\n\n```python\nfor morpheme in sentence.morphemes:  # a.k.a. keitai-so\n    ...\n```\n\nSentence objects can be saved in the JUMAN format.\n\n```python\n# Save the sentence in the JUMAN format\nwith open(\"sentence.jumanpp\", \"wt\") as f:\n    f.write(sentence.to_jumanpp())\n\n# Load the sentence\nwith open(\"sentence.jumanpp\", \"rt\") as f:\n    sentence = rhoknp.Sentence.from_jumanpp(f.read())\n```\n\nAlmost the same APIs are available for KNP.\n\n```python\n# Perform language analysis by KNP\nknp = rhoknp.KNP()\nsentence = knp.apply_to_sentence(\"\u96fb\u6c17\u62b5\u6297\u7387\u306f\u96fb\u6c17\u306e\u901a\u3057\u306b\u304f\u3055\u3092\u8868\u3059\u7269\u6027\u5024\u3067\u3042\u308b\u3002\")\n```\n\nKNP performs language analysis at multiple levels.\n\n```python\nfor clause in sentence.clauses:  # a.k.a., setsu\n    ...\nfor phrase in sentence.phrases:  # a.k.a. bunsetsu\n    ...\nfor base_phrase in sentence.base_phrases:  # a.k.a. kihon-ku\n    ...\nfor morpheme in sentence.morphemes:  # a.k.a. keitai-so\n    ...\n```\n\nSentence objects can be saved in the KNP format.\n\n```python\n# Save the sentence in the KNP format\nwith open(\"sentence.knp\", \"wt\") as f:\n    f.write(sentence.to_knp())\n\n# Load the sentence\nwith open(\"sentence.knp\", \"rt\") as f:\n    sentence = rhoknp.Sentence.from_knp(f.read())\n```\n\nFurthermore, rhoknp provides convenient APIs for document-level language analysis.\n\n```python\ndocument = rhoknp.Document.from_raw_text(\n    \"\u96fb\u6c17\u62b5\u6297\u7387\u306f\u96fb\u6c17\u306e\u901a\u3057\u306b\u304f\u3055\u3092\u8868\u3059\u7269\u6027\u5024\u3067\u3042\u308b\u3002\u5358\u306b\u62b5\u6297\u7387\u3068\u3082\u547c\u3070\u308c\u308b\u3002\"\n)\n# If you know sentence boundaries, you can use `Document.from_sentences` instead.\ndocument = rhoknp.Document.from_sentences(\n    [\n        \"\u96fb\u6c17\u62b5\u6297\u7387\u306f\u96fb\u6c17\u306e\u901a\u3057\u306b\u304f\u3055\u3092\u8868\u3059\u7269\u6027\u5024\u3067\u3042\u308b\u3002\",\n        \"\u5358\u306b\u62b5\u6297\u7387\u3068\u3082\u547c\u3070\u308c\u308b\u3002\",\n    ]\n)\n```\n\nDocument objects can be handled in a similar manner as Sentence objects.\n\n```python\n# Perform morphological analysis by Juman++\ndocument = jumanpp.apply_to_document(document)\n\n# Access language units in the document\nfor sentence in document.sentences:\n    ...\nfor morpheme in document.morphemes:\n    ...\n\n# Save language analysis by Juman++\nwith open(\"document.jumanpp\", \"wt\") as f:\n    f.write(document.to_jumanpp())\n\n# Load language analysis by Juman++\nwith open(\"document.jumanpp\", \"rt\") as f:\n    document = rhoknp.Document.from_jumanpp(f.read())\n```\n\nFor more information, please refer to the [examples](./examples) and [documentation](https://rhoknp.readthedocs.io/en/latest/).\n\n## Main differences from [pyknp](https://github.com/ku-nlp/pyknp/)\n\n[_pyknp_](https://pypi.org/project/pyknp/) serves as the official Python binding for Juman++ and KNP.\nIn the development of rhoknp, we redesigned the API, considering the current use cases of pyknp.\nThe key differences between the two are as follows:\n\n- **Support for document-level language analysis**: rhoknp allows you to load and instantiate the results of document-level language analysis, including cohesion analysis and discourse relation analysis.\n- **Strict type-awareness**: rhoknp has been thoroughly annotated with type annotations, ensuring strict type checking and improved code clarity.\n- **Comprehensive test suite**: rhoknp is extensively tested with a comprehensive test suite. You can view the code coverage report on [Codecov](https://app.codecov.io/gh/ku-nlp/rhoknp).\n\n## License\n\nMIT\n\n## Contributing\n\nWe warmly welcome contributions to rhoknp.\nYou can get started by reading the [contribution guide](https://rhoknp.readthedocs.io/en/latest/contributing/index.html).\n\n## Reference\n\n- [KNP FORMAT](http://cr.fvcrc.i.nagoya-u.ac.jp/~sasano/knp/format.html)\n- [KNP - KUROHASHI-CHU-MURAWAKI LAB](https://nlp.ist.i.kyoto-u.ac.jp/?KNP)\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Yet another Python binding for Juman++/KNP/KWJA",
    "version": "1.7.0",
    "project_urls": {
        "Documentation": "https://rhoknp.readthedocs.io/en/latest",
        "Homepage": "https://github.com/ku-nlp/rhoknp",
        "Repository": "https://github.com/ku-nlp/rhoknp"
    },
    "split_keywords": [
        "nlp",
        "japanese",
        "juman++",
        "knp",
        "kwja"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "05ad2bb730195660010505e3652abf333688e4e1dde2882d7ffc365490d83d6f",
                "md5": "943e62b0bd95f77933180c1f33c2de86",
                "sha256": "c1334b7d70fcc7cd7820e5786c4fe9b258ff4d11d4575ba6ae4c03035b00a32d"
            },
            "downloads": -1,
            "filename": "rhoknp-1.7.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "943e62b0bd95f77933180c1f33c2de86",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 93069,
            "upload_time": "2024-01-16T12:33:12",
            "upload_time_iso_8601": "2024-01-16T12:33:12.905277Z",
            "url": "https://files.pythonhosted.org/packages/05/ad/2bb730195660010505e3652abf333688e4e1dde2882d7ffc365490d83d6f/rhoknp-1.7.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "930327d75ad51d5e947ca922ec9c18f2ea9c94ef83e08dc25162f6c9a48b7467",
                "md5": "a21dcf472f2bb802487c5319a1c37080",
                "sha256": "e27faef312173dab60bba9803105dd3895bba53c5f2b6bf84d56fde7c6d51571"
            },
            "downloads": -1,
            "filename": "rhoknp-1.7.0.tar.gz",
            "has_sig": false,
            "md5_digest": "a21dcf472f2bb802487c5319a1c37080",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 69815,
            "upload_time": "2024-01-16T12:33:15",
            "upload_time_iso_8601": "2024-01-16T12:33:15.031842Z",
            "url": "https://files.pythonhosted.org/packages/93/03/27d75ad51d5e947ca922ec9c18f2ea9c94ef83e08dc25162f6c9a48b7467/rhoknp-1.7.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-01-16 12:33:15",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "ku-nlp",
    "github_project": "rhoknp",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "rhoknp"
}
        
Elapsed time: 0.16444s