<p align="center">
<a href="https://rhoknp.readthedocs.io/en/latest/" rel="noopener" target="_blank">
<img width="150" src="https://raw.githubusercontent.com/ku-nlp/rhoknp/develop/docs/_static/logo.png" alt="rhoknp logo">
</a>
</p>
<h1 align="center">rhoknp: Yet another Python binding for Juman++/KNP/KWJA</h1>
<p align="center">
<a href="https://github.com/ku-nlp/rhoknp/actions/workflows/test.yml"><img alt="Test" src="https://img.shields.io/github/actions/workflow/status/ku-nlp/rhoknp/test.yml?branch=main&logo=github&label=test&style=flat-square"></a>
<a href="https://codecov.io/gh/ku-nlp/rhoknp"><img alt="Codecov" src="https://img.shields.io/codecov/c/github/ku-nlp/rhoknp?logo=codecov&style=flat-square"></a>
<a href="https://www.codefactor.io/repository/github/ku-nlp/rhoknp"><img alt="CodeFactor" src="https://img.shields.io/codefactor/grade/github/ku-nlp/rhoknp?style=flat-square"></a>
<a href="https://pypi.org/project/rhoknp/"><img alt="PyPI" src="https://img.shields.io/pypi/v/rhoknp?style=flat-square"></a>
<a href="https://pypi.org/project/rhoknp/"><img alt="PyPI - Python Version" src="https://img.shields.io/pypi/pyversions/rhoknp?style=flat-square">
<a href="https://rhoknp.readthedocs.io/en/latest/"><img alt="Documentation" src="https://img.shields.io/readthedocs/rhoknp?style=flat-square"></a>
<a href="https://github.com/astral-sh/ruff"><img src="https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json" alt="Ruff" style="max-width:100%;"></a>
</p>
---
**Documentation**: [https://rhoknp.readthedocs.io/en/latest/](https://rhoknp.readthedocs.io/en/latest/)
**Source Code**: [https://github.com/ku-nlp/rhoknp](https://github.com/ku-nlp/rhoknp)
---
_rhoknp_ is a Python binding for [Juman++](https://github.com/ku-nlp/jumanpp), [KNP](https://github.com/ku-nlp/knp), and [KWJA](https://github.com/ku-nlp/kwja).[^1]
[^1]: The logo was generated by OpenAI DALL·E 2.
```python
import rhoknp
# Perform morphological analysis by Juman++
jumanpp = rhoknp.Jumanpp()
sentence = jumanpp.apply_to_sentence(
"電気抵抗率は電気の通しにくさを表す物性値である。"
)
# Access to the result
for morpheme in sentence.morphemes: # a.k.a. keitai-so
...
# Save the result
with open("result.jumanpp", "wt") as f:
f.write(sentence.to_jumanpp())
# Load the result
with open("result.jumanpp", "rt") as f:
sentence = rhoknp.Sentence.from_jumanpp(f.read())
```
## Requirements
- Python 3.8+
- (Optional) [Juman++](https://github.com/ku-nlp/jumanpp) v2.0.0-rc3+
- (Optional) [KNP](https://github.com/ku-nlp/knp) 5.0+
- (Optional) [KWJA](https://github.com/ku-nlp/kwja) 1.0.0+
## Installation
```shell
pip install rhoknp
```
## Quick tour
Let's begin by using Juman++ with rhoknp.
Here, we present a simple example demonstrating how Juman++ can be used to analyze a sentence.
```python
# Perform morphological analysis by Juman++
jumanpp = rhoknp.Jumanpp()
sentence = jumanpp.apply_to_sentence("電気抵抗率は電気の通しにくさを表す物性値である。")
```
You can easily access the individual morphemes that make up the sentence.
```python
for morpheme in sentence.morphemes: # a.k.a. keitai-so
...
```
Sentence objects can be saved in the JUMAN format.
```python
# Save the sentence in the JUMAN format
with open("sentence.jumanpp", "wt") as f:
f.write(sentence.to_jumanpp())
# Load the sentence
with open("sentence.jumanpp", "rt") as f:
sentence = rhoknp.Sentence.from_jumanpp(f.read())
```
Almost the same APIs are available for KNP.
```python
# Perform language analysis by KNP
knp = rhoknp.KNP()
sentence = knp.apply_to_sentence("電気抵抗率は電気の通しにくさを表す物性値である。")
```
KNP performs language analysis at multiple levels.
```python
for clause in sentence.clauses: # a.k.a., setsu
...
for phrase in sentence.phrases: # a.k.a. bunsetsu
...
for base_phrase in sentence.base_phrases: # a.k.a. kihon-ku
...
for morpheme in sentence.morphemes: # a.k.a. keitai-so
...
```
Sentence objects can be saved in the KNP format.
```python
# Save the sentence in the KNP format
with open("sentence.knp", "wt") as f:
f.write(sentence.to_knp())
# Load the sentence
with open("sentence.knp", "rt") as f:
sentence = rhoknp.Sentence.from_knp(f.read())
```
Furthermore, rhoknp provides convenient APIs for document-level language analysis.
```python
document = rhoknp.Document.from_raw_text(
"電気抵抗率は電気の通しにくさを表す物性値である。単に抵抗率とも呼ばれる。"
)
# If you know sentence boundaries, you can use `Document.from_sentences` instead.
document = rhoknp.Document.from_sentences(
[
"電気抵抗率は電気の通しにくさを表す物性値である。",
"単に抵抗率とも呼ばれる。",
]
)
```
Document objects can be handled in a similar manner as Sentence objects.
```python
# Perform morphological analysis by Juman++
document = jumanpp.apply_to_document(document)
# Access language units in the document
for sentence in document.sentences:
...
for morpheme in document.morphemes:
...
# Save language analysis by Juman++
with open("document.jumanpp", "wt") as f:
f.write(document.to_jumanpp())
# Load language analysis by Juman++
with open("document.jumanpp", "rt") as f:
document = rhoknp.Document.from_jumanpp(f.read())
```
For more information, please refer to the [examples](./examples) and [documentation](https://rhoknp.readthedocs.io/en/latest/).
## Main differences from [pyknp](https://github.com/ku-nlp/pyknp/)
[_pyknp_](https://pypi.org/project/pyknp/) serves as the official Python binding for Juman++ and KNP.
In the development of rhoknp, we redesigned the API, considering the current use cases of pyknp.
The key differences between the two are as follows:
- **Support for document-level language analysis**: rhoknp allows you to load and instantiate the results of document-level language analysis, including cohesion analysis and discourse relation analysis.
- **Strict type-awareness**: rhoknp has been thoroughly annotated with type annotations, ensuring strict type checking and improved code clarity.
- **Comprehensive test suite**: rhoknp is extensively tested with a comprehensive test suite. You can view the code coverage report on [Codecov](https://app.codecov.io/gh/ku-nlp/rhoknp).
## License
MIT
## Contributing
We warmly welcome contributions to rhoknp.
You can get started by reading the [contribution guide](https://rhoknp.readthedocs.io/en/latest/contributing/index.html).
## Reference
- [KNP FORMAT](http://cr.fvcrc.i.nagoya-u.ac.jp/~sasano/knp/format.html)
- [KNP - KUROHASHI-CHU-MURAWAKI LAB](https://nlp.ist.i.kyoto-u.ac.jp/?KNP)
Raw data
{
"_id": null,
"home_page": "https://github.com/ku-nlp/rhoknp",
"name": "rhoknp",
"maintainer": "Hirokazu Kiyomaru",
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": "h.kiyomaru@gmail.com",
"keywords": "NLP,Japanese,Juman++,KNP,KWJA",
"author": "Hirokazu Kiyomaru",
"author_email": "h.kiyomaru@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/93/03/27d75ad51d5e947ca922ec9c18f2ea9c94ef83e08dc25162f6c9a48b7467/rhoknp-1.7.0.tar.gz",
"platform": null,
"description": "<p align=\"center\">\n<a href=\"https://rhoknp.readthedocs.io/en/latest/\" rel=\"noopener\" target=\"_blank\">\n<img width=\"150\" src=\"https://raw.githubusercontent.com/ku-nlp/rhoknp/develop/docs/_static/logo.png\" alt=\"rhoknp logo\">\n</a>\n</p>\n\n<h1 align=\"center\">rhoknp: Yet another Python binding for Juman++/KNP/KWJA</h1>\n\n<p align=\"center\">\n<a href=\"https://github.com/ku-nlp/rhoknp/actions/workflows/test.yml\"><img alt=\"Test\" src=\"https://img.shields.io/github/actions/workflow/status/ku-nlp/rhoknp/test.yml?branch=main&logo=github&label=test&style=flat-square\"></a>\n<a href=\"https://codecov.io/gh/ku-nlp/rhoknp\"><img alt=\"Codecov\" src=\"https://img.shields.io/codecov/c/github/ku-nlp/rhoknp?logo=codecov&style=flat-square\"></a>\n<a href=\"https://www.codefactor.io/repository/github/ku-nlp/rhoknp\"><img alt=\"CodeFactor\" src=\"https://img.shields.io/codefactor/grade/github/ku-nlp/rhoknp?style=flat-square\"></a>\n<a href=\"https://pypi.org/project/rhoknp/\"><img alt=\"PyPI\" src=\"https://img.shields.io/pypi/v/rhoknp?style=flat-square\"></a>\n<a href=\"https://pypi.org/project/rhoknp/\"><img alt=\"PyPI - Python Version\" src=\"https://img.shields.io/pypi/pyversions/rhoknp?style=flat-square\">\n<a href=\"https://rhoknp.readthedocs.io/en/latest/\"><img alt=\"Documentation\" src=\"https://img.shields.io/readthedocs/rhoknp?style=flat-square\"></a>\n<a href=\"https://github.com/astral-sh/ruff\"><img src=\"https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json\" alt=\"Ruff\" style=\"max-width:100%;\"></a>\n</p>\n\n---\n\n**Documentation**: [https://rhoknp.readthedocs.io/en/latest/](https://rhoknp.readthedocs.io/en/latest/)\n\n**Source Code**: [https://github.com/ku-nlp/rhoknp](https://github.com/ku-nlp/rhoknp)\n\n---\n\n_rhoknp_ is a Python binding for [Juman++](https://github.com/ku-nlp/jumanpp), [KNP](https://github.com/ku-nlp/knp), and [KWJA](https://github.com/ku-nlp/kwja).[^1]\n\n[^1]: The logo was generated by OpenAI DALL\u00b7E 2.\n\n```python\nimport rhoknp\n\n# Perform morphological analysis by Juman++\njumanpp = rhoknp.Jumanpp()\nsentence = jumanpp.apply_to_sentence(\n \"\u96fb\u6c17\u62b5\u6297\u7387\u306f\u96fb\u6c17\u306e\u901a\u3057\u306b\u304f\u3055\u3092\u8868\u3059\u7269\u6027\u5024\u3067\u3042\u308b\u3002\"\n)\n\n# Access to the result\nfor morpheme in sentence.morphemes: # a.k.a. keitai-so\n ...\n\n# Save the result\nwith open(\"result.jumanpp\", \"wt\") as f:\n f.write(sentence.to_jumanpp())\n\n# Load the result\nwith open(\"result.jumanpp\", \"rt\") as f:\n sentence = rhoknp.Sentence.from_jumanpp(f.read())\n```\n\n## Requirements\n\n- Python 3.8+\n- (Optional) [Juman++](https://github.com/ku-nlp/jumanpp) v2.0.0-rc3+\n- (Optional) [KNP](https://github.com/ku-nlp/knp) 5.0+\n- (Optional) [KWJA](https://github.com/ku-nlp/kwja) 1.0.0+\n\n## Installation\n\n```shell\npip install rhoknp\n```\n\n## Quick tour\n\nLet's begin by using Juman++ with rhoknp.\nHere, we present a simple example demonstrating how Juman++ can be used to analyze a sentence.\n\n```python\n# Perform morphological analysis by Juman++\njumanpp = rhoknp.Jumanpp()\nsentence = jumanpp.apply_to_sentence(\"\u96fb\u6c17\u62b5\u6297\u7387\u306f\u96fb\u6c17\u306e\u901a\u3057\u306b\u304f\u3055\u3092\u8868\u3059\u7269\u6027\u5024\u3067\u3042\u308b\u3002\")\n```\n\nYou can easily access the individual morphemes that make up the sentence.\n\n```python\nfor morpheme in sentence.morphemes: # a.k.a. keitai-so\n ...\n```\n\nSentence objects can be saved in the JUMAN format.\n\n```python\n# Save the sentence in the JUMAN format\nwith open(\"sentence.jumanpp\", \"wt\") as f:\n f.write(sentence.to_jumanpp())\n\n# Load the sentence\nwith open(\"sentence.jumanpp\", \"rt\") as f:\n sentence = rhoknp.Sentence.from_jumanpp(f.read())\n```\n\nAlmost the same APIs are available for KNP.\n\n```python\n# Perform language analysis by KNP\nknp = rhoknp.KNP()\nsentence = knp.apply_to_sentence(\"\u96fb\u6c17\u62b5\u6297\u7387\u306f\u96fb\u6c17\u306e\u901a\u3057\u306b\u304f\u3055\u3092\u8868\u3059\u7269\u6027\u5024\u3067\u3042\u308b\u3002\")\n```\n\nKNP performs language analysis at multiple levels.\n\n```python\nfor clause in sentence.clauses: # a.k.a., setsu\n ...\nfor phrase in sentence.phrases: # a.k.a. bunsetsu\n ...\nfor base_phrase in sentence.base_phrases: # a.k.a. kihon-ku\n ...\nfor morpheme in sentence.morphemes: # a.k.a. keitai-so\n ...\n```\n\nSentence objects can be saved in the KNP format.\n\n```python\n# Save the sentence in the KNP format\nwith open(\"sentence.knp\", \"wt\") as f:\n f.write(sentence.to_knp())\n\n# Load the sentence\nwith open(\"sentence.knp\", \"rt\") as f:\n sentence = rhoknp.Sentence.from_knp(f.read())\n```\n\nFurthermore, rhoknp provides convenient APIs for document-level language analysis.\n\n```python\ndocument = rhoknp.Document.from_raw_text(\n \"\u96fb\u6c17\u62b5\u6297\u7387\u306f\u96fb\u6c17\u306e\u901a\u3057\u306b\u304f\u3055\u3092\u8868\u3059\u7269\u6027\u5024\u3067\u3042\u308b\u3002\u5358\u306b\u62b5\u6297\u7387\u3068\u3082\u547c\u3070\u308c\u308b\u3002\"\n)\n# If you know sentence boundaries, you can use `Document.from_sentences` instead.\ndocument = rhoknp.Document.from_sentences(\n [\n \"\u96fb\u6c17\u62b5\u6297\u7387\u306f\u96fb\u6c17\u306e\u901a\u3057\u306b\u304f\u3055\u3092\u8868\u3059\u7269\u6027\u5024\u3067\u3042\u308b\u3002\",\n \"\u5358\u306b\u62b5\u6297\u7387\u3068\u3082\u547c\u3070\u308c\u308b\u3002\",\n ]\n)\n```\n\nDocument objects can be handled in a similar manner as Sentence objects.\n\n```python\n# Perform morphological analysis by Juman++\ndocument = jumanpp.apply_to_document(document)\n\n# Access language units in the document\nfor sentence in document.sentences:\n ...\nfor morpheme in document.morphemes:\n ...\n\n# Save language analysis by Juman++\nwith open(\"document.jumanpp\", \"wt\") as f:\n f.write(document.to_jumanpp())\n\n# Load language analysis by Juman++\nwith open(\"document.jumanpp\", \"rt\") as f:\n document = rhoknp.Document.from_jumanpp(f.read())\n```\n\nFor more information, please refer to the [examples](./examples) and [documentation](https://rhoknp.readthedocs.io/en/latest/).\n\n## Main differences from [pyknp](https://github.com/ku-nlp/pyknp/)\n\n[_pyknp_](https://pypi.org/project/pyknp/) serves as the official Python binding for Juman++ and KNP.\nIn the development of rhoknp, we redesigned the API, considering the current use cases of pyknp.\nThe key differences between the two are as follows:\n\n- **Support for document-level language analysis**: rhoknp allows you to load and instantiate the results of document-level language analysis, including cohesion analysis and discourse relation analysis.\n- **Strict type-awareness**: rhoknp has been thoroughly annotated with type annotations, ensuring strict type checking and improved code clarity.\n- **Comprehensive test suite**: rhoknp is extensively tested with a comprehensive test suite. You can view the code coverage report on [Codecov](https://app.codecov.io/gh/ku-nlp/rhoknp).\n\n## License\n\nMIT\n\n## Contributing\n\nWe warmly welcome contributions to rhoknp.\nYou can get started by reading the [contribution guide](https://rhoknp.readthedocs.io/en/latest/contributing/index.html).\n\n## Reference\n\n- [KNP FORMAT](http://cr.fvcrc.i.nagoya-u.ac.jp/~sasano/knp/format.html)\n- [KNP - KUROHASHI-CHU-MURAWAKI LAB](https://nlp.ist.i.kyoto-u.ac.jp/?KNP)\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Yet another Python binding for Juman++/KNP/KWJA",
"version": "1.7.0",
"project_urls": {
"Documentation": "https://rhoknp.readthedocs.io/en/latest",
"Homepage": "https://github.com/ku-nlp/rhoknp",
"Repository": "https://github.com/ku-nlp/rhoknp"
},
"split_keywords": [
"nlp",
"japanese",
"juman++",
"knp",
"kwja"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "05ad2bb730195660010505e3652abf333688e4e1dde2882d7ffc365490d83d6f",
"md5": "943e62b0bd95f77933180c1f33c2de86",
"sha256": "c1334b7d70fcc7cd7820e5786c4fe9b258ff4d11d4575ba6ae4c03035b00a32d"
},
"downloads": -1,
"filename": "rhoknp-1.7.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "943e62b0bd95f77933180c1f33c2de86",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 93069,
"upload_time": "2024-01-16T12:33:12",
"upload_time_iso_8601": "2024-01-16T12:33:12.905277Z",
"url": "https://files.pythonhosted.org/packages/05/ad/2bb730195660010505e3652abf333688e4e1dde2882d7ffc365490d83d6f/rhoknp-1.7.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "930327d75ad51d5e947ca922ec9c18f2ea9c94ef83e08dc25162f6c9a48b7467",
"md5": "a21dcf472f2bb802487c5319a1c37080",
"sha256": "e27faef312173dab60bba9803105dd3895bba53c5f2b6bf84d56fde7c6d51571"
},
"downloads": -1,
"filename": "rhoknp-1.7.0.tar.gz",
"has_sig": false,
"md5_digest": "a21dcf472f2bb802487c5319a1c37080",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 69815,
"upload_time": "2024-01-16T12:33:15",
"upload_time_iso_8601": "2024-01-16T12:33:15.031842Z",
"url": "https://files.pythonhosted.org/packages/93/03/27d75ad51d5e947ca922ec9c18f2ea9c94ef83e08dc25162f6c9a48b7467/rhoknp-1.7.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-01-16 12:33:15",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "ku-nlp",
"github_project": "rhoknp",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "rhoknp"
}