# 概要
変数置き換えモデルを用いた英日両文に適用可能なリーダビリティ判定ツールです。
字種分割にはdivide-char-typeを, 音節数計算にはcount-syllableを使用しています。
戻り値は全体、段落ごと、センテンスごとのリーダビリティ値が取得できるようにしています。
# 変数置き換えモデルの指標
jFRE = 206.835-(1.015×ASL)-(84.6×ASW)
jFKG = (0.39×ASL)+(11.8×ASW)-15.59
jARI = (4.71×ACW)+(0.5×ASL)-21.43
jCLI = (5.88×ACW)-(29.6/ASL)-15.8
jSMOG = 1.031√(30×PS)+3.1291
*ASL = 字種分割語数/センテンス数
*ASW = 音節数・漢字の連なり数/字種分割語数
*ACW = シャノン情報量に基づく重み/字種分割語数
*PS = 英語3音節・漢字3字以上の字種分割語数/センテンス数
シャノン情報量に基づく重みは、英数字(61種類)を1として、ひらがな(88種類)をlog(1/88)/log(1/61)で,カタカナ(141種類)をlog(1/141)/log(1/61)で、漢字(20898種類)をlog(1/20898)/log(1/61)でそれぞれ重み付けする.
# 評価表
jFREはReading Ease Scoreに照らし合わせて評価します。
jFKG、jARI、jCLI、jSMOGはEstimated Reading Gradeに照らし合わせて評価します。
| Reading Ease Score | Style Description | Estimated Reading Grade | Estimated Percent of U.S. Adults (1949) |
| :---: | :---: | :---: | :---: |
| 0 to 30: | Very Difficult | College graduate | 4.5 |
| 30 to 50: | Difficult | 13th to 16th grade | 33 |
| 50 to 60: | Fairly Difficult | 10th to 12th grade | 54 |
| 60 to 70: | Standard | 8th to 9th grade | 83 |
| 70 to 80: | Fairly Easy | 7th grade | 88 |
| 80 to 90: | Easy | 6th grade | 91 |
| 90 to 100: | Very Easy | 5th grade | 93 |
- William H. DuBay: The Principles of Readability, 2004
- https://files.eric.ed.gov/fulltext/ED490073.pdf
# セットアップ
```
pip install calculate-readability
```
# アンインストール
```
pip uninstall calculate-readability divide-char-type count-syllable nltk
```
# 使用方法
```
from calculate_readability import calculate_readability
data = calculate_readability("今日の天気は晴れです。明日は曇りです。\n明後日は雨です。")
print(data["raw_text"])
print(data["text"])
print(data["jfre"])
print(data["break"][0]["text"])
print(data["break"][0]["jfre"])
print(data["break"][0]["sentence"][0]["text"])
print(data["break"][0]["sentence"][0]["jfre"])
```
# 論文
- 赤木信也ら:変数置き換えモデルを用いた医療関連文書の可読性分析,
- バイオメディカル・ファジィ・システム学会誌 19 (1), 19-27, 2017
- https://cir.nii.ac.jp/crid/1391975276374773248
別途、論文化、または、学会発表を予定してます。
# ライセンス
- calculate-readability
- Python Software Foundation License
- Copyright (C) 2024 Shinya Akagi
- divide-char-type
- Python Software Foundation License
- Copyright (C) 2023-2024 Shinya Akagi
- count-syllable
- Python Software Foundation License
- Copyright (C) 2024 Shinya Akagi
- nltk
- Apache License 2.0
- Copyright (C) 2001-2023 NLTK Project
- cmudict
- BSD License
- Copyright (C) 1998 Carnegie Mellon University
Raw data
{
"_id": null,
"home_page": "https://github.com/ShinyaAkagiI/calculate_readability",
"name": "calculate-readability",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": null,
"author": "Shinya Akagi",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/bd/33/f4017b6f5daa19bb447b29d3f178964533cce3806c2bb7563121be317af4/calculate_readability-0.1.2.tar.gz",
"platform": null,
"description": "# \u6982\u8981\n\n\u5909\u6570\u7f6e\u304d\u63db\u3048\u30e2\u30c7\u30eb\u3092\u7528\u3044\u305f\u82f1\u65e5\u4e21\u6587\u306b\u9069\u7528\u53ef\u80fd\u306a\u30ea\u30fc\u30c0\u30d3\u30ea\u30c6\u30a3\u5224\u5b9a\u30c4\u30fc\u30eb\u3067\u3059\u3002 \n\u5b57\u7a2e\u5206\u5272\u306b\u306fdivide-char-type\u3092, \u97f3\u7bc0\u6570\u8a08\u7b97\u306b\u306fcount-syllable\u3092\u4f7f\u7528\u3057\u3066\u3044\u307e\u3059\u3002 \n\u623b\u308a\u5024\u306f\u5168\u4f53\u3001\u6bb5\u843d\u3054\u3068\u3001\u30bb\u30f3\u30c6\u30f3\u30b9\u3054\u3068\u306e\u30ea\u30fc\u30c0\u30d3\u30ea\u30c6\u30a3\u5024\u304c\u53d6\u5f97\u3067\u304d\u308b\u3088\u3046\u306b\u3057\u3066\u3044\u307e\u3059\u3002 \n\n\n# \u5909\u6570\u7f6e\u304d\u63db\u3048\u30e2\u30c7\u30eb\u306e\u6307\u6a19\n\njFRE = 206.835-(1.015\u00d7ASL)-(84.6\u00d7ASW) \njFKG = (0.39\u00d7ASL)+(11.8\u00d7ASW)-15.59 \njARI = (4.71\u00d7ACW)+(0.5\u00d7ASL)-21.43 \njCLI = (5.88\u00d7ACW)-(29.6/ASL)-15.8 \njSMOG = 1.031\u221a(30\u00d7PS)+3.1291 \n\n*ASL = \u5b57\u7a2e\u5206\u5272\u8a9e\u6570/\u30bb\u30f3\u30c6\u30f3\u30b9\u6570 \n*ASW = \u97f3\u7bc0\u6570\u30fb\u6f22\u5b57\u306e\u9023\u306a\u308a\u6570/\u5b57\u7a2e\u5206\u5272\u8a9e\u6570 \n*ACW = \u30b7\u30e3\u30ce\u30f3\u60c5\u5831\u91cf\u306b\u57fa\u3064\u3099\u304f\u91cd\u307f/\u5b57\u7a2e\u5206\u5272\u8a9e\u6570 \n*PS = \u82f1\u8a9e3\u97f3\u7bc0\u30fb\u6f22\u5b573\u5b57\u4ee5\u4e0a\u306e\u5b57\u7a2e\u5206\u5272\u8a9e\u6570/\u30bb\u30f3\u30c6\u30f3\u30b9\u6570 \n \n\u30b7\u30e3\u30ce\u30f3\u60c5\u5831\u91cf\u306b\u57fa\u3065\u304f\u91cd\u307f\u306f\u3001\u82f1\u6570\u5b57\uff0861\u7a2e\u985e\uff09\u30921\u3068\u3057\u3066\u3001\u3072\u3089\u304b\u3099\u306a\uff0888\u7a2e\u985e\uff09\u3092log(1/88)/log(1/61)\u3066\u3099\uff0c\u30ab\u30bf\u30ab\u30ca\uff08141\u7a2e\u985e\uff09\u3092log(1/141)/log(1/61)\u3067\u3001\u6f22\u5b57\uff0820898\u7a2e\u985e\uff09\u3092log(1/20898)/log(1/61)\u3067\u305d\u308c\u305e\u308c\u91cd\u307f\u4ed8\u3051\u3059\u308b. \n \n# \u8a55\u4fa1\u8868\njFRE\u306fReading Ease Score\u306b\u7167\u3089\u3057\u5408\u308f\u305b\u3066\u8a55\u4fa1\u3057\u307e\u3059\u3002 \njFKG\u3001jARI\u3001jCLI\u3001jSMOG\u306fEstimated Reading Grade\u306b\u7167\u3089\u3057\u5408\u308f\u305b\u3066\u8a55\u4fa1\u3057\u307e\u3059\u3002 \n \n| Reading Ease Score | Style Description | Estimated Reading Grade | Estimated Percent of U.S. Adults (1949) |\n| :---: | :---: | :---: | :---: |\n| 0 to 30: | Very Difficult | College graduate | 4.5 |\n| 30 to 50: | Difficult | 13th to 16th grade | 33 |\n| 50 to 60: | Fairly Difficult | 10th to 12th grade | 54 |\n| 60 to 70: | Standard | 8th to 9th grade | 83 |\n| 70 to 80: | Fairly Easy | 7th grade | 88 |\n| 80 to 90: | Easy | 6th grade | 91 |\n| 90 to 100: | Very Easy | 5th grade | 93 |\n\n- William H. DuBay: The Principles of Readability, 2004\n - https://files.eric.ed.gov/fulltext/ED490073.pdf\n \n# \u30bb\u30c3\u30c8\u30a2\u30c3\u30d7\n```\npip install calculate-readability\n```\n\n# \u30a2\u30f3\u30a4\u30f3\u30b9\u30c8\u30fc\u30eb\n```\npip uninstall calculate-readability divide-char-type count-syllable nltk\n```\n\n# \u4f7f\u7528\u65b9\u6cd5\n```\nfrom calculate_readability import calculate_readability\n\ndata = calculate_readability(\"\u4eca\u65e5\u306e\u5929\u6c17\u306f\u6674\u308c\u3067\u3059\u3002\u660e\u65e5\u306f\u66c7\u308a\u3067\u3059\u3002\\n\u660e\u5f8c\u65e5\u306f\u96e8\u3067\u3059\u3002\")\n\nprint(data[\"raw_text\"])\nprint(data[\"text\"])\nprint(data[\"jfre\"])\n\nprint(data[\"break\"][0][\"text\"])\nprint(data[\"break\"][0][\"jfre\"])\n\nprint(data[\"break\"][0][\"sentence\"][0][\"text\"])\nprint(data[\"break\"][0][\"sentence\"][0][\"jfre\"])\n```\n\n \n# \u8ad6\u6587\n\n- \u8d64\u6728\u4fe1\u4e5f\u3089\uff1a\u5909\u6570\u7f6e\u304d\u63db\u3048\u30e2\u30c7\u30eb\u3092\u7528\u3044\u305f\u533b\u7642\u95a2\u9023\u6587\u66f8\u306e\u53ef\u8aad\u6027\u5206\u6790, \n - \u30d0\u30a4\u30aa\u30e1\u30c7\u30a3\u30ab\u30eb\u30fb\u30d5\u30a1\u30b8\u30a3\u30fb\u30b7\u30b9\u30c6\u30e0\u5b66\u4f1a\u8a8c 19 (1), 19-27, 2017 \n - https://cir.nii.ac.jp/crid/1391975276374773248 \n\n\u5225\u9014\u3001\u8ad6\u6587\u5316\u3001\u307e\u305f\u306f\u3001\u5b66\u4f1a\u767a\u8868\u3092\u4e88\u5b9a\u3057\u3066\u307e\u3059\u3002 \n\n\n# \u30e9\u30a4\u30bb\u30f3\u30b9\n- calculate-readability\n\t- Python Software Foundation License\n\t- Copyright (C) 2024 Shinya Akagi\n- divide-char-type\n\t- Python Software Foundation License\n\t- Copyright (C) 2023-2024 Shinya Akagi\n- count-syllable\n\t- Python Software Foundation License\n\t- Copyright (C) 2024 Shinya Akagi\n- nltk\n\t- Apache License 2.0\n\t- Copyright (C) 2001-2023 NLTK Project\n- cmudict\n\t- BSD License\n\t- Copyright (C) 1998 Carnegie Mellon University\n \n",
"bugtrack_url": null,
"license": "PSF",
"summary": "Calculate readability by using variable replacement model",
"version": "0.1.2",
"project_urls": {
"Homepage": "https://github.com/ShinyaAkagiI/calculate_readability"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "8e752fbf92d7501c817e902a66f380ff5b420e873f2fc2a315726f89a711392a",
"md5": "6e2e179d72987b28470b25483bf52e43",
"sha256": "630bdbd23a5e6954a9f7760af2debf1c6e05955fbcad77585b26ea39292d650d"
},
"downloads": -1,
"filename": "calculate_readability-0.1.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "6e2e179d72987b28470b25483bf52e43",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 4817,
"upload_time": "2024-10-05T06:06:56",
"upload_time_iso_8601": "2024-10-05T06:06:56.180368Z",
"url": "https://files.pythonhosted.org/packages/8e/75/2fbf92d7501c817e902a66f380ff5b420e873f2fc2a315726f89a711392a/calculate_readability-0.1.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "bd33f4017b6f5daa19bb447b29d3f178964533cce3806c2bb7563121be317af4",
"md5": "f3e7ad3e467792b2f3e51495d3656771",
"sha256": "d55b5cf74cc5eaae2cf72b3d085bf91f1b72836e3862b18210e8bfb726ada26c"
},
"downloads": -1,
"filename": "calculate_readability-0.1.2.tar.gz",
"has_sig": false,
"md5_digest": "f3e7ad3e467792b2f3e51495d3656771",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 4581,
"upload_time": "2024-10-05T06:06:57",
"upload_time_iso_8601": "2024-10-05T06:06:57.238912Z",
"url": "https://files.pythonhosted.org/packages/bd/33/f4017b6f5daa19bb447b29d3f178964533cce3806c2bb7563121be317af4/calculate_readability-0.1.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-05 06:06:57",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "ShinyaAkagiI",
"github_project": "calculate_readability",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "calculate-readability"
}