fkscore
====
Flesch Kincaid readability score for text
A Python module implementation of the Flesch Kincaid readability score algorithm.
The source code is released under the MIT License.
### Installation ###
pip3 install fkscore
### Usage ###
For text in python represented as a string.
Takes text as string datatype. Words can be on same or different lines. Current version is English language only. Email for support.
import fkscore
text = '...blah blah blah...'
f = fkscore.fkscore(text)
print(f.stats)
print(f.score)
OR
from fkscore import fkscore
text = '...blah blah blah...'
f = fkscore(text)
print(f.stats)
print(f.score)
### Output ###
Output includes 2 dictionaries of information as follows:
* stats:
* stats['num_words']
* stats['num_syllables']
* stats['num_sentences']
* score:
* score['readability'] # Calculated F-K Readability Score
* score['read_grade'] # Permuted F-K Grade Reading Level
* score['calc_grade'] # Calculated K-K Grade Level
### Releases ###
Releases and additions will push to PyPi as needed. If there is a feature in master not built/pushed, and you want it to be, just ping me.
Note that the validation and many examples for this algorithm implement lines of text for analysis. It is not required to use single lines.
One classic example of this is the text of Moby Dick, which is evaluated to posess a readability score of 58.
This module is pure python and works with all python versions >= 3.5. It likely works with older versions but has yet been tested.
### History ###
This is maintained as an implementation of the Flesch-Kincaid algorithm which initially developed in 1948 by Rudolph Flesch.
It was later revised by J. Peter Kincaid and his team for the U.S. Navy in 1975. The F–K formula was first used by the Army for
assessing the difficulty of technical manuals in 1978 and soon after became a United States Military Standard. The goal of the
algorithm is to provide an empirical basis for assessing the difficuly of understanding text.
### Algorithm ###
There are 2 algorithms providing output and associated text statistics as follows:
- **Flesch Reading Ease**:
- In the Flesch reading-ease test, higher scores indicate material that is easier to read; lower numbers mark passages that are more difficult to read.
- The formula for the Flesch reading-ease score (FRES) test is:
- 206.835 - (1.015 * (total words / total sentences)) - (84.6 * (total syllables / total words))
- The score is a float number rounded to 3 decimal places.
- Grade level can be permuted from the Flesch Reading Ease score:
- 100.00–90.00 - 5th grade - Very easy to read. Easily understood by an average 11-year-old student.
- 90.0–80.0 - 6th grade Easy to read - Conversational English for consumers.
- 80.0–70.0 - 7th grade - Fairly easy to read.
- 70.0–60.0 - 8th & 9th grade - Plain English. Easily understood by 13- to 15-year-old students.
- 60.0–50.0 - 10th to 12th grade - Fairly difficult to read.
- 50.0–30.0 - College - Difficult to read.
- 30.0–10.0 - College graduate - Very difficult to read. Best understood by university graduates.
- 10.0–0.0 - Professional - Extremely difficult to read. Best understood by subject-matter experts.
- **Flesch Kincaid Grade Level**:
- These readability tests are used extensively in the field of education. The "Flesch–Kincaid Grade Level Formula" presents a score as a U.S. grade level, making it easier to assess audience.
- It can also mean the number of years of education generally required to understand this text, most relevant when the formula results in a number greater than 10.
- The reason to use the calculated grade level versus the permuted table is when there is potential for text to be outside the minimum and maximum table lookup.
- Note there is often a difference between the permuted grade level and the calculated grade level.
- The grade level is calculated with the following formula:
- (0.39 * (total words / total sentences)) + (11.8 * (total syllables / total words)) -15.59
- The calculated grade is a float number rounded to 3 decimal places.
- **Text Statistics**:
- Number of words
- Number of syllables
- Number of sentences
- for more info, see this [Wikipedia entry](https://en.wikipedia.org/wiki/Flesch–Kincaid_readability_tests)
### Validation Text ###
- **Easy:** `The cat sat on the mat.` scores 116 and is considered VERY easy to read with a single sentence of single syllable words.
- {'num_sentences': 1, 'num_words': 6, 'num_syllables': 6} {'readability': 116.145, 'read_grade': '5th Grade', 'calc_grade': -1.45}
- NOTE the very low calcualted reading grade as compared to the permuted grade level. <br />
- **Low:** `The quick red fox jumped over the lazy brown dog.` is a low grade difficulty sentence scoring 86.705.
- {'num_sentences': 1, 'num_words': 10, 'num_syllables': 13} {'readability': 86.705, 'read_grade': '6th Grade', 'calc_grade': 3.65} <br />
- **Mid:** `This sentence, taken as a reading passage unto itself, is being used to prove a point.` has a readability of 69.
- {'num_sentences': 1, 'num_words': 16, 'num_syllables': 23} {'readability': 68.983, 'read_grade': '9th Grade', 'calc_grade': 7.613} <br />
- **Hard:** `The Australian platypus is seemingly a hybrid of a mammal and reptilian creature.` possesses a readability of 37.455.
- {'num_sentences': 1, 'num_words': 13, 'num_syllables': 24} {'readability': 37.455, 'read_grade': 'College Level', 'calc_grade': 11.265} <br />
### Questions ###
Feel free to contact for questions, comments, concerns or interact directly via the GitHub repository.
Randall Shane PhD<br />
Randall@NumbersAndTech.com<br />
https://github.com/RandallShanePhD/fkscore<br />
Thank you!
Raw data
{
"_id": null,
"home_page": "https://github.com/RandallShanePhD/fkscore",
"name": "fkscore",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.5",
"maintainer_email": "",
"keywords": "nlp,linguistics,nltk,text processing,flesch-kincaid readability",
"author": "Randall Shane PhD <Randall@NumbersAndTech.com>",
"author_email": "Randall@NumbersAndTech.com",
"download_url": "https://files.pythonhosted.org/packages/fd/e9/74cdb201eecc328356f9afcabb1bf24d26f7a095b7be1444a114beaaa749/fkscore-2.0.1.tar.gz",
"platform": null,
"description": "fkscore\n====\n\nFlesch Kincaid readability score for text\n\nA Python module implementation of the Flesch Kincaid readability score algorithm.\n\nThe source code is released under the MIT License.\n\n### Installation ###\n pip3 install fkscore\n\n### Usage ###\nFor text in python represented as a string.\n\nTakes text as string datatype. Words can be on same or different lines. Current version is English language only. Email for support.\n\n import fkscore\n text = '...blah blah blah...'\n f = fkscore.fkscore(text)\n print(f.stats)\n print(f.score)\n\n OR\n\n from fkscore import fkscore\n text = '...blah blah blah...'\n f = fkscore(text)\n print(f.stats)\n print(f.score)\n\n### Output ###\nOutput includes 2 dictionaries of information as follows:\n* stats:\n * stats['num_words']\n * stats['num_syllables']\n * stats['num_sentences']\n* score:\n * score['readability'] # Calculated F-K Readability Score\n * score['read_grade'] # Permuted F-K Grade Reading Level\n * score['calc_grade'] # Calculated K-K Grade Level\n\n### Releases ###\nReleases and additions will push to PyPi as needed. If there is a feature in master not built/pushed, and you want it to be, just ping me. \nNote that the validation and many examples for this algorithm implement lines of text for analysis. It is not required to use single lines. \nOne classic example of this is the text of Moby Dick, which is evaluated to posess a readability score of 58.\nThis module is pure python and works with all python versions >= 3.5. It likely works with older versions but has yet been tested.\n\n\n### History ###\nThis is maintained as an implementation of the Flesch-Kincaid algorithm which initially developed in 1948 by Rudolph Flesch. \nIt was later revised by J. Peter Kincaid and his team for the U.S. Navy in 1975. The F\u2013K formula was first used by the Army for \nassessing the difficulty of technical manuals in 1978 and soon after became a United States Military Standard. The goal of the \nalgorithm is to provide an empirical basis for assessing the difficuly of understanding text.\n\n### Algorithm ###\nThere are 2 algorithms providing output and associated text statistics as follows:\n- **Flesch Reading Ease**: \n - In the Flesch reading-ease test, higher scores indicate material that is easier to read; lower numbers mark passages that are more difficult to read.\n - The formula for the Flesch reading-ease score (FRES) test is:\n - 206.835 - (1.015 * (total words / total sentences)) - (84.6 * (total syllables / total words))\n - The score is a float number rounded to 3 decimal places.\n - Grade level can be permuted from the Flesch Reading Ease score:\n - 100.00\u201390.00 - 5th grade - Very easy to read. Easily understood by an average 11-year-old student. \n - 90.0\u201380.0 - 6th grade\tEasy to read - Conversational English for consumers. \n - 80.0\u201370.0 - 7th grade - Fairly easy to read. \n - 70.0\u201360.0\t- 8th & 9th grade - Plain English. Easily understood by 13- to 15-year-old students. \n - 60.0\u201350.0\t- 10th to 12th grade - Fairly difficult to read. \n - 50.0\u201330.0\t- College - Difficult to read. \n - 30.0\u201310.0\t- College graduate - Very difficult to read. Best understood by university graduates. \n - 10.0\u20130.0 - Professional - Extremely difficult to read. Best understood by subject-matter experts.\n- **Flesch Kincaid Grade Level**:\n - These readability tests are used extensively in the field of education. The \"Flesch\u2013Kincaid Grade Level Formula\" presents a score as a U.S. grade level, making it easier to assess audience.\n - It can also mean the number of years of education generally required to understand this text, most relevant when the formula results in a number greater than 10.\n - The reason to use the calculated grade level versus the permuted table is when there is potential for text to be outside the minimum and maximum table lookup.\n - Note there is often a difference between the permuted grade level and the calculated grade level.\n - The grade level is calculated with the following formula:\n - (0.39 * (total words / total sentences)) + (11.8 * (total syllables / total words)) -15.59\n - The calculated grade is a float number rounded to 3 decimal places.\n- **Text Statistics**:\n - Number of words\n - Number of syllables\n - Number of sentences\n- for more info, see this [Wikipedia entry](https://en.wikipedia.org/wiki/Flesch\u2013Kincaid_readability_tests) \n\n### Validation Text ###\n- **Easy:** `The cat sat on the mat.` scores 116 and is considered VERY easy to read with a single sentence of single syllable words. \n - {'num_sentences': 1, 'num_words': 6, 'num_syllables': 6} {'readability': 116.145, 'read_grade': '5th Grade', 'calc_grade': -1.45} \n - NOTE the very low calcualted reading grade as compared to the permuted grade level. <br />\n- **Low:** `The quick red fox jumped over the lazy brown dog.` is a low grade difficulty sentence scoring 86.705. \n - {'num_sentences': 1, 'num_words': 10, 'num_syllables': 13} {'readability': 86.705, 'read_grade': '6th Grade', 'calc_grade': 3.65} <br />\n- **Mid:** `This sentence, taken as a reading passage unto itself, is being used to prove a point.` has a readability of 69. \n - {'num_sentences': 1, 'num_words': 16, 'num_syllables': 23} {'readability': 68.983, 'read_grade': '9th Grade', 'calc_grade': 7.613} <br />\n- **Hard:** `The Australian platypus is seemingly a hybrid of a mammal and reptilian creature.` possesses a readability of 37.455. \n - {'num_sentences': 1, 'num_words': 13, 'num_syllables': 24} {'readability': 37.455, 'read_grade': 'College Level', 'calc_grade': 11.265} <br />\n\n\n### Questions ###\nFeel free to contact for questions, comments, concerns or interact directly via the GitHub repository.\n\nRandall Shane PhD<br />\nRandall@NumbersAndTech.com<br />\nhttps://github.com/RandallShanePhD/fkscore<br />\nThank you!\n\n\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Flesch Kincaid readability scoring algorithm",
"version": "2.0.1",
"project_urls": {
"Homepage": "https://github.com/RandallShanePhD/fkscore"
},
"split_keywords": [
"nlp",
"linguistics",
"nltk",
"text processing",
"flesch-kincaid readability"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "818fbd48909ea4dd732975e82fd0cdc061cb199e3e9cd4bba5bf13f3c0ec13d9",
"md5": "8c70f266a8506d33c4eca7b564a89a39",
"sha256": "607e0fe4ace89960e9db0dbb630e9e0d1f60e2611112c5cca131d72d2898f99c"
},
"downloads": -1,
"filename": "fkscore-2.0.1-py2.py3-none-any.whl",
"has_sig": false,
"md5_digest": "8c70f266a8506d33c4eca7b564a89a39",
"packagetype": "bdist_wheel",
"python_version": "py2.py3",
"requires_python": ">=3.5",
"size": 6429,
"upload_time": "2024-01-19T09:06:09",
"upload_time_iso_8601": "2024-01-19T09:06:09.560687Z",
"url": "https://files.pythonhosted.org/packages/81/8f/bd48909ea4dd732975e82fd0cdc061cb199e3e9cd4bba5bf13f3c0ec13d9/fkscore-2.0.1-py2.py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "6d9a16348152ca4ddae3906e1830685d2758fa22b7714e6db20a39dfee970776",
"md5": "187c5435fc4c7454ee6da1e7448ff021",
"sha256": "5691075c357e21193596b25fcde9ca69ac9cbbf141677687872b1bbfc592c00a"
},
"downloads": -1,
"filename": "fkscore-2.0.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "187c5435fc4c7454ee6da1e7448ff021",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.5",
"size": 6424,
"upload_time": "2024-01-19T09:06:11",
"upload_time_iso_8601": "2024-01-19T09:06:11.308701Z",
"url": "https://files.pythonhosted.org/packages/6d/9a/16348152ca4ddae3906e1830685d2758fa22b7714e6db20a39dfee970776/fkscore-2.0.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "fde974cdb201eecc328356f9afcabb1bf24d26f7a095b7be1444a114beaaa749",
"md5": "35dc7ee7625ea7fc00416a882468a40f",
"sha256": "1dc857ba73bbb20612b1a0f9b4840d48219bcca94bee9a2d958abbaaae0fa184"
},
"downloads": -1,
"filename": "fkscore-2.0.1.tar.gz",
"has_sig": false,
"md5_digest": "35dc7ee7625ea7fc00416a882468a40f",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.5",
"size": 6236,
"upload_time": "2024-01-19T09:06:12",
"upload_time_iso_8601": "2024-01-19T09:06:12.740760Z",
"url": "https://files.pythonhosted.org/packages/fd/e9/74cdb201eecc328356f9afcabb1bf24d26f7a095b7be1444a114beaaa749/fkscore-2.0.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-01-19 09:06:12",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "RandallShanePhD",
"github_project": "fkscore",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "fkscore"
}