<p align="center">
<a href="https://github.com/intsights/pywordfreq">
<img src="https://raw.githubusercontent.com/intsights/pywordfreq/master/images/logo.png" alt="Logo">
</a>
<h3 align="center">
Word frequency checker based on Wikipedia corpus written in Rust
</h3>
</p>
![license](https://img.shields.io/badge/MIT-License-blue)
![Python](https://img.shields.io/badge/Python-3.7%20%7C%203.8%20%7C%203.9%20%7C%203.10%20%7C%203.11-blue)
![OS](https://img.shields.io/badge/OS-Mac%20%7C%20Linux%20%7C%20Windows-blue)
![Build](https://github.com/intsights/pywordfreq/workflows/Build/badge.svg)
[![PyPi](https://img.shields.io/pypi/v/pywordfreq.svg)](https://pypi.org/project/pywordfreq/)
## Table of Contents
- [Table of Contents](#table-of-contents)
- [About The Project](#about-the-project)
- [Built With](#built-with)
- [Installation](#installation)
- [Usage](#usage)
- [License](#license)
- [Contact](#contact)
## About The Project
Rust library for checking against the Wikipedia word frequency corpus. The library is fast, memory efficient, and secure.
The data structure used to do full lookups is the Hashmap. A Suffix Array data structure [suffix](https://github.com/BurntSushi/suffix) is used to perform quick lookups of sub-patterns over the dictionary.
### Built With
* [pyo3](https://github.com/PyO3/pyo3)
* [suffix](https://github.com/BurntSushi/suffix)
* [ahash](https://github.com/tkaitchuck/ahash)
### Installation
```sh
pip3 install pywordfreq
```
## Usage
```python
import pywordfreq
# On the first use of library, the engine is loaded with the dictionary.
# It is worth to mention that there is a significant ammount
# of memory overhead for the engine.
# This function checks the frequency of the word "the" in the corpus
pywordfreq.full_frequency(
word="the",
)
# This function checks the frequency of the word "inter" as a pattern
# in other words of the dictionary.
pywordfreq.partial_frequency(
pattern="inter",
)
```
## License
Distributed under the MIT License. See `LICENSE` for more information.
## Contact
Gal Ben David - gal@intsights.com
Project Link: [https://github.com/intsights/pywordfreq](https://github.com/intsights/pywordfreq)
Raw data
{
"_id": null,
"home_page": "https://github.com/intsights/pywordfreq",
"name": "pywordfreq",
"maintainer": null,
"docs_url": null,
"requires_python": "",
"maintainer_email": null,
"keywords": "word,frequency,frequencies,rust,pyo3",
"author": "Gal Ben David <gal@intsights.com>",
"author_email": "Gal Ben David <gal@intsights.com>",
"download_url": null,
"platform": null,
"description": "<p align=\"center\">\n <a href=\"https://github.com/intsights/pywordfreq\">\n <img src=\"https://raw.githubusercontent.com/intsights/pywordfreq/master/images/logo.png\" alt=\"Logo\">\n </a>\n <h3 align=\"center\">\n Word frequency checker based on Wikipedia corpus written in Rust\n </h3>\n</p>\n\n\n![license](https://img.shields.io/badge/MIT-License-blue)\n![Python](https://img.shields.io/badge/Python-3.7%20%7C%203.8%20%7C%203.9%20%7C%203.10%20%7C%203.11-blue)\n![OS](https://img.shields.io/badge/OS-Mac%20%7C%20Linux%20%7C%20Windows-blue)\n![Build](https://github.com/intsights/pywordfreq/workflows/Build/badge.svg)\n[![PyPi](https://img.shields.io/pypi/v/pywordfreq.svg)](https://pypi.org/project/pywordfreq/)\n\n## Table of Contents\n\n- [Table of Contents](#table-of-contents)\n- [About The Project](#about-the-project)\n - [Built With](#built-with)\n - [Installation](#installation)\n- [Usage](#usage)\n- [License](#license)\n- [Contact](#contact)\n\n\n## About The Project\n\nRust library for checking against the Wikipedia word frequency corpus. The library is fast, memory efficient, and secure.\nThe data structure used to do full lookups is the Hashmap. A Suffix Array data structure [suffix](https://github.com/BurntSushi/suffix) is used to perform quick lookups of sub-patterns over the dictionary.\n\n\n### Built With\n\n* [pyo3](https://github.com/PyO3/pyo3)\n* [suffix](https://github.com/BurntSushi/suffix)\n* [ahash](https://github.com/tkaitchuck/ahash)\n\n\n### Installation\n\n```sh\npip3 install pywordfreq\n```\n\n\n## Usage\n\n```python\nimport pywordfreq\n\n\n# On the first use of library, the engine is loaded with the dictionary.\n# It is worth to mention that there is a significant ammount\n# of memory overhead for the engine.\n\n# This function checks the frequency of the word \"the\" in the corpus\npywordfreq.full_frequency(\n word=\"the\",\n)\n# This function checks the frequency of the word \"inter\" as a pattern\n# in other words of the dictionary.\npywordfreq.partial_frequency(\n pattern=\"inter\",\n)\n```\n\n\n## License\n\nDistributed under the MIT License. See `LICENSE` for more information.\n\n\n## Contact\n\nGal Ben David - gal@intsights.com\n\nProject Link: [https://github.com/intsights/pywordfreq](https://github.com/intsights/pywordfreq)\n\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Word frequency checker based on Wikipedia corpus written in Rust",
"version": "0.4.0",
"split_keywords": [
"word",
"frequency",
"frequencies",
"rust",
"pyo3"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "8c015b8e0dc84f6a2ac1732f18842b5495eade1820e5d65c487798cbfc5fbd71",
"md5": "79ad2d9c58bbd1a37c5b594149d73589",
"sha256": "4e6d6dbd73a661510ee87b3ff62ff378d282d6b7edf736faf6ff298007641c23"
},
"downloads": -1,
"filename": "pywordfreq-0.4.0-cp310-cp310-macosx_10_7_x86_64.whl",
"has_sig": false,
"md5_digest": "79ad2d9c58bbd1a37c5b594149d73589",
"packagetype": "bdist_wheel",
"python_version": "cp310",
"requires_python": null,
"size": 4729501,
"upload_time": "2023-01-18T08:00:06",
"upload_time_iso_8601": "2023-01-18T08:00:06.649908Z",
"url": "https://files.pythonhosted.org/packages/8c/01/5b8e0dc84f6a2ac1732f18842b5495eade1820e5d65c487798cbfc5fbd71/pywordfreq-0.4.0-cp310-cp310-macosx_10_7_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "3335e2d9a705f916e0ea67e8c7e96cc27a781e56215445766510163160e8780f",
"md5": "fc2f637a04eb634a02e2b097f231946f",
"sha256": "81eb46039fcb38e311eb43c6ca68cf14d59f97799e6c457c217b7695bb27c45a"
},
"downloads": -1,
"filename": "pywordfreq-0.4.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"has_sig": false,
"md5_digest": "fc2f637a04eb634a02e2b097f231946f",
"packagetype": "bdist_wheel",
"python_version": "cp310",
"requires_python": null,
"size": 4730327,
"upload_time": "2023-01-18T08:00:03",
"upload_time_iso_8601": "2023-01-18T08:00:03.265920Z",
"url": "https://files.pythonhosted.org/packages/33/35/e2d9a705f916e0ea67e8c7e96cc27a781e56215445766510163160e8780f/pywordfreq-0.4.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "6728ea5e701b114248ea59befcbcf6cd4f3ed8f5c37f45a999c2a6a273ade245",
"md5": "eb0707afb939c36196336d9858d67cba",
"sha256": "4dc6d5f0c35d240ecd77fbc11d371fa198af5f3f5e214cba5163fc918537685c"
},
"downloads": -1,
"filename": "pywordfreq-0.4.0-cp310-none-win_amd64.whl",
"has_sig": false,
"md5_digest": "eb0707afb939c36196336d9858d67cba",
"packagetype": "bdist_wheel",
"python_version": "cp310",
"requires_python": null,
"size": 4658147,
"upload_time": "2023-01-18T08:01:45",
"upload_time_iso_8601": "2023-01-18T08:01:45.722614Z",
"url": "https://files.pythonhosted.org/packages/67/28/ea5e701b114248ea59befcbcf6cd4f3ed8f5c37f45a999c2a6a273ade245/pywordfreq-0.4.0-cp310-none-win_amd64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "a7511fca0efd686447726e924cc581cdddbd729e2818f4608c42ead93e817725",
"md5": "ee782fdbcd7d30714a39bccc44301be3",
"sha256": "9ff338d5c49ce5ce8b79184917eec9a3f1d898c401f9dc708a3a0108b24b2016"
},
"downloads": -1,
"filename": "pywordfreq-0.4.0-cp311-cp311-macosx_10_7_x86_64.whl",
"has_sig": false,
"md5_digest": "ee782fdbcd7d30714a39bccc44301be3",
"packagetype": "bdist_wheel",
"python_version": "cp311",
"requires_python": null,
"size": 4729501,
"upload_time": "2023-01-18T08:00:08",
"upload_time_iso_8601": "2023-01-18T08:00:08.548592Z",
"url": "https://files.pythonhosted.org/packages/a7/51/1fca0efd686447726e924cc581cdddbd729e2818f4608c42ead93e817725/pywordfreq-0.4.0-cp311-cp311-macosx_10_7_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "a012885d8db38499e4b12da347b5c10639a51a587b62e828bf77a2fd52263b49",
"md5": "491dddc1b1d4d6df9eff680f442fe383",
"sha256": "4a9cdb82de25d7c14e62d04780d429cd1e3792e213041342a3e7ebf36991c2ae"
},
"downloads": -1,
"filename": "pywordfreq-0.4.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"has_sig": false,
"md5_digest": "491dddc1b1d4d6df9eff680f442fe383",
"packagetype": "bdist_wheel",
"python_version": "cp311",
"requires_python": null,
"size": 4730327,
"upload_time": "2023-01-18T08:00:09",
"upload_time_iso_8601": "2023-01-18T08:00:09.837703Z",
"url": "https://files.pythonhosted.org/packages/a0/12/885d8db38499e4b12da347b5c10639a51a587b62e828bf77a2fd52263b49/pywordfreq-0.4.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "871fddc8b309e706c4adaf78986585f53bfd8363d156715142f06a038104eef2",
"md5": "573ac3a24167ab2088ef75b6e176536c",
"sha256": "19479cb713fc75e367272c1ac4358cbb28706c4334fb9b1319dc96c2b7923eed"
},
"downloads": -1,
"filename": "pywordfreq-0.4.0-cp311-none-win_amd64.whl",
"has_sig": false,
"md5_digest": "573ac3a24167ab2088ef75b6e176536c",
"packagetype": "bdist_wheel",
"python_version": "cp311",
"requires_python": null,
"size": 4658151,
"upload_time": "2023-01-18T08:02:36",
"upload_time_iso_8601": "2023-01-18T08:02:36.945262Z",
"url": "https://files.pythonhosted.org/packages/87/1f/ddc8b309e706c4adaf78986585f53bfd8363d156715142f06a038104eef2/pywordfreq-0.4.0-cp311-none-win_amd64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "572415ec77541d07aa19840534d3d2a22ce42403f6b0629a3e876a62a095b769",
"md5": "1a4137e7894408ccbea53448a818e86f",
"sha256": "e7948dabb183c4e78ebbdb043b184ae69f7b9e07f1ddd5bef53794444688cc86"
},
"downloads": -1,
"filename": "pywordfreq-0.4.0-cp37-cp37m-macosx_10_7_x86_64.whl",
"has_sig": false,
"md5_digest": "1a4137e7894408ccbea53448a818e86f",
"packagetype": "bdist_wheel",
"python_version": "cp37",
"requires_python": null,
"size": 4729624,
"upload_time": "2023-01-18T08:00:22",
"upload_time_iso_8601": "2023-01-18T08:00:22.535985Z",
"url": "https://files.pythonhosted.org/packages/57/24/15ec77541d07aa19840534d3d2a22ce42403f6b0629a3e876a62a095b769/pywordfreq-0.4.0-cp37-cp37m-macosx_10_7_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "d034b4a780f67a0a5c891b9a76c12a415fffcfb046e85ff37e32bfd3d06b838a",
"md5": "76ab8569917548e19e1cc83fad29c5f3",
"sha256": "c444df88cf0538538db44bc261295f3b70f213451a83cbbc0073c27e7684ee6d"
},
"downloads": -1,
"filename": "pywordfreq-0.4.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"has_sig": false,
"md5_digest": "76ab8569917548e19e1cc83fad29c5f3",
"packagetype": "bdist_wheel",
"python_version": "cp37",
"requires_python": null,
"size": 4730392,
"upload_time": "2023-01-18T08:00:14",
"upload_time_iso_8601": "2023-01-18T08:00:14.848870Z",
"url": "https://files.pythonhosted.org/packages/d0/34/b4a780f67a0a5c891b9a76c12a415fffcfb046e85ff37e32bfd3d06b838a/pywordfreq-0.4.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "174a018ddd3e544cf3e2c051258e0c0f68bfaabd52e0466307b727e674555df2",
"md5": "c0f87a7fa3c8cebe0a349ed6c5e614d8",
"sha256": "2c3ba0ba22812ac314185d362c0d51ffe365b28343496d137bba4dc9cd05188b"
},
"downloads": -1,
"filename": "pywordfreq-0.4.0-cp37-none-win_amd64.whl",
"has_sig": false,
"md5_digest": "c0f87a7fa3c8cebe0a349ed6c5e614d8",
"packagetype": "bdist_wheel",
"python_version": "cp37",
"requires_python": null,
"size": 4657970,
"upload_time": "2023-01-18T08:01:08",
"upload_time_iso_8601": "2023-01-18T08:01:08.165385Z",
"url": "https://files.pythonhosted.org/packages/17/4a/018ddd3e544cf3e2c051258e0c0f68bfaabd52e0466307b727e674555df2/pywordfreq-0.4.0-cp37-none-win_amd64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "e830592fb089fe4e0e5e73a86744bc1531b9658f0e9afc5359b5b6b9009df47b",
"md5": "ca6bb4fe9adea62ef0f5cc79e54e9dc8",
"sha256": "abc9ac4ac5d6b1a1623339645048628fb16636fb8ea6c50ee1226e9963bf9bd0"
},
"downloads": -1,
"filename": "pywordfreq-0.4.0-cp38-cp38-macosx_10_7_x86_64.whl",
"has_sig": false,
"md5_digest": "ca6bb4fe9adea62ef0f5cc79e54e9dc8",
"packagetype": "bdist_wheel",
"python_version": "cp38",
"requires_python": null,
"size": 4729547,
"upload_time": "2023-01-18T08:00:40",
"upload_time_iso_8601": "2023-01-18T08:00:40.638677Z",
"url": "https://files.pythonhosted.org/packages/e8/30/592fb089fe4e0e5e73a86744bc1531b9658f0e9afc5359b5b6b9009df47b/pywordfreq-0.4.0-cp38-cp38-macosx_10_7_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "a8555d81738ed194cacadc7706f6d934fbbd80cb51ce98b324f8c304c749f2ed",
"md5": "131a32e9585ec1a5e718d2ffa2625fa8",
"sha256": "d3a8df8de29b206de63f987f18068c1260329b29d154539bc048c7e926e4122e"
},
"downloads": -1,
"filename": "pywordfreq-0.4.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"has_sig": false,
"md5_digest": "131a32e9585ec1a5e718d2ffa2625fa8",
"packagetype": "bdist_wheel",
"python_version": "cp38",
"requires_python": null,
"size": 4730357,
"upload_time": "2023-01-18T08:00:10",
"upload_time_iso_8601": "2023-01-18T08:00:10.799633Z",
"url": "https://files.pythonhosted.org/packages/a8/55/5d81738ed194cacadc7706f6d934fbbd80cb51ce98b324f8c304c749f2ed/pywordfreq-0.4.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "0ecfea0274c1fb0420c02dee1a02fbcb830b9fe134d0b3af6a4eb0bb9107fdd2",
"md5": "d271ba092b1a453a6c17d2ab5f14c7c0",
"sha256": "5405e46cfed6fb567c1bdaf0fde3ec1979a128da7d0a3b694adc32d3a3a3d58b"
},
"downloads": -1,
"filename": "pywordfreq-0.4.0-cp38-none-win_amd64.whl",
"has_sig": false,
"md5_digest": "d271ba092b1a453a6c17d2ab5f14c7c0",
"packagetype": "bdist_wheel",
"python_version": "cp38",
"requires_python": null,
"size": 4658135,
"upload_time": "2023-01-18T08:02:22",
"upload_time_iso_8601": "2023-01-18T08:02:22.351474Z",
"url": "https://files.pythonhosted.org/packages/0e/cf/ea0274c1fb0420c02dee1a02fbcb830b9fe134d0b3af6a4eb0bb9107fdd2/pywordfreq-0.4.0-cp38-none-win_amd64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "8b35ab9ab3b795a8b7c571dcc15475d1989867ec5aeee290e9fb1d2f5c6ef483",
"md5": "607ed556441f44873a2c777a185cc2e3",
"sha256": "7a4d15f0a5c79b05a4f7bc222b4859221a7c000e5dbd361a5f73e8f2fc0ee225"
},
"downloads": -1,
"filename": "pywordfreq-0.4.0-cp39-cp39-macosx_10_7_x86_64.whl",
"has_sig": false,
"md5_digest": "607ed556441f44873a2c777a185cc2e3",
"packagetype": "bdist_wheel",
"python_version": "cp39",
"requires_python": null,
"size": 4729505,
"upload_time": "2023-01-18T08:01:57",
"upload_time_iso_8601": "2023-01-18T08:01:57.959502Z",
"url": "https://files.pythonhosted.org/packages/8b/35/ab9ab3b795a8b7c571dcc15475d1989867ec5aeee290e9fb1d2f5c6ef483/pywordfreq-0.4.0-cp39-cp39-macosx_10_7_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "abf3608f690de43921bf04807ddc4d2fbe9158f4b04287bcd62be369c39c1a6b",
"md5": "cb24a96420fe1fb5ba5758012bd6aa27",
"sha256": "662969b17335d72add0612c9f5b662b29a3717ebe0cf09f6f99c550533d75622"
},
"downloads": -1,
"filename": "pywordfreq-0.4.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"has_sig": false,
"md5_digest": "cb24a96420fe1fb5ba5758012bd6aa27",
"packagetype": "bdist_wheel",
"python_version": "cp39",
"requires_python": null,
"size": 4730327,
"upload_time": "2023-01-18T08:00:11",
"upload_time_iso_8601": "2023-01-18T08:00:11.886273Z",
"url": "https://files.pythonhosted.org/packages/ab/f3/608f690de43921bf04807ddc4d2fbe9158f4b04287bcd62be369c39c1a6b/pywordfreq-0.4.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "773d720ebd6e76db869a745ba1ab3bb5fb2431461d5e64a2003f34bea95718b5",
"md5": "a5fc4c7e5ebfe20e9811044f6ce937f7",
"sha256": "c8442db7cede3381d256f216f5db2119d18f29732d8d83a15881a812feb8b208"
},
"downloads": -1,
"filename": "pywordfreq-0.4.0-cp39-none-win_amd64.whl",
"has_sig": false,
"md5_digest": "a5fc4c7e5ebfe20e9811044f6ce937f7",
"packagetype": "bdist_wheel",
"python_version": "cp39",
"requires_python": null,
"size": 4658148,
"upload_time": "2023-01-18T08:01:22",
"upload_time_iso_8601": "2023-01-18T08:01:22.692313Z",
"url": "https://files.pythonhosted.org/packages/77/3d/720ebd6e76db869a745ba1ab3bb5fb2431461d5e64a2003f34bea95718b5/pywordfreq-0.4.0-cp39-none-win_amd64.whl",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-01-18 08:00:06",
"github": true,
"gitlab": false,
"bitbucket": false,
"github_user": "intsights",
"github_project": "pywordfreq",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "pywordfreq"
}