fastzy


Namefastzy JSON
Version 0.5.0 PyPI version JSON
download
home_pagehttps://github.com/intsights/fastzy
SummaryPython library for fast fuzzy search over a big file written in Rust
upload_time2023-01-09 12:19:41
maintainerNone
docs_urlNone
authorGal Ben David <gal@intsights.com>
requires_python
licenseMIT
keywords fuzzy levenshtein rust
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <p align="center">
    <a href="https://github.com/Intsights/fastzy">
        <img src="https://raw.githubusercontent.com/Intsights/fastzy/master/images/logo.png" alt="Logo">
    </a>
    <h3 align="center">
        Python library for fast fuzzy search over a big file written in Rust
    </h3>
</p>

![license](https://img.shields.io/badge/MIT-License-blue)
![Python](https://img.shields.io/badge/Python-3.7%20%7C%203.8%20%7C%203.9%20%7C%203.10%20%7C%203.11-blue)
![Build](https://github.com/Intsights/fastzy/workflows/Build/badge.svg)
[![PyPi](https://img.shields.io/pypi/v/fastzy.svg)](https://pypi.org/project/fastzy/)

## Table of Contents

- [Table of Contents](#table-of-contents)
- [About The Project](#about-the-project)
  - [Built With](#built-with)
  - [Performance](#performance)
  - [Installation](#installation)
- [Usage](#usage)
- [License](#license)
- [Contact](#contact)


## About The Project

Fastzy is a library written in Rust that can search through a file looking for text based on its distance (Levenshtein). For measuring the Levenshtein distance, the library uses mbleven's algorithm. In situations where the requested distance exceeds 3, where mbleven is slower, Wagner-Fischer is used instead of mbleven. This library loads the whole file into memory, and creates a lightweight index based on the length of the lines. The result is that only potential lines are looked up, opposed to a large number of lines.


### Built With

* [mbleven](https://github.com/fujimotos/mbleven)
* [Pyo3](https://github.com/PyO3/pyo3)


### Performance

| Library | Function | Time |
| ------------- | ------------- | ------------- |
| [polyleven](https://github.com/ztane/python-Levenshtein) | polyleven.levenshtein('text') | 8.48s |
| [fastzy](https://github.com/Intsights/fastzy) | fastzy.search('text) | 0.003s |


### Installation

```sh
pip3 install fastzy
```


## Usage

```python
import fastzy

# open a file and index it in memory
searcher = fastzy.Searcher(
    file_path='input_text_file.txt',
    separator='',
)

# search for the input text 'text' with the distance of 1
searcher.search(
    pattern='text',
    max_distance=1,
)
['test', 'texts', 'next']
```


## License

Distributed under the MIT License. See `LICENSE` for more information.


## Contact

Gal Ben David - gal@intsights.com

Project Link: [https://github.com/Intsights/fastzy](https://github.com/Intsights/fastzy)


            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/intsights/fastzy",
    "name": "fastzy",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": null,
    "keywords": "fuzzy,levenshtein,rust",
    "author": "Gal Ben David <gal@intsights.com>",
    "author_email": "Gal Ben David <gal@intsights.com>",
    "download_url": null,
    "platform": null,
    "description": "<p align=\"center\">\n    <a href=\"https://github.com/Intsights/fastzy\">\n        <img src=\"https://raw.githubusercontent.com/Intsights/fastzy/master/images/logo.png\" alt=\"Logo\">\n    </a>\n    <h3 align=\"center\">\n        Python library for fast fuzzy search over a big file written in Rust\n    </h3>\n</p>\n\n![license](https://img.shields.io/badge/MIT-License-blue)\n![Python](https://img.shields.io/badge/Python-3.7%20%7C%203.8%20%7C%203.9%20%7C%203.10%20%7C%203.11-blue)\n![Build](https://github.com/Intsights/fastzy/workflows/Build/badge.svg)\n[![PyPi](https://img.shields.io/pypi/v/fastzy.svg)](https://pypi.org/project/fastzy/)\n\n## Table of Contents\n\n- [Table of Contents](#table-of-contents)\n- [About The Project](#about-the-project)\n  - [Built With](#built-with)\n  - [Performance](#performance)\n  - [Installation](#installation)\n- [Usage](#usage)\n- [License](#license)\n- [Contact](#contact)\n\n\n## About The Project\n\nFastzy is a library written in Rust that can search through a file looking for text based on its distance (Levenshtein). For measuring the Levenshtein distance, the library uses mbleven's algorithm. In situations where the requested distance exceeds 3, where mbleven is slower, Wagner-Fischer is used instead of mbleven. This library loads the whole file into memory, and creates a lightweight index based on the length of the lines. The result is that only potential lines are looked up, opposed to a large number of lines.\n\n\n### Built With\n\n* [mbleven](https://github.com/fujimotos/mbleven)\n* [Pyo3](https://github.com/PyO3/pyo3)\n\n\n### Performance\n\n| Library | Function | Time |\n| ------------- | ------------- | ------------- |\n| [polyleven](https://github.com/ztane/python-Levenshtein) | polyleven.levenshtein('text') | 8.48s |\n| [fastzy](https://github.com/Intsights/fastzy) | fastzy.search('text) | 0.003s |\n\n\n### Installation\n\n```sh\npip3 install fastzy\n```\n\n\n## Usage\n\n```python\nimport fastzy\n\n# open a file and index it in memory\nsearcher = fastzy.Searcher(\n    file_path='input_text_file.txt',\n    separator='',\n)\n\n# search for the input text 'text' with the distance of 1\nsearcher.search(\n    pattern='text',\n    max_distance=1,\n)\n['test', 'texts', 'next']\n```\n\n\n## License\n\nDistributed under the MIT License. See `LICENSE` for more information.\n\n\n## Contact\n\nGal Ben David - gal@intsights.com\n\nProject Link: [https://github.com/Intsights/fastzy](https://github.com/Intsights/fastzy)\n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Python library for fast fuzzy search over a big file written in Rust",
    "version": "0.5.0",
    "split_keywords": [
        "fuzzy",
        "levenshtein",
        "rust"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "5cc72ce68f950af59f60125e47753fa23b498beae601b26b6b858fe62705dcc5",
                "md5": "6beeab9d5e6493a0bba94a87bcd33ece",
                "sha256": "256a3137e90de4345d8ab51ab1356321f44f3858d14f014d9cf26514f704905f"
            },
            "downloads": -1,
            "filename": "fastzy-0.5.0-cp310-cp310-macosx_10_7_x86_64.whl",
            "has_sig": false,
            "md5_digest": "6beeab9d5e6493a0bba94a87bcd33ece",
            "packagetype": "bdist_wheel",
            "python_version": "cp310",
            "requires_python": null,
            "size": 215938,
            "upload_time": "2023-01-09T12:19:41",
            "upload_time_iso_8601": "2023-01-09T12:19:41.783599Z",
            "url": "https://files.pythonhosted.org/packages/5c/c7/2ce68f950af59f60125e47753fa23b498beae601b26b6b858fe62705dcc5/fastzy-0.5.0-cp310-cp310-macosx_10_7_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "a67867d59961c30574cb9da3fcf2893be673b1e5a5178174d251f81412279048",
                "md5": "08fb3ab318206bf0401a9ec24f755768",
                "sha256": "a91c22d1d521aa6d28eb96488ad2b460d5a73aa87c9458f4cc90ede6cfc189d8"
            },
            "downloads": -1,
            "filename": "fastzy-0.5.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "08fb3ab318206bf0401a9ec24f755768",
            "packagetype": "bdist_wheel",
            "python_version": "cp310",
            "requires_python": null,
            "size": 221745,
            "upload_time": "2023-01-09T12:17:40",
            "upload_time_iso_8601": "2023-01-09T12:17:40.966865Z",
            "url": "https://files.pythonhosted.org/packages/a6/78/67d59961c30574cb9da3fcf2893be673b1e5a5178174d251f81412279048/fastzy-0.5.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "4efe22a46e4b03e982d25e9b00f9f7535ba1219ff0be151b6b07b669a487f303",
                "md5": "dc737352e9139b4305deaad84d2aea96",
                "sha256": "2f560fb9a05be8ed14434cb3f2c976eb9fcc54bfa3a4745a79430391cf9865ba"
            },
            "downloads": -1,
            "filename": "fastzy-0.5.0-cp310-none-win_amd64.whl",
            "has_sig": false,
            "md5_digest": "dc737352e9139b4305deaad84d2aea96",
            "packagetype": "bdist_wheel",
            "python_version": "cp310",
            "requires_python": null,
            "size": 153135,
            "upload_time": "2023-01-09T12:18:05",
            "upload_time_iso_8601": "2023-01-09T12:18:05.572328Z",
            "url": "https://files.pythonhosted.org/packages/4e/fe/22a46e4b03e982d25e9b00f9f7535ba1219ff0be151b6b07b669a487f303/fastzy-0.5.0-cp310-none-win_amd64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "d3697e25721a89697c5f564b531183869943c372858f3e50ca8567be6bc768eb",
                "md5": "7d845d6af2bc09311b5dabfbeea0814e",
                "sha256": "fc12465aeba29c26b33935fe8b54b8d63eda467e4ee69c9a022b4dfc63e4214f"
            },
            "downloads": -1,
            "filename": "fastzy-0.5.0-cp311-cp311-macosx_10_7_x86_64.whl",
            "has_sig": false,
            "md5_digest": "7d845d6af2bc09311b5dabfbeea0814e",
            "packagetype": "bdist_wheel",
            "python_version": "cp311",
            "requires_python": null,
            "size": 215936,
            "upload_time": "2023-01-09T12:17:44",
            "upload_time_iso_8601": "2023-01-09T12:17:44.323650Z",
            "url": "https://files.pythonhosted.org/packages/d3/69/7e25721a89697c5f564b531183869943c372858f3e50ca8567be6bc768eb/fastzy-0.5.0-cp311-cp311-macosx_10_7_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "8571cb2a9a005a7ff061e81791fe549ec445ff8e2d1c8f3e2bd1366280b41e64",
                "md5": "b9645841d23f36bb1655baace82401bc",
                "sha256": "71888737fb918f75751fbb0dd0346bd579da705b6326414c0d8d70af3f7f6069"
            },
            "downloads": -1,
            "filename": "fastzy-0.5.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "b9645841d23f36bb1655baace82401bc",
            "packagetype": "bdist_wheel",
            "python_version": "cp311",
            "requires_python": null,
            "size": 221745,
            "upload_time": "2023-01-09T12:18:19",
            "upload_time_iso_8601": "2023-01-09T12:18:19.554468Z",
            "url": "https://files.pythonhosted.org/packages/85/71/cb2a9a005a7ff061e81791fe549ec445ff8e2d1c8f3e2bd1366280b41e64/fastzy-0.5.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "a0de2a780dfef65aebcbf9f4177dd0af626e634fbd05aa2eadcce912ca798f2b",
                "md5": "b31d1e869344e0ec21777ed29ea3dc83",
                "sha256": "cc7c0d574aed292a2ee6e0fe4dca6b96e88b4ee1098279b444aee566065aacc7"
            },
            "downloads": -1,
            "filename": "fastzy-0.5.0-cp311-none-win_amd64.whl",
            "has_sig": false,
            "md5_digest": "b31d1e869344e0ec21777ed29ea3dc83",
            "packagetype": "bdist_wheel",
            "python_version": "cp311",
            "requires_python": null,
            "size": 153137,
            "upload_time": "2023-01-09T12:18:34",
            "upload_time_iso_8601": "2023-01-09T12:18:34.998278Z",
            "url": "https://files.pythonhosted.org/packages/a0/de/2a780dfef65aebcbf9f4177dd0af626e634fbd05aa2eadcce912ca798f2b/fastzy-0.5.0-cp311-none-win_amd64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "7b3f5350e8a4799113562e33d6c358e5d96c1b0fa1bd59cb1b43aeb4985fa249",
                "md5": "58ea76b638d5dfe251c316504dc2fea9",
                "sha256": "1c8f216ed67bb46b12a8d9ad2500e3ea64d01796000abbc9409883a62c45ef88"
            },
            "downloads": -1,
            "filename": "fastzy-0.5.0-cp37-cp37m-macosx_10_7_x86_64.whl",
            "has_sig": false,
            "md5_digest": "58ea76b638d5dfe251c316504dc2fea9",
            "packagetype": "bdist_wheel",
            "python_version": "cp37",
            "requires_python": null,
            "size": 215943,
            "upload_time": "2023-01-09T12:17:49",
            "upload_time_iso_8601": "2023-01-09T12:17:49.661237Z",
            "url": "https://files.pythonhosted.org/packages/7b/3f/5350e8a4799113562e33d6c358e5d96c1b0fa1bd59cb1b43aeb4985fa249/fastzy-0.5.0-cp37-cp37m-macosx_10_7_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "f7400706a613169ec3c9a21c37a8e3fa553a4301eb966a196c4ad91aa9647070",
                "md5": "97a5441cac3d71f76dd131f0d28341e0",
                "sha256": "c2ecdfd63092879a8df7ac115f972867c56e4041fb6b2280085e0a3d618ab528"
            },
            "downloads": -1,
            "filename": "fastzy-0.5.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "97a5441cac3d71f76dd131f0d28341e0",
            "packagetype": "bdist_wheel",
            "python_version": "cp37",
            "requires_python": null,
            "size": 221682,
            "upload_time": "2023-01-09T12:17:41",
            "upload_time_iso_8601": "2023-01-09T12:17:41.487029Z",
            "url": "https://files.pythonhosted.org/packages/f7/40/0706a613169ec3c9a21c37a8e3fa553a4301eb966a196c4ad91aa9647070/fastzy-0.5.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "c7bd437148f2aa22fff84f14cf0c859cc2bc1444b59427d82d73a59d1a7894b7",
                "md5": "b04ac5d22266a548c017e21f22584f8a",
                "sha256": "3734a947c129ba384e86972ae4a3a8a25a04eb66d8d0d8068760159c93cf6f52"
            },
            "downloads": -1,
            "filename": "fastzy-0.5.0-cp37-none-win_amd64.whl",
            "has_sig": false,
            "md5_digest": "b04ac5d22266a548c017e21f22584f8a",
            "packagetype": "bdist_wheel",
            "python_version": "cp37",
            "requires_python": null,
            "size": 153130,
            "upload_time": "2023-01-09T12:18:26",
            "upload_time_iso_8601": "2023-01-09T12:18:26.265483Z",
            "url": "https://files.pythonhosted.org/packages/c7/bd/437148f2aa22fff84f14cf0c859cc2bc1444b59427d82d73a59d1a7894b7/fastzy-0.5.0-cp37-none-win_amd64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "2e22103150e4692048e7074c859fe85f29f12aaabc50684cd4a701c5ff9d3210",
                "md5": "376e61f2501759d566ba96db457e5e7e",
                "sha256": "dda6b3127f1db282e88b0c4bc311b5b836c68e3d18365ab90ae82cdf579698ec"
            },
            "downloads": -1,
            "filename": "fastzy-0.5.0-cp38-cp38-macosx_10_7_x86_64.whl",
            "has_sig": false,
            "md5_digest": "376e61f2501759d566ba96db457e5e7e",
            "packagetype": "bdist_wheel",
            "python_version": "cp38",
            "requires_python": null,
            "size": 215884,
            "upload_time": "2023-01-09T12:17:22",
            "upload_time_iso_8601": "2023-01-09T12:17:22.256648Z",
            "url": "https://files.pythonhosted.org/packages/2e/22/103150e4692048e7074c859fe85f29f12aaabc50684cd4a701c5ff9d3210/fastzy-0.5.0-cp38-cp38-macosx_10_7_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "852adb6ede2b86b3b9394057fde58f84860ceb9b08c581c89c8806729c89921a",
                "md5": "3679b23ede8e17eca140fd0927521c41",
                "sha256": "ab4574d8bfc63c62c3402a62a8bcd54e8338a1b72d89f097aadff4110840ec87"
            },
            "downloads": -1,
            "filename": "fastzy-0.5.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "3679b23ede8e17eca140fd0927521c41",
            "packagetype": "bdist_wheel",
            "python_version": "cp38",
            "requires_python": null,
            "size": 221684,
            "upload_time": "2023-01-09T12:17:44",
            "upload_time_iso_8601": "2023-01-09T12:17:44.999018Z",
            "url": "https://files.pythonhosted.org/packages/85/2a/db6ede2b86b3b9394057fde58f84860ceb9b08c581c89c8806729c89921a/fastzy-0.5.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "c44e38300d3191a7f7da669709821e20b5f0dc279f4b3791eb46ef2336b76953",
                "md5": "6d9d9d6c16af643354b7bdd51a9c04dd",
                "sha256": "7cb80becdeb2ddbb6a84c0b3d4fcb1d132702b04cc083e6bf66e5462398fe890"
            },
            "downloads": -1,
            "filename": "fastzy-0.5.0-cp38-none-win_amd64.whl",
            "has_sig": false,
            "md5_digest": "6d9d9d6c16af643354b7bdd51a9c04dd",
            "packagetype": "bdist_wheel",
            "python_version": "cp38",
            "requires_python": null,
            "size": 153159,
            "upload_time": "2023-01-09T12:18:31",
            "upload_time_iso_8601": "2023-01-09T12:18:31.833651Z",
            "url": "https://files.pythonhosted.org/packages/c4/4e/38300d3191a7f7da669709821e20b5f0dc279f4b3791eb46ef2336b76953/fastzy-0.5.0-cp38-none-win_amd64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "a44287db8fe92663fb65e3d5a514301abe968dff64a0712f0004454ff5c4dcdf",
                "md5": "6a110bbb8bafb831cdf28226be0cbcd5",
                "sha256": "4fdf38e3dfe140b86d3313e78435cfe33194cb12bc286504bff0c6c7aa1f3e04"
            },
            "downloads": -1,
            "filename": "fastzy-0.5.0-cp39-cp39-macosx_10_7_x86_64.whl",
            "has_sig": false,
            "md5_digest": "6a110bbb8bafb831cdf28226be0cbcd5",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": null,
            "size": 215979,
            "upload_time": "2023-01-09T12:17:27",
            "upload_time_iso_8601": "2023-01-09T12:17:27.357347Z",
            "url": "https://files.pythonhosted.org/packages/a4/42/87db8fe92663fb65e3d5a514301abe968dff64a0712f0004454ff5c4dcdf/fastzy-0.5.0-cp39-cp39-macosx_10_7_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "d5684c0e69d1de88f3630135ea803f43017095a0d70c50cabd73952f9c8ba2d7",
                "md5": "c887c9fff1f95cbf13779e6e42797e3e",
                "sha256": "a428304c8b0c469a065f41dcfb037cb065a8ba2a8be29bcf654b2be00ced423c"
            },
            "downloads": -1,
            "filename": "fastzy-0.5.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "c887c9fff1f95cbf13779e6e42797e3e",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": null,
            "size": 221800,
            "upload_time": "2023-01-09T12:18:02",
            "upload_time_iso_8601": "2023-01-09T12:18:02.767028Z",
            "url": "https://files.pythonhosted.org/packages/d5/68/4c0e69d1de88f3630135ea803f43017095a0d70c50cabd73952f9c8ba2d7/fastzy-0.5.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "0b434aca24c67c7b866aaee5b1033d8a6f67d2e3920a04662e487ee285caaaa8",
                "md5": "4c1edb463095fd25d04fc9249a89fc84",
                "sha256": "02a6c17cd61e5608f0239d64c181ea74d9b061f0eec632ca0941515dd9622b01"
            },
            "downloads": -1,
            "filename": "fastzy-0.5.0-cp39-none-win_amd64.whl",
            "has_sig": false,
            "md5_digest": "4c1edb463095fd25d04fc9249a89fc84",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": null,
            "size": 153135,
            "upload_time": "2023-01-09T12:18:23",
            "upload_time_iso_8601": "2023-01-09T12:18:23.616239Z",
            "url": "https://files.pythonhosted.org/packages/0b/43/4aca24c67c7b866aaee5b1033d8a6f67d2e3920a04662e487ee285caaaa8/fastzy-0.5.0-cp39-none-win_amd64.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-01-09 12:19:41",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "intsights",
    "github_project": "fastzy",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "fastzy"
}
        
Elapsed time: 0.02826s