.. image:: https://github.com/seatgeek/thefuzz/actions/workflows/ci.yml/badge.svg
:target: https://github.com/seatgeek/thefuzz
TheFuzz
=======
Fuzzy string matching like a boss. It uses `Levenshtein Distance <https://en.wikipedia.org/wiki/Levenshtein_distance>`_ to calculate the differences between sequences in a simple-to-use package.
Requirements
============
- Python 3.8 or higher
- `rapidfuzz <https://github.com/maxbachmann/RapidFuzz/>`_
For testing
~~~~~~~~~~~
- pycodestyle
- hypothesis
- pytest
Installation
============
Using pip via PyPI
.. code:: bash
pip install thefuzz
Using pip via GitHub
.. code:: bash
pip install git+git://github.com/seatgeek/thefuzz.git@0.19.0#egg=thefuzz
Adding to your ``requirements.txt`` file (run ``pip install -r requirements.txt`` afterwards)
.. code:: bash
git+ssh://git@github.com/seatgeek/thefuzz.git@0.19.0#egg=thefuzz
Manually via GIT
.. code:: bash
git clone git://github.com/seatgeek/thefuzz.git thefuzz
cd thefuzz
python setup.py install
Usage
=====
.. code:: python
>>> from thefuzz import fuzz
>>> from thefuzz import process
Simple Ratio
~~~~~~~~~~~~
.. code:: python
>>> fuzz.ratio("this is a test", "this is a test!")
97
Partial Ratio
~~~~~~~~~~~~~
.. code:: python
>>> fuzz.partial_ratio("this is a test", "this is a test!")
100
Token Sort Ratio
~~~~~~~~~~~~~~~~
.. code:: python
>>> fuzz.ratio("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")
91
>>> fuzz.token_sort_ratio("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")
100
Token Set Ratio
~~~~~~~~~~~~~~~
.. code:: python
>>> fuzz.token_sort_ratio("fuzzy was a bear", "fuzzy fuzzy was a bear")
84
>>> fuzz.token_set_ratio("fuzzy was a bear", "fuzzy fuzzy was a bear")
100
Partial Token Sort Ratio
~~~~~~~~~~~~~~~~~~~~~~~~
.. code:: python
>>> fuzz.token_sort_ratio("fuzzy was a bear", "wuzzy fuzzy was a bear")
84
>>> fuzz.partial_token_sort_ratio("fuzzy was a bear", "wuzzy fuzzy was a bear")
100
Process
~~~~~~~
.. code:: python
>>> choices = ["Atlanta Falcons", "New York Jets", "New York Giants", "Dallas Cowboys"]
>>> process.extract("new york jets", choices, limit=2)
[('New York Jets', 100), ('New York Giants', 78)]
>>> process.extractOne("cowboys", choices)
("Dallas Cowboys", 90)
You can also pass additional parameters to ``extractOne`` method to make it use a specific scorer. A typical use case is to match file paths:
.. code:: python
>>> process.extractOne("System of a down - Hypnotize - Heroin", songs)
('/music/library/good/System of a Down/2005 - Hypnotize/01 - Attack.mp3', 86)
>>> process.extractOne("System of a down - Hypnotize - Heroin", songs, scorer=fuzz.token_sort_ratio)
("/music/library/good/System of a Down/2005 - Hypnotize/10 - She's Like Heroin.mp3", 61)
.. |Build Status| image:: https://github.com/seatgeek/thefuzz/actions/workflows/ci.yml/badge.svg
:target: https://github.com/seatgeek/thefuzz
Raw data
{
"_id": null,
"home_page": "https://github.com/seatgeek/thefuzz",
"name": "thefuzz",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": "",
"keywords": "",
"author": "Adam Cohen",
"author_email": "adam@seatgeek.com",
"download_url": "https://files.pythonhosted.org/packages/81/4b/d3eb25831590d6d7d38c2f2e3561d3ba41d490dc89cd91d9e65e7c812508/thefuzz-0.22.1.tar.gz",
"platform": null,
"description": ".. image:: https://github.com/seatgeek/thefuzz/actions/workflows/ci.yml/badge.svg\n :target: https://github.com/seatgeek/thefuzz\n\nTheFuzz\n=======\n\nFuzzy string matching like a boss. It uses `Levenshtein Distance <https://en.wikipedia.org/wiki/Levenshtein_distance>`_ to calculate the differences between sequences in a simple-to-use package.\n\nRequirements\n============\n\n- Python 3.8 or higher\n- `rapidfuzz <https://github.com/maxbachmann/RapidFuzz/>`_\n\nFor testing\n~~~~~~~~~~~\n- pycodestyle\n- hypothesis\n- pytest\n\nInstallation\n============\n\nUsing pip via PyPI\n\n.. code:: bash\n\n pip install thefuzz\n\n\nUsing pip via GitHub\n\n.. code:: bash\n\n pip install git+git://github.com/seatgeek/thefuzz.git@0.19.0#egg=thefuzz\n\nAdding to your ``requirements.txt`` file (run ``pip install -r requirements.txt`` afterwards)\n\n.. code:: bash\n\n git+ssh://git@github.com/seatgeek/thefuzz.git@0.19.0#egg=thefuzz\n\nManually via GIT\n\n.. code:: bash\n\n git clone git://github.com/seatgeek/thefuzz.git thefuzz\n cd thefuzz\n python setup.py install\n\n\nUsage\n=====\n\n.. code:: python\n\n >>> from thefuzz import fuzz\n >>> from thefuzz import process\n\nSimple Ratio\n~~~~~~~~~~~~\n\n.. code:: python\n\n >>> fuzz.ratio(\"this is a test\", \"this is a test!\")\n 97\n\nPartial Ratio\n~~~~~~~~~~~~~\n\n.. code:: python\n\n >>> fuzz.partial_ratio(\"this is a test\", \"this is a test!\")\n 100\n\nToken Sort Ratio\n~~~~~~~~~~~~~~~~\n\n.. code:: python\n\n >>> fuzz.ratio(\"fuzzy wuzzy was a bear\", \"wuzzy fuzzy was a bear\")\n 91\n >>> fuzz.token_sort_ratio(\"fuzzy wuzzy was a bear\", \"wuzzy fuzzy was a bear\")\n 100\n\nToken Set Ratio\n~~~~~~~~~~~~~~~\n\n.. code:: python\n\n >>> fuzz.token_sort_ratio(\"fuzzy was a bear\", \"fuzzy fuzzy was a bear\")\n 84\n >>> fuzz.token_set_ratio(\"fuzzy was a bear\", \"fuzzy fuzzy was a bear\")\n 100\n\nPartial Token Sort Ratio\n~~~~~~~~~~~~~~~~~~~~~~~~\n\n.. code:: python\n\n >>> fuzz.token_sort_ratio(\"fuzzy was a bear\", \"wuzzy fuzzy was a bear\")\n 84\n >>> fuzz.partial_token_sort_ratio(\"fuzzy was a bear\", \"wuzzy fuzzy was a bear\")\n 100\n\nProcess\n~~~~~~~\n\n.. code:: python\n\n >>> choices = [\"Atlanta Falcons\", \"New York Jets\", \"New York Giants\", \"Dallas Cowboys\"]\n >>> process.extract(\"new york jets\", choices, limit=2)\n [('New York Jets', 100), ('New York Giants', 78)]\n >>> process.extractOne(\"cowboys\", choices)\n (\"Dallas Cowboys\", 90)\n\nYou can also pass additional parameters to ``extractOne`` method to make it use a specific scorer. A typical use case is to match file paths:\n\n.. code:: python\n\n >>> process.extractOne(\"System of a down - Hypnotize - Heroin\", songs)\n ('/music/library/good/System of a Down/2005 - Hypnotize/01 - Attack.mp3', 86)\n >>> process.extractOne(\"System of a down - Hypnotize - Heroin\", songs, scorer=fuzz.token_sort_ratio)\n (\"/music/library/good/System of a Down/2005 - Hypnotize/10 - She's Like Heroin.mp3\", 61)\n\n.. |Build Status| image:: https://github.com/seatgeek/thefuzz/actions/workflows/ci.yml/badge.svg\n :target: https://github.com/seatgeek/thefuzz\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Fuzzy string matching in python",
"version": "0.22.1",
"project_urls": {
"Homepage": "https://github.com/seatgeek/thefuzz"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "824f1695e70ceb3604f19eda9908e289c687ea81c4fecef4d90a9d1d0f2f7ae9",
"md5": "f04e6215c71dde3e79d55b3911acd351",
"sha256": "59729b33556850b90e1093c4cf9e618af6f2e4c985df193fdf3c5b5cf02ca481"
},
"downloads": -1,
"filename": "thefuzz-0.22.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "f04e6215c71dde3e79d55b3911acd351",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 8245,
"upload_time": "2024-01-19T19:18:20",
"upload_time_iso_8601": "2024-01-19T19:18:20.362755Z",
"url": "https://files.pythonhosted.org/packages/82/4f/1695e70ceb3604f19eda9908e289c687ea81c4fecef4d90a9d1d0f2f7ae9/thefuzz-0.22.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "814bd3eb25831590d6d7d38c2f2e3561d3ba41d490dc89cd91d9e65e7c812508",
"md5": "0b7ec0d80b46c90d113df62892d78395",
"sha256": "7138039a7ecf540da323792d8592ef9902b1d79eb78c147d4f20664de79f3680"
},
"downloads": -1,
"filename": "thefuzz-0.22.1.tar.gz",
"has_sig": false,
"md5_digest": "0b7ec0d80b46c90d113df62892d78395",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 19993,
"upload_time": "2024-01-19T19:18:23",
"upload_time_iso_8601": "2024-01-19T19:18:23.135879Z",
"url": "https://files.pythonhosted.org/packages/81/4b/d3eb25831590d6d7d38c2f2e3561d3ba41d490dc89cd91d9e65e7c812508/thefuzz-0.22.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-01-19 19:18:23",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "seatgeek",
"github_project": "thefuzz",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "rapidfuzz",
"specs": [
[
"==",
"3.4.0"
]
]
},
{
"name": "pycodestyle",
"specs": [
[
"==",
"2.11.1"
]
]
},
{
"name": "hypothesis",
"specs": [
[
"==",
"6.88.1"
]
]
},
{
"name": "pytest",
"specs": [
[
"==",
"7.4.3"
]
]
},
{
"name": "docutils",
"specs": [
[
"==",
"0.20.1"
]
]
},
{
"name": "Pygments",
"specs": [
[
"==",
"2.16.1"
]
]
},
{
"name": "wheel",
"specs": [
[
"==",
"0.41.3"
]
]
},
{
"name": "setuptools",
"specs": [
[
"==",
"68.2.2"
]
]
},
{
"name": "gitchangelog",
"specs": [
[
"==",
"3.0.4"
]
]
},
{
"name": "restructuredtext_lint",
"specs": [
[
"==",
"1.4.0"
]
]
}
],
"tox": true,
"lcname": "thefuzz"
}