fuzzywuzzy


Namefuzzywuzzy JSON
Version 0.18.0 PyPI version JSON
download
home_pagehttps://github.com/seatgeek/fuzzywuzzy
SummaryFuzzy string matching in python
upload_time2020-02-13 21:06:27
maintainer
docs_urlNone
authorAdam Cohen
requires_python
licenseGPLv2
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI
coveralls test coverage No coveralls.
            .. image:: https://travis-ci.org/seatgeek/fuzzywuzzy.svg?branch=master
    :target: https://travis-ci.org/seatgeek/fuzzywuzzy

FuzzyWuzzy
==========

Fuzzy string matching like a boss. It uses `Levenshtein Distance <https://en.wikipedia.org/wiki/Levenshtein_distance>`_ to calculate the differences between sequences in a simple-to-use package.

Requirements
============

-  Python 2.7 or higher
-  difflib
-  `python-Levenshtein <https://github.com/ztane/python-Levenshtein/>`_ (optional, provides a 4-10x speedup in String
   Matching, though may result in `differing results for certain cases <https://github.com/seatgeek/fuzzywuzzy/issues/128>`_)

For testing
~~~~~~~~~~~
-  pycodestyle
-  hypothesis
-  pytest

Installation
============

Using PIP via PyPI

.. code:: bash

    pip install fuzzywuzzy

or the following to install `python-Levenshtein` too

.. code:: bash

    pip install fuzzywuzzy[speedup]


Using PIP via Github

.. code:: bash

    pip install git+git://github.com/seatgeek/fuzzywuzzy.git@0.18.0#egg=fuzzywuzzy

Adding to your ``requirements.txt`` file (run ``pip install -r requirements.txt`` afterwards)

.. code:: bash

    git+ssh://git@github.com/seatgeek/fuzzywuzzy.git@0.18.0#egg=fuzzywuzzy

Manually via GIT

.. code:: bash

    git clone git://github.com/seatgeek/fuzzywuzzy.git fuzzywuzzy
    cd fuzzywuzzy
    python setup.py install


Usage
=====

.. code:: python

    >>> from fuzzywuzzy import fuzz
    >>> from fuzzywuzzy import process

Simple Ratio
~~~~~~~~~~~~

.. code:: python

    >>> fuzz.ratio("this is a test", "this is a test!")
        97

Partial Ratio
~~~~~~~~~~~~~

.. code:: python

    >>> fuzz.partial_ratio("this is a test", "this is a test!")
        100

Token Sort Ratio
~~~~~~~~~~~~~~~~

.. code:: python

    >>> fuzz.ratio("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")
        91
    >>> fuzz.token_sort_ratio("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")
        100

Token Set Ratio
~~~~~~~~~~~~~~~

.. code:: python

    >>> fuzz.token_sort_ratio("fuzzy was a bear", "fuzzy fuzzy was a bear")
        84
    >>> fuzz.token_set_ratio("fuzzy was a bear", "fuzzy fuzzy was a bear")
        100

Process
~~~~~~~

.. code:: python

    >>> choices = ["Atlanta Falcons", "New York Jets", "New York Giants", "Dallas Cowboys"]
    >>> process.extract("new york jets", choices, limit=2)
        [('New York Jets', 100), ('New York Giants', 78)]
    >>> process.extractOne("cowboys", choices)
        ("Dallas Cowboys", 90)

You can also pass additional parameters to ``extractOne`` method to make it use a specific scorer. A typical use case is to match file paths:

.. code:: python

    >>> process.extractOne("System of a down - Hypnotize - Heroin", songs)
        ('/music/library/good/System of a Down/2005 - Hypnotize/01 - Attack.mp3', 86)
    >>> process.extractOne("System of a down - Hypnotize - Heroin", songs, scorer=fuzz.token_sort_ratio)
        ("/music/library/good/System of a Down/2005 - Hypnotize/10 - She's Like Heroin.mp3", 61)

.. |Build Status| image:: https://api.travis-ci.org/seatgeek/fuzzywuzzy.png?branch=master
   :target: https:travis-ci.org/seatgeek/fuzzywuzzy

Known Ports
============

FuzzyWuzzy is being ported to other languages too! Here are a few ports we know about:

-  Java: `xpresso's fuzzywuzzy implementation <https://github.com/WantedTechnologies/xpresso/wiki/Approximate-string-comparison-and-pattern-matching-in-Java>`_
-  Java: `fuzzywuzzy (java port) <https://github.com/xdrop/fuzzywuzzy>`_
-  Rust: `fuzzyrusty (Rust port) <https://github.com/logannc/fuzzyrusty>`_
-  JavaScript: `fuzzball.js (JavaScript port) <https://github.com/nol13/fuzzball.js>`_
-  C++: `Tmplt/fuzzywuzzy <https://github.com/Tmplt/fuzzywuzzy>`_
-  C#: `fuzzysharp (.Net port) <https://github.com/BoomTownRoi/BoomTown.FuzzySharp>`_
-  Go: `go-fuzzywuzz (Go port) <https://github.com/paul-mannino/go-fuzzywuzzy>`_
-  Free Pascal: `FuzzyWuzzy.pas (Free Pascal port) <https://github.com/DavidMoraisFerreira/FuzzyWuzzy.pas>`_
-  Kotlin multiplatform: `FuzzyWuzzy-Kotlin <https://github.com/willowtreeapps/fuzzywuzzy-kotlin>`_
-  R: `fuzzywuzzyR (R port) <https://github.com/mlampros/fuzzywuzzyR>`_



            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/seatgeek/fuzzywuzzy",
    "name": "fuzzywuzzy",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "",
    "author": "Adam Cohen",
    "author_email": "adam@seatgeek.com",
    "download_url": "https://files.pythonhosted.org/packages/11/4b/0a002eea91be6048a2b5d53c5f1b4dafd57ba2e36eea961d05086d7c28ce/fuzzywuzzy-0.18.0.tar.gz",
    "platform": "",
    "description": ".. image:: https://travis-ci.org/seatgeek/fuzzywuzzy.svg?branch=master\n    :target: https://travis-ci.org/seatgeek/fuzzywuzzy\n\nFuzzyWuzzy\n==========\n\nFuzzy string matching like a boss. It uses `Levenshtein Distance <https://en.wikipedia.org/wiki/Levenshtein_distance>`_ to calculate the differences between sequences in a simple-to-use package.\n\nRequirements\n============\n\n-  Python 2.7 or higher\n-  difflib\n-  `python-Levenshtein <https://github.com/ztane/python-Levenshtein/>`_ (optional, provides a 4-10x speedup in String\n   Matching, though may result in `differing results for certain cases <https://github.com/seatgeek/fuzzywuzzy/issues/128>`_)\n\nFor testing\n~~~~~~~~~~~\n-  pycodestyle\n-  hypothesis\n-  pytest\n\nInstallation\n============\n\nUsing PIP via PyPI\n\n.. code:: bash\n\n    pip install fuzzywuzzy\n\nor the following to install `python-Levenshtein` too\n\n.. code:: bash\n\n    pip install fuzzywuzzy[speedup]\n\n\nUsing PIP via Github\n\n.. code:: bash\n\n    pip install git+git://github.com/seatgeek/fuzzywuzzy.git@0.18.0#egg=fuzzywuzzy\n\nAdding to your ``requirements.txt`` file (run ``pip install -r requirements.txt`` afterwards)\n\n.. code:: bash\n\n    git+ssh://git@github.com/seatgeek/fuzzywuzzy.git@0.18.0#egg=fuzzywuzzy\n\nManually via GIT\n\n.. code:: bash\n\n    git clone git://github.com/seatgeek/fuzzywuzzy.git fuzzywuzzy\n    cd fuzzywuzzy\n    python setup.py install\n\n\nUsage\n=====\n\n.. code:: python\n\n    >>> from fuzzywuzzy import fuzz\n    >>> from fuzzywuzzy import process\n\nSimple Ratio\n~~~~~~~~~~~~\n\n.. code:: python\n\n    >>> fuzz.ratio(\"this is a test\", \"this is a test!\")\n        97\n\nPartial Ratio\n~~~~~~~~~~~~~\n\n.. code:: python\n\n    >>> fuzz.partial_ratio(\"this is a test\", \"this is a test!\")\n        100\n\nToken Sort Ratio\n~~~~~~~~~~~~~~~~\n\n.. code:: python\n\n    >>> fuzz.ratio(\"fuzzy wuzzy was a bear\", \"wuzzy fuzzy was a bear\")\n        91\n    >>> fuzz.token_sort_ratio(\"fuzzy wuzzy was a bear\", \"wuzzy fuzzy was a bear\")\n        100\n\nToken Set Ratio\n~~~~~~~~~~~~~~~\n\n.. code:: python\n\n    >>> fuzz.token_sort_ratio(\"fuzzy was a bear\", \"fuzzy fuzzy was a bear\")\n        84\n    >>> fuzz.token_set_ratio(\"fuzzy was a bear\", \"fuzzy fuzzy was a bear\")\n        100\n\nProcess\n~~~~~~~\n\n.. code:: python\n\n    >>> choices = [\"Atlanta Falcons\", \"New York Jets\", \"New York Giants\", \"Dallas Cowboys\"]\n    >>> process.extract(\"new york jets\", choices, limit=2)\n        [('New York Jets', 100), ('New York Giants', 78)]\n    >>> process.extractOne(\"cowboys\", choices)\n        (\"Dallas Cowboys\", 90)\n\nYou can also pass additional parameters to ``extractOne`` method to make it use a specific scorer. A typical use case is to match file paths:\n\n.. code:: python\n\n    >>> process.extractOne(\"System of a down - Hypnotize - Heroin\", songs)\n        ('/music/library/good/System of a Down/2005 - Hypnotize/01 - Attack.mp3', 86)\n    >>> process.extractOne(\"System of a down - Hypnotize - Heroin\", songs, scorer=fuzz.token_sort_ratio)\n        (\"/music/library/good/System of a Down/2005 - Hypnotize/10 - She's Like Heroin.mp3\", 61)\n\n.. |Build Status| image:: https://api.travis-ci.org/seatgeek/fuzzywuzzy.png?branch=master\n   :target: https:travis-ci.org/seatgeek/fuzzywuzzy\n\nKnown Ports\n============\n\nFuzzyWuzzy is being ported to other languages too! Here are a few ports we know about:\n\n-  Java: `xpresso's fuzzywuzzy implementation <https://github.com/WantedTechnologies/xpresso/wiki/Approximate-string-comparison-and-pattern-matching-in-Java>`_\n-  Java: `fuzzywuzzy (java port) <https://github.com/xdrop/fuzzywuzzy>`_\n-  Rust: `fuzzyrusty (Rust port) <https://github.com/logannc/fuzzyrusty>`_\n-  JavaScript: `fuzzball.js (JavaScript port) <https://github.com/nol13/fuzzball.js>`_\n-  C++: `Tmplt/fuzzywuzzy <https://github.com/Tmplt/fuzzywuzzy>`_\n-  C#: `fuzzysharp (.Net port) <https://github.com/BoomTownRoi/BoomTown.FuzzySharp>`_\n-  Go: `go-fuzzywuzz (Go port) <https://github.com/paul-mannino/go-fuzzywuzzy>`_\n-  Free Pascal: `FuzzyWuzzy.pas (Free Pascal port) <https://github.com/DavidMoraisFerreira/FuzzyWuzzy.pas>`_\n-  Kotlin multiplatform: `FuzzyWuzzy-Kotlin <https://github.com/willowtreeapps/fuzzywuzzy-kotlin>`_\n-  R: `fuzzywuzzyR (R port) <https://github.com/mlampros/fuzzywuzzyR>`_\n\n\n",
    "bugtrack_url": null,
    "license": "GPLv2",
    "summary": "Fuzzy string matching in python",
    "version": "0.18.0",
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "md5": "237450dba93f7226c7dfbdd04a1355c6",
                "sha256": "928244b28db720d1e0ee7587acf660ea49d7e4c632569cad4f1cd7e68a5f0993"
            },
            "downloads": -1,
            "filename": "fuzzywuzzy-0.18.0-py2.py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "237450dba93f7226c7dfbdd04a1355c6",
            "packagetype": "bdist_wheel",
            "python_version": "py2.py3",
            "requires_python": null,
            "size": 18272,
            "upload_time": "2020-02-13T21:06:25",
            "upload_time_iso_8601": "2020-02-13T21:06:25.209912Z",
            "url": "https://files.pythonhosted.org/packages/43/ff/74f23998ad2f93b945c0309f825be92e04e0348e062026998b5eefef4c33/fuzzywuzzy-0.18.0-py2.py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "md5": "29708593c35b1ca67c329f853d9abcd0",
                "sha256": "45016e92264780e58972dca1b3d939ac864b78437422beecebb3095f8efd00e8"
            },
            "downloads": -1,
            "filename": "fuzzywuzzy-0.18.0.tar.gz",
            "has_sig": false,
            "md5_digest": "29708593c35b1ca67c329f853d9abcd0",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 28888,
            "upload_time": "2020-02-13T21:06:27",
            "upload_time_iso_8601": "2020-02-13T21:06:27.054783Z",
            "url": "https://files.pythonhosted.org/packages/11/4b/0a002eea91be6048a2b5d53c5f1b4dafd57ba2e36eea961d05086d7c28ce/fuzzywuzzy-0.18.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2020-02-13 21:06:27",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "seatgeek",
    "github_project": "fuzzywuzzy",
    "travis_ci": true,
    "coveralls": false,
    "github_actions": false,
    "tox": true,
    "lcname": "fuzzywuzzy"
}
        
Elapsed time: 0.14286s