# datasette-jellyfish
[![PyPI](https://img.shields.io/pypi/v/datasette-jellyfish.svg)](https://pypi.org/project/datasette-jellyfish/)
[![Changelog](https://img.shields.io/github/v/release/simonw/datasette-jellyfish?include_prereleases&label=changelog)](https://github.com/simonw/datasette-jellyfish/releases)
[![Tests](https://github.com/simonw/datasette-jellyfish/workflows/Test/badge.svg)](https://github.com/simonw/datasette-jellyfish/actions?query=workflow%3ATest)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/datasette-jellyfish/blob/main/LICENSE)
Datasette plugin that adds custom SQL functions for fuzzy string matching, built on top of the [Jellyfish](https://github.com/jamesturk/jellyfish) Python library by James Turk and Michael Stephens.
Interactive demos:
* [soundex, metaphone, nysiis, match_rating_codex comparison](https://latest-with-plugins.datasette.io/fixtures?sql=SELECT%0D%0A++++soundex%28%3As%29%2C+%0D%0A++++metaphone%28%3As%29%2C+%0D%0A++++nysiis%28%3As%29%2C+%0D%0A++++match_rating_codex%28%3As%29&s=demo).
* [distance functions comparison](https://latest-with-plugins.datasette.io/fixtures?sql=SELECT%0D%0A++++levenshtein_distance%28%3As1%2C+%3As2%29%2C%0D%0A++++damerau_levenshtein_distance%28%3As1%2C+%3As2%29%2C%0D%0A++++hamming_distance%28%3As1%2C+%3As2%29%2C%0D%0A++++jaro_similarity%28%3As1%2C+%3As2%29%2C%0D%0A++++jaro_winkler_similarity%28%3As1%2C+%3As2%29%2C%0D%0A++++match_rating_comparison%28%3As1%2C+%3As2%29%3B&s1=barrack+obama&s2=barrack+h+obama)
Examples:
SELECT soundex("hello");
-- Outputs H400
SELECT metaphone("hello");
-- Outputs HL
SELECT nysiis("hello");
-- Outputs HAL
SELECT match_rating_codex("hello");
-- Outputs HLL
SELECT levenshtein_distance("hello", "hello world");
-- Outputs 6
SELECT damerau_levenshtein_distance("hello", "hello world");
-- Outputs 6
SELECT hamming_distance("hello", "hello world");
-- Outputs 6
SELECT jaro_similarity("hello", "hello world");
-- Outputs 0.8181818181818182
SELECT jaro_winkler_similarity("hello", "hello world");
-- Outputs 0.890909090909091
SELECT match_rating_comparison("hello", "helloo");
-- Outputs 1
See [the Jellyfish documentation](https://jellyfish.readthedocs.io/en/latest/) for an explanation of each of these functions.
Raw data
{
"_id": null,
"home_page": "https://datasette.io/plugins/datasette-jellyfish",
"name": "datasette-jellyfish",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "",
"author": "Simon Willison",
"author_email": "",
"download_url": "https://files.pythonhosted.org/packages/62/cd/473c4f1ac7e3406afac8fafbfb3ec68f08a3305c3ce4780783225f9654e3/datasette-jellyfish-2.0.tar.gz",
"platform": null,
"description": "# datasette-jellyfish\n\n[![PyPI](https://img.shields.io/pypi/v/datasette-jellyfish.svg)](https://pypi.org/project/datasette-jellyfish/)\n[![Changelog](https://img.shields.io/github/v/release/simonw/datasette-jellyfish?include_prereleases&label=changelog)](https://github.com/simonw/datasette-jellyfish/releases)\n[![Tests](https://github.com/simonw/datasette-jellyfish/workflows/Test/badge.svg)](https://github.com/simonw/datasette-jellyfish/actions?query=workflow%3ATest)\n[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/datasette-jellyfish/blob/main/LICENSE)\n\nDatasette plugin that adds custom SQL functions for fuzzy string matching, built on top of the [Jellyfish](https://github.com/jamesturk/jellyfish) Python library by James Turk and Michael Stephens.\n\nInteractive demos:\n\n* [soundex, metaphone, nysiis, match_rating_codex comparison](https://latest-with-plugins.datasette.io/fixtures?sql=SELECT%0D%0A++++soundex%28%3As%29%2C+%0D%0A++++metaphone%28%3As%29%2C+%0D%0A++++nysiis%28%3As%29%2C+%0D%0A++++match_rating_codex%28%3As%29&s=demo).\n* [distance functions comparison](https://latest-with-plugins.datasette.io/fixtures?sql=SELECT%0D%0A++++levenshtein_distance%28%3As1%2C+%3As2%29%2C%0D%0A++++damerau_levenshtein_distance%28%3As1%2C+%3As2%29%2C%0D%0A++++hamming_distance%28%3As1%2C+%3As2%29%2C%0D%0A++++jaro_similarity%28%3As1%2C+%3As2%29%2C%0D%0A++++jaro_winkler_similarity%28%3As1%2C+%3As2%29%2C%0D%0A++++match_rating_comparison%28%3As1%2C+%3As2%29%3B&s1=barrack+obama&s2=barrack+h+obama)\n\nExamples:\n\n SELECT soundex(\"hello\");\n -- Outputs H400\n SELECT metaphone(\"hello\");\n -- Outputs HL\n SELECT nysiis(\"hello\");\n -- Outputs HAL\n SELECT match_rating_codex(\"hello\");\n -- Outputs HLL\n SELECT levenshtein_distance(\"hello\", \"hello world\");\n -- Outputs 6\n SELECT damerau_levenshtein_distance(\"hello\", \"hello world\");\n -- Outputs 6\n SELECT hamming_distance(\"hello\", \"hello world\");\n -- Outputs 6\n SELECT jaro_similarity(\"hello\", \"hello world\");\n -- Outputs 0.8181818181818182\n SELECT jaro_winkler_similarity(\"hello\", \"hello world\");\n -- Outputs 0.890909090909091\n SELECT match_rating_comparison(\"hello\", \"helloo\");\n -- Outputs 1\n\nSee [the Jellyfish documentation](https://jellyfish.readthedocs.io/en/latest/) for an explanation of each of these functions.\n",
"bugtrack_url": null,
"license": "Apache License, Version 2.0",
"summary": "Datasette plugin adding SQL functions for fuzzy text matching powered by Jellyfish",
"version": "2.0",
"project_urls": {
"CI": "https://github.com/simonw/datasette-jellyfish/actions",
"Changelog": "https://github.com/simonw/datasette-jellyfish/releases",
"Homepage": "https://datasette.io/plugins/datasette-jellyfish",
"Issues": "https://github.com/simonw/datasette-jellyfish/issues"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "b4168aa72ac5e2f634ebebff4f5732ef867f8d6ddc4de024dcc2573505995d60",
"md5": "abead56a2868a022c61b136bd2124a6c",
"sha256": "057708a96fef725294708537ac9dae68dac308268eaf283b7f5c6945dd319dff"
},
"downloads": -1,
"filename": "datasette_jellyfish-2.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "abead56a2868a022c61b136bd2124a6c",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 6909,
"upload_time": "2023-08-24T21:46:20",
"upload_time_iso_8601": "2023-08-24T21:46:20.274553Z",
"url": "https://files.pythonhosted.org/packages/b4/16/8aa72ac5e2f634ebebff4f5732ef867f8d6ddc4de024dcc2573505995d60/datasette_jellyfish-2.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "62cd473c4f1ac7e3406afac8fafbfb3ec68f08a3305c3ce4780783225f9654e3",
"md5": "6043f7448296719143d2ba892a9c5336",
"sha256": "4ca91fa7b09658a31b6942db2771884e46e7ead0dd731c06424adf5f4eb8965f"
},
"downloads": -1,
"filename": "datasette-jellyfish-2.0.tar.gz",
"has_sig": false,
"md5_digest": "6043f7448296719143d2ba892a9c5336",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 6743,
"upload_time": "2023-08-24T21:46:22",
"upload_time_iso_8601": "2023-08-24T21:46:22.006757Z",
"url": "https://files.pythonhosted.org/packages/62/cd/473c4f1ac7e3406afac8fafbfb3ec68f08a3305c3ce4780783225f9654e3/datasette-jellyfish-2.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-08-24 21:46:22",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "simonw",
"github_project": "datasette-jellyfish",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "datasette-jellyfish"
}