[](https://github.com/nik-ko/elastichash/actions/workflows/CI.yml)
[](https://github.com/nik-ko/elastichash/actions/workflows/documentation.yml)
[](https://pypi.python.org/pypi/elastichash)
[](https://opensource.org/licenses/MIT)
# ElasticHash
## Introduction
ElasticHash implements efficient similarity search by using a two-stage method for efficiently searching binary hash
codes using Elasticsearch.
In the first stage, a coarse search based on short hash codes is performed using multi-index hashing and ES terms lookup
of neighboring hash codes. In the second stage, the list of results is re-ranked by computing the Hamming distance on
long hash codes.
The only requirement ist that binary codes to be indexed need to be 256 bits long as currently only 256 bit codes are
supported.
For a whole image similarity search system, including model training and model serving,
see https://github.com/umr-ds/ElasticHash.
## Install
`pip install elastichash`
## Usage
- Create an Elastisearch client to use it with ElasticHash
```
es = Elasticsearch(elasticsearch_endpoint)
eh = ElasticHash(es)
```
- New items can be added by calling `add(code)` where `code` can be a list, string or numpy array together with
additional fields
```
eh.add(code, additional_fields={"image_path": "/path/to/an/image"})
```
- After adding a suffiently large amount of codes (e.g. 10,000), `decorrelate()` needs to be called to rearrange the
binary hashcode permutations
- To search documents by their hash code use `search(code)`
Raw data
{
"_id": null,
"home_page": "https://nik-ko.github.io/elastichash/",
"name": "elastichash",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": "",
"keywords": "",
"author": "Nikolaus Korfhage",
"author_email": "nko+py@posteo.ru",
"download_url": "https://files.pythonhosted.org/packages/32/f2/baf772ed702b751505d726aeead76b799eeb032ea4eaed5745ffd8e3ff51/elastichash-0.1.5.tar.gz",
"platform": null,
"description": "[](https://github.com/nik-ko/elastichash/actions/workflows/CI.yml) \n[](https://github.com/nik-ko/elastichash/actions/workflows/documentation.yml)\n[](https://pypi.python.org/pypi/elastichash)\n[](https://opensource.org/licenses/MIT)\n\n# ElasticHash\n\n## Introduction\n\nElasticHash implements efficient similarity search by using a two-stage method for efficiently searching binary hash \ncodes using Elasticsearch. \nIn the first stage, a coarse search based on short hash codes is performed using multi-index hashing and ES terms lookup \nof neighboring hash codes. In the second stage, the list of results is re-ranked by computing the Hamming distance on \nlong hash codes.\n\nThe only requirement ist that binary codes to be indexed need to be 256 bits long as currently only 256 bit codes are \nsupported.\n\nFor a whole image similarity search system, including model training and model serving, \nsee https://github.com/umr-ds/ElasticHash.\n\n## Install\n\n`pip install elastichash`\n\n## Usage\n\n- Create an Elastisearch client to use it with ElasticHash\n ```\n es = Elasticsearch(elasticsearch_endpoint)\n eh = ElasticHash(es)\n ```\n- New items can be added by calling `add(code)` where `code` can be a list, string or numpy array together with\n additional fields\n ```\n eh.add(code, additional_fields={\"image_path\": \"/path/to/an/image\"})\n ```\n- After adding a suffiently large amount of codes (e.g. 10,000), `decorrelate()` needs to be called to rearrange the\n binary hashcode permutations\n- To search documents by their hash code use `search(code)` \n",
"bugtrack_url": null,
"license": "MIT",
"summary": "ElasticHash enables efficient similarity search for binary hash codes using Elasticsearch",
"version": "0.1.5",
"project_urls": {
"Bug Tracker": "https://github.com/nik-ko/elastichash/issues",
"Documentation": "https://nik-ko.github.io/elastichash/",
"Homepage": "https://nik-ko.github.io/elastichash/",
"Source": "https://github.com/nik-ko/elastichash"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "fb030838c599778f68c1bf24542d75b10752473af39bd856778a203bcffcef3f",
"md5": "0184fae063837cb4d219bf4005107aef",
"sha256": "ef166ae79441b0dac646f7ce1c0c143e88fafa40136c6fbf5e9c82ae3278ca4d"
},
"downloads": -1,
"filename": "elastichash-0.1.5-py3-none-any.whl",
"has_sig": false,
"md5_digest": "0184fae063837cb4d219bf4005107aef",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 11752,
"upload_time": "2023-11-20T13:46:36",
"upload_time_iso_8601": "2023-11-20T13:46:36.614318Z",
"url": "https://files.pythonhosted.org/packages/fb/03/0838c599778f68c1bf24542d75b10752473af39bd856778a203bcffcef3f/elastichash-0.1.5-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "32f2baf772ed702b751505d726aeead76b799eeb032ea4eaed5745ffd8e3ff51",
"md5": "dcfdce05426b7dbbf152f9ee68f17009",
"sha256": "4df3d50f06121620e73f37d26b72a619e3c52679f2924be4f6a8d2e6829c26b3"
},
"downloads": -1,
"filename": "elastichash-0.1.5.tar.gz",
"has_sig": false,
"md5_digest": "dcfdce05426b7dbbf152f9ee68f17009",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 13374,
"upload_time": "2023-11-20T13:46:37",
"upload_time_iso_8601": "2023-11-20T13:46:37.727798Z",
"url": "https://files.pythonhosted.org/packages/32/f2/baf772ed702b751505d726aeead76b799eeb032ea4eaed5745ffd8e3ff51/elastichash-0.1.5.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-11-20 13:46:37",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "nik-ko",
"github_project": "elastichash",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "elasticsearch",
"specs": [
[
"~=",
"8.10.1"
]
]
},
{
"name": "numpy",
"specs": [
[
"~=",
"1.24.3"
]
]
},
{
"name": "networkx",
"specs": [
[
"~=",
"3.1"
]
]
},
{
"name": "bitstring",
"specs": [
[
"~=",
"4.1.1"
]
]
},
{
"name": "matplotlib",
"specs": [
[
"~=",
"3.7.1"
]
]
},
{
"name": "urllib3",
"specs": [
[
"~=",
"1.26.16"
]
]
},
{
"name": "seaborn",
"specs": [
[
"~=",
"0.12.2"
]
]
},
{
"name": "pillow",
"specs": [
[
">=",
"10.0.1"
]
]
}
],
"lcname": "elastichash"
}