elastichash


Nameelastichash JSON
Version 0.1.5 PyPI version JSON
download
home_pagehttps://nik-ko.github.io/elastichash/
SummaryElasticHash enables efficient similarity search for binary hash codes using Elasticsearch
upload_time2023-11-20 13:46:37
maintainer
docs_urlNone
authorNikolaus Korfhage
requires_python>=3.7
licenseMIT
keywords
VCS
bugtrack_url
requirements elasticsearch numpy networkx bitstring matplotlib urllib3 seaborn pillow
Travis-CI No Travis.
coveralls test coverage No coveralls.
            [![build](https://github.com/nik-ko/elastichash/actions/workflows/CI.yml/badge.svg)](https://github.com/nik-ko/elastichash/actions/workflows/CI.yml) 
[![doc](https://github.com/nik-ko/elastichash/actions/workflows/documentation.yml/badge.svg)](https://github.com/nik-ko/elastichash/actions/workflows/documentation.yml)
[![PyPI version](https://img.shields.io/pypi/v/elastichash.svg)](https://pypi.python.org/pypi/elastichash)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

# ElasticHash

## Introduction

ElasticHash implements efficient similarity search by using a two-stage method for efficiently searching binary hash 
codes using Elasticsearch. 
In the first stage, a coarse search based on short hash codes is performed using multi-index hashing and ES terms lookup 
of neighboring hash codes. In the second stage, the list of results is re-ranked by computing the Hamming distance on 
long hash codes.

The only requirement ist that binary codes to be indexed need to be 256 bits long as currently only 256 bit codes are 
supported.

For a whole image similarity search system, including model training and model serving, 
see https://github.com/umr-ds/ElasticHash.

## Install

`pip install elastichash`

## Usage

- Create an Elastisearch client to use it with ElasticHash
  ```
  es = Elasticsearch(elasticsearch_endpoint)
  eh = ElasticHash(es)
  ```
- New items can be added by calling `add(code)` where `code` can be a list, string or numpy array together with
  additional fields
  ```
  eh.add(code, additional_fields={"image_path": "/path/to/an/image"})
  ```
- After adding a suffiently large amount of codes (e.g. 10,000), `decorrelate()` needs to be called to rearrange the
  binary hashcode permutations
- To search documents by their hash code use `search(code)` 

            

Raw data

            {
    "_id": null,
    "home_page": "https://nik-ko.github.io/elastichash/",
    "name": "elastichash",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "",
    "keywords": "",
    "author": "Nikolaus Korfhage",
    "author_email": "nko+py@posteo.ru",
    "download_url": "https://files.pythonhosted.org/packages/32/f2/baf772ed702b751505d726aeead76b799eeb032ea4eaed5745ffd8e3ff51/elastichash-0.1.5.tar.gz",
    "platform": null,
    "description": "[![build](https://github.com/nik-ko/elastichash/actions/workflows/CI.yml/badge.svg)](https://github.com/nik-ko/elastichash/actions/workflows/CI.yml) \n[![doc](https://github.com/nik-ko/elastichash/actions/workflows/documentation.yml/badge.svg)](https://github.com/nik-ko/elastichash/actions/workflows/documentation.yml)\n[![PyPI version](https://img.shields.io/pypi/v/elastichash.svg)](https://pypi.python.org/pypi/elastichash)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n\n# ElasticHash\n\n## Introduction\n\nElasticHash implements efficient similarity search by using a two-stage method for efficiently searching binary hash \ncodes using Elasticsearch. \nIn the first stage, a coarse search based on short hash codes is performed using multi-index hashing and ES terms lookup \nof neighboring hash codes. In the second stage, the list of results is re-ranked by computing the Hamming distance on \nlong hash codes.\n\nThe only requirement ist that binary codes to be indexed need to be 256 bits long as currently only 256 bit codes are \nsupported.\n\nFor a whole image similarity search system, including model training and model serving, \nsee https://github.com/umr-ds/ElasticHash.\n\n## Install\n\n`pip install elastichash`\n\n## Usage\n\n- Create an Elastisearch client to use it with ElasticHash\n  ```\n  es = Elasticsearch(elasticsearch_endpoint)\n  eh = ElasticHash(es)\n  ```\n- New items can be added by calling `add(code)` where `code` can be a list, string or numpy array together with\n  additional fields\n  ```\n  eh.add(code, additional_fields={\"image_path\": \"/path/to/an/image\"})\n  ```\n- After adding a suffiently large amount of codes (e.g. 10,000), `decorrelate()` needs to be called to rearrange the\n  binary hashcode permutations\n- To search documents by their hash code use `search(code)` \n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "ElasticHash enables efficient similarity search for binary hash codes using Elasticsearch",
    "version": "0.1.5",
    "project_urls": {
        "Bug Tracker": "https://github.com/nik-ko/elastichash/issues",
        "Documentation": "https://nik-ko.github.io/elastichash/",
        "Homepage": "https://nik-ko.github.io/elastichash/",
        "Source": "https://github.com/nik-ko/elastichash"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "fb030838c599778f68c1bf24542d75b10752473af39bd856778a203bcffcef3f",
                "md5": "0184fae063837cb4d219bf4005107aef",
                "sha256": "ef166ae79441b0dac646f7ce1c0c143e88fafa40136c6fbf5e9c82ae3278ca4d"
            },
            "downloads": -1,
            "filename": "elastichash-0.1.5-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "0184fae063837cb4d219bf4005107aef",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 11752,
            "upload_time": "2023-11-20T13:46:36",
            "upload_time_iso_8601": "2023-11-20T13:46:36.614318Z",
            "url": "https://files.pythonhosted.org/packages/fb/03/0838c599778f68c1bf24542d75b10752473af39bd856778a203bcffcef3f/elastichash-0.1.5-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "32f2baf772ed702b751505d726aeead76b799eeb032ea4eaed5745ffd8e3ff51",
                "md5": "dcfdce05426b7dbbf152f9ee68f17009",
                "sha256": "4df3d50f06121620e73f37d26b72a619e3c52679f2924be4f6a8d2e6829c26b3"
            },
            "downloads": -1,
            "filename": "elastichash-0.1.5.tar.gz",
            "has_sig": false,
            "md5_digest": "dcfdce05426b7dbbf152f9ee68f17009",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 13374,
            "upload_time": "2023-11-20T13:46:37",
            "upload_time_iso_8601": "2023-11-20T13:46:37.727798Z",
            "url": "https://files.pythonhosted.org/packages/32/f2/baf772ed702b751505d726aeead76b799eeb032ea4eaed5745ffd8e3ff51/elastichash-0.1.5.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-11-20 13:46:37",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "nik-ko",
    "github_project": "elastichash",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "elasticsearch",
            "specs": [
                [
                    "~=",
                    "8.10.1"
                ]
            ]
        },
        {
            "name": "numpy",
            "specs": [
                [
                    "~=",
                    "1.24.3"
                ]
            ]
        },
        {
            "name": "networkx",
            "specs": [
                [
                    "~=",
                    "3.1"
                ]
            ]
        },
        {
            "name": "bitstring",
            "specs": [
                [
                    "~=",
                    "4.1.1"
                ]
            ]
        },
        {
            "name": "matplotlib",
            "specs": [
                [
                    "~=",
                    "3.7.1"
                ]
            ]
        },
        {
            "name": "urllib3",
            "specs": [
                [
                    "~=",
                    "1.26.16"
                ]
            ]
        },
        {
            "name": "seaborn",
            "specs": [
                [
                    "~=",
                    "0.12.2"
                ]
            ]
        },
        {
            "name": "pillow",
            "specs": [
                [
                    ">=",
                    "10.0.1"
                ]
            ]
        }
    ],
    "lcname": "elastichash"
}
        
Elapsed time: 0.20751s