vecsim


Namevecsim JSON
Version 0.0.62 PyPI version JSON
download
home_pagehttps://github.com/argmaxml/vecsim
SummaryVector Similarity Search Engine
upload_time2023-11-30 13:41:14
maintainer
docs_urlNone
authorArgmaxML
requires_python
license
keywords vector-similarity faiss hnsw redis matching ranking elasticsearch search embedding
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # VecSim - A unified interface for similarity servers
A standard, light-weight interface to all popular similarity servers.

## The problems we are trying to solve:
1. **Standard API** - Different vector similarity servers have different APIs - so switching is not trivial.
1. **Identifiers** - Some vector similarity servers support string IDs, some do not - we keep track of the mapping.
1. **Partitions** - In most cases, pre-filtering is needed prior to querying, we abstract this concept away.
1. **Aggregations** - In some cases, one item is being indexed to multiple vectors.

## Supported engines:
1. Scikit-learn, via [NearestNeighbors](https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.NearestNeighbors.html)
1. [RediSearch](https://redis.io/docs/stack/search/reference/vectors/)
1. [Faiss](https://github.com/facebookresearch/faiss)
1. [ElasticSearch](https://www.elastic.co)
1. [Pinecone](https://www.pinecone.io)


## QuickStart example
```python
import numpy as np
# Import a similarity server of your choice:
# SKlearn (best for small datasets or testing)
from vecsim import SciKitIndex
sim = SciKitIndex(metric='cosine', dim=32)

user_ids = ["user_"+str(1+i) for i in range(100)]
user_data = np.random.random((100,32))
item_ids=["item_"+str(101+i) for i in range(100)]
item_data = np.random.random((100,32))
sim.add_items(user_data, user_ids, partition="users")
sim.add_items(item_data, item_ids, partition="items")
# Index the data
sim.init()
# Run nearest neighbor vector search
query = np.random.random(32)
dists, items = sim.search(query, k=10) # returns a list of users and items
dists, items = sim.search(query, k=10, partition="users") # returns a list of users only
```

For more examples, please read our [documentation](https://vecsim.readthedocs.io/)

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/argmaxml/vecsim",
    "name": "vecsim",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "vector-similarity,faiss,hnsw,redis,matching,ranking,elasticsearch,search,embedding",
    "author": "ArgmaxML",
    "author_email": "ugoren@argmax.ml",
    "download_url": "https://files.pythonhosted.org/packages/de/14/d37e75e0f05caa74fea3c15270cca699f1ac1c10c199453c2f0009f4f90e/vecsim-0.0.62.tar.gz",
    "platform": null,
    "description": "# VecSim - A unified interface for similarity servers\nA standard, light-weight interface to all popular similarity servers.\n\n## The problems we are trying to solve:\n1. **Standard API** - Different vector similarity servers have different APIs - so switching is not trivial.\n1. **Identifiers** - Some vector similarity servers support string IDs, some do not - we keep track of the mapping.\n1. **Partitions** - In most cases, pre-filtering is needed prior to querying, we abstract this concept away.\n1. **Aggregations** - In some cases, one item is being indexed to multiple vectors.\n\n## Supported engines:\n1. Scikit-learn, via [NearestNeighbors](https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.NearestNeighbors.html)\n1. [RediSearch](https://redis.io/docs/stack/search/reference/vectors/)\n1. [Faiss](https://github.com/facebookresearch/faiss)\n1. [ElasticSearch](https://www.elastic.co)\n1. [Pinecone](https://www.pinecone.io)\n\n\n## QuickStart example\n```python\nimport numpy as np\n# Import a similarity server of your choice:\n# SKlearn (best for small datasets or testing)\nfrom vecsim import SciKitIndex\nsim = SciKitIndex(metric='cosine', dim=32)\n\nuser_ids = [\"user_\"+str(1+i) for i in range(100)]\nuser_data = np.random.random((100,32))\nitem_ids=[\"item_\"+str(101+i) for i in range(100)]\nitem_data = np.random.random((100,32))\nsim.add_items(user_data, user_ids, partition=\"users\")\nsim.add_items(item_data, item_ids, partition=\"items\")\n# Index the data\nsim.init()\n# Run nearest neighbor vector search\nquery = np.random.random(32)\ndists, items = sim.search(query, k=10) # returns a list of users and items\ndists, items = sim.search(query, k=10, partition=\"users\") # returns a list of users only\n```\n\nFor more examples, please read our [documentation](https://vecsim.readthedocs.io/)\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Vector Similarity Search Engine",
    "version": "0.0.62",
    "project_urls": {
        "Homepage": "https://github.com/argmaxml/vecsim"
    },
    "split_keywords": [
        "vector-similarity",
        "faiss",
        "hnsw",
        "redis",
        "matching",
        "ranking",
        "elasticsearch",
        "search",
        "embedding"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b26b64cb7ae88a1abcefda84f4bfbd340926b35c42f86c367b29d087a3cf7b16",
                "md5": "7ac302bfcf457ceddf58134e70b662d3",
                "sha256": "0864cd2cc3e1483d117e00858a2c00b8ab37f9ad4ebd4693d39215be67d60cb6"
            },
            "downloads": -1,
            "filename": "vecsim-0.0.62-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "7ac302bfcf457ceddf58134e70b662d3",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 8953,
            "upload_time": "2023-11-30T13:41:12",
            "upload_time_iso_8601": "2023-11-30T13:41:12.909781Z",
            "url": "https://files.pythonhosted.org/packages/b2/6b/64cb7ae88a1abcefda84f4bfbd340926b35c42f86c367b29d087a3cf7b16/vecsim-0.0.62-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "de14d37e75e0f05caa74fea3c15270cca699f1ac1c10c199453c2f0009f4f90e",
                "md5": "375ead8da1c22f7d5f52bc1b8273ea37",
                "sha256": "067f5ca5e573abcf423e1f10b8682c9e8081eb5a0ba533c27cbabfdc8454212f"
            },
            "downloads": -1,
            "filename": "vecsim-0.0.62.tar.gz",
            "has_sig": false,
            "md5_digest": "375ead8da1c22f7d5f52bc1b8273ea37",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 9962,
            "upload_time": "2023-11-30T13:41:14",
            "upload_time_iso_8601": "2023-11-30T13:41:14.638871Z",
            "url": "https://files.pythonhosted.org/packages/de/14/d37e75e0f05caa74fea3c15270cca699f1ac1c10c199453c2f0009f4f90e/vecsim-0.0.62.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-11-30 13:41:14",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "argmaxml",
    "github_project": "vecsim",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "vecsim"
}
        
Elapsed time: 0.16245s