vectorium


Namevectorium JSON
Version 0.1.0 PyPI version JSON
download
home_pagehttps://github.com/silvaan/vectorium
SummaryTools for storing embeddings in a database and querying them
upload_time2023-06-15 13:52:06
maintainer
docs_urlNone
authorSilvan Ferreira
requires_python
license
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Vectorium

This Python package provides a simple and flexible interface for creating, manipulating and querying vector databases. Vectorium makes it easy to add or remove vectors, compare them using various similarity metrics, and perform various aggregate operations.

### Features

- **Flexible Vector Database**: Store and manage your high dimensional vectors in an efficient manner.
- **Multiple Compare Functions**: Includes functions like cosine similarity, euclidean distance, dot product.
- **Various Aggregate Functions**: Supports aggregation of result vectors using mean, sum, max, min or no operation.
- **Vector Operations**: Add or remove vectors from your database, save or load your vector collection, and update your vector database as needed.

## Installation

This package is not yet available on PyPi. Please clone this repository to your local machine and import the `VectorDatabase` class.

## Usage

This is a brief example of how to use the `VectorDatabase` class:

```python
from vectorium import VectorDatabase
import numpy as np

# Create a new database named 'my_collection'
db = VectorDatabase('my_collection', dim=128)

# Add a new vector associated with the key 'my_key'
db.add('my_key', np.random.randn(128))

# Compare an input vector with the database
results = db.compare(np.random.randn(128), func='cosine', aggregate='mean')

# Remove a key from the database
db.remove('my_key')

# Save the database
db.save()

# Load the database
db.load('my_collection_path')
```

## Class: VectorDatabase

### Parameters
- `collection` - The name of the database. Will be used as a filename when saving/loading.
- `dim` (optional) - The dimensions of the vectors. If None, will be inferred from the first added vector.
- `collection_path` (optional) - The path where the database file will be stored.

### Methods

#### `add(key, vec)`
Add a new vector associated with the given key to the database. If the key already exists, the vector will be appended to the existing ones.

#### `remove(key)`
Removes the vectors associated with the given key from the database.

#### `compare(input_vector, func='cosine', aggregate='mean')`
Compares an input vector with the vectors in the database using the given compare function (default is 'cosine') and returns the results aggregated using the given aggregate function (default is 'mean').

#### `topk(input_vector, k=10, func='cosine', aggregate='mean', reverse=False)`
Similar to `compare`, but only returns the top `k` results.

#### `reset()`
Empties the database.

#### `save()`
Saves the database to a `.npz` file with the name of the collection.

#### `load(collection_path)`
Loads the database from a `.npz` file located at the given path.

#### `update()`
Updates the internal list representation of the database. This is called automatically after each `add`, `remove`, `reset`, `load` and `save` operation.

## Requirements

- Python 3.6+
- NumPy
- PyTorch

## Contribution

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

## License

[MIT](https://choosealicense.com/licenses/mit/)

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/silvaan/vectorium",
    "name": "vectorium",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "",
    "author": "Silvan Ferreira",
    "author_email": "silvanfj@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/00/2a/c26a990c9181b17f565d031eb7d9ecaff70610527a63a73662af258400d1/vectorium-0.1.0.tar.gz",
    "platform": null,
    "description": "# Vectorium\n\nThis Python package provides a simple and flexible interface for creating, manipulating and querying vector databases. Vectorium makes it easy to add or remove vectors, compare them using various similarity metrics, and perform various aggregate operations.\n\n### Features\n\n- **Flexible Vector Database**: Store and manage your high dimensional vectors in an efficient manner.\n- **Multiple Compare Functions**: Includes functions like cosine similarity, euclidean distance, dot product.\n- **Various Aggregate Functions**: Supports aggregation of result vectors using mean, sum, max, min or no operation.\n- **Vector Operations**: Add or remove vectors from your database, save or load your vector collection, and update your vector database as needed.\n\n## Installation\n\nThis package is not yet available on PyPi. Please clone this repository to your local machine and import the `VectorDatabase` class.\n\n## Usage\n\nThis is a brief example of how to use the `VectorDatabase` class:\n\n```python\nfrom vectorium import VectorDatabase\nimport numpy as np\n\n# Create a new database named 'my_collection'\ndb = VectorDatabase('my_collection', dim=128)\n\n# Add a new vector associated with the key 'my_key'\ndb.add('my_key', np.random.randn(128))\n\n# Compare an input vector with the database\nresults = db.compare(np.random.randn(128), func='cosine', aggregate='mean')\n\n# Remove a key from the database\ndb.remove('my_key')\n\n# Save the database\ndb.save()\n\n# Load the database\ndb.load('my_collection_path')\n```\n\n## Class: VectorDatabase\n\n### Parameters\n- `collection` - The name of the database. Will be used as a filename when saving/loading.\n- `dim` (optional) - The dimensions of the vectors. If None, will be inferred from the first added vector.\n- `collection_path` (optional) - The path where the database file will be stored.\n\n### Methods\n\n#### `add(key, vec)`\nAdd a new vector associated with the given key to the database. If the key already exists, the vector will be appended to the existing ones.\n\n#### `remove(key)`\nRemoves the vectors associated with the given key from the database.\n\n#### `compare(input_vector, func='cosine', aggregate='mean')`\nCompares an input vector with the vectors in the database using the given compare function (default is 'cosine') and returns the results aggregated using the given aggregate function (default is 'mean').\n\n#### `topk(input_vector, k=10, func='cosine', aggregate='mean', reverse=False)`\nSimilar to `compare`, but only returns the top `k` results.\n\n#### `reset()`\nEmpties the database.\n\n#### `save()`\nSaves the database to a `.npz` file with the name of the collection.\n\n#### `load(collection_path)`\nLoads the database from a `.npz` file located at the given path.\n\n#### `update()`\nUpdates the internal list representation of the database. This is called automatically after each `add`, `remove`, `reset`, `load` and `save` operation.\n\n## Requirements\n\n- Python 3.6+\n- NumPy\n- PyTorch\n\n## Contribution\n\nPull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.\n\n## License\n\n[MIT](https://choosealicense.com/licenses/mit/)\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Tools for storing embeddings in a database and querying them",
    "version": "0.1.0",
    "project_urls": {
        "Homepage": "https://github.com/silvaan/vectorium"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "5d63243cc0d7ba7e87843b356169b2d3129b8b62b69aef5012015cff385cec8e",
                "md5": "299b0dbe1d5b386caa22ce12084a6646",
                "sha256": "72f5f7e7f2f03b8b4e45e8e77839caeec4aa7cfeeca36ffb8985b933812cfd02"
            },
            "downloads": -1,
            "filename": "vectorium-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "299b0dbe1d5b386caa22ce12084a6646",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 4473,
            "upload_time": "2023-06-15T13:52:04",
            "upload_time_iso_8601": "2023-06-15T13:52:04.694261Z",
            "url": "https://files.pythonhosted.org/packages/5d/63/243cc0d7ba7e87843b356169b2d3129b8b62b69aef5012015cff385cec8e/vectorium-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "002ac26a990c9181b17f565d031eb7d9ecaff70610527a63a73662af258400d1",
                "md5": "4c1f3bce5f21dbc4c009131a5ac2b84a",
                "sha256": "7cd016f921002867342cf7e42e1f8aaafa16cb8b85fda6398bd600b39d06f613"
            },
            "downloads": -1,
            "filename": "vectorium-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "4c1f3bce5f21dbc4c009131a5ac2b84a",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 4033,
            "upload_time": "2023-06-15T13:52:06",
            "upload_time_iso_8601": "2023-06-15T13:52:06.072722Z",
            "url": "https://files.pythonhosted.org/packages/00/2a/c26a990c9181b17f565d031eb7d9ecaff70610527a63a73662af258400d1/vectorium-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-06-15 13:52:06",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "silvaan",
    "github_project": "vectorium",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "vectorium"
}
        
Elapsed time: 0.08681s