lab-1806-vec-db


Namelab-1806-vec-db JSON
Version 0.2.3 PyPI version JSON
download
home_pageNone
SummaryNone
upload_time2024-10-20 07:10:46
maintainerNone
docs_urlNone
authorNone
requires_python<3.11,>=3.10
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # lab-1806-vec-db

Lab 1806 Vector Database.

## Usage with Python

```bash
# See https://pypi.org/project/lab-1806-vec-db/
pip install lab-1806-vec-db
```

```py
from lab_1806_vec_db import RagVecDB

db = RagVecDB(dim=4)

db.add([1.0, 0.0, 0.0, 0.0], {"content": "a"})
db.add([1.0, 0.0, 0.0, 0.1], {"content": "aa"})

db.add([0.0, 1.0, 0.0, 0.0], {"content": "b"})
db.add([0.0, 1.0, 0.0, 0.1], {"content": "bb"})

db.add([0.0, 0.0, 1.0, 0.0], {"content": "c"})
db.add([0.0, 0.0, 1.0, 0.1], {"content": "cc"})

db.save("test_db.local.bin")

loaded_db = RagVecDB.load("test_db.local.bin")

for idx, metadata in enumerate(loaded_db.search([1.0, 0.0, 0.0, 0.0], 2)):
    print(idx, metadata["content"])
```

## Development with Rust

```bash
# Install Rustup
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
. "$HOME/.cargo/env"

# Then install the rust-analyzer extension in VSCode.
# You may need to set "rust-analyzer.runnables.extraEnv" in VSCode Machine settings.
# The value should be like {"PATH":""} and make sure that `/home/YOUR_NAME/.cargo/bin` is in it.
# Otherwise you may fail when press the `Run test` button.

# Run tests
# Add `-r` to test with release mode
cargo test
# Or you can click the 'Run Test' button in VSCode to show output.
# Our GitHub Actions will also run the tests.
```

Test the python binding with `test-pyo3.py`.

```bash
# Install Python 3.10
brew install python@3.10
# or on Windows
scoop bucket add versions
scoop install python310

# Install uv.
# See https://github.com/astral-sh/uv for alternatives.
pip install uv
# or on Windows
scoop install uv

# Run the Python test
uv sync --reinstall-package lab_1806_vec_db
uv run ./test-pyo3.py

# Build the Python Wheel Release
# This will be automatically run in GitHub Actions.
uv build
```

### Examples Binaries

See also the Binaries at `src/bin/`, and the Examples at `examples/`.

- `src/bin/convert_fvecs.rs`: Convert the fvecs format to the binary format.
- `src/bin/gen_ground_truth.rs`: Generate the ground truth for the query.
- `examples/bench.rs`: The benchmark for index algorithms.

Check the comments at the end of the source files for the usage.

### Dataset

Download Gist1M dataset from:

- Official: <http://corpus-texmex.irisa.fr/>
- Ours: **Recommended** faster, and already converted to the binary format. We also provide pre-built config file & ground truth & HNSW index.

  <https://huggingface.co/datasets/pku-lab-1806-llm/gist-for-lab-1806-vec-db>

Then, you can run the examples to test the database.


            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "lab-1806-vec-db",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<3.11,>=3.10",
    "maintainer_email": null,
    "keywords": null,
    "author": null,
    "author_email": null,
    "download_url": null,
    "platform": null,
    "description": "# lab-1806-vec-db\n\nLab 1806 Vector Database.\n\n## Usage with Python\n\n```bash\n# See https://pypi.org/project/lab-1806-vec-db/\npip install lab-1806-vec-db\n```\n\n```py\nfrom lab_1806_vec_db import RagVecDB\n\ndb = RagVecDB(dim=4)\n\ndb.add([1.0, 0.0, 0.0, 0.0], {\"content\": \"a\"})\ndb.add([1.0, 0.0, 0.0, 0.1], {\"content\": \"aa\"})\n\ndb.add([0.0, 1.0, 0.0, 0.0], {\"content\": \"b\"})\ndb.add([0.0, 1.0, 0.0, 0.1], {\"content\": \"bb\"})\n\ndb.add([0.0, 0.0, 1.0, 0.0], {\"content\": \"c\"})\ndb.add([0.0, 0.0, 1.0, 0.1], {\"content\": \"cc\"})\n\ndb.save(\"test_db.local.bin\")\n\nloaded_db = RagVecDB.load(\"test_db.local.bin\")\n\nfor idx, metadata in enumerate(loaded_db.search([1.0, 0.0, 0.0, 0.0], 2)):\n    print(idx, metadata[\"content\"])\n```\n\n## Development with Rust\n\n```bash\n# Install Rustup\ncurl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh\n. \"$HOME/.cargo/env\"\n\n# Then install the rust-analyzer extension in VSCode.\n# You may need to set \"rust-analyzer.runnables.extraEnv\" in VSCode Machine settings.\n# The value should be like {\"PATH\":\"\"} and make sure that `/home/YOUR_NAME/.cargo/bin` is in it.\n# Otherwise you may fail when press the `Run test` button.\n\n# Run tests\n# Add `-r` to test with release mode\ncargo test\n# Or you can click the 'Run Test' button in VSCode to show output.\n# Our GitHub Actions will also run the tests.\n```\n\nTest the python binding with `test-pyo3.py`.\n\n```bash\n# Install Python 3.10\nbrew install python@3.10\n# or on Windows\nscoop bucket add versions\nscoop install python310\n\n# Install uv.\n# See https://github.com/astral-sh/uv for alternatives.\npip install uv\n# or on Windows\nscoop install uv\n\n# Run the Python test\nuv sync --reinstall-package lab_1806_vec_db\nuv run ./test-pyo3.py\n\n# Build the Python Wheel Release\n# This will be automatically run in GitHub Actions.\nuv build\n```\n\n### Examples Binaries\n\nSee also the Binaries at `src/bin/`, and the Examples at `examples/`.\n\n- `src/bin/convert_fvecs.rs`: Convert the fvecs format to the binary format.\n- `src/bin/gen_ground_truth.rs`: Generate the ground truth for the query.\n- `examples/bench.rs`: The benchmark for index algorithms.\n\nCheck the comments at the end of the source files for the usage.\n\n### Dataset\n\nDownload Gist1M dataset from:\n\n- Official: <http://corpus-texmex.irisa.fr/>\n- Ours: **Recommended** faster, and already converted to the binary format. We also provide pre-built config file & ground truth & HNSW index.\n\n  <https://huggingface.co/datasets/pku-lab-1806-llm/gist-for-lab-1806-vec-db>\n\nThen, you can run the examples to test the database.\n\n",
    "bugtrack_url": null,
    "license": null,
    "summary": null,
    "version": "0.2.3",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "b59555bec63e48e0f569bd3d1ae802dcc51e82e0c7b4fa9b101831ace196ee5f",
                "md5": "ea0a749732d39305580fa6cd99104e7d",
                "sha256": "4393b08215e08f34d2801f13feb612babb54d228120f9d8f1e60f554549425a1"
            },
            "downloads": -1,
            "filename": "lab_1806_vec_db-0.2.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "ea0a749732d39305580fa6cd99104e7d",
            "packagetype": "bdist_wheel",
            "python_version": "cp310",
            "requires_python": "<3.11,>=3.10",
            "size": 399922,
            "upload_time": "2024-10-20T07:10:46",
            "upload_time_iso_8601": "2024-10-20T07:10:46.614357Z",
            "url": "https://files.pythonhosted.org/packages/b5/95/55bec63e48e0f569bd3d1ae802dcc51e82e0c7b4fa9b101831ace196ee5f/lab_1806_vec_db-0.2.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "afcebc0630f46823433a5aef7f06622fb14a202fbd95474254bc0931765bc2d5",
                "md5": "a72b9b06716d8c9301f05deaf5c1f154",
                "sha256": "bf312e494bdd00bc8c387dabc28009f332c460b9f9b3e4a098686215fec7ac9e"
            },
            "downloads": -1,
            "filename": "lab_1806_vec_db-0.2.3-cp310-none-win_amd64.whl",
            "has_sig": false,
            "md5_digest": "a72b9b06716d8c9301f05deaf5c1f154",
            "packagetype": "bdist_wheel",
            "python_version": "cp310",
            "requires_python": "<3.11,>=3.10",
            "size": 257845,
            "upload_time": "2024-10-20T07:10:48",
            "upload_time_iso_8601": "2024-10-20T07:10:48.430990Z",
            "url": "https://files.pythonhosted.org/packages/af/ce/bc0630f46823433a5aef7f06622fb14a202fbd95474254bc0931765bc2d5/lab_1806_vec_db-0.2.3-cp310-none-win_amd64.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-10-20 07:10:46",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "lab-1806-vec-db"
}
        
Elapsed time: 0.51909s