| Name | lab-1806-vec-db JSON |
| Version |
0.2.3
JSON |
| download |
| home_page | None |
| Summary | None |
| upload_time | 2024-10-20 07:10:46 |
| maintainer | None |
| docs_url | None |
| author | None |
| requires_python | <3.11,>=3.10 |
| license | None |
| keywords |
|
| VCS |
|
| bugtrack_url |
|
| requirements |
No requirements were recorded.
|
| Travis-CI |
No Travis.
|
| coveralls test coverage |
No coveralls.
|
# lab-1806-vec-db
Lab 1806 Vector Database.
## Usage with Python
```bash
# See https://pypi.org/project/lab-1806-vec-db/
pip install lab-1806-vec-db
```
```py
from lab_1806_vec_db import RagVecDB
db = RagVecDB(dim=4)
db.add([1.0, 0.0, 0.0, 0.0], {"content": "a"})
db.add([1.0, 0.0, 0.0, 0.1], {"content": "aa"})
db.add([0.0, 1.0, 0.0, 0.0], {"content": "b"})
db.add([0.0, 1.0, 0.0, 0.1], {"content": "bb"})
db.add([0.0, 0.0, 1.0, 0.0], {"content": "c"})
db.add([0.0, 0.0, 1.0, 0.1], {"content": "cc"})
db.save("test_db.local.bin")
loaded_db = RagVecDB.load("test_db.local.bin")
for idx, metadata in enumerate(loaded_db.search([1.0, 0.0, 0.0, 0.0], 2)):
print(idx, metadata["content"])
```
## Development with Rust
```bash
# Install Rustup
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
. "$HOME/.cargo/env"
# Then install the rust-analyzer extension in VSCode.
# You may need to set "rust-analyzer.runnables.extraEnv" in VSCode Machine settings.
# The value should be like {"PATH":""} and make sure that `/home/YOUR_NAME/.cargo/bin` is in it.
# Otherwise you may fail when press the `Run test` button.
# Run tests
# Add `-r` to test with release mode
cargo test
# Or you can click the 'Run Test' button in VSCode to show output.
# Our GitHub Actions will also run the tests.
```
Test the python binding with `test-pyo3.py`.
```bash
# Install Python 3.10
brew install python@3.10
# or on Windows
scoop bucket add versions
scoop install python310
# Install uv.
# See https://github.com/astral-sh/uv for alternatives.
pip install uv
# or on Windows
scoop install uv
# Run the Python test
uv sync --reinstall-package lab_1806_vec_db
uv run ./test-pyo3.py
# Build the Python Wheel Release
# This will be automatically run in GitHub Actions.
uv build
```
### Examples Binaries
See also the Binaries at `src/bin/`, and the Examples at `examples/`.
- `src/bin/convert_fvecs.rs`: Convert the fvecs format to the binary format.
- `src/bin/gen_ground_truth.rs`: Generate the ground truth for the query.
- `examples/bench.rs`: The benchmark for index algorithms.
Check the comments at the end of the source files for the usage.
### Dataset
Download Gist1M dataset from:
- Official: <http://corpus-texmex.irisa.fr/>
- Ours: **Recommended** faster, and already converted to the binary format. We also provide pre-built config file & ground truth & HNSW index.
<https://huggingface.co/datasets/pku-lab-1806-llm/gist-for-lab-1806-vec-db>
Then, you can run the examples to test the database.
Raw data
{
"_id": null,
"home_page": null,
"name": "lab-1806-vec-db",
"maintainer": null,
"docs_url": null,
"requires_python": "<3.11,>=3.10",
"maintainer_email": null,
"keywords": null,
"author": null,
"author_email": null,
"download_url": null,
"platform": null,
"description": "# lab-1806-vec-db\n\nLab 1806 Vector Database.\n\n## Usage with Python\n\n```bash\n# See https://pypi.org/project/lab-1806-vec-db/\npip install lab-1806-vec-db\n```\n\n```py\nfrom lab_1806_vec_db import RagVecDB\n\ndb = RagVecDB(dim=4)\n\ndb.add([1.0, 0.0, 0.0, 0.0], {\"content\": \"a\"})\ndb.add([1.0, 0.0, 0.0, 0.1], {\"content\": \"aa\"})\n\ndb.add([0.0, 1.0, 0.0, 0.0], {\"content\": \"b\"})\ndb.add([0.0, 1.0, 0.0, 0.1], {\"content\": \"bb\"})\n\ndb.add([0.0, 0.0, 1.0, 0.0], {\"content\": \"c\"})\ndb.add([0.0, 0.0, 1.0, 0.1], {\"content\": \"cc\"})\n\ndb.save(\"test_db.local.bin\")\n\nloaded_db = RagVecDB.load(\"test_db.local.bin\")\n\nfor idx, metadata in enumerate(loaded_db.search([1.0, 0.0, 0.0, 0.0], 2)):\n print(idx, metadata[\"content\"])\n```\n\n## Development with Rust\n\n```bash\n# Install Rustup\ncurl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh\n. \"$HOME/.cargo/env\"\n\n# Then install the rust-analyzer extension in VSCode.\n# You may need to set \"rust-analyzer.runnables.extraEnv\" in VSCode Machine settings.\n# The value should be like {\"PATH\":\"\"} and make sure that `/home/YOUR_NAME/.cargo/bin` is in it.\n# Otherwise you may fail when press the `Run test` button.\n\n# Run tests\n# Add `-r` to test with release mode\ncargo test\n# Or you can click the 'Run Test' button in VSCode to show output.\n# Our GitHub Actions will also run the tests.\n```\n\nTest the python binding with `test-pyo3.py`.\n\n```bash\n# Install Python 3.10\nbrew install python@3.10\n# or on Windows\nscoop bucket add versions\nscoop install python310\n\n# Install uv.\n# See https://github.com/astral-sh/uv for alternatives.\npip install uv\n# or on Windows\nscoop install uv\n\n# Run the Python test\nuv sync --reinstall-package lab_1806_vec_db\nuv run ./test-pyo3.py\n\n# Build the Python Wheel Release\n# This will be automatically run in GitHub Actions.\nuv build\n```\n\n### Examples Binaries\n\nSee also the Binaries at `src/bin/`, and the Examples at `examples/`.\n\n- `src/bin/convert_fvecs.rs`: Convert the fvecs format to the binary format.\n- `src/bin/gen_ground_truth.rs`: Generate the ground truth for the query.\n- `examples/bench.rs`: The benchmark for index algorithms.\n\nCheck the comments at the end of the source files for the usage.\n\n### Dataset\n\nDownload Gist1M dataset from:\n\n- Official: <http://corpus-texmex.irisa.fr/>\n- Ours: **Recommended** faster, and already converted to the binary format. We also provide pre-built config file & ground truth & HNSW index.\n\n <https://huggingface.co/datasets/pku-lab-1806-llm/gist-for-lab-1806-vec-db>\n\nThen, you can run the examples to test the database.\n\n",
"bugtrack_url": null,
"license": null,
"summary": null,
"version": "0.2.3",
"project_urls": null,
"split_keywords": [],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "b59555bec63e48e0f569bd3d1ae802dcc51e82e0c7b4fa9b101831ace196ee5f",
"md5": "ea0a749732d39305580fa6cd99104e7d",
"sha256": "4393b08215e08f34d2801f13feb612babb54d228120f9d8f1e60f554549425a1"
},
"downloads": -1,
"filename": "lab_1806_vec_db-0.2.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"has_sig": false,
"md5_digest": "ea0a749732d39305580fa6cd99104e7d",
"packagetype": "bdist_wheel",
"python_version": "cp310",
"requires_python": "<3.11,>=3.10",
"size": 399922,
"upload_time": "2024-10-20T07:10:46",
"upload_time_iso_8601": "2024-10-20T07:10:46.614357Z",
"url": "https://files.pythonhosted.org/packages/b5/95/55bec63e48e0f569bd3d1ae802dcc51e82e0c7b4fa9b101831ace196ee5f/lab_1806_vec_db-0.2.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "afcebc0630f46823433a5aef7f06622fb14a202fbd95474254bc0931765bc2d5",
"md5": "a72b9b06716d8c9301f05deaf5c1f154",
"sha256": "bf312e494bdd00bc8c387dabc28009f332c460b9f9b3e4a098686215fec7ac9e"
},
"downloads": -1,
"filename": "lab_1806_vec_db-0.2.3-cp310-none-win_amd64.whl",
"has_sig": false,
"md5_digest": "a72b9b06716d8c9301f05deaf5c1f154",
"packagetype": "bdist_wheel",
"python_version": "cp310",
"requires_python": "<3.11,>=3.10",
"size": 257845,
"upload_time": "2024-10-20T07:10:48",
"upload_time_iso_8601": "2024-10-20T07:10:48.430990Z",
"url": "https://files.pythonhosted.org/packages/af/ce/bc0630f46823433a5aef7f06622fb14a202fbd95474254bc0931765bc2d5/lab_1806_vec_db-0.2.3-cp310-none-win_amd64.whl",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-20 07:10:46",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "lab-1806-vec-db"
}