# NSeekFS
[](https://pypi.org/project/nseekfs)
[](https://python.org)
[](https://opensource.org/licenses/MIT)
**High-Performance Exact Vector Search with Rust Backend**
Fast and exact cosine similarity search for Python. Built with Rust for performance, designed for production use.
```bash
pip install nseekfs
```
## Quick Start
```python
import nseekfs
import numpy as np
# Create some test vectors
embeddings = np.random.randn(10000, 384).astype(np.float32)
query = np.random.randn(384).astype(np.float32)
# Build index and run a search
index = nseekfs.from_embeddings(embeddings, normalized=True)
results = index.query(query, top_k=10)
print(f"Found {len(results)} results")
print(f"Best match: idx={results[0]['idx']} score={results[0]['score']:.3f}")
```
## Core Features
### Exact Search
```python
# Basic query
results = index.query(query, top_k=10)
# Access results
for item in results:
print(f"Vector {item['idx']}: {item['score']:.6f}")
```
### Batch Queries
```python
queries = np.random.randn(50, 384).astype(np.float32)
batch_results = index.query_batch(queries, top_k=5)
print(f"Processed {len(batch_results)} queries")
```
### Query Options
```python
# Simple query (alias for query with format="simple")
results = index.query_simple(query, top_k=10)
# Detailed query with timing and diagnostics
result = index.query_detailed(query, top_k=10)
print(f"Query took {result.query_time_ms:.2f} ms, top1 idx={result.results[0]['idx']}")
```
### Index Persistence
```python
# Build and save index
index = nseekfs.from_embeddings(embeddings, normalized=True)
print("Index saved at:", index.index_path)
# Later, reload from file
index2 = nseekfs.from_bin(index.index_path)
print(f"Reloaded index: {index2.rows} vectors x {index2.dims} dims")
```
### Performance Metrics
```python
metrics = index.get_performance_metrics()
print(f"Total queries: {metrics['total_queries']}")
print(f"Average time: {metrics['avg_query_time_ms']:.2f} ms")
```
### Built-in Benchmark
```python
nseekfs.benchmark(vectors=1000, dims=384, queries=100, verbose=True)
```
## API Reference
### Index
* `from_embeddings(embeddings, normalized=True, verbose=False)`
* `from_bin(path)`
### Queries
* `query(query_vector, top_k=10)`
* `query_simple(query_vector, top_k=10)`
* `query_detailed(query_vector, top_k=10)`
* `query_batch(queries, top_k=10)`
### Properties
* `index.rows`
* `index.dims`
* `index.config`
### Utilities
* `get_performance_metrics()`
* `benchmark(vectors=..., dims=..., queries=...)`
## Architecture Highlights
### SIMD Optimizations
- AVX2 support for 8x parallelism on compatible CPUs
- Automatic fallback to scalar operations on older hardware
- Runtime detection of CPU capabilities
### Memory Management
- Memory mapping for efficient data access
- Thread-local buffers for zero-allocation queries
- Cache-aligned data structures for optimal performance
### Batch Processing
- Intelligent batching strategies based on query size
- SIMD vectorization across multiple queries
- Optimized memory access patterns
## Installation
```bash
# From PyPI
pip install nseekfs
# Verify installation
python -c "import nseekfs; print('NSeekFS installed successfully')"
```
## Technical Details
- **Precision**: Float32 optimized for standard ML embeddings
- **Memory**: Efficient memory usage with optimized data structures
- **Performance**: Rust backend with SIMD optimizations where available
- **Compatibility**: Python 3.8+ on Windows, macOS, and Linux
- **Thread Safety**: Safe concurrent access from multiple threads
## Performance Tips
```python
# Pre-normalize vectors if using cosine similarity
embeddings = embeddings / np.linalg.norm(embeddings, axis=1, keepdims=True)
index = nseekfs.from_embeddings(embeddings, normalized=True)
# Use appropriate data types
embeddings = embeddings.astype(np.float32)
# Choose optimal top_k values
results = index.query(query, top_k=10) # vs top_k=1000
# Use batch processing for multiple queries
batch_results = index.query_batch(queries, top_k=10)
```
## License
MIT License - see LICENSE file for details.
---
**Fast, exact cosine similarity search for Python.**
*Built with Rust for performance, designed for Python developers.*
Raw data
{
"_id": null,
"home_page": null,
"name": "nseekfs",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": "Diogo Novo <contact@nseek.io>",
"keywords": "vector, similarity, search, rust, machine-learning, embeddings",
"author": null,
"author_email": "Diogo Novo <contact@nseek.io>",
"download_url": "https://files.pythonhosted.org/packages/51/4e/e12a0e45869035336583808990a7f0abb992915a48432dcefee1ee7dfd1c/nseekfs-1.0.2.tar.gz",
"platform": null,
"description": "# NSeekFS\n\n[](https://pypi.org/project/nseekfs)\n[](https://python.org)\n[](https://opensource.org/licenses/MIT)\n\n**High-Performance Exact Vector Search with Rust Backend**\n\nFast and exact cosine similarity search for Python. Built with Rust for performance, designed for production use.\n\n```bash\npip install nseekfs\n```\n\n## Quick Start\n\n```python\nimport nseekfs\nimport numpy as np\n\n# Create some test vectors\nembeddings = np.random.randn(10000, 384).astype(np.float32)\nquery = np.random.randn(384).astype(np.float32)\n\n# Build index and run a search\nindex = nseekfs.from_embeddings(embeddings, normalized=True)\nresults = index.query(query, top_k=10)\n\nprint(f\"Found {len(results)} results\")\nprint(f\"Best match: idx={results[0]['idx']} score={results[0]['score']:.3f}\")\n```\n\n## Core Features\n\n### Exact Search\n\n```python\n# Basic query\nresults = index.query(query, top_k=10)\n\n# Access results\nfor item in results:\n print(f\"Vector {item['idx']}: {item['score']:.6f}\")\n```\n\n### Batch Queries\n\n```python\nqueries = np.random.randn(50, 384).astype(np.float32)\nbatch_results = index.query_batch(queries, top_k=5)\nprint(f\"Processed {len(batch_results)} queries\")\n```\n\n### Query Options\n\n```python\n# Simple query (alias for query with format=\"simple\")\nresults = index.query_simple(query, top_k=10)\n\n# Detailed query with timing and diagnostics\nresult = index.query_detailed(query, top_k=10)\nprint(f\"Query took {result.query_time_ms:.2f} ms, top1 idx={result.results[0]['idx']}\")\n```\n\n### Index Persistence\n\n```python\n# Build and save index\nindex = nseekfs.from_embeddings(embeddings, normalized=True)\nprint(\"Index saved at:\", index.index_path)\n\n# Later, reload from file\nindex2 = nseekfs.from_bin(index.index_path)\nprint(f\"Reloaded index: {index2.rows} vectors x {index2.dims} dims\")\n```\n\n### Performance Metrics\n\n```python\nmetrics = index.get_performance_metrics()\nprint(f\"Total queries: {metrics['total_queries']}\")\nprint(f\"Average time: {metrics['avg_query_time_ms']:.2f} ms\")\n```\n\n### Built-in Benchmark\n\n```python\nnseekfs.benchmark(vectors=1000, dims=384, queries=100, verbose=True)\n```\n\n## API Reference\n\n### Index\n\n* `from_embeddings(embeddings, normalized=True, verbose=False)`\n* `from_bin(path)`\n\n### Queries\n\n* `query(query_vector, top_k=10)`\n* `query_simple(query_vector, top_k=10)`\n* `query_detailed(query_vector, top_k=10)`\n* `query_batch(queries, top_k=10)`\n\n### Properties\n\n* `index.rows`\n* `index.dims`\n* `index.config`\n\n### Utilities\n\n* `get_performance_metrics()`\n* `benchmark(vectors=..., dims=..., queries=...)`\n\n## Architecture Highlights\n\n### SIMD Optimizations\n- AVX2 support for 8x parallelism on compatible CPUs\n- Automatic fallback to scalar operations on older hardware \n- Runtime detection of CPU capabilities\n\n### Memory Management\n- Memory mapping for efficient data access\n- Thread-local buffers for zero-allocation queries\n- Cache-aligned data structures for optimal performance\n\n### Batch Processing\n- Intelligent batching strategies based on query size\n- SIMD vectorization across multiple queries\n- Optimized memory access patterns\n\n## Installation\n\n```bash\n# From PyPI\npip install nseekfs\n\n# Verify installation\npython -c \"import nseekfs; print('NSeekFS installed successfully')\"\n```\n\n## Technical Details\n\n- **Precision**: Float32 optimized for standard ML embeddings\n- **Memory**: Efficient memory usage with optimized data structures\n- **Performance**: Rust backend with SIMD optimizations where available\n- **Compatibility**: Python 3.8+ on Windows, macOS, and Linux\n- **Thread Safety**: Safe concurrent access from multiple threads\n\n## Performance Tips\n\n```python\n# Pre-normalize vectors if using cosine similarity\nembeddings = embeddings / np.linalg.norm(embeddings, axis=1, keepdims=True)\nindex = nseekfs.from_embeddings(embeddings, normalized=True)\n\n# Use appropriate data types\nembeddings = embeddings.astype(np.float32)\n\n# Choose optimal top_k values\nresults = index.query(query, top_k=10) # vs top_k=1000\n\n# Use batch processing for multiple queries\nbatch_results = index.query_batch(queries, top_k=10)\n```\n\n## License\n\nMIT License - see LICENSE file for details.\n\n---\n\n**Fast, exact cosine similarity search for Python.**\n\n*Built with Rust for performance, designed for Python developers.*\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "High-performance exact vector similarity search with Rust backend",
"version": "1.0.2",
"project_urls": {
"Documentation": "https://github.com/NSeek-AI/nseekfs/wiki",
"Homepage": "https://github.com/NSeek-AI/nseekfs",
"Repository": "https://github.com/NSeek-AI/nseekfs.git"
},
"split_keywords": [
"vector",
" similarity",
" search",
" rust",
" machine-learning",
" embeddings"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "985c656d8bd385de5a8134a1fdfccaaf87144d0cb5918050e8d161dfb4c9d48e",
"md5": "7aebee3b00e023b10da3bd94515aaa5e",
"sha256": "2abe662f51357063a12e5e308ff9604c736bab1aca020a2170088a53431fc317"
},
"downloads": -1,
"filename": "nseekfs-1.0.2-cp38-abi3-macosx_10_12_x86_64.whl",
"has_sig": false,
"md5_digest": "7aebee3b00e023b10da3bd94515aaa5e",
"packagetype": "bdist_wheel",
"python_version": "cp38",
"requires_python": ">=3.8",
"size": 301925,
"upload_time": "2025-09-07T09:44:03",
"upload_time_iso_8601": "2025-09-07T09:44:03.365472Z",
"url": "https://files.pythonhosted.org/packages/98/5c/656d8bd385de5a8134a1fdfccaaf87144d0cb5918050e8d161dfb4c9d48e/nseekfs-1.0.2-cp38-abi3-macosx_10_12_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "ef1b70eb20852059779e31b803a08429d6fef4e85850e9d6617fbbcb922fbccd",
"md5": "09b980c68e927c34a77000129747042b",
"sha256": "543afe5e6479241399618c456a51014a60c7262296fdabb29594201d37039c6d"
},
"downloads": -1,
"filename": "nseekfs-1.0.2-cp38-abi3-macosx_11_0_arm64.whl",
"has_sig": false,
"md5_digest": "09b980c68e927c34a77000129747042b",
"packagetype": "bdist_wheel",
"python_version": "cp38",
"requires_python": ">=3.8",
"size": 278390,
"upload_time": "2025-09-07T09:44:05",
"upload_time_iso_8601": "2025-09-07T09:44:05.110884Z",
"url": "https://files.pythonhosted.org/packages/ef/1b/70eb20852059779e31b803a08429d6fef4e85850e9d6617fbbcb922fbccd/nseekfs-1.0.2-cp38-abi3-macosx_11_0_arm64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "d285e267b195b654fa2e278f03fcfbfeaa6a401682b8fc31ad6855e86dc2adeb",
"md5": "cd2c3ab4eb55718ef4c7ed7178093273",
"sha256": "58f2eb7c2c8dab8da71b5ded71ea207065da302ff71c70d592aa39f470e221d6"
},
"downloads": -1,
"filename": "nseekfs-1.0.2-cp38-abi3-manylinux_2_34_x86_64.whl",
"has_sig": false,
"md5_digest": "cd2c3ab4eb55718ef4c7ed7178093273",
"packagetype": "bdist_wheel",
"python_version": "cp38",
"requires_python": ">=3.8",
"size": 351905,
"upload_time": "2025-09-07T09:44:06",
"upload_time_iso_8601": "2025-09-07T09:44:06.141976Z",
"url": "https://files.pythonhosted.org/packages/d2/85/e267b195b654fa2e278f03fcfbfeaa6a401682b8fc31ad6855e86dc2adeb/nseekfs-1.0.2-cp38-abi3-manylinux_2_34_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "0ce0c06d62ffec0bdae5c51c17308c6d1609692afa467117b616188959ad1ebc",
"md5": "1bb438f2318204328adb81c2f2e8798d",
"sha256": "cc8fd265f7e1f0d65cde68402ac6381ac57ec3eccb5cdfd92de0e1108ad53458"
},
"downloads": -1,
"filename": "nseekfs-1.0.2-cp38-abi3-win_amd64.whl",
"has_sig": false,
"md5_digest": "1bb438f2318204328adb81c2f2e8798d",
"packagetype": "bdist_wheel",
"python_version": "cp38",
"requires_python": ">=3.8",
"size": 230851,
"upload_time": "2025-09-07T09:44:07",
"upload_time_iso_8601": "2025-09-07T09:44:07.661548Z",
"url": "https://files.pythonhosted.org/packages/0c/e0/c06d62ffec0bdae5c51c17308c6d1609692afa467117b616188959ad1ebc/nseekfs-1.0.2-cp38-abi3-win_amd64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "514ee12a0e45869035336583808990a7f0abb992915a48432dcefee1ee7dfd1c",
"md5": "15dbead451b0a9e2bedf4e8cf10cfcdc",
"sha256": "82c2664545a19c191a75000fc18205bea6f142f7f7f43be6dc089e0b144e2d06"
},
"downloads": -1,
"filename": "nseekfs-1.0.2.tar.gz",
"has_sig": false,
"md5_digest": "15dbead451b0a9e2bedf4e8cf10cfcdc",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 86845,
"upload_time": "2025-09-07T09:44:08",
"upload_time_iso_8601": "2025-09-07T09:44:08.576051Z",
"url": "https://files.pythonhosted.org/packages/51/4e/e12a0e45869035336583808990a7f0abb992915a48432dcefee1ee7dfd1c/nseekfs-1.0.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-09-07 09:44:08",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "NSeek-AI",
"github_project": "nseekfs",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "nseekfs"
}