# PyNSG
[](https://github.com/twuebker/nsg/blob/master/LICENSE.lesser)
[](https://www.python.org/downloads/)
Python bindings for **Navigating Spreading-Out Graph (NSG)** - a fast and memory-efficient approximate nearest neighbor search algorithm.
## About NSG
NSG is a graph-based approximate nearest neighbor search algorithm that provides excellent search performance with low memory overhead. This library provides Python bindings for the original C++ implementation.
**Original Paper**: *Fast Approximate Nearest Neighbor Search with Navigating Spreading-out Graph* by Cong Fu, Chao Xiang, Changxu Wang, and Deng Cai.
## Credits
This package provides Python bindings for the original NSG implementation:
- **Original NSG Repository**: [ZJULearning/nsg](https://github.com/ZJULearning/nsg)
- **Original Authors**: Cong Fu, Chao Xiang, Changxu Wang, Deng Cai
- **Python Bindings**: Created to enable easy integration with Python-based machine learning workflows (such as the ANN benchmarks)
## Hardware Requirements
The underlying CPP implementation of NSG requires both OpenMP and AVX2.
## Installation
The knn extension requires faiss but provides an easy means of generating a knn graph in python.
```bash
pip install pynsg
pip install pynsg[knn]
```
## Quick Start
```python
from pynsg import NSG, Metric
nsg = NSG(dimension=128, num_points=1000, metric=Metric.L2)
# Build the index (requires a k-NN graph file - see below)
nsg.build_index(data, "knn_graph.graph", L=40, R=50, C=500)
k = 10
results = nsg.search(queries, data, k, search_L=100)
# Save and load index
nsg.save_index("my_index.nsg")
nsg2 = NSG(128, 1000, Metric.L2)
nsg2.load_index("my_index.nsg")
```
## Optimized Search
The normal search functions above are recommended for low memory scenarios. The latter search yields better performance.
```python
import numpy as np
from pynsg import NSG, Metric
nsg = NSG(dimension=base_data.shape[1],
num_points=len(base_data),
metric=Metric.L2)
nsg.build_index(base_data, "knn_graph.graph", L=40, R=50, C=500)
nsg.optimize_graph(base_data)
k = 10
results = nsg.search_opt(queries, k, search_L=100)
```
## API Reference
### NSG Class
```python
NSG(dimension, num_points, metric)
```
**Parameters:**
- `dimension` (int): Dimensionality of the vectors
- `num_points` (int): Number of points in the dataset
- `metric` (Metric): Distance metric (Currently, only L2 is supported. Note that cosine similarity produces ranking identical to L2 on normalized vectors)
**Methods:**
- `build_index(data, knn_graph_path, L, R, C)`: Build the NSG index
- `search(queries, data, k, search_L)`: Search for k nearest neighbors
- `search_opt(queries, k, search_L)`: Optimized search (requires optimize_graph)
- `optimize_graph(data)`: Optimize the graph structure for faster search
- `save_index(path)`: Save the index to disk
- `load_index(path)`: Load an index from disk
### Metrics
Available distance metrics:
- `Metric.L2`: Standard L2 (Euclidean) distance
> [!NOTE]
> While cosine similarity is not directly exposed, it can be computed by first normalizing the vectors and then using L2.
## Requirements
- Python 3.6+
- NumPy >= 1.16.0
- A k-NN graph file (can be generated using tools like FAISS or other ANN libraries)
## Generating k-NN Graphs
NSG requires an approximate k-NN graph written to a file in a specific format as input for building the index. There are a number of ways to obtain such a graph, for example using efanna_graph (recommended by the authors of the paper), which is only available in cpp. In python, you can use faiss, or another algorithm of your choice.
For convenience, if you install the extension of this package using `pip install pynsg[knn]`, faiss will be installed and you can use the function `create_graph_file` that uses faiss' hnsw index to quickly build an approximate knn graph. It uses OpenMP.
This function is also exposed as a command line utility `nsg-build-knn`.
```python
from pynsg import create_graph_file
create_graph_file(filename="test200.graph", x=X, k=200, hnsw_efConstruction=200, hnsw_efSearch=128, hnsw_M=32, use_omp=True)
```
## License
PyNSG is MIT-licensed.
The original NSG algorithm and implementation are credited to the authors of the ZJULearning/nsg repository.
## Citation
If you use this library in your research, please cite the original NSG paper:
```bibtex
@article{FuNSG17,
author = {Cong Fu and Chao Xiang and Changxu Wang and Deng Cai},
title = {Fast Approximate Nearest Neighbor Search With The Navigating Spreading-out Graphs},
journal = {{PVLDB}},
volume = {12},
number = {5},
pages = {461 - 474},
year = {2019},
url = {http://www.vldb.org/pvldb/vol12/p461-fu.pdf},
doi = {10.14778/3303753.3303754}
}
```
Raw data
{
"_id": null,
"home_page": "https://github.com/twuebker/nsg",
"name": "pynsg",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": null,
"keywords": "nsg graph nearest neighbor search ann",
"author": "Theodor Wuebker",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/48/b3/be1c5d6d52b03a47a88fcd0375e9859747fee4f78eabadb4210ab7cc4c26/pynsg-0.1.3.tar.gz",
"platform": null,
"description": "# PyNSG\n\n[](https://github.com/twuebker/nsg/blob/master/LICENSE.lesser)\n[](https://www.python.org/downloads/)\n\nPython bindings for **Navigating Spreading-Out Graph (NSG)** - a fast and memory-efficient approximate nearest neighbor search algorithm.\n\n## About NSG\n\nNSG is a graph-based approximate nearest neighbor search algorithm that provides excellent search performance with low memory overhead. This library provides Python bindings for the original C++ implementation.\n\n**Original Paper**: *Fast Approximate Nearest Neighbor Search with Navigating Spreading-out Graph* by Cong Fu, Chao Xiang, Changxu Wang, and Deng Cai.\n\n## Credits\n\nThis package provides Python bindings for the original NSG implementation:\n- **Original NSG Repository**: [ZJULearning/nsg](https://github.com/ZJULearning/nsg)\n- **Original Authors**: Cong Fu, Chao Xiang, Changxu Wang, Deng Cai\n- **Python Bindings**: Created to enable easy integration with Python-based machine learning workflows (such as the ANN benchmarks)\n\n## Hardware Requirements\n\nThe underlying CPP implementation of NSG requires both OpenMP and AVX2.\n\n## Installation\n\nThe knn extension requires faiss but provides an easy means of generating a knn graph in python.\n\n```bash\npip install pynsg\npip install pynsg[knn]\n```\n\n## Quick Start\n\n```python\nfrom pynsg import NSG, Metric\n\nnsg = NSG(dimension=128, num_points=1000, metric=Metric.L2)\n\n# Build the index (requires a k-NN graph file - see below)\nnsg.build_index(data, \"knn_graph.graph\", L=40, R=50, C=500)\nk = 10\nresults = nsg.search(queries, data, k, search_L=100)\n\n# Save and load index\nnsg.save_index(\"my_index.nsg\")\nnsg2 = NSG(128, 1000, Metric.L2)\nnsg2.load_index(\"my_index.nsg\")\n```\n\n## Optimized Search \nThe normal search functions above are recommended for low memory scenarios. The latter search yields better performance.\n\n```python\nimport numpy as np\nfrom pynsg import NSG, Metric\n\nnsg = NSG(dimension=base_data.shape[1], \n num_points=len(base_data), \n metric=Metric.L2)\n\nnsg.build_index(base_data, \"knn_graph.graph\", L=40, R=50, C=500)\nnsg.optimize_graph(base_data)\nk = 10\nresults = nsg.search_opt(queries, k, search_L=100)\n```\n## API Reference\n\n### NSG Class\n\n```python\nNSG(dimension, num_points, metric)\n```\n\n**Parameters:**\n- `dimension` (int): Dimensionality of the vectors\n- `num_points` (int): Number of points in the dataset\n- `metric` (Metric): Distance metric (Currently, only L2 is supported. Note that cosine similarity produces ranking identical to L2 on normalized vectors)\n\n**Methods:**\n- `build_index(data, knn_graph_path, L, R, C)`: Build the NSG index\n- `search(queries, data, k, search_L)`: Search for k nearest neighbors\n- `search_opt(queries, k, search_L)`: Optimized search (requires optimize_graph)\n- `optimize_graph(data)`: Optimize the graph structure for faster search\n- `save_index(path)`: Save the index to disk\n- `load_index(path)`: Load an index from disk\n\n### Metrics\n\nAvailable distance metrics:\n- `Metric.L2`: Standard L2 (Euclidean) distance\n> [!NOTE]\n> While cosine similarity is not directly exposed, it can be computed by first normalizing the vectors and then using L2.\n\n## Requirements\n\n- Python 3.6+\n- NumPy >= 1.16.0\n- A k-NN graph file (can be generated using tools like FAISS or other ANN libraries)\n\n## Generating k-NN Graphs\n\nNSG requires an approximate k-NN graph written to a file in a specific format as input for building the index. There are a number of ways to obtain such a graph, for example using efanna_graph (recommended by the authors of the paper), which is only available in cpp. In python, you can use faiss, or another algorithm of your choice.\nFor convenience, if you install the extension of this package using `pip install pynsg[knn]`, faiss will be installed and you can use the function `create_graph_file` that uses faiss' hnsw index to quickly build an approximate knn graph. It uses OpenMP. \n\nThis function is also exposed as a command line utility `nsg-build-knn`.\n\n```python\nfrom pynsg import create_graph_file\n\ncreate_graph_file(filename=\"test200.graph\", x=X, k=200, hnsw_efConstruction=200, hnsw_efSearch=128, hnsw_M=32, use_omp=True) \n```\n\n## License\n\nPyNSG is MIT-licensed.\nThe original NSG algorithm and implementation are credited to the authors of the ZJULearning/nsg repository.\n\n## Citation\n\nIf you use this library in your research, please cite the original NSG paper:\n\n```bibtex\n@article{FuNSG17,\n author = {Cong Fu and Chao Xiang and Changxu Wang and Deng Cai},\n title = {Fast Approximate Nearest Neighbor Search With The Navigating Spreading-out Graphs},\n journal = {{PVLDB}},\n volume = {12},\n number = {5},\n pages = {461 - 474},\n year = {2019},\n url = {http://www.vldb.org/pvldb/vol12/p461-fu.pdf},\n doi = {10.14778/3303753.3303754}\n}\n```\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Python bindings for Navigating Spreading-Out Graph (NSG)",
"version": "0.1.3",
"project_urls": {
"Bug Reports": "https://github.com/twuebker/nsg/issues",
"Homepage": "https://github.com/twuebker/nsg",
"Original NSG": "https://github.com/ZJULearning/nsg",
"Source": "https://github.com/twuebker/nsg"
},
"split_keywords": [
"nsg",
"graph",
"nearest",
"neighbor",
"search",
"ann"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "48b3be1c5d6d52b03a47a88fcd0375e9859747fee4f78eabadb4210ab7cc4c26",
"md5": "e10023a2001bcff532e8aa70c00262c1",
"sha256": "d1c05f532285a561c12a9da9420279aa61c834eaabc38c127d56036c319ae8a6"
},
"downloads": -1,
"filename": "pynsg-0.1.3.tar.gz",
"has_sig": false,
"md5_digest": "e10023a2001bcff532e8aa70c00262c1",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 24480,
"upload_time": "2025-09-01T17:39:56",
"upload_time_iso_8601": "2025-09-01T17:39:56.669610Z",
"url": "https://files.pythonhosted.org/packages/48/b3/be1c5d6d52b03a47a88fcd0375e9859747fee4f78eabadb4210ab7cc4c26/pynsg-0.1.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-09-01 17:39:56",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "twuebker",
"github_project": "nsg",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "pynsg"
}