# d-vecDB Python Client
[](https://www.python.org/downloads/)
[](../LICENSE)
[](https://github.com/psf/black)
A comprehensive Python client library for [d-vecDB](https://github.com/rdmurugan/d-vecDB), providing both synchronous and asynchronous interfaces for vector database operations.
## ๐ **Features**
### **Multi-Protocol Support**
- **REST API** via HTTP/HTTPS with connection pooling
- **gRPC** for high-performance binary protocol communication
- **Auto-detection** with intelligent fallback
### **Synchronous & Asynchronous**
- **Sync client** for traditional blocking operations
- **Async client** for high-concurrency applications
- **Connection pooling** and concurrent batch operations
### **Type Safety & Validation**
- **Pydantic models** for data validation
- **Type hints** throughout the codebase
- **Comprehensive error handling**
### **Developer Experience**
- **Intuitive API** with simple and advanced methods
- **NumPy integration** for seamless array handling
- **Rich documentation** and examples
## ๐ฆ **Installation**
### **Recommended: Install d-vecDB with Python Client**
```bash
# Install d-vecDB (includes Python client)
pip install d-vecdb
# Install with development dependencies
pip install d-vecdb[dev]
# Install with example dependencies
pip install d-vecdb[examples]
```
### **Alternative: Install Python Client Only**
```bash
# Install just the Python client library
pip install vectordb-client
# Install with development dependencies
pip install vectordb-client[dev]
# Install with example dependencies
pip install vectordb-client[examples]
```
### **From Source**
```bash
git clone https://github.com/rdmurugan/d-vecDB.git
cd d-vecDB/python-client
pip install -e .
```
## ๐ **Getting Started After Installation**
### **Step 1: Build and Start the d-vecDB Server**
**Important**: The PyPI package (`pip install d-vecdb`) only includes the Python client library. You need to build the server separately:
```bash
# Clone the repository and build the server
git clone https://github.com/rdmurugan/d-vecDB.git
cd d-vecDB
# Build the server (requires Rust)
cargo build --release
# Start the server
./target/release/vectordb-server --config config.toml
```
### **Step 2: Use the Python Client**
Once you have a running d-vecDB server, you can use the Python client (installed via pip) to interact with it:
```python
import numpy as np
from vectordb_client import VectorDBClient
# Connect to your d-vecDB server
client = VectorDBClient(host="localhost", port=8080)
# Create a collection
client.create_collection_simple("my_collection", 128, "cosine")
# Insert some vectors
vector = np.random.random(128)
client.insert_simple("my_collection", "vector_1", vector)
# Search for similar vectors
query = np.random.random(128)
results = client.search_simple("my_collection", query, limit=5)
print(f"Found {len(results)} similar vectors")
for result in results:
print(f" - ID: {result.id}, Distance: {result.distance:.4f}")
client.close()
```
## ๐ **Quick Start**
### **Synchronous Client**
```python
import numpy as np
from vectordb_client import VectorDBClient
# Connect to d-vecDB server
client = VectorDBClient(host="localhost", port=8080)
# Create a collection
client.create_collection_simple(
name="documents",
dimension=128,
distance_metric="cosine"
)
# Insert vectors
vectors = np.random.random((100, 128))
for i, vector in enumerate(vectors):
client.insert_simple(
collection_name="documents",
vector_id=f"doc_{i}",
vector_data=vector,
metadata={"title": f"Document {i}", "category": "example"}
)
# Search for similar vectors
query_vector = np.random.random(128)
results = client.search_simple("documents", query_vector, limit=5)
for result in results:
print(f"ID: {result.id}, Distance: {result.distance:.4f}")
# Clean up
client.close()
```
### **Asynchronous Client**
```python
import asyncio
import numpy as np
from vectordb_client import AsyncVectorDBClient
async def main():
# Connect to d-vecDB server
async with AsyncVectorDBClient(host="localhost", port=8080) as client:
# Create collection
await client.create_collection_simple(
name="embeddings",
dimension=384,
distance_metric="cosine"
)
# Prepare batch data
batch_data = [
(f"item_{i}", np.random.random(384), {"category": "test"})
for i in range(1000)
]
# Concurrent batch insertion
await client.batch_insert_concurrent(
collection_name="embeddings",
vectors_data=batch_data,
batch_size=50,
max_concurrent_batches=10
)
# Search
query_vector = np.random.random(384)
results = await client.search_simple("embeddings", query_vector, limit=10)
print(f"Found {len(results)} similar vectors")
# Run the async example
asyncio.run(main())
```
## ๐ **API Reference**
### **Client Initialization**
```python
from vectordb_client import VectorDBClient, AsyncVectorDBClient
# Synchronous client
client = VectorDBClient(
host="localhost",
port=8080, # REST port
grpc_port=9090, # gRPC port
protocol="rest", # "rest", "grpc", or "auto"
ssl=False, # Use HTTPS/secure gRPC
timeout=30.0, # Request timeout
)
# Asynchronous client
async_client = AsyncVectorDBClient(
host="localhost",
port=8080,
connection_pool_size=10, # HTTP connection pool size
protocol="rest",
ssl=False,
timeout=30.0,
)
```
### **Collection Management**
```python
from vectordb_client.types import CollectionConfig, DistanceMetric, IndexConfig
# Advanced collection configuration
config = CollectionConfig(
name="my_collection",
dimension=768,
distance_metric=DistanceMetric.COSINE,
index_config=IndexConfig(
max_connections=32,
ef_construction=400,
ef_search=100,
max_layer=16
)
)
# Create collection
response = client.create_collection(config)
# List all collections
collections = client.list_collections()
print("Collections:", collections.collections)
# Get collection info and stats
collection_info = client.get_collection("my_collection")
stats = client.get_collection_stats("my_collection")
print(f"Vectors: {stats.vector_count}, Memory: {stats.memory_usage} bytes")
# Delete collection
client.delete_collection("my_collection")
```
### **Vector Operations**
```python
from vectordb_client.types import Vector
import numpy as np
# Create vectors with metadata
vectors = [
Vector(
id="vec_1",
data=np.random.random(128).tolist(),
metadata={"category": "A", "score": 0.95}
),
Vector(
id="vec_2",
data=np.random.random(128).tolist(),
metadata={"category": "B", "score": 0.87}
)
]
# Insert single vector
response = client.insert_vector("my_collection", vectors[0])
# Batch insert
response = client.insert_vectors("my_collection", vectors)
print(f"Inserted {response.inserted_count} vectors")
# Get vector by ID
vector = client.get_vector("my_collection", "vec_1")
print(f"Retrieved vector: {vector.id}")
# Update vector
vectors[0].metadata["updated"] = True
client.update_vector("my_collection", vectors[0])
# Delete vector
client.delete_vector("my_collection", "vec_1")
```
### **Vector Search**
```python
from vectordb_client.types import SearchRequest
import numpy as np
# Simple search
query_vector = np.random.random(128)
results = client.search_simple("my_collection", query_vector, limit=10)
# Advanced search with parameters
search_request = SearchRequest(
query_vector=query_vector.tolist(),
limit=20,
ef_search=150, # Higher value = better accuracy, slower search
filter={"category": "A"} # Metadata filtering
)
response = client.search("my_collection",
search_request.query_vector,
search_request.limit,
search_request.ef_search,
search_request.filter)
# Process results
for result in response.results:
print(f"ID: {result.id}")
print(f"Distance: {result.distance:.6f}")
print(f"Metadata: {result.metadata}")
print("---")
print(f"Search took {response.query_time_ms}ms")
```
### **Server Information**
```python
# Health check
health = client.health_check()
print(f"Server healthy: {health.healthy}")
# Server statistics
stats = client.get_server_stats()
print(f"Total vectors: {stats.total_vectors}")
print(f"Collections: {stats.total_collections}")
print(f"Memory usage: {stats.memory_usage} bytes")
print(f"Uptime: {stats.uptime_seconds}s")
# Quick connectivity test
is_reachable = client.ping()
print(f"Server reachable: {is_reachable}")
# Comprehensive info
info = client.get_info()
print("Client info:", info["client"])
print("Server info:", info["server"])
```
## ๐งช **Advanced Examples**
### **Working with NumPy Arrays**
```python
import numpy as np
from vectordb_client import VectorDBClient
from vectordb_client.types import Vector
client = VectorDBClient()
# Create collection for embeddings
client.create_collection_simple("embeddings", 384, "cosine")
# Work directly with NumPy arrays
embeddings = np.random.random((1000, 384))
ids = [f"embedding_{i}" for i in range(1000)]
metadata_list = [{"index": i, "batch": i // 100} for i in range(1000)]
# Batch insert using NumPy
vectors = [
Vector.from_numpy(id=ids[i], data=embeddings[i], metadata=metadata_list[i])
for i in range(len(embeddings))
]
# Insert in batches
batch_size = 100
for i in range(0, len(vectors), batch_size):
batch = vectors[i:i + batch_size]
response = client.insert_vectors("embeddings", batch)
print(f"Inserted batch {i // batch_size + 1}: {response.inserted_count} vectors")
# Search with NumPy array
query_embedding = np.random.random(384)
results = client.search_simple("embeddings", query_embedding, limit=5)
# Convert results back to NumPy if needed
for result in results:
vector = client.get_vector("embeddings", result.id)
vector_array = vector.to_numpy() # Convert to NumPy array
print(f"Vector {result.id} shape: {vector_array.shape}")
```
### **Async Batch Processing**
```python
import asyncio
import numpy as np
from vectordb_client import AsyncVectorDBClient
async def process_large_dataset():
async with AsyncVectorDBClient() as client:
# Create collection
await client.create_collection_simple("large_dataset", 512, "euclidean")
# Generate large dataset
num_vectors = 10000
dimension = 512
dataset = np.random.random((num_vectors, dimension))
# Prepare batch data
batch_data = [
(f"vec_{i}", dataset[i], {"batch": i // 1000, "index": i})
for i in range(num_vectors)
]
# Concurrent insertion with progress tracking
batch_size = 200
max_concurrent = 20
start_time = asyncio.get_event_loop().time()
responses = await client.batch_insert_concurrent(
collection_name="large_dataset",
vectors_data=batch_data,
batch_size=batch_size,
max_concurrent_batches=max_concurrent
)
end_time = asyncio.get_event_loop().time()
total_inserted = sum(r.inserted_count or 0 for r in responses)
duration = end_time - start_time
rate = total_inserted / duration
print(f"Inserted {total_inserted} vectors in {duration:.2f}s")
print(f"Rate: {rate:.2f} vectors/second")
# Verify with search
query_vector = np.random.random(512)
results = await client.search_simple("large_dataset", query_vector, limit=10)
print(f"Search found {len(results)} results")
# Run the async processing
asyncio.run(process_large_dataset())
```
### **Error Handling and Retries**
```python
import time
from vectordb_client import VectorDBClient
from vectordb_client.exceptions import (
VectorDBError, ConnectionError, CollectionNotFoundError,
VectorNotFoundError, RateLimitError
)
def robust_insert_with_retry(client, collection_name, vectors, max_retries=3):
"""Insert vectors with automatic retry on failure."""
for attempt in range(max_retries):
try:
response = client.insert_vectors(collection_name, vectors)
print(f"Successfully inserted {response.inserted_count} vectors")
return response
except RateLimitError as e:
if attempt < max_retries - 1:
wait_time = 2 ** attempt # Exponential backoff
print(f"Rate limited, waiting {wait_time}s before retry...")
time.sleep(wait_time)
else:
raise e
except ConnectionError as e:
if attempt < max_retries - 1:
print(f"Connection failed, retrying... ({attempt + 1}/{max_retries})")
time.sleep(1)
else:
raise e
except CollectionNotFoundError:
print(f"Collection '{collection_name}' not found, creating...")
client.create_collection_simple(collection_name, 128, "cosine")
# Retry the insertion
continue
raise VectorDBError(f"Failed to insert after {max_retries} attempts")
# Usage
client = VectorDBClient()
vectors = [Vector(id=f"test_{i}", data=[0.1] * 128) for i in range(10)]
try:
robust_insert_with_retry(client, "test_collection", vectors)
except VectorDBError as e:
print(f"Final error: {e}")
```
### **Configuration and Connection Management**
```python
from vectordb_client import VectorDBClient
import os
# Configuration from environment variables
client = VectorDBClient(
host=os.getenv("VECTORDB_HOST", "localhost"),
port=int(os.getenv("VECTORDB_PORT", "8080")),
ssl=os.getenv("VECTORDB_SSL", "false").lower() == "true",
timeout=float(os.getenv("VECTORDB_TIMEOUT", "30.0"))
)
# Connection testing and fallback
def get_client_with_fallback():
"""Try multiple connection options."""
# Try primary server
try:
primary_client = VectorDBClient(host="primary.vectordb.com", port=8080)
if primary_client.ping():
return primary_client
primary_client.close()
except Exception:
pass
# Try secondary server
try:
secondary_client = VectorDBClient(host="secondary.vectordb.com", port=8080)
if secondary_client.ping():
return secondary_client
secondary_client.close()
except Exception:
pass
# Fall back to localhost
return VectorDBClient(host="localhost", port=8080)
# Context managers for resource cleanup
with get_client_with_fallback() as client:
# Use client here - automatically closed when leaving context
collections = client.list_collections()
print(f"Available collections: {collections.collections}")
```
## ๐งช **Testing**
```bash
# Run unit tests
python -m pytest tests/
# Run with coverage
python -m pytest tests/ --cov=vectordb_client --cov-report=html
# Run integration tests (requires running d-vecDB server)
python -m pytest tests/integration/ -v
# Run performance benchmarks
python -m pytest tests/benchmarks/ -v
```
## ๐ง **Development**
```bash
# Setup development environment
git clone https://github.com/rdmurugan/d-vecDB.git
cd d-vecDB/python-client
# Install in development mode
pip install -e .[dev]
# Run code formatting
black vectordb_client/
isort vectordb_client/
# Run type checking
mypy vectordb_client/
# Run linting
flake8 vectordb_client/
```
## ๐ **Performance Tips**
### **Batch Operations**
- Use `insert_vectors()` instead of multiple `insert_vector()` calls
- For async clients, use `batch_insert_concurrent()` for maximum throughput
- Optimal batch size is typically 100-1000 vectors depending on dimension
### **Connection Pooling**
- Async clients automatically pool HTTP connections
- Increase `connection_pool_size` for high-concurrency applications
- Reuse client instances instead of creating new ones
### **Search Optimization**
- Lower `ef_search` values for faster but less accurate search
- Use metadata filtering to reduce search space
- Consider the trade-off between speed and recall
### **Memory Management**
- Use NumPy arrays for large vector datasets
- Close clients explicitly or use context managers
- Monitor memory usage with large batch operations
## ๐ค **Contributing**
We welcome contributions! Please see our [Contributing Guide](../CONTRIBUTING.md) for details.
### **Development Setup**
1. Fork the repository
2. Create a feature branch
3. Install development dependencies: `pip install -e .[dev]`
4. Make changes and add tests
5. Run tests: `pytest`
6. Submit a pull request
## ๐ **License**
This project is licensed under the d-vecDB Enterprise License - see the [LICENSE](../LICENSE) file for details.
**For Enterprise Use**: Commercial usage requires a separate enterprise license. Contact durai@infinidatum.com for licensing terms.
## ๐ **Support**
- **Documentation**: [docs.d-vecdb.com](https://docs.d-vecdb.com)
- **Issues**: [GitHub Issues](https://github.com/rdmurugan/d-vecDB/issues)
- **Discussions**: [GitHub Discussions](https://github.com/rdmurugan/d-vecDB/discussions)
- **Discord**: [d-vecDB Community](https://discord.gg/d-vecdb)
---
**Built with โค๏ธ by the d-vecDB team**
Raw data
{
"_id": null,
"home_page": "https://github.com/your-org/d-vecDB",
"name": "vectordb-client",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "vector database, similarity search, machine learning, embeddings, HNSW",
"author": "d-vecDB Team",
"author_email": "durai@infinidatum.com",
"download_url": "https://files.pythonhosted.org/packages/3c/31/5939b717cd15c81e8220ee36117bed672db6c562f86d286256f80cb4a53b/vectordb_client-0.1.0.tar.gz",
"platform": null,
"description": "# d-vecDB Python Client\n\n[](https://www.python.org/downloads/)\n[](../LICENSE)\n[](https://github.com/psf/black)\n\nA comprehensive Python client library for [d-vecDB](https://github.com/rdmurugan/d-vecDB), providing both synchronous and asynchronous interfaces for vector database operations.\n\n## \ud83d\ude80 **Features**\n\n### **Multi-Protocol Support**\n- **REST API** via HTTP/HTTPS with connection pooling\n- **gRPC** for high-performance binary protocol communication\n- **Auto-detection** with intelligent fallback\n\n### **Synchronous & Asynchronous**\n- **Sync client** for traditional blocking operations\n- **Async client** for high-concurrency applications \n- **Connection pooling** and concurrent batch operations\n\n### **Type Safety & Validation**\n- **Pydantic models** for data validation\n- **Type hints** throughout the codebase\n- **Comprehensive error handling**\n\n### **Developer Experience**\n- **Intuitive API** with simple and advanced methods\n- **NumPy integration** for seamless array handling\n- **Rich documentation** and examples\n\n## \ud83d\udce6 **Installation**\n\n### **Recommended: Install d-vecDB with Python Client**\n\n```bash\n# Install d-vecDB (includes Python client)\npip install d-vecdb\n\n# Install with development dependencies\npip install d-vecdb[dev]\n\n# Install with example dependencies \npip install d-vecdb[examples]\n```\n\n### **Alternative: Install Python Client Only**\n\n```bash\n# Install just the Python client library\npip install vectordb-client\n\n# Install with development dependencies\npip install vectordb-client[dev]\n\n# Install with example dependencies\npip install vectordb-client[examples]\n```\n\n### **From Source**\n\n```bash\ngit clone https://github.com/rdmurugan/d-vecDB.git\ncd d-vecDB/python-client\npip install -e .\n```\n\n## \ud83d\ude80 **Getting Started After Installation**\n\n### **Step 1: Build and Start the d-vecDB Server**\n\n**Important**: The PyPI package (`pip install d-vecdb`) only includes the Python client library. You need to build the server separately:\n\n```bash\n# Clone the repository and build the server\ngit clone https://github.com/rdmurugan/d-vecDB.git\ncd d-vecDB\n\n# Build the server (requires Rust)\ncargo build --release\n\n# Start the server\n./target/release/vectordb-server --config config.toml\n```\n\n### **Step 2: Use the Python Client**\n\nOnce you have a running d-vecDB server, you can use the Python client (installed via pip) to interact with it:\n\n```python\nimport numpy as np\nfrom vectordb_client import VectorDBClient\n\n# Connect to your d-vecDB server\nclient = VectorDBClient(host=\"localhost\", port=8080)\n\n# Create a collection\nclient.create_collection_simple(\"my_collection\", 128, \"cosine\")\n\n# Insert some vectors\nvector = np.random.random(128)\nclient.insert_simple(\"my_collection\", \"vector_1\", vector)\n\n# Search for similar vectors\nquery = np.random.random(128)\nresults = client.search_simple(\"my_collection\", query, limit=5)\n\nprint(f\"Found {len(results)} similar vectors\")\nfor result in results:\n print(f\" - ID: {result.id}, Distance: {result.distance:.4f}\")\n\nclient.close()\n```\n\n## \ud83c\udfc3 **Quick Start**\n\n### **Synchronous Client**\n\n```python\nimport numpy as np\nfrom vectordb_client import VectorDBClient\n\n# Connect to d-vecDB server\nclient = VectorDBClient(host=\"localhost\", port=8080)\n\n# Create a collection\nclient.create_collection_simple(\n name=\"documents\", \n dimension=128, \n distance_metric=\"cosine\"\n)\n\n# Insert vectors\nvectors = np.random.random((100, 128))\nfor i, vector in enumerate(vectors):\n client.insert_simple(\n collection_name=\"documents\",\n vector_id=f\"doc_{i}\",\n vector_data=vector,\n metadata={\"title\": f\"Document {i}\", \"category\": \"example\"}\n )\n\n# Search for similar vectors\nquery_vector = np.random.random(128)\nresults = client.search_simple(\"documents\", query_vector, limit=5)\n\nfor result in results:\n print(f\"ID: {result.id}, Distance: {result.distance:.4f}\")\n\n# Clean up\nclient.close()\n```\n\n### **Asynchronous Client**\n\n```python\nimport asyncio\nimport numpy as np\nfrom vectordb_client import AsyncVectorDBClient\n\nasync def main():\n # Connect to d-vecDB server\n async with AsyncVectorDBClient(host=\"localhost\", port=8080) as client:\n \n # Create collection\n await client.create_collection_simple(\n name=\"embeddings\", \n dimension=384, \n distance_metric=\"cosine\"\n )\n \n # Prepare batch data\n batch_data = [\n (f\"item_{i}\", np.random.random(384), {\"category\": \"test\"})\n for i in range(1000)\n ]\n \n # Concurrent batch insertion\n await client.batch_insert_concurrent(\n collection_name=\"embeddings\",\n vectors_data=batch_data,\n batch_size=50,\n max_concurrent_batches=10\n )\n \n # Search\n query_vector = np.random.random(384)\n results = await client.search_simple(\"embeddings\", query_vector, limit=10)\n \n print(f\"Found {len(results)} similar vectors\")\n\n# Run the async example\nasyncio.run(main())\n```\n\n## \ud83d\udcd6 **API Reference**\n\n### **Client Initialization**\n\n```python\nfrom vectordb_client import VectorDBClient, AsyncVectorDBClient\n\n# Synchronous client\nclient = VectorDBClient(\n host=\"localhost\",\n port=8080, # REST port\n grpc_port=9090, # gRPC port \n protocol=\"rest\", # \"rest\", \"grpc\", or \"auto\"\n ssl=False, # Use HTTPS/secure gRPC\n timeout=30.0, # Request timeout\n)\n\n# Asynchronous client\nasync_client = AsyncVectorDBClient(\n host=\"localhost\",\n port=8080,\n connection_pool_size=10, # HTTP connection pool size\n protocol=\"rest\",\n ssl=False,\n timeout=30.0,\n)\n```\n\n### **Collection Management**\n\n```python\nfrom vectordb_client.types import CollectionConfig, DistanceMetric, IndexConfig\n\n# Advanced collection configuration\nconfig = CollectionConfig(\n name=\"my_collection\",\n dimension=768,\n distance_metric=DistanceMetric.COSINE,\n index_config=IndexConfig(\n max_connections=32,\n ef_construction=400,\n ef_search=100,\n max_layer=16\n )\n)\n\n# Create collection\nresponse = client.create_collection(config)\n\n# List all collections\ncollections = client.list_collections()\nprint(\"Collections:\", collections.collections)\n\n# Get collection info and stats\ncollection_info = client.get_collection(\"my_collection\")\nstats = client.get_collection_stats(\"my_collection\")\nprint(f\"Vectors: {stats.vector_count}, Memory: {stats.memory_usage} bytes\")\n\n# Delete collection\nclient.delete_collection(\"my_collection\")\n```\n\n### **Vector Operations**\n\n```python\nfrom vectordb_client.types import Vector\nimport numpy as np\n\n# Create vectors with metadata\nvectors = [\n Vector(\n id=\"vec_1\",\n data=np.random.random(128).tolist(),\n metadata={\"category\": \"A\", \"score\": 0.95}\n ),\n Vector(\n id=\"vec_2\", \n data=np.random.random(128).tolist(),\n metadata={\"category\": \"B\", \"score\": 0.87}\n )\n]\n\n# Insert single vector\nresponse = client.insert_vector(\"my_collection\", vectors[0])\n\n# Batch insert\nresponse = client.insert_vectors(\"my_collection\", vectors)\nprint(f\"Inserted {response.inserted_count} vectors\")\n\n# Get vector by ID\nvector = client.get_vector(\"my_collection\", \"vec_1\")\nprint(f\"Retrieved vector: {vector.id}\")\n\n# Update vector\nvectors[0].metadata[\"updated\"] = True\nclient.update_vector(\"my_collection\", vectors[0])\n\n# Delete vector \nclient.delete_vector(\"my_collection\", \"vec_1\")\n```\n\n### **Vector Search**\n\n```python\nfrom vectordb_client.types import SearchRequest\nimport numpy as np\n\n# Simple search\nquery_vector = np.random.random(128)\nresults = client.search_simple(\"my_collection\", query_vector, limit=10)\n\n# Advanced search with parameters\nsearch_request = SearchRequest(\n query_vector=query_vector.tolist(),\n limit=20,\n ef_search=150, # Higher value = better accuracy, slower search\n filter={\"category\": \"A\"} # Metadata filtering\n)\n\nresponse = client.search(\"my_collection\", \n search_request.query_vector,\n search_request.limit,\n search_request.ef_search,\n search_request.filter)\n\n# Process results\nfor result in response.results:\n print(f\"ID: {result.id}\")\n print(f\"Distance: {result.distance:.6f}\") \n print(f\"Metadata: {result.metadata}\")\n print(\"---\")\n\nprint(f\"Search took {response.query_time_ms}ms\")\n```\n\n### **Server Information**\n\n```python\n# Health check\nhealth = client.health_check()\nprint(f\"Server healthy: {health.healthy}\")\n\n# Server statistics\nstats = client.get_server_stats()\nprint(f\"Total vectors: {stats.total_vectors}\")\nprint(f\"Collections: {stats.total_collections}\")\nprint(f\"Memory usage: {stats.memory_usage} bytes\")\nprint(f\"Uptime: {stats.uptime_seconds}s\")\n\n# Quick connectivity test\nis_reachable = client.ping()\nprint(f\"Server reachable: {is_reachable}\")\n\n# Comprehensive info\ninfo = client.get_info()\nprint(\"Client info:\", info[\"client\"])\nprint(\"Server info:\", info[\"server\"])\n```\n\n## \ud83e\uddea **Advanced Examples**\n\n### **Working with NumPy Arrays**\n\n```python\nimport numpy as np\nfrom vectordb_client import VectorDBClient\nfrom vectordb_client.types import Vector\n\nclient = VectorDBClient()\n\n# Create collection for embeddings\nclient.create_collection_simple(\"embeddings\", 384, \"cosine\")\n\n# Work directly with NumPy arrays\nembeddings = np.random.random((1000, 384))\nids = [f\"embedding_{i}\" for i in range(1000)]\nmetadata_list = [{\"index\": i, \"batch\": i // 100} for i in range(1000)]\n\n# Batch insert using NumPy\nvectors = [\n Vector.from_numpy(id=ids[i], data=embeddings[i], metadata=metadata_list[i])\n for i in range(len(embeddings))\n]\n\n# Insert in batches\nbatch_size = 100\nfor i in range(0, len(vectors), batch_size):\n batch = vectors[i:i + batch_size]\n response = client.insert_vectors(\"embeddings\", batch)\n print(f\"Inserted batch {i // batch_size + 1}: {response.inserted_count} vectors\")\n\n# Search with NumPy array\nquery_embedding = np.random.random(384)\nresults = client.search_simple(\"embeddings\", query_embedding, limit=5)\n\n# Convert results back to NumPy if needed\nfor result in results:\n vector = client.get_vector(\"embeddings\", result.id)\n vector_array = vector.to_numpy() # Convert to NumPy array\n print(f\"Vector {result.id} shape: {vector_array.shape}\")\n```\n\n### **Async Batch Processing**\n\n```python\nimport asyncio\nimport numpy as np\nfrom vectordb_client import AsyncVectorDBClient\n\nasync def process_large_dataset():\n async with AsyncVectorDBClient() as client:\n # Create collection\n await client.create_collection_simple(\"large_dataset\", 512, \"euclidean\")\n \n # Generate large dataset\n num_vectors = 10000\n dimension = 512\n dataset = np.random.random((num_vectors, dimension))\n \n # Prepare batch data\n batch_data = [\n (f\"vec_{i}\", dataset[i], {\"batch\": i // 1000, \"index\": i})\n for i in range(num_vectors)\n ]\n \n # Concurrent insertion with progress tracking\n batch_size = 200\n max_concurrent = 20\n \n start_time = asyncio.get_event_loop().time()\n \n responses = await client.batch_insert_concurrent(\n collection_name=\"large_dataset\",\n vectors_data=batch_data,\n batch_size=batch_size,\n max_concurrent_batches=max_concurrent\n )\n \n end_time = asyncio.get_event_loop().time()\n \n total_inserted = sum(r.inserted_count or 0 for r in responses)\n duration = end_time - start_time\n rate = total_inserted / duration\n \n print(f\"Inserted {total_inserted} vectors in {duration:.2f}s\")\n print(f\"Rate: {rate:.2f} vectors/second\")\n \n # Verify with search\n query_vector = np.random.random(512)\n results = await client.search_simple(\"large_dataset\", query_vector, limit=10)\n print(f\"Search found {len(results)} results\")\n\n# Run the async processing\nasyncio.run(process_large_dataset())\n```\n\n### **Error Handling and Retries**\n\n```python\nimport time\nfrom vectordb_client import VectorDBClient\nfrom vectordb_client.exceptions import (\n VectorDBError, ConnectionError, CollectionNotFoundError,\n VectorNotFoundError, RateLimitError\n)\n\ndef robust_insert_with_retry(client, collection_name, vectors, max_retries=3):\n \"\"\"Insert vectors with automatic retry on failure.\"\"\"\n for attempt in range(max_retries):\n try:\n response = client.insert_vectors(collection_name, vectors)\n print(f\"Successfully inserted {response.inserted_count} vectors\")\n return response\n \n except RateLimitError as e:\n if attempt < max_retries - 1:\n wait_time = 2 ** attempt # Exponential backoff\n print(f\"Rate limited, waiting {wait_time}s before retry...\")\n time.sleep(wait_time)\n else:\n raise e\n \n except ConnectionError as e:\n if attempt < max_retries - 1:\n print(f\"Connection failed, retrying... ({attempt + 1}/{max_retries})\")\n time.sleep(1)\n else:\n raise e\n \n except CollectionNotFoundError:\n print(f\"Collection '{collection_name}' not found, creating...\")\n client.create_collection_simple(collection_name, 128, \"cosine\")\n # Retry the insertion\n continue\n \n raise VectorDBError(f\"Failed to insert after {max_retries} attempts\")\n\n# Usage\nclient = VectorDBClient()\nvectors = [Vector(id=f\"test_{i}\", data=[0.1] * 128) for i in range(10)]\n\ntry:\n robust_insert_with_retry(client, \"test_collection\", vectors)\nexcept VectorDBError as e:\n print(f\"Final error: {e}\")\n```\n\n### **Configuration and Connection Management**\n\n```python\nfrom vectordb_client import VectorDBClient\nimport os\n\n# Configuration from environment variables\nclient = VectorDBClient(\n host=os.getenv(\"VECTORDB_HOST\", \"localhost\"),\n port=int(os.getenv(\"VECTORDB_PORT\", \"8080\")),\n ssl=os.getenv(\"VECTORDB_SSL\", \"false\").lower() == \"true\",\n timeout=float(os.getenv(\"VECTORDB_TIMEOUT\", \"30.0\"))\n)\n\n# Connection testing and fallback\ndef get_client_with_fallback():\n \"\"\"Try multiple connection options.\"\"\"\n \n # Try primary server\n try:\n primary_client = VectorDBClient(host=\"primary.vectordb.com\", port=8080)\n if primary_client.ping():\n return primary_client\n primary_client.close()\n except Exception:\n pass\n \n # Try secondary server\n try:\n secondary_client = VectorDBClient(host=\"secondary.vectordb.com\", port=8080)\n if secondary_client.ping():\n return secondary_client\n secondary_client.close()\n except Exception:\n pass\n \n # Fall back to localhost\n return VectorDBClient(host=\"localhost\", port=8080)\n\n# Context managers for resource cleanup\nwith get_client_with_fallback() as client:\n # Use client here - automatically closed when leaving context\n collections = client.list_collections()\n print(f\"Available collections: {collections.collections}\")\n```\n\n## \ud83e\uddea **Testing**\n\n```bash\n# Run unit tests\npython -m pytest tests/\n\n# Run with coverage\npython -m pytest tests/ --cov=vectordb_client --cov-report=html\n\n# Run integration tests (requires running d-vecDB server)\npython -m pytest tests/integration/ -v\n\n# Run performance benchmarks\npython -m pytest tests/benchmarks/ -v\n```\n\n## \ud83d\udd27 **Development**\n\n```bash\n# Setup development environment\ngit clone https://github.com/rdmurugan/d-vecDB.git\ncd d-vecDB/python-client\n\n# Install in development mode\npip install -e .[dev]\n\n# Run code formatting\nblack vectordb_client/\nisort vectordb_client/\n\n# Run type checking \nmypy vectordb_client/\n\n# Run linting\nflake8 vectordb_client/\n```\n\n## \ud83d\udcca **Performance Tips**\n\n### **Batch Operations**\n- Use `insert_vectors()` instead of multiple `insert_vector()` calls\n- For async clients, use `batch_insert_concurrent()` for maximum throughput\n- Optimal batch size is typically 100-1000 vectors depending on dimension\n\n### **Connection Pooling**\n- Async clients automatically pool HTTP connections\n- Increase `connection_pool_size` for high-concurrency applications\n- Reuse client instances instead of creating new ones\n\n### **Search Optimization**\n- Lower `ef_search` values for faster but less accurate search\n- Use metadata filtering to reduce search space\n- Consider the trade-off between speed and recall\n\n### **Memory Management**\n- Use NumPy arrays for large vector datasets\n- Close clients explicitly or use context managers\n- Monitor memory usage with large batch operations\n\n## \ud83e\udd1d **Contributing**\n\nWe welcome contributions! Please see our [Contributing Guide](../CONTRIBUTING.md) for details.\n\n### **Development Setup**\n1. Fork the repository\n2. Create a feature branch\n3. Install development dependencies: `pip install -e .[dev]`\n4. Make changes and add tests\n5. Run tests: `pytest`\n6. Submit a pull request\n\n## \ud83d\udcc4 **License**\n\nThis project is licensed under the d-vecDB Enterprise License - see the [LICENSE](../LICENSE) file for details.\n\n**For Enterprise Use**: Commercial usage requires a separate enterprise license. Contact durai@infinidatum.com for licensing terms.\n\n## \ud83c\udd98 **Support**\n\n- **Documentation**: [docs.d-vecdb.com](https://docs.d-vecdb.com)\n- **Issues**: [GitHub Issues](https://github.com/rdmurugan/d-vecDB/issues)\n- **Discussions**: [GitHub Discussions](https://github.com/rdmurugan/d-vecDB/discussions)\n- **Discord**: [d-vecDB Community](https://discord.gg/d-vecdb)\n\n---\n\n**Built with \u2764\ufe0f by the d-vecDB team**\n",
"bugtrack_url": null,
"license": null,
"summary": "Python client library for d-vecDB vector database",
"version": "0.1.0",
"project_urls": {
"Bug Reports": "https://github.com/your-org/d-vecDB/issues",
"Documentation": "https://docs.d-vecdb.com",
"Homepage": "https://github.com/your-org/d-vecDB",
"Source": "https://github.com/your-org/d-vecDB"
},
"split_keywords": [
"vector database",
" similarity search",
" machine learning",
" embeddings",
" hnsw"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "053d43b195b696f57dc3ca9e8fe3b38858ba9f8656697e13de9de2722a6e1ee3",
"md5": "8414816707cfc52bf3038b53dc8ac1a8",
"sha256": "05435e5b555d84e390ad039ffd542d670a86b16b32a645210c061cb2e3602d88"
},
"downloads": -1,
"filename": "vectordb_client-0.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "8414816707cfc52bf3038b53dc8ac1a8",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 59982,
"upload_time": "2025-09-03T01:17:41",
"upload_time_iso_8601": "2025-09-03T01:17:41.169930Z",
"url": "https://files.pythonhosted.org/packages/05/3d/43b195b696f57dc3ca9e8fe3b38858ba9f8656697e13de9de2722a6e1ee3/vectordb_client-0.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "3c315939b717cd15c81e8220ee36117bed672db6c562f86d286256f80cb4a53b",
"md5": "78d1dc1e9010ade7bceefc9ca20bec44",
"sha256": "407fe4fa0d130e680004dfe58e9347cf58ef018e7314c5e698e0cd868ebab886"
},
"downloads": -1,
"filename": "vectordb_client-0.1.0.tar.gz",
"has_sig": false,
"md5_digest": "78d1dc1e9010ade7bceefc9ca20bec44",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 50586,
"upload_time": "2025-09-03T01:17:42",
"upload_time_iso_8601": "2025-09-03T01:17:42.631638Z",
"url": "https://files.pythonhosted.org/packages/3c/31/5939b717cd15c81e8220ee36117bed672db6c562f86d286256f80cb4a53b/vectordb_client-0.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-09-03 01:17:42",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "your-org",
"github_project": "d-vecDB",
"github_not_found": true,
"lcname": "vectordb-client"
}