vectordb-client


Namevectordb-client JSON
Version 0.1.0 PyPI version JSON
download
home_pagehttps://github.com/your-org/d-vecDB
SummaryPython client library for d-vecDB vector database
upload_time2025-09-03 01:17:42
maintainerNone
docs_urlNone
authord-vecDB Team
requires_python>=3.8
licenseNone
keywords vector database similarity search machine learning embeddings hnsw
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # d-vecDB Python Client

[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
[![License: Enterprise](https://img.shields.io/badge/License-Enterprise-red.svg)](../LICENSE)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)

A comprehensive Python client library for [d-vecDB](https://github.com/rdmurugan/d-vecDB), providing both synchronous and asynchronous interfaces for vector database operations.

## ๐Ÿš€ **Features**

### **Multi-Protocol Support**
- **REST API** via HTTP/HTTPS with connection pooling
- **gRPC** for high-performance binary protocol communication
- **Auto-detection** with intelligent fallback

### **Synchronous & Asynchronous**
- **Sync client** for traditional blocking operations
- **Async client** for high-concurrency applications  
- **Connection pooling** and concurrent batch operations

### **Type Safety & Validation**
- **Pydantic models** for data validation
- **Type hints** throughout the codebase
- **Comprehensive error handling**

### **Developer Experience**
- **Intuitive API** with simple and advanced methods
- **NumPy integration** for seamless array handling
- **Rich documentation** and examples

## ๐Ÿ“ฆ **Installation**

### **Recommended: Install d-vecDB with Python Client**

```bash
# Install d-vecDB (includes Python client)
pip install d-vecdb

# Install with development dependencies
pip install d-vecdb[dev]

# Install with example dependencies  
pip install d-vecdb[examples]
```

### **Alternative: Install Python Client Only**

```bash
# Install just the Python client library
pip install vectordb-client

# Install with development dependencies
pip install vectordb-client[dev]

# Install with example dependencies
pip install vectordb-client[examples]
```

### **From Source**

```bash
git clone https://github.com/rdmurugan/d-vecDB.git
cd d-vecDB/python-client
pip install -e .
```

## ๐Ÿš€ **Getting Started After Installation**

### **Step 1: Build and Start the d-vecDB Server**

**Important**: The PyPI package (`pip install d-vecdb`) only includes the Python client library. You need to build the server separately:

```bash
# Clone the repository and build the server
git clone https://github.com/rdmurugan/d-vecDB.git
cd d-vecDB

# Build the server (requires Rust)
cargo build --release

# Start the server
./target/release/vectordb-server --config config.toml
```

### **Step 2: Use the Python Client**

Once you have a running d-vecDB server, you can use the Python client (installed via pip) to interact with it:

```python
import numpy as np
from vectordb_client import VectorDBClient

# Connect to your d-vecDB server
client = VectorDBClient(host="localhost", port=8080)

# Create a collection
client.create_collection_simple("my_collection", 128, "cosine")

# Insert some vectors
vector = np.random.random(128)
client.insert_simple("my_collection", "vector_1", vector)

# Search for similar vectors
query = np.random.random(128)
results = client.search_simple("my_collection", query, limit=5)

print(f"Found {len(results)} similar vectors")
for result in results:
    print(f"  - ID: {result.id}, Distance: {result.distance:.4f}")

client.close()
```

## ๐Ÿƒ **Quick Start**

### **Synchronous Client**

```python
import numpy as np
from vectordb_client import VectorDBClient

# Connect to d-vecDB server
client = VectorDBClient(host="localhost", port=8080)

# Create a collection
client.create_collection_simple(
    name="documents", 
    dimension=128, 
    distance_metric="cosine"
)

# Insert vectors
vectors = np.random.random((100, 128))
for i, vector in enumerate(vectors):
    client.insert_simple(
        collection_name="documents",
        vector_id=f"doc_{i}",
        vector_data=vector,
        metadata={"title": f"Document {i}", "category": "example"}
    )

# Search for similar vectors
query_vector = np.random.random(128)
results = client.search_simple("documents", query_vector, limit=5)

for result in results:
    print(f"ID: {result.id}, Distance: {result.distance:.4f}")

# Clean up
client.close()
```

### **Asynchronous Client**

```python
import asyncio
import numpy as np
from vectordb_client import AsyncVectorDBClient

async def main():
    # Connect to d-vecDB server
    async with AsyncVectorDBClient(host="localhost", port=8080) as client:
        
        # Create collection
        await client.create_collection_simple(
            name="embeddings", 
            dimension=384, 
            distance_metric="cosine"
        )
        
        # Prepare batch data
        batch_data = [
            (f"item_{i}", np.random.random(384), {"category": "test"})
            for i in range(1000)
        ]
        
        # Concurrent batch insertion
        await client.batch_insert_concurrent(
            collection_name="embeddings",
            vectors_data=batch_data,
            batch_size=50,
            max_concurrent_batches=10
        )
        
        # Search
        query_vector = np.random.random(384)
        results = await client.search_simple("embeddings", query_vector, limit=10)
        
        print(f"Found {len(results)} similar vectors")

# Run the async example
asyncio.run(main())
```

## ๐Ÿ“– **API Reference**

### **Client Initialization**

```python
from vectordb_client import VectorDBClient, AsyncVectorDBClient

# Synchronous client
client = VectorDBClient(
    host="localhost",
    port=8080,              # REST port
    grpc_port=9090,         # gRPC port  
    protocol="rest",        # "rest", "grpc", or "auto"
    ssl=False,              # Use HTTPS/secure gRPC
    timeout=30.0,           # Request timeout
)

# Asynchronous client
async_client = AsyncVectorDBClient(
    host="localhost",
    port=8080,
    connection_pool_size=10,  # HTTP connection pool size
    protocol="rest",
    ssl=False,
    timeout=30.0,
)
```

### **Collection Management**

```python
from vectordb_client.types import CollectionConfig, DistanceMetric, IndexConfig

# Advanced collection configuration
config = CollectionConfig(
    name="my_collection",
    dimension=768,
    distance_metric=DistanceMetric.COSINE,
    index_config=IndexConfig(
        max_connections=32,
        ef_construction=400,
        ef_search=100,
        max_layer=16
    )
)

# Create collection
response = client.create_collection(config)

# List all collections
collections = client.list_collections()
print("Collections:", collections.collections)

# Get collection info and stats
collection_info = client.get_collection("my_collection")
stats = client.get_collection_stats("my_collection")
print(f"Vectors: {stats.vector_count}, Memory: {stats.memory_usage} bytes")

# Delete collection
client.delete_collection("my_collection")
```

### **Vector Operations**

```python
from vectordb_client.types import Vector
import numpy as np

# Create vectors with metadata
vectors = [
    Vector(
        id="vec_1",
        data=np.random.random(128).tolist(),
        metadata={"category": "A", "score": 0.95}
    ),
    Vector(
        id="vec_2", 
        data=np.random.random(128).tolist(),
        metadata={"category": "B", "score": 0.87}
    )
]

# Insert single vector
response = client.insert_vector("my_collection", vectors[0])

# Batch insert
response = client.insert_vectors("my_collection", vectors)
print(f"Inserted {response.inserted_count} vectors")

# Get vector by ID
vector = client.get_vector("my_collection", "vec_1")
print(f"Retrieved vector: {vector.id}")

# Update vector
vectors[0].metadata["updated"] = True
client.update_vector("my_collection", vectors[0])

# Delete vector  
client.delete_vector("my_collection", "vec_1")
```

### **Vector Search**

```python
from vectordb_client.types import SearchRequest
import numpy as np

# Simple search
query_vector = np.random.random(128)
results = client.search_simple("my_collection", query_vector, limit=10)

# Advanced search with parameters
search_request = SearchRequest(
    query_vector=query_vector.tolist(),
    limit=20,
    ef_search=150,  # Higher value = better accuracy, slower search
    filter={"category": "A"}  # Metadata filtering
)

response = client.search("my_collection", 
                        search_request.query_vector,
                        search_request.limit,
                        search_request.ef_search,
                        search_request.filter)

# Process results
for result in response.results:
    print(f"ID: {result.id}")
    print(f"Distance: {result.distance:.6f}")  
    print(f"Metadata: {result.metadata}")
    print("---")

print(f"Search took {response.query_time_ms}ms")
```

### **Server Information**

```python
# Health check
health = client.health_check()
print(f"Server healthy: {health.healthy}")

# Server statistics
stats = client.get_server_stats()
print(f"Total vectors: {stats.total_vectors}")
print(f"Collections: {stats.total_collections}")
print(f"Memory usage: {stats.memory_usage} bytes")
print(f"Uptime: {stats.uptime_seconds}s")

# Quick connectivity test
is_reachable = client.ping()
print(f"Server reachable: {is_reachable}")

# Comprehensive info
info = client.get_info()
print("Client info:", info["client"])
print("Server info:", info["server"])
```

## ๐Ÿงช **Advanced Examples**

### **Working with NumPy Arrays**

```python
import numpy as np
from vectordb_client import VectorDBClient
from vectordb_client.types import Vector

client = VectorDBClient()

# Create collection for embeddings
client.create_collection_simple("embeddings", 384, "cosine")

# Work directly with NumPy arrays
embeddings = np.random.random((1000, 384))
ids = [f"embedding_{i}" for i in range(1000)]
metadata_list = [{"index": i, "batch": i // 100} for i in range(1000)]

# Batch insert using NumPy
vectors = [
    Vector.from_numpy(id=ids[i], data=embeddings[i], metadata=metadata_list[i])
    for i in range(len(embeddings))
]

# Insert in batches
batch_size = 100
for i in range(0, len(vectors), batch_size):
    batch = vectors[i:i + batch_size]
    response = client.insert_vectors("embeddings", batch)
    print(f"Inserted batch {i // batch_size + 1}: {response.inserted_count} vectors")

# Search with NumPy array
query_embedding = np.random.random(384)
results = client.search_simple("embeddings", query_embedding, limit=5)

# Convert results back to NumPy if needed
for result in results:
    vector = client.get_vector("embeddings", result.id)
    vector_array = vector.to_numpy()  # Convert to NumPy array
    print(f"Vector {result.id} shape: {vector_array.shape}")
```

### **Async Batch Processing**

```python
import asyncio
import numpy as np
from vectordb_client import AsyncVectorDBClient

async def process_large_dataset():
    async with AsyncVectorDBClient() as client:
        # Create collection
        await client.create_collection_simple("large_dataset", 512, "euclidean")
        
        # Generate large dataset
        num_vectors = 10000
        dimension = 512
        dataset = np.random.random((num_vectors, dimension))
        
        # Prepare batch data
        batch_data = [
            (f"vec_{i}", dataset[i], {"batch": i // 1000, "index": i})
            for i in range(num_vectors)
        ]
        
        # Concurrent insertion with progress tracking
        batch_size = 200
        max_concurrent = 20
        
        start_time = asyncio.get_event_loop().time()
        
        responses = await client.batch_insert_concurrent(
            collection_name="large_dataset",
            vectors_data=batch_data,
            batch_size=batch_size,
            max_concurrent_batches=max_concurrent
        )
        
        end_time = asyncio.get_event_loop().time()
        
        total_inserted = sum(r.inserted_count or 0 for r in responses)
        duration = end_time - start_time
        rate = total_inserted / duration
        
        print(f"Inserted {total_inserted} vectors in {duration:.2f}s")
        print(f"Rate: {rate:.2f} vectors/second")
        
        # Verify with search
        query_vector = np.random.random(512)
        results = await client.search_simple("large_dataset", query_vector, limit=10)
        print(f"Search found {len(results)} results")

# Run the async processing
asyncio.run(process_large_dataset())
```

### **Error Handling and Retries**

```python
import time
from vectordb_client import VectorDBClient
from vectordb_client.exceptions import (
    VectorDBError, ConnectionError, CollectionNotFoundError,
    VectorNotFoundError, RateLimitError
)

def robust_insert_with_retry(client, collection_name, vectors, max_retries=3):
    """Insert vectors with automatic retry on failure."""
    for attempt in range(max_retries):
        try:
            response = client.insert_vectors(collection_name, vectors)
            print(f"Successfully inserted {response.inserted_count} vectors")
            return response
            
        except RateLimitError as e:
            if attempt < max_retries - 1:
                wait_time = 2 ** attempt  # Exponential backoff
                print(f"Rate limited, waiting {wait_time}s before retry...")
                time.sleep(wait_time)
            else:
                raise e
                
        except ConnectionError as e:
            if attempt < max_retries - 1:
                print(f"Connection failed, retrying... ({attempt + 1}/{max_retries})")
                time.sleep(1)
            else:
                raise e
                
        except CollectionNotFoundError:
            print(f"Collection '{collection_name}' not found, creating...")
            client.create_collection_simple(collection_name, 128, "cosine")
            # Retry the insertion
            continue
            
    raise VectorDBError(f"Failed to insert after {max_retries} attempts")

# Usage
client = VectorDBClient()
vectors = [Vector(id=f"test_{i}", data=[0.1] * 128) for i in range(10)]

try:
    robust_insert_with_retry(client, "test_collection", vectors)
except VectorDBError as e:
    print(f"Final error: {e}")
```

### **Configuration and Connection Management**

```python
from vectordb_client import VectorDBClient
import os

# Configuration from environment variables
client = VectorDBClient(
    host=os.getenv("VECTORDB_HOST", "localhost"),
    port=int(os.getenv("VECTORDB_PORT", "8080")),
    ssl=os.getenv("VECTORDB_SSL", "false").lower() == "true",
    timeout=float(os.getenv("VECTORDB_TIMEOUT", "30.0"))
)

# Connection testing and fallback
def get_client_with_fallback():
    """Try multiple connection options."""
    
    # Try primary server
    try:
        primary_client = VectorDBClient(host="primary.vectordb.com", port=8080)
        if primary_client.ping():
            return primary_client
        primary_client.close()
    except Exception:
        pass
    
    # Try secondary server
    try:
        secondary_client = VectorDBClient(host="secondary.vectordb.com", port=8080)
        if secondary_client.ping():
            return secondary_client
        secondary_client.close()
    except Exception:
        pass
    
    # Fall back to localhost
    return VectorDBClient(host="localhost", port=8080)

# Context managers for resource cleanup
with get_client_with_fallback() as client:
    # Use client here - automatically closed when leaving context
    collections = client.list_collections()
    print(f"Available collections: {collections.collections}")
```

## ๐Ÿงช **Testing**

```bash
# Run unit tests
python -m pytest tests/

# Run with coverage
python -m pytest tests/ --cov=vectordb_client --cov-report=html

# Run integration tests (requires running d-vecDB server)
python -m pytest tests/integration/ -v

# Run performance benchmarks
python -m pytest tests/benchmarks/ -v
```

## ๐Ÿ”ง **Development**

```bash
# Setup development environment
git clone https://github.com/rdmurugan/d-vecDB.git
cd d-vecDB/python-client

# Install in development mode
pip install -e .[dev]

# Run code formatting
black vectordb_client/
isort vectordb_client/

# Run type checking  
mypy vectordb_client/

# Run linting
flake8 vectordb_client/
```

## ๐Ÿ“Š **Performance Tips**

### **Batch Operations**
- Use `insert_vectors()` instead of multiple `insert_vector()` calls
- For async clients, use `batch_insert_concurrent()` for maximum throughput
- Optimal batch size is typically 100-1000 vectors depending on dimension

### **Connection Pooling**
- Async clients automatically pool HTTP connections
- Increase `connection_pool_size` for high-concurrency applications
- Reuse client instances instead of creating new ones

### **Search Optimization**
- Lower `ef_search` values for faster but less accurate search
- Use metadata filtering to reduce search space
- Consider the trade-off between speed and recall

### **Memory Management**
- Use NumPy arrays for large vector datasets
- Close clients explicitly or use context managers
- Monitor memory usage with large batch operations

## ๐Ÿค **Contributing**

We welcome contributions! Please see our [Contributing Guide](../CONTRIBUTING.md) for details.

### **Development Setup**
1. Fork the repository
2. Create a feature branch
3. Install development dependencies: `pip install -e .[dev]`
4. Make changes and add tests
5. Run tests: `pytest`
6. Submit a pull request

## ๐Ÿ“„ **License**

This project is licensed under the d-vecDB Enterprise License - see the [LICENSE](../LICENSE) file for details.

**For Enterprise Use**: Commercial usage requires a separate enterprise license. Contact durai@infinidatum.com for licensing terms.

## ๐Ÿ†˜ **Support**

- **Documentation**: [docs.d-vecdb.com](https://docs.d-vecdb.com)
- **Issues**: [GitHub Issues](https://github.com/rdmurugan/d-vecDB/issues)
- **Discussions**: [GitHub Discussions](https://github.com/rdmurugan/d-vecDB/discussions)
- **Discord**: [d-vecDB Community](https://discord.gg/d-vecdb)

---

**Built with โค๏ธ by the d-vecDB team**

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/your-org/d-vecDB",
    "name": "vectordb-client",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "vector database, similarity search, machine learning, embeddings, HNSW",
    "author": "d-vecDB Team",
    "author_email": "durai@infinidatum.com",
    "download_url": "https://files.pythonhosted.org/packages/3c/31/5939b717cd15c81e8220ee36117bed672db6c562f86d286256f80cb4a53b/vectordb_client-0.1.0.tar.gz",
    "platform": null,
    "description": "# d-vecDB Python Client\n\n[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)\n[![License: Enterprise](https://img.shields.io/badge/License-Enterprise-red.svg)](../LICENSE)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n\nA comprehensive Python client library for [d-vecDB](https://github.com/rdmurugan/d-vecDB), providing both synchronous and asynchronous interfaces for vector database operations.\n\n## \ud83d\ude80 **Features**\n\n### **Multi-Protocol Support**\n- **REST API** via HTTP/HTTPS with connection pooling\n- **gRPC** for high-performance binary protocol communication\n- **Auto-detection** with intelligent fallback\n\n### **Synchronous & Asynchronous**\n- **Sync client** for traditional blocking operations\n- **Async client** for high-concurrency applications  \n- **Connection pooling** and concurrent batch operations\n\n### **Type Safety & Validation**\n- **Pydantic models** for data validation\n- **Type hints** throughout the codebase\n- **Comprehensive error handling**\n\n### **Developer Experience**\n- **Intuitive API** with simple and advanced methods\n- **NumPy integration** for seamless array handling\n- **Rich documentation** and examples\n\n## \ud83d\udce6 **Installation**\n\n### **Recommended: Install d-vecDB with Python Client**\n\n```bash\n# Install d-vecDB (includes Python client)\npip install d-vecdb\n\n# Install with development dependencies\npip install d-vecdb[dev]\n\n# Install with example dependencies  \npip install d-vecdb[examples]\n```\n\n### **Alternative: Install Python Client Only**\n\n```bash\n# Install just the Python client library\npip install vectordb-client\n\n# Install with development dependencies\npip install vectordb-client[dev]\n\n# Install with example dependencies\npip install vectordb-client[examples]\n```\n\n### **From Source**\n\n```bash\ngit clone https://github.com/rdmurugan/d-vecDB.git\ncd d-vecDB/python-client\npip install -e .\n```\n\n## \ud83d\ude80 **Getting Started After Installation**\n\n### **Step 1: Build and Start the d-vecDB Server**\n\n**Important**: The PyPI package (`pip install d-vecdb`) only includes the Python client library. You need to build the server separately:\n\n```bash\n# Clone the repository and build the server\ngit clone https://github.com/rdmurugan/d-vecDB.git\ncd d-vecDB\n\n# Build the server (requires Rust)\ncargo build --release\n\n# Start the server\n./target/release/vectordb-server --config config.toml\n```\n\n### **Step 2: Use the Python Client**\n\nOnce you have a running d-vecDB server, you can use the Python client (installed via pip) to interact with it:\n\n```python\nimport numpy as np\nfrom vectordb_client import VectorDBClient\n\n# Connect to your d-vecDB server\nclient = VectorDBClient(host=\"localhost\", port=8080)\n\n# Create a collection\nclient.create_collection_simple(\"my_collection\", 128, \"cosine\")\n\n# Insert some vectors\nvector = np.random.random(128)\nclient.insert_simple(\"my_collection\", \"vector_1\", vector)\n\n# Search for similar vectors\nquery = np.random.random(128)\nresults = client.search_simple(\"my_collection\", query, limit=5)\n\nprint(f\"Found {len(results)} similar vectors\")\nfor result in results:\n    print(f\"  - ID: {result.id}, Distance: {result.distance:.4f}\")\n\nclient.close()\n```\n\n## \ud83c\udfc3 **Quick Start**\n\n### **Synchronous Client**\n\n```python\nimport numpy as np\nfrom vectordb_client import VectorDBClient\n\n# Connect to d-vecDB server\nclient = VectorDBClient(host=\"localhost\", port=8080)\n\n# Create a collection\nclient.create_collection_simple(\n    name=\"documents\", \n    dimension=128, \n    distance_metric=\"cosine\"\n)\n\n# Insert vectors\nvectors = np.random.random((100, 128))\nfor i, vector in enumerate(vectors):\n    client.insert_simple(\n        collection_name=\"documents\",\n        vector_id=f\"doc_{i}\",\n        vector_data=vector,\n        metadata={\"title\": f\"Document {i}\", \"category\": \"example\"}\n    )\n\n# Search for similar vectors\nquery_vector = np.random.random(128)\nresults = client.search_simple(\"documents\", query_vector, limit=5)\n\nfor result in results:\n    print(f\"ID: {result.id}, Distance: {result.distance:.4f}\")\n\n# Clean up\nclient.close()\n```\n\n### **Asynchronous Client**\n\n```python\nimport asyncio\nimport numpy as np\nfrom vectordb_client import AsyncVectorDBClient\n\nasync def main():\n    # Connect to d-vecDB server\n    async with AsyncVectorDBClient(host=\"localhost\", port=8080) as client:\n        \n        # Create collection\n        await client.create_collection_simple(\n            name=\"embeddings\", \n            dimension=384, \n            distance_metric=\"cosine\"\n        )\n        \n        # Prepare batch data\n        batch_data = [\n            (f\"item_{i}\", np.random.random(384), {\"category\": \"test\"})\n            for i in range(1000)\n        ]\n        \n        # Concurrent batch insertion\n        await client.batch_insert_concurrent(\n            collection_name=\"embeddings\",\n            vectors_data=batch_data,\n            batch_size=50,\n            max_concurrent_batches=10\n        )\n        \n        # Search\n        query_vector = np.random.random(384)\n        results = await client.search_simple(\"embeddings\", query_vector, limit=10)\n        \n        print(f\"Found {len(results)} similar vectors\")\n\n# Run the async example\nasyncio.run(main())\n```\n\n## \ud83d\udcd6 **API Reference**\n\n### **Client Initialization**\n\n```python\nfrom vectordb_client import VectorDBClient, AsyncVectorDBClient\n\n# Synchronous client\nclient = VectorDBClient(\n    host=\"localhost\",\n    port=8080,              # REST port\n    grpc_port=9090,         # gRPC port  \n    protocol=\"rest\",        # \"rest\", \"grpc\", or \"auto\"\n    ssl=False,              # Use HTTPS/secure gRPC\n    timeout=30.0,           # Request timeout\n)\n\n# Asynchronous client\nasync_client = AsyncVectorDBClient(\n    host=\"localhost\",\n    port=8080,\n    connection_pool_size=10,  # HTTP connection pool size\n    protocol=\"rest\",\n    ssl=False,\n    timeout=30.0,\n)\n```\n\n### **Collection Management**\n\n```python\nfrom vectordb_client.types import CollectionConfig, DistanceMetric, IndexConfig\n\n# Advanced collection configuration\nconfig = CollectionConfig(\n    name=\"my_collection\",\n    dimension=768,\n    distance_metric=DistanceMetric.COSINE,\n    index_config=IndexConfig(\n        max_connections=32,\n        ef_construction=400,\n        ef_search=100,\n        max_layer=16\n    )\n)\n\n# Create collection\nresponse = client.create_collection(config)\n\n# List all collections\ncollections = client.list_collections()\nprint(\"Collections:\", collections.collections)\n\n# Get collection info and stats\ncollection_info = client.get_collection(\"my_collection\")\nstats = client.get_collection_stats(\"my_collection\")\nprint(f\"Vectors: {stats.vector_count}, Memory: {stats.memory_usage} bytes\")\n\n# Delete collection\nclient.delete_collection(\"my_collection\")\n```\n\n### **Vector Operations**\n\n```python\nfrom vectordb_client.types import Vector\nimport numpy as np\n\n# Create vectors with metadata\nvectors = [\n    Vector(\n        id=\"vec_1\",\n        data=np.random.random(128).tolist(),\n        metadata={\"category\": \"A\", \"score\": 0.95}\n    ),\n    Vector(\n        id=\"vec_2\", \n        data=np.random.random(128).tolist(),\n        metadata={\"category\": \"B\", \"score\": 0.87}\n    )\n]\n\n# Insert single vector\nresponse = client.insert_vector(\"my_collection\", vectors[0])\n\n# Batch insert\nresponse = client.insert_vectors(\"my_collection\", vectors)\nprint(f\"Inserted {response.inserted_count} vectors\")\n\n# Get vector by ID\nvector = client.get_vector(\"my_collection\", \"vec_1\")\nprint(f\"Retrieved vector: {vector.id}\")\n\n# Update vector\nvectors[0].metadata[\"updated\"] = True\nclient.update_vector(\"my_collection\", vectors[0])\n\n# Delete vector  \nclient.delete_vector(\"my_collection\", \"vec_1\")\n```\n\n### **Vector Search**\n\n```python\nfrom vectordb_client.types import SearchRequest\nimport numpy as np\n\n# Simple search\nquery_vector = np.random.random(128)\nresults = client.search_simple(\"my_collection\", query_vector, limit=10)\n\n# Advanced search with parameters\nsearch_request = SearchRequest(\n    query_vector=query_vector.tolist(),\n    limit=20,\n    ef_search=150,  # Higher value = better accuracy, slower search\n    filter={\"category\": \"A\"}  # Metadata filtering\n)\n\nresponse = client.search(\"my_collection\", \n                        search_request.query_vector,\n                        search_request.limit,\n                        search_request.ef_search,\n                        search_request.filter)\n\n# Process results\nfor result in response.results:\n    print(f\"ID: {result.id}\")\n    print(f\"Distance: {result.distance:.6f}\")  \n    print(f\"Metadata: {result.metadata}\")\n    print(\"---\")\n\nprint(f\"Search took {response.query_time_ms}ms\")\n```\n\n### **Server Information**\n\n```python\n# Health check\nhealth = client.health_check()\nprint(f\"Server healthy: {health.healthy}\")\n\n# Server statistics\nstats = client.get_server_stats()\nprint(f\"Total vectors: {stats.total_vectors}\")\nprint(f\"Collections: {stats.total_collections}\")\nprint(f\"Memory usage: {stats.memory_usage} bytes\")\nprint(f\"Uptime: {stats.uptime_seconds}s\")\n\n# Quick connectivity test\nis_reachable = client.ping()\nprint(f\"Server reachable: {is_reachable}\")\n\n# Comprehensive info\ninfo = client.get_info()\nprint(\"Client info:\", info[\"client\"])\nprint(\"Server info:\", info[\"server\"])\n```\n\n## \ud83e\uddea **Advanced Examples**\n\n### **Working with NumPy Arrays**\n\n```python\nimport numpy as np\nfrom vectordb_client import VectorDBClient\nfrom vectordb_client.types import Vector\n\nclient = VectorDBClient()\n\n# Create collection for embeddings\nclient.create_collection_simple(\"embeddings\", 384, \"cosine\")\n\n# Work directly with NumPy arrays\nembeddings = np.random.random((1000, 384))\nids = [f\"embedding_{i}\" for i in range(1000)]\nmetadata_list = [{\"index\": i, \"batch\": i // 100} for i in range(1000)]\n\n# Batch insert using NumPy\nvectors = [\n    Vector.from_numpy(id=ids[i], data=embeddings[i], metadata=metadata_list[i])\n    for i in range(len(embeddings))\n]\n\n# Insert in batches\nbatch_size = 100\nfor i in range(0, len(vectors), batch_size):\n    batch = vectors[i:i + batch_size]\n    response = client.insert_vectors(\"embeddings\", batch)\n    print(f\"Inserted batch {i // batch_size + 1}: {response.inserted_count} vectors\")\n\n# Search with NumPy array\nquery_embedding = np.random.random(384)\nresults = client.search_simple(\"embeddings\", query_embedding, limit=5)\n\n# Convert results back to NumPy if needed\nfor result in results:\n    vector = client.get_vector(\"embeddings\", result.id)\n    vector_array = vector.to_numpy()  # Convert to NumPy array\n    print(f\"Vector {result.id} shape: {vector_array.shape}\")\n```\n\n### **Async Batch Processing**\n\n```python\nimport asyncio\nimport numpy as np\nfrom vectordb_client import AsyncVectorDBClient\n\nasync def process_large_dataset():\n    async with AsyncVectorDBClient() as client:\n        # Create collection\n        await client.create_collection_simple(\"large_dataset\", 512, \"euclidean\")\n        \n        # Generate large dataset\n        num_vectors = 10000\n        dimension = 512\n        dataset = np.random.random((num_vectors, dimension))\n        \n        # Prepare batch data\n        batch_data = [\n            (f\"vec_{i}\", dataset[i], {\"batch\": i // 1000, \"index\": i})\n            for i in range(num_vectors)\n        ]\n        \n        # Concurrent insertion with progress tracking\n        batch_size = 200\n        max_concurrent = 20\n        \n        start_time = asyncio.get_event_loop().time()\n        \n        responses = await client.batch_insert_concurrent(\n            collection_name=\"large_dataset\",\n            vectors_data=batch_data,\n            batch_size=batch_size,\n            max_concurrent_batches=max_concurrent\n        )\n        \n        end_time = asyncio.get_event_loop().time()\n        \n        total_inserted = sum(r.inserted_count or 0 for r in responses)\n        duration = end_time - start_time\n        rate = total_inserted / duration\n        \n        print(f\"Inserted {total_inserted} vectors in {duration:.2f}s\")\n        print(f\"Rate: {rate:.2f} vectors/second\")\n        \n        # Verify with search\n        query_vector = np.random.random(512)\n        results = await client.search_simple(\"large_dataset\", query_vector, limit=10)\n        print(f\"Search found {len(results)} results\")\n\n# Run the async processing\nasyncio.run(process_large_dataset())\n```\n\n### **Error Handling and Retries**\n\n```python\nimport time\nfrom vectordb_client import VectorDBClient\nfrom vectordb_client.exceptions import (\n    VectorDBError, ConnectionError, CollectionNotFoundError,\n    VectorNotFoundError, RateLimitError\n)\n\ndef robust_insert_with_retry(client, collection_name, vectors, max_retries=3):\n    \"\"\"Insert vectors with automatic retry on failure.\"\"\"\n    for attempt in range(max_retries):\n        try:\n            response = client.insert_vectors(collection_name, vectors)\n            print(f\"Successfully inserted {response.inserted_count} vectors\")\n            return response\n            \n        except RateLimitError as e:\n            if attempt < max_retries - 1:\n                wait_time = 2 ** attempt  # Exponential backoff\n                print(f\"Rate limited, waiting {wait_time}s before retry...\")\n                time.sleep(wait_time)\n            else:\n                raise e\n                \n        except ConnectionError as e:\n            if attempt < max_retries - 1:\n                print(f\"Connection failed, retrying... ({attempt + 1}/{max_retries})\")\n                time.sleep(1)\n            else:\n                raise e\n                \n        except CollectionNotFoundError:\n            print(f\"Collection '{collection_name}' not found, creating...\")\n            client.create_collection_simple(collection_name, 128, \"cosine\")\n            # Retry the insertion\n            continue\n            \n    raise VectorDBError(f\"Failed to insert after {max_retries} attempts\")\n\n# Usage\nclient = VectorDBClient()\nvectors = [Vector(id=f\"test_{i}\", data=[0.1] * 128) for i in range(10)]\n\ntry:\n    robust_insert_with_retry(client, \"test_collection\", vectors)\nexcept VectorDBError as e:\n    print(f\"Final error: {e}\")\n```\n\n### **Configuration and Connection Management**\n\n```python\nfrom vectordb_client import VectorDBClient\nimport os\n\n# Configuration from environment variables\nclient = VectorDBClient(\n    host=os.getenv(\"VECTORDB_HOST\", \"localhost\"),\n    port=int(os.getenv(\"VECTORDB_PORT\", \"8080\")),\n    ssl=os.getenv(\"VECTORDB_SSL\", \"false\").lower() == \"true\",\n    timeout=float(os.getenv(\"VECTORDB_TIMEOUT\", \"30.0\"))\n)\n\n# Connection testing and fallback\ndef get_client_with_fallback():\n    \"\"\"Try multiple connection options.\"\"\"\n    \n    # Try primary server\n    try:\n        primary_client = VectorDBClient(host=\"primary.vectordb.com\", port=8080)\n        if primary_client.ping():\n            return primary_client\n        primary_client.close()\n    except Exception:\n        pass\n    \n    # Try secondary server\n    try:\n        secondary_client = VectorDBClient(host=\"secondary.vectordb.com\", port=8080)\n        if secondary_client.ping():\n            return secondary_client\n        secondary_client.close()\n    except Exception:\n        pass\n    \n    # Fall back to localhost\n    return VectorDBClient(host=\"localhost\", port=8080)\n\n# Context managers for resource cleanup\nwith get_client_with_fallback() as client:\n    # Use client here - automatically closed when leaving context\n    collections = client.list_collections()\n    print(f\"Available collections: {collections.collections}\")\n```\n\n## \ud83e\uddea **Testing**\n\n```bash\n# Run unit tests\npython -m pytest tests/\n\n# Run with coverage\npython -m pytest tests/ --cov=vectordb_client --cov-report=html\n\n# Run integration tests (requires running d-vecDB server)\npython -m pytest tests/integration/ -v\n\n# Run performance benchmarks\npython -m pytest tests/benchmarks/ -v\n```\n\n## \ud83d\udd27 **Development**\n\n```bash\n# Setup development environment\ngit clone https://github.com/rdmurugan/d-vecDB.git\ncd d-vecDB/python-client\n\n# Install in development mode\npip install -e .[dev]\n\n# Run code formatting\nblack vectordb_client/\nisort vectordb_client/\n\n# Run type checking  \nmypy vectordb_client/\n\n# Run linting\nflake8 vectordb_client/\n```\n\n## \ud83d\udcca **Performance Tips**\n\n### **Batch Operations**\n- Use `insert_vectors()` instead of multiple `insert_vector()` calls\n- For async clients, use `batch_insert_concurrent()` for maximum throughput\n- Optimal batch size is typically 100-1000 vectors depending on dimension\n\n### **Connection Pooling**\n- Async clients automatically pool HTTP connections\n- Increase `connection_pool_size` for high-concurrency applications\n- Reuse client instances instead of creating new ones\n\n### **Search Optimization**\n- Lower `ef_search` values for faster but less accurate search\n- Use metadata filtering to reduce search space\n- Consider the trade-off between speed and recall\n\n### **Memory Management**\n- Use NumPy arrays for large vector datasets\n- Close clients explicitly or use context managers\n- Monitor memory usage with large batch operations\n\n## \ud83e\udd1d **Contributing**\n\nWe welcome contributions! Please see our [Contributing Guide](../CONTRIBUTING.md) for details.\n\n### **Development Setup**\n1. Fork the repository\n2. Create a feature branch\n3. Install development dependencies: `pip install -e .[dev]`\n4. Make changes and add tests\n5. Run tests: `pytest`\n6. Submit a pull request\n\n## \ud83d\udcc4 **License**\n\nThis project is licensed under the d-vecDB Enterprise License - see the [LICENSE](../LICENSE) file for details.\n\n**For Enterprise Use**: Commercial usage requires a separate enterprise license. Contact durai@infinidatum.com for licensing terms.\n\n## \ud83c\udd98 **Support**\n\n- **Documentation**: [docs.d-vecdb.com](https://docs.d-vecdb.com)\n- **Issues**: [GitHub Issues](https://github.com/rdmurugan/d-vecDB/issues)\n- **Discussions**: [GitHub Discussions](https://github.com/rdmurugan/d-vecDB/discussions)\n- **Discord**: [d-vecDB Community](https://discord.gg/d-vecdb)\n\n---\n\n**Built with \u2764\ufe0f by the d-vecDB team**\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Python client library for d-vecDB vector database",
    "version": "0.1.0",
    "project_urls": {
        "Bug Reports": "https://github.com/your-org/d-vecDB/issues",
        "Documentation": "https://docs.d-vecdb.com",
        "Homepage": "https://github.com/your-org/d-vecDB",
        "Source": "https://github.com/your-org/d-vecDB"
    },
    "split_keywords": [
        "vector database",
        " similarity search",
        " machine learning",
        " embeddings",
        " hnsw"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "053d43b195b696f57dc3ca9e8fe3b38858ba9f8656697e13de9de2722a6e1ee3",
                "md5": "8414816707cfc52bf3038b53dc8ac1a8",
                "sha256": "05435e5b555d84e390ad039ffd542d670a86b16b32a645210c061cb2e3602d88"
            },
            "downloads": -1,
            "filename": "vectordb_client-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "8414816707cfc52bf3038b53dc8ac1a8",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 59982,
            "upload_time": "2025-09-03T01:17:41",
            "upload_time_iso_8601": "2025-09-03T01:17:41.169930Z",
            "url": "https://files.pythonhosted.org/packages/05/3d/43b195b696f57dc3ca9e8fe3b38858ba9f8656697e13de9de2722a6e1ee3/vectordb_client-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "3c315939b717cd15c81e8220ee36117bed672db6c562f86d286256f80cb4a53b",
                "md5": "78d1dc1e9010ade7bceefc9ca20bec44",
                "sha256": "407fe4fa0d130e680004dfe58e9347cf58ef018e7314c5e698e0cd868ebab886"
            },
            "downloads": -1,
            "filename": "vectordb_client-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "78d1dc1e9010ade7bceefc9ca20bec44",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 50586,
            "upload_time": "2025-09-03T01:17:42",
            "upload_time_iso_8601": "2025-09-03T01:17:42.631638Z",
            "url": "https://files.pythonhosted.org/packages/3c/31/5939b717cd15c81e8220ee36117bed672db6c562f86d286256f80cb4a53b/vectordb_client-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-09-03 01:17:42",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "your-org",
    "github_project": "d-vecDB",
    "github_not_found": true,
    "lcname": "vectordb-client"
}
        
Elapsed time: 0.49712s