nusterdb


Namenusterdb JSON
Version 2.1.3 PyPI version JSON
download
home_pagehttps://github.com/NusterAI/nusterdb
SummaryHigh-performance, government-grade vector database with advanced indexing algorithms
upload_time2025-08-04 01:12:17
maintainerNone
docs_urlNone
authorNusterAI Team
requires_python>=3.8
licenseNone
keywords vector database similarity search machine learning ai government-grade security fips quantum-resistant
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # NusterDB - High-Performance Vector Database with Enterprise Security

[![PyPI version](https://badge.fury.io/py/nusterdb.svg)](https://badge.fury.io/py/nusterdb)
[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

## The Complete Vector Database Solution

NusterDB is a high-performance vector database designed for production workloads with enterprise-grade security, persistence, and comprehensive features. Built for AI/ML applications requiring fast similarity search with reliability and security.

### Core Features

| Feature | Description |
|---------|-------------|
| **Advanced Algorithms** | Multiple search algorithms: IVF, PQ, LSH, SQ, Flat, HNSW |
| **Enterprise Security** | FIPS 140-2 compliance, quantum-resistant encryption |
| **Production APIs** | Complete REST APIs and Python SDK |
| **Data Persistence** | Built-in durable storage with transaction support |
| **Full CRUD Operations** | Complete database operations: Create, Read, Update, Delete |
| **Multiple Storage Modes** | Memory, Persistent, Cache, and API modes |

## Quick Start

### Installation

```bash
pip install nusterdb
```

### Simple Usage

```python
import nusterdb

# Create database - choose your mode
db = nusterdb.NusterDB(mode="memory", dimension=128)

# Add vectors
db.add(1, [0.1, 0.2, 0.3, ...])  # Single vector
db.bulk_add([1,2,3], vectors, metadata)  # Multiple vectors

# Search
results = db.search([0.1, 0.2, 0.3, ...], k=5)
for result in results:
    print(f"ID: {result['id']}, Distance: {result['distance']}")
```

## Complete API Reference

### Core Classes

#### `NusterDB`

The main database class supporting all storage modes and algorithms.

```python
class NusterDB:
    """
    Unified NusterDB interface supporting all storage modes and algorithms.
    """
    
    def __init__(
        self,
        mode: Union[str, StorageMode] = "memory",
        dimension: Optional[int] = None,
        path: Optional[str] = None,
        url: Optional[str] = None,
        algorithm: Union[str, Algorithm] = "flat",
        security_level: Union[str, SecurityLevel] = "none",
        distance_metric: Union[str, DistanceMetric] = "l2",
        use_simd: bool = True,
        use_gpu: bool = True,
        parallel_processing: bool = True,
        cache_size: str = "512MB",
        compression: bool = False,
        **kwargs
    ):
```

**Parameters:**
- `mode`: Storage mode ("memory", "persistent", "cache", "api")
- `dimension`: Vector dimension (required for new databases)
- `path`: Path for persistent storage
- `url`: Server URL for API mode
- `algorithm`: Indexing algorithm ("flat", "ivf", "pq", "lsh", "sq", "hnsw")
- `security_level`: Security level ("none", "basic", "enterprise", "government")
- `distance_metric`: Distance metric ("l2", "cosine", "inner_product", "l1")
- `use_simd`: Enable SIMD optimizations
- `use_gpu`: Enable GPU acceleration
- `parallel_processing`: Enable parallel processing
- `cache_size`: Cache size (e.g., "1GB", "512MB")
- `compression`: Enable compression

#### Core Database Operations

##### `add(id, vector, metadata=None)`

Add a single vector to the database.

```python
def add(
    self, 
    id: Union[int, str], 
    vector: Union[List[float], np.ndarray],
    metadata: Optional[Dict[str, Any]] = None
) -> bool:
```

**Parameters:**
- `id`: Unique identifier
- `vector`: Vector data (list or numpy array)
- `metadata`: Optional metadata dictionary

**Returns:** `bool` - Success status

**Example:**
```python
# Add vector with metadata
success = db.add(1, [0.1, 0.2, 0.3], {"category": "document", "type": "text"})

# Add numpy vector
import numpy as np
vector = np.random.random(128)
db.add("doc_001", vector, {"source": "research_paper"})
```

##### `bulk_add(ids, vectors, metadata=None)`

Add multiple vectors efficiently.

```python
def bulk_add(
    self,
    ids: List[Union[int, str]],
    vectors: Union[List[List[float]], np.ndarray],
    metadata: Optional[List[Dict[str, Any]]] = None
) -> int:
```

**Parameters:**
- `ids`: List of unique identifiers
- `vectors`: List of vectors or 2D numpy array
- `metadata`: Optional list of metadata dictionaries

**Returns:** `int` - Number of vectors successfully added

**Example:**
```python
# Bulk add with metadata
ids = [1, 2, 3, 4, 5]
vectors = [[0.1, 0.2], [0.3, 0.4], [0.5, 0.6], [0.7, 0.8], [0.9, 1.0]]
metdata = [
    {"category": "A", "score": 0.95},
    {"category": "B", "score": 0.87},
    {"category": "A", "score": 0.92},
    {"category": "C", "score": 0.78},
    {"category": "B", "score": 0.89}
]
added_count = db.bulk_add(ids, vectors, metadata)
print(f"Added {added_count} vectors")

# Bulk add numpy arrays
import numpy as np
vectors = np.random.random((1000, 128))
ids = [f"vec_{i}" for i in range(1000)]
db.bulk_add(ids, vectors)
```

##### `search(query, k=10, filters=None, include_metadata=True, include_distances=True)`

Search for similar vectors.

```python
def search(
    self,
    query: Union[List[float], np.ndarray],
    k: int = 10,
    filters: Optional[Dict[str, Any]] = None,
    include_metadata: bool = True,
    include_distances: bool = True
) -> List[Dict[str, Any]]:
```

**Parameters:**
- `query`: Query vector
- `k`: Number of results to return
- `filters`: Optional metadata filters
- `include_metadata`: Include metadata in results
- `include_distances`: Include distances in results

**Returns:** `List[Dict[str, Any]]` - List of search results

**Example:**
```python
# Basic search
results = db.search([0.1, 0.2, 0.3], k=5)
for result in results:
    print(f"ID: {result['id']}, Distance: {result['distance']}")

# Search with filters
results = db.search(
    query=[0.1, 0.2, 0.3],
    k=10,
    filters={"category": "document"},
    include_metadata=True,
    include_distances=True
)

# Process results
for result in results:
    print(f"ID: {result['id']}")
    print(f"Distance: {result['distance']:.4f}")
    print(f"Metadata: {result['metadata']}")
```

##### `update(id, vector=None, metadata=None)`

Update an existing vector.

```python
def update(
    self,
    id: Union[int, str],
    vector: Optional[Union[List[float], np.ndarray]] = None,
    metadata: Optional[Dict[str, Any]] = None
) -> bool:
```

**Parameters:**
- `id`: Vector ID to update
- `vector`: New vector data (optional)
- `metadata`: New metadata (optional)

**Returns:** `bool` - Success status

**Example:**
```python
# Update vector only
db.update(1, [0.2, 0.3, 0.4])

# Update metadata only
db.update(1, metadata={"category": "updated", "version": 2})

# Update both
db.update(1, [0.2, 0.3, 0.4], {"category": "updated", "version": 2})
```

##### `delete(id)`

Delete a vector by ID.

```python
def delete(self, id: Union[int, str]) -> bool:
```

**Parameters:**
- `id`: Vector ID to delete

**Returns:** `bool` - Success status

**Example:**
```python
# Delete by ID
success = db.delete(1)
if success:
    print("Vector deleted successfully")

# Delete multiple vectors
ids_to_delete = [1, 2, 3, 4, 5]
for vec_id in ids_to_delete:
    db.delete(vec_id)
```

##### `get(id, include_metadata=True)`

Get a vector by ID.

```python
def get(
    self, 
    id: Union[int, str],
    include_metadata: bool = True
) -> Optional[Dict[str, Any]]:
```

**Parameters:**
- `id`: Vector ID
- `include_metadata`: Include metadata in result

**Returns:** `Optional[Dict[str, Any]]` - Vector data or None if not found

**Example:**
```python
# Get vector with metadata
vector_data = db.get(1)
if vector_data:
    print(f"Vector: {vector_data['vector']}")
    print(f"Metadata: {vector_data['metadata']}")

# Get vector without metadata
vector_data = db.get(1, include_metadata=False)
```

##### `batch_search(queries, k=10, filters=None)`

Search with multiple queries efficiently.

```python
def batch_search(
    self,
    queries: Union[List[List[float]], np.ndarray],
    k: int = 10,
    filters: Optional[List[Dict[str, Any]]] = None
) -> List[List[Dict[str, Any]]]:
```

**Parameters:**
- `queries`: List of query vectors
- `k`: Number of results per query
- `filters`: Optional filters per query

**Returns:** `List[List[Dict[str, Any]]]` - List of result lists

**Example:**
```python
# Batch search multiple queries
queries = [
    [0.1, 0.2, 0.3],
    [0.4, 0.5, 0.6],
    [0.7, 0.8, 0.9]
]
all_results = db.batch_search(queries, k=5)

for i, results in enumerate(all_results):
    print(f"Query {i} results:")
    for result in results:
        print(f"  ID: {result['id']}, Distance: {result['distance']}")
```

#### Management Operations

##### `train(training_vectors=None)`

Train the index (required for some algorithms).

```python
def train(self, training_vectors: Optional[Union[List[List[float]], np.ndarray]] = None) -> bool:
```

**Parameters:**
- `training_vectors`: Training data (optional, uses existing data if None)

**Returns:** `bool` - Success status

**Example:**
```python
# Train with existing data
db.train()

# Train with specific training data
training_data = np.random.random((10000, 128))
db.train(training_data)
```

##### `optimize()`

Optimize the index for better performance.

```python
def optimize(self) -> bool:
```

**Returns:** `bool` - Success status

**Example:**
```python
# Optimize after bulk inserts
db.bulk_add(ids, vectors)
db.optimize()  # Rebuild index for better performance
```

##### `save(path=None)`

Save the database to disk.

```python
def save(self, path: Optional[str] = None) -> bool:
```

**Parameters:**
- `path`: Save path (uses default if None)

**Returns:** `bool` - Success status

##### `load(path)`

Load database from disk.

```python
def load(self, path: str) -> bool:
```

**Parameters:**
- `path`: Load path

**Returns:** `bool` - Success status

##### `clear()`

Clear all vectors from the database.

```python
def clear(self) -> bool:
```

**Returns:** `bool` - Success status

#### Information and Statistics

##### `count()` / `size()`

Get total number of vectors.

```python
def count(self) -> int:
def size(self) -> int:  # Alias for count()
```

**Returns:** `int` - Number of vectors

**Example:**
```python
total_vectors = db.count()
print(f"Database contains {total_vectors} vectors")
```

##### `stats()`

Get database statistics.

```python
def stats(self) -> Dict[str, Any]:
```

**Returns:** `Dict[str, Any]` - Statistics dictionary

**Example:**
```python
stats = db.stats()
print(f"Vector count: {stats['vector_count']}")
print(f"Average query time: {stats['avg_query_time']:.3f}ms")
print(f"Algorithm: {stats['algorithm']}")
print(f"Security level: {stats['security_level']}")
```

##### `info()`

Get database information.

```python
def info(self) -> Dict[str, Any]:
```

**Returns:** `Dict[str, Any]` - Information dictionary

##### `health_check()`

Perform health check.

```python
def health_check(self) -> Dict[str, Any]:
```

**Returns:** `Dict[str, Any]` - Health status

**Example:**
```python
health = db.health_check()
if health['healthy']:
    print("Database is healthy")
    print(f"Vector count: {health['vector_count']}")
else:
    print(f"Database issue: {health.get('error')}")
```

#### Context Manager Support

```python
# Automatic resource management
with nusterdb.NusterDB(mode="persistent", path="./vectors") as db:
    db.add(1, [0.1, 0.2, 0.3])
    results = db.search([0.1, 0.2, 0.3])
    # Database automatically saved and closed
```

#### Iterator Support

```python
# Iterate over all vectors (if supported by backend)
for vector_data in db:
    print(f"ID: {vector_data['id']}")
    print(f"Vector: {vector_data['vector']}")

# Check if ID exists
if 1 in db:
    print("Vector with ID 1 exists")

# Get length
print(f"Database has {len(db)} vectors")
```

### Configuration Classes

#### `NusterConfig`

Complete configuration for all NusterDB aspects.

```python
@dataclass
class NusterConfig:
    # Core settings
    algorithm: Algorithm = Algorithm.FLAT
    security_level: SecurityLevel = SecurityLevel.NONE
    distance_metric: DistanceMetric = DistanceMetric.L2
    
    # Performance settings
    use_simd: bool = True
    use_gpu: bool = True
    parallel_processing: bool = True
    cache_size: str = "512MB"
    compression: bool = False
```

**Methods:**
- `to_dict()` - Convert to dictionary
- `to_json()` - Convert to JSON string
- `from_dict(config_dict)` - Create from dictionary
- `from_json(json_str)` - Create from JSON
- `update(**kwargs)` - Create updated configuration
- `optimize_for_speed()` - Speed-optimized configuration
- `optimize_for_accuracy()` - Accuracy-optimized configuration
- `optimize_for_memory()` - Memory-optimized configuration

#### Configuration Enums

```python
class Algorithm(Enum):
    FLAT = "flat"              # Exact search
    IVF = "ivf"               # Inverted File Index
    PQ = "pq"                 # Product Quantization
    LSH = "lsh"               # Locality Sensitive Hashing
    SQ = "sq"                 # Scalar Quantization
    HNSW = "hnsw"             # Hierarchical NSW
    HYBRID = "hybrid"         # Multi-algorithm approach

class SecurityLevel(Enum):
    NONE = "none"                   # No special security
    BASIC = "basic"                 # Basic encryption
    ENTERPRISE = "enterprise"       # Enterprise-grade security
    GOVERNMENT = "government"       # Government-grade (FIPS 140-2)

class StorageMode(Enum):
    MEMORY = "memory"               # In-memory only (fastest)
    PERSISTENT = "persistent"       # Disk-based storage (durable)
    CACHE = "cache"                # Memory + disk caching
    API = "api"                    # Remote API connection

class DistanceMetric(Enum):
    L2 = "l2"                      # Euclidean distance
    COSINE = "cosine"              # Cosine similarity
    INNER_PRODUCT = "inner_product" # Inner product
    L1 = "l1"                      # Manhattan distance
    HAMMING = "hamming"            # Hamming distance
```

### Client Class

#### `NusterClient`

Client for connecting to NusterDB server instances.

```python
class NusterClient:
    def __init__(
        self,
        url: str = "http://localhost:7878",
        timeout: int = 30,
        retry_attempts: int = 3,
        api_key: Optional[str] = None,
        verify_ssl: bool = True
    ):
```

**Parameters:**
- `url`: Server URL
- `timeout`: Request timeout in seconds
- `retry_attempts`: Number of retry attempts
- `api_key`: Optional API key for authentication
- `verify_ssl`: Verify SSL certificates

**Methods:** Same interface as `NusterDB` but operates over HTTP/REST API.

**Example:**
```python
# Connect to server
client = nusterdb.NusterClient("http://localhost:7878", api_key="your-key")

# Use same interface as local database
client.add(1, [0.1, 0.2, 0.3])
results = client.search([0.1, 0.2, 0.3], k=5)
```

### Utility Functions

#### `create_random_vectors(count, dimension, distribution="normal", seed=None)`

Create random vectors for testing.

```python
def create_random_vectors(
    count: int, 
    dimension: int, 
    distribution: str = "normal",
    seed: Optional[int] = None
) -> np.ndarray:
```

**Parameters:**
- `count`: Number of vectors
- `dimension`: Vector dimension
- `distribution`: Distribution type ("normal", "uniform", "clustered")
- `seed`: Random seed for reproducibility

**Example:**
```python
# Create test vectors
vectors = nusterdb.create_random_vectors(1000, 128, distribution="normal", seed=42)

# Create clustered data
clustered = nusterdb.create_random_vectors(500, 64, distribution="clustered")
```

#### `benchmark_performance(db, num_vectors=1000, dimension=128, num_queries=100, k=10)`

Benchmark database performance.

```python
def benchmark_performance(
    db,
    num_vectors: int = 1000,
    dimension: int = 128,
    num_queries: int = 100,
    k: int = 10
) -> Dict[str, float]:
```

**Returns:** Performance metrics dictionary

**Example:**
```python
# Benchmark your database
metrics = nusterdb.benchmark_performance(db, num_vectors=5000, dimension=768)
print(f"Insert rate: {metrics['insert_rate_per_sec']:.0f} vectors/sec")
print(f"Search rate: {metrics['search_rate_qps']:.0f} QPS")
print(f"Average search time: {metrics['avg_search_time_ms']:.2f} ms")
```

#### `validate_vectors(vectors, expected_dimension=None)`

Validate and normalize vector data.

```python
def validate_vectors(
    vectors: Union[List[List[float]], np.ndarray], 
    expected_dimension: Optional[int] = None
) -> np.ndarray:
```

**Example:**
```python
# Validate vectors before adding
vectors = [[0.1, 0.2], [0.3, 0.4], [0.5, 0.6]]
validated = nusterdb.validate_vectors(vectors, expected_dimension=2)
```

#### `load_vectors_from_file(file_path, format="auto")` / `save_vectors_to_file(...)`

Load and save vectors from various file formats.

```python
# Load from file
ids, vectors, metadata = nusterdb.load_vectors_from_file("data.json")
db.bulk_add(ids, vectors, metadata)

# Save to file
ids = list(range(100))
vectors = nusterdb.create_random_vectors(100, 128)
nusterdb.save_vectors_to_file("backup.json", ids, vectors)
```

**Supported formats:** JSON, NumPy (.npy), CSV, HDF5 (.h5)

### Exception Classes

```python
class NusterDBError(Exception):
    """Base exception for all NusterDB errors."""
    
class ConnectionError(NusterDBError):
    """Connection to server failed."""
    
class SecurityError(NusterDBError):
    """Security validation failed."""
    
class IndexError(NusterDBError):
    """Index operations failed."""
    
class ConfigurationError(NusterDBError):
    """Configuration is invalid."""
    
class ValidationError(NusterDBError):
    """Input validation failed."""
```

### Module-Level Convenience Functions

#### `quick_start(dimension, mode="memory")`

Quick start helper for common use cases.

```python
# Quick setup
db = nusterdb.quick_start(128, "memory")
db.add(1, [0.1, 0.2, ...])
```

#### `connect(url="http://localhost:7878", **kwargs)`

Connect to NusterDB server.

```python
# Connect to server
client = nusterdb.connect("http://localhost:7878", api_key="key")
```

#### `create_database(path, dimension, **kwargs)`

Create a new persistent database.

```python
# Create persistent database
db = nusterdb.create_database("./vectors", 128, algorithm="ivf")
```

#### Configuration Helpers

```python
# Get optimized configurations
speed_config = nusterdb.optimize_for_speed()
accuracy_config = nusterdb.optimize_for_accuracy()
memory_config = nusterdb.optimize_for_memory()

# Use with database
db = nusterdb.NusterDB(dimension=128, **speed_config)
```

#### `info()`

Get package information.

```python
package_info = nusterdb.info()
print(f"Version: {package_info['version']}")
print(f"Algorithms: {package_info['algorithms']}")
print(f"Features: {package_info['features']}")
```

## Storage Modes

### Memory Mode (Fastest)
```python
# For temporary, high-speed operations
db = nusterdb.NusterDB(mode="memory", dimension=128)
# Best for: Testing, temporary data, maximum speed
```

### Persistent Mode (Production Ready)
```python
# For production with data persistence
db = nusterdb.NusterDB(
    mode="persistent", 
    path="./my_vectors",
    dimension=768
)
# Best for: Production data, long-term storage, reliability
```

### Cache Mode (Balanced Performance)
```python
# For large datasets with intelligent caching
db = nusterdb.NusterDB(
    mode="cache",
    cache_size="2GB",
    dimension=512
)
# Best for: Large datasets, memory optimization, balanced performance
```

### API Mode (Distributed)
```python
# Connect to NusterDB server
db = nusterdb.NusterDB(mode="api", url="http://localhost:7878")
# Best for: Microservices, distributed systems, scalability
```

## Enterprise Security Features

### Standard Security
```python
# Basic encryption and security
db = nusterdb.NusterDB(
    mode="persistent",
    path="./secure_vectors",
    security_level="basic",           # Standard security
    encryption_at_rest=True          # Data encryption
)
```

### Advanced Security (Enterprise)
```python
# Maximum security for sensitive data
db = nusterdb.NusterDB(
    mode="persistent",
    path="./classified_vectors",
    security_level="enterprise",      # Enhanced security
    encryption_at_rest=True,         # AES-256 encryption
    audit_logging=True,              # Security event tracking
    access_control=True,             # Role-based permissions
    quantum_resistant=True           # Future-proof encryption
)
```

### Security Features Available
- **FIPS 140-2 Ready** - Federal cryptographic standards compliance
- **AES-256 Encryption** - Industry-standard data protection
- **Quantum-Resistant** - Post-quantum cryptography algorithms
- **Audit Logging** - Comprehensive security event tracking
- **Access Control** - Multi-level security permissions
- **Key Management** - Secure key derivation and rotation

## Vector Search Algorithms

### Algorithm Details

- **Flat**: Exact brute-force search for highest accuracy
- **IVF**: Inverted file structure for balanced performance
- **LSH**: Locality-sensitive hashing for speed
- **PQ**: Product quantization for memory efficiency
- **HNSW**: Hierarchical navigable small world graphs
- **SQ**: Scalar quantization for reduced memory usage

```python
# Algorithm-specific configuration
db = nusterdb.NusterDB(
    dimension=768,
    algorithm="ivf",
    # IVF-specific parameters
    ivf_clusters=256,
    ivf_probe_lists=32
)

# PQ configuration
db = nusterdb.NusterDB(
    dimension=768,
    algorithm="pq",
    pq_subvectors=8,
    pq_centroids=256
)
```

## Performance Benchmarks

Based on internal benchmarking on enterprise hardware:

### NusterDB Performance
- **Memory Mode**: 15K-30K queries per second
- **Persistent Mode**: 8K-15K QPS with full durability
- **Insertion Rate**: 10K+ vectors/sec with persistence
- **Memory Efficiency**: Zero-copy access, optimized storage
- **Latency**: Sub-millisecond response times for most queries

### Distance Metrics Supported
- **L2 (Euclidean)**: Standard Euclidean distance
- **Cosine**: Cosine similarity for normalized vectors
- **Inner Product**: Dot product similarity
- **L1 (Manhattan)**: Manhattan distance

## Configuration Management

### Predefined Configurations

```python
from nusterdb import create_config

# Predefined configurations for common use cases
config = create_config("production_speed")     # Speed-optimized
config = create_config("production_accuracy")  # Accuracy-optimized  
config = create_config("memory_constrained")   # Memory-optimized
config = create_config("secure")              # Security-focused

db = nusterdb.NusterDB(config=config, dimension=768)
```

### Available Presets
- `"development"` / `"dev"` - Development and testing
- `"production_speed"` / `"prod_speed"` - Speed-optimized production
- `"production_accuracy"` / `"prod_accuracy"` - Accuracy-optimized production
- `"government"` / `"secure"` - Government-grade security
- `"memory_constrained"` / `"low_memory"` - Memory-optimized
- `"high_throughput"` / `"throughput"` - High-throughput applications

### Custom Configurations

```python
# Create custom configuration
config = nusterdb.NusterConfig(
    algorithm=nusterdb.Algorithm.IVF,
    security_level=nusterdb.SecurityLevel.ENTERPRISE,
    distance_metric=nusterdb.DistanceMetric.COSINE,
    use_gpu=True,
    cache_size="4GB"
)

# Use configuration
db = nusterdb.NusterDB(config=config, dimension=768)

# Update configuration
updated_config = config.update(use_simd=False, parallel_processing=False)
```

## Advanced Examples

### Large-Scale Production Setup

```python
import nusterdb
import numpy as np

# Production configuration with security
db = nusterdb.NusterDB(
    mode="persistent",
    path="/secure/vectors",
    dimension=1536,                    # OpenAI embeddings
    algorithm="ivf",
    security_level="enterprise",
    distance_metric="cosine",
    use_gpu=True,
    parallel_processing=True,
    cache_size="8GB",
    
    # IVF-specific tuning
    ivf_clusters=1024,
    ivf_probe_lists=64,
    
    # Security settings
    encryption_at_rest=True,
    audit_logging=True,
    access_control=True
)

# Bulk data loading with progress tracking
def load_embeddings(file_path, batch_size=1000):
    ids, vectors, metadata = nusterdb.load_vectors_from_file(file_path)
    
    total_batches = len(ids) // batch_size + 1
    for i in range(0, len(ids), batch_size):
        batch_ids = ids[i:i+batch_size]
        batch_vectors = vectors[i:i+batch_size]
        batch_metadata = metadata[i:i+batch_size] if metadata else None
        
        added = db.bulk_add(batch_ids, batch_vectors, batch_metadata)
        print(f"Batch {i//batch_size + 1}/{total_batches}: Added {added} vectors")
    
    # Optimize after bulk loading
    print("Optimizing index...")
    db.optimize()

# Advanced search with multiple filters
def semantic_search(query_text, filters=None, k=10):
    # Convert text to embedding (pseudo-code)
    query_embedding = get_text_embedding(query_text)
    
    results = db.search(
        query=query_embedding,
        k=k,
        filters=filters or {},
        include_metadata=True,
        include_distances=True
    )
    
    # Post-process results
    processed_results = []
    for result in results:
        processed_results.append({
            'id': result['id'],
            'similarity': 1 - result['distance'],  # Convert distance to similarity
            'metadata': result['metadata'],
            'confidence': result['distance'] < 0.5  # Confidence threshold
        })
    
    return processed_results

# Usage
results = semantic_search(
    "machine learning algorithms",
    filters={"category": "research", "year": 2023},
    k=20
)
```

### Multi-Modal Search System

```python
import nusterdb

class MultiModalSearchSystem:
    def __init__(self, base_path: str):
        # Separate databases for different modalities
        self.text_db = nusterdb.NusterDB(
            mode="persistent",
            path=f"{base_path}/text",
            dimension=768,
            algorithm="hnsw",
            distance_metric="cosine"
        )
        
        self.image_db = nusterdb.NusterDB(
            mode="persistent", 
            path=f"{base_path}/images",
            dimension=2048,
            algorithm="ivf",
            distance_metric="l2"
        )
        
        self.audio_db = nusterdb.NusterDB(
            mode="persistent",
            path=f"{base_path}/audio", 
            dimension=512,
            algorithm="lsh",
            distance_metric="cosine"
        )
    
    def add_content(self, content_id: str, embeddings: dict, metadata: dict):
        """Add multi-modal content."""
        if 'text' in embeddings:
            self.text_db.add(content_id, embeddings['text'], metadata)
        
        if 'image' in embeddings:
            self.image_db.add(content_id, embeddings['image'], metadata)
            
        if 'audio' in embeddings:
            self.audio_db.add(content_id, embeddings['audio'], metadata)
    
    def search_all_modalities(self, query_embeddings: dict, k: int = 10):
        """Search across all modalities and combine results."""
        all_results = {}
        
        if 'text' in query_embeddings:
            text_results = self.text_db.search(query_embeddings['text'], k=k)
            all_results['text'] = text_results
            
        if 'image' in query_embeddings:
            image_results = self.image_db.search(query_embeddings['image'], k=k)
            all_results['image'] = image_results
            
        if 'audio' in query_embeddings:
            audio_results = self.audio_db.search(query_embeddings['audio'], k=k)
            all_results['audio'] = audio_results
        
        return self._combine_results(all_results)
    
    def _combine_results(self, results_by_modality):
        """Combine and rank results from multiple modalities."""
        # Implementation depends on your fusion strategy
        combined = {}
        for modality, results in results_by_modality.items():
            for result in results:
                content_id = result['id']
                if content_id not in combined:
                    combined[content_id] = {
                        'id': content_id,
                        'metadata': result['metadata'],
                        'scores': {}
                    }
                combined[content_id]['scores'][modality] = 1 - result['distance']
        
        # Sort by combined score
        for item in combined.values():
            item['combined_score'] = sum(item['scores'].values()) / len(item['scores'])
        
        return sorted(combined.values(), key=lambda x: x['combined_score'], reverse=True)

# Usage
search_system = MultiModalSearchSystem("/data/multimodal")

# Add content
search_system.add_content(
    "doc_001",
    embeddings={
        'text': text_embedding,
        'image': image_embedding
    },
    metadata={'title': 'Research Paper', 'type': 'academic'}
)

# Search
results = search_system.search_all_modalities({
    'text': query_text_embedding,
    'image': query_image_embedding
})
```

### Real-Time Recommendation System

```python
import nusterdb
from collections import defaultdict
import time

class RecommendationSystem:
    def __init__(self):
        self.user_db = nusterdb.NusterDB(
            mode="cache",
            dimension=256,
            algorithm="lsh",
            cache_size="1GB"
        )
        
        self.item_db = nusterdb.NusterDB(
            mode="persistent",
            path="./items",
            dimension=256,
            algorithm="ivf"
        )
        
        # Track user interactions
        self.user_interactions = defaultdict(list)
    
    def add_user_profile(self, user_id: str, profile_vector: list, metadata: dict):
        """Add or update user profile."""
        self.user_db.add(user_id, profile_vector, metadata)
    
    def add_item(self, item_id: str, feature_vector: list, metadata: dict):
        """Add item to catalog."""
        self.item_db.add(item_id, feature_vector, metadata)
    
    def record_interaction(self, user_id: str, item_id: str, interaction_type: str, rating: float = None):
        """Record user-item interaction."""
        interaction = {
            'item_id': item_id,
            'type': interaction_type,
            'rating': rating,
            'timestamp': time.time()
        }
        self.user_interactions[user_id].append(interaction)
        
        # Update user profile based on interaction
        self._update_user_profile(user_id, item_id, interaction_type, rating)
    
    def get_recommendations(self, user_id: str, k: int = 10, exclude_seen: bool = True):
        """Get personalized recommendations."""
        # Get user profile
        user_profile = self.user_db.get(user_id)
        if not user_profile:
            return self._get_popular_items(k)
        
        # Find similar items
        recommendations = self.item_db.search(
            user_profile['vector'],
            k=k * 2,  # Get more to account for filtering
            include_metadata=True
        )
        
        # Filter out already seen items
        if exclude_seen:
            seen_items = {interaction['item_id'] for interaction in self.user_interactions[user_id]}
            recommendations = [r for r in recommendations if r['id'] not in seen_items]
        
        return recommendations[:k]
    
    def get_similar_users(self, user_id: str, k: int = 5):
        """Find similar users for collaborative filtering."""
        user_profile = self.user_db.get(user_id)
        if not user_profile:
            return []
        
        similar_users = self.user_db.search(
            user_profile['vector'],
            k=k + 1,  # +1 to exclude self
            include_metadata=True
        )
        
        # Remove self from results
        return [u for u in similar_users if u['id'] != user_id]
    
    def _update_user_profile(self, user_id: str, item_id: str, interaction_type: str, rating: float):
        """Update user profile based on interaction."""
        # Get current profile and item features
        user_profile = self.user_db.get(user_id)
        item_data = self.item_db.get(item_id)
        
        if not user_profile or not item_data:
            return
        
        # Simple profile update (weighted average)
        weight = self._get_interaction_weight(interaction_type, rating)
        current_vector = np.array(user_profile['vector'])
        item_vector = np.array(item_data['vector'])
        
        # Update with exponential moving average
        alpha = 0.1  # Learning rate
        updated_vector = (1 - alpha) * current_vector + alpha * weight * item_vector
        
        # Update user profile
        self.user_db.update(user_id, updated_vector.tolist())
    
    def _get_interaction_weight(self, interaction_type: str, rating: float = None) -> float:
        """Convert interaction type to weight."""
        weights = {
            'view': 0.1,
            'click': 0.3,
            'like': 0.7,
            'purchase': 1.0,
            'rating': rating or 0.5
        }
        return weights.get(interaction_type, 0.1)
    
    def _get_popular_items(self, k: int):
        """Fallback for new users - return popular items."""
        # Simple implementation - could be enhanced with actual popularity metrics
        all_items = list(self.item_db)[:k]
        return all_items

# Usage
rec_system = RecommendationSystem()

# Add items
rec_system.add_item("item_1", feature_vector, {"category": "electronics", "price": 299.99})

# Add users
rec_system.add_user_profile("user_1", profile_vector, {"age": 25, "location": "NY"})

# Record interactions
rec_system.record_interaction("user_1", "item_1", "purchase", rating=4.5)

# Get recommendations
recommendations = rec_system.get_recommendations("user_1", k=10)
for rec in recommendations:
    print(f"Recommended: {rec['id']} (similarity: {1-rec['distance']:.3f})")
```

## Why Choose NusterDB?

### Complete Database Solution
- Full CRUD operations with transaction support
- Built-in persistence and data durability
- Comprehensive APIs for production use
- Multiple storage modes for different use cases

### Enterprise Security
- Industry-leading security features
- FIPS 140-2 compliance ready
- Quantum-resistant cryptography
- Comprehensive audit logging and access control

### High Performance
- Advanced algorithms optimized for different workloads
- Hardware acceleration (SIMD, multi-threading)
- Memory-efficient with zero-copy access
- Intelligent caching for large datasets

### Developer Friendly
- Single unified API for all storage modes
- Simple installation with pip
- Extensive documentation and examples
- Type hints and comprehensive error handling

## Use Cases

### Recommended For:
- **AI/ML Applications** requiring fast similarity search
- **Production Systems** needing reliability and persistence  
- **Enterprise Environments** with security requirements
- **Large-Scale Deployments** requiring monitoring and ops tools
- **Sensitive Data** needing encryption and compliance
- **Microservices** architecture with API-first design

### Common Applications:
- Semantic search and document retrieval
- Image and video similarity search
- Recommendation systems
- Anomaly detection
- Content-based filtering
- Knowledge base search

## Links & Resources

- [Documentation](https://docs.nusterai.com/nusterdb)
- [Issues](https://github.com/NusterAI/nusterdb/issues)
- [Discussions](https://github.com/NusterAI/nusterdb/discussions)
- [Performance Benchmarks](https://github.com/NusterAI/nusterdb/blob/main/docs/PERFORMANCE_ANALYSIS.md)
- [Security Guide](https://github.com/NusterAI/nusterdb/blob/main/docs/SECURITY.md)

## License

MIT License - see [LICENSE](LICENSE) file for details.

---

**Ready to build with high-performance vector search?**

```bash
pip install nusterdb
```

Get enterprise-grade vector database with security, persistence, and production features!

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/NusterAI/nusterdb",
    "name": "nusterdb",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "NusterAI Team <info@nusterai.com>",
    "keywords": "vector database, similarity search, machine learning, AI, government-grade, security, FIPS, quantum-resistant",
    "author": "NusterAI Team",
    "author_email": "NusterAI Team <info@nusterai.com>",
    "download_url": null,
    "platform": null,
    "description": "# NusterDB - High-Performance Vector Database with Enterprise Security\n\n[![PyPI version](https://badge.fury.io/py/nusterdb.svg)](https://badge.fury.io/py/nusterdb)\n[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n\n## The Complete Vector Database Solution\n\nNusterDB is a high-performance vector database designed for production workloads with enterprise-grade security, persistence, and comprehensive features. Built for AI/ML applications requiring fast similarity search with reliability and security.\n\n### Core Features\n\n| Feature | Description |\n|---------|-------------|\n| **Advanced Algorithms** | Multiple search algorithms: IVF, PQ, LSH, SQ, Flat, HNSW |\n| **Enterprise Security** | FIPS 140-2 compliance, quantum-resistant encryption |\n| **Production APIs** | Complete REST APIs and Python SDK |\n| **Data Persistence** | Built-in durable storage with transaction support |\n| **Full CRUD Operations** | Complete database operations: Create, Read, Update, Delete |\n| **Multiple Storage Modes** | Memory, Persistent, Cache, and API modes |\n\n## Quick Start\n\n### Installation\n\n```bash\npip install nusterdb\n```\n\n### Simple Usage\n\n```python\nimport nusterdb\n\n# Create database - choose your mode\ndb = nusterdb.NusterDB(mode=\"memory\", dimension=128)\n\n# Add vectors\ndb.add(1, [0.1, 0.2, 0.3, ...])  # Single vector\ndb.bulk_add([1,2,3], vectors, metadata)  # Multiple vectors\n\n# Search\nresults = db.search([0.1, 0.2, 0.3, ...], k=5)\nfor result in results:\n    print(f\"ID: {result['id']}, Distance: {result['distance']}\")\n```\n\n## Complete API Reference\n\n### Core Classes\n\n#### `NusterDB`\n\nThe main database class supporting all storage modes and algorithms.\n\n```python\nclass NusterDB:\n    \"\"\"\n    Unified NusterDB interface supporting all storage modes and algorithms.\n    \"\"\"\n    \n    def __init__(\n        self,\n        mode: Union[str, StorageMode] = \"memory\",\n        dimension: Optional[int] = None,\n        path: Optional[str] = None,\n        url: Optional[str] = None,\n        algorithm: Union[str, Algorithm] = \"flat\",\n        security_level: Union[str, SecurityLevel] = \"none\",\n        distance_metric: Union[str, DistanceMetric] = \"l2\",\n        use_simd: bool = True,\n        use_gpu: bool = True,\n        parallel_processing: bool = True,\n        cache_size: str = \"512MB\",\n        compression: bool = False,\n        **kwargs\n    ):\n```\n\n**Parameters:**\n- `mode`: Storage mode (\"memory\", \"persistent\", \"cache\", \"api\")\n- `dimension`: Vector dimension (required for new databases)\n- `path`: Path for persistent storage\n- `url`: Server URL for API mode\n- `algorithm`: Indexing algorithm (\"flat\", \"ivf\", \"pq\", \"lsh\", \"sq\", \"hnsw\")\n- `security_level`: Security level (\"none\", \"basic\", \"enterprise\", \"government\")\n- `distance_metric`: Distance metric (\"l2\", \"cosine\", \"inner_product\", \"l1\")\n- `use_simd`: Enable SIMD optimizations\n- `use_gpu`: Enable GPU acceleration\n- `parallel_processing`: Enable parallel processing\n- `cache_size`: Cache size (e.g., \"1GB\", \"512MB\")\n- `compression`: Enable compression\n\n#### Core Database Operations\n\n##### `add(id, vector, metadata=None)`\n\nAdd a single vector to the database.\n\n```python\ndef add(\n    self, \n    id: Union[int, str], \n    vector: Union[List[float], np.ndarray],\n    metadata: Optional[Dict[str, Any]] = None\n) -> bool:\n```\n\n**Parameters:**\n- `id`: Unique identifier\n- `vector`: Vector data (list or numpy array)\n- `metadata`: Optional metadata dictionary\n\n**Returns:** `bool` - Success status\n\n**Example:**\n```python\n# Add vector with metadata\nsuccess = db.add(1, [0.1, 0.2, 0.3], {\"category\": \"document\", \"type\": \"text\"})\n\n# Add numpy vector\nimport numpy as np\nvector = np.random.random(128)\ndb.add(\"doc_001\", vector, {\"source\": \"research_paper\"})\n```\n\n##### `bulk_add(ids, vectors, metadata=None)`\n\nAdd multiple vectors efficiently.\n\n```python\ndef bulk_add(\n    self,\n    ids: List[Union[int, str]],\n    vectors: Union[List[List[float]], np.ndarray],\n    metadata: Optional[List[Dict[str, Any]]] = None\n) -> int:\n```\n\n**Parameters:**\n- `ids`: List of unique identifiers\n- `vectors`: List of vectors or 2D numpy array\n- `metadata`: Optional list of metadata dictionaries\n\n**Returns:** `int` - Number of vectors successfully added\n\n**Example:**\n```python\n# Bulk add with metadata\nids = [1, 2, 3, 4, 5]\nvectors = [[0.1, 0.2], [0.3, 0.4], [0.5, 0.6], [0.7, 0.8], [0.9, 1.0]]\nmetdata = [\n    {\"category\": \"A\", \"score\": 0.95},\n    {\"category\": \"B\", \"score\": 0.87},\n    {\"category\": \"A\", \"score\": 0.92},\n    {\"category\": \"C\", \"score\": 0.78},\n    {\"category\": \"B\", \"score\": 0.89}\n]\nadded_count = db.bulk_add(ids, vectors, metadata)\nprint(f\"Added {added_count} vectors\")\n\n# Bulk add numpy arrays\nimport numpy as np\nvectors = np.random.random((1000, 128))\nids = [f\"vec_{i}\" for i in range(1000)]\ndb.bulk_add(ids, vectors)\n```\n\n##### `search(query, k=10, filters=None, include_metadata=True, include_distances=True)`\n\nSearch for similar vectors.\n\n```python\ndef search(\n    self,\n    query: Union[List[float], np.ndarray],\n    k: int = 10,\n    filters: Optional[Dict[str, Any]] = None,\n    include_metadata: bool = True,\n    include_distances: bool = True\n) -> List[Dict[str, Any]]:\n```\n\n**Parameters:**\n- `query`: Query vector\n- `k`: Number of results to return\n- `filters`: Optional metadata filters\n- `include_metadata`: Include metadata in results\n- `include_distances`: Include distances in results\n\n**Returns:** `List[Dict[str, Any]]` - List of search results\n\n**Example:**\n```python\n# Basic search\nresults = db.search([0.1, 0.2, 0.3], k=5)\nfor result in results:\n    print(f\"ID: {result['id']}, Distance: {result['distance']}\")\n\n# Search with filters\nresults = db.search(\n    query=[0.1, 0.2, 0.3],\n    k=10,\n    filters={\"category\": \"document\"},\n    include_metadata=True,\n    include_distances=True\n)\n\n# Process results\nfor result in results:\n    print(f\"ID: {result['id']}\")\n    print(f\"Distance: {result['distance']:.4f}\")\n    print(f\"Metadata: {result['metadata']}\")\n```\n\n##### `update(id, vector=None, metadata=None)`\n\nUpdate an existing vector.\n\n```python\ndef update(\n    self,\n    id: Union[int, str],\n    vector: Optional[Union[List[float], np.ndarray]] = None,\n    metadata: Optional[Dict[str, Any]] = None\n) -> bool:\n```\n\n**Parameters:**\n- `id`: Vector ID to update\n- `vector`: New vector data (optional)\n- `metadata`: New metadata (optional)\n\n**Returns:** `bool` - Success status\n\n**Example:**\n```python\n# Update vector only\ndb.update(1, [0.2, 0.3, 0.4])\n\n# Update metadata only\ndb.update(1, metadata={\"category\": \"updated\", \"version\": 2})\n\n# Update both\ndb.update(1, [0.2, 0.3, 0.4], {\"category\": \"updated\", \"version\": 2})\n```\n\n##### `delete(id)`\n\nDelete a vector by ID.\n\n```python\ndef delete(self, id: Union[int, str]) -> bool:\n```\n\n**Parameters:**\n- `id`: Vector ID to delete\n\n**Returns:** `bool` - Success status\n\n**Example:**\n```python\n# Delete by ID\nsuccess = db.delete(1)\nif success:\n    print(\"Vector deleted successfully\")\n\n# Delete multiple vectors\nids_to_delete = [1, 2, 3, 4, 5]\nfor vec_id in ids_to_delete:\n    db.delete(vec_id)\n```\n\n##### `get(id, include_metadata=True)`\n\nGet a vector by ID.\n\n```python\ndef get(\n    self, \n    id: Union[int, str],\n    include_metadata: bool = True\n) -> Optional[Dict[str, Any]]:\n```\n\n**Parameters:**\n- `id`: Vector ID\n- `include_metadata`: Include metadata in result\n\n**Returns:** `Optional[Dict[str, Any]]` - Vector data or None if not found\n\n**Example:**\n```python\n# Get vector with metadata\nvector_data = db.get(1)\nif vector_data:\n    print(f\"Vector: {vector_data['vector']}\")\n    print(f\"Metadata: {vector_data['metadata']}\")\n\n# Get vector without metadata\nvector_data = db.get(1, include_metadata=False)\n```\n\n##### `batch_search(queries, k=10, filters=None)`\n\nSearch with multiple queries efficiently.\n\n```python\ndef batch_search(\n    self,\n    queries: Union[List[List[float]], np.ndarray],\n    k: int = 10,\n    filters: Optional[List[Dict[str, Any]]] = None\n) -> List[List[Dict[str, Any]]]:\n```\n\n**Parameters:**\n- `queries`: List of query vectors\n- `k`: Number of results per query\n- `filters`: Optional filters per query\n\n**Returns:** `List[List[Dict[str, Any]]]` - List of result lists\n\n**Example:**\n```python\n# Batch search multiple queries\nqueries = [\n    [0.1, 0.2, 0.3],\n    [0.4, 0.5, 0.6],\n    [0.7, 0.8, 0.9]\n]\nall_results = db.batch_search(queries, k=5)\n\nfor i, results in enumerate(all_results):\n    print(f\"Query {i} results:\")\n    for result in results:\n        print(f\"  ID: {result['id']}, Distance: {result['distance']}\")\n```\n\n#### Management Operations\n\n##### `train(training_vectors=None)`\n\nTrain the index (required for some algorithms).\n\n```python\ndef train(self, training_vectors: Optional[Union[List[List[float]], np.ndarray]] = None) -> bool:\n```\n\n**Parameters:**\n- `training_vectors`: Training data (optional, uses existing data if None)\n\n**Returns:** `bool` - Success status\n\n**Example:**\n```python\n# Train with existing data\ndb.train()\n\n# Train with specific training data\ntraining_data = np.random.random((10000, 128))\ndb.train(training_data)\n```\n\n##### `optimize()`\n\nOptimize the index for better performance.\n\n```python\ndef optimize(self) -> bool:\n```\n\n**Returns:** `bool` - Success status\n\n**Example:**\n```python\n# Optimize after bulk inserts\ndb.bulk_add(ids, vectors)\ndb.optimize()  # Rebuild index for better performance\n```\n\n##### `save(path=None)`\n\nSave the database to disk.\n\n```python\ndef save(self, path: Optional[str] = None) -> bool:\n```\n\n**Parameters:**\n- `path`: Save path (uses default if None)\n\n**Returns:** `bool` - Success status\n\n##### `load(path)`\n\nLoad database from disk.\n\n```python\ndef load(self, path: str) -> bool:\n```\n\n**Parameters:**\n- `path`: Load path\n\n**Returns:** `bool` - Success status\n\n##### `clear()`\n\nClear all vectors from the database.\n\n```python\ndef clear(self) -> bool:\n```\n\n**Returns:** `bool` - Success status\n\n#### Information and Statistics\n\n##### `count()` / `size()`\n\nGet total number of vectors.\n\n```python\ndef count(self) -> int:\ndef size(self) -> int:  # Alias for count()\n```\n\n**Returns:** `int` - Number of vectors\n\n**Example:**\n```python\ntotal_vectors = db.count()\nprint(f\"Database contains {total_vectors} vectors\")\n```\n\n##### `stats()`\n\nGet database statistics.\n\n```python\ndef stats(self) -> Dict[str, Any]:\n```\n\n**Returns:** `Dict[str, Any]` - Statistics dictionary\n\n**Example:**\n```python\nstats = db.stats()\nprint(f\"Vector count: {stats['vector_count']}\")\nprint(f\"Average query time: {stats['avg_query_time']:.3f}ms\")\nprint(f\"Algorithm: {stats['algorithm']}\")\nprint(f\"Security level: {stats['security_level']}\")\n```\n\n##### `info()`\n\nGet database information.\n\n```python\ndef info(self) -> Dict[str, Any]:\n```\n\n**Returns:** `Dict[str, Any]` - Information dictionary\n\n##### `health_check()`\n\nPerform health check.\n\n```python\ndef health_check(self) -> Dict[str, Any]:\n```\n\n**Returns:** `Dict[str, Any]` - Health status\n\n**Example:**\n```python\nhealth = db.health_check()\nif health['healthy']:\n    print(\"Database is healthy\")\n    print(f\"Vector count: {health['vector_count']}\")\nelse:\n    print(f\"Database issue: {health.get('error')}\")\n```\n\n#### Context Manager Support\n\n```python\n# Automatic resource management\nwith nusterdb.NusterDB(mode=\"persistent\", path=\"./vectors\") as db:\n    db.add(1, [0.1, 0.2, 0.3])\n    results = db.search([0.1, 0.2, 0.3])\n    # Database automatically saved and closed\n```\n\n#### Iterator Support\n\n```python\n# Iterate over all vectors (if supported by backend)\nfor vector_data in db:\n    print(f\"ID: {vector_data['id']}\")\n    print(f\"Vector: {vector_data['vector']}\")\n\n# Check if ID exists\nif 1 in db:\n    print(\"Vector with ID 1 exists\")\n\n# Get length\nprint(f\"Database has {len(db)} vectors\")\n```\n\n### Configuration Classes\n\n#### `NusterConfig`\n\nComplete configuration for all NusterDB aspects.\n\n```python\n@dataclass\nclass NusterConfig:\n    # Core settings\n    algorithm: Algorithm = Algorithm.FLAT\n    security_level: SecurityLevel = SecurityLevel.NONE\n    distance_metric: DistanceMetric = DistanceMetric.L2\n    \n    # Performance settings\n    use_simd: bool = True\n    use_gpu: bool = True\n    parallel_processing: bool = True\n    cache_size: str = \"512MB\"\n    compression: bool = False\n```\n\n**Methods:**\n- `to_dict()` - Convert to dictionary\n- `to_json()` - Convert to JSON string\n- `from_dict(config_dict)` - Create from dictionary\n- `from_json(json_str)` - Create from JSON\n- `update(**kwargs)` - Create updated configuration\n- `optimize_for_speed()` - Speed-optimized configuration\n- `optimize_for_accuracy()` - Accuracy-optimized configuration\n- `optimize_for_memory()` - Memory-optimized configuration\n\n#### Configuration Enums\n\n```python\nclass Algorithm(Enum):\n    FLAT = \"flat\"              # Exact search\n    IVF = \"ivf\"               # Inverted File Index\n    PQ = \"pq\"                 # Product Quantization\n    LSH = \"lsh\"               # Locality Sensitive Hashing\n    SQ = \"sq\"                 # Scalar Quantization\n    HNSW = \"hnsw\"             # Hierarchical NSW\n    HYBRID = \"hybrid\"         # Multi-algorithm approach\n\nclass SecurityLevel(Enum):\n    NONE = \"none\"                   # No special security\n    BASIC = \"basic\"                 # Basic encryption\n    ENTERPRISE = \"enterprise\"       # Enterprise-grade security\n    GOVERNMENT = \"government\"       # Government-grade (FIPS 140-2)\n\nclass StorageMode(Enum):\n    MEMORY = \"memory\"               # In-memory only (fastest)\n    PERSISTENT = \"persistent\"       # Disk-based storage (durable)\n    CACHE = \"cache\"                # Memory + disk caching\n    API = \"api\"                    # Remote API connection\n\nclass DistanceMetric(Enum):\n    L2 = \"l2\"                      # Euclidean distance\n    COSINE = \"cosine\"              # Cosine similarity\n    INNER_PRODUCT = \"inner_product\" # Inner product\n    L1 = \"l1\"                      # Manhattan distance\n    HAMMING = \"hamming\"            # Hamming distance\n```\n\n### Client Class\n\n#### `NusterClient`\n\nClient for connecting to NusterDB server instances.\n\n```python\nclass NusterClient:\n    def __init__(\n        self,\n        url: str = \"http://localhost:7878\",\n        timeout: int = 30,\n        retry_attempts: int = 3,\n        api_key: Optional[str] = None,\n        verify_ssl: bool = True\n    ):\n```\n\n**Parameters:**\n- `url`: Server URL\n- `timeout`: Request timeout in seconds\n- `retry_attempts`: Number of retry attempts\n- `api_key`: Optional API key for authentication\n- `verify_ssl`: Verify SSL certificates\n\n**Methods:** Same interface as `NusterDB` but operates over HTTP/REST API.\n\n**Example:**\n```python\n# Connect to server\nclient = nusterdb.NusterClient(\"http://localhost:7878\", api_key=\"your-key\")\n\n# Use same interface as local database\nclient.add(1, [0.1, 0.2, 0.3])\nresults = client.search([0.1, 0.2, 0.3], k=5)\n```\n\n### Utility Functions\n\n#### `create_random_vectors(count, dimension, distribution=\"normal\", seed=None)`\n\nCreate random vectors for testing.\n\n```python\ndef create_random_vectors(\n    count: int, \n    dimension: int, \n    distribution: str = \"normal\",\n    seed: Optional[int] = None\n) -> np.ndarray:\n```\n\n**Parameters:**\n- `count`: Number of vectors\n- `dimension`: Vector dimension\n- `distribution`: Distribution type (\"normal\", \"uniform\", \"clustered\")\n- `seed`: Random seed for reproducibility\n\n**Example:**\n```python\n# Create test vectors\nvectors = nusterdb.create_random_vectors(1000, 128, distribution=\"normal\", seed=42)\n\n# Create clustered data\nclustered = nusterdb.create_random_vectors(500, 64, distribution=\"clustered\")\n```\n\n#### `benchmark_performance(db, num_vectors=1000, dimension=128, num_queries=100, k=10)`\n\nBenchmark database performance.\n\n```python\ndef benchmark_performance(\n    db,\n    num_vectors: int = 1000,\n    dimension: int = 128,\n    num_queries: int = 100,\n    k: int = 10\n) -> Dict[str, float]:\n```\n\n**Returns:** Performance metrics dictionary\n\n**Example:**\n```python\n# Benchmark your database\nmetrics = nusterdb.benchmark_performance(db, num_vectors=5000, dimension=768)\nprint(f\"Insert rate: {metrics['insert_rate_per_sec']:.0f} vectors/sec\")\nprint(f\"Search rate: {metrics['search_rate_qps']:.0f} QPS\")\nprint(f\"Average search time: {metrics['avg_search_time_ms']:.2f} ms\")\n```\n\n#### `validate_vectors(vectors, expected_dimension=None)`\n\nValidate and normalize vector data.\n\n```python\ndef validate_vectors(\n    vectors: Union[List[List[float]], np.ndarray], \n    expected_dimension: Optional[int] = None\n) -> np.ndarray:\n```\n\n**Example:**\n```python\n# Validate vectors before adding\nvectors = [[0.1, 0.2], [0.3, 0.4], [0.5, 0.6]]\nvalidated = nusterdb.validate_vectors(vectors, expected_dimension=2)\n```\n\n#### `load_vectors_from_file(file_path, format=\"auto\")` / `save_vectors_to_file(...)`\n\nLoad and save vectors from various file formats.\n\n```python\n# Load from file\nids, vectors, metadata = nusterdb.load_vectors_from_file(\"data.json\")\ndb.bulk_add(ids, vectors, metadata)\n\n# Save to file\nids = list(range(100))\nvectors = nusterdb.create_random_vectors(100, 128)\nnusterdb.save_vectors_to_file(\"backup.json\", ids, vectors)\n```\n\n**Supported formats:** JSON, NumPy (.npy), CSV, HDF5 (.h5)\n\n### Exception Classes\n\n```python\nclass NusterDBError(Exception):\n    \"\"\"Base exception for all NusterDB errors.\"\"\"\n    \nclass ConnectionError(NusterDBError):\n    \"\"\"Connection to server failed.\"\"\"\n    \nclass SecurityError(NusterDBError):\n    \"\"\"Security validation failed.\"\"\"\n    \nclass IndexError(NusterDBError):\n    \"\"\"Index operations failed.\"\"\"\n    \nclass ConfigurationError(NusterDBError):\n    \"\"\"Configuration is invalid.\"\"\"\n    \nclass ValidationError(NusterDBError):\n    \"\"\"Input validation failed.\"\"\"\n```\n\n### Module-Level Convenience Functions\n\n#### `quick_start(dimension, mode=\"memory\")`\n\nQuick start helper for common use cases.\n\n```python\n# Quick setup\ndb = nusterdb.quick_start(128, \"memory\")\ndb.add(1, [0.1, 0.2, ...])\n```\n\n#### `connect(url=\"http://localhost:7878\", **kwargs)`\n\nConnect to NusterDB server.\n\n```python\n# Connect to server\nclient = nusterdb.connect(\"http://localhost:7878\", api_key=\"key\")\n```\n\n#### `create_database(path, dimension, **kwargs)`\n\nCreate a new persistent database.\n\n```python\n# Create persistent database\ndb = nusterdb.create_database(\"./vectors\", 128, algorithm=\"ivf\")\n```\n\n#### Configuration Helpers\n\n```python\n# Get optimized configurations\nspeed_config = nusterdb.optimize_for_speed()\naccuracy_config = nusterdb.optimize_for_accuracy()\nmemory_config = nusterdb.optimize_for_memory()\n\n# Use with database\ndb = nusterdb.NusterDB(dimension=128, **speed_config)\n```\n\n#### `info()`\n\nGet package information.\n\n```python\npackage_info = nusterdb.info()\nprint(f\"Version: {package_info['version']}\")\nprint(f\"Algorithms: {package_info['algorithms']}\")\nprint(f\"Features: {package_info['features']}\")\n```\n\n## Storage Modes\n\n### Memory Mode (Fastest)\n```python\n# For temporary, high-speed operations\ndb = nusterdb.NusterDB(mode=\"memory\", dimension=128)\n# Best for: Testing, temporary data, maximum speed\n```\n\n### Persistent Mode (Production Ready)\n```python\n# For production with data persistence\ndb = nusterdb.NusterDB(\n    mode=\"persistent\", \n    path=\"./my_vectors\",\n    dimension=768\n)\n# Best for: Production data, long-term storage, reliability\n```\n\n### Cache Mode (Balanced Performance)\n```python\n# For large datasets with intelligent caching\ndb = nusterdb.NusterDB(\n    mode=\"cache\",\n    cache_size=\"2GB\",\n    dimension=512\n)\n# Best for: Large datasets, memory optimization, balanced performance\n```\n\n### API Mode (Distributed)\n```python\n# Connect to NusterDB server\ndb = nusterdb.NusterDB(mode=\"api\", url=\"http://localhost:7878\")\n# Best for: Microservices, distributed systems, scalability\n```\n\n## Enterprise Security Features\n\n### Standard Security\n```python\n# Basic encryption and security\ndb = nusterdb.NusterDB(\n    mode=\"persistent\",\n    path=\"./secure_vectors\",\n    security_level=\"basic\",           # Standard security\n    encryption_at_rest=True          # Data encryption\n)\n```\n\n### Advanced Security (Enterprise)\n```python\n# Maximum security for sensitive data\ndb = nusterdb.NusterDB(\n    mode=\"persistent\",\n    path=\"./classified_vectors\",\n    security_level=\"enterprise\",      # Enhanced security\n    encryption_at_rest=True,         # AES-256 encryption\n    audit_logging=True,              # Security event tracking\n    access_control=True,             # Role-based permissions\n    quantum_resistant=True           # Future-proof encryption\n)\n```\n\n### Security Features Available\n- **FIPS 140-2 Ready** - Federal cryptographic standards compliance\n- **AES-256 Encryption** - Industry-standard data protection\n- **Quantum-Resistant** - Post-quantum cryptography algorithms\n- **Audit Logging** - Comprehensive security event tracking\n- **Access Control** - Multi-level security permissions\n- **Key Management** - Secure key derivation and rotation\n\n## Vector Search Algorithms\n\n### Algorithm Details\n\n- **Flat**: Exact brute-force search for highest accuracy\n- **IVF**: Inverted file structure for balanced performance\n- **LSH**: Locality-sensitive hashing for speed\n- **PQ**: Product quantization for memory efficiency\n- **HNSW**: Hierarchical navigable small world graphs\n- **SQ**: Scalar quantization for reduced memory usage\n\n```python\n# Algorithm-specific configuration\ndb = nusterdb.NusterDB(\n    dimension=768,\n    algorithm=\"ivf\",\n    # IVF-specific parameters\n    ivf_clusters=256,\n    ivf_probe_lists=32\n)\n\n# PQ configuration\ndb = nusterdb.NusterDB(\n    dimension=768,\n    algorithm=\"pq\",\n    pq_subvectors=8,\n    pq_centroids=256\n)\n```\n\n## Performance Benchmarks\n\nBased on internal benchmarking on enterprise hardware:\n\n### NusterDB Performance\n- **Memory Mode**: 15K-30K queries per second\n- **Persistent Mode**: 8K-15K QPS with full durability\n- **Insertion Rate**: 10K+ vectors/sec with persistence\n- **Memory Efficiency**: Zero-copy access, optimized storage\n- **Latency**: Sub-millisecond response times for most queries\n\n### Distance Metrics Supported\n- **L2 (Euclidean)**: Standard Euclidean distance\n- **Cosine**: Cosine similarity for normalized vectors\n- **Inner Product**: Dot product similarity\n- **L1 (Manhattan)**: Manhattan distance\n\n## Configuration Management\n\n### Predefined Configurations\n\n```python\nfrom nusterdb import create_config\n\n# Predefined configurations for common use cases\nconfig = create_config(\"production_speed\")     # Speed-optimized\nconfig = create_config(\"production_accuracy\")  # Accuracy-optimized  \nconfig = create_config(\"memory_constrained\")   # Memory-optimized\nconfig = create_config(\"secure\")              # Security-focused\n\ndb = nusterdb.NusterDB(config=config, dimension=768)\n```\n\n### Available Presets\n- `\"development\"` / `\"dev\"` - Development and testing\n- `\"production_speed\"` / `\"prod_speed\"` - Speed-optimized production\n- `\"production_accuracy\"` / `\"prod_accuracy\"` - Accuracy-optimized production\n- `\"government\"` / `\"secure\"` - Government-grade security\n- `\"memory_constrained\"` / `\"low_memory\"` - Memory-optimized\n- `\"high_throughput\"` / `\"throughput\"` - High-throughput applications\n\n### Custom Configurations\n\n```python\n# Create custom configuration\nconfig = nusterdb.NusterConfig(\n    algorithm=nusterdb.Algorithm.IVF,\n    security_level=nusterdb.SecurityLevel.ENTERPRISE,\n    distance_metric=nusterdb.DistanceMetric.COSINE,\n    use_gpu=True,\n    cache_size=\"4GB\"\n)\n\n# Use configuration\ndb = nusterdb.NusterDB(config=config, dimension=768)\n\n# Update configuration\nupdated_config = config.update(use_simd=False, parallel_processing=False)\n```\n\n## Advanced Examples\n\n### Large-Scale Production Setup\n\n```python\nimport nusterdb\nimport numpy as np\n\n# Production configuration with security\ndb = nusterdb.NusterDB(\n    mode=\"persistent\",\n    path=\"/secure/vectors\",\n    dimension=1536,                    # OpenAI embeddings\n    algorithm=\"ivf\",\n    security_level=\"enterprise\",\n    distance_metric=\"cosine\",\n    use_gpu=True,\n    parallel_processing=True,\n    cache_size=\"8GB\",\n    \n    # IVF-specific tuning\n    ivf_clusters=1024,\n    ivf_probe_lists=64,\n    \n    # Security settings\n    encryption_at_rest=True,\n    audit_logging=True,\n    access_control=True\n)\n\n# Bulk data loading with progress tracking\ndef load_embeddings(file_path, batch_size=1000):\n    ids, vectors, metadata = nusterdb.load_vectors_from_file(file_path)\n    \n    total_batches = len(ids) // batch_size + 1\n    for i in range(0, len(ids), batch_size):\n        batch_ids = ids[i:i+batch_size]\n        batch_vectors = vectors[i:i+batch_size]\n        batch_metadata = metadata[i:i+batch_size] if metadata else None\n        \n        added = db.bulk_add(batch_ids, batch_vectors, batch_metadata)\n        print(f\"Batch {i//batch_size + 1}/{total_batches}: Added {added} vectors\")\n    \n    # Optimize after bulk loading\n    print(\"Optimizing index...\")\n    db.optimize()\n\n# Advanced search with multiple filters\ndef semantic_search(query_text, filters=None, k=10):\n    # Convert text to embedding (pseudo-code)\n    query_embedding = get_text_embedding(query_text)\n    \n    results = db.search(\n        query=query_embedding,\n        k=k,\n        filters=filters or {},\n        include_metadata=True,\n        include_distances=True\n    )\n    \n    # Post-process results\n    processed_results = []\n    for result in results:\n        processed_results.append({\n            'id': result['id'],\n            'similarity': 1 - result['distance'],  # Convert distance to similarity\n            'metadata': result['metadata'],\n            'confidence': result['distance'] < 0.5  # Confidence threshold\n        })\n    \n    return processed_results\n\n# Usage\nresults = semantic_search(\n    \"machine learning algorithms\",\n    filters={\"category\": \"research\", \"year\": 2023},\n    k=20\n)\n```\n\n### Multi-Modal Search System\n\n```python\nimport nusterdb\n\nclass MultiModalSearchSystem:\n    def __init__(self, base_path: str):\n        # Separate databases for different modalities\n        self.text_db = nusterdb.NusterDB(\n            mode=\"persistent\",\n            path=f\"{base_path}/text\",\n            dimension=768,\n            algorithm=\"hnsw\",\n            distance_metric=\"cosine\"\n        )\n        \n        self.image_db = nusterdb.NusterDB(\n            mode=\"persistent\", \n            path=f\"{base_path}/images\",\n            dimension=2048,\n            algorithm=\"ivf\",\n            distance_metric=\"l2\"\n        )\n        \n        self.audio_db = nusterdb.NusterDB(\n            mode=\"persistent\",\n            path=f\"{base_path}/audio\", \n            dimension=512,\n            algorithm=\"lsh\",\n            distance_metric=\"cosine\"\n        )\n    \n    def add_content(self, content_id: str, embeddings: dict, metadata: dict):\n        \"\"\"Add multi-modal content.\"\"\"\n        if 'text' in embeddings:\n            self.text_db.add(content_id, embeddings['text'], metadata)\n        \n        if 'image' in embeddings:\n            self.image_db.add(content_id, embeddings['image'], metadata)\n            \n        if 'audio' in embeddings:\n            self.audio_db.add(content_id, embeddings['audio'], metadata)\n    \n    def search_all_modalities(self, query_embeddings: dict, k: int = 10):\n        \"\"\"Search across all modalities and combine results.\"\"\"\n        all_results = {}\n        \n        if 'text' in query_embeddings:\n            text_results = self.text_db.search(query_embeddings['text'], k=k)\n            all_results['text'] = text_results\n            \n        if 'image' in query_embeddings:\n            image_results = self.image_db.search(query_embeddings['image'], k=k)\n            all_results['image'] = image_results\n            \n        if 'audio' in query_embeddings:\n            audio_results = self.audio_db.search(query_embeddings['audio'], k=k)\n            all_results['audio'] = audio_results\n        \n        return self._combine_results(all_results)\n    \n    def _combine_results(self, results_by_modality):\n        \"\"\"Combine and rank results from multiple modalities.\"\"\"\n        # Implementation depends on your fusion strategy\n        combined = {}\n        for modality, results in results_by_modality.items():\n            for result in results:\n                content_id = result['id']\n                if content_id not in combined:\n                    combined[content_id] = {\n                        'id': content_id,\n                        'metadata': result['metadata'],\n                        'scores': {}\n                    }\n                combined[content_id]['scores'][modality] = 1 - result['distance']\n        \n        # Sort by combined score\n        for item in combined.values():\n            item['combined_score'] = sum(item['scores'].values()) / len(item['scores'])\n        \n        return sorted(combined.values(), key=lambda x: x['combined_score'], reverse=True)\n\n# Usage\nsearch_system = MultiModalSearchSystem(\"/data/multimodal\")\n\n# Add content\nsearch_system.add_content(\n    \"doc_001\",\n    embeddings={\n        'text': text_embedding,\n        'image': image_embedding\n    },\n    metadata={'title': 'Research Paper', 'type': 'academic'}\n)\n\n# Search\nresults = search_system.search_all_modalities({\n    'text': query_text_embedding,\n    'image': query_image_embedding\n})\n```\n\n### Real-Time Recommendation System\n\n```python\nimport nusterdb\nfrom collections import defaultdict\nimport time\n\nclass RecommendationSystem:\n    def __init__(self):\n        self.user_db = nusterdb.NusterDB(\n            mode=\"cache\",\n            dimension=256,\n            algorithm=\"lsh\",\n            cache_size=\"1GB\"\n        )\n        \n        self.item_db = nusterdb.NusterDB(\n            mode=\"persistent\",\n            path=\"./items\",\n            dimension=256,\n            algorithm=\"ivf\"\n        )\n        \n        # Track user interactions\n        self.user_interactions = defaultdict(list)\n    \n    def add_user_profile(self, user_id: str, profile_vector: list, metadata: dict):\n        \"\"\"Add or update user profile.\"\"\"\n        self.user_db.add(user_id, profile_vector, metadata)\n    \n    def add_item(self, item_id: str, feature_vector: list, metadata: dict):\n        \"\"\"Add item to catalog.\"\"\"\n        self.item_db.add(item_id, feature_vector, metadata)\n    \n    def record_interaction(self, user_id: str, item_id: str, interaction_type: str, rating: float = None):\n        \"\"\"Record user-item interaction.\"\"\"\n        interaction = {\n            'item_id': item_id,\n            'type': interaction_type,\n            'rating': rating,\n            'timestamp': time.time()\n        }\n        self.user_interactions[user_id].append(interaction)\n        \n        # Update user profile based on interaction\n        self._update_user_profile(user_id, item_id, interaction_type, rating)\n    \n    def get_recommendations(self, user_id: str, k: int = 10, exclude_seen: bool = True):\n        \"\"\"Get personalized recommendations.\"\"\"\n        # Get user profile\n        user_profile = self.user_db.get(user_id)\n        if not user_profile:\n            return self._get_popular_items(k)\n        \n        # Find similar items\n        recommendations = self.item_db.search(\n            user_profile['vector'],\n            k=k * 2,  # Get more to account for filtering\n            include_metadata=True\n        )\n        \n        # Filter out already seen items\n        if exclude_seen:\n            seen_items = {interaction['item_id'] for interaction in self.user_interactions[user_id]}\n            recommendations = [r for r in recommendations if r['id'] not in seen_items]\n        \n        return recommendations[:k]\n    \n    def get_similar_users(self, user_id: str, k: int = 5):\n        \"\"\"Find similar users for collaborative filtering.\"\"\"\n        user_profile = self.user_db.get(user_id)\n        if not user_profile:\n            return []\n        \n        similar_users = self.user_db.search(\n            user_profile['vector'],\n            k=k + 1,  # +1 to exclude self\n            include_metadata=True\n        )\n        \n        # Remove self from results\n        return [u for u in similar_users if u['id'] != user_id]\n    \n    def _update_user_profile(self, user_id: str, item_id: str, interaction_type: str, rating: float):\n        \"\"\"Update user profile based on interaction.\"\"\"\n        # Get current profile and item features\n        user_profile = self.user_db.get(user_id)\n        item_data = self.item_db.get(item_id)\n        \n        if not user_profile or not item_data:\n            return\n        \n        # Simple profile update (weighted average)\n        weight = self._get_interaction_weight(interaction_type, rating)\n        current_vector = np.array(user_profile['vector'])\n        item_vector = np.array(item_data['vector'])\n        \n        # Update with exponential moving average\n        alpha = 0.1  # Learning rate\n        updated_vector = (1 - alpha) * current_vector + alpha * weight * item_vector\n        \n        # Update user profile\n        self.user_db.update(user_id, updated_vector.tolist())\n    \n    def _get_interaction_weight(self, interaction_type: str, rating: float = None) -> float:\n        \"\"\"Convert interaction type to weight.\"\"\"\n        weights = {\n            'view': 0.1,\n            'click': 0.3,\n            'like': 0.7,\n            'purchase': 1.0,\n            'rating': rating or 0.5\n        }\n        return weights.get(interaction_type, 0.1)\n    \n    def _get_popular_items(self, k: int):\n        \"\"\"Fallback for new users - return popular items.\"\"\"\n        # Simple implementation - could be enhanced with actual popularity metrics\n        all_items = list(self.item_db)[:k]\n        return all_items\n\n# Usage\nrec_system = RecommendationSystem()\n\n# Add items\nrec_system.add_item(\"item_1\", feature_vector, {\"category\": \"electronics\", \"price\": 299.99})\n\n# Add users\nrec_system.add_user_profile(\"user_1\", profile_vector, {\"age\": 25, \"location\": \"NY\"})\n\n# Record interactions\nrec_system.record_interaction(\"user_1\", \"item_1\", \"purchase\", rating=4.5)\n\n# Get recommendations\nrecommendations = rec_system.get_recommendations(\"user_1\", k=10)\nfor rec in recommendations:\n    print(f\"Recommended: {rec['id']} (similarity: {1-rec['distance']:.3f})\")\n```\n\n## Why Choose NusterDB?\n\n### Complete Database Solution\n- Full CRUD operations with transaction support\n- Built-in persistence and data durability\n- Comprehensive APIs for production use\n- Multiple storage modes for different use cases\n\n### Enterprise Security\n- Industry-leading security features\n- FIPS 140-2 compliance ready\n- Quantum-resistant cryptography\n- Comprehensive audit logging and access control\n\n### High Performance\n- Advanced algorithms optimized for different workloads\n- Hardware acceleration (SIMD, multi-threading)\n- Memory-efficient with zero-copy access\n- Intelligent caching for large datasets\n\n### Developer Friendly\n- Single unified API for all storage modes\n- Simple installation with pip\n- Extensive documentation and examples\n- Type hints and comprehensive error handling\n\n## Use Cases\n\n### Recommended For:\n- **AI/ML Applications** requiring fast similarity search\n- **Production Systems** needing reliability and persistence  \n- **Enterprise Environments** with security requirements\n- **Large-Scale Deployments** requiring monitoring and ops tools\n- **Sensitive Data** needing encryption and compliance\n- **Microservices** architecture with API-first design\n\n### Common Applications:\n- Semantic search and document retrieval\n- Image and video similarity search\n- Recommendation systems\n- Anomaly detection\n- Content-based filtering\n- Knowledge base search\n\n## Links & Resources\n\n- [Documentation](https://docs.nusterai.com/nusterdb)\n- [Issues](https://github.com/NusterAI/nusterdb/issues)\n- [Discussions](https://github.com/NusterAI/nusterdb/discussions)\n- [Performance Benchmarks](https://github.com/NusterAI/nusterdb/blob/main/docs/PERFORMANCE_ANALYSIS.md)\n- [Security Guide](https://github.com/NusterAI/nusterdb/blob/main/docs/SECURITY.md)\n\n## License\n\nMIT License - see [LICENSE](LICENSE) file for details.\n\n---\n\n**Ready to build with high-performance vector search?**\n\n```bash\npip install nusterdb\n```\n\nGet enterprise-grade vector database with security, persistence, and production features!\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "High-performance, government-grade vector database with advanced indexing algorithms",
    "version": "2.1.3",
    "project_urls": {
        "Bug Tracker": "https://github.com/NusterAI/nusterdb/issues",
        "Changelog": "https://github.com/NusterAI/nusterdb/blob/main/CHANGELOG.md",
        "Documentation": "https://docs.nusterai.com/nusterdb",
        "Homepage": "https://github.com/NusterAI/nusterdb",
        "Repository": "https://github.com/NusterAI/nusterdb"
    },
    "split_keywords": [
        "vector database",
        " similarity search",
        " machine learning",
        " ai",
        " government-grade",
        " security",
        " fips",
        " quantum-resistant"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "17aca280926362ac3adbe971d715403e6d2da436a6d8f280e4b4cbba7fc2f035",
                "md5": "fb5c6faf88785b937b03ee2445c2f586",
                "sha256": "326f6b2d97b39a64033f2169306e8084b8c55c4151e1f2a57a9b9070d4377950"
            },
            "downloads": -1,
            "filename": "nusterdb-2.1.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "fb5c6faf88785b937b03ee2445c2f586",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 38119,
            "upload_time": "2025-08-04T01:12:17",
            "upload_time_iso_8601": "2025-08-04T01:12:17.877882Z",
            "url": "https://files.pythonhosted.org/packages/17/ac/a280926362ac3adbe971d715403e6d2da436a6d8f280e4b4cbba7fc2f035/nusterdb-2.1.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-04 01:12:17",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "NusterAI",
    "github_project": "nusterdb",
    "github_not_found": true,
    "lcname": "nusterdb"
}
        
Elapsed time: 1.65265s