matrixone-python-sdk

Name	matrixone-python-sdk JSON
Version	0.1.4 JSON
	download
home_page	https://github.com/matrixorigin/matrixone
Summary	A comprehensive Python SDK for MatrixOne database operations with vector search, fulltext search, and advanced features
upload_time	2025-10-15 03:34:25
maintainer	None
docs_url	None
author	MatrixOne Team
requires_python	>=3.8
license	Apache-2.0
keywords	matrixone database vector search sqlalchemy python
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # MatrixOne Python SDK

[![PyPI version](https://badge.fury.io/py/matrixone-python-sdk.svg)](https://badge.fury.io/py/matrixone-python-sdk)
[![Python Support](https://img.shields.io/pypi/pyversions/matrixone-python-sdk.svg)](https://pypi.org/project/matrixone-python-sdk/)
[![Documentation Status](https://app.readthedocs.org/projects/matrixone/badge/?version=latest)](https://matrixone.readthedocs.io/en/latest/?badge=latest)
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)

A comprehensive, high-level Python SDK for MatrixOne that provides SQLAlchemy-like interface for database operations, vector similarity search, fulltext search, snapshot management, PITR, restore operations, table cloning, and more.

---

## 📚 Documentation

**[📖 Complete Documentation on ReadTheDocs](https://matrixone.readthedocs.io/)** ⭐

**Quick Links:**
- 🚀 [Quick Start Guide](https://matrixone.readthedocs.io/en/latest/quickstart.html)
- 🧠 [Vector Search & IVF Index Monitoring](https://matrixone.readthedocs.io/en/latest/vector_guide.html)
- 📋 [Best Practices](https://matrixone.readthedocs.io/en/latest/best_practices.html)
- 📖 [API Reference](https://matrixone.readthedocs.io/en/latest/api/index.html)

---

## ✨ Features

- 🚀 **High Performance**: Optimized for MatrixOne database operations with connection pooling
- 🔄 **Async Support**: Full async/await support with AsyncClient for non-blocking operations
- 🧠 **Vector Search**: Advanced vector similarity search with HNSW and IVF indexing
  - Support for f32 and f64 precision vectors
  - Multiple distance metrics (L2, Cosine, Inner Product)
  - ⭐ **IVF Index Health Monitoring** with `get_ivf_stats()` - Critical for production!
  - High-performance indexing for AI/ML applications
- 🔍 **Fulltext Search**: Powerful fulltext indexing and search with BM25 and TF-IDF
  - Natural language and boolean search modes
  - Multi-column indexes with relevance scoring
- 📊 **Metadata Analysis**: Table and column metadata analysis with statistics
- 📸 **Snapshot Management**: Create and manage database snapshots at multiple levels
- ⏰ **Point-in-Time Recovery**: PITR functionality for precise data recovery
- 🔄 **Table Cloning**: Clone databases and tables efficiently
- 👥 **Account Management**: Comprehensive user and role management
- 📊 **Pub/Sub**: Real-time publication and subscription support
- 🔧 **Version Management**: Automatic backend version detection and compatibility
- 🛡️ **Type Safety**: Full type hints support with comprehensive documentation
- 📚 **SQLAlchemy Integration**: Seamless SQLAlchemy ORM integration with enhanced features

## 🚀 Installation

### Using pip (Recommended)

```bash
pip install matrixone-python-sdk
```

### Install from test.pypi (Latest Pre-release)

```bash
pip install \
    --index-url https://test.pypi.org/simple/ \
    --extra-index-url https://pypi.org/simple/ \
    matrixone-python-sdk
```

**Note**: The `--extra-index-url` is required to install dependencies from the official PyPI.

### Using Virtual Environment (Best Practice)

```bash
# Create virtual environment
python -m venv venv

# Activate virtual environment
# On macOS/Linux:
source venv/bin/activate
# On Windows:
# venv\Scripts\activate

# Install MatrixOne SDK
pip install matrixone-python-sdk

# Verify installation
python -c "import matrixone; print('MatrixOne SDK installed successfully')"
```

### Using Conda

```bash
# Create conda environment
conda create -n matrixone python=3.10
conda activate matrixone

# Install MatrixOne SDK
pip install matrixone-python-sdk
```

## Quick Start

### Basic Usage

```python
from matrixone import Client

# Create and connect to MatrixOne
client = Client()
client.connect(
    host='localhost',
    port=6001,
    user='root',
    password='111',
    database='test'
)

# Execute queries
result = client.execute("SELECT 1 as test")
print(result.fetchall())

# Get backend version (auto-detected)
version = client.get_backend_version()
print(f"MatrixOne version: {version}")

client.disconnect()
```

> **📝 Connection Parameters**
> 
> The `connect()` method requires **keyword arguments** (not positional):
> - `database` - **Required**, no default value
> - `host` - Default: `'localhost'`
> - `port` - Default: `6001`
> - `user` - Default: `'root'`
> - `password` - Default: `'111'`
> 
> **Minimal connection** (uses all defaults):
> ```python
> client.connect(database='test')
> ```
> 
> By default, all features (IVF, HNSW, fulltext) are automatically enabled via `on_connect=[ConnectionAction.ENABLE_ALL]`.

### Async Usage

```python
import asyncio
from matrixone import AsyncClient

async def main():
    client = AsyncClient()
    await client.connect(
        host='localhost',
        port=6001,
        user='root',
        password='111',
        database='test'
    )
    
    result = await client.execute("SELECT 1 as test")
    print(result.fetchall())
    
    await client.disconnect()

asyncio.run(main())
```

### Snapshot Management

```python
# Create a snapshot
snapshot = client.snapshots.create(
    'my_snapshot',
    'cluster',
    description='Backup before migration'
)

# List snapshots
snapshots = client.snapshots.list()
for snap in snapshots:
    print(f"Snapshot: {snap.name}, Created: {snap.created_at}")

# Clone database from snapshot
client.clone.clone_database(
    'new_database',
    'old_database',
    snapshot_name='my_snapshot'
)
```

### Version Management

```python
# Check if feature is available
if client.is_feature_available('snapshot_creation'):
    snapshot = client.snapshots.create('my_snapshot', 'cluster')
else:
    hint = client.get_version_hint('snapshot_creation')
    print(f"Feature not available: {hint}")

# Check version compatibility
if client.check_version_compatibility('3.0.0', '>='):
    print("Backend supports 3.0.0+ features")
```

## MatrixOne Version Support

The SDK automatically detects MatrixOne backend versions and handles compatibility:

- **Development Version**: `8.0.30-MatrixOne-v` → `999.0.0` (highest priority)
- **Release Version**: `8.0.30-MatrixOne-v3.0.0` → `3.0.0`
- **Legacy Format**: `MatrixOne 3.0.1` → `3.0.1`

```python
# Check if running development version
if client.is_development_version():
    print("Running development version - all features available")
else:
    print(f"Running release version: {client.get_backend_version()}")
```

## Advanced Features

### PITR (Point-in-Time Recovery)

```python
# Create PITR for cluster
pitr = client.pitr.create_cluster_pitr(
    'cluster_pitr',
    range_value=7,
    range_unit='d'
)

# Restore cluster from snapshot
client.restore.restore_cluster('my_snapshot')
```

### Account Management

```python
from matrixone.account import AccountManager

# Initialize account manager
account_manager = AccountManager(client)

# Create user
user = account_manager.create_user('newuser', 'password123')
print(f"Created user: {user.name}")

# Create role  
role = account_manager.create_role('analyst')
print(f"Created role: {role.name}")

# Grant privileges on specific table (optional)
# Note: table must exist first
account_manager.grant_privilege(
    'SELECT',           # privilege
    'TABLE',            # object_type
    'users',       # object_name (database.table format)
    to_role='analyst'
)

# Grant role to user
account_manager.grant_role('analyst', 'newuser')
print(f"Granted role to user")

# List users
users = account_manager.list_users()
for user in users:
    print(f"User: {user.name}")
```

### Vector Search Operations

```python
from matrixone import Client
from matrixone.sqlalchemy_ext import create_vector_column
from matrixone.orm import declarative_base
from sqlalchemy import Column, BigInteger, String, Text
import numpy as np

# Create client and connect
client = Client()
client.connect(
    host='localhost',
    port=6001,
    user='root',
    password='111',
    database='test'
)

# Define vector table using MatrixOne ORM
Base = declarative_base()

class Document(Base):
    __tablename__ = 'documents'
    # IMPORTANT: HNSW index requires BigInteger (BIGINT) primary key
    id = Column(BigInteger, primary_key=True, autoincrement=True)
    title = Column(String(200))
    content = Column(Text)
    embedding = create_vector_column(384, precision='f32')

# Create table using client API (not Base.metadata.create_all)
client.create_table(Document)

# Create HNSW index using SDK (not SQL)
client.vector_ops.enable_hnsw()
client.vector_ops.create_hnsw(
    'documents',  # table name or model - positional argument
    name='idx_embedding',
    column='embedding',
    m=16,
    ef_construction=200
)

# Insert vector data using client API
client.insert(Document, {
    'title': 'Machine Learning Guide',
    'content': 'Comprehensive ML tutorial...',
    'embedding': np.random.rand(384).tolist()
})

# Search similar documents using SDK
query_vector = np.random.rand(384).tolist()
results = client.vector_ops.similarity_search(
    'documents',  # table name or model - positional argument
    vector_column='embedding',
    query_vector=query_vector,
    limit=5,
    distance_type='cosine'
)

for row in results:
    print(f"Document: {row[1]}, Similarity: {row[-1]}")

# Cleanup
client.drop_table(Document)  # Use client API
client.disconnect()
```

### ⭐ IVF Index Health Monitoring (Production Critical)

**Monitor your IVF indexes to ensure optimal performance!**

```python
from matrixone import Client
import numpy as np

client = Client()
client.connect(host='localhost', port=6001, user='root', password='111', database='test')

# After creating IVF index and inserting data...

# Get IVF index statistics
stats = client.vector_ops.get_ivf_stats("documents", "embedding")

# Analyze index balance
counts = stats['distribution']['centroid_count']
total_centroids = len(counts)
total_vectors = sum(counts)
min_count = min(counts) if counts else 0
max_count = max(counts) if counts else 0
balance_ratio = max_count / min_count if min_count > 0 else float('inf')

print(f"📊 IVF Index Health Report:")
print(f"  - Total centroids: {total_centroids}")
print(f"  - Total vectors: {total_vectors}")
print(f"  - Balance ratio: {balance_ratio:.2f}")
print(f"  - Min vectors in centroid: {min_count}")
print(f"  - Max vectors in centroid: {max_count}")

# Check if index needs rebuilding
if balance_ratio > 2.5:
    print("⚠️  WARNING: Index is imbalanced and needs rebuilding!")
    print("   Rebuild the index for optimal performance:")
    
    # Rebuild process
    client.vector_ops.drop("documents", "idx_embedding")
    client.vector_ops.create_ivf(
        "documents",
        name="idx_embedding",
        column="embedding",
        lists=100
    )
    print("✅ Index rebuilt successfully")
else:
    print("✅ Index is healthy and well-balanced")

client.disconnect()
```

**Why IVF Stats Matter:**
- 🎯 **Performance**: Unbalanced indexes lead to slow searches
- 📊 **Load Distribution**: Identify hot spots and imbalances
- 🔄 **Rebuild Timing**: Know when to rebuild for optimal performance
- 📈 **Capacity Planning**: Understand data distribution patterns

**When to Rebuild:**
- Balance ratio > 2.5 (moderate imbalance)
- Balance ratio > 3.0 (severe imbalance - rebuild immediately)
- After bulk inserts (>20% of data)
- Performance degradation in searches

### Fulltext Search Operations

```python
from matrixone import Client
from matrixone.sqlalchemy_ext.fulltext_search import boolean_match
from matrixone.orm import declarative_base
from sqlalchemy import Column, Integer, String, Text

# Create client and connect
client = Client()
client.connect(
    host='localhost',
    port=6001,
    user='root',
    password='111',
    database='test'
)

# Define model using MatrixOne ORM
Base = declarative_base()

class Article(Base):
    __tablename__ = 'articles'
    id = Column(Integer, primary_key=True, autoincrement=True)
    title = Column(String(200), nullable=False)
    content = Column(Text, nullable=False)
    category = Column(String(100))

# Create table using client API (not Base.metadata.create_all)
client.create_table(Article)

# Insert some data using client API
articles = [
    {'title': 'Machine Learning Guide', 
     'content': 'Comprehensive machine learning tutorial...', 
     'category': 'AI'},
    {'title': 'Python Programming', 
     'content': 'Learn Python programming basics', 
     'category': 'Programming'},
]
client.batch_insert(Article, articles)

# Create fulltext index using SDK (not SQL)
client.fulltext_index.create(
    'articles',  # table name - positional argument
    name='ftidx_content',
    columns=['title', 'content']
)

# Boolean search with encourage (like natural language)
results = client.query(
    Article.title,
    Article.content,
    boolean_match('title', 'content').encourage('machine learning tutorial')
).execute()

# Boolean search with must/should operators
results = client.query(
    Article.title,
    Article.content,
    boolean_match('title', 'content')
        .must('machine')
        .must('learning')
        .must_not('basics')
).execute()

# Results is a ResultSet object
for row in results.rows:
    print(f"Title: {row[0]}, Content: {row[1][:50]}...")

# Cleanup
client.drop_table(Article)  # Use client API
client.disconnect()
```

### Metadata Analysis

```python
from matrixone import Client

# Create client and connect
client = Client()
client.connect(
    host='localhost',
    port=6001,
    user='root',
    password='111',
    database='test'
)

# Analyze table metadata - returns structured MetadataRow objects
metadata_rows = client.metadata.scan(
    dbname='test',
    tablename='documents',
    columns='*'  # Get all columns
)

for row in metadata_rows:
    print(f"Column: {row.col_name}")
    print(f"  Rows count: {row.rows_cnt}")
    print(f"  Null count: {row.null_cnt}")
    print(f"  Size: {row.origin_size}")

# Get table brief statistics
brief_stats = client.metadata.get_table_brief_stats(
    dbname='test',
    tablename='documents'
)

table_stats = brief_stats['documents']
print(f"Total rows: {table_stats['row_cnt']}")
print(f"Total nulls: {table_stats['null_cnt']}")
print(f"Original size: {table_stats['original_size']}")
print(f"Compressed size: {table_stats['compress_size']}")

client.disconnect()
```

### Pub/Sub Operations

```python
# List publications
publications = client.pubsub.list_publications()
for pub in publications:
    print(f"Publication: {pub}")

# List subscriptions
subscriptions = client.pubsub.list_subscriptions()
for sub in subscriptions:
    print(f"Subscription: {sub}")

# Drop publication/subscription when needed
try:
    client.pubsub.drop_publication("test_publication")
    client.pubsub.drop_subscription("test_subscription")
except Exception as e:
    print(f"Cleanup: {e}")
```

## Configuration

### Connection Parameters

```python
client = Client(
    connection_timeout=30,
    query_timeout=300,
    auto_commit=True,
    charset='utf8mb4',
    sql_log_mode='auto',  # 'off', 'simple', 'auto', 'full'
    slow_query_threshold=1.0
)
```

### Logging Configuration

```python
from matrixone import Client
from matrixone.logger import create_default_logger
import logging

# Create custom logger
logger = create_default_logger(
    level=logging.INFO,
    sql_log_mode='auto',  # 'off', 'simple', 'auto', 'full'
    slow_query_threshold=1.0,
    max_sql_display_length=500
)

# Use custom logger with client
client = Client(logger=logger)
```

## Error Handling

The SDK provides comprehensive error handling with helpful messages:

```python
from matrixone.exceptions import (
    ConnectionError,
    QueryError,
    VersionError,
    SnapshotError
)

try:
    snapshot = client.snapshots.create('test', 'cluster')
except VersionError as e:
    print(f"Version compatibility error: {e}")
except SnapshotError as e:
    print(f"Snapshot operation failed: {e}")
```

## 🔗 Links

- **📚 Full Documentation**: https://matrixone.readthedocs.io/
- **📦 PyPI Package**: https://pypi.org/project/matrixone-python-sdk/
- **💻 GitHub Repository**: https://github.com/matrixorigin/matrixone/tree/main/clients/python
- **🌐 MatrixOne Docs**: https://docs.matrixorigin.cn/

### Online Examples

The SDK includes 25+ comprehensive examples covering all features:

**Getting Started:**
- Basic connection and database operations
- Async/await operations
- Transaction management
- SQLAlchemy ORM integration

**Vector Search:**
- Vector data types and distance functions
- IVF and HNSW index creation and tuning
- ⭐ **IVF Index Health Monitoring** - Essential for production systems
- Similarity search operations
- Advanced vector optimizations and index rebuilding

**Advanced Features:**
- Fulltext search with BM25/TF-IDF
- Table metadata analysis
- Snapshot and restore operations
- Account and permission management
- Pub/Sub operations
- Connection hooks and logging

### Quick Examples

Clone the repository to access all examples:
```bash
git clone https://github.com/matrixorigin/matrixone.git
cd matrixone/clients/python/examples

# Run basic example
python example_01_basic_connection.py

# Run vector search example
python example_12_vector_basics.py

# Run metadata analysis example
python example_25_metadata_operations.py
```


## Support

- 📧 Email: contact@matrixorigin.cn
- 🐛 Issues: [GitHub Issues](https://github.com/matrixorigin/matrixone/issues)
- 💬 Discussions: [GitHub Discussions](https://github.com/matrixorigin/matrixone/discussions)
- 📖 Documentation: 
  - [MatrixOne Docs (English)](https://docs.matrixorigin.cn/en)
  - [MatrixOne Docs (中文)](https://docs.matrixorigin.cn/)

## License

This project is licensed under the Apache License 2.0 - see the [LICENSE](LICENSE) file for details.

---

**MatrixOne Python SDK** - Making MatrixOne database operations simple and powerful in Python.

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/matrixorigin/matrixone",
    "name": "matrixone-python-sdk",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "MatrixOne Team <dev@matrixone.io>",
    "keywords": "matrixone, database, vector, search, sqlalchemy, python",
    "author": "MatrixOne Team",
    "author_email": "MatrixOne Team <dev@matrixone.io>",
    "download_url": "https://files.pythonhosted.org/packages/d3/69/832fd8d2d42e6add4b78270e7327f2393fefea24913cac75913c723ab290/matrixone_python_sdk-0.1.4.tar.gz",
    "platform": null,
    "description": "# MatrixOne Python SDK\n\n[![PyPI version](https://badge.fury.io/py/matrixone-python-sdk.svg)](https://badge.fury.io/py/matrixone-python-sdk)\n[![Python Support](https://img.shields.io/pypi/pyversions/matrixone-python-sdk.svg)](https://pypi.org/project/matrixone-python-sdk/)\n[![Documentation Status](https://app.readthedocs.org/projects/matrixone/badge/?version=latest)](https://matrixone.readthedocs.io/en/latest/?badge=latest)\n[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)\n\nA comprehensive, high-level Python SDK for MatrixOne that provides SQLAlchemy-like interface for database operations, vector similarity search, fulltext search, snapshot management, PITR, restore operations, table cloning, and more.\n\n---\n\n## \ud83d\udcda Documentation\n\n**[\ud83d\udcd6 Complete Documentation on ReadTheDocs](https://matrixone.readthedocs.io/)** \u2b50\n\n**Quick Links:**\n- \ud83d\ude80 [Quick Start Guide](https://matrixone.readthedocs.io/en/latest/quickstart.html)\n- \ud83e\udde0 [Vector Search & IVF Index Monitoring](https://matrixone.readthedocs.io/en/latest/vector_guide.html)\n- \ud83d\udccb [Best Practices](https://matrixone.readthedocs.io/en/latest/best_practices.html)\n- \ud83d\udcd6 [API Reference](https://matrixone.readthedocs.io/en/latest/api/index.html)\n\n---\n\n## \u2728 Features\n\n- \ud83d\ude80 **High Performance**: Optimized for MatrixOne database operations with connection pooling\n- \ud83d\udd04 **Async Support**: Full async/await support with AsyncClient for non-blocking operations\n- \ud83e\udde0 **Vector Search**: Advanced vector similarity search with HNSW and IVF indexing\n  - Support for f32 and f64 precision vectors\n  - Multiple distance metrics (L2, Cosine, Inner Product)\n  - \u2b50 **IVF Index Health Monitoring** with `get_ivf_stats()` - Critical for production!\n  - High-performance indexing for AI/ML applications\n- \ud83d\udd0d **Fulltext Search**: Powerful fulltext indexing and search with BM25 and TF-IDF\n  - Natural language and boolean search modes\n  - Multi-column indexes with relevance scoring\n- \ud83d\udcca **Metadata Analysis**: Table and column metadata analysis with statistics\n- \ud83d\udcf8 **Snapshot Management**: Create and manage database snapshots at multiple levels\n- \u23f0 **Point-in-Time Recovery**: PITR functionality for precise data recovery\n- \ud83d\udd04 **Table Cloning**: Clone databases and tables efficiently\n- \ud83d\udc65 **Account Management**: Comprehensive user and role management\n- \ud83d\udcca **Pub/Sub**: Real-time publication and subscription support\n- \ud83d\udd27 **Version Management**: Automatic backend version detection and compatibility\n- \ud83d\udee1\ufe0f **Type Safety**: Full type hints support with comprehensive documentation\n- \ud83d\udcda **SQLAlchemy Integration**: Seamless SQLAlchemy ORM integration with enhanced features\n\n## \ud83d\ude80 Installation\n\n### Using pip (Recommended)\n\n```bash\npip install matrixone-python-sdk\n```\n\n### Install from test.pypi (Latest Pre-release)\n\n```bash\npip install \\\n    --index-url https://test.pypi.org/simple/ \\\n    --extra-index-url https://pypi.org/simple/ \\\n    matrixone-python-sdk\n```\n\n**Note**: The `--extra-index-url` is required to install dependencies from the official PyPI.\n\n### Using Virtual Environment (Best Practice)\n\n```bash\n# Create virtual environment\npython -m venv venv\n\n# Activate virtual environment\n# On macOS/Linux:\nsource venv/bin/activate\n# On Windows:\n# venv\\Scripts\\activate\n\n# Install MatrixOne SDK\npip install matrixone-python-sdk\n\n# Verify installation\npython -c \"import matrixone; print('MatrixOne SDK installed successfully')\"\n```\n\n### Using Conda\n\n```bash\n# Create conda environment\nconda create -n matrixone python=3.10\nconda activate matrixone\n\n# Install MatrixOne SDK\npip install matrixone-python-sdk\n```\n\n## Quick Start\n\n### Basic Usage\n\n```python\nfrom matrixone import Client\n\n# Create and connect to MatrixOne\nclient = Client()\nclient.connect(\n    host='localhost',\n    port=6001,\n    user='root',\n    password='111',\n    database='test'\n)\n\n# Execute queries\nresult = client.execute(\"SELECT 1 as test\")\nprint(result.fetchall())\n\n# Get backend version (auto-detected)\nversion = client.get_backend_version()\nprint(f\"MatrixOne version: {version}\")\n\nclient.disconnect()\n```\n\n> **\ud83d\udcdd Connection Parameters**\n> \n> The `connect()` method requires **keyword arguments** (not positional):\n> - `database` - **Required**, no default value\n> - `host` - Default: `'localhost'`\n> - `port` - Default: `6001`\n> - `user` - Default: `'root'`\n> - `password` - Default: `'111'`\n> \n> **Minimal connection** (uses all defaults):\n> ```python\n> client.connect(database='test')\n> ```\n> \n> By default, all features (IVF, HNSW, fulltext) are automatically enabled via `on_connect=[ConnectionAction.ENABLE_ALL]`.\n\n### Async Usage\n\n```python\nimport asyncio\nfrom matrixone import AsyncClient\n\nasync def main():\n    client = AsyncClient()\n    await client.connect(\n        host='localhost',\n        port=6001,\n        user='root',\n        password='111',\n        database='test'\n    )\n    \n    result = await client.execute(\"SELECT 1 as test\")\n    print(result.fetchall())\n    \n    await client.disconnect()\n\nasyncio.run(main())\n```\n\n### Snapshot Management\n\n```python\n# Create a snapshot\nsnapshot = client.snapshots.create(\n    'my_snapshot',\n    'cluster',\n    description='Backup before migration'\n)\n\n# List snapshots\nsnapshots = client.snapshots.list()\nfor snap in snapshots:\n    print(f\"Snapshot: {snap.name}, Created: {snap.created_at}\")\n\n# Clone database from snapshot\nclient.clone.clone_database(\n    'new_database',\n    'old_database',\n    snapshot_name='my_snapshot'\n)\n```\n\n### Version Management\n\n```python\n# Check if feature is available\nif client.is_feature_available('snapshot_creation'):\n    snapshot = client.snapshots.create('my_snapshot', 'cluster')\nelse:\n    hint = client.get_version_hint('snapshot_creation')\n    print(f\"Feature not available: {hint}\")\n\n# Check version compatibility\nif client.check_version_compatibility('3.0.0', '>='):\n    print(\"Backend supports 3.0.0+ features\")\n```\n\n## MatrixOne Version Support\n\nThe SDK automatically detects MatrixOne backend versions and handles compatibility:\n\n- **Development Version**: `8.0.30-MatrixOne-v` \u2192 `999.0.0` (highest priority)\n- **Release Version**: `8.0.30-MatrixOne-v3.0.0` \u2192 `3.0.0`\n- **Legacy Format**: `MatrixOne 3.0.1` \u2192 `3.0.1`\n\n```python\n# Check if running development version\nif client.is_development_version():\n    print(\"Running development version - all features available\")\nelse:\n    print(f\"Running release version: {client.get_backend_version()}\")\n```\n\n## Advanced Features\n\n### PITR (Point-in-Time Recovery)\n\n```python\n# Create PITR for cluster\npitr = client.pitr.create_cluster_pitr(\n    'cluster_pitr',\n    range_value=7,\n    range_unit='d'\n)\n\n# Restore cluster from snapshot\nclient.restore.restore_cluster('my_snapshot')\n```\n\n### Account Management\n\n```python\nfrom matrixone.account import AccountManager\n\n# Initialize account manager\naccount_manager = AccountManager(client)\n\n# Create user\nuser = account_manager.create_user('newuser', 'password123')\nprint(f\"Created user: {user.name}\")\n\n# Create role  \nrole = account_manager.create_role('analyst')\nprint(f\"Created role: {role.name}\")\n\n# Grant privileges on specific table (optional)\n# Note: table must exist first\naccount_manager.grant_privilege(\n    'SELECT',           # privilege\n    'TABLE',            # object_type\n    'users',       # object_name (database.table format)\n    to_role='analyst'\n)\n\n# Grant role to user\naccount_manager.grant_role('analyst', 'newuser')\nprint(f\"Granted role to user\")\n\n# List users\nusers = account_manager.list_users()\nfor user in users:\n    print(f\"User: {user.name}\")\n```\n\n### Vector Search Operations\n\n```python\nfrom matrixone import Client\nfrom matrixone.sqlalchemy_ext import create_vector_column\nfrom matrixone.orm import declarative_base\nfrom sqlalchemy import Column, BigInteger, String, Text\nimport numpy as np\n\n# Create client and connect\nclient = Client()\nclient.connect(\n    host='localhost',\n    port=6001,\n    user='root',\n    password='111',\n    database='test'\n)\n\n# Define vector table using MatrixOne ORM\nBase = declarative_base()\n\nclass Document(Base):\n    __tablename__ = 'documents'\n    # IMPORTANT: HNSW index requires BigInteger (BIGINT) primary key\n    id = Column(BigInteger, primary_key=True, autoincrement=True)\n    title = Column(String(200))\n    content = Column(Text)\n    embedding = create_vector_column(384, precision='f32')\n\n# Create table using client API (not Base.metadata.create_all)\nclient.create_table(Document)\n\n# Create HNSW index using SDK (not SQL)\nclient.vector_ops.enable_hnsw()\nclient.vector_ops.create_hnsw(\n    'documents',  # table name or model - positional argument\n    name='idx_embedding',\n    column='embedding',\n    m=16,\n    ef_construction=200\n)\n\n# Insert vector data using client API\nclient.insert(Document, {\n    'title': 'Machine Learning Guide',\n    'content': 'Comprehensive ML tutorial...',\n    'embedding': np.random.rand(384).tolist()\n})\n\n# Search similar documents using SDK\nquery_vector = np.random.rand(384).tolist()\nresults = client.vector_ops.similarity_search(\n    'documents',  # table name or model - positional argument\n    vector_column='embedding',\n    query_vector=query_vector,\n    limit=5,\n    distance_type='cosine'\n)\n\nfor row in results:\n    print(f\"Document: {row[1]}, Similarity: {row[-1]}\")\n\n# Cleanup\nclient.drop_table(Document)  # Use client API\nclient.disconnect()\n```\n\n### \u2b50 IVF Index Health Monitoring (Production Critical)\n\n**Monitor your IVF indexes to ensure optimal performance!**\n\n```python\nfrom matrixone import Client\nimport numpy as np\n\nclient = Client()\nclient.connect(host='localhost', port=6001, user='root', password='111', database='test')\n\n# After creating IVF index and inserting data...\n\n# Get IVF index statistics\nstats = client.vector_ops.get_ivf_stats(\"documents\", \"embedding\")\n\n# Analyze index balance\ncounts = stats['distribution']['centroid_count']\ntotal_centroids = len(counts)\ntotal_vectors = sum(counts)\nmin_count = min(counts) if counts else 0\nmax_count = max(counts) if counts else 0\nbalance_ratio = max_count / min_count if min_count > 0 else float('inf')\n\nprint(f\"\ud83d\udcca IVF Index Health Report:\")\nprint(f\"  - Total centroids: {total_centroids}\")\nprint(f\"  - Total vectors: {total_vectors}\")\nprint(f\"  - Balance ratio: {balance_ratio:.2f}\")\nprint(f\"  - Min vectors in centroid: {min_count}\")\nprint(f\"  - Max vectors in centroid: {max_count}\")\n\n# Check if index needs rebuilding\nif balance_ratio > 2.5:\n    print(\"\u26a0\ufe0f  WARNING: Index is imbalanced and needs rebuilding!\")\n    print(\"   Rebuild the index for optimal performance:\")\n    \n    # Rebuild process\n    client.vector_ops.drop(\"documents\", \"idx_embedding\")\n    client.vector_ops.create_ivf(\n        \"documents\",\n        name=\"idx_embedding\",\n        column=\"embedding\",\n        lists=100\n    )\n    print(\"\u2705 Index rebuilt successfully\")\nelse:\n    print(\"\u2705 Index is healthy and well-balanced\")\n\nclient.disconnect()\n```\n\n**Why IVF Stats Matter:**\n- \ud83c\udfaf **Performance**: Unbalanced indexes lead to slow searches\n- \ud83d\udcca **Load Distribution**: Identify hot spots and imbalances\n- \ud83d\udd04 **Rebuild Timing**: Know when to rebuild for optimal performance\n- \ud83d\udcc8 **Capacity Planning**: Understand data distribution patterns\n\n**When to Rebuild:**\n- Balance ratio > 2.5 (moderate imbalance)\n- Balance ratio > 3.0 (severe imbalance - rebuild immediately)\n- After bulk inserts (>20% of data)\n- Performance degradation in searches\n\n### Fulltext Search Operations\n\n```python\nfrom matrixone import Client\nfrom matrixone.sqlalchemy_ext.fulltext_search import boolean_match\nfrom matrixone.orm import declarative_base\nfrom sqlalchemy import Column, Integer, String, Text\n\n# Create client and connect\nclient = Client()\nclient.connect(\n    host='localhost',\n    port=6001,\n    user='root',\n    password='111',\n    database='test'\n)\n\n# Define model using MatrixOne ORM\nBase = declarative_base()\n\nclass Article(Base):\n    __tablename__ = 'articles'\n    id = Column(Integer, primary_key=True, autoincrement=True)\n    title = Column(String(200), nullable=False)\n    content = Column(Text, nullable=False)\n    category = Column(String(100))\n\n# Create table using client API (not Base.metadata.create_all)\nclient.create_table(Article)\n\n# Insert some data using client API\narticles = [\n    {'title': 'Machine Learning Guide', \n     'content': 'Comprehensive machine learning tutorial...', \n     'category': 'AI'},\n    {'title': 'Python Programming', \n     'content': 'Learn Python programming basics', \n     'category': 'Programming'},\n]\nclient.batch_insert(Article, articles)\n\n# Create fulltext index using SDK (not SQL)\nclient.fulltext_index.create(\n    'articles',  # table name - positional argument\n    name='ftidx_content',\n    columns=['title', 'content']\n)\n\n# Boolean search with encourage (like natural language)\nresults = client.query(\n    Article.title,\n    Article.content,\n    boolean_match('title', 'content').encourage('machine learning tutorial')\n).execute()\n\n# Boolean search with must/should operators\nresults = client.query(\n    Article.title,\n    Article.content,\n    boolean_match('title', 'content')\n        .must('machine')\n        .must('learning')\n        .must_not('basics')\n).execute()\n\n# Results is a ResultSet object\nfor row in results.rows:\n    print(f\"Title: {row[0]}, Content: {row[1][:50]}...\")\n\n# Cleanup\nclient.drop_table(Article)  # Use client API\nclient.disconnect()\n```\n\n### Metadata Analysis\n\n```python\nfrom matrixone import Client\n\n# Create client and connect\nclient = Client()\nclient.connect(\n    host='localhost',\n    port=6001,\n    user='root',\n    password='111',\n    database='test'\n)\n\n# Analyze table metadata - returns structured MetadataRow objects\nmetadata_rows = client.metadata.scan(\n    dbname='test',\n    tablename='documents',\n    columns='*'  # Get all columns\n)\n\nfor row in metadata_rows:\n    print(f\"Column: {row.col_name}\")\n    print(f\"  Rows count: {row.rows_cnt}\")\n    print(f\"  Null count: {row.null_cnt}\")\n    print(f\"  Size: {row.origin_size}\")\n\n# Get table brief statistics\nbrief_stats = client.metadata.get_table_brief_stats(\n    dbname='test',\n    tablename='documents'\n)\n\ntable_stats = brief_stats['documents']\nprint(f\"Total rows: {table_stats['row_cnt']}\")\nprint(f\"Total nulls: {table_stats['null_cnt']}\")\nprint(f\"Original size: {table_stats['original_size']}\")\nprint(f\"Compressed size: {table_stats['compress_size']}\")\n\nclient.disconnect()\n```\n\n### Pub/Sub Operations\n\n```python\n# List publications\npublications = client.pubsub.list_publications()\nfor pub in publications:\n    print(f\"Publication: {pub}\")\n\n# List subscriptions\nsubscriptions = client.pubsub.list_subscriptions()\nfor sub in subscriptions:\n    print(f\"Subscription: {sub}\")\n\n# Drop publication/subscription when needed\ntry:\n    client.pubsub.drop_publication(\"test_publication\")\n    client.pubsub.drop_subscription(\"test_subscription\")\nexcept Exception as e:\n    print(f\"Cleanup: {e}\")\n```\n\n## Configuration\n\n### Connection Parameters\n\n```python\nclient = Client(\n    connection_timeout=30,\n    query_timeout=300,\n    auto_commit=True,\n    charset='utf8mb4',\n    sql_log_mode='auto',  # 'off', 'simple', 'auto', 'full'\n    slow_query_threshold=1.0\n)\n```\n\n### Logging Configuration\n\n```python\nfrom matrixone import Client\nfrom matrixone.logger import create_default_logger\nimport logging\n\n# Create custom logger\nlogger = create_default_logger(\n    level=logging.INFO,\n    sql_log_mode='auto',  # 'off', 'simple', 'auto', 'full'\n    slow_query_threshold=1.0,\n    max_sql_display_length=500\n)\n\n# Use custom logger with client\nclient = Client(logger=logger)\n```\n\n## Error Handling\n\nThe SDK provides comprehensive error handling with helpful messages:\n\n```python\nfrom matrixone.exceptions import (\n    ConnectionError,\n    QueryError,\n    VersionError,\n    SnapshotError\n)\n\ntry:\n    snapshot = client.snapshots.create('test', 'cluster')\nexcept VersionError as e:\n    print(f\"Version compatibility error: {e}\")\nexcept SnapshotError as e:\n    print(f\"Snapshot operation failed: {e}\")\n```\n\n## \ud83d\udd17 Links\n\n- **\ud83d\udcda Full Documentation**: https://matrixone.readthedocs.io/\n- **\ud83d\udce6 PyPI Package**: https://pypi.org/project/matrixone-python-sdk/\n- **\ud83d\udcbb GitHub Repository**: https://github.com/matrixorigin/matrixone/tree/main/clients/python\n- **\ud83c\udf10 MatrixOne Docs**: https://docs.matrixorigin.cn/\n\n### Online Examples\n\nThe SDK includes 25+ comprehensive examples covering all features:\n\n**Getting Started:**\n- Basic connection and database operations\n- Async/await operations\n- Transaction management\n- SQLAlchemy ORM integration\n\n**Vector Search:**\n- Vector data types and distance functions\n- IVF and HNSW index creation and tuning\n- \u2b50 **IVF Index Health Monitoring** - Essential for production systems\n- Similarity search operations\n- Advanced vector optimizations and index rebuilding\n\n**Advanced Features:**\n- Fulltext search with BM25/TF-IDF\n- Table metadata analysis\n- Snapshot and restore operations\n- Account and permission management\n- Pub/Sub operations\n- Connection hooks and logging\n\n### Quick Examples\n\nClone the repository to access all examples:\n```bash\ngit clone https://github.com/matrixorigin/matrixone.git\ncd matrixone/clients/python/examples\n\n# Run basic example\npython example_01_basic_connection.py\n\n# Run vector search example\npython example_12_vector_basics.py\n\n# Run metadata analysis example\npython example_25_metadata_operations.py\n```\n\n\n## Support\n\n- \ud83d\udce7 Email: contact@matrixorigin.cn\n- \ud83d\udc1b Issues: [GitHub Issues](https://github.com/matrixorigin/matrixone/issues)\n- \ud83d\udcac Discussions: [GitHub Discussions](https://github.com/matrixorigin/matrixone/discussions)\n- \ud83d\udcd6 Documentation: \n  - [MatrixOne Docs (English)](https://docs.matrixorigin.cn/en)\n  - [MatrixOne Docs (\u4e2d\u6587)](https://docs.matrixorigin.cn/)\n\n## License\n\nThis project is licensed under the Apache License 2.0 - see the [LICENSE](LICENSE) file for details.\n\n---\n\n**MatrixOne Python SDK** - Making MatrixOne database operations simple and powerful in Python.\n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "A comprehensive Python SDK for MatrixOne database operations with vector search, fulltext search, and advanced features",
    "version": "0.1.4",
    "project_urls": {
        "Changelog": "https://github.com/matrixorigin/matrixone/blob/main/clients/python/CHANGELOG.md",
        "Documentation": "https://matrixone.readthedocs.io/",
        "Homepage": "https://github.com/matrixorigin/matrixone",
        "Issues": "https://github.com/matrixorigin/matrixone/issues",
        "Repository": "https://github.com/matrixorigin/matrixone"
    },
    "split_keywords": [
        "matrixone",
        " database",
        " vector",
        " search",
        " sqlalchemy",
        " python"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "8c78344a4f787d90b258803413b43d567d3fd2a866153d5c29f31af09f783c30",
                "md5": "9b8a307bcaffbd2b3a7cbe4874ec30ac",
                "sha256": "2d4206fb692c5f32aa6c426cb5c22c9b43246d918140a9d59c601c2906d59843"
            },
            "downloads": -1,
            "filename": "matrixone_python_sdk-0.1.4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "9b8a307bcaffbd2b3a7cbe4874ec30ac",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 483471,
            "upload_time": "2025-10-15T03:34:23",
            "upload_time_iso_8601": "2025-10-15T03:34:23.703168Z",
            "url": "https://files.pythonhosted.org/packages/8c/78/344a4f787d90b258803413b43d567d3fd2a866153d5c29f31af09f783c30/matrixone_python_sdk-0.1.4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "d369832fd8d2d42e6add4b78270e7327f2393fefea24913cac75913c723ab290",
                "md5": "1f98bd66e924955d763435839bb8bb78",
                "sha256": "462136662beace530b448705a3627cb456d490943910d8c4b97705141475d475"
            },
            "downloads": -1,
            "filename": "matrixone_python_sdk-0.1.4.tar.gz",
            "has_sig": false,
            "md5_digest": "1f98bd66e924955d763435839bb8bb78",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 253278,
            "upload_time": "2025-10-15T03:34:25",
            "upload_time_iso_8601": "2025-10-15T03:34:25.495210Z",
            "url": "https://files.pythonhosted.org/packages/d3/69/832fd8d2d42e6add4b78270e7327f2393fefea24913cac75913c723ab290/matrixone_python_sdk-0.1.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-10-15 03:34:25",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "matrixorigin",
    "github_project": "matrixone",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "matrixone-python-sdk"
}

MatrixOne Team