# kiarina-lib-redisearch
A comprehensive Python client library for [RediSearch](https://redis.io/docs/interact/search-and-query/) with advanced configuration management, schema definition, and both full-text and vector search capabilities.
## Features
- **Full-Text Search**: Advanced text search with stemming, phonetic matching, and fuzzy search
- **Vector Search**: Similarity search using FLAT and HNSW algorithms with multiple distance metrics
- **Schema Management**: Type-safe schema definition with automatic migration support
- **Configuration Management**: Flexible configuration using `pydantic-settings-manager`
- **Sync & Async**: Support for both synchronous and asynchronous operations
- **Advanced Filtering**: Intuitive query builder with type-safe filter expressions
- **Index Management**: Complete index lifecycle management (create, migrate, reset, drop)
- **Type Safety**: Full type hints and Pydantic validation throughout
## Installation
```bash
pip install kiarina-lib-redisearch
```
## Quick Start
### Basic Usage (Sync)
```python
import redis
from kiarina.lib.redisearch import create_redisearch_client, RedisearchSettings
# Configure your schema
schema = [
{"type": "tag", "name": "category"},
{"type": "text", "name": "title"},
{"type": "numeric", "name": "price", "sortable": True},
{"type": "vector", "name": "embedding", "algorithm": "FLAT", "dims": 1536}
]
# Create Redis connection (decode_responses=False is required)
redis_client = redis.Redis(host="localhost", port=6379, decode_responses=False)
# Create RediSearch client
client = create_redisearch_client(
redis=redis_client,
config_key="default" # Optional: use specific configuration
)
# Configure settings
from kiarina.lib.redisearch import settings_manager
settings_manager.user_config = {
"default": {
"key_prefix": "products:",
"index_name": "products_index",
"index_schema": schema
}
}
# Create index
client.create_index()
# Add documents
client.set({
"category": "electronics",
"title": "Wireless Headphones",
"price": 99.99,
"embedding": [0.1, 0.2, 0.3, ...] # 1536-dimensional vector
}, id="product_1")
# Full-text search
results = client.find(
filter=[["category", "==", "electronics"]],
return_fields=["title", "price"]
)
# Vector similarity search
results = client.search(
vector=[0.1, 0.2, 0.3, ...], # Query vector
limit=10
)
```
### Async Usage
```python
import redis.asyncio
from kiarina.lib.redisearch.asyncio import create_redisearch_client
async def main():
# Create async Redis connection
redis_client = redis.asyncio.Redis(host="localhost", port=6379, decode_responses=False)
# Create async RediSearch client
client = create_redisearch_client(redis=redis_client)
# All operations are awaitable
await client.create_index()
await client.set({"title": "Example"}, id="doc_1")
results = await client.find()
```
## Schema Definition
Define your search schema with type-safe field definitions:
### Field Types
#### Tag Fields
```python
{
"type": "tag",
"name": "category",
"separator": ",", # Default: ","
"case_sensitive": False, # Default: False
"sortable": True, # Default: False
"multiple": True # Allow multiple tags (library-specific)
}
```
#### Text Fields
```python
{
"type": "text",
"name": "description",
"weight": 2.0, # Default: 1.0
"no_stem": False, # Default: False
"sortable": True, # Default: False
"phonetic_matcher": "dm:en" # Optional phonetic matching
}
```
#### Numeric Fields
```python
{
"type": "numeric",
"name": "price",
"sortable": True, # Default: False
"no_index": False # Default: False
}
```
#### Vector Fields
**FLAT Algorithm (Exact Search)**
```python
{
"type": "vector",
"name": "embedding",
"algorithm": "FLAT",
"dims": 1536,
"datatype": "FLOAT32", # FLOAT32 or FLOAT64
"distance_metric": "COSINE", # L2, COSINE, or IP
"initial_cap": 1000 # Optional initial capacity
}
```
**HNSW Algorithm (Approximate Search)**
```python
{
"type": "vector",
"name": "embedding",
"algorithm": "HNSW",
"dims": 1536,
"datatype": "FLOAT32",
"distance_metric": "COSINE",
"m": 16, # Default: 16
"ef_construction": 200, # Default: 200
"ef_runtime": 10, # Default: 10
"epsilon": 0.01 # Default: 0.01
}
```
## Configuration
This library uses [pydantic-settings-manager](https://github.com/kiarina/pydantic-settings-manager) for flexible configuration management.
### Environment Variables
```bash
# Basic settings
export KIARINA_LIB_REDISEARCH_KEY_PREFIX="myapp:"
export KIARINA_LIB_REDISEARCH_INDEX_NAME="main_index"
export KIARINA_LIB_REDISEARCH_PROTECT_INDEX_DELETION="true"
```
### Programmatic Configuration
```python
from kiarina.lib.redisearch import settings_manager
# Configure multiple environments
settings_manager.user_config = {
"development": {
"key_prefix": "dev:",
"index_name": "dev_index",
"index_schema": dev_schema,
"protect_index_deletion": False
},
"production": {
"key_prefix": "prod:",
"index_name": "prod_index",
"index_schema": prod_schema,
"protect_index_deletion": True
}
}
# Switch configurations
settings_manager.active_key = "production"
```
## Advanced Filtering
Use the intuitive filter API to build complex queries:
### Filter API
```python
import kiarina.lib.redisearch.filter as rf
# Tag filters
filter1 = rf.Tag("category") == "electronics"
filter2 = rf.Tag("tags") == ["new", "featured"] # Multiple tags
filter3 = rf.Tag("brand") != "apple"
# Numeric filters
filter4 = rf.Numeric("price") > 100
filter5 = rf.Numeric("rating") >= 4.5
filter6 = rf.Numeric("stock") <= 10
# Text filters
filter7 = rf.Text("title") == "exact match"
filter8 = rf.Text("description") % "*wireless*" # Wildcard search
filter9 = rf.Text("content") % "%%fuzzy%%" # Fuzzy search
# Combine filters
complex_filter = (
(rf.Tag("category") == "electronics") &
(rf.Numeric("price") < 500) &
(rf.Text("title") % "*headphone*")
)
# Use in searches
results = client.find(filter=complex_filter)
```
### Condition Lists
Alternatively, use simple condition lists:
```python
# Equivalent to the complex filter above
conditions = [
["category", "==", "electronics"],
["price", "<", 500],
["title", "like", "*headphone*"]
]
results = client.find(filter=conditions)
```
## Search Operations
The library provides three main search operations: `count`, `find`, and `search`. These are the core functions for querying your indexed data.
### 1. Count Documents (`count`)
Count the number of documents matching specific criteria without retrieving the actual documents. This is efficient for getting result counts.
```python
# Count all documents
total = client.count()
print(f"Total documents: {total.total}")
# Count with filters
electronics_count = client.count(
filter=[["category", "==", "electronics"]]
)
print(f"Electronics products: {electronics_count.total}")
# Complex filter counting
expensive_electronics = client.count(
filter=[
["category", "==", "electronics"],
["price", ">", 500]
]
)
print(f"Expensive electronics: {expensive_electronics.total}")
# Using filter API
import kiarina.lib.redisearch.filter as rf
filter_expr = (rf.Tag("category") == "electronics") & (rf.Numeric("price") > 500)
count_result = client.count(filter=filter_expr)
```
**Count Result Structure:**
```python
class SearchResult:
total: int # Number of matching documents
duration: float # Query execution time in milliseconds
documents: list # Empty for count operations
```
### 2. Full-Text Search (`find`)
Search and retrieve documents based on filters, with support for sorting, pagination, and field selection.
#### Basic Find Operations
```python
# Find all documents
results = client.find()
print(f"Found {results.total} documents")
# Find with specific fields returned
results = client.find(
return_fields=["title", "price", "category"]
)
for doc in results.documents:
print(f"ID: {doc.id}")
print(f"Title: {doc.mapping['title']}")
print(f"Price: {doc.mapping['price']}")
```
#### Filtering
```python
# Single filter condition
results = client.find(
filter=[["category", "==", "electronics"]]
)
# Multiple filter conditions (AND logic)
results = client.find(
filter=[
["category", "==", "electronics"],
["price", ">=", 100],
["price", "<=", 500]
]
)
# Using filter expressions for complex logic
import kiarina.lib.redisearch.filter as rf
complex_filter = (
(rf.Tag("category") == "electronics") |
(rf.Tag("category") == "computers")
) & (rf.Numeric("price") < 1000)
results = client.find(filter=complex_filter)
```
#### Sorting
```python
# Sort by price (ascending)
results = client.find(
sort_by="price",
sort_desc=False
)
# Sort by rating (descending)
results = client.find(
filter=[["category", "==", "electronics"]],
sort_by="rating",
sort_desc=True
)
# Note: Only sortable fields can be used for sorting
# Define sortable fields in your schema:
# {"type": "numeric", "name": "price", "sortable": True}
```
#### Pagination
```python
# Get first 10 results
results = client.find(limit=10)
# Get next 10 results (pagination)
results = client.find(offset=10, limit=10)
# Get results 21-30
results = client.find(offset=20, limit=10)
# Combine with filtering and sorting
results = client.find(
filter=[["category", "==", "electronics"]],
sort_by="price",
sort_desc=True,
offset=0,
limit=20
)
```
#### Field Selection
```python
# Return only specific fields (more efficient)
results = client.find(
return_fields=["title", "price"]
)
# Return no content, only document IDs (most efficient for counting)
results = client.find(
return_fields=[] # or omit return_fields parameter
)
# Include computed fields
results = client.find(
return_fields=["title", "price", "id"] # id is automatically computed
)
```
#### Complete Find Example
```python
# Comprehensive search with all options
results = client.find(
filter=[
["category", "in", ["electronics", "computers"]],
["price", ">=", 50],
["rating", ">=", 4.0]
],
sort_by="price",
sort_desc=False,
offset=0,
limit=25,
return_fields=["title", "price", "rating", "category"]
)
print(f"Found {results.total} products (showing {len(results.documents)})")
print(f"Query took {results.duration}ms")
for doc in results.documents:
print(f"- {doc.mapping['title']}: ${doc.mapping['price']} ({doc.mapping['rating']}⭐)")
```
### 3. Vector Similarity Search (`search`)
Perform semantic similarity search using vector embeddings. This is ideal for AI-powered search, recommendation systems, and semantic matching.
#### Basic Vector Search
```python
# Simple vector search
query_vector = [0.1, 0.2, 0.3, ...] # Your query embedding (must match schema dims)
results = client.search(vector=query_vector)
print(f"Found {results.total} similar documents")
for doc in results.documents:
print(f"Document: {doc.id}, Similarity Score: {doc.score:.4f}")
```
#### Vector Search with Filtering
```python
# Pre-filter documents before vector search (more efficient)
results = client.search(
vector=query_vector,
filter=[["category", "==", "electronics"]], # Only search within electronics
limit=10
)
# Complex pre-filtering
results = client.search(
vector=query_vector,
filter=[
["category", "in", ["electronics", "computers"]],
["price", "<=", 1000],
["in_stock", "==", "true"]
],
limit=20
)
```
#### Pagination and Field Selection
```python
# Paginated vector search
results = client.search(
vector=query_vector,
offset=10,
limit=10,
return_fields=["title", "description", "price", "distance"]
)
# Get similarity scores and distances
for doc in results.documents:
distance = doc.mapping.get('distance', 0)
score = doc.score # Normalized similarity score (0-1)
print(f"{doc.mapping['title']}: score={score:.4f}, distance={distance:.4f}")
```
#### Understanding Vector Search Results
```python
results = client.search(
vector=query_vector,
limit=5,
return_fields=["title", "distance"]
)
for i, doc in enumerate(results.documents, 1):
print(f"{i}. {doc.mapping['title']}")
print(f" Similarity Score: {doc.score:.4f}") # Higher = more similar
print(f" Distance: {doc.mapping['distance']:.4f}") # Lower = more similar
print(f" Document ID: {doc.id}")
print()
```
#### Vector Search Best Practices
```python
# 1. Use appropriate vector dimensions
schema = [{
"type": "vector",
"name": "embedding",
"algorithm": "HNSW", # or "FLAT"
"dims": 1536, # Must match your embedding model
"distance_metric": "COSINE" # COSINE, L2, or IP
}]
# 2. Pre-filter for better performance
results = client.search(
vector=query_vector,
filter=[["category", "==", "target_category"]], # Reduce search space
limit=50 # Don't retrieve more than needed
)
# 3. Use HNSW for large datasets
hnsw_schema = {
"type": "vector",
"name": "embedding",
"algorithm": "HNSW",
"dims": 1536,
"m": 16, # Connections per node
"ef_construction": 200, # Build-time accuracy
"ef_runtime": 100 # Search-time accuracy
}
# 4. Use FLAT for smaller datasets or exact search
flat_schema = {
"type": "vector",
"name": "embedding",
"algorithm": "FLAT",
"dims": 1536
}
```
### Search Result Structure
All search operations return a `SearchResult` object:
```python
class SearchResult:
total: int # Total matching documents
duration: float # Query execution time (ms)
documents: list[Document] # Retrieved documents
class Document:
key: str # Redis key
id: str # Document ID
score: float # Relevance/similarity score (-1.0 to 1.0)*
mapping: dict[str, Any] # Document fields
```
### Performance Comparison
| Operation | Use Case | Performance | Returns |
|-----------|----------|-------------|---------|
| `count()` | Get result counts | Fastest | Count only |
| `find()` | Full-text search, filtering | Fast | Full documents |
| `search()` | Semantic similarity | Moderate* | Ranked by similarity |
*Vector search performance depends on algorithm (FLAT vs HNSW) and dataset size.
### Combining Operations
```python
# 1. First, check how many results we'll get
count_result = client.count(
filter=[["category", "==", "electronics"]]
)
print(f"Will search through {count_result.total} electronics")
# 2. If reasonable number, do full-text search
if count_result.total < 10000:
text_results = client.find(
filter=[["category", "==", "electronics"]],
sort_by="rating",
sort_desc=True,
limit=100
)
# 3. For semantic search within results
if query_vector:
semantic_results = client.search(
vector=query_vector,
filter=[["category", "==", "electronics"]],
limit=20
)
```
## Index Management
### Index Lifecycle
```python
# Check if index exists
if not client.exists_index():
client.create_index()
# Get index information
info = client.get_info()
print(f"Index: {info.index_name}, Documents: {info.num_docs}")
# Reset index (delete all documents, recreate index)
client.reset_index()
# Drop index (optionally delete documents)
client.drop_index(delete_documents=True)
```
### Schema Migration
Automatically migrate your index when schema changes:
```python
# Update your schema
new_schema = [
{"type": "tag", "name": "category"},
{"type": "text", "name": "title"},
{"type": "numeric", "name": "price", "sortable": True},
{"type": "numeric", "name": "rating", "sortable": True}, # New field
{"type": "vector", "name": "embedding", "algorithm": "HNSW", "dims": 1536} # Changed algorithm
]
# Update configuration
settings_manager.user_config["production"]["index_schema"] = new_schema
# Migrate (automatically detects changes and recreates index)
client.migrate_index()
```
## Document Operations
### Adding Documents
```python
# Add single document
client.set({
"category": "electronics",
"title": "Wireless Mouse",
"price": 29.99,
"rating": 4.5,
"embedding": [0.1, 0.2, ...]
}, id="mouse_001")
# Add document with ID in mapping
client.set({
"id": "keyboard_001",
"category": "electronics",
"title": "Mechanical Keyboard",
"price": 129.99,
"embedding": [0.2, 0.3, ...]
})
```
### Retrieving Documents
```python
# Get single document
doc = client.get("mouse_001")
if doc:
print(f"Title: {doc.mapping['title']}")
print(f"Price: {doc.mapping['price']}")
# Get Redis key for document
key = client.get_key("mouse_001") # Returns "products:mouse_001"
```
### Deleting Documents
```python
# Delete single document
client.delete("mouse_001")
```
## Integration with Other Libraries
### Using with kiarina-lib-redis
```python
from kiarina.lib.redis import get_redis
from kiarina.lib.redisearch import create_redisearch_client
# Get Redis client from kiarina-lib-redis
redis_client = get_redis(decode_responses=False)
# Create RediSearch client
search_client = create_redisearch_client(redis=redis_client)
```
### Custom Redis Configuration
```python
import redis
from kiarina.lib.redisearch import create_redisearch_client
# Custom Redis client with connection pooling
redis_client = redis.Redis(
host="localhost",
port=6379,
db=0,
decode_responses=False, # Required!
max_connections=20,
socket_timeout=30,
socket_connect_timeout=10
)
search_client = create_redisearch_client(redis=redis_client)
```
## Error Handling
```python
try:
client.create_index()
except Exception as e:
if "Index already exists" in str(e):
print("Index already exists, continuing...")
else:
raise
# Protect against accidental index deletion
settings_manager.user_config["production"]["protect_index_deletion"] = True
# This will return False instead of deleting
success = client.drop_index()
if not success:
print("Index deletion is protected")
```
## Performance Considerations
### Vector Search Optimization
```python
# Use HNSW for large datasets (faster but approximate)
hnsw_schema = {
"type": "vector",
"name": "embedding",
"algorithm": "HNSW",
"dims": 1536,
"m": 32, # Higher M = better recall, more memory
"ef_construction": 400, # Higher = better index quality, slower indexing
"ef_runtime": 100 # Higher = better recall, slower search
}
# Use FLAT for smaller datasets or exact search
flat_schema = {
"type": "vector",
"name": "embedding",
"algorithm": "FLAT",
"dims": 1536,
"initial_cap": 10000 # Pre-allocate capacity
}
```
### Indexing Best Practices
```python
# Use appropriate field options
schema = [
{
"type": "tag",
"name": "category",
"sortable": True, # Only if you need sorting
"no_index": False # Set True for storage-only fields
},
{
"type": "text",
"name": "description",
"weight": 1.0, # Adjust relevance weight
"no_stem": False # Enable stemming for better search
}
]
```
## Development
### Prerequisites
- Python 3.12+
- Redis with RediSearch module
- Docker (for running Redis in tests)
### Setup
```bash
# Clone the repository
git clone https://github.com/kiarina/kiarina-python.git
cd kiarina-python
# Setup development environment
mise run setup
# Start Redis with RediSearch for testing
docker compose up -d redis
```
### Running Tests
```bash
# Run all tests for this package
mise run package kiarina-lib-redisearch
# Run specific test categories
uv run --group test pytest packages/kiarina-lib-redisearch/tests/sync/
uv run --group test pytest packages/kiarina-lib-redisearch/tests/async/
# Run with coverage
mise run package:test kiarina-lib-redisearch --coverage
```
## Configuration Reference
| Setting | Environment Variable | Default | Description |
|---------|---------------------|---------|-------------|
| `key_prefix` | `KIARINA_LIB_REDISEARCH_KEY_PREFIX` | `""` | Redis key prefix for documents |
| `index_name` | `KIARINA_LIB_REDISEARCH_INDEX_NAME` | `"default"` | RediSearch index name |
| `index_schema` | - | `None` | Index schema definition (list of field dicts) |
| `protect_index_deletion` | `KIARINA_LIB_REDISEARCH_PROTECT_INDEX_DELETION` | `false` | Prevent accidental index deletion |
## Dependencies
- [redis](https://github.com/redis/redis-py) - Redis client for Python
- [numpy](https://numpy.org/) - Numerical computing (for vector operations)
- [pydantic](https://docs.pydantic.dev/) - Data validation and settings management
- [pydantic-settings](https://docs.pydantic.dev/latest/concepts/pydantic_settings/) - Settings management
- [pydantic-settings-manager](https://github.com/kiarina/pydantic-settings-manager) - Advanced settings management
## License
This project is licensed under the MIT License - see the [LICENSE](../../LICENSE) file for details.
## Contributing
This is a personal project, but contributions are welcome! Please feel free to submit issues or pull requests.
## Related Projects
- [kiarina-python](https://github.com/kiarina/kiarina-python) - The main monorepo containing this package
- [RediSearch](https://redis.io/docs/interact/search-and-query/) - The search and query engine this library connects to
- [kiarina-lib-redis](../kiarina-lib-redis/) - Redis client library for basic Redis operations
- [pydantic-settings-manager](https://github.com/kiarina/pydantic-settings-manager) - Configuration management library used by this package
Raw data
{
"_id": null,
"home_page": null,
"name": "kiarina-lib-redisearch",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.12",
"maintainer_email": "kiarina <kiarinadawa@gmail.com>",
"keywords": "database, fulltext, pydantic, redis, redisearch, search, settings, vector",
"author": null,
"author_email": "kiarina <kiarinadawa@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/5f/38/4543cc455ee4a9af5d9ee5d7f88056a9b8117562840064952e400d1838f4/kiarina_lib_redisearch-1.0.1.tar.gz",
"platform": null,
"description": "# kiarina-lib-redisearch\n\nA comprehensive Python client library for [RediSearch](https://redis.io/docs/interact/search-and-query/) with advanced configuration management, schema definition, and both full-text and vector search capabilities.\n\n## Features\n\n- **Full-Text Search**: Advanced text search with stemming, phonetic matching, and fuzzy search\n- **Vector Search**: Similarity search using FLAT and HNSW algorithms with multiple distance metrics\n- **Schema Management**: Type-safe schema definition with automatic migration support\n- **Configuration Management**: Flexible configuration using `pydantic-settings-manager`\n- **Sync & Async**: Support for both synchronous and asynchronous operations\n- **Advanced Filtering**: Intuitive query builder with type-safe filter expressions\n- **Index Management**: Complete index lifecycle management (create, migrate, reset, drop)\n- **Type Safety**: Full type hints and Pydantic validation throughout\n\n## Installation\n\n```bash\npip install kiarina-lib-redisearch\n```\n\n## Quick Start\n\n### Basic Usage (Sync)\n\n```python\nimport redis\nfrom kiarina.lib.redisearch import create_redisearch_client, RedisearchSettings\n\n# Configure your schema\nschema = [\n {\"type\": \"tag\", \"name\": \"category\"},\n {\"type\": \"text\", \"name\": \"title\"},\n {\"type\": \"numeric\", \"name\": \"price\", \"sortable\": True},\n {\"type\": \"vector\", \"name\": \"embedding\", \"algorithm\": \"FLAT\", \"dims\": 1536}\n]\n\n# Create Redis connection (decode_responses=False is required)\nredis_client = redis.Redis(host=\"localhost\", port=6379, decode_responses=False)\n\n# Create RediSearch client\nclient = create_redisearch_client(\n redis=redis_client,\n config_key=\"default\" # Optional: use specific configuration\n)\n\n# Configure settings\nfrom kiarina.lib.redisearch import settings_manager\nsettings_manager.user_config = {\n \"default\": {\n \"key_prefix\": \"products:\",\n \"index_name\": \"products_index\",\n \"index_schema\": schema\n }\n}\n\n# Create index\nclient.create_index()\n\n# Add documents\nclient.set({\n \"category\": \"electronics\",\n \"title\": \"Wireless Headphones\",\n \"price\": 99.99,\n \"embedding\": [0.1, 0.2, 0.3, ...] # 1536-dimensional vector\n}, id=\"product_1\")\n\n# Full-text search\nresults = client.find(\n filter=[[\"category\", \"==\", \"electronics\"]],\n return_fields=[\"title\", \"price\"]\n)\n\n# Vector similarity search\nresults = client.search(\n vector=[0.1, 0.2, 0.3, ...], # Query vector\n limit=10\n)\n```\n\n### Async Usage\n\n```python\nimport redis.asyncio\nfrom kiarina.lib.redisearch.asyncio import create_redisearch_client\n\nasync def main():\n # Create async Redis connection\n redis_client = redis.asyncio.Redis(host=\"localhost\", port=6379, decode_responses=False)\n\n # Create async RediSearch client\n client = create_redisearch_client(redis=redis_client)\n\n # All operations are awaitable\n await client.create_index()\n await client.set({\"title\": \"Example\"}, id=\"doc_1\")\n results = await client.find()\n```\n\n## Schema Definition\n\nDefine your search schema with type-safe field definitions:\n\n### Field Types\n\n#### Tag Fields\n```python\n{\n \"type\": \"tag\",\n \"name\": \"category\",\n \"separator\": \",\", # Default: \",\"\n \"case_sensitive\": False, # Default: False\n \"sortable\": True, # Default: False\n \"multiple\": True # Allow multiple tags (library-specific)\n}\n```\n\n#### Text Fields\n```python\n{\n \"type\": \"text\",\n \"name\": \"description\",\n \"weight\": 2.0, # Default: 1.0\n \"no_stem\": False, # Default: False\n \"sortable\": True, # Default: False\n \"phonetic_matcher\": \"dm:en\" # Optional phonetic matching\n}\n```\n\n#### Numeric Fields\n```python\n{\n \"type\": \"numeric\",\n \"name\": \"price\",\n \"sortable\": True, # Default: False\n \"no_index\": False # Default: False\n}\n```\n\n#### Vector Fields\n\n**FLAT Algorithm (Exact Search)**\n```python\n{\n \"type\": \"vector\",\n \"name\": \"embedding\",\n \"algorithm\": \"FLAT\",\n \"dims\": 1536,\n \"datatype\": \"FLOAT32\", # FLOAT32 or FLOAT64\n \"distance_metric\": \"COSINE\", # L2, COSINE, or IP\n \"initial_cap\": 1000 # Optional initial capacity\n}\n```\n\n**HNSW Algorithm (Approximate Search)**\n```python\n{\n \"type\": \"vector\",\n \"name\": \"embedding\",\n \"algorithm\": \"HNSW\",\n \"dims\": 1536,\n \"datatype\": \"FLOAT32\",\n \"distance_metric\": \"COSINE\",\n \"m\": 16, # Default: 16\n \"ef_construction\": 200, # Default: 200\n \"ef_runtime\": 10, # Default: 10\n \"epsilon\": 0.01 # Default: 0.01\n}\n```\n\n## Configuration\n\nThis library uses [pydantic-settings-manager](https://github.com/kiarina/pydantic-settings-manager) for flexible configuration management.\n\n### Environment Variables\n\n```bash\n# Basic settings\nexport KIARINA_LIB_REDISEARCH_KEY_PREFIX=\"myapp:\"\nexport KIARINA_LIB_REDISEARCH_INDEX_NAME=\"main_index\"\nexport KIARINA_LIB_REDISEARCH_PROTECT_INDEX_DELETION=\"true\"\n```\n\n### Programmatic Configuration\n\n```python\nfrom kiarina.lib.redisearch import settings_manager\n\n# Configure multiple environments\nsettings_manager.user_config = {\n \"development\": {\n \"key_prefix\": \"dev:\",\n \"index_name\": \"dev_index\",\n \"index_schema\": dev_schema,\n \"protect_index_deletion\": False\n },\n \"production\": {\n \"key_prefix\": \"prod:\",\n \"index_name\": \"prod_index\",\n \"index_schema\": prod_schema,\n \"protect_index_deletion\": True\n }\n}\n\n# Switch configurations\nsettings_manager.active_key = \"production\"\n```\n\n## Advanced Filtering\n\nUse the intuitive filter API to build complex queries:\n\n### Filter API\n\n```python\nimport kiarina.lib.redisearch.filter as rf\n\n# Tag filters\nfilter1 = rf.Tag(\"category\") == \"electronics\"\nfilter2 = rf.Tag(\"tags\") == [\"new\", \"featured\"] # Multiple tags\nfilter3 = rf.Tag(\"brand\") != \"apple\"\n\n# Numeric filters\nfilter4 = rf.Numeric(\"price\") > 100\nfilter5 = rf.Numeric(\"rating\") >= 4.5\nfilter6 = rf.Numeric(\"stock\") <= 10\n\n# Text filters\nfilter7 = rf.Text(\"title\") == \"exact match\"\nfilter8 = rf.Text(\"description\") % \"*wireless*\" # Wildcard search\nfilter9 = rf.Text(\"content\") % \"%%fuzzy%%\" # Fuzzy search\n\n# Combine filters\ncomplex_filter = (\n (rf.Tag(\"category\") == \"electronics\") &\n (rf.Numeric(\"price\") < 500) &\n (rf.Text(\"title\") % \"*headphone*\")\n)\n\n# Use in searches\nresults = client.find(filter=complex_filter)\n```\n\n### Condition Lists\n\nAlternatively, use simple condition lists:\n\n```python\n# Equivalent to the complex filter above\nconditions = [\n [\"category\", \"==\", \"electronics\"],\n [\"price\", \"<\", 500],\n [\"title\", \"like\", \"*headphone*\"]\n]\n\nresults = client.find(filter=conditions)\n```\n\n## Search Operations\n\nThe library provides three main search operations: `count`, `find`, and `search`. These are the core functions for querying your indexed data.\n\n### 1. Count Documents (`count`)\n\nCount the number of documents matching specific criteria without retrieving the actual documents. This is efficient for getting result counts.\n\n```python\n# Count all documents\ntotal = client.count()\nprint(f\"Total documents: {total.total}\")\n\n# Count with filters\nelectronics_count = client.count(\n filter=[[\"category\", \"==\", \"electronics\"]]\n)\nprint(f\"Electronics products: {electronics_count.total}\")\n\n# Complex filter counting\nexpensive_electronics = client.count(\n filter=[\n [\"category\", \"==\", \"electronics\"],\n [\"price\", \">\", 500]\n ]\n)\nprint(f\"Expensive electronics: {expensive_electronics.total}\")\n\n# Using filter API\nimport kiarina.lib.redisearch.filter as rf\nfilter_expr = (rf.Tag(\"category\") == \"electronics\") & (rf.Numeric(\"price\") > 500)\ncount_result = client.count(filter=filter_expr)\n```\n\n**Count Result Structure:**\n```python\nclass SearchResult:\n total: int # Number of matching documents\n duration: float # Query execution time in milliseconds\n documents: list # Empty for count operations\n```\n\n### 2. Full-Text Search (`find`)\n\nSearch and retrieve documents based on filters, with support for sorting, pagination, and field selection.\n\n#### Basic Find Operations\n\n```python\n# Find all documents\nresults = client.find()\nprint(f\"Found {results.total} documents\")\n\n# Find with specific fields returned\nresults = client.find(\n return_fields=[\"title\", \"price\", \"category\"]\n)\nfor doc in results.documents:\n print(f\"ID: {doc.id}\")\n print(f\"Title: {doc.mapping['title']}\")\n print(f\"Price: {doc.mapping['price']}\")\n```\n\n#### Filtering\n\n```python\n# Single filter condition\nresults = client.find(\n filter=[[\"category\", \"==\", \"electronics\"]]\n)\n\n# Multiple filter conditions (AND logic)\nresults = client.find(\n filter=[\n [\"category\", \"==\", \"electronics\"],\n [\"price\", \">=\", 100],\n [\"price\", \"<=\", 500]\n ]\n)\n\n# Using filter expressions for complex logic\nimport kiarina.lib.redisearch.filter as rf\ncomplex_filter = (\n (rf.Tag(\"category\") == \"electronics\") |\n (rf.Tag(\"category\") == \"computers\")\n) & (rf.Numeric(\"price\") < 1000)\n\nresults = client.find(filter=complex_filter)\n```\n\n#### Sorting\n\n```python\n# Sort by price (ascending)\nresults = client.find(\n sort_by=\"price\",\n sort_desc=False\n)\n\n# Sort by rating (descending)\nresults = client.find(\n filter=[[\"category\", \"==\", \"electronics\"]],\n sort_by=\"rating\",\n sort_desc=True\n)\n\n# Note: Only sortable fields can be used for sorting\n# Define sortable fields in your schema:\n# {\"type\": \"numeric\", \"name\": \"price\", \"sortable\": True}\n```\n\n#### Pagination\n\n```python\n# Get first 10 results\nresults = client.find(limit=10)\n\n# Get next 10 results (pagination)\nresults = client.find(offset=10, limit=10)\n\n# Get results 21-30\nresults = client.find(offset=20, limit=10)\n\n# Combine with filtering and sorting\nresults = client.find(\n filter=[[\"category\", \"==\", \"electronics\"]],\n sort_by=\"price\",\n sort_desc=True,\n offset=0,\n limit=20\n)\n```\n\n#### Field Selection\n\n```python\n# Return only specific fields (more efficient)\nresults = client.find(\n return_fields=[\"title\", \"price\"]\n)\n\n# Return no content, only document IDs (most efficient for counting)\nresults = client.find(\n return_fields=[] # or omit return_fields parameter\n)\n\n# Include computed fields\nresults = client.find(\n return_fields=[\"title\", \"price\", \"id\"] # id is automatically computed\n)\n```\n\n#### Complete Find Example\n\n```python\n# Comprehensive search with all options\nresults = client.find(\n filter=[\n [\"category\", \"in\", [\"electronics\", \"computers\"]],\n [\"price\", \">=\", 50],\n [\"rating\", \">=\", 4.0]\n ],\n sort_by=\"price\",\n sort_desc=False,\n offset=0,\n limit=25,\n return_fields=[\"title\", \"price\", \"rating\", \"category\"]\n)\n\nprint(f\"Found {results.total} products (showing {len(results.documents)})\")\nprint(f\"Query took {results.duration}ms\")\n\nfor doc in results.documents:\n print(f\"- {doc.mapping['title']}: ${doc.mapping['price']} ({doc.mapping['rating']}\u2b50)\")\n```\n\n### 3. Vector Similarity Search (`search`)\n\nPerform semantic similarity search using vector embeddings. This is ideal for AI-powered search, recommendation systems, and semantic matching.\n\n#### Basic Vector Search\n\n```python\n# Simple vector search\nquery_vector = [0.1, 0.2, 0.3, ...] # Your query embedding (must match schema dims)\nresults = client.search(vector=query_vector)\n\nprint(f\"Found {results.total} similar documents\")\nfor doc in results.documents:\n print(f\"Document: {doc.id}, Similarity Score: {doc.score:.4f}\")\n```\n\n#### Vector Search with Filtering\n\n```python\n# Pre-filter documents before vector search (more efficient)\nresults = client.search(\n vector=query_vector,\n filter=[[\"category\", \"==\", \"electronics\"]], # Only search within electronics\n limit=10\n)\n\n# Complex pre-filtering\nresults = client.search(\n vector=query_vector,\n filter=[\n [\"category\", \"in\", [\"electronics\", \"computers\"]],\n [\"price\", \"<=\", 1000],\n [\"in_stock\", \"==\", \"true\"]\n ],\n limit=20\n)\n```\n\n#### Pagination and Field Selection\n\n```python\n# Paginated vector search\nresults = client.search(\n vector=query_vector,\n offset=10,\n limit=10,\n return_fields=[\"title\", \"description\", \"price\", \"distance\"]\n)\n\n# Get similarity scores and distances\nfor doc in results.documents:\n distance = doc.mapping.get('distance', 0)\n score = doc.score # Normalized similarity score (0-1)\n print(f\"{doc.mapping['title']}: score={score:.4f}, distance={distance:.4f}\")\n```\n\n#### Understanding Vector Search Results\n\n```python\nresults = client.search(\n vector=query_vector,\n limit=5,\n return_fields=[\"title\", \"distance\"]\n)\n\nfor i, doc in enumerate(results.documents, 1):\n print(f\"{i}. {doc.mapping['title']}\")\n print(f\" Similarity Score: {doc.score:.4f}\") # Higher = more similar\n print(f\" Distance: {doc.mapping['distance']:.4f}\") # Lower = more similar\n print(f\" Document ID: {doc.id}\")\n print()\n```\n\n#### Vector Search Best Practices\n\n```python\n# 1. Use appropriate vector dimensions\nschema = [{\n \"type\": \"vector\",\n \"name\": \"embedding\",\n \"algorithm\": \"HNSW\", # or \"FLAT\"\n \"dims\": 1536, # Must match your embedding model\n \"distance_metric\": \"COSINE\" # COSINE, L2, or IP\n}]\n\n# 2. Pre-filter for better performance\nresults = client.search(\n vector=query_vector,\n filter=[[\"category\", \"==\", \"target_category\"]], # Reduce search space\n limit=50 # Don't retrieve more than needed\n)\n\n# 3. Use HNSW for large datasets\nhnsw_schema = {\n \"type\": \"vector\",\n \"name\": \"embedding\",\n \"algorithm\": \"HNSW\",\n \"dims\": 1536,\n \"m\": 16, # Connections per node\n \"ef_construction\": 200, # Build-time accuracy\n \"ef_runtime\": 100 # Search-time accuracy\n}\n\n# 4. Use FLAT for smaller datasets or exact search\nflat_schema = {\n \"type\": \"vector\",\n \"name\": \"embedding\",\n \"algorithm\": \"FLAT\",\n \"dims\": 1536\n}\n```\n\n### Search Result Structure\n\nAll search operations return a `SearchResult` object:\n\n```python\nclass SearchResult:\n total: int # Total matching documents\n duration: float # Query execution time (ms)\n documents: list[Document] # Retrieved documents\n\nclass Document:\n key: str # Redis key\n id: str # Document ID\n score: float # Relevance/similarity score (-1.0 to 1.0)*\n mapping: dict[str, Any] # Document fields\n```\n\n### Performance Comparison\n\n| Operation | Use Case | Performance | Returns |\n|-----------|----------|-------------|---------|\n| `count()` | Get result counts | Fastest | Count only |\n| `find()` | Full-text search, filtering | Fast | Full documents |\n| `search()` | Semantic similarity | Moderate* | Ranked by similarity |\n\n*Vector search performance depends on algorithm (FLAT vs HNSW) and dataset size.\n\n### Combining Operations\n\n```python\n# 1. First, check how many results we'll get\ncount_result = client.count(\n filter=[[\"category\", \"==\", \"electronics\"]]\n)\nprint(f\"Will search through {count_result.total} electronics\")\n\n# 2. If reasonable number, do full-text search\nif count_result.total < 10000:\n text_results = client.find(\n filter=[[\"category\", \"==\", \"electronics\"]],\n sort_by=\"rating\",\n sort_desc=True,\n limit=100\n )\n\n# 3. For semantic search within results\nif query_vector:\n semantic_results = client.search(\n vector=query_vector,\n filter=[[\"category\", \"==\", \"electronics\"]],\n limit=20\n )\n```\n\n## Index Management\n\n### Index Lifecycle\n\n```python\n# Check if index exists\nif not client.exists_index():\n client.create_index()\n\n# Get index information\ninfo = client.get_info()\nprint(f\"Index: {info.index_name}, Documents: {info.num_docs}\")\n\n# Reset index (delete all documents, recreate index)\nclient.reset_index()\n\n# Drop index (optionally delete documents)\nclient.drop_index(delete_documents=True)\n```\n\n### Schema Migration\n\nAutomatically migrate your index when schema changes:\n\n```python\n# Update your schema\nnew_schema = [\n {\"type\": \"tag\", \"name\": \"category\"},\n {\"type\": \"text\", \"name\": \"title\"},\n {\"type\": \"numeric\", \"name\": \"price\", \"sortable\": True},\n {\"type\": \"numeric\", \"name\": \"rating\", \"sortable\": True}, # New field\n {\"type\": \"vector\", \"name\": \"embedding\", \"algorithm\": \"HNSW\", \"dims\": 1536} # Changed algorithm\n]\n\n# Update configuration\nsettings_manager.user_config[\"production\"][\"index_schema\"] = new_schema\n\n# Migrate (automatically detects changes and recreates index)\nclient.migrate_index()\n```\n\n## Document Operations\n\n### Adding Documents\n\n```python\n# Add single document\nclient.set({\n \"category\": \"electronics\",\n \"title\": \"Wireless Mouse\",\n \"price\": 29.99,\n \"rating\": 4.5,\n \"embedding\": [0.1, 0.2, ...]\n}, id=\"mouse_001\")\n\n# Add document with ID in mapping\nclient.set({\n \"id\": \"keyboard_001\",\n \"category\": \"electronics\",\n \"title\": \"Mechanical Keyboard\",\n \"price\": 129.99,\n \"embedding\": [0.2, 0.3, ...]\n})\n```\n\n### Retrieving Documents\n\n```python\n# Get single document\ndoc = client.get(\"mouse_001\")\nif doc:\n print(f\"Title: {doc.mapping['title']}\")\n print(f\"Price: {doc.mapping['price']}\")\n\n# Get Redis key for document\nkey = client.get_key(\"mouse_001\") # Returns \"products:mouse_001\"\n```\n\n### Deleting Documents\n\n```python\n# Delete single document\nclient.delete(\"mouse_001\")\n```\n\n## Integration with Other Libraries\n\n### Using with kiarina-lib-redis\n\n```python\nfrom kiarina.lib.redis import get_redis\nfrom kiarina.lib.redisearch import create_redisearch_client\n\n# Get Redis client from kiarina-lib-redis\nredis_client = get_redis(decode_responses=False)\n\n# Create RediSearch client\nsearch_client = create_redisearch_client(redis=redis_client)\n```\n\n### Custom Redis Configuration\n\n```python\nimport redis\nfrom kiarina.lib.redisearch import create_redisearch_client\n\n# Custom Redis client with connection pooling\nredis_client = redis.Redis(\n host=\"localhost\",\n port=6379,\n db=0,\n decode_responses=False, # Required!\n max_connections=20,\n socket_timeout=30,\n socket_connect_timeout=10\n)\n\nsearch_client = create_redisearch_client(redis=redis_client)\n```\n\n## Error Handling\n\n```python\ntry:\n client.create_index()\nexcept Exception as e:\n if \"Index already exists\" in str(e):\n print(\"Index already exists, continuing...\")\n else:\n raise\n\n# Protect against accidental index deletion\nsettings_manager.user_config[\"production\"][\"protect_index_deletion\"] = True\n\n# This will return False instead of deleting\nsuccess = client.drop_index()\nif not success:\n print(\"Index deletion is protected\")\n```\n\n## Performance Considerations\n\n### Vector Search Optimization\n\n```python\n# Use HNSW for large datasets (faster but approximate)\nhnsw_schema = {\n \"type\": \"vector\",\n \"name\": \"embedding\",\n \"algorithm\": \"HNSW\",\n \"dims\": 1536,\n \"m\": 32, # Higher M = better recall, more memory\n \"ef_construction\": 400, # Higher = better index quality, slower indexing\n \"ef_runtime\": 100 # Higher = better recall, slower search\n}\n\n# Use FLAT for smaller datasets or exact search\nflat_schema = {\n \"type\": \"vector\",\n \"name\": \"embedding\",\n \"algorithm\": \"FLAT\",\n \"dims\": 1536,\n \"initial_cap\": 10000 # Pre-allocate capacity\n}\n```\n\n### Indexing Best Practices\n\n```python\n# Use appropriate field options\nschema = [\n {\n \"type\": \"tag\",\n \"name\": \"category\",\n \"sortable\": True, # Only if you need sorting\n \"no_index\": False # Set True for storage-only fields\n },\n {\n \"type\": \"text\",\n \"name\": \"description\",\n \"weight\": 1.0, # Adjust relevance weight\n \"no_stem\": False # Enable stemming for better search\n }\n]\n```\n\n## Development\n\n### Prerequisites\n\n- Python 3.12+\n- Redis with RediSearch module\n- Docker (for running Redis in tests)\n\n### Setup\n\n```bash\n# Clone the repository\ngit clone https://github.com/kiarina/kiarina-python.git\ncd kiarina-python\n\n# Setup development environment\nmise run setup\n\n# Start Redis with RediSearch for testing\ndocker compose up -d redis\n```\n\n### Running Tests\n\n```bash\n# Run all tests for this package\nmise run package kiarina-lib-redisearch\n\n# Run specific test categories\nuv run --group test pytest packages/kiarina-lib-redisearch/tests/sync/\nuv run --group test pytest packages/kiarina-lib-redisearch/tests/async/\n\n# Run with coverage\nmise run package:test kiarina-lib-redisearch --coverage\n```\n\n## Configuration Reference\n\n| Setting | Environment Variable | Default | Description |\n|---------|---------------------|---------|-------------|\n| `key_prefix` | `KIARINA_LIB_REDISEARCH_KEY_PREFIX` | `\"\"` | Redis key prefix for documents |\n| `index_name` | `KIARINA_LIB_REDISEARCH_INDEX_NAME` | `\"default\"` | RediSearch index name |\n| `index_schema` | - | `None` | Index schema definition (list of field dicts) |\n| `protect_index_deletion` | `KIARINA_LIB_REDISEARCH_PROTECT_INDEX_DELETION` | `false` | Prevent accidental index deletion |\n\n## Dependencies\n\n- [redis](https://github.com/redis/redis-py) - Redis client for Python\n- [numpy](https://numpy.org/) - Numerical computing (for vector operations)\n- [pydantic](https://docs.pydantic.dev/) - Data validation and settings management\n- [pydantic-settings](https://docs.pydantic.dev/latest/concepts/pydantic_settings/) - Settings management\n- [pydantic-settings-manager](https://github.com/kiarina/pydantic-settings-manager) - Advanced settings management\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](../../LICENSE) file for details.\n\n## Contributing\n\nThis is a personal project, but contributions are welcome! Please feel free to submit issues or pull requests.\n\n## Related Projects\n\n- [kiarina-python](https://github.com/kiarina/kiarina-python) - The main monorepo containing this package\n- [RediSearch](https://redis.io/docs/interact/search-and-query/) - The search and query engine this library connects to\n- [kiarina-lib-redis](../kiarina-lib-redis/) - Redis client library for basic Redis operations\n- [pydantic-settings-manager](https://github.com/kiarina/pydantic-settings-manager) - Configuration management library used by this package\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "RediSearch client library for kiarina namespace",
"version": "1.0.1",
"project_urls": {
"Changelog": "https://github.com/kiarina/kiarina-python/blob/main/packages/kiarina-lib-redisearch/CHANGELOG.md",
"Documentation": "https://github.com/kiarina/kiarina-python/tree/main/packages/kiarina-lib-redisearch#readme",
"Homepage": "https://github.com/kiarina/kiarina-python",
"Issues": "https://github.com/kiarina/kiarina-python/issues",
"Repository": "https://github.com/kiarina/kiarina-python"
},
"split_keywords": [
"database",
" fulltext",
" pydantic",
" redis",
" redisearch",
" search",
" settings",
" vector"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "a0c5db4468d61b19475ec40bde8a0ab083ad156df89167b1bca49ee4dcd57b23",
"md5": "5cc46b1062ad1054cc77dfc32e5a2217",
"sha256": "25de93bd42505ca5aa004d56768f579652caf5a22aa740ce3de1ceb87b837d72"
},
"downloads": -1,
"filename": "kiarina_lib_redisearch-1.0.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "5cc46b1062ad1054cc77dfc32e5a2217",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.12",
"size": 46054,
"upload_time": "2025-09-11T08:52:24",
"upload_time_iso_8601": "2025-09-11T08:52:24.133779Z",
"url": "https://files.pythonhosted.org/packages/a0/c5/db4468d61b19475ec40bde8a0ab083ad156df89167b1bca49ee4dcd57b23/kiarina_lib_redisearch-1.0.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "5f384543cc455ee4a9af5d9ee5d7f88056a9b8117562840064952e400d1838f4",
"md5": "5836d9a1717d91e16ab970c4e35abd3b",
"sha256": "069e647941a34ee872acf2d2f235ff369397400eefa8066cb58bff2bc2894263"
},
"downloads": -1,
"filename": "kiarina_lib_redisearch-1.0.1.tar.gz",
"has_sig": false,
"md5_digest": "5836d9a1717d91e16ab970c4e35abd3b",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.12",
"size": 28595,
"upload_time": "2025-09-11T08:52:29",
"upload_time_iso_8601": "2025-09-11T08:52:29.973858Z",
"url": "https://files.pythonhosted.org/packages/5f/38/4543cc455ee4a9af5d9ee5d7f88056a9b8117562840064952e400d1838f4/kiarina_lib_redisearch-1.0.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-09-11 08:52:29",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "kiarina",
"github_project": "kiarina-python",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "kiarina-lib-redisearch"
}