tensorus


Nametensorus JSON
Version 0.1.0 PyPI version JSON
download
home_pageNone
SummaryAn agentic tensor database with unified SDK, agent orchestration, and intelligent workflows for ML/AI applications.
upload_time2025-11-03 18:19:42
maintainerNone
docs_urlNone
authorNone
requires_python>=3.10
licenseMIT License Copyright (c) 2025 Tensorus Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
keywords tensor database agent ai pytorch fastapi streamlit automl reinforcement-learning data-ingestion
VCS
bugtrack_url
requirements torch torchvision segmentation-models-pytorch transformers langchain-google langchain-google-genai numpy tensorly Pillow fastapi pydantic pydantic-settings uvicorn psycopg2-binary streamlit requests python-jose plotly pytest httpx boto3 matplotlib scikit-learn umap-learn pandas arch lifelines semopy gensim joblib opencv-python
Travis-CI No Travis.
coveralls test coverage No coveralls.
            ---
license: mit
title: Core
sdk: docker
emoji: ๐Ÿ 
colorFrom: blue
colorTo: yellow
short_description: Tensorus Core
---

# Tensorus: Agentic Tensor Database/Data Lake

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)
[![PyPI version](https://img.shields.io/badge/pypi-v0.0.5-blue.svg)](https://pypi.org/project/tensorus/)
[![Docker](https://img.shields.io/badge/docker-%230db7ed.svg?style=flat&logo=docker&logoColor=white)](https://hub.docker.com/r/tensorus/tensorus)
[![API Documentation](https://img.shields.io/badge/API-Documentation-green.svg)](https://docs.tensorus.com/api)

> **๐ŸŽ‰ New in v0.0.5:** Unified Python SDK with intuitive API, Agent Orchestrator for multi-agent workflows, and comprehensive examples. See [What's New](#-whats-new-in-v005) for details.

**Tensorus** is a production-ready, specialized data platform focused on the management and agent-driven manipulation of tensor data. It offers a streamlined environment for storing, retrieving, and operating on tensors at scale, providing the foundation for advanced AI and machine learning workflows.

## ๐ŸŽฏ What Makes Tensorus Special

Tensorus bridges the gap between traditional databases and AI/ML requirements by providing:

- **๐Ÿง  Intelligent Agent Framework** - Built-in agents for data ingestion, reinforcement learning, AutoML, and embedding generation
- **โšก High-Performance Tensor Operations** - 40+ optimized operations with 10-100x performance improvements
- **๐Ÿ” Natural Language Queries** - Intuitive NQL interface for tensor discovery and analysis
- **๐Ÿ“Š Complete Observability** - Full computational lineage and operation history tracking
- **๐Ÿ—๏ธ Production-Grade Architecture** - Enterprise security, scaling, and deployment capabilities

The core purpose of Tensorus is to **simplify and accelerate** how developers and AI agents interact with tensor datasets, enabling faster development of automated data ingestion, reinforcement learning from stored experiences, AutoML processes, and intelligent data utilization in AI projects.

## ๐Ÿš€ Quick Start (3 Minutes)

### Installation
```bash
# Install from PyPI
pip install tensorus

# Or install from source for development
git clone https://github.com/tensorus/tensorus.git
cd tensorus
pip install -e .
```

### Basic Usage with Python SDK
```python
from tensorus import Tensorus
import torch

# Initialize Tensorus SDK (minimal dependencies)
ts = Tensorus(
    enable_nql=False,          # Disable if transformers not installed
    enable_embeddings=False,   # Disable if sentence-transformers not installed
    enable_vector_search=False
)

# Create a dataset
ts.create_dataset("my_dataset")

# Create and store tensors
tensor_a = ts.create_tensor(
    [[1, 2], [3, 4]], 
    name="matrix_a",
    dataset="my_dataset"
)

tensor_b = ts.create_tensor(
    [[5, 6], [7, 8]],
    name="matrix_b",
    dataset="my_dataset"
)

# Perform operations
result = ts.matmul(tensor_a.to_tensor(), tensor_b.to_tensor())
print(f"Result shape: {result.shape}")  # (2, 2)

# List all tensors
tensors = ts.list_tensors("my_dataset")
print(f"Stored {len(tensors)} tensors")
```

### Start the API Server
```bash
# Start development server
python -m uvicorn tensorus.api:app --reload --port 8000

# Access interactive API docs at:
# - Swagger UI: http://localhost:8000/docs
# - ReDoc: http://localhost:8000/redoc
```

## ๐Ÿ Python SDK Features

The Tensorus SDK provides a unified interface for all tensor operations, agent coordination, and data management.

### Core SDK Operations

```python
from tensorus import Tensorus

# Full initialization with all features
ts = Tensorus(
    enable_nql=True,              # Natural Query Language
    enable_embeddings=True,       # Embedding generation  
    enable_vector_search=True,    # Vector similarity search
    enable_orchestrator=True,     # Multi-agent workflows
    embedding_model="all-MiniLM-L6-v2"
)

# Dataset management
ts.create_dataset("research_data")
ts.list_datasets()
ts.delete_dataset("old_data")

# Tensor operations
a = ts.create_tensor([[1, 2], [3, 4]], name="matrix_a", dataset="research_data")
b = ts.create_tensor([[5, 6], [7, 8]], name="matrix_b", dataset="research_data")

# Mathematical operations
result = ts.matmul(a.to_tensor(), b.to_tensor())
transposed = ts.transpose(a.to_tensor())
eigenvals = ts.eigenvalues(a.to_tensor())

# Natural language queries (requires enable_nql=True)
results = ts.query("find tensors in research_data where shape is (2, 2)")

# Vector operations (requires enable_embeddings=True)
ts.create_index("docs", dimensions=384, metric="cosine")
ts.embed_and_index(
    texts=["Machine learning paper", "Deep learning tutorial"],
    index_name="docs",
    dataset="research_data"
)
search_results = ts.search("neural networks", index_name="docs", top_k=5)

# Multi-agent workflows (requires enable_orchestrator=True)
workflow = ts.create_workflow("data_pipeline")
ts.orchestrator.add_task(workflow, "embed", "embedding", "generate", {...})
ts.orchestrator.add_task(workflow, "index", "vector", "index", {...}, deps=["embed"])
results = ts.execute_workflow(workflow)
```

### SDK Benefits

- **Unified Interface** - Single entry point for all Tensorus capabilities
- **Lazy Loading** - Agents load only when enabled, reducing dependencies
- **Type Safety** - Full type hints for IDE autocomplete and validation
- **Error Handling** - Comprehensive exception handling with helpful messages
- **Performance** - Optimized for both single-node and distributed workloads

## ๐Ÿ“š Documentation

For comprehensive documentation, including user guides and examples, please visit our [documentation site](https://docs.tensorus.com).

### Interactive API Documentation

Access the interactive API documentation when the server is running:

- **Swagger UI**: `http://localhost:8000/docs` - Interactive API exploration with "Try it out" functionality
- **ReDoc**: `http://localhost:8000/redoc` - Clean, responsive API documentation

### Quick Links
- [Getting Started Guide](docs/user_guide.md) - Learn the basics of Tensorus
- [Examples](examples/) - Practical code examples including `basic_usage.py` and `complete_workflow_example.py`
- [Deployment Guide](docs/deployment.md) - Production deployment instructions

## ๐Ÿ“– Comprehensive Documentation

### ๐Ÿ“š Learning Resources
- **๐ŸŽ“ [Documentation Hub](docs/index.md)** - Central portal with guided learning paths for all skill levels
- **๐Ÿš€ [Getting Started Guide](docs/getting_started.md)** - Complete 15-minute tutorial with real examples
- **๐Ÿ’ก [Use Case Examples](examples/)** - Real-world implementations and practical guides

### ๐Ÿ”ง Technical References  
- **๐Ÿ” [Complete API Reference](docs/api_reference.md)** - Full REST API documentation with code samples
- **๐Ÿญ [Production Deployment](docs/production_deployment.md)** - Enterprise deployment strategies and operations
- **โšก [Performance & Scaling](docs/performance_benchmarks.md)** - Benchmarks, optimization, and capacity planning

### ๐Ÿข Business & Strategy
- **๐ŸŽฏ [Executive Overview](docs/executive_overview.md)** - Product positioning, market analysis, and business value
- **๐Ÿ“Š [Architecture Guide](docs/index.md#architecture-highlights)** - System design and technical architecture

## ๐Ÿ“ฆ What's New in v0.0.5

**Major Release** - Unified SDK and Agent Orchestration

### New Features
- โœจ **Unified Tensorus SDK** - Single `Tensorus` class with intuitive API for all operations
- ๐Ÿค– **Agent Orchestrator** - Multi-agent workflow coordination with DAG-based execution
- ๐Ÿ“š **Updated Examples** - All examples now use the new SDK (`examples/basic_usage.py`, `examples/complete_workflow_example.py`)
- ๐Ÿ“Š **Benchmarking Suite** - Comprehensive performance testing framework (`benchmarks/benchmark_suite.py`)
- ๐Ÿ”ง **Lazy Agent Loading** - Agents only load when enabled, reducing startup dependencies
- ๐Ÿ“ **Enhanced Documentation** - Complete SDK reference and implementation guides

### Breaking Changes
- **SDK Interface** - New unified API replaces direct component access (migration is straightforward - see Quick Start)
- **Optional Dependencies** - NQL, embeddings, and vector search now require explicit enabling

### Improvements
- Better error messages for missing dependencies
- Cleaner separation of concerns
- Improved performance through optimized initialization
- More intuitive API naming

See [QUICKSTART.md](QUICKSTART.md) for migration guide and [examples/](examples/) for updated code samples.

## Table of Contents

- [What's New in v0.0.5](#-whats-new-in-v005)
- [Python SDK Features](#-python-sdk-features)
- [Key Features](#key-features)
- [Project Structure](#project-structure)
- [Demos](#demos)
- [Architecture](#architecture)
- [Getting Started](#getting-started)
  - [Prerequisites](#prerequisites)
  - [Installation](#installation)
  - [Running the API Server](#running-the-api-server)
  - [Running the Streamlit UI](#running-the-streamlit-ui)
  - [Model Context Protocol Integration](#model-context-protocol-integration)
  - [Running the Agents (Examples)](#running-the-agents-examples)
- [Docker Deployment](#docker-deployment)
- [Environment Configuration](#environment-configuration)
- [Production Deployment](#production-deployment)
- [Testing](#testing)
- [Using Tensorus](#using-tensorus)
  - [API Basics](#api-basics)
  - [Authentication Examples](#authentication-examples)
  - [NQL Query Example](#nql-query-example)
  - [API Endpoints](#api-endpoints)
  - [Vector Database Examples](#vector-database-examples)
  - [Request/Response Schemas](#requestresponse-schemas)
  - [Dataset API Examples](#dataset-api-examples)
  - [Dataset Schemas](#dataset-schemas)
- [Metadata System](#metadata-system)
- [Streamlit UI](#streamlit-ui)
- [Natural Query Language (NQL)](#natural-query-language-nql)
- [Agent Details](#agent-details)
- [Tensorus Models](#tensorus-models)
- [Basic Tensor Operations](#basic-tensor-operations)
- [Tensor Decomposition Operations](#tensor-decomposition-operations)
- [Vector Database Features](#vector-database-features)
- [Completed Features](#completed-features)
- [Future Implementation](#future-implementation)
- [Contributing](#contributing)
- [License](#license)

## ๐ŸŒŸ Core Capabilities

### ๐Ÿ—„๏ธ Advanced Tensor Storage System
*   **High-Performance Storage** - Efficiently store and retrieve PyTorch tensors with rich metadata support
*   **Intelligent Compression** - Multiple algorithms (LZ4, GZIP, quantization) with up to 4x space savings
*   **Schema Validation** - Optional per-dataset schemas enforce metadata fields and tensor shape/dtype constraints
*   **Chunked Processing** - Handle tensors larger than available memory through intelligent chunking
*   **Multi-Backend Support** - Local filesystem, PostgreSQL, S3, and cloud storage backends

### ๐Ÿค– Intelligent Agent Ecosystem  
*   **Data Ingestion Agent** - Automatically monitors directories and ingests files as tensors with preprocessing
*   **Reinforcement Learning Agent** - Deep Q-Network (DQN) agent that learns from experiences stored in tensor datasets  
*   **AutoML Agent** - Hyperparameter optimization and model selection using advanced search algorithms
*   **Embedding Agent** - Multi-provider embedding generation with intelligent caching and vector indexing
*   **Extensible Framework** - Build custom agents that interact intelligently with your tensor data

### ๐Ÿ” Advanced Query & Search Engine
*   **Natural Query Language (NQL)** - Query tensor data using intuitive, natural language-like syntax
*   **Vector Database Integration** - Advanced similarity search with multi-provider embedding generation
*   **Hybrid Search** - Combine semantic similarity with computational tensor properties  
*   **Geometric Partitioning** - Efficient vector indexing with automatic clustering and freshness layers

### ๐Ÿ”ฌ Production-Grade Operations
*   **40+ Tensor Operations** - Comprehensive library covering arithmetic, linear algebra, decompositions, and advanced operations
*   **Computational Lineage** - Complete tracking of tensor transformations for reproducible scientific workflows
*   **Operation History** - Full audit trail with performance metrics and error tracking
*   **Asynchronous Processing** - Background operations and job queuing for long-running computations

### ๐ŸŒ Developer-Friendly Interface
*   **RESTful API** - FastAPI backend with comprehensive OpenAPI documentation and authentication
*   **Interactive Web UI** - Streamlit-based dashboard for data exploration and agent control
*   **Python SDK** - Rich client library with intuitive APIs and comprehensive error handling
*   **Model Context Protocol** - Standardized integration for AI agents and LLMs via [tensorus/mcp](https://github.com/tensorus/mcp)

### ๐Ÿ“Š Enterprise Features
*   **Rich Metadata System** - Pydantic schemas for semantic, lineage, computational, quality, and usage metadata
*   **Security & Authentication** - API key management, role-based access control, and audit logging  
*   **Monitoring & Observability** - Health checks, performance metrics, and comprehensive logging
*   **Scalable Architecture** - Horizontal scaling, load balancing, and distributed processing capabilities

## Project Structure

*   `app.py`: The main Streamlit frontend application (located at the project root).
*   `pages/`: Directory containing individual Streamlit page scripts and shared UI utilities for the dashboard.
    *   `pages/ui_utils.py`: Utility functions specifically for the Streamlit UI.
    *   *(Other page scripts like `01_dashboard.py`, `02_control_panel.py`, etc., define the different views of the dashboard)*
*   `tensorus/`: Directory containing the core `tensorus` library modules (this is the main installable package).
    *   `tensorus/__init__.py`: Makes `tensorus` a Python package.
    *   `tensorus/api.py`: The FastAPI application providing the backend API for Tensorus.
    *   `tensorus/tensor_storage.py`: Core TensorStorage implementation for managing tensor data.
    *   `tensorus/tensor_ops.py`: Library of functions for tensor manipulations.
    *   `tensorus/vector_database.py`: Advanced vector indexing with geometric partitioning and freshness layers.
    *   `tensorus/embedding_agent.py`: Multi-provider embedding generation and vector database integration.
    *   `tensorus/hybrid_search.py`: Hybrid search engine combining semantic similarity with computational tensor properties.
    *   `tensorus/nql_agent.py`: Agent for processing Natural Query Language queries.
    *   `tensorus/ingestion_agent.py`: Agent for ingesting data from various sources.
    *   `tensorus/rl_agent.py`: Agent for Reinforcement Learning tasks.
    *   `tensorus/automl_agent.py`: Agent for AutoML processes.
    *   `tensorus/dummy_env.py`: A simple environment for the RL agent demonstration.
    *   *(Other Python files within `tensorus/` are part of the core library.)*
*   `requirements.txt`: Lists the project's Python dependencies for development and local execution.
*   `pyproject.toml`: Project metadata, dependencies for distribution, and build system configuration (e.g., for PyPI).
*   `README.md`: This file.
*   `LICENSE`: Project license file.
*   `.gitignore`: Specifies intentionally untracked files that Git should ignore.

## ๐ŸŒ Live Demos & Integrations

### ๐Ÿš€ Try Tensorus Online (No Installation Required)

Experience Tensorus directly in your browser via Huggingface Spaces:

*   **๐Ÿ”— [Interactive API Documentation](https://tensorus-api.hf.space/docs)** - Full Swagger UI with live examples and real-time testing
*   **๐Ÿ“– [Alternative API Docs](https://tensorus-api.hf.space/redoc)** - Clean ReDoc interface with detailed schemas
*   **๐Ÿ“Š [Web Dashboard Demo](https://tensorus-dashboard.hf.space)** - Complete Streamlit UI for data exploration and agent control

### ๐Ÿค– AI Agent Integration

**Model Context Protocol (MCP) Support** - Standardized integration for AI agents and LLMs:
*   **Repository:** [tensorus/mcp](https://github.com/tensorus/mcp) - Complete MCP server implementation
*   **Features:** Standardized protocol access to all Tensorus capabilities
*   **Use Cases:** LLM-driven tensor analysis, automated data workflows, intelligent agent interactions

## Architecture

### Tensorus Execution Cycle

```mermaid
graph TD
    %% User Interface Layer
    subgraph UI_Layer ["User Interaction"]
        UI[Streamlit UI]
    end

    %% API Gateway Layer
    subgraph API_Layer ["Backend Services"]
        API[FastAPI Backend]
    end

    %% Core Storage with Method Interface
    subgraph Storage_Layer ["Core Storage - TensorStorage"]
        TS[TensorStorage Core]
        subgraph Storage_Methods ["Storage Interface"]
            TS_insert[insert data metadata]
            TS_query[query query_fn]
            TS_get[get_by_id id]
            TS_sample[sample n]
            TS_update[update_metadata]
        end
        TS --- Storage_Methods
    end

    %% Agent Processing Layer
    subgraph Agent_Layer ["Processing Agents"]
        IA[Ingestion Agent]
        NQLA[NQL Agent]
        RLA[RL Agent]
        AutoMLA[AutoML Agent]
        EA[Embedding Agent]
    end

    %% Vector Database Layer
    subgraph Vector_Layer ["Vector Database"]
        VDB[Vector Index Manager]
        HSE[Hybrid Search Engine]
    end

    %% Model System
    subgraph Model_Layer ["Model System"]
        Registry[Model Registry]
        ModelsPkg[Models Package]
    end

    %% Tensor Operations Library
    subgraph Ops_Layer ["Tensor Operations"]
        TOps[TensorOps Library]
    end

    %% Primary UI Flow
    UI -->|HTTP Requests| API

    %% API Orchestration
    API -->|Command Dispatch| IA
    API -->|Command Dispatch| NQLA
    API -->|Command Dispatch| RLA
    API -->|Command Dispatch| AutoMLA
    API -->|Vector Operations| EA
    API -->|Model Training| Registry
    API -->|Direct Query| TS_query

    %% Vector Database Integration
    EA -->|Vector Indexing| VDB
    HSE -->|Hybrid Search| VDB
    API -->|Search Requests| HSE

    %% Model System Interactions
    Registry -->|Uses Models| ModelsPkg
    Registry -->|Load/Save| TS
    ModelsPkg -->|Tensor Ops| TOps

    %% Agent Storage Interactions
    IA -->|Data Ingestion| TS_insert

    NQLA -->|Query Execution| TS_query
    NQLA -->|Record Retrieval| TS_get

    RLA -->|State Persistence| TS_insert
    RLA -->|Experience Sampling| TS_sample
    RLA -->|State Retrieval| TS_get

    AutoMLA -->|Trial Storage| TS_insert
    AutoMLA -->|Data Retrieval| TS_query

    EA -->|Embedding Storage| TS_insert
    EA -->|Vector Retrieval| TS_query

    %% Computational Operations
    NQLA -->|Vector Operations| TOps
    RLA -->|Policy Evaluation| TOps
    AutoMLA -->|Model Optimization| TOps
    HSE -->|Tensor Analysis| TOps

    %% Indirect Storage Write-back
    TOps -.->|Intermediate Results| TS_insert
```

## ๐Ÿš€ Installation & Setup

### ๐Ÿ“‹ System Requirements

#### Minimum Requirements
*   **Python:** 3.10+ (3.11+ recommended for best performance)
*   **Memory:** 4 GB RAM (8+ GB recommended)
*   **Storage:** 10 GB available disk space
*   **OS:** Linux, macOS, Windows with WSL2

#### Production Requirements
*   **CPU:** 8+ cores with 16+ threads
*   **Memory:** 32+ GB RAM (64+ GB for large tensor workloads)
*   **Storage:** 1 TB+ NVMe SSD for optimal I/O performance  
*   **Network:** 10+ Gbps for distributed deployments
*   **See:** [Production Deployment Guide](docs/production_deployment.md) for detailed specifications

### ๐Ÿ”ง Installation Options

#### Option 1: Quick Install (Recommended for New Users)
```bash
# Install latest stable version from PyPI
pip install tensorus

# Start development server
tensorus start --dev

# Access web interface at http://localhost:8000
# API documentation at http://localhost:8000/docs
```

#### Option 2: Feature-Specific Installation
```bash
# Install with GPU acceleration support
pip install tensorus[gpu]

# Install with advanced compression algorithms
pip install tensorus[compression]

# Install with monitoring and metrics
pip install tensorus[monitoring]

# Install everything (enterprise features)
pip install tensorus[all]
```

#### Option 3: Development Installation
```bash
# Clone repository for development and contributions
git clone https://github.com/tensorus/tensorus.git
cd tensorus

# Create isolated virtual environment
python3 -m venv venv
source venv/bin/activate  # Linux/macOS
# venv\Scripts\activate   # Windows

# Install in development mode with all dependencies
./setup.sh
```

**Development Installation Notes:**
- Uses `requirements.txt` and `requirements-test.txt` for full dependency management
- Installs CPU-optimized PyTorch wheels (modify `setup.sh` for GPU versions)
- Includes testing frameworks and development tools
- Heavy ML libraries (`xgboost`, `lightgbm`, etc.) available via `pip install tensorus[models]`
- Audit logging to `tensorus_audit.log` (configurable via `TENSORUS_AUDIT_LOG_PATH`)

#### Option 4: Container Deployment
```bash
# Production deployment with PostgreSQL backend
docker compose up --build

# Quick testing with in-memory storage
docker run -p 8000:8000 tensorus/tensorus:latest

# Custom configuration with environment variables
docker run -p 8000:8000 \
  -e TENSORUS_STORAGE_BACKEND=postgres \
  -e TENSORUS_API_KEYS=your-api-key \
  tensorus/tensorus:latest
```

## โšก Performance & Scalability

### ๐Ÿ† Benchmark Results
Tensorus delivers **10-100x performance improvements** over traditional file-based tensor storage:

| Operation Type | Traditional Files | Tensorus | Improvement |
|----------------|------------------|----------|-------------|
| **Tensor Retrieval** | 280 ops/sec | 15,000 ops/sec | **53.6x faster** |
| **Query Processing** | 850ms | 45ms | **18.9x faster** |
| **Storage Efficiency** | 1.0x baseline | 4.0x compressed | **75% space saved** |
| **Vector Search** | 15,000ms | 125ms | **120x faster** |
| **Concurrent Operations** | 450 ops/sec | 12,000 ops/sec | **26.7x higher throughput** |

### ๐Ÿ“ˆ Scaling Characteristics
- **Linear scaling** up to 32+ nodes in distributed deployments
- **Sub-200ms response times** at enterprise scale (millions of tensors)
- **99.9% availability** with proper redundancy configuration
- **Automatic load balancing** and intelligent request routing

## ๐ŸŽฏ Use Cases & Applications

### ๐Ÿง  AI/ML Development & Production
- **Model Training Pipelines** - Store training data, model checkpoints, and experiment results
- **Real-time Inference** - Fast retrieval of model weights and feature tensors for serving
- **Experiment Tracking** - Complete lineage of model development with reproducible workflows
- **AutoML Platforms** - Automated hyperparameter optimization and model architecture search

### ๐Ÿ”ฌ Scientific Computing & Research
- **Numerical Simulations** - Large-scale scientific computing with computational provenance
- **Climate & Weather Modeling** - Multi-dimensional data analysis with temporal tracking
- **Genomics & Bioinformatics** - DNA sequence analysis, protein folding, and molecular dynamics
- **Materials Science** - Quantum chemistry simulations and materials property prediction

### ๐Ÿ‘๏ธ Computer Vision & Autonomous Systems
- **Image/Video Processing** - Efficient storage and retrieval of visual data tensors
- **Object Detection & Recognition** - Real-time inference with cached model components
- **Autonomous Vehicles** - Sensor fusion, path planning, and decision-making algorithms
- **Medical Imaging** - DICOM processing, radiological analysis, and diagnostic AI

### ๐Ÿ’ฐ Financial Services & Trading
- **Risk Management** - Real-time portfolio optimization and risk assessment models
- **Algorithmic Trading** - High-frequency trading with microsecond-latency model execution
- **Fraud Detection** - Anomaly detection in transaction patterns and behavioral analysis
- **Credit Scoring** - ML-driven creditworthiness assessment with regulatory compliance

### Running the API Server

1.  Navigate to the project root directory (the directory containing the `tensorus` folder and `pyproject.toml`).
2.  Ensure your virtual environment is activated if you are using one.
3.  Start the FastAPI backend server using:

    ```bash
    uvicorn tensorus.api:app --reload --host 127.0.0.1 --port 7860
    # For external access (e.g., Docker/WSL/other machines), use:
    # uvicorn tensorus.api:app --host 0.0.0.0 --port 7860
    ```

    *   This command launches Uvicorn with the `app` instance defined in `tensorus/api.py`.
    *   Access the API documentation at `http://localhost:7860/docs` or `http://localhost:7860/redoc`.
    *   All dataset and agent endpoints are available once the server is running.

    To use S3 for tensor dataset persistence instead of local disk, set:

    ```bash
    export TENSORUS_TENSOR_STORAGE_PATH="s3://your-bucket/optional/prefix"
    # Ensure AWS credentials are available (env vars, profile, or instance role)
    uvicorn tensorus.api:app --host 0.0.0.0 --port 7860
    ```

### Running the Streamlit UI

1.  In a separate terminal (with the virtual environment activated), navigate to the project root.
2.  Start the Streamlit frontend:

    ```bash
    streamlit run app.py
    ```

    *   Access the UI in your browser at the URL provided by Streamlit (usually `http://localhost:8501`).

### Model Context Protocol Integration

For AI agents and LLMs that need standardized protocol access to Tensorus capabilities, see the separate [Tensorus MCP package](https://github.com/tensorus/mcp) which provides a complete MCP server implementation.

### Running the Agents (Examples)

You can run the example agents directly from their respective files:

*   **RL Agent:**

    ```bash
    python tensorus/rl_agent.py
    ```

*   **AutoML Agent:**

    ```bash
    python tensorus/automl_agent.py
    ```

*   **Ingestion Agent:**

    ```bash
    python tensorus/ingestion_agent.py
    ```

    *   Note: The Ingestion Agent will monitor the `temp_ingestion_source` directory (created automatically if it doesn't exist in the project root) for new files.

## Docker Deployment

### Docker Quickstart

Run Tensorus with Docker in two ways: a single container (inโ€‘memory storage) or a full stack with PostgreSQL via Docker Compose.

#### Option A: Full stack with PostgreSQL (recommended)
1. Install Docker Desktop (or Docker Engine) and Docker Compose v2.
2. Generate an API key:

   ```bash
   python generate_api_key.py --format env
   # Copy the value printed after TENSORUS_API_KEYS=
   ```

3. Open `docker-compose.yml` and set your key. Either:
   - Replace the placeholder in `TENSORUS_VALID_API_KEYS` with your key, or
   - Add `TENSORUS_API_KEYS: "tsr_..."` alongside it. Both are supported; `TENSORUS_API_KEYS` is preferred.

4. Start the stack from the project root:

   ```bash
   docker compose up --build
   ```

   - The API starts on `http://localhost:7860`
   - PostgreSQL is exposed on host `5433` (container `5432`)
   - Audit logs are persisted to `./tensorus_audit.log` via a bind mount

5. Test authentication (Bearer token is recommended):

   ```bash
   # Replace tsr_your_key with the key you generated
   curl -H "Authorization: Bearer tsr_your_key" http://localhost:7860/datasets
   ```

Notes
- The compose file waits for Postgres to become healthy before starting the app.
- Legacy header `X-API-KEY: tsr_your_key` is still accepted for backward compatibility.

Useful commands
```bash
# View logs
docker compose logs -f app

# Rebuild after code changes
docker compose up --build --force-recreate

# Stop stack
docker compose down
```

#### Option B: Single container (inโ€‘memory storage)
Use this for quick, ephemeral testing without Postgres.

```bash
docker build -t tensorus .
docker run --rm -p 7860:7860 \
  -e TENSORUS_AUTH_ENABLED=true \
  -e TENSORUS_API_KEYS=tsr_your_key \
  -e TENSORUS_STORAGE_BACKEND=in_memory \
  -v "$(pwd)/tensorus_audit.log:/app/tensorus_audit.log" \
  tensorus
```

Then open `http://localhost:7860/docs`.

WSL2 tip: If you run Docker Desktop on Windows with WSL2, `localhost:7860` works from both Windows and the WSL distro. Keep volumes on the Linux side (`/home/...`) for best performance.

#### GPU acceleration (optional)
The default image uses CPU wheels. For GPUs, install the NVIDIA Container Toolkit and switch to CUDAโ€‘enabled PyTorch wheels in your build (e.g., modify `setup.sh` or your Dockerfile). Pass `--gpus all` to `docker run`.

## Environment Configuration

### Environment configuration (reference)
Tensorus reads configuration from environment variables (prefix `TENSORUS_`). Common settings:

- Authentication
  - `TENSORUS_AUTH_ENABLED` (default: `true`)
  - `TENSORUS_API_KEYS`: Commaโ€‘separated list of keys (recommended)
  - `TENSORUS_VALID_API_KEYS`: Legacy alternative; comma list or JSON array
  - Usage: Prefer `Authorization: Bearer tsr_...`; legacy `X-API-KEY` also accepted

- Storage backend
  - `TENSORUS_STORAGE_BACKEND`: `in_memory` | `postgres` (default: `in_memory`)
  - Postgres when `postgres`:
    - `TENSORUS_POSTGRES_HOST`, `TENSORUS_POSTGRES_PORT` (default `5432`),
      `TENSORUS_POSTGRES_USER`, `TENSORUS_POSTGRES_PASSWORD`, `TENSORUS_POSTGRES_DB`
    - or `TENSORUS_POSTGRES_DSN` (overrides individual fields)
  - Optional tensor persistence path: `TENSORUS_TENSOR_STORAGE_PATH` (e.g., a local path or URI)

- Security headers
  - `TENSORUS_X_FRAME_OPTIONS` (default `SAMEORIGIN`; set to `NONE` to omit)
  - `TENSORUS_CONTENT_SECURITY_POLICY` (default `default-src 'self'`; set to `NONE` to omit)

- Misc
  - `TENSORUS_AUDIT_LOG_PATH` (default `tensorus_audit.log`)
  - `TENSORUS_MINIMAL_IMPORT`=1 to skip optional model package imports
  - NQL with LLM: `NQL_USE_LLM=true`, `GOOGLE_API_KEY`, optional `NQL_LLM_MODEL`

Example `.env` (for local runs or compose env_file):
```bash
TENSORUS_AUTH_ENABLED=true
TENSORUS_API_KEYS=tsr_your_key
TENSORUS_STORAGE_BACKEND=postgres
TENSORUS_POSTGRES_HOST=db
TENSORUS_POSTGRES_PORT=5432
TENSORUS_POSTGRES_USER=tensorus_user
TENSORUS_POSTGRES_PASSWORD=change_me
TENSORUS_POSTGRES_DB=tensorus_db
TENSORUS_AUDIT_LOG_PATH=/app/tensorus_audit.log
```

## Production Deployment

### Production deployment with Docker (stepโ€‘byโ€‘step)
This example uses Docker Compose with PostgreSQL. Adjust for your infra as needed.

1. Generate and store your API key securely
   - `python generate_api_key.py --format env`
   - Prefer secret management (Docker/Swarm/K8s/Vault). For Compose, you can use a fileโ€‘based secret:

     ```bash
     # secrets/api_key.txt contains only your key value (no quotes)
     echo "tsr_prod_key_..." > secrets/api_key.txt
     ```

2. Configure Compose for production
   - Edit `docker-compose.yml` and set:
     - `TENSORUS_AUTH_ENABLED: "true"`
     - `TENSORUS_API_KEYS: ${TENSORUS_API_KEYS:-}` or point to a secret/file
     - `TENSORUS_STORAGE_BACKEND: postgres` and your Postgres credentials
   - Optionally add `env_file: .env` and put nonโ€‘secret config there.

3. Harden runtime
   - Put Tensorus behind a reverse proxy (Nginx/Traefik) with TLS
   - Restrict CORS/hosts at the proxy; the app currently allows all by default
   - Set security headers via env vars (see below)

4. Start and verify
   ```bash
   docker compose up -d --build
   docker compose ps
   curl -f -H "Authorization: Bearer tsr_prod_key_..." http://localhost:7860/ || echo "API not ready"
   ```

5. Health and logs
   - Postgres health is checked automatically; the app waits until healthy
   - `docker compose logs -f app`

Security headers
- Override defaults to match your CSP and embedding needs. If set to `NONE` or empty, the header is omitted.

```bash
# Example: allow Swagger/ReDoc CDNs and a trusted frame host
TENSORUS_X_FRAME_OPTIONS="ALLOW-FROM https://example.com"
TENSORUS_CONTENT_SECURITY_POLICY="default-src 'self'; script-src 'self' https://cdn.jsdelivr.net; style-src 'self' https://fonts.googleapis.com 'unsafe-inline'; font-src 'self' https://fonts.gstatic.com; img-src 'self' https://fastapi.tiangolo.com"
```

Troubleshooting
- 401 Unauthorized: ensure you send `Authorization: Bearer tsr_...` and the key is configured (`TENSORUS_API_KEYS` or `TENSORUS_VALID_API_KEYS`).
- 503 auth not configured: set an API key when auth is enabled.
- DB connection errors: verify Postgres env, port conflicts (host 5433 vs local 5432), and that the DB user/database exist.
- Windows/WSL2 volume performance: keep bindโ€‘mounted files on the Linux filesystem for best performance.

## Testing

### Preparing the Test Environment

The tests expect all dependencies from both `requirements.txt` and
`requirements-test.txt` to be installed. A simple setup script is provided
to handle this automatically:

```bash
./setup.sh
```

Run this after creating and activating a Python virtual environment. The script
installs the Tensorus runtime requirements and the additional packages needed
for `pytest`. Once completed, executing `pytest` from the repository root will
automatically discover and run the entire suiteโ€”no manual package discovery is
required.

### Test Suite Dependencies

The Python tests rely on packages from both `requirements.txt` and
`requirements-test.txt`. The latter includes `httpx` and other packages
used by the test suite. **Always run `./setup.sh` before executing
`pytest` to install these requirements**:

```bash
./setup.sh
```

### Running Tests

Tensorus includes Python unit tests. To set up the environment and run them:

1. Install all dependencies using the setup script **before running any tests**:

    ```bash
    ./setup.sh
    ```

    This script installs packages from `requirements.txt` and `requirements-test.txt` (which pins `fastapi>=0.110` for Pydantic v2 support).

2. Run the Python test suite:

    ```bash
    pytest
    ```

    All tests should pass without errors when dependencies are properly installed.

## Using Tensorus

### API Basics

Base URL: `http://localhost:7860`

Authentication:
- Preferred: send `Authorization: Bearer tsr_<your_key>`
- Legacy (still supported): `X-API-KEY: tsr_<your_key>`

PowerShell notes:
- Use double quotes for JSON and escape inner quotes, or run in WSL/Git Bash for copy/paste fidelity.
- Alternatively, use `--%` to stop PowerShell from interpreting special characters.

### API Endpoints

The API provides the following main endpoints:

*   **Datasets:**
    *   `POST /datasets/create`: Create a new dataset.
    *   `POST /datasets/{name}/ingest`: Ingest a tensor into a dataset.
    *   `GET /datasets/{name}/fetch`: Retrieve all records from a dataset.
    *   `GET /datasets/{name}/records`: Retrieve a page of records. Supports `offset` (start index, default `0`) and `limit` (max results, default `100`).
    *   `GET /datasets`: List all available datasets.
    *   `GET /datasets/{name}/count`: Count records in a dataset.
    *   `GET /datasets/{dataset_name}/tensors/{record_id}`: Retrieve a tensor by record ID.
    *   `DELETE /datasets/{dataset_name}`: Delete a dataset.
    *   `DELETE /datasets/{dataset_name}/tensors/{record_id}`: Delete a tensor by record ID.
    *   `PUT /datasets/{dataset_name}/tensors/{record_id}/metadata`: Update tensor metadata.
*   **Querying:**
    *   `POST /api/v1/query`: Execute an NQL query.
*   **Vector Database:**
    *   `POST /api/v1/vector/embed`: Generate and store embeddings from text.
    *   `POST /api/v1/vector/search`: Perform vector similarity search.
    *   `POST /api/v1/vector/hybrid-search`: Execute hybrid semantic-computational search.
    *   `POST /api/v1/vector/tensor-workflow`: Run tensor workflow with lineage tracking.
    *   `POST /api/v1/vector/index/build`: Build vector indexes with geometric partitioning.
    *   `GET /api/v1/vector/models`: List available embedding models.
    *   `GET /api/v1/vector/stats/{dataset_name}`: Get embedding statistics for a dataset.
    *   `GET /api/v1/vector/metrics`: Get performance metrics.
    *   `DELETE /api/v1/vector/vectors/{dataset_name}`: Delete vectors from a dataset.
*   **Operation History & Lineage:**
    *   `GET /api/v1/operations/recent`: Get recent operations with optional filtering by type/status.
    *   `GET /api/v1/operations/tensor/{tensor_id}`: Get all operations that involved a specific tensor.
    *   `GET /api/v1/operations/statistics`: Get aggregate operation statistics.
    *   `GET /api/v1/operations/types`: List available operation types.
    *   `GET /api/v1/operations/statuses`: List available operation statuses.
    *   `GET /api/v1/lineage/tensor/{tensor_id}`: Get computational lineage for a tensor.
    *   `GET /api/v1/lineage/tensor/{tensor_id}/dot`: DOT graph for lineage visualization.
    *   `GET /api/v1/lineage/tensor/{source_tensor_id}/path/{target_tensor_id}`: Operation path between two tensors.
*   **Agents:**
    *   `GET /agents`: List all registered agents.
    *   `GET /agents/{agent_id}/status`: Get the status of a specific agent.
    *   `POST /agents/{agent_id}/start`: Start an agent.
    *   `POST /agents/{agent_id}/stop`: Stop an agent.
    *   `GET /agents/{agent_id}/logs`: Get recent logs for an agent.
    *   `GET /agents/{agent_id}/config`: Get stored configuration for an agent.
    *   `POST /agents/{agent_id}/configure`: Update agent configuration.
*   **Metrics & Monitoring:**
    *   `GET /metrics/dashboard`: Get aggregated dashboard metrics.

### Vector Database Examples

Note on path parameter names across endpoints:
- Datasets CRUD often uses `name` in path: e.g., `/datasets/{name}/ingest`, `/datasets/{name}/records`
- Tensor CRUD uses `dataset_name` + `record_id`: e.g., `/datasets/{dataset_name}/tensors/{record_id}`
- Vector API consistently uses `dataset_name` in path and body

For an end-to-end quickstart with PowerShell-friendly curl commands and authentication setup, see `DEMO.md` โ†’ "Vector & Embedding API Quickstart".

#### Generate & Store Embeddings
```bash
curl -X POST "http://localhost:7860/api/v1/vector/embed" \
  -H "Authorization: Bearer your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "texts": ["Machine learning algorithms", "Deep neural networks"],
    "dataset_name": "ai_research",
    "model_name": "all-mpnet-base-v2",
    "namespace": "research",
    "tenant_id": "team_alpha"
  }'
```

#### Semantic Similarity Search
```bash
curl -X POST "http://localhost:7860/api/v1/vector/search" \
  -H "Authorization: Bearer your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "artificial intelligence models",
    "dataset_name": "ai_research",
    "k": 5,
    "similarity_threshold": 0.7,
    "namespace": "research"
  }'
```

Example response:
```json
{
  "success": true,
  "query": "artificial intelligence models",
  "total_results": 2,
  "search_time_ms": 8.42,
  "results": [
    {
      "record_id": "rec_123",
      "similarity_score": 0.9153,
      "rank": 1,
      "source_text": "Deep learning models for AI",
      "metadata": {"source": "paper_db", "year": 2024},
      "namespace": "research",
      "tenant_id": "team_alpha"
    },
    {
      "record_id": "rec_456",
      "similarity_score": 0.8831,
      "rank": 2,
      "source_text": "AI model architectures",
      "metadata": {"source": "notes"},
      "namespace": "research",
      "tenant_id": "team_alpha"
    }
  ]
}
```

#### Hybrid Computational Search
```bash
curl -X POST "http://localhost:7860/api/v1/vector/hybrid-search" \
  -H "Authorization: Bearer your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "text_query": "neural network weights",
    "dataset_name": "model_tensors", 
    "tensor_operations": [
      {
        "operation_name": "svd",
        "description": "Singular value decomposition",
        "parameters": {}
      }
    ],
    "similarity_weight": 0.7,
    "computation_weight": 0.3,
    "filters": {
      "preferred_shape": [512, 512],
      "sparsity_preference": 0.1
    }
  }'
```

### Agents API Examples

#### List Agents
```bash
curl -s -H "Authorization: Bearer your_api_key" \
  "http://localhost:7860/agents"
```

#### Start Agent
```bash
curl -X POST -H "Authorization: Bearer your_api_key" \
  "http://localhost:7860/agents/ingestion/start"
```

#### Agent Status & Logs
```bash
curl -s -H "Authorization: Bearer your_api_key" \
  "http://localhost:7860/agents/ingestion/status"

curl -s -H "Authorization: Bearer your_api_key" \
  "http://localhost:7860/agents/ingestion/logs?lines=50"
```

### Operation History & Lineage Examples

#### Recent Operations
```bash
curl -s -H "Authorization: Bearer your_api_key" \
  "http://localhost:7860/api/v1/operations/recent?limit=50"
```

#### Tensor Lineage (JSON)
```bash
curl -s -H "Authorization: Bearer your_api_key" \
  "http://localhost:7860/api/v1/lineage/tensor/{tensor_id}"
```

#### Tensor Lineage (DOT Graph)
```bash
curl -s -H "Authorization: Bearer your_api_key" \
  "http://localhost:7860/api/v1/lineage/tensor/{tensor_id}/dot"
```

#### Operation Path Between Two Tensors
```bash
curl -s -H "Authorization: Bearer your_api_key" \
  "http://localhost:7860/api/v1/lineage/tensor/{source_tensor_id}/path/{target_tensor_id}"
```

### Authentication Examples

Recommended (Bearer):
```bash
curl -s \
  -H "Authorization: Bearer tsr_your_api_key" \
  "http://localhost:7860/datasets"
```

Legacy header (still supported):
```bash
curl -s \
  -H "X-API-KEY: tsr_your_api_key" \
  "http://localhost:7860/datasets"
```

### NQL Query Example

```bash
curl -X POST "http://localhost:7860/api/v1/query" \
  -H "Authorization: Bearer tsr_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "find tensors from 'my_dataset' where metadata.source = 'api_ingest' limit 10"
  }'
```

### Request/Response Schemas

Below are the primary Pydantic models used by the API. See `tensorus/api.py` and `tensorus/api/models.py` for full details.

* **ApiResponse** (`tensorus/api.py`)
  - `success: bool`
  - `message: str`
  - `data: Any | null`

* **DatasetCreateRequest** (`tensorus/api.py`)
  - `name: str`

* **TensorInput** (`tensorus/api.py`)
  - `shape: List[int]`
  - `dtype: str`
  - `data: List[Any] | int | float`
  - `metadata: Dict[str, Any] | null`

* **TensorOutput** (`tensorus/api/models.py`)
  - `record_id: str`
  - `shape: List[int]`
  - `dtype: str`
  - `data: List[Any] | int | float`
  - `metadata: Dict[str, Any]`

* **NQLQueryRequest / NQLResponse** (`tensorus/api/models.py`)
  - Request: `query: str`
  - Response: `success: bool`, `message: str`, `count?: int`, `results?: List[TensorOutput]`

* **VectorSearchQuery** (`tensorus/api/models.py`)
  - `query: str`, `dataset_name: str`, `k: int = 5`, `namespace?: str`, `tenant_id?: str`, `similarity_threshold?: float`, `include_vectors: bool = false`

* **OperationHistoryRequest** (`tensorus/api/models.py`)
  - Filters: `tensor_id?`, `operation_type?`, `status?`, `user_id?`, `session_id?`, `start_time?`, `end_time?`, `limit: int = 100`

* **LineageResponse** (`tensorus/api/models.py`)
  - Key fields: `tensor_id`, `root_tensor_ids`, `max_depth`, `total_operations`, `lineage_nodes[]`, `operations[]`, timestamps


#### ๐Ÿงฉ Examples

Explore our collection of examples to get started with Tensorus:
  "message": "Tensor ingested successfully.",
  "data": { "record_id": "abc123" }
}
```

Not Found (FastAPI error shape):
```json
{
  "detail": "Not Found"
}
```

Unauthorized:
```json
{
  "detail": "Not authenticated"
}
```

### Dataset API Examples

All requests require authentication by default: `-H "Authorization: Bearer your_api_key"` (legacy `X-API-KEY` also supported).

#### Create Dataset
```bash
curl -X POST "http://localhost:7860/datasets/create" \
  -H "Authorization: Bearer your_api_key" \
  -H "Content-Type: application/json" \
  -d '{"name": "my_dataset"}'
```

#### Ingest Tensor
```bash
curl -X POST "http://localhost:7860/datasets/my_dataset/ingest" \
  -H "Authorization: Bearer your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "shape": [2, 3],
    "dtype": "float32",
    "data": [[1.0,2.0,3.0],[4.0,5.0,6.0]],
    "metadata": {"source": "api_ingest", "label": "row_batch_1"}
  }'
```

#### Fetch Entire Dataset
```bash
curl -s -H "Authorization: Bearer your_api_key" \
  "http://localhost:7860/datasets/my_dataset/fetch"
```

#### Fetch Records (Pagination)
```bash
curl -s -H "Authorization: Bearer your_api_key" \
  "http://localhost:7860/datasets/my_dataset/records?offset=0&limit=50"
```

#### Count Records
```bash
curl -s -H "Authorization: Bearer your_api_key" \
  "http://localhost:7860/datasets/my_dataset/count"
```

#### Get Tensor By ID
```bash
curl -s -H "Authorization: Bearer your_api_key" \
  "http://localhost:7860/datasets/my_dataset/tensors/{record_id}"
```

#### Update Tensor Metadata (replace entire metadata)
```bash
curl -X PUT "http://localhost:7860/datasets/my_dataset/tensors/{record_id}/metadata" \
  -H "Authorization: Bearer your_api_key" \
  -H "Content-Type: application/json" \
  -d '{"new_metadata": {"source": "sensor_A", "priority": "high"}}'
```

#### Delete Tensor
```bash
curl -X DELETE -H "Authorization: Bearer your_api_key" \
  "http://localhost:7860/datasets/my_dataset/tensors/{record_id}"
```

#### Delete Dataset
```bash
curl -X DELETE -H "Authorization: Bearer your_api_key" \
  "http://localhost:7860/datasets/my_dataset"
```

### Dataset Schemas

Datasets can optionally include a schema when created. The schema defines
required metadata fields and expected tensor `shape` and `dtype`. Inserts that
violate the schema will raise a validation error.

Example:

```python
schema = {
    "shape": [3, 10],
    "dtype": "float32",
    "metadata": {"source": "str", "value": "int"}
}
storage.create_dataset("my_ds", schema=schema)
storage.insert("my_ds", torch.rand(3, 10), {"source": "sensor", "value": 5})
```

## Metadata System

Tensorus includes a detailed metadata subsystem for describing tensors beyond their raw data. Each tensor has a `TensorDescriptor` and can be associated with optional semantic, lineage, computational, quality, relational, and usage metadata. The metadata storage backend is pluggable, supporting in-memory storage for quick testing or PostgreSQL for persistence. Search and aggregation utilities allow querying across these metadata fields. See [metadata_schemas.md](docs/metadata_schemas.md) for schema details.

## Streamlit UI

The Streamlit UI provides a user-friendly interface for:

*   **Dashboard:** View basic system metrics and agent status.
*   **Agent Control:** Start, stop, and view logs for agents.
*   **NQL Chat:** Enter natural language queries and view results.
*   **Data Explorer:** Browse datasets, preview data, and perform tensor operations.

## Natural Query Language (NQL)

Tensorus ships with a simple regexโ€‘based Natural Query Language for retrieving
tensors by metadata. You can issue NQL queries via the API or from the "NQL
Chat" page in the Streamlit UI.

See also: [NQL Query Example](#nql-query-example) for a minimal API request.

### Enabling LLM rewriting

Set `NQL_USE_LLM=true` to enable parsing of freeโ€‘form queries with
Google's Gemini model. Provide your API key in the `GOOGLE_API_KEY`
environment variable and optionally set `NQL_LLM_MODEL` (defaults to
`gemini-2.0-flash`) to choose the model version. The agent sends the
current dataset schema and your query to Gemini via
`langchain-google`. If the model or key are unavailable the agent
silently falls back to the regex-based parser.

Example query using the LLM parser:

```text
show me all images containing a dog from dataset animals where source is "mobile"
```

This phrasing is more natural than the regex format and will be
rewritten into a structured NQL query by Gemini.

## Agent Details

### Data Ingestion Agent

*   **Functionality:** Monitors a source directory for new files, preprocesses them into tensors, and inserts them into TensorStorage.
*   **Supported File Types:** CSV, PNG, JPG, JPEG, TIF, TIFF (can be extended).
*   **Preprocessing:** Uses default functions for CSV and images (resize, normalize).
*   **Configuration:**
    *   `source_directory`: The directory to monitor.
    *   `polling_interval_sec`: How often to check for new files.
    *   `preprocessing_rules`: A dictionary mapping file extensions to custom preprocessing functions.

### RL Agent

*   **Functionality:** A Deep Q-Network (DQN) agent that learns from experiences stored in TensorStorage.
*   **Environment:** Uses a `DummyEnv` for demonstration.
*   **Experience Storage:** Stores experiences (state, action, reward, next_state, done) in TensorStorage.
*   **Training:** Implements epsilon-greedy exploration and target network updates.
*   **Configuration:**
    *   `state_dim`: Dimensionality of the environment state.
    *   `action_dim`: Number of discrete actions.
    *   `hidden_size`: Hidden layer size for the DQN.
    *   `lr`: Learning rate.
    *   `gamma`: Discount factor.
    *   `epsilon_*`: Epsilon-greedy parameters.
    *   `target_update_freq`: Target network update frequency.
    *   `batch_size`: Experience batch size.
    *   `experience_dataset`: Dataset name for experiences.
    *   `state_dataset`: Dataset name for state tensors.

### AutoML Agent

*   **Functionality:** Performs hyperparameter optimization using random search.
*   **Model:** Trains a simple `DummyMLP` model.
*   **Search Space:** Configurable hyperparameter search space (learning rate, hidden size, activation).
*   **Evaluation:** Trains and evaluates models on synthetic data.
*   **Results:** Stores trial results (parameters, score) in TensorStorage.
*   **Configuration:**
    *   `search_space`: Dictionary defining the hyperparameter search space.
    *   `input_dim`: Input dimension for the model.
    *   `output_dim`: Output dimension for the model.
    *   `task_type`: Type of task ('regression' or 'classification').
    *   `results_dataset`: Dataset name for storing results.

### Embedding Agent

*   **Functionality:** Multi-provider embedding generation with intelligent caching and vector database integration.
*   **Providers:** Supports Sentence Transformers, OpenAI, and extensible architecture for additional providers.
*   **Features:** Automatic batching, embedding caching, vector indexing, and performance monitoring.
*   **Configuration:**
    *   `default_provider`: Default embedding provider to use.
    *   `default_model`: Default model for embedding generation.
    *   `batch_size`: Batch size for embedding generation.
    *   `cache_ttl`: Time-to-live for embedding cache entries.

## Tensorus Models

The collection of example models previously bundled with Tensorus now lives in
a separate repository: [tensorus/models](https://github.com/tensorus/models).
Install it with:

```bash
pip install tensorus-models
```

When the package is installed, Tensorus will automatically import it. Set the
environment variable `TENSORUS_MINIMAL_IMPORT=1` before importing Tensorus to
skip this optional dependency and keep startup lightweight.

## Basic Tensor Operations

This section details the core tensor manipulation functionalities provided by `tensor_ops.py`. These operations are designed to be robust, with built-in type and shape checking where appropriate.

#### Arithmetic Operations

*   `add(t1, t2)`: Element-wise addition of two tensors, or a tensor and a scalar.
*   `subtract(t1, t2)`: Element-wise subtraction of two tensors, or a tensor and a scalar.
*   `multiply(t1, t2)`: Element-wise multiplication of two tensors, or a tensor and a scalar.
*   `divide(t1, t2)`: Element-wise division of two tensors, or a tensor and a scalar. Includes checks for division by zero.
*   `power(t1, t2)`: Raises each element in `t1` to the power of `t2`. Supports tensor or scalar exponents.
*   `log(tensor)`: Element-wise natural logarithm with warnings for non-positive values.

#### Matrix and Dot Operations

*   `matmul(t1, t2)`: Matrix multiplication of two tensors, supporting various dimensionalities (e.g., 2D matrices, batched matrix multiplication).
*   `dot(t1, t2)`: Computes the dot product of two 1D tensors.
*   `outer(t1, t2)`: Computes the outer product of two 1โ€‘D tensors.
*   `cross(t1, t2, dim=-1)`: Computes the cross product along the specified dimension (size must be 3).
*   `matrix_eigendecomposition(matrix_A)`: Returns eigenvalues and eigenvectors of a square matrix.
*   `matrix_trace(matrix_A)`: Computes the trace of a 2-D matrix.
*   `tensor_trace(tensor_A, axis1=0, axis2=1)`: Trace of a tensor along two axes.
*   `svd(matrix)`: Singular value decomposition of a matrix, returns `U`, `S`, and `Vh`.
*   `qr_decomposition(matrix)`: QR decomposition returning `Q` and `R`.
*   `lu_decomposition(matrix)`: LU decomposition returning permutation `P`, lower `L`, and upper `U` matrices.
*   `cholesky_decomposition(matrix)`: Cholesky factor of a symmetric positive-definite matrix.
*   `matrix_inverse(matrix)`: Inverse of a square matrix.
*   `matrix_determinant(matrix)`: Determinant of a square matrix.
*   `matrix_rank(matrix)`: Rank of a matrix.

#### Reduction Operations

*   `sum(tensor, dim=None, keepdim=False)`: Computes the sum of tensor elements over specified dimensions.
*   `mean(tensor, dim=None, keepdim=False)`: Computes the mean of tensor elements over specified dimensions. Tensor is cast to float for calculation.
*   `min(tensor, dim=None, keepdim=False)`: Finds the minimum value in a tensor, optionally along a dimension. Returns values and indices if `dim` is specified.
*   `max(tensor, dim=None, keepdim=False)`: Finds the maximum value in a tensor, optionally along a dimension. Returns values and indices if `dim` is specified.
*   `variance(tensor, dim=None, unbiased=False, keepdim=False)`: Variance of tensor elements.
*   `covariance(matrix_X, matrix_Y=None, rowvar=True, bias=False, ddof=None)`: Covariance matrix estimation.
*   `correlation(matrix_X, matrix_Y=None, rowvar=True)`: Correlation coefficient matrix.

#### Reshaping and Slicing

*   `reshape(tensor, shape)`: Changes the shape of a tensor without changing its data.
*   `transpose(tensor, dim0, dim1)`: Swaps two dimensions of a tensor.
*   `permute(tensor, dims)`: Permutes the dimensions of a tensor according to the specified order.
*   `flatten(tensor, start_dim=0, end_dim=-1)`: Flattens a range of dimensions into a single dimension.
*   `squeeze(tensor, dim=None)`: Removes dimensions of size 1, or a specific dimension if provided.
*   `unsqueeze(tensor, dim)`: Inserts a dimension of size 1 at the given position.

#### Concatenation and Splitting

*   `concatenate(tensors, dim=0)`: Joins a sequence of tensors along an existing dimension.
*   `stack(tensors, dim=0)`: Joins a sequence of tensors along a new dimension.

#### Advanced Operations

*   `einsum(equation, *tensors)`: Applies Einstein summation convention to the input tensors based on the provided equation string.
*   `compute_gradient(func, tensor)`: Returns the gradient of a scalar `func` with respect to `tensor`.
*   `compute_jacobian(func, tensor)`: Computes the Jacobian matrix of a vector function.
*   `convolve_1d(signal_x, kernel_w, mode='valid')`: 1โ€‘D convolution using `torch.nn.functional.conv1d`.
*   `convolve_2d(image_I, kernel_K, mode='valid')`: 2โ€‘D convolution using `torch.nn.functional.conv2d`.
 *   `frobenius_norm(tensor)`: Calculates the Frobenius norm.
 *   `l1_norm(tensor)`: Calculates the L1 norm (sum of absolute values).

## Tensor Decomposition Operations

Tensorus includes a library of higherโ€‘order tensor factorizations in
`tensor_decompositions.py`. These operations mirror the algorithms
available in TensorLy and related libraries.

* **CP Decomposition** โ€“ Canonical Polyadic factorization returning
  weights and factor matrices.
* **NTFโ€‘CP Decomposition** โ€“ Nonโ€‘negative CP using
  `non_negative_parafac`.
* **Tucker Decomposition** โ€“ Standard Tucker factorization for specified
  ranks.
* **Nonโ€‘negative Tucker / Partial Tucker** โ€“ Variants with HOOI and
  nonโ€‘negative constraints.
* **HOSVD** โ€“ Higherโ€‘order SVD (Tucker with full ranks).
* **Tensor Train (TT)** โ€“ Sequence of TT cores representing the tensor.
* **TTโ€‘SVD** โ€“ TT factorization via SVD initialization.
* **Tensor Ring (TR)** โ€“ Circular variant of TT.
* **Hierarchical Tucker (HT)** โ€“ Decomposition using a dimension tree.
* **Block Term Decomposition (BTD)** โ€“ Sum of Tuckerโ€‘1 terms for 3โ€‘way
  tensors.
* **tโ€‘SVD** โ€“ Tensor singular value decomposition based on the
  tโ€‘product.

Examples of how to call these methods are provided in
[`tensorus/tensor_decompositions.py`](tensorus/tensor_decompositions.py).

## Vector Database Features

### Embedding Generation

Tensorus supports multiple embedding providers for generating high-quality vector representations of text:

*   **Sentence Transformers**: Local models including all-MiniLM-L6-v2, all-mpnet-base-v2, and specialized models
*   **OpenAI**: Cloud-based models like text-embedding-3-small and text-embedding-3-large
*   **Extensible Architecture**: Easy integration of additional embedding providers

### Vector Indexing

Advanced vector indexing capabilities for efficient similarity search:

*   **Geometric Partitioning**: Automatic distribution of vectors across partitions using k-means clustering
*   **Freshness Layers**: Real-time updates without requiring full index rebuilds
*   **FAISS Integration**: High-performance similarity search with multiple distance metrics
*   **Multi-tenancy**: Namespace and tenant isolation for secure multi-user deployments

### Hybrid Search

Unique hybrid search capabilities that combine semantic similarity with computational tensor properties:

*   **Semantic Scoring**: Traditional vector similarity search based on text embeddings
*   **Computational Scoring**: Mathematical property evaluation including shape compatibility, sparsity, rank analysis
*   **Operation Compatibility**: Scoring tensors based on suitability for specific mathematical operations
*   **Combined Ranking**: Weighted combination of semantic and computational relevance scores

### Tensor Workflows

Execute complex mathematical workflows with full computational lineage tracking:

*   **Workflow Execution**: Chain multiple tensor operations with intermediate result storage
*   **Lineage Tracking**: Complete provenance tracking of tensor transformations
*   **Scientific Reproducibility**: Full audit trail of computational steps for research applications
*   **Intermediate Storage**: Optional preservation of intermediate results for analysis

## Completed Features

The current codebase implements all of the items listed in
[Key Features](#key-features). Tensorus already provides efficient tensor
storage with optional file persistence, a natural query language, a flexible
agent framework, a RESTful API, a Streamlit UI, robust tensor operations, and
advanced vector database capabilities. The modular architecture makes future
extensions straightforward.

## Future Implementation

*   **Enhanced NQL:** Integrate a local or remote LLM for more robust natural language understanding.
*   **Advanced Agents:** Develop more sophisticated agents for specific tasks (e.g., anomaly detection, forecasting).
*   **Persistent Storage Backend:** Replace/augment current file-based persistence with more robust database or cloud storage solutions (e.g., PostgreSQL, S3, MinIO).
*   **Advanced Vector Indexing:** Implement HNSW and IVF-PQ algorithms for even more efficient similarity search.
*   **Scalability & Performance:**
    *   Implement tensor chunking for very large tensors.
    *   Optimize query performance with indexing.
    *   Asynchronous operations for agents and API calls.
*   **Security:** Implement authentication and authorization mechanisms for the API and UI.
*   **Real-World Integration:**
    *   Connect Ingestion Agent to more data sources (e.g., cloud storage, databases, APIs).
    *   Integrate RL Agent with real-world environments or more complex simulations.
*   **Advanced AutoML:**
    *   Implement sophisticated search algorithms (e.g., Bayesian Optimization, Hyperband).
    *   Support for diverse model architectures and custom models.
*   **Model Management:** Add capabilities for saving, loading, versioning, and deploying trained models (from RL/AutoML).
*   **Streaming Data Support:** Enhance Ingestion Agent to handle real-time streaming data.
*   **Resource Management:** Add tools and controls for monitoring and managing the resource consumption (CPU, memory) of agents.
*   **Improved UI/UX:** Continuously refine the Streamlit UI for better usability and richer visualizations.
*   **Comprehensive Testing:** Expand unit, integration, and end-to-end tests.
*   **Multi-modal Embeddings:** Support for image, audio, and video embeddings alongside text.
*   **Distributed Architecture:** Multi-node deployments for large-scale vector search workloads.

## ๐Ÿค Community & Contributing

### ๐Ÿ’ฌ Get Help & Support

**Community Resources:**
- **๐Ÿ“š [Documentation Hub](docs/index.md)** - Comprehensive guides and tutorials
- **๐Ÿ’ฌ [GitHub Discussions](https://github.com/tensorus/tensorus/discussions)** - Ask questions and share ideas
- **๐Ÿ› [Issue Tracker](https://github.com/tensorus/tensorus/issues)** - Bug reports and feature requests
- **๐Ÿท๏ธ [Stack Overflow](https://stackoverflow.com/questions/tagged/tensorus)** - Technical Q&A with the community

**Enterprise Support:**
- **๐Ÿ“ง Technical Support**: support@tensorus.com
- **๐Ÿ“ง Sales & Partnerships**: sales@tensorus.com  
- **๐Ÿ“ง Security Issues**: security@tensorus.com

### ๐Ÿš€ Contributing to Tensorus

We welcome contributions from the community! Here's how to get involved:

#### ๐Ÿ› Report Issues
- Use our [issue templates](https://github.com/tensorus/tensorus/issues/new/choose) for bug reports
- Include system information, reproduction steps, and expected behavior
- Search existing issues before creating new ones

#### ๐Ÿ”ง Code Contributions
1. **Fork** the repository and create a feature branch
2. **Develop** with proper tests and documentation
3. **Test** your changes locally using `pytest`
4. **Submit** a pull request with clear description and examples

#### ๐Ÿ“– Documentation Improvements
- Fix typos, improve clarity, and add examples
- Translate documentation to other languages  
- Create tutorials and use case guides
- Update API documentation and code comments

#### ๐Ÿ’ก Feature Requests & Ideas
- Propose new features via [GitHub Discussions](https://github.com/tensorus/tensorus/discussions)
- Provide detailed use cases and implementation suggestions
- Participate in design discussions and RFC processes

**Development Resources:**
- **๐Ÿ“‹ [Contributing Guide](CONTRIBUTING.md)** - Detailed contribution guidelines
- **๐Ÿ“œ [Code of Conduct](CODE_OF_CONDUCT.md)** - Community standards and expectations
- **๐Ÿ—๏ธ [Development Setup](docs/getting_started.md#development-installation)** - Local development environment

## ๐Ÿ“„ License & Legal

**MIT License** - See [LICENSE](LICENSE) file for complete terms.

```
Copyright (c) 2024 Tensorus Contributors

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
```

**Third-Party Licenses:** This project includes dependencies with their own licenses. See `requirements.txt` and individual package documentation for details.

---

<div align="center">

### ๐ŸŒŸ Ready to Transform Your Tensor Workflows?

[![Get Started](https://img.shields.io/badge/๐Ÿ“š_Get_Started-blue?style=for-the-badge&logo=rocket)](docs/getting_started.md)
[![Live Demo](https://img.shields.io/badge/๐Ÿš€_Try_Demo-green?style=for-the-badge&logo=play)](https://tensorus-dashboard.hf.space)
[![API Docs](https://img.shields.io/badge/๐Ÿ“–_API_Reference-orange?style=for-the-badge&logo=swagger)](docs/api_reference.md)
[![Enterprise](https://img.shields.io/badge/๐Ÿข_Enterprise-purple?style=for-the-badge&logo=building)](mailto:sales@tensorus.com)

### โญ **Star us on GitHub** | **๐Ÿ”„ Share with your team** | **๐Ÿ“ข Follow for updates**

*Tensorus - Empowering Intelligent Tensor Data Management*

</div>

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "tensorus",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "tensor, database, agent, ai, pytorch, fastapi, streamlit, automl, reinforcement-learning, data-ingestion",
    "author": null,
    "author_email": "Tensorus Team <ai@tensorus.com>",
    "download_url": "https://files.pythonhosted.org/packages/13/e6/98e22da049cc91622e5f169e364e5ae04f27d7b66eb437ba6f2c411a30eb/tensorus-0.1.0.tar.gz",
    "platform": null,
    "description": "---\nlicense: mit\ntitle: Core\nsdk: docker\nemoji: \ud83d\udc20\ncolorFrom: blue\ncolorTo: yellow\nshort_description: Tensorus Core\n---\n\n# Tensorus: Agentic Tensor Database/Data Lake\n\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)\n[![PyPI version](https://img.shields.io/badge/pypi-v0.0.5-blue.svg)](https://pypi.org/project/tensorus/)\n[![Docker](https://img.shields.io/badge/docker-%230db7ed.svg?style=flat&logo=docker&logoColor=white)](https://hub.docker.com/r/tensorus/tensorus)\n[![API Documentation](https://img.shields.io/badge/API-Documentation-green.svg)](https://docs.tensorus.com/api)\n\n> **\ud83c\udf89 New in v0.0.5:** Unified Python SDK with intuitive API, Agent Orchestrator for multi-agent workflows, and comprehensive examples. See [What's New](#-whats-new-in-v005) for details.\n\n**Tensorus** is a production-ready, specialized data platform focused on the management and agent-driven manipulation of tensor data. It offers a streamlined environment for storing, retrieving, and operating on tensors at scale, providing the foundation for advanced AI and machine learning workflows.\n\n## \ud83c\udfaf What Makes Tensorus Special\n\nTensorus bridges the gap between traditional databases and AI/ML requirements by providing:\n\n- **\ud83e\udde0 Intelligent Agent Framework** - Built-in agents for data ingestion, reinforcement learning, AutoML, and embedding generation\n- **\u26a1 High-Performance Tensor Operations** - 40+ optimized operations with 10-100x performance improvements\n- **\ud83d\udd0d Natural Language Queries** - Intuitive NQL interface for tensor discovery and analysis\n- **\ud83d\udcca Complete Observability** - Full computational lineage and operation history tracking\n- **\ud83c\udfd7\ufe0f Production-Grade Architecture** - Enterprise security, scaling, and deployment capabilities\n\nThe core purpose of Tensorus is to **simplify and accelerate** how developers and AI agents interact with tensor datasets, enabling faster development of automated data ingestion, reinforcement learning from stored experiences, AutoML processes, and intelligent data utilization in AI projects.\n\n## \ud83d\ude80 Quick Start (3 Minutes)\n\n### Installation\n```bash\n# Install from PyPI\npip install tensorus\n\n# Or install from source for development\ngit clone https://github.com/tensorus/tensorus.git\ncd tensorus\npip install -e .\n```\n\n### Basic Usage with Python SDK\n```python\nfrom tensorus import Tensorus\nimport torch\n\n# Initialize Tensorus SDK (minimal dependencies)\nts = Tensorus(\n    enable_nql=False,          # Disable if transformers not installed\n    enable_embeddings=False,   # Disable if sentence-transformers not installed\n    enable_vector_search=False\n)\n\n# Create a dataset\nts.create_dataset(\"my_dataset\")\n\n# Create and store tensors\ntensor_a = ts.create_tensor(\n    [[1, 2], [3, 4]], \n    name=\"matrix_a\",\n    dataset=\"my_dataset\"\n)\n\ntensor_b = ts.create_tensor(\n    [[5, 6], [7, 8]],\n    name=\"matrix_b\",\n    dataset=\"my_dataset\"\n)\n\n# Perform operations\nresult = ts.matmul(tensor_a.to_tensor(), tensor_b.to_tensor())\nprint(f\"Result shape: {result.shape}\")  # (2, 2)\n\n# List all tensors\ntensors = ts.list_tensors(\"my_dataset\")\nprint(f\"Stored {len(tensors)} tensors\")\n```\n\n### Start the API Server\n```bash\n# Start development server\npython -m uvicorn tensorus.api:app --reload --port 8000\n\n# Access interactive API docs at:\n# - Swagger UI: http://localhost:8000/docs\n# - ReDoc: http://localhost:8000/redoc\n```\n\n## \ud83d\udc0d Python SDK Features\n\nThe Tensorus SDK provides a unified interface for all tensor operations, agent coordination, and data management.\n\n### Core SDK Operations\n\n```python\nfrom tensorus import Tensorus\n\n# Full initialization with all features\nts = Tensorus(\n    enable_nql=True,              # Natural Query Language\n    enable_embeddings=True,       # Embedding generation  \n    enable_vector_search=True,    # Vector similarity search\n    enable_orchestrator=True,     # Multi-agent workflows\n    embedding_model=\"all-MiniLM-L6-v2\"\n)\n\n# Dataset management\nts.create_dataset(\"research_data\")\nts.list_datasets()\nts.delete_dataset(\"old_data\")\n\n# Tensor operations\na = ts.create_tensor([[1, 2], [3, 4]], name=\"matrix_a\", dataset=\"research_data\")\nb = ts.create_tensor([[5, 6], [7, 8]], name=\"matrix_b\", dataset=\"research_data\")\n\n# Mathematical operations\nresult = ts.matmul(a.to_tensor(), b.to_tensor())\ntransposed = ts.transpose(a.to_tensor())\neigenvals = ts.eigenvalues(a.to_tensor())\n\n# Natural language queries (requires enable_nql=True)\nresults = ts.query(\"find tensors in research_data where shape is (2, 2)\")\n\n# Vector operations (requires enable_embeddings=True)\nts.create_index(\"docs\", dimensions=384, metric=\"cosine\")\nts.embed_and_index(\n    texts=[\"Machine learning paper\", \"Deep learning tutorial\"],\n    index_name=\"docs\",\n    dataset=\"research_data\"\n)\nsearch_results = ts.search(\"neural networks\", index_name=\"docs\", top_k=5)\n\n# Multi-agent workflows (requires enable_orchestrator=True)\nworkflow = ts.create_workflow(\"data_pipeline\")\nts.orchestrator.add_task(workflow, \"embed\", \"embedding\", \"generate\", {...})\nts.orchestrator.add_task(workflow, \"index\", \"vector\", \"index\", {...}, deps=[\"embed\"])\nresults = ts.execute_workflow(workflow)\n```\n\n### SDK Benefits\n\n- **Unified Interface** - Single entry point for all Tensorus capabilities\n- **Lazy Loading** - Agents load only when enabled, reducing dependencies\n- **Type Safety** - Full type hints for IDE autocomplete and validation\n- **Error Handling** - Comprehensive exception handling with helpful messages\n- **Performance** - Optimized for both single-node and distributed workloads\n\n## \ud83d\udcda Documentation\n\nFor comprehensive documentation, including user guides and examples, please visit our [documentation site](https://docs.tensorus.com).\n\n### Interactive API Documentation\n\nAccess the interactive API documentation when the server is running:\n\n- **Swagger UI**: `http://localhost:8000/docs` - Interactive API exploration with \"Try it out\" functionality\n- **ReDoc**: `http://localhost:8000/redoc` - Clean, responsive API documentation\n\n### Quick Links\n- [Getting Started Guide](docs/user_guide.md) - Learn the basics of Tensorus\n- [Examples](examples/) - Practical code examples including `basic_usage.py` and `complete_workflow_example.py`\n- [Deployment Guide](docs/deployment.md) - Production deployment instructions\n\n## \ud83d\udcd6 Comprehensive Documentation\n\n### \ud83d\udcda Learning Resources\n- **\ud83c\udf93 [Documentation Hub](docs/index.md)** - Central portal with guided learning paths for all skill levels\n- **\ud83d\ude80 [Getting Started Guide](docs/getting_started.md)** - Complete 15-minute tutorial with real examples\n- **\ud83d\udca1 [Use Case Examples](examples/)** - Real-world implementations and practical guides\n\n### \ud83d\udd27 Technical References  \n- **\ud83d\udd0d [Complete API Reference](docs/api_reference.md)** - Full REST API documentation with code samples\n- **\ud83c\udfed [Production Deployment](docs/production_deployment.md)** - Enterprise deployment strategies and operations\n- **\u26a1 [Performance & Scaling](docs/performance_benchmarks.md)** - Benchmarks, optimization, and capacity planning\n\n### \ud83c\udfe2 Business & Strategy\n- **\ud83c\udfaf [Executive Overview](docs/executive_overview.md)** - Product positioning, market analysis, and business value\n- **\ud83d\udcca [Architecture Guide](docs/index.md#architecture-highlights)** - System design and technical architecture\n\n## \ud83d\udce6 What's New in v0.0.5\n\n**Major Release** - Unified SDK and Agent Orchestration\n\n### New Features\n- \u2728 **Unified Tensorus SDK** - Single `Tensorus` class with intuitive API for all operations\n- \ud83e\udd16 **Agent Orchestrator** - Multi-agent workflow coordination with DAG-based execution\n- \ud83d\udcda **Updated Examples** - All examples now use the new SDK (`examples/basic_usage.py`, `examples/complete_workflow_example.py`)\n- \ud83d\udcca **Benchmarking Suite** - Comprehensive performance testing framework (`benchmarks/benchmark_suite.py`)\n- \ud83d\udd27 **Lazy Agent Loading** - Agents only load when enabled, reducing startup dependencies\n- \ud83d\udcdd **Enhanced Documentation** - Complete SDK reference and implementation guides\n\n### Breaking Changes\n- **SDK Interface** - New unified API replaces direct component access (migration is straightforward - see Quick Start)\n- **Optional Dependencies** - NQL, embeddings, and vector search now require explicit enabling\n\n### Improvements\n- Better error messages for missing dependencies\n- Cleaner separation of concerns\n- Improved performance through optimized initialization\n- More intuitive API naming\n\nSee [QUICKSTART.md](QUICKSTART.md) for migration guide and [examples/](examples/) for updated code samples.\n\n## Table of Contents\n\n- [What's New in v0.0.5](#-whats-new-in-v005)\n- [Python SDK Features](#-python-sdk-features)\n- [Key Features](#key-features)\n- [Project Structure](#project-structure)\n- [Demos](#demos)\n- [Architecture](#architecture)\n- [Getting Started](#getting-started)\n  - [Prerequisites](#prerequisites)\n  - [Installation](#installation)\n  - [Running the API Server](#running-the-api-server)\n  - [Running the Streamlit UI](#running-the-streamlit-ui)\n  - [Model Context Protocol Integration](#model-context-protocol-integration)\n  - [Running the Agents (Examples)](#running-the-agents-examples)\n- [Docker Deployment](#docker-deployment)\n- [Environment Configuration](#environment-configuration)\n- [Production Deployment](#production-deployment)\n- [Testing](#testing)\n- [Using Tensorus](#using-tensorus)\n  - [API Basics](#api-basics)\n  - [Authentication Examples](#authentication-examples)\n  - [NQL Query Example](#nql-query-example)\n  - [API Endpoints](#api-endpoints)\n  - [Vector Database Examples](#vector-database-examples)\n  - [Request/Response Schemas](#requestresponse-schemas)\n  - [Dataset API Examples](#dataset-api-examples)\n  - [Dataset Schemas](#dataset-schemas)\n- [Metadata System](#metadata-system)\n- [Streamlit UI](#streamlit-ui)\n- [Natural Query Language (NQL)](#natural-query-language-nql)\n- [Agent Details](#agent-details)\n- [Tensorus Models](#tensorus-models)\n- [Basic Tensor Operations](#basic-tensor-operations)\n- [Tensor Decomposition Operations](#tensor-decomposition-operations)\n- [Vector Database Features](#vector-database-features)\n- [Completed Features](#completed-features)\n- [Future Implementation](#future-implementation)\n- [Contributing](#contributing)\n- [License](#license)\n\n## \ud83c\udf1f Core Capabilities\n\n### \ud83d\uddc4\ufe0f Advanced Tensor Storage System\n*   **High-Performance Storage** - Efficiently store and retrieve PyTorch tensors with rich metadata support\n*   **Intelligent Compression** - Multiple algorithms (LZ4, GZIP, quantization) with up to 4x space savings\n*   **Schema Validation** - Optional per-dataset schemas enforce metadata fields and tensor shape/dtype constraints\n*   **Chunked Processing** - Handle tensors larger than available memory through intelligent chunking\n*   **Multi-Backend Support** - Local filesystem, PostgreSQL, S3, and cloud storage backends\n\n### \ud83e\udd16 Intelligent Agent Ecosystem  \n*   **Data Ingestion Agent** - Automatically monitors directories and ingests files as tensors with preprocessing\n*   **Reinforcement Learning Agent** - Deep Q-Network (DQN) agent that learns from experiences stored in tensor datasets  \n*   **AutoML Agent** - Hyperparameter optimization and model selection using advanced search algorithms\n*   **Embedding Agent** - Multi-provider embedding generation with intelligent caching and vector indexing\n*   **Extensible Framework** - Build custom agents that interact intelligently with your tensor data\n\n### \ud83d\udd0d Advanced Query & Search Engine\n*   **Natural Query Language (NQL)** - Query tensor data using intuitive, natural language-like syntax\n*   **Vector Database Integration** - Advanced similarity search with multi-provider embedding generation\n*   **Hybrid Search** - Combine semantic similarity with computational tensor properties  \n*   **Geometric Partitioning** - Efficient vector indexing with automatic clustering and freshness layers\n\n### \ud83d\udd2c Production-Grade Operations\n*   **40+ Tensor Operations** - Comprehensive library covering arithmetic, linear algebra, decompositions, and advanced operations\n*   **Computational Lineage** - Complete tracking of tensor transformations for reproducible scientific workflows\n*   **Operation History** - Full audit trail with performance metrics and error tracking\n*   **Asynchronous Processing** - Background operations and job queuing for long-running computations\n\n### \ud83c\udf10 Developer-Friendly Interface\n*   **RESTful API** - FastAPI backend with comprehensive OpenAPI documentation and authentication\n*   **Interactive Web UI** - Streamlit-based dashboard for data exploration and agent control\n*   **Python SDK** - Rich client library with intuitive APIs and comprehensive error handling\n*   **Model Context Protocol** - Standardized integration for AI agents and LLMs via [tensorus/mcp](https://github.com/tensorus/mcp)\n\n### \ud83d\udcca Enterprise Features\n*   **Rich Metadata System** - Pydantic schemas for semantic, lineage, computational, quality, and usage metadata\n*   **Security & Authentication** - API key management, role-based access control, and audit logging  \n*   **Monitoring & Observability** - Health checks, performance metrics, and comprehensive logging\n*   **Scalable Architecture** - Horizontal scaling, load balancing, and distributed processing capabilities\n\n## Project Structure\n\n*   `app.py`: The main Streamlit frontend application (located at the project root).\n*   `pages/`: Directory containing individual Streamlit page scripts and shared UI utilities for the dashboard.\n    *   `pages/ui_utils.py`: Utility functions specifically for the Streamlit UI.\n    *   *(Other page scripts like `01_dashboard.py`, `02_control_panel.py`, etc., define the different views of the dashboard)*\n*   `tensorus/`: Directory containing the core `tensorus` library modules (this is the main installable package).\n    *   `tensorus/__init__.py`: Makes `tensorus` a Python package.\n    *   `tensorus/api.py`: The FastAPI application providing the backend API for Tensorus.\n    *   `tensorus/tensor_storage.py`: Core TensorStorage implementation for managing tensor data.\n    *   `tensorus/tensor_ops.py`: Library of functions for tensor manipulations.\n    *   `tensorus/vector_database.py`: Advanced vector indexing with geometric partitioning and freshness layers.\n    *   `tensorus/embedding_agent.py`: Multi-provider embedding generation and vector database integration.\n    *   `tensorus/hybrid_search.py`: Hybrid search engine combining semantic similarity with computational tensor properties.\n    *   `tensorus/nql_agent.py`: Agent for processing Natural Query Language queries.\n    *   `tensorus/ingestion_agent.py`: Agent for ingesting data from various sources.\n    *   `tensorus/rl_agent.py`: Agent for Reinforcement Learning tasks.\n    *   `tensorus/automl_agent.py`: Agent for AutoML processes.\n    *   `tensorus/dummy_env.py`: A simple environment for the RL agent demonstration.\n    *   *(Other Python files within `tensorus/` are part of the core library.)*\n*   `requirements.txt`: Lists the project's Python dependencies for development and local execution.\n*   `pyproject.toml`: Project metadata, dependencies for distribution, and build system configuration (e.g., for PyPI).\n*   `README.md`: This file.\n*   `LICENSE`: Project license file.\n*   `.gitignore`: Specifies intentionally untracked files that Git should ignore.\n\n## \ud83c\udf10 Live Demos & Integrations\n\n### \ud83d\ude80 Try Tensorus Online (No Installation Required)\n\nExperience Tensorus directly in your browser via Huggingface Spaces:\n\n*   **\ud83d\udd17 [Interactive API Documentation](https://tensorus-api.hf.space/docs)** - Full Swagger UI with live examples and real-time testing\n*   **\ud83d\udcd6 [Alternative API Docs](https://tensorus-api.hf.space/redoc)** - Clean ReDoc interface with detailed schemas\n*   **\ud83d\udcca [Web Dashboard Demo](https://tensorus-dashboard.hf.space)** - Complete Streamlit UI for data exploration and agent control\n\n### \ud83e\udd16 AI Agent Integration\n\n**Model Context Protocol (MCP) Support** - Standardized integration for AI agents and LLMs:\n*   **Repository:** [tensorus/mcp](https://github.com/tensorus/mcp) - Complete MCP server implementation\n*   **Features:** Standardized protocol access to all Tensorus capabilities\n*   **Use Cases:** LLM-driven tensor analysis, automated data workflows, intelligent agent interactions\n\n## Architecture\n\n### Tensorus Execution Cycle\n\n```mermaid\ngraph TD\n    %% User Interface Layer\n    subgraph UI_Layer [\"User Interaction\"]\n        UI[Streamlit UI]\n    end\n\n    %% API Gateway Layer\n    subgraph API_Layer [\"Backend Services\"]\n        API[FastAPI Backend]\n    end\n\n    %% Core Storage with Method Interface\n    subgraph Storage_Layer [\"Core Storage - TensorStorage\"]\n        TS[TensorStorage Core]\n        subgraph Storage_Methods [\"Storage Interface\"]\n            TS_insert[insert data metadata]\n            TS_query[query query_fn]\n            TS_get[get_by_id id]\n            TS_sample[sample n]\n            TS_update[update_metadata]\n        end\n        TS --- Storage_Methods\n    end\n\n    %% Agent Processing Layer\n    subgraph Agent_Layer [\"Processing Agents\"]\n        IA[Ingestion Agent]\n        NQLA[NQL Agent]\n        RLA[RL Agent]\n        AutoMLA[AutoML Agent]\n        EA[Embedding Agent]\n    end\n\n    %% Vector Database Layer\n    subgraph Vector_Layer [\"Vector Database\"]\n        VDB[Vector Index Manager]\n        HSE[Hybrid Search Engine]\n    end\n\n    %% Model System\n    subgraph Model_Layer [\"Model System\"]\n        Registry[Model Registry]\n        ModelsPkg[Models Package]\n    end\n\n    %% Tensor Operations Library\n    subgraph Ops_Layer [\"Tensor Operations\"]\n        TOps[TensorOps Library]\n    end\n\n    %% Primary UI Flow\n    UI -->|HTTP Requests| API\n\n    %% API Orchestration\n    API -->|Command Dispatch| IA\n    API -->|Command Dispatch| NQLA\n    API -->|Command Dispatch| RLA\n    API -->|Command Dispatch| AutoMLA\n    API -->|Vector Operations| EA\n    API -->|Model Training| Registry\n    API -->|Direct Query| TS_query\n\n    %% Vector Database Integration\n    EA -->|Vector Indexing| VDB\n    HSE -->|Hybrid Search| VDB\n    API -->|Search Requests| HSE\n\n    %% Model System Interactions\n    Registry -->|Uses Models| ModelsPkg\n    Registry -->|Load/Save| TS\n    ModelsPkg -->|Tensor Ops| TOps\n\n    %% Agent Storage Interactions\n    IA -->|Data Ingestion| TS_insert\n\n    NQLA -->|Query Execution| TS_query\n    NQLA -->|Record Retrieval| TS_get\n\n    RLA -->|State Persistence| TS_insert\n    RLA -->|Experience Sampling| TS_sample\n    RLA -->|State Retrieval| TS_get\n\n    AutoMLA -->|Trial Storage| TS_insert\n    AutoMLA -->|Data Retrieval| TS_query\n\n    EA -->|Embedding Storage| TS_insert\n    EA -->|Vector Retrieval| TS_query\n\n    %% Computational Operations\n    NQLA -->|Vector Operations| TOps\n    RLA -->|Policy Evaluation| TOps\n    AutoMLA -->|Model Optimization| TOps\n    HSE -->|Tensor Analysis| TOps\n\n    %% Indirect Storage Write-back\n    TOps -.->|Intermediate Results| TS_insert\n```\n\n## \ud83d\ude80 Installation & Setup\n\n### \ud83d\udccb System Requirements\n\n#### Minimum Requirements\n*   **Python:** 3.10+ (3.11+ recommended for best performance)\n*   **Memory:** 4 GB RAM (8+ GB recommended)\n*   **Storage:** 10 GB available disk space\n*   **OS:** Linux, macOS, Windows with WSL2\n\n#### Production Requirements\n*   **CPU:** 8+ cores with 16+ threads\n*   **Memory:** 32+ GB RAM (64+ GB for large tensor workloads)\n*   **Storage:** 1 TB+ NVMe SSD for optimal I/O performance  \n*   **Network:** 10+ Gbps for distributed deployments\n*   **See:** [Production Deployment Guide](docs/production_deployment.md) for detailed specifications\n\n### \ud83d\udd27 Installation Options\n\n#### Option 1: Quick Install (Recommended for New Users)\n```bash\n# Install latest stable version from PyPI\npip install tensorus\n\n# Start development server\ntensorus start --dev\n\n# Access web interface at http://localhost:8000\n# API documentation at http://localhost:8000/docs\n```\n\n#### Option 2: Feature-Specific Installation\n```bash\n# Install with GPU acceleration support\npip install tensorus[gpu]\n\n# Install with advanced compression algorithms\npip install tensorus[compression]\n\n# Install with monitoring and metrics\npip install tensorus[monitoring]\n\n# Install everything (enterprise features)\npip install tensorus[all]\n```\n\n#### Option 3: Development Installation\n```bash\n# Clone repository for development and contributions\ngit clone https://github.com/tensorus/tensorus.git\ncd tensorus\n\n# Create isolated virtual environment\npython3 -m venv venv\nsource venv/bin/activate  # Linux/macOS\n# venv\\Scripts\\activate   # Windows\n\n# Install in development mode with all dependencies\n./setup.sh\n```\n\n**Development Installation Notes:**\n- Uses `requirements.txt` and `requirements-test.txt` for full dependency management\n- Installs CPU-optimized PyTorch wheels (modify `setup.sh` for GPU versions)\n- Includes testing frameworks and development tools\n- Heavy ML libraries (`xgboost`, `lightgbm`, etc.) available via `pip install tensorus[models]`\n- Audit logging to `tensorus_audit.log` (configurable via `TENSORUS_AUDIT_LOG_PATH`)\n\n#### Option 4: Container Deployment\n```bash\n# Production deployment with PostgreSQL backend\ndocker compose up --build\n\n# Quick testing with in-memory storage\ndocker run -p 8000:8000 tensorus/tensorus:latest\n\n# Custom configuration with environment variables\ndocker run -p 8000:8000 \\\n  -e TENSORUS_STORAGE_BACKEND=postgres \\\n  -e TENSORUS_API_KEYS=your-api-key \\\n  tensorus/tensorus:latest\n```\n\n## \u26a1 Performance & Scalability\n\n### \ud83c\udfc6 Benchmark Results\nTensorus delivers **10-100x performance improvements** over traditional file-based tensor storage:\n\n| Operation Type | Traditional Files | Tensorus | Improvement |\n|----------------|------------------|----------|-------------|\n| **Tensor Retrieval** | 280 ops/sec | 15,000 ops/sec | **53.6x faster** |\n| **Query Processing** | 850ms | 45ms | **18.9x faster** |\n| **Storage Efficiency** | 1.0x baseline | 4.0x compressed | **75% space saved** |\n| **Vector Search** | 15,000ms | 125ms | **120x faster** |\n| **Concurrent Operations** | 450 ops/sec | 12,000 ops/sec | **26.7x higher throughput** |\n\n### \ud83d\udcc8 Scaling Characteristics\n- **Linear scaling** up to 32+ nodes in distributed deployments\n- **Sub-200ms response times** at enterprise scale (millions of tensors)\n- **99.9% availability** with proper redundancy configuration\n- **Automatic load balancing** and intelligent request routing\n\n## \ud83c\udfaf Use Cases & Applications\n\n### \ud83e\udde0 AI/ML Development & Production\n- **Model Training Pipelines** - Store training data, model checkpoints, and experiment results\n- **Real-time Inference** - Fast retrieval of model weights and feature tensors for serving\n- **Experiment Tracking** - Complete lineage of model development with reproducible workflows\n- **AutoML Platforms** - Automated hyperparameter optimization and model architecture search\n\n### \ud83d\udd2c Scientific Computing & Research\n- **Numerical Simulations** - Large-scale scientific computing with computational provenance\n- **Climate & Weather Modeling** - Multi-dimensional data analysis with temporal tracking\n- **Genomics & Bioinformatics** - DNA sequence analysis, protein folding, and molecular dynamics\n- **Materials Science** - Quantum chemistry simulations and materials property prediction\n\n### \ud83d\udc41\ufe0f Computer Vision & Autonomous Systems\n- **Image/Video Processing** - Efficient storage and retrieval of visual data tensors\n- **Object Detection & Recognition** - Real-time inference with cached model components\n- **Autonomous Vehicles** - Sensor fusion, path planning, and decision-making algorithms\n- **Medical Imaging** - DICOM processing, radiological analysis, and diagnostic AI\n\n### \ud83d\udcb0 Financial Services & Trading\n- **Risk Management** - Real-time portfolio optimization and risk assessment models\n- **Algorithmic Trading** - High-frequency trading with microsecond-latency model execution\n- **Fraud Detection** - Anomaly detection in transaction patterns and behavioral analysis\n- **Credit Scoring** - ML-driven creditworthiness assessment with regulatory compliance\n\n### Running the API Server\n\n1.  Navigate to the project root directory (the directory containing the `tensorus` folder and `pyproject.toml`).\n2.  Ensure your virtual environment is activated if you are using one.\n3.  Start the FastAPI backend server using:\n\n    ```bash\n    uvicorn tensorus.api:app --reload --host 127.0.0.1 --port 7860\n    # For external access (e.g., Docker/WSL/other machines), use:\n    # uvicorn tensorus.api:app --host 0.0.0.0 --port 7860\n    ```\n\n    *   This command launches Uvicorn with the `app` instance defined in `tensorus/api.py`.\n    *   Access the API documentation at `http://localhost:7860/docs` or `http://localhost:7860/redoc`.\n    *   All dataset and agent endpoints are available once the server is running.\n\n    To use S3 for tensor dataset persistence instead of local disk, set:\n\n    ```bash\n    export TENSORUS_TENSOR_STORAGE_PATH=\"s3://your-bucket/optional/prefix\"\n    # Ensure AWS credentials are available (env vars, profile, or instance role)\n    uvicorn tensorus.api:app --host 0.0.0.0 --port 7860\n    ```\n\n### Running the Streamlit UI\n\n1.  In a separate terminal (with the virtual environment activated), navigate to the project root.\n2.  Start the Streamlit frontend:\n\n    ```bash\n    streamlit run app.py\n    ```\n\n    *   Access the UI in your browser at the URL provided by Streamlit (usually `http://localhost:8501`).\n\n### Model Context Protocol Integration\n\nFor AI agents and LLMs that need standardized protocol access to Tensorus capabilities, see the separate [Tensorus MCP package](https://github.com/tensorus/mcp) which provides a complete MCP server implementation.\n\n### Running the Agents (Examples)\n\nYou can run the example agents directly from their respective files:\n\n*   **RL Agent:**\n\n    ```bash\n    python tensorus/rl_agent.py\n    ```\n\n*   **AutoML Agent:**\n\n    ```bash\n    python tensorus/automl_agent.py\n    ```\n\n*   **Ingestion Agent:**\n\n    ```bash\n    python tensorus/ingestion_agent.py\n    ```\n\n    *   Note: The Ingestion Agent will monitor the `temp_ingestion_source` directory (created automatically if it doesn't exist in the project root) for new files.\n\n## Docker Deployment\n\n### Docker Quickstart\n\nRun Tensorus with Docker in two ways: a single container (in\u2011memory storage) or a full stack with PostgreSQL via Docker Compose.\n\n#### Option A: Full stack with PostgreSQL (recommended)\n1. Install Docker Desktop (or Docker Engine) and Docker Compose v2.\n2. Generate an API key:\n\n   ```bash\n   python generate_api_key.py --format env\n   # Copy the value printed after TENSORUS_API_KEYS=\n   ```\n\n3. Open `docker-compose.yml` and set your key. Either:\n   - Replace the placeholder in `TENSORUS_VALID_API_KEYS` with your key, or\n   - Add `TENSORUS_API_KEYS: \"tsr_...\"` alongside it. Both are supported; `TENSORUS_API_KEYS` is preferred.\n\n4. Start the stack from the project root:\n\n   ```bash\n   docker compose up --build\n   ```\n\n   - The API starts on `http://localhost:7860`\n   - PostgreSQL is exposed on host `5433` (container `5432`)\n   - Audit logs are persisted to `./tensorus_audit.log` via a bind mount\n\n5. Test authentication (Bearer token is recommended):\n\n   ```bash\n   # Replace tsr_your_key with the key you generated\n   curl -H \"Authorization: Bearer tsr_your_key\" http://localhost:7860/datasets\n   ```\n\nNotes\n- The compose file waits for Postgres to become healthy before starting the app.\n- Legacy header `X-API-KEY: tsr_your_key` is still accepted for backward compatibility.\n\nUseful commands\n```bash\n# View logs\ndocker compose logs -f app\n\n# Rebuild after code changes\ndocker compose up --build --force-recreate\n\n# Stop stack\ndocker compose down\n```\n\n#### Option B: Single container (in\u2011memory storage)\nUse this for quick, ephemeral testing without Postgres.\n\n```bash\ndocker build -t tensorus .\ndocker run --rm -p 7860:7860 \\\n  -e TENSORUS_AUTH_ENABLED=true \\\n  -e TENSORUS_API_KEYS=tsr_your_key \\\n  -e TENSORUS_STORAGE_BACKEND=in_memory \\\n  -v \"$(pwd)/tensorus_audit.log:/app/tensorus_audit.log\" \\\n  tensorus\n```\n\nThen open `http://localhost:7860/docs`.\n\nWSL2 tip: If you run Docker Desktop on Windows with WSL2, `localhost:7860` works from both Windows and the WSL distro. Keep volumes on the Linux side (`/home/...`) for best performance.\n\n#### GPU acceleration (optional)\nThe default image uses CPU wheels. For GPUs, install the NVIDIA Container Toolkit and switch to CUDA\u2011enabled PyTorch wheels in your build (e.g., modify `setup.sh` or your Dockerfile). Pass `--gpus all` to `docker run`.\n\n## Environment Configuration\n\n### Environment configuration (reference)\nTensorus reads configuration from environment variables (prefix `TENSORUS_`). Common settings:\n\n- Authentication\n  - `TENSORUS_AUTH_ENABLED` (default: `true`)\n  - `TENSORUS_API_KEYS`: Comma\u2011separated list of keys (recommended)\n  - `TENSORUS_VALID_API_KEYS`: Legacy alternative; comma list or JSON array\n  - Usage: Prefer `Authorization: Bearer tsr_...`; legacy `X-API-KEY` also accepted\n\n- Storage backend\n  - `TENSORUS_STORAGE_BACKEND`: `in_memory` | `postgres` (default: `in_memory`)\n  - Postgres when `postgres`:\n    - `TENSORUS_POSTGRES_HOST`, `TENSORUS_POSTGRES_PORT` (default `5432`),\n      `TENSORUS_POSTGRES_USER`, `TENSORUS_POSTGRES_PASSWORD`, `TENSORUS_POSTGRES_DB`\n    - or `TENSORUS_POSTGRES_DSN` (overrides individual fields)\n  - Optional tensor persistence path: `TENSORUS_TENSOR_STORAGE_PATH` (e.g., a local path or URI)\n\n- Security headers\n  - `TENSORUS_X_FRAME_OPTIONS` (default `SAMEORIGIN`; set to `NONE` to omit)\n  - `TENSORUS_CONTENT_SECURITY_POLICY` (default `default-src 'self'`; set to `NONE` to omit)\n\n- Misc\n  - `TENSORUS_AUDIT_LOG_PATH` (default `tensorus_audit.log`)\n  - `TENSORUS_MINIMAL_IMPORT`=1 to skip optional model package imports\n  - NQL with LLM: `NQL_USE_LLM=true`, `GOOGLE_API_KEY`, optional `NQL_LLM_MODEL`\n\nExample `.env` (for local runs or compose env_file):\n```bash\nTENSORUS_AUTH_ENABLED=true\nTENSORUS_API_KEYS=tsr_your_key\nTENSORUS_STORAGE_BACKEND=postgres\nTENSORUS_POSTGRES_HOST=db\nTENSORUS_POSTGRES_PORT=5432\nTENSORUS_POSTGRES_USER=tensorus_user\nTENSORUS_POSTGRES_PASSWORD=change_me\nTENSORUS_POSTGRES_DB=tensorus_db\nTENSORUS_AUDIT_LOG_PATH=/app/tensorus_audit.log\n```\n\n## Production Deployment\n\n### Production deployment with Docker (step\u2011by\u2011step)\nThis example uses Docker Compose with PostgreSQL. Adjust for your infra as needed.\n\n1. Generate and store your API key securely\n   - `python generate_api_key.py --format env`\n   - Prefer secret management (Docker/Swarm/K8s/Vault). For Compose, you can use a file\u2011based secret:\n\n     ```bash\n     # secrets/api_key.txt contains only your key value (no quotes)\n     echo \"tsr_prod_key_...\" > secrets/api_key.txt\n     ```\n\n2. Configure Compose for production\n   - Edit `docker-compose.yml` and set:\n     - `TENSORUS_AUTH_ENABLED: \"true\"`\n     - `TENSORUS_API_KEYS: ${TENSORUS_API_KEYS:-}` or point to a secret/file\n     - `TENSORUS_STORAGE_BACKEND: postgres` and your Postgres credentials\n   - Optionally add `env_file: .env` and put non\u2011secret config there.\n\n3. Harden runtime\n   - Put Tensorus behind a reverse proxy (Nginx/Traefik) with TLS\n   - Restrict CORS/hosts at the proxy; the app currently allows all by default\n   - Set security headers via env vars (see below)\n\n4. Start and verify\n   ```bash\n   docker compose up -d --build\n   docker compose ps\n   curl -f -H \"Authorization: Bearer tsr_prod_key_...\" http://localhost:7860/ || echo \"API not ready\"\n   ```\n\n5. Health and logs\n   - Postgres health is checked automatically; the app waits until healthy\n   - `docker compose logs -f app`\n\nSecurity headers\n- Override defaults to match your CSP and embedding needs. If set to `NONE` or empty, the header is omitted.\n\n```bash\n# Example: allow Swagger/ReDoc CDNs and a trusted frame host\nTENSORUS_X_FRAME_OPTIONS=\"ALLOW-FROM https://example.com\"\nTENSORUS_CONTENT_SECURITY_POLICY=\"default-src 'self'; script-src 'self' https://cdn.jsdelivr.net; style-src 'self' https://fonts.googleapis.com 'unsafe-inline'; font-src 'self' https://fonts.gstatic.com; img-src 'self' https://fastapi.tiangolo.com\"\n```\n\nTroubleshooting\n- 401 Unauthorized: ensure you send `Authorization: Bearer tsr_...` and the key is configured (`TENSORUS_API_KEYS` or `TENSORUS_VALID_API_KEYS`).\n- 503 auth not configured: set an API key when auth is enabled.\n- DB connection errors: verify Postgres env, port conflicts (host 5433 vs local 5432), and that the DB user/database exist.\n- Windows/WSL2 volume performance: keep bind\u2011mounted files on the Linux filesystem for best performance.\n\n## Testing\n\n### Preparing the Test Environment\n\nThe tests expect all dependencies from both `requirements.txt` and\n`requirements-test.txt` to be installed. A simple setup script is provided\nto handle this automatically:\n\n```bash\n./setup.sh\n```\n\nRun this after creating and activating a Python virtual environment. The script\ninstalls the Tensorus runtime requirements and the additional packages needed\nfor `pytest`. Once completed, executing `pytest` from the repository root will\nautomatically discover and run the entire suite\u2014no manual package discovery is\nrequired.\n\n### Test Suite Dependencies\n\nThe Python tests rely on packages from both `requirements.txt` and\n`requirements-test.txt`. The latter includes `httpx` and other packages\nused by the test suite. **Always run `./setup.sh` before executing\n`pytest` to install these requirements**:\n\n```bash\n./setup.sh\n```\n\n### Running Tests\n\nTensorus includes Python unit tests. To set up the environment and run them:\n\n1. Install all dependencies using the setup script **before running any tests**:\n\n    ```bash\n    ./setup.sh\n    ```\n\n    This script installs packages from `requirements.txt` and `requirements-test.txt` (which pins `fastapi>=0.110` for Pydantic v2 support).\n\n2. Run the Python test suite:\n\n    ```bash\n    pytest\n    ```\n\n    All tests should pass without errors when dependencies are properly installed.\n\n## Using Tensorus\n\n### API Basics\n\nBase URL: `http://localhost:7860`\n\nAuthentication:\n- Preferred: send `Authorization: Bearer tsr_<your_key>`\n- Legacy (still supported): `X-API-KEY: tsr_<your_key>`\n\nPowerShell notes:\n- Use double quotes for JSON and escape inner quotes, or run in WSL/Git Bash for copy/paste fidelity.\n- Alternatively, use `--%` to stop PowerShell from interpreting special characters.\n\n### API Endpoints\n\nThe API provides the following main endpoints:\n\n*   **Datasets:**\n    *   `POST /datasets/create`: Create a new dataset.\n    *   `POST /datasets/{name}/ingest`: Ingest a tensor into a dataset.\n    *   `GET /datasets/{name}/fetch`: Retrieve all records from a dataset.\n    *   `GET /datasets/{name}/records`: Retrieve a page of records. Supports `offset` (start index, default `0`) and `limit` (max results, default `100`).\n    *   `GET /datasets`: List all available datasets.\n    *   `GET /datasets/{name}/count`: Count records in a dataset.\n    *   `GET /datasets/{dataset_name}/tensors/{record_id}`: Retrieve a tensor by record ID.\n    *   `DELETE /datasets/{dataset_name}`: Delete a dataset.\n    *   `DELETE /datasets/{dataset_name}/tensors/{record_id}`: Delete a tensor by record ID.\n    *   `PUT /datasets/{dataset_name}/tensors/{record_id}/metadata`: Update tensor metadata.\n*   **Querying:**\n    *   `POST /api/v1/query`: Execute an NQL query.\n*   **Vector Database:**\n    *   `POST /api/v1/vector/embed`: Generate and store embeddings from text.\n    *   `POST /api/v1/vector/search`: Perform vector similarity search.\n    *   `POST /api/v1/vector/hybrid-search`: Execute hybrid semantic-computational search.\n    *   `POST /api/v1/vector/tensor-workflow`: Run tensor workflow with lineage tracking.\n    *   `POST /api/v1/vector/index/build`: Build vector indexes with geometric partitioning.\n    *   `GET /api/v1/vector/models`: List available embedding models.\n    *   `GET /api/v1/vector/stats/{dataset_name}`: Get embedding statistics for a dataset.\n    *   `GET /api/v1/vector/metrics`: Get performance metrics.\n    *   `DELETE /api/v1/vector/vectors/{dataset_name}`: Delete vectors from a dataset.\n*   **Operation History & Lineage:**\n    *   `GET /api/v1/operations/recent`: Get recent operations with optional filtering by type/status.\n    *   `GET /api/v1/operations/tensor/{tensor_id}`: Get all operations that involved a specific tensor.\n    *   `GET /api/v1/operations/statistics`: Get aggregate operation statistics.\n    *   `GET /api/v1/operations/types`: List available operation types.\n    *   `GET /api/v1/operations/statuses`: List available operation statuses.\n    *   `GET /api/v1/lineage/tensor/{tensor_id}`: Get computational lineage for a tensor.\n    *   `GET /api/v1/lineage/tensor/{tensor_id}/dot`: DOT graph for lineage visualization.\n    *   `GET /api/v1/lineage/tensor/{source_tensor_id}/path/{target_tensor_id}`: Operation path between two tensors.\n*   **Agents:**\n    *   `GET /agents`: List all registered agents.\n    *   `GET /agents/{agent_id}/status`: Get the status of a specific agent.\n    *   `POST /agents/{agent_id}/start`: Start an agent.\n    *   `POST /agents/{agent_id}/stop`: Stop an agent.\n    *   `GET /agents/{agent_id}/logs`: Get recent logs for an agent.\n    *   `GET /agents/{agent_id}/config`: Get stored configuration for an agent.\n    *   `POST /agents/{agent_id}/configure`: Update agent configuration.\n*   **Metrics & Monitoring:**\n    *   `GET /metrics/dashboard`: Get aggregated dashboard metrics.\n\n### Vector Database Examples\n\nNote on path parameter names across endpoints:\n- Datasets CRUD often uses `name` in path: e.g., `/datasets/{name}/ingest`, `/datasets/{name}/records`\n- Tensor CRUD uses `dataset_name` + `record_id`: e.g., `/datasets/{dataset_name}/tensors/{record_id}`\n- Vector API consistently uses `dataset_name` in path and body\n\nFor an end-to-end quickstart with PowerShell-friendly curl commands and authentication setup, see `DEMO.md` \u2192 \"Vector & Embedding API Quickstart\".\n\n#### Generate & Store Embeddings\n```bash\ncurl -X POST \"http://localhost:7860/api/v1/vector/embed\" \\\n  -H \"Authorization: Bearer your_api_key\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n    \"texts\": [\"Machine learning algorithms\", \"Deep neural networks\"],\n    \"dataset_name\": \"ai_research\",\n    \"model_name\": \"all-mpnet-base-v2\",\n    \"namespace\": \"research\",\n    \"tenant_id\": \"team_alpha\"\n  }'\n```\n\n#### Semantic Similarity Search\n```bash\ncurl -X POST \"http://localhost:7860/api/v1/vector/search\" \\\n  -H \"Authorization: Bearer your_api_key\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n    \"query\": \"artificial intelligence models\",\n    \"dataset_name\": \"ai_research\",\n    \"k\": 5,\n    \"similarity_threshold\": 0.7,\n    \"namespace\": \"research\"\n  }'\n```\n\nExample response:\n```json\n{\n  \"success\": true,\n  \"query\": \"artificial intelligence models\",\n  \"total_results\": 2,\n  \"search_time_ms\": 8.42,\n  \"results\": [\n    {\n      \"record_id\": \"rec_123\",\n      \"similarity_score\": 0.9153,\n      \"rank\": 1,\n      \"source_text\": \"Deep learning models for AI\",\n      \"metadata\": {\"source\": \"paper_db\", \"year\": 2024},\n      \"namespace\": \"research\",\n      \"tenant_id\": \"team_alpha\"\n    },\n    {\n      \"record_id\": \"rec_456\",\n      \"similarity_score\": 0.8831,\n      \"rank\": 2,\n      \"source_text\": \"AI model architectures\",\n      \"metadata\": {\"source\": \"notes\"},\n      \"namespace\": \"research\",\n      \"tenant_id\": \"team_alpha\"\n    }\n  ]\n}\n```\n\n#### Hybrid Computational Search\n```bash\ncurl -X POST \"http://localhost:7860/api/v1/vector/hybrid-search\" \\\n  -H \"Authorization: Bearer your_api_key\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n    \"text_query\": \"neural network weights\",\n    \"dataset_name\": \"model_tensors\", \n    \"tensor_operations\": [\n      {\n        \"operation_name\": \"svd\",\n        \"description\": \"Singular value decomposition\",\n        \"parameters\": {}\n      }\n    ],\n    \"similarity_weight\": 0.7,\n    \"computation_weight\": 0.3,\n    \"filters\": {\n      \"preferred_shape\": [512, 512],\n      \"sparsity_preference\": 0.1\n    }\n  }'\n```\n\n### Agents API Examples\n\n#### List Agents\n```bash\ncurl -s -H \"Authorization: Bearer your_api_key\" \\\n  \"http://localhost:7860/agents\"\n```\n\n#### Start Agent\n```bash\ncurl -X POST -H \"Authorization: Bearer your_api_key\" \\\n  \"http://localhost:7860/agents/ingestion/start\"\n```\n\n#### Agent Status & Logs\n```bash\ncurl -s -H \"Authorization: Bearer your_api_key\" \\\n  \"http://localhost:7860/agents/ingestion/status\"\n\ncurl -s -H \"Authorization: Bearer your_api_key\" \\\n  \"http://localhost:7860/agents/ingestion/logs?lines=50\"\n```\n\n### Operation History & Lineage Examples\n\n#### Recent Operations\n```bash\ncurl -s -H \"Authorization: Bearer your_api_key\" \\\n  \"http://localhost:7860/api/v1/operations/recent?limit=50\"\n```\n\n#### Tensor Lineage (JSON)\n```bash\ncurl -s -H \"Authorization: Bearer your_api_key\" \\\n  \"http://localhost:7860/api/v1/lineage/tensor/{tensor_id}\"\n```\n\n#### Tensor Lineage (DOT Graph)\n```bash\ncurl -s -H \"Authorization: Bearer your_api_key\" \\\n  \"http://localhost:7860/api/v1/lineage/tensor/{tensor_id}/dot\"\n```\n\n#### Operation Path Between Two Tensors\n```bash\ncurl -s -H \"Authorization: Bearer your_api_key\" \\\n  \"http://localhost:7860/api/v1/lineage/tensor/{source_tensor_id}/path/{target_tensor_id}\"\n```\n\n### Authentication Examples\n\nRecommended (Bearer):\n```bash\ncurl -s \\\n  -H \"Authorization: Bearer tsr_your_api_key\" \\\n  \"http://localhost:7860/datasets\"\n```\n\nLegacy header (still supported):\n```bash\ncurl -s \\\n  -H \"X-API-KEY: tsr_your_api_key\" \\\n  \"http://localhost:7860/datasets\"\n```\n\n### NQL Query Example\n\n```bash\ncurl -X POST \"http://localhost:7860/api/v1/query\" \\\n  -H \"Authorization: Bearer tsr_your_api_key\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n    \"query\": \"find tensors from 'my_dataset' where metadata.source = 'api_ingest' limit 10\"\n  }'\n```\n\n### Request/Response Schemas\n\nBelow are the primary Pydantic models used by the API. See `tensorus/api.py` and `tensorus/api/models.py` for full details.\n\n* **ApiResponse** (`tensorus/api.py`)\n  - `success: bool`\n  - `message: str`\n  - `data: Any | null`\n\n* **DatasetCreateRequest** (`tensorus/api.py`)\n  - `name: str`\n\n* **TensorInput** (`tensorus/api.py`)\n  - `shape: List[int]`\n  - `dtype: str`\n  - `data: List[Any] | int | float`\n  - `metadata: Dict[str, Any] | null`\n\n* **TensorOutput** (`tensorus/api/models.py`)\n  - `record_id: str`\n  - `shape: List[int]`\n  - `dtype: str`\n  - `data: List[Any] | int | float`\n  - `metadata: Dict[str, Any]`\n\n* **NQLQueryRequest / NQLResponse** (`tensorus/api/models.py`)\n  - Request: `query: str`\n  - Response: `success: bool`, `message: str`, `count?: int`, `results?: List[TensorOutput]`\n\n* **VectorSearchQuery** (`tensorus/api/models.py`)\n  - `query: str`, `dataset_name: str`, `k: int = 5`, `namespace?: str`, `tenant_id?: str`, `similarity_threshold?: float`, `include_vectors: bool = false`\n\n* **OperationHistoryRequest** (`tensorus/api/models.py`)\n  - Filters: `tensor_id?`, `operation_type?`, `status?`, `user_id?`, `session_id?`, `start_time?`, `end_time?`, `limit: int = 100`\n\n* **LineageResponse** (`tensorus/api/models.py`)\n  - Key fields: `tensor_id`, `root_tensor_ids`, `max_depth`, `total_operations`, `lineage_nodes[]`, `operations[]`, timestamps\n\n\n#### \ud83e\udde9 Examples\n\nExplore our collection of examples to get started with Tensorus:\n  \"message\": \"Tensor ingested successfully.\",\n  \"data\": { \"record_id\": \"abc123\" }\n}\n```\n\nNot Found (FastAPI error shape):\n```json\n{\n  \"detail\": \"Not Found\"\n}\n```\n\nUnauthorized:\n```json\n{\n  \"detail\": \"Not authenticated\"\n}\n```\n\n### Dataset API Examples\n\nAll requests require authentication by default: `-H \"Authorization: Bearer your_api_key\"` (legacy `X-API-KEY` also supported).\n\n#### Create Dataset\n```bash\ncurl -X POST \"http://localhost:7860/datasets/create\" \\\n  -H \"Authorization: Bearer your_api_key\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"name\": \"my_dataset\"}'\n```\n\n#### Ingest Tensor\n```bash\ncurl -X POST \"http://localhost:7860/datasets/my_dataset/ingest\" \\\n  -H \"Authorization: Bearer your_api_key\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n    \"shape\": [2, 3],\n    \"dtype\": \"float32\",\n    \"data\": [[1.0,2.0,3.0],[4.0,5.0,6.0]],\n    \"metadata\": {\"source\": \"api_ingest\", \"label\": \"row_batch_1\"}\n  }'\n```\n\n#### Fetch Entire Dataset\n```bash\ncurl -s -H \"Authorization: Bearer your_api_key\" \\\n  \"http://localhost:7860/datasets/my_dataset/fetch\"\n```\n\n#### Fetch Records (Pagination)\n```bash\ncurl -s -H \"Authorization: Bearer your_api_key\" \\\n  \"http://localhost:7860/datasets/my_dataset/records?offset=0&limit=50\"\n```\n\n#### Count Records\n```bash\ncurl -s -H \"Authorization: Bearer your_api_key\" \\\n  \"http://localhost:7860/datasets/my_dataset/count\"\n```\n\n#### Get Tensor By ID\n```bash\ncurl -s -H \"Authorization: Bearer your_api_key\" \\\n  \"http://localhost:7860/datasets/my_dataset/tensors/{record_id}\"\n```\n\n#### Update Tensor Metadata (replace entire metadata)\n```bash\ncurl -X PUT \"http://localhost:7860/datasets/my_dataset/tensors/{record_id}/metadata\" \\\n  -H \"Authorization: Bearer your_api_key\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"new_metadata\": {\"source\": \"sensor_A\", \"priority\": \"high\"}}'\n```\n\n#### Delete Tensor\n```bash\ncurl -X DELETE -H \"Authorization: Bearer your_api_key\" \\\n  \"http://localhost:7860/datasets/my_dataset/tensors/{record_id}\"\n```\n\n#### Delete Dataset\n```bash\ncurl -X DELETE -H \"Authorization: Bearer your_api_key\" \\\n  \"http://localhost:7860/datasets/my_dataset\"\n```\n\n### Dataset Schemas\n\nDatasets can optionally include a schema when created. The schema defines\nrequired metadata fields and expected tensor `shape` and `dtype`. Inserts that\nviolate the schema will raise a validation error.\n\nExample:\n\n```python\nschema = {\n    \"shape\": [3, 10],\n    \"dtype\": \"float32\",\n    \"metadata\": {\"source\": \"str\", \"value\": \"int\"}\n}\nstorage.create_dataset(\"my_ds\", schema=schema)\nstorage.insert(\"my_ds\", torch.rand(3, 10), {\"source\": \"sensor\", \"value\": 5})\n```\n\n## Metadata System\n\nTensorus includes a detailed metadata subsystem for describing tensors beyond their raw data. Each tensor has a `TensorDescriptor` and can be associated with optional semantic, lineage, computational, quality, relational, and usage metadata. The metadata storage backend is pluggable, supporting in-memory storage for quick testing or PostgreSQL for persistence. Search and aggregation utilities allow querying across these metadata fields. See [metadata_schemas.md](docs/metadata_schemas.md) for schema details.\n\n## Streamlit UI\n\nThe Streamlit UI provides a user-friendly interface for:\n\n*   **Dashboard:** View basic system metrics and agent status.\n*   **Agent Control:** Start, stop, and view logs for agents.\n*   **NQL Chat:** Enter natural language queries and view results.\n*   **Data Explorer:** Browse datasets, preview data, and perform tensor operations.\n\n## Natural Query Language (NQL)\n\nTensorus ships with a simple regex\u2011based Natural Query Language for retrieving\ntensors by metadata. You can issue NQL queries via the API or from the \"NQL\nChat\" page in the Streamlit UI.\n\nSee also: [NQL Query Example](#nql-query-example) for a minimal API request.\n\n### Enabling LLM rewriting\n\nSet `NQL_USE_LLM=true` to enable parsing of free\u2011form queries with\nGoogle's Gemini model. Provide your API key in the `GOOGLE_API_KEY`\nenvironment variable and optionally set `NQL_LLM_MODEL` (defaults to\n`gemini-2.0-flash`) to choose the model version. The agent sends the\ncurrent dataset schema and your query to Gemini via\n`langchain-google`. If the model or key are unavailable the agent\nsilently falls back to the regex-based parser.\n\nExample query using the LLM parser:\n\n```text\nshow me all images containing a dog from dataset animals where source is \"mobile\"\n```\n\nThis phrasing is more natural than the regex format and will be\nrewritten into a structured NQL query by Gemini.\n\n## Agent Details\n\n### Data Ingestion Agent\n\n*   **Functionality:** Monitors a source directory for new files, preprocesses them into tensors, and inserts them into TensorStorage.\n*   **Supported File Types:** CSV, PNG, JPG, JPEG, TIF, TIFF (can be extended).\n*   **Preprocessing:** Uses default functions for CSV and images (resize, normalize).\n*   **Configuration:**\n    *   `source_directory`: The directory to monitor.\n    *   `polling_interval_sec`: How often to check for new files.\n    *   `preprocessing_rules`: A dictionary mapping file extensions to custom preprocessing functions.\n\n### RL Agent\n\n*   **Functionality:** A Deep Q-Network (DQN) agent that learns from experiences stored in TensorStorage.\n*   **Environment:** Uses a `DummyEnv` for demonstration.\n*   **Experience Storage:** Stores experiences (state, action, reward, next_state, done) in TensorStorage.\n*   **Training:** Implements epsilon-greedy exploration and target network updates.\n*   **Configuration:**\n    *   `state_dim`: Dimensionality of the environment state.\n    *   `action_dim`: Number of discrete actions.\n    *   `hidden_size`: Hidden layer size for the DQN.\n    *   `lr`: Learning rate.\n    *   `gamma`: Discount factor.\n    *   `epsilon_*`: Epsilon-greedy parameters.\n    *   `target_update_freq`: Target network update frequency.\n    *   `batch_size`: Experience batch size.\n    *   `experience_dataset`: Dataset name for experiences.\n    *   `state_dataset`: Dataset name for state tensors.\n\n### AutoML Agent\n\n*   **Functionality:** Performs hyperparameter optimization using random search.\n*   **Model:** Trains a simple `DummyMLP` model.\n*   **Search Space:** Configurable hyperparameter search space (learning rate, hidden size, activation).\n*   **Evaluation:** Trains and evaluates models on synthetic data.\n*   **Results:** Stores trial results (parameters, score) in TensorStorage.\n*   **Configuration:**\n    *   `search_space`: Dictionary defining the hyperparameter search space.\n    *   `input_dim`: Input dimension for the model.\n    *   `output_dim`: Output dimension for the model.\n    *   `task_type`: Type of task ('regression' or 'classification').\n    *   `results_dataset`: Dataset name for storing results.\n\n### Embedding Agent\n\n*   **Functionality:** Multi-provider embedding generation with intelligent caching and vector database integration.\n*   **Providers:** Supports Sentence Transformers, OpenAI, and extensible architecture for additional providers.\n*   **Features:** Automatic batching, embedding caching, vector indexing, and performance monitoring.\n*   **Configuration:**\n    *   `default_provider`: Default embedding provider to use.\n    *   `default_model`: Default model for embedding generation.\n    *   `batch_size`: Batch size for embedding generation.\n    *   `cache_ttl`: Time-to-live for embedding cache entries.\n\n## Tensorus Models\n\nThe collection of example models previously bundled with Tensorus now lives in\na separate repository: [tensorus/models](https://github.com/tensorus/models).\nInstall it with:\n\n```bash\npip install tensorus-models\n```\n\nWhen the package is installed, Tensorus will automatically import it. Set the\nenvironment variable `TENSORUS_MINIMAL_IMPORT=1` before importing Tensorus to\nskip this optional dependency and keep startup lightweight.\n\n## Basic Tensor Operations\n\nThis section details the core tensor manipulation functionalities provided by `tensor_ops.py`. These operations are designed to be robust, with built-in type and shape checking where appropriate.\n\n#### Arithmetic Operations\n\n*   `add(t1, t2)`: Element-wise addition of two tensors, or a tensor and a scalar.\n*   `subtract(t1, t2)`: Element-wise subtraction of two tensors, or a tensor and a scalar.\n*   `multiply(t1, t2)`: Element-wise multiplication of two tensors, or a tensor and a scalar.\n*   `divide(t1, t2)`: Element-wise division of two tensors, or a tensor and a scalar. Includes checks for division by zero.\n*   `power(t1, t2)`: Raises each element in `t1` to the power of `t2`. Supports tensor or scalar exponents.\n*   `log(tensor)`: Element-wise natural logarithm with warnings for non-positive values.\n\n#### Matrix and Dot Operations\n\n*   `matmul(t1, t2)`: Matrix multiplication of two tensors, supporting various dimensionalities (e.g., 2D matrices, batched matrix multiplication).\n*   `dot(t1, t2)`: Computes the dot product of two 1D tensors.\n*   `outer(t1, t2)`: Computes the outer product of two 1\u2011D tensors.\n*   `cross(t1, t2, dim=-1)`: Computes the cross product along the specified dimension (size must be 3).\n*   `matrix_eigendecomposition(matrix_A)`: Returns eigenvalues and eigenvectors of a square matrix.\n*   `matrix_trace(matrix_A)`: Computes the trace of a 2-D matrix.\n*   `tensor_trace(tensor_A, axis1=0, axis2=1)`: Trace of a tensor along two axes.\n*   `svd(matrix)`: Singular value decomposition of a matrix, returns `U`, `S`, and `Vh`.\n*   `qr_decomposition(matrix)`: QR decomposition returning `Q` and `R`.\n*   `lu_decomposition(matrix)`: LU decomposition returning permutation `P`, lower `L`, and upper `U` matrices.\n*   `cholesky_decomposition(matrix)`: Cholesky factor of a symmetric positive-definite matrix.\n*   `matrix_inverse(matrix)`: Inverse of a square matrix.\n*   `matrix_determinant(matrix)`: Determinant of a square matrix.\n*   `matrix_rank(matrix)`: Rank of a matrix.\n\n#### Reduction Operations\n\n*   `sum(tensor, dim=None, keepdim=False)`: Computes the sum of tensor elements over specified dimensions.\n*   `mean(tensor, dim=None, keepdim=False)`: Computes the mean of tensor elements over specified dimensions. Tensor is cast to float for calculation.\n*   `min(tensor, dim=None, keepdim=False)`: Finds the minimum value in a tensor, optionally along a dimension. Returns values and indices if `dim` is specified.\n*   `max(tensor, dim=None, keepdim=False)`: Finds the maximum value in a tensor, optionally along a dimension. Returns values and indices if `dim` is specified.\n*   `variance(tensor, dim=None, unbiased=False, keepdim=False)`: Variance of tensor elements.\n*   `covariance(matrix_X, matrix_Y=None, rowvar=True, bias=False, ddof=None)`: Covariance matrix estimation.\n*   `correlation(matrix_X, matrix_Y=None, rowvar=True)`: Correlation coefficient matrix.\n\n#### Reshaping and Slicing\n\n*   `reshape(tensor, shape)`: Changes the shape of a tensor without changing its data.\n*   `transpose(tensor, dim0, dim1)`: Swaps two dimensions of a tensor.\n*   `permute(tensor, dims)`: Permutes the dimensions of a tensor according to the specified order.\n*   `flatten(tensor, start_dim=0, end_dim=-1)`: Flattens a range of dimensions into a single dimension.\n*   `squeeze(tensor, dim=None)`: Removes dimensions of size 1, or a specific dimension if provided.\n*   `unsqueeze(tensor, dim)`: Inserts a dimension of size 1 at the given position.\n\n#### Concatenation and Splitting\n\n*   `concatenate(tensors, dim=0)`: Joins a sequence of tensors along an existing dimension.\n*   `stack(tensors, dim=0)`: Joins a sequence of tensors along a new dimension.\n\n#### Advanced Operations\n\n*   `einsum(equation, *tensors)`: Applies Einstein summation convention to the input tensors based on the provided equation string.\n*   `compute_gradient(func, tensor)`: Returns the gradient of a scalar `func` with respect to `tensor`.\n*   `compute_jacobian(func, tensor)`: Computes the Jacobian matrix of a vector function.\n*   `convolve_1d(signal_x, kernel_w, mode='valid')`: 1\u2011D convolution using `torch.nn.functional.conv1d`.\n*   `convolve_2d(image_I, kernel_K, mode='valid')`: 2\u2011D convolution using `torch.nn.functional.conv2d`.\n *   `frobenius_norm(tensor)`: Calculates the Frobenius norm.\n *   `l1_norm(tensor)`: Calculates the L1 norm (sum of absolute values).\n\n## Tensor Decomposition Operations\n\nTensorus includes a library of higher\u2011order tensor factorizations in\n`tensor_decompositions.py`. These operations mirror the algorithms\navailable in TensorLy and related libraries.\n\n* **CP Decomposition** \u2013 Canonical Polyadic factorization returning\n  weights and factor matrices.\n* **NTF\u2011CP Decomposition** \u2013 Non\u2011negative CP using\n  `non_negative_parafac`.\n* **Tucker Decomposition** \u2013 Standard Tucker factorization for specified\n  ranks.\n* **Non\u2011negative Tucker / Partial Tucker** \u2013 Variants with HOOI and\n  non\u2011negative constraints.\n* **HOSVD** \u2013 Higher\u2011order SVD (Tucker with full ranks).\n* **Tensor Train (TT)** \u2013 Sequence of TT cores representing the tensor.\n* **TT\u2011SVD** \u2013 TT factorization via SVD initialization.\n* **Tensor Ring (TR)** \u2013 Circular variant of TT.\n* **Hierarchical Tucker (HT)** \u2013 Decomposition using a dimension tree.\n* **Block Term Decomposition (BTD)** \u2013 Sum of Tucker\u20111 terms for 3\u2011way\n  tensors.\n* **t\u2011SVD** \u2013 Tensor singular value decomposition based on the\n  t\u2011product.\n\nExamples of how to call these methods are provided in\n[`tensorus/tensor_decompositions.py`](tensorus/tensor_decompositions.py).\n\n## Vector Database Features\n\n### Embedding Generation\n\nTensorus supports multiple embedding providers for generating high-quality vector representations of text:\n\n*   **Sentence Transformers**: Local models including all-MiniLM-L6-v2, all-mpnet-base-v2, and specialized models\n*   **OpenAI**: Cloud-based models like text-embedding-3-small and text-embedding-3-large\n*   **Extensible Architecture**: Easy integration of additional embedding providers\n\n### Vector Indexing\n\nAdvanced vector indexing capabilities for efficient similarity search:\n\n*   **Geometric Partitioning**: Automatic distribution of vectors across partitions using k-means clustering\n*   **Freshness Layers**: Real-time updates without requiring full index rebuilds\n*   **FAISS Integration**: High-performance similarity search with multiple distance metrics\n*   **Multi-tenancy**: Namespace and tenant isolation for secure multi-user deployments\n\n### Hybrid Search\n\nUnique hybrid search capabilities that combine semantic similarity with computational tensor properties:\n\n*   **Semantic Scoring**: Traditional vector similarity search based on text embeddings\n*   **Computational Scoring**: Mathematical property evaluation including shape compatibility, sparsity, rank analysis\n*   **Operation Compatibility**: Scoring tensors based on suitability for specific mathematical operations\n*   **Combined Ranking**: Weighted combination of semantic and computational relevance scores\n\n### Tensor Workflows\n\nExecute complex mathematical workflows with full computational lineage tracking:\n\n*   **Workflow Execution**: Chain multiple tensor operations with intermediate result storage\n*   **Lineage Tracking**: Complete provenance tracking of tensor transformations\n*   **Scientific Reproducibility**: Full audit trail of computational steps for research applications\n*   **Intermediate Storage**: Optional preservation of intermediate results for analysis\n\n## Completed Features\n\nThe current codebase implements all of the items listed in\n[Key Features](#key-features). Tensorus already provides efficient tensor\nstorage with optional file persistence, a natural query language, a flexible\nagent framework, a RESTful API, a Streamlit UI, robust tensor operations, and\nadvanced vector database capabilities. The modular architecture makes future\nextensions straightforward.\n\n## Future Implementation\n\n*   **Enhanced NQL:** Integrate a local or remote LLM for more robust natural language understanding.\n*   **Advanced Agents:** Develop more sophisticated agents for specific tasks (e.g., anomaly detection, forecasting).\n*   **Persistent Storage Backend:** Replace/augment current file-based persistence with more robust database or cloud storage solutions (e.g., PostgreSQL, S3, MinIO).\n*   **Advanced Vector Indexing:** Implement HNSW and IVF-PQ algorithms for even more efficient similarity search.\n*   **Scalability & Performance:**\n    *   Implement tensor chunking for very large tensors.\n    *   Optimize query performance with indexing.\n    *   Asynchronous operations for agents and API calls.\n*   **Security:** Implement authentication and authorization mechanisms for the API and UI.\n*   **Real-World Integration:**\n    *   Connect Ingestion Agent to more data sources (e.g., cloud storage, databases, APIs).\n    *   Integrate RL Agent with real-world environments or more complex simulations.\n*   **Advanced AutoML:**\n    *   Implement sophisticated search algorithms (e.g., Bayesian Optimization, Hyperband).\n    *   Support for diverse model architectures and custom models.\n*   **Model Management:** Add capabilities for saving, loading, versioning, and deploying trained models (from RL/AutoML).\n*   **Streaming Data Support:** Enhance Ingestion Agent to handle real-time streaming data.\n*   **Resource Management:** Add tools and controls for monitoring and managing the resource consumption (CPU, memory) of agents.\n*   **Improved UI/UX:** Continuously refine the Streamlit UI for better usability and richer visualizations.\n*   **Comprehensive Testing:** Expand unit, integration, and end-to-end tests.\n*   **Multi-modal Embeddings:** Support for image, audio, and video embeddings alongside text.\n*   **Distributed Architecture:** Multi-node deployments for large-scale vector search workloads.\n\n## \ud83e\udd1d Community & Contributing\n\n### \ud83d\udcac Get Help & Support\n\n**Community Resources:**\n- **\ud83d\udcda [Documentation Hub](docs/index.md)** - Comprehensive guides and tutorials\n- **\ud83d\udcac [GitHub Discussions](https://github.com/tensorus/tensorus/discussions)** - Ask questions and share ideas\n- **\ud83d\udc1b [Issue Tracker](https://github.com/tensorus/tensorus/issues)** - Bug reports and feature requests\n- **\ud83c\udff7\ufe0f [Stack Overflow](https://stackoverflow.com/questions/tagged/tensorus)** - Technical Q&A with the community\n\n**Enterprise Support:**\n- **\ud83d\udce7 Technical Support**: support@tensorus.com\n- **\ud83d\udce7 Sales & Partnerships**: sales@tensorus.com  \n- **\ud83d\udce7 Security Issues**: security@tensorus.com\n\n### \ud83d\ude80 Contributing to Tensorus\n\nWe welcome contributions from the community! Here's how to get involved:\n\n#### \ud83d\udc1b Report Issues\n- Use our [issue templates](https://github.com/tensorus/tensorus/issues/new/choose) for bug reports\n- Include system information, reproduction steps, and expected behavior\n- Search existing issues before creating new ones\n\n#### \ud83d\udd27 Code Contributions\n1. **Fork** the repository and create a feature branch\n2. **Develop** with proper tests and documentation\n3. **Test** your changes locally using `pytest`\n4. **Submit** a pull request with clear description and examples\n\n#### \ud83d\udcd6 Documentation Improvements\n- Fix typos, improve clarity, and add examples\n- Translate documentation to other languages  \n- Create tutorials and use case guides\n- Update API documentation and code comments\n\n#### \ud83d\udca1 Feature Requests & Ideas\n- Propose new features via [GitHub Discussions](https://github.com/tensorus/tensorus/discussions)\n- Provide detailed use cases and implementation suggestions\n- Participate in design discussions and RFC processes\n\n**Development Resources:**\n- **\ud83d\udccb [Contributing Guide](CONTRIBUTING.md)** - Detailed contribution guidelines\n- **\ud83d\udcdc [Code of Conduct](CODE_OF_CONDUCT.md)** - Community standards and expectations\n- **\ud83c\udfd7\ufe0f [Development Setup](docs/getting_started.md#development-installation)** - Local development environment\n\n## \ud83d\udcc4 License & Legal\n\n**MIT License** - See [LICENSE](LICENSE) file for complete terms.\n\n```\nCopyright (c) 2024 Tensorus Contributors\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n```\n\n**Third-Party Licenses:** This project includes dependencies with their own licenses. See `requirements.txt` and individual package documentation for details.\n\n---\n\n<div align=\"center\">\n\n### \ud83c\udf1f Ready to Transform Your Tensor Workflows?\n\n[![Get Started](https://img.shields.io/badge/\ud83d\udcda_Get_Started-blue?style=for-the-badge&logo=rocket)](docs/getting_started.md)\n[![Live Demo](https://img.shields.io/badge/\ud83d\ude80_Try_Demo-green?style=for-the-badge&logo=play)](https://tensorus-dashboard.hf.space)\n[![API Docs](https://img.shields.io/badge/\ud83d\udcd6_API_Reference-orange?style=for-the-badge&logo=swagger)](docs/api_reference.md)\n[![Enterprise](https://img.shields.io/badge/\ud83c\udfe2_Enterprise-purple?style=for-the-badge&logo=building)](mailto:sales@tensorus.com)\n\n### \u2b50 **Star us on GitHub** | **\ud83d\udd04 Share with your team** | **\ud83d\udce2 Follow for updates**\n\n*Tensorus - Empowering Intelligent Tensor Data Management*\n\n</div>\n",
    "bugtrack_url": null,
    "license": "MIT License\n        \n        Copyright (c) 2025 Tensorus\n        \n        Permission is hereby granted, free of charge, to any person obtaining a copy\n        of this software and associated documentation files (the \"Software\"), to deal\n        in the Software without restriction, including without limitation the rights\n        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell\n        copies of the Software, and to permit persons to whom the Software is\n        furnished to do so, subject to the following conditions:\n        \n        The above copyright notice and this permission notice shall be included in all\n        copies or substantial portions of the Software.\n        \n        THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\n        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\n        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\n        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\n        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\n        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\n        SOFTWARE.\n        ",
    "summary": "An agentic tensor database with unified SDK, agent orchestration, and intelligent workflows for ML/AI applications.",
    "version": "0.1.0",
    "project_urls": {
        "Homepage": "https://tensorus.com",
        "Repository": "https://github.com/tensorus/tensorus"
    },
    "split_keywords": [
        "tensor",
        " database",
        " agent",
        " ai",
        " pytorch",
        " fastapi",
        " streamlit",
        " automl",
        " reinforcement-learning",
        " data-ingestion"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "6992c83f463ce83c60bd1202e06e500143cdaf7587dc62426984226cd6c9ef26",
                "md5": "db518d5ce45f7336ff8303f9b24079f2",
                "sha256": "9493f4db310e4c5e1711073d38d291612fdde40b0e7f091b4f3e3b02c59060e2"
            },
            "downloads": -1,
            "filename": "tensorus-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "db518d5ce45f7336ff8303f9b24079f2",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 241742,
            "upload_time": "2025-11-03T18:19:40",
            "upload_time_iso_8601": "2025-11-03T18:19:40.543045Z",
            "url": "https://files.pythonhosted.org/packages/69/92/c83f463ce83c60bd1202e06e500143cdaf7587dc62426984226cd6c9ef26/tensorus-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "13e698e22da049cc91622e5f169e364e5ae04f27d7b66eb437ba6f2c411a30eb",
                "md5": "d0f49f2e958eaad252f8bb9d98cc0538",
                "sha256": "8836bc692383a1720c113765ca8b51314bf74a0040c29ce5e8c65b874b34d6d6"
            },
            "downloads": -1,
            "filename": "tensorus-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "d0f49f2e958eaad252f8bb9d98cc0538",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 340054,
            "upload_time": "2025-11-03T18:19:42",
            "upload_time_iso_8601": "2025-11-03T18:19:42.487046Z",
            "url": "https://files.pythonhosted.org/packages/13/e6/98e22da049cc91622e5f169e364e5ae04f27d7b66eb437ba6f2c411a30eb/tensorus-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-11-03 18:19:42",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "tensorus",
    "github_project": "tensorus",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "torch",
            "specs": [
                [
                    ">=",
                    "1.13.0"
                ]
            ]
        },
        {
            "name": "torchvision",
            "specs": [
                [
                    ">=",
                    "0.14.0"
                ]
            ]
        },
        {
            "name": "segmentation-models-pytorch",
            "specs": []
        },
        {
            "name": "transformers",
            "specs": []
        },
        {
            "name": "langchain-google",
            "specs": [
                [
                    ">=",
                    "0.1.1"
                ]
            ]
        },
        {
            "name": "langchain-google-genai",
            "specs": [
                [
                    ">=",
                    "2.1.5"
                ]
            ]
        },
        {
            "name": "numpy",
            "specs": [
                [
                    ">=",
                    "1.21.0"
                ]
            ]
        },
        {
            "name": "tensorly",
            "specs": []
        },
        {
            "name": "Pillow",
            "specs": [
                [
                    ">=",
                    "9.0.0"
                ]
            ]
        },
        {
            "name": "fastapi",
            "specs": [
                [
                    ">=",
                    "0.110.0"
                ]
            ]
        },
        {
            "name": "pydantic",
            "specs": [
                [
                    ">=",
                    "2.0.0"
                ]
            ]
        },
        {
            "name": "pydantic-settings",
            "specs": [
                [
                    ">=",
                    "2.0"
                ]
            ]
        },
        {
            "name": "uvicorn",
            "specs": [
                [
                    ">=",
                    "0.20.0"
                ]
            ]
        },
        {
            "name": "psycopg2-binary",
            "specs": [
                [
                    ">=",
                    "2.9.0"
                ]
            ]
        },
        {
            "name": "streamlit",
            "specs": [
                [
                    ">=",
                    "1.25.0"
                ]
            ]
        },
        {
            "name": "requests",
            "specs": [
                [
                    ">=",
                    "2.28.0"
                ]
            ]
        },
        {
            "name": "python-jose",
            "specs": []
        },
        {
            "name": "plotly",
            "specs": [
                [
                    ">=",
                    "5.10.0"
                ]
            ]
        },
        {
            "name": "pytest",
            "specs": [
                [
                    ">=",
                    "7.0.0"
                ]
            ]
        },
        {
            "name": "httpx",
            "specs": [
                [
                    ">=",
                    "0.28.1"
                ]
            ]
        },
        {
            "name": "boto3",
            "specs": [
                [
                    ">=",
                    "1.28.0"
                ]
            ]
        },
        {
            "name": "matplotlib",
            "specs": [
                [
                    ">=",
                    "3.5.0"
                ]
            ]
        },
        {
            "name": "scikit-learn",
            "specs": [
                [
                    ">=",
                    "1.3.0"
                ]
            ]
        },
        {
            "name": "umap-learn",
            "specs": []
        },
        {
            "name": "pandas",
            "specs": [
                [
                    ">=",
                    "1.5.0"
                ]
            ]
        },
        {
            "name": "arch",
            "specs": [
                [
                    ">=",
                    "5.7"
                ]
            ]
        },
        {
            "name": "lifelines",
            "specs": [
                [
                    ">=",
                    "0.28"
                ]
            ]
        },
        {
            "name": "semopy",
            "specs": [
                [
                    ">=",
                    "2.3"
                ]
            ]
        },
        {
            "name": "gensim",
            "specs": []
        },
        {
            "name": "joblib",
            "specs": []
        },
        {
            "name": "opencv-python",
            "specs": []
        }
    ],
    "lcname": "tensorus"
}
        
Elapsed time: 4.83018s