# Nexus: AI-Native Distributed Filesystem
[](https://github.com/nexi-lab/nexus/actions/workflows/test.yml)
[](https://github.com/nexi-lab/nexus/actions/workflows/lint.yml)
[](https://badge.fury.io/py/nexus-ai-fs)
[](https://opensource.org/licenses/Apache-2.0)
[](https://www.python.org/downloads/)
**Version 0.1.0** | AI Agent Infrastructure Platform
Nexus is a complete AI agent infrastructure platform that combines distributed unified filesystem, self-evolving agent memory, intelligent document processing, and seamless deployment from local development to hosted production—all from a single codebase.
## Features
### Foundation
- **Distributed Unified Filesystem**: Multi-backend abstraction (S3, GDrive, SharePoint, LocalFS)
- **Tiered Storage**: Hot/Warm/Cold tiers with automatic lineage tracking
- **Content-Addressable Storage**: 30-50% storage savings via deduplication
- **"Everything as a File" Paradigm**: Configuration, memory, jobs, and commands as files
### Agent Intelligence
- **Self-Evolving Memory**: Agent memory with automatic consolidation
- **Memory Versioning**: Track knowledge evolution over time
- **Multi-Agent Sharing**: Shared memory spaces within tenants
- **Memory Analytics**: Effectiveness tracking and insights
- **Prompt Version Control**: Track prompt evolution with lineage
- **Training Data Management**: Version-controlled datasets with deduplication
- **Prompt Optimization**: Multi-candidate testing, execution traces, tradeoff analysis
- **Experiment Tracking**: Organize optimization runs, per-example results, regression detection
### Content Processing
- **Rich Format Parsing**: Extensible parsers (PDF, Excel, CSV, JSON, images)
- **LLM KV Cache Management**: 50-90% cost savings on AI queries
- **Semantic Chunking**: Better search via intelligent document segmentation
- **MCP Integration**: Native Model Context Protocol server
- **Document Type Detection**: Automatic routing to appropriate parsers
### Operations
- **Resumable Jobs**: Checkpointing system survives restarts
- **OAuth Token Management**: Auto-refreshing credentials
- **Backend Auto-Mount**: Automatic recognition and mounting
- **Resource Management**: CPU throttling and rate limiting
- **Work Queue Detection**: SQL views for efficient task scheduling and dependency resolution
## Deployment Modes
Nexus supports two deployment modes from a single codebase:
| Mode | Use Case | Setup Time | Scaling |
|------|----------|------------|---------|
| **Local** | Individual developers, CLI tools, prototyping | 60 seconds | Single machine (~10GB) |
| **Hosted** | Teams and production (auto-scales) | Sign up | Automatic (GB to Petabytes) |
**Note**: Hosted mode automatically scales infrastructure under the hood—you don't choose between "monolithic" or "distributed". Nexus handles that for you based on your usage.
### Quick Start: Local Mode
```python
import nexus
# Zero-deployment filesystem with AI features
# Config auto-discovered from nexus.yaml or environment
nx = nexus.connect()
async with nx:
# Write and read files
await nx.write("/workspace/data.txt", b"Hello World")
content = await nx.read("/workspace/data.txt")
# Semantic search across documents
results = await nx.semantic_search(
"/docs/**/*.pdf",
query="authentication implementation"
)
# LLM-powered document reading with KV cache
answer = await nx.llm_read(
"/reports/q4.pdf",
prompt="Summarize key findings",
model="claude-sonnet-4"
)
```
**Config file (`nexus.yaml`):**
```yaml
mode: local
data_dir: ./nexus-data
cache_size_mb: 100
enable_vector_search: true
```
### Quick Start: Hosted Mode
**Coming Soon!** Sign up for early access at [nexus.ai](https://nexus.ai)
```python
import nexus
# Connect to Nexus hosted instance
# Infrastructure scales automatically based on your usage
nx = nexus.connect(
api_key="your-api-key",
endpoint="https://api.nexus.ai"
)
async with nx:
# Same API as local mode!
await nx.write("/workspace/data.txt", b"Hello World")
content = await nx.read("/workspace/data.txt")
```
**For self-hosted deployments**, see the [S3-Compatible HTTP Server](#s3-compatible-http-server) section below for deployment instructions.
## Storage Backends
Nexus supports multiple storage backends through a unified API. All backends use **Content-Addressable Storage (CAS)** for automatic deduplication.
### Local Backend (Default)
Store files on local filesystem:
```python
import nexus
# Auto-detected from config or uses default
nx = nexus.connect()
# Or explicitly configure
nx = nexus.connect(config={
"backend": "local",
"data_dir": "./nexus-data"
})
```
### Google Cloud Storage (GCS) Backend
Store files in Google Cloud Storage with local metadata:
```python
import nexus
# Connect with GCS backend
nx = nexus.connect(config={
"backend": "gcs",
"gcs_bucket_name": "my-nexus-bucket",
"gcs_project_id": "my-gcp-project", # Optional
"gcs_credentials_path": "/path/to/credentials.json", # Optional
})
```
**Authentication Methods:**
1. **Service Account Key**: Provide `gcs_credentials_path`
2. **Application Default Credentials** (if not provided):
- `GOOGLE_APPLICATION_CREDENTIALS` environment variable
- `gcloud auth application-default login` credentials
- GCE/Cloud Run service account (when running on GCP)
**Using Config File (`nexus.yaml`):**
```yaml
backend: gcs
gcs_bucket_name: my-nexus-bucket
gcs_project_id: my-gcp-project # Optional
# gcs_credentials_path: /path/to/credentials.json # Optional
```
**Using Environment Variables:**
```bash
export NEXUS_BACKEND=gcs
export NEXUS_GCS_BUCKET_NAME=my-nexus-bucket
export NEXUS_GCS_PROJECT_ID=my-gcp-project # Optional
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/credentials.json # Optional
```
**CLI Usage with GCS:**
```bash
# Write file to GCS
nexus write /workspace/data.txt "Hello GCS!" \
--backend=gcs \
--gcs-bucket=my-nexus-bucket
# Or use config file (simpler!)
nexus write /workspace/data.txt "Hello GCS!" --config=nexus.yaml
```
### Advanced: Direct Backend API
For advanced use cases, instantiate backends directly:
```python
from nexus import NexusFS, LocalBackend, GCSBackend
# Local backend
nx_local = NexusFS(
backend=LocalBackend("/path/to/data"),
db_path="./metadata.db"
)
# GCS backend
nx_gcs = NexusFS(
backend=GCSBackend(
bucket_name="my-bucket",
project_id="my-project",
credentials_path="/path/to/creds.json"
),
db_path="./gcs-metadata.db"
)
# Same API for both!
nx_local.write("/file.txt", b"data")
nx_gcs.write("/file.txt", b"data")
```
### Backend Comparison
| Feature | Local Backend | GCS Backend |
|---------|--------------|-------------|
| **Content Storage** | Local filesystem | Google Cloud Storage |
| **Metadata Storage** | Local SQLite | Local SQLite |
| **Deduplication** | ✅ CAS (30-50% savings) | ✅ CAS (30-50% savings) |
| **Multi-machine Access** | ❌ Single machine | ✅ Shared across machines |
| **Durability** | Single disk | 99.999999999% (11 nines) |
| **Latency** | <1ms (local) | 10-50ms (network) |
| **Cost** | Free (local disk) | GCS storage pricing |
| **Use Case** | Development, single machine | Teams, production, backup |
### Coming Soon
- **Amazon S3 Backend** (v0.7.0)
- **Azure Blob Storage** (v0.7.0)
- **Google Drive** (v0.7.0)
- **SharePoint** (v0.7.0)
## Installation
### Using pip (Recommended)
```bash
# Install core Nexus
pip install nexus-ai-fs
# Install with FUSE support
pip install nexus-ai-fs[fuse]
# Install with PostgreSQL support
pip install nexus-ai-fs[postgres]
# Install everything
pip install nexus-ai-fs[all] # All features (FUSE + PostgreSQL + future plugins)
# Verify installation
nexus --version
```
### Installing First-Party Plugins (Local Development)
First-party plugins are in development and not yet published to PyPI. Install from source:
```bash
# Clone repository
git clone https://github.com/nexi-lab/nexus.git
cd nexus
# Install Nexus
pip install -e .
# Install plugins from local source
pip install -e ./nexus-plugin-anthropic # Claude Skills API
pip install -e ./nexus-plugin-skill-seekers # Doc scraper
# Verify plugins
nexus plugins list
```
See [PLUGIN_INSTALLATION.md](./PLUGIN_INSTALLATION.md) for detailed instructions.
### From Source (Development)
```bash
# Clone the repository
git clone https://github.com/nexi-lab/nexus.git
cd nexus
# Install using uv (recommended for faster installs)
curl -LsSf https://astral.sh/uv/install.sh | sh
uv venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
uv pip install -e ".[dev]"
# Or using pip
pip install -e ".[dev]"
```
### Development Setup
```bash
# Install development dependencies
uv pip install -e ".[dev,test]"
# Run tests
pytest
# Run type checking
mypy src/nexus
# Format code
ruff format .
# Lint
ruff check .
```
## CLI Usage
Nexus provides a beautiful command-line interface for all file operations. After installation, the `nexus` command will be available.
### Quick Start
```bash
# Initialize a new workspace
nexus init ./my-workspace
# Write a file
nexus write /workspace/hello.txt "Hello, Nexus!"
# Read a file
nexus cat /workspace/hello.txt
# List files
nexus ls /workspace
nexus ls /workspace --recursive
nexus ls /workspace --long # Detailed view with metadata
```
### Available Commands
#### File Operations
```bash
# Write content to a file
nexus write /path/to/file.txt "content"
echo "content" | nexus write /path/to/file.txt --input -
# Display file contents (with syntax highlighting)
nexus cat /workspace/code.py
# Copy files
nexus cp /source.txt /dest.txt
# Delete files
nexus rm /workspace/old-file.txt
nexus rm /workspace/old-file.txt --force # Skip confirmation
# Show file information
nexus info /workspace/data.txt
```
#### Directory Operations
```bash
# Create directory
nexus mkdir /workspace/data
nexus mkdir /workspace/deep/nested/dir --parents
# Remove directory
nexus rmdir /workspace/data
nexus rmdir /workspace/data --recursive --force
```
#### File Discovery
```bash
# List files
nexus ls /workspace
nexus ls /workspace --recursive
nexus ls /workspace --long # Show size, modified time, etag
# Find files by pattern (glob)
nexus glob "**/*.py" # All Python files recursively
nexus glob "*.txt" --path /workspace # Text files in workspace
nexus glob "test_*.py" # Test files
# Search file contents (grep)
nexus grep "TODO" # Find all TODO comments
nexus grep "def \w+" --file-pattern "**/*.py" # Find function definitions
nexus grep "error" --ignore-case # Case-insensitive search
nexus grep "TODO" --max-results 50 # Limit results
# Search modes (v0.2.0+)
nexus grep "revenue" --file-pattern "**/*.pdf" # Auto mode: tries parsed first
nexus grep "revenue" --file-pattern "**/*.pdf" --search-mode=parsed # Only parsed content
nexus grep "TODO" --search-mode=raw # Only raw text (skip parsing)
# Result shows source type
# Match: TODO (parsed) ← from parsed PDF
# Match: TODO (raw) ← from source code
```
#### File Permissions (v0.3.0)
```bash
# Change file permissions
nexus chmod 755 /workspace/script.sh
nexus chmod rw-r--r-- /workspace/data.txt
# Change file owner and group
nexus chown alice /workspace/file.txt
nexus chgrp developers /workspace/code/
# View ACL entries
nexus getfacl /workspace/file.txt
# Manage ACL entries
nexus setfacl user:alice:rw- /workspace/file.txt
nexus setfacl group:developers:r-x /workspace/code/
nexus setfacl deny:user:bob /workspace/secret.txt
nexus setfacl user:alice:rwx /workspace/file.txt --remove
```
**Supported Formats:**
- **Octal**: `755`, `0o644`, `0755`
- **Symbolic**: `rwxr-xr-x`, `rw-r--r--`
- **ACL Entries**: `user:<name>:rwx`, `group:<name>:r-x`, `deny:user:<name>`
#### ReBAC - Relationship-Based Access Control (v0.3.0)
Nexus implements Zanzibar-style relationship-based authorization for team-based permissions, hierarchical access, and dynamic permission inheritance.
```bash
# Create relationship tuples
nexus rebac create agent alice member-of group eng-team
nexus rebac create group eng-team owner-of file project-docs
nexus rebac create file folder-parent parent-of file folder-child
# Check permissions (with graph traversal)
nexus rebac check agent alice member-of group eng-team # Direct check
nexus rebac check agent alice owner-of file project-docs # Inherited via group
# Find all subjects with a permission
nexus rebac expand owner-of file project-docs # Returns: alice (via eng-team)
nexus rebac expand member-of group eng-team # Returns: alice, bob, ...
# Delete relationships
nexus rebac delete <tuple-id>
# Create temporary access (expires automatically)
nexus rebac create agent alice viewer-of file temp-report \
--expires "2025-12-31T23:59:59"
```
**ReBAC Features:**
- **Relationship Types**: `member-of`, `owner-of`, `viewer-of`, `editor-of`, `parent-of`
- **Graph Traversal**: Recursive permission checking through relationship chains
- **Permission Inheritance**: Team ownership, hierarchical folders, group membership
- **Caching**: 5-minute TTL with automatic invalidation on changes
- **Expiring Access**: Temporary permissions with automatic cleanup
- **Cycle Detection**: Prevents infinite loops in relationship graphs
**Example Use Cases:**
```bash
# Team-based file access
nexus rebac create agent alice member-of group engineering
nexus rebac create group engineering owner-of file /projects/backend
# alice now has owner permission on /projects/backend
# Hierarchical folder permissions
nexus rebac create agent bob owner-of file /workspace/parent-folder
nexus rebac create file /workspace/parent-folder parent-of file /workspace/parent-folder/child
# bob automatically has owner permission on child folder
# Temporary collaborator access
nexus rebac create agent charlie viewer-of file /reports/q4.pdf \
--expires "2025-01-31T23:59:59"
# charlie's access expires automatically on Jan 31, 2025
```
#### Work Queue Operations
```bash
# Query work items by status
nexus work ready --limit 10 # Get ready work items (high priority first)
nexus work pending # Get pending work items
nexus work blocked # Get blocked work items (with dependency info)
nexus work in-progress # Get currently processing items
# View aggregate statistics
nexus work status # Show counts for all work queues
# Output as JSON (for scripting)
nexus work ready --json
nexus work status --json
```
**Note**: Work items are files with special metadata (status, priority, depends_on, worker_id). See `docs/SQL_VIEWS_FOR_WORK_DETECTION.md` for details on setting up work queues.
### Examples
**Initialize and populate a workspace:**
```bash
# Create workspace
nexus init ./my-project
# Create structure
nexus mkdir /workspace/src --data-dir ./my-project/nexus-data
nexus mkdir /workspace/tests --data-dir ./my-project/nexus-data
# Add files
echo "print('Hello World')" | nexus write /workspace/src/main.py --input - \
--data-dir ./my-project/nexus-data
# List everything
nexus ls / --recursive --long --data-dir ./my-project/nexus-data
```
**Find and analyze code:**
```bash
# Find all Python files
nexus glob "**/*.py"
# Search for TODO comments
nexus grep "TODO|FIXME" --file-pattern "**/*.py"
# Find all test files
nexus glob "**/test_*.py"
# Search for function definitions
nexus grep "^def \w+\(" --file-pattern "**/*.py"
```
**Work with data:**
```bash
# Write JSON data
echo '{"name": "test", "value": 42}' | nexus write /data/config.json --input -
# Display with syntax highlighting
nexus cat /data/config.json
# Get file information
nexus info /data/config.json
```
### Global Options
All commands support these global options:
```bash
# Use custom config file
nexus ls /workspace --config /path/to/config.yaml
# Override data directory
nexus ls /workspace --data-dir /path/to/nexus-data
# Combine both (config takes precedence)
nexus ls /workspace --config ./my-config.yaml --data-dir ./data
```
### Plugin Management
Nexus has a modular plugin system for external integrations:
```bash
# List installed plugins
nexus plugins list
# Get detailed plugin information
nexus plugins info anthropic
nexus plugins info skill-seekers
# Install a plugin
nexus plugins install anthropic
nexus plugins install skill-seekers
# Enable/disable plugins
nexus plugins enable anthropic
nexus plugins disable anthropic
# Uninstall a plugin
nexus plugins uninstall skill-seekers
```
**First-party plugins (local development only - not yet on PyPI):**
- **anthropic** - Claude Skills API integration (upload/download/manage skills)
- **skill-seekers** - Generate skills from documentation websites
**Installation:**
```bash
# Install from local source
pip install -e ./nexus-plugin-anthropic
pip install -e ./nexus-plugin-skill-seekers
```
**Using plugin commands:**
```bash
# Anthropic plugin commands
nexus anthropic upload-skill my-skill
nexus anthropic list-skills
nexus anthropic import-github canvas-design
# Skill Seekers plugin commands
nexus skill-seekers generate https://react.dev/ --name react-basics
nexus skill-seekers import /path/to/SKILL.md
nexus skill-seekers list
```
See detailed documentation:
- [Plugin Installation Guide](./PLUGIN_INSTALLATION.md) - **Start here for setup**
- [nexus-plugin-anthropic](./nexus-plugin-anthropic/README.md) - Anthropic plugin docs
- [nexus-plugin-skill-seekers](./nexus-plugin-skill-seekers/README.md) - Skill Seekers docs
**Try plugin examples:**
```bash
# CLI demo - plugin management commands
./examples/plugin_cli_demo.sh
# SDK demo - programmatic plugin usage
python examples/plugin_sdk_demo.py
```
### Help
Get help for any command:
```bash
nexus --help # Show all commands
nexus ls --help # Show help for ls command
nexus grep --help # Show help for grep command
nexus plugins --help # Show plugin management commands
```
## Remote Nexus Server
Nexus includes a JSON-RPC server that exposes the full NexusFileSystem interface over HTTP, enabling remote filesystem access and FUSE mounts to remote servers.
### Quick Start
#### Method 1: Using the Startup Script (Recommended)
```bash
# Navigate to nexus directory
cd /path/to/nexus
# Start with defaults (host: 0.0.0.0, port: 8080, no auth)
./start-server.sh
# Or with custom options
./start-server.sh --host localhost --port 8080 --api-key mysecret
```
#### Method 2: Direct Command
```bash
# Start the server (optional API key authentication)
nexus serve --host 0.0.0.0 --port 8080 --api-key mysecret
# Use remote filesystem from Python
from nexus import RemoteNexusFS
nx = RemoteNexusFS(
server_url="http://localhost:8080",
api_key="mysecret" # Optional
)
# Same API as local NexusFS!
nx.write("/workspace/hello.txt", b"Hello Remote!")
content = nx.read("/workspace/hello.txt")
files = nx.list("/workspace", recursive=True)
```
### Features
- **Full NFS Interface**: All filesystem operations exposed over RPC (read, write, list, glob, grep, mkdir, etc.)
- **JSON-RPC 2.0 Protocol**: Standard RPC protocol with proper error handling
- **API Key Authentication**: Optional Bearer token authentication for security
- **Backend Agnostic**: Works with local and GCS backends
- **FUSE Compatible**: Mount remote Nexus servers as local filesystems
### Remote Client Usage
```python
from nexus import RemoteNexusFS
# Connect to remote server
nx = RemoteNexusFS(
server_url="http://your-server:8080",
api_key="your-api-key" # Optional
)
# All standard operations work
nx.write("/workspace/data.txt", b"content")
content = nx.read("/workspace/data.txt")
files = nx.list("/workspace", recursive=True)
results = nx.glob("**/*.py")
matches = nx.grep("TODO", file_pattern="*.py")
```
### Server Options
```bash
# Start with custom host/port
nexus serve --host 0.0.0.0 --port 8080
# Start with API key authentication
nexus serve --api-key mysecret
# Start with GCS backend
nexus serve --backend=gcs --gcs-bucket=my-bucket --api-key mysecret
# Custom data directory
nexus serve --data-dir /path/to/data
```
### Testing the Server
Once the server is running, verify it's working:
```bash
# Health check
curl http://localhost:8080/health
# Expected: {"status": "healthy", "service": "nexus-rpc"}
# Check available methods
curl http://localhost:8080/api/nfs/status
# Expected: {"status": "running", "service": "nexus-rpc", "version": "1.0", "methods": [...]}
# List files (JSON-RPC)
curl -X POST http://localhost:8080/api/nfs/list \
-H "Content-Type: application/json" \
-d '{
"jsonrpc": "2.0",
"method": "list",
"params": {"path": "/", "recursive": false, "details": true},
"id": 1
}'
# With API key
curl -X POST http://localhost:8080/api/nfs/list \
-H "Content-Type: application/json" \
-H "Authorization: Bearer mysecretkey" \
-d '{"jsonrpc": "2.0", "method": "list", "params": {"path": "/"}, "id": 1}'
```
### Troubleshooting
**Port Already in Use:**
```bash
# Find and kill process using port 8080
lsof -ti:8080 | xargs kill -9
# Or use a different port
nexus serve --port 8081
```
**Module Not Found:**
```bash
# Activate virtual environment and install
source .venv/bin/activate
pip install -e .
```
**Permission Denied:**
```bash
# Use a directory you have write access to
nexus serve --data-dir ~/nexus-data
```
### Deploying Nexus Server
#### Google Cloud Platform (Recommended)
Deploy to GCP with a single command using the automated deployment script:
```bash
# Quick start
./deploy-gcp.sh --project-id YOUR-PROJECT-ID --api-key mysecret
# With GCS backend
./deploy-gcp.sh \
--project-id YOUR-PROJECT-ID \
--gcs-bucket your-nexus-bucket \
--api-key mysecret \
--machine-type e2-standard-2
```
**Features:**
- ✅ Automated VM provisioning (Ubuntu 22.04)
- ✅ Systemd service with auto-restart
- ✅ Firewall configuration
- ✅ GCS backend support
- ✅ Production-ready setup
**See [GCP Deployment Guide](docs/deployment/GCP_DEPLOYMENT.md) for complete instructions.**
#### Docker Deployment
Deploy using Docker for consistent environments and easy management:
```bash
# Quick start with Docker Compose
cp .env.docker.example .env
# Edit .env with your configuration
docker-compose up -d
# Or run directly
docker build -t nexus-server:latest .
docker run -d \
--name nexus-server \
--restart unless-stopped \
-p 8080:8080 \
-v nexus-data:/app/data \
-e NEXUS_API_KEY="your-api-key" \
nexus-server:latest
# Deploy to GCP with Docker (automated)
./deploy-gcp-docker.sh \
--project-id your-project-id \
--api-key mysecret \
--build-local
```
**Features:**
- ✅ Multi-stage build for optimized image size (~300MB)
- ✅ Non-root user for security
- ✅ Health checks and auto-restart
- ✅ GCS backend support
- ✅ Docker Compose for easy orchestration
**See [Docker Deployment Guide](docs/deployment/DOCKER_DEPLOYMENT.md) for complete instructions.**
**Deployment Features:**
- **Persistent Metadata**: SQLite database stored on VM disk at `/var/lib/nexus/`
- **Content Storage**: All file content stored in configured backend (GCS, local, etc.)
- **Content Deduplication**: CAS-based storage with 30-50% savings
- **Full NFS API**: All operations available remotely
## FUSE Mount: Use Standard Unix Tools (v0.2.0)
Mount Nexus to a local path and use **any standard Unix tool** seamlessly - `ls`, `cat`, `grep`, `vim`, and more!
### Installation
First, install FUSE support:
```bash
# Install Nexus with FUSE support
pip install nexus-ai-fs[fuse]
# Platform-specific FUSE library:
# macOS: Install macFUSE from https://osxfuse.github.io/
# Linux: sudo apt-get install fuse3 # or equivalent for your distro
```
### Quick Start
```bash
# Mount Nexus to local path (smart mode by default)
nexus mount /mnt/nexus
# Now use ANY standard Unix tools!
ls -la /mnt/nexus/workspace/
cat /mnt/nexus/workspace/notes.txt
grep -r "TODO" /mnt/nexus/workspace/
find /mnt/nexus -name "*.py"
vim /mnt/nexus/workspace/code.py
git clone /some/repo /mnt/nexus/repos/myproject
# Unmount when done
nexus unmount /mnt/nexus
```
### Quick Start Examples
**Example 1: Default (Explicit Views) - Best for Mixed Workflows**
```bash
# Mount normally
nexus mount /mnt/nexus
# Binary tools work directly
evince /mnt/nexus/docs/report.pdf # PDF viewer works ✓
# Add .txt for text operations
cat /mnt/nexus/docs/report.pdf.txt # Read as text
grep "results" /mnt/nexus/docs/*.pdf.txt
# Virtual views auto-generated
ls /mnt/nexus/docs/
# → report.pdf
# → report.pdf.txt (virtual)
# → report.pdf.md (virtual)
```
**Example 2: Auto-Parse - Best for Search-Heavy Workflows**
```bash
# Mount with auto-parse
nexus mount /mnt/nexus --auto-parse
# grep works directly on PDFs!
grep "results" /mnt/nexus/docs/*.pdf # No .txt needed! ✓
cat /mnt/nexus/docs/report.pdf # Returns text ✓
# Search across everything
grep -r "TODO" /mnt/nexus/workspace/ # Searches PDFs, Excel, etc.
# Binary via .raw/ when needed
evince /mnt/nexus/.raw/docs/report.pdf # For PDF viewer
```
**Example 3: Real-World Script**
```bash
#!/bin/bash
# Find all PDFs mentioning "invoice"
# Mount in background - command returns immediately!
nexus mount /mnt/nexus --auto-parse --daemon
# (No blocking - script continues immediately)
# Mount is ready - grep works on PDFs!
grep -l "invoice" /mnt/nexus/documents/*.pdf
# Process results
for pdf in $(grep -l "invoice" /mnt/nexus/documents/*.pdf); do
echo "Found in: $pdf"
grep -n "invoice" "$pdf" | head -5
done
# Clean up
nexus unmount /mnt/nexus
```
**Remote server example:**
```bash
#!/bin/bash
# Search PDFs on remote Nexus server
# Mount remote server in background
nexus mount /mnt/nexus \
--remote-url http://nexus-server:8080 \
--auto-parse \
--daemon
# Command returns immediately - daemon process runs in background
# You can now use standard Unix tools on remote filesystem!
# Search across remote PDFs
grep -r "TODO" /mnt/nexus/workspace/ | head -20
# Find large files
find /mnt/nexus -type f -size +10M
# Clean up when done
nexus unmount /mnt/nexus
```
### File Access: Two Modes
Nexus supports **two ways** to access files - choose what fits your workflow:
#### 1. Explicit Views (Default) - Best for Compatibility
Binary files return binary, use `.txt`/`.md` suffixes for parsed content:
```bash
nexus mount /mnt/nexus
# Binary files work with native tools
evince /mnt/nexus/docs/report.pdf # PDF viewer gets binary ✓
libreoffice /mnt/nexus/data/sheet.xlsx # Excel app gets binary ✓
# Add .txt to search/read as text
cat /mnt/nexus/docs/report.pdf.txt # Returns parsed text
grep "pattern" /mnt/nexus/docs/*.pdf.txt
# Virtual views appear automatically
ls /mnt/nexus/docs/
# → report.pdf
# → report.pdf.txt (virtual view)
# → report.pdf.md (virtual view)
```
**When to use:** You want both binary tools AND text search to work
#### 2. Auto-Parse Mode - Best for Search/Grep
Binary files return parsed text directly, use `.raw/` for binary:
```bash
nexus mount /mnt/nexus --auto-parse
# Binary files return text directly - perfect for grep!
cat /mnt/nexus/docs/report.pdf # Returns parsed text ✓
grep "pattern" /mnt/nexus/docs/*.pdf # Works directly! ✓
less /mnt/nexus/docs/report.pdf # Page through text ✓
# Access binary via .raw/ when needed
evince /mnt/nexus/.raw/docs/report.pdf # PDF viewer gets binary
# No .txt/.md suffixes - files return text by default
ls /mnt/nexus/docs/
# → report.pdf (returns text when read)
```
**When to use:** Text search is your primary use case, binary tools are secondary
### Mount Modes (Content Parsing)
Control **what** gets parsed:
```bash
# Smart mode (default) - Auto-detect file types
nexus mount /mnt/nexus --mode=smart
# ✅ PDFs, Excel, Word → parsed
# ✅ .py, .txt, .md → pass-through
# ✅ Best for mixed content
# Text mode - Parse everything aggressively
nexus mount /mnt/nexus --mode=text
# ✅ All files parsed to text
# ⚠️ Slower (always parses)
# Binary mode - No parsing at all
nexus mount /mnt/nexus --mode=binary
# ✅ All files return binary
# ❌ grep won't work on PDFs
```
### Comparison Table
| Feature | Explicit Views (default) | Auto-Parse Mode (`--auto-parse`) |
|---------|-------------------------|-----------------------------------|
| **PDF viewers work** | ✅ `evince file.pdf` | ⚠️ `evince .raw/file.pdf` |
| **grep on PDFs** | ⚠️ `grep *.pdf.txt` | ✅ `grep *.pdf` |
| **Excel apps work** | ✅ `libreoffice file.xlsx` | ⚠️ `libreoffice .raw/file.xlsx` |
| **Best for** | Binary tools + search | Text search primary use case |
| **Virtual views** | `.txt`, `.md` suffixes | No suffixes needed |
| **Binary access** | Direct (`file.pdf`) | Via `.raw/` directory |
### Background (Daemon) Mode
Run the mount in the background and return to your shell immediately:
```bash
# Mount in background - command returns immediately
nexus mount /mnt/nexus --daemon
# ✓ Mounted Nexus to /mnt/nexus
#
# To unmount:
# nexus unmount /mnt/nexus
#
# (Shell prompt returns immediately, mount runs in background)
# Mount is active - you can use it immediately
ls /mnt/nexus
cat /mnt/nexus/workspace/file.txt
# Check daemon status
ps aux | grep "nexus mount" | grep -v grep
# jinjingzhou 43097 ... nexus mount /mnt/nexus --daemon
# Later, unmount when done
nexus unmount /mnt/nexus
```
**How it works:**
- Command returns to shell immediately (using double-fork technique)
- Background daemon process keeps mount active
- Daemon survives terminal close and persists until unmount
- Safe to close your terminal - mount stays active
**Local Mount:**
```bash
# Mount local Nexus data in background
nexus mount /mnt/nexus --daemon
```
**Remote Mount:**
```bash
# Mount remote Nexus server in background
nexus mount /mnt/nexus --remote-url http://your-server:8080 --daemon
# With API key authentication
nexus mount /mnt/nexus \
--remote-url http://your-server:8080 \
--api-key your-secret-key \
--daemon
```
### Performance & Caching (v0.2.0)
FUSE mounts include automatic caching for improved performance. Caching is **enabled by default** with sensible defaults - no configuration needed for most users.
**Default Performance:**
- ✅ Attribute caching (1024 entries, 60s TTL) - Makes `ls` and `stat` operations faster
- ✅ Content caching (100 files) - Speeds up repeated file reads
- ✅ Parsed content caching (50 files) - Accelerates PDF/Excel text extraction
- ✅ Automatic cache invalidation on writes/deletes - Always consistent
**Advanced: Custom Cache Configuration**
For power users with specific performance requirements:
```python
from nexus import connect
from nexus.fuse import mount_nexus
nx = connect(config={"data_dir": "./nexus-data"})
# Custom cache configuration
cache_config = {
"attr_cache_size": 2048, # Double the attribute cache (default: 1024)
"attr_cache_ttl": 120, # Cache attributes for 2 minutes (default: 60s)
"content_cache_size": 200, # Cache 200 files (default: 100)
"parsed_cache_size": 100, # Cache 100 parsed files (default: 50)
"enable_metrics": True # Track cache hit/miss rates (default: False)
}
fuse = mount_nexus(
nx,
"/mnt/nexus",
mode="smart",
cache_config=cache_config,
foreground=False
)
# View cache performance (if metrics enabled)
# Note: Access via fuse.fuse.operations.cache
```
**Cache Configuration Options:**
| Option | Default | Description |
|--------|---------|-------------|
| `attr_cache_size` | 1024 | Max number of cached file attribute entries |
| `attr_cache_ttl` | 60 | Time-to-live for attributes in seconds |
| `content_cache_size` | 100 | Max number of cached file contents |
| `parsed_cache_size` | 50 | Max number of cached parsed contents (PDFs, etc.) |
| `enable_metrics` | False | Enable cache hit/miss tracking |
**When to Tune Cache Settings:**
- **Large directory listings**: Increase `attr_cache_size` to 2048+ and `attr_cache_ttl` to 120+
- **Many small files**: Increase `content_cache_size` to 500+
- **Heavy PDF/Excel use**: Increase `parsed_cache_size` to 200+
- **Performance analysis**: Enable `enable_metrics` to measure cache effectiveness
- **Memory-constrained**: Decrease all cache sizes (e.g., 512 / 50 / 25)
**Notes:**
- Caches are **thread-safe** - safe for concurrent access
- Caches are **automatically invalidated** on file writes, deletes, and renames
- Default settings work well for most use cases - tune only if needed
### Troubleshooting FUSE Mounts
#### Check Mount Status
```bash
# Check if daemon process is running
ps aux | grep "nexus mount" | grep -v grep
# Check mount points
mount | grep nexus
# List files in mount point (should show files, not empty)
ls -la /mnt/nexus/
```
#### Common Issues
**Mount appears empty or shows "Transport endpoint is not connected":**
```bash
# Unmount the stale mount point
nexus unmount /mnt/nexus
# Or force unmount (macOS)
umount -f /mnt/nexus
# Or force unmount (Linux)
fusermount -u /mnt/nexus
# Then remount
nexus mount /mnt/nexus --daemon
```
**Process won't die (stuck in 'D' or 'U' state):**
```bash
# Find stuck processes
ps aux | grep nexus | grep -E "D|U"
# Force kill
kill -9 <PID>
# If process is still stuck (uninterruptible I/O), try:
# macOS: umount -f /mnt/nexus
# Linux: fusermount -uz /mnt/nexus
# Note: Stuck processes in 'D' state typically resolve after unmount
# If they persist, they'll be cleaned up on system reboot
```
**"Directory not empty" error when mounting:**
```bash
# Unmount first
nexus unmount /mnt/nexus
# Or remove and recreate the mount point
rm -rf /mnt/nexus && mkdir /mnt/nexus
# Then mount
nexus mount /mnt/nexus --daemon
```
**Permission denied errors:**
```bash
# Ensure FUSE is installed
# macOS: Install macFUSE from https://osxfuse.github.io/
# Linux: sudo apt-get install fuse3
# Check mount point permissions
ls -ld /mnt/nexus
# Should be owned by your user
# Create mount point with correct permissions
mkdir -p /mnt/nexus
chmod 755 /mnt/nexus
```
**Connection refused (remote mounts):**
```bash
# Check server is running
curl http://your-server:8080/health
# Test connectivity
ping your-server
# Verify API key (if required)
nexus mount /mnt/nexus \
--remote-url http://your-server:8080 \
--api-key your-key \
--daemon
```
**Multiple mounts to same mount point:**
```bash
# Check for existing mounts
mount | grep /mnt/nexus
# Unmount all instances
nexus unmount /mnt/nexus
# Kill any lingering processes
pkill -f "nexus mount /mnt/nexus"
# Clean mount and remount
rm -rf /mnt/nexus && mkdir /mnt/nexus
nexus mount /mnt/nexus --daemon
```
#### Debug Mode
For detailed debugging output:
```bash
# Run in foreground with debug output
nexus mount /mnt/nexus --debug
# This will show all FUSE operations in real-time
# Press Ctrl+C to stop
```
### rclone-style CLI Commands (v0.2.0)
Nexus provides efficient file operations inspired by rclone, with automatic deduplication and progress tracking:
#### Sync Command
One-way synchronization with hash-based change detection:
```bash
# Sync local directory to Nexus (only copies changed files)
nexus sync ./local/dataset/ /workspace/training/
# Preview changes before syncing (dry-run)
nexus sync ./data/ /workspace/backup/ --dry-run
# Mirror sync - delete extra files in destination
nexus sync /workspace/source/ /workspace/dest/ --delete
# Disable hash comparison (force copy all files)
nexus sync ./data/ /workspace/ --no-checksum
```
#### Copy Command
Smart copy with automatic deduplication:
```bash
# Copy directory recursively (skips identical files)
nexus copy ./local/data/ /workspace/project/ --recursive
# Copy within Nexus (leverages CAS deduplication)
nexus copy /workspace/source/ /workspace/dest/ --recursive
# Copy Nexus to local
nexus copy /workspace/data/ ./backup/ --recursive
# Copy single file
nexus copy /workspace/file.txt /workspace/copy.txt
# Disable checksum verification
nexus copy ./data/ /workspace/ --recursive --no-checksum
```
#### Move Command
Efficient file/directory moves with confirmation prompts:
```bash
# Move file (rename if possible, copy+delete otherwise)
nexus move /workspace/old.txt /workspace/new.txt
# Move directory without confirmation
nexus move /workspace/old_dir/ /archives/2024/ --force
```
#### Tree Command
Visualize directory structure as ASCII tree:
```bash
# Show full directory tree
nexus tree /workspace/
# Limit depth to 2 levels
nexus tree /workspace/ -L 2
# Show file sizes
nexus tree /workspace/ --show-size
```
#### Size Command
Calculate directory sizes with human-readable output:
```bash
# Calculate total size
nexus size /workspace/project/
# Human-readable output (KB, MB, GB)
nexus size /workspace/ --human
# Show top 10 largest files
nexus size /workspace/ --human --details
```
**Features:**
- **Hash-based deduplication** - Only copies changed files
- **Progress bars** - Visual feedback for long operations
- **Dry-run mode** - Preview changes before execution
- **Cross-platform paths** - Works with local filesystem and Nexus paths
- **Automatic deduplication** - Leverages Content-Addressable Storage (CAS)
### Performance Comparison
| Method | Speed | Content-Aware | Use Case |
|--------|-------|---------------|----------|
| `grep -r /mnt/nexus/` | Medium | ✅ Yes (via mount) | Interactive use |
| `nexus grep "pattern"` | **Fast** (DB-backed) | ✅ Yes | Large-scale search |
| Standard tools | Familiar | ✅ Yes (via mount) | Day-to-day work |
### Use Cases
**Interactive Development**:
```bash
# Mount for interactive work
nexus mount /mnt/nexus
vim /mnt/nexus/workspace/code.py
git clone /mnt/nexus/repos/myproject
```
**Bulk Operations**:
```bash
# Use rclone-style commands for efficiency
nexus sync /local/dataset/ /workspace/training-data/
nexus tree /workspace/ > structure.txt
```
**Automated Workflows**:
```bash
# Standard Unix tools in scripts
find /mnt/nexus -name "*.pdf" -exec grep -l "invoice" {} \;
rsync -av /mnt/nexus/workspace/ /backup/
```
## Architecture
### Agent Workspace Structure
Every agent gets a structured workspace at `/workspace/{tenant}/{agent}/`:
```
/workspace/acme-corp/research-agent/
├── .nexus/ # Nexus metadata (Git-trackable)
│ ├── agent.yaml # Agent configuration
│ ├── commands/ # Custom commands (markdown files)
│ │ ├── analyze-codebase.md
│ │ └── summarize-docs.md
│ ├── jobs/ # Background job definitions
│ │ └── daily-summary.yaml
│ ├── memory/ # File-based memory
│ │ ├── project-knowledge.md
│ │ └── recent-tasks.jsonl
│ └── secrets.encrypted # KMS-encrypted credentials
├── data/ # Agent's working data
│ ├── inputs/
│ └── outputs/
└── INSTRUCTIONS.md # Agent instructions (auto-loaded)
```
### Path Namespace
```
/
├── workspace/ # Agent scratch space (hot tier, ephemeral)
├── shared/ # Shared tenant data (warm tier, persistent)
├── external/ # Pass-through backends (no content storage)
├── system/ # System metadata (admin-only)
└── archives/ # Cold storage (read-only)
```
## Core Components
### File System Operations
```python
import nexus
# Works in both local and hosted modes
# Mode determined by config file or environment
nx = nexus.connect()
async with nx:
# Basic operations
await nx.write("/workspace/data.txt", b"content")
content = await nx.read("/workspace/data.txt")
await nx.delete("/workspace/data.txt")
# Batch operations
files = await nx.list("/workspace/", recursive=True)
results = await nx.copy_batch(sources, destinations)
# File discovery
python_files = await nx.glob("**/*.py")
todos = await nx.grep(r"TODO:|FIXME:", file_pattern="*.py")
```
### Semantic Search
```python
# Search across documents with vector embeddings
async with nexus.connect() as nx:
results = await nx.semantic_search(
path="/docs/",
query="How does authentication work?",
limit=10,
filters={"file_type": "markdown"}
)
for result in results:
print(f"{result.path}:{result.line} - {result.text}")
```
### LLM-Powered Reading
```python
# Read documents with AI, with automatic KV cache
async with nexus.connect() as nx:
answer = await nx.llm_read(
path="/reports/q4-2024.pdf",
prompt="What were the top 3 challenges?",
model="claude-sonnet-4",
max_tokens=1000
)
```
### Agent Memory
```python
# Store and retrieve agent memories
async with nexus.connect() as nx:
await nx.store_memory(
content="User prefers TypeScript over JavaScript",
memory_type="preference",
tags=["coding", "languages"]
)
memories = await nx.search_memories(
query="programming language preferences",
limit=5
)
```
### Prompt Optimization (Coming in v0.9.5)
```python
# Track multiple prompt candidates during optimization
async with nexus.connect() as nx:
# Start optimization run
run_id = await nx.start_optimization_run(
module_name="SearchModule",
objectives=["accuracy", "latency", "cost"]
)
# Store prompt candidates with detailed traces
for candidate in prompt_variants:
version_id = await nx.store_prompt_version(
module_name="SearchModule",
prompt_template=candidate.template,
metrics={"accuracy": 0.85, "latency_ms": 450},
run_id=run_id
)
# Store execution traces for debugging
await nx.store_execution_trace(
prompt_version_id=version_id,
inputs=test_inputs,
outputs=predictions,
intermediate_steps=reasoning_chain
)
# Analyze tradeoffs across candidates
analysis = await nx.analyze_prompt_tradeoffs(
run_id=run_id,
objectives=["accuracy", "latency_ms", "cost_per_query"]
)
# Get per-example results to find failure patterns
failures = await nx.get_failing_examples(
prompt_version_id=version_id,
limit=20
)
```
### Custom Commands
Create `/workspace/{tenant}/{agent}/.nexus/commands/semantic-search.md`:
```markdown
---
name: semantic-search
description: Search codebase semantically
allowed-tools: [semantic_read, glob, grep]
required-scopes: [read]
model: sonnet
---
## Your task
Given query: {{query}}
1. Use `glob` to find relevant files by pattern
2. Use `semantic_read` to extract relevant sections
3. Summarize findings with file:line citations
```
Execute via API:
```python
async with nexus.connect() as nx:
result = await nx.execute_command(
"semantic-search",
context={"query": "authentication implementation"}
)
```
### Skills System (v0.3.0)
Manage reusable AI agent skills with SKILL.md format, progressive disclosure, lifecycle management, and dependency resolution:
```python
from nexus.skills import SkillRegistry, SkillManager, SkillExporter
# Initialize filesystem
nx = nexus.connect()
# Create skill registry
registry = SkillRegistry(nx)
# Discover skills from three tiers (agent > tenant > system)
# Loads metadata only - lightweight and fast
await registry.discover()
# List available skills
skills = registry.list_skills()
# ['analyze-code', 'data-processing', 'report-generation']
# Get skill metadata (no content loading)
metadata = registry.get_metadata("analyze-code")
print(f"{metadata.name}: {metadata.description}")
# analyze-code: Analyzes code quality and structure
# Load full skill content (lazy loading + caching)
skill = await registry.get_skill("analyze-code")
print(skill.content) # Full markdown content
# Resolve dependencies automatically (DAG with cycle detection)
deps = await registry.resolve_dependencies("complex-skill")
# ['base-skill', 'helper-skill', 'complex-skill']
# Create skill manager for lifecycle operations
manager = SkillManager(nx, registry)
# Create new skill from template
await manager.create_skill(
"my-analyzer",
description="Analyzes code quality and structure",
template="code-generation", # basic, data-analysis, code-generation, document-processing, api-integration
author="Alice",
tier="agent"
)
# Fork existing skill with lineage tracking
await manager.fork_skill(
"analyze-code",
"my-custom-analyzer",
tier="agent",
author="Bob"
)
# Publish skill to tenant library
await manager.publish_skill(
"my-analyzer",
source_tier="agent",
target_tier="tenant"
)
# Export skills to .zip (vendor-neutral)
exporter = SkillExporter(registry)
# Export with dependencies
await exporter.export_skill(
"analyze-code",
output_path="analyze-code.zip",
format="claude", # Enforces 8MB limit
include_dependencies=True
)
# Validate before export
valid, msg, size = await exporter.validate_export("large-skill", format="claude")
if not valid:
print(f"Cannot export: {msg}")
# Enterprise Features (NEW in v0.3.0)
from nexus.skills import (
SkillAnalyticsTracker,
SkillGovernance,
SkillAuditLogger,
AuditAction
)
# Track skill usage and analytics
tracker = SkillAnalyticsTracker(db_connection)
await tracker.track_usage(
"analyze-code",
agent_id="alice",
execution_time=1.5,
success=True
)
# Get analytics for a skill
analytics = await tracker.get_skill_analytics("analyze-code")
print(f"Success rate: {analytics.success_rate:.1%}")
print(f"Avg execution time: {analytics.avg_execution_time:.2f}s")
# Get dashboard metrics
dashboard = await tracker.get_dashboard_metrics()
print(f"Total skills: {dashboard.total_skills}")
print(f"Most used: {dashboard.most_used_skills[:5]}")
# Governance - approval workflow for org-wide skills
gov = SkillGovernance(db_connection)
# Submit for approval
approval_id = await gov.submit_for_approval(
"my-analyzer",
submitted_by="alice",
reviewers=["bob", "charlie"],
comments="Ready for team-wide use"
)
# Approve skill
await gov.approve_skill(approval_id, reviewed_by="bob", comments="Excellent work!")
is_approved = await gov.is_approved("my-analyzer")
# Audit logging for compliance
audit = SkillAuditLogger(db_connection)
# Log skill operations
await audit.log(
"analyze-code",
AuditAction.EXECUTED,
agent_id="alice",
details={"execution_time": 1.5, "success": True}
)
# Query audit logs
logs = await audit.query_logs(skill_name="analyze-code", action=AuditAction.EXECUTED)
# Generate compliance report
report = await audit.generate_compliance_report(tenant_id="tenant1")
print(f"Total operations: {report['total_operations']}")
print(f"Top skills: {report['top_skills'][:5]}")
# Search skills by description
results = await manager.search_skills("code analysis", limit=5)
for skill_name, score in results:
print(f"{skill_name}: {score:.1f}")
```
#### Skills CLI Commands (v0.3.0)
Nexus provides comprehensive CLI commands for skill management:
```bash
# List all skills
nexus skills list
nexus skills list --tenant # Show tenant skills
nexus skills list --system # Show system skills
nexus skills list --tier agent # Filter by tier
# Create new skill from template
nexus skills create my-skill --description "My custom skill"
nexus skills create data-viz --description "Data visualization" --template data-analysis
nexus skills create analyzer --description "Code analyzer" --author Alice
# Fork existing skill
nexus skills fork analyze-code my-analyzer
nexus skills fork data-analysis custom-analysis --author Bob
# Publish skill to tenant library
nexus skills publish my-skill
nexus skills publish shared-skill --from-tier tenant --to-tier system
# Search skills by description
nexus skills search "data analysis"
nexus skills search "code" --tier tenant --limit 5
# Show detailed skill information
nexus skills info analyze-code
nexus skills info data-analysis
# Export skill to .zip package (vendor-neutral)
nexus skills export my-skill --output ./my-skill.zip
nexus skills export analyze-code --output ./export.zip --format claude
nexus skills export my-skill --output ./export.zip --no-deps # Exclude dependencies
# Validate skill format and size limits
nexus skills validate my-skill
nexus skills validate analyze-code --format claude
# Calculate skill size
nexus skills size my-skill
nexus skills size analyze-code --human
```
**Available Templates:**
- `basic` - Simple skill template
- `data-analysis` - Data processing and analysis
- `code-generation` - Code generation and modification
- `document-processing` - Document parsing and analysis
- `api-integration` - API integration and data fetching
**Export Formats:**
- `generic` - Vendor-neutral .zip format (no size limit)
- `claude` - Anthropic Claude format (8MB limit enforced)
- `openai` - OpenAI format (validation only, ready for future plugins)
**Note**: External API integrations (uploading to Claude API, OpenAI, etc.) will be implemented as plugins in v0.3.5+ to maintain vendor neutrality. The core CLI provides generic export functionality.
**SKILL.md Format:**
```markdown
---
name: analyze-code
description: Analyzes code quality and structure
version: 1.0.0
author: Your Name
requires:
- base-parser
- ast-analyzer
---
# Code Analysis Skill
This skill analyzes code for quality metrics...
## Usage
1. Parse the code files
2. Run static analysis
3. Generate report
```
**Features:**
- **Progressive Disclosure**: Load metadata during discovery, full content on-demand
- **Lazy Loading**: Skills cached only when accessed
- **Three-Tier Hierarchy**: Agent skills override tenant/system skills
- **Dependency Resolution**: Automatic DAG resolution with cycle detection
- **Skill Lifecycle**: Create, fork, and publish skills with lineage tracking
- **Template System**: 5 pre-built templates (basic, data-analysis, code-generation, document-processing, api-integration)
- **Vendor-Neutral Export**: Generic .zip format with Claude/OpenAI validation
- **Usage Analytics**: Track performance, success rates, dashboard metrics (NEW in v0.3.0)
- **Governance**: Approval workflows for team-wide skill publication (NEW in v0.3.0)
- **Audit Logging**: Complete compliance tracking and reporting (NEW in v0.3.0)
- **Skill Search**: Find skills by description with relevance scoring (NEW in v0.3.0)
- **Comprehensive Tests**: 156 passing tests (31%+ overall coverage, 65-91% skills module)
**Skill Tiers:**
- **Agent** (`/workspace/.nexus/skills/`) - Personal skills (highest priority)
- **Tenant** (`/shared/skills/`) - Team-shared skills
- **System** (`/system/skills/`) - Built-in skills (lowest priority)
## Technology Stack
### Core
- **Language**: Python 3.11+
- **API Framework**: FastAPI
- **Database**: PostgreSQL / SQLite (configurable via environment variable)
- **Cache**: Redis (prod) / In-memory (dev)
- **Vector DB**: Qdrant
- **Object Storage**: S3-compatible, GCS, Azure Blob
### AI/ML
- **LLM Providers**: Anthropic Claude, OpenAI, Google Gemini
- **Embeddings**: text-embedding-3-large, voyage-ai
- **Parsing**: PyPDF2, pandas, openpyxl, Pillow
### Infrastructure
- **Orchestration**: Kubernetes (distributed mode)
- **Monitoring**: Prometheus + Grafana
- **Logging**: Structlog + Loki
- **Admin UI**: Simple HTML/JS (jobs, memories, files, operations)
## Performance Targets
| Metric | Target | Impact |
|--------|--------|--------|
| Write Throughput | 500-1000 MB/s | 10-50× vs direct backend |
| Read Latency | <10ms | 10-50× vs remote storage |
| Memory Search | <100ms | Vector search across memories |
| Storage Savings | 30-50% | CAS deduplication |
| Job Resumability | 100% | Survives all restarts |
| LLM Cache Hit Rate | 50-90% | Major cost savings |
| Prompt Versioning | Full lineage | Track optimization history |
| Training Data Dedup | 30-50% | CAS-based deduplication |
| Prompt Optimization | Multi-candidate | Test multiple strategies in parallel |
| Trace Storage | Full execution logs | Debug failures, analyze patterns |
## Configuration
### Local Mode
```python
import nexus
# Config via Python (useful for programmatic configuration)
nx = nexus.connect(config={
"mode": "local",
"data_dir": "./nexus-data",
"cache_size_mb": 100,
"enable_vector_search": True
})
# Or let it auto-discover from nexus.yaml
nx = nexus.connect()
```
### Self-Hosted Deployment
For organizations that want to run their own Nexus instance, create `config.yaml`:
```yaml
mode: server # local or server
database:
url: postgresql://user:pass@localhost/nexus
# or for SQLite: sqlite:///./nexus.db
# Can also use NEXUS_DATABASE_URL or POSTGRES_URL environment variable
cache:
type: redis # memory, redis
url: redis://localhost:6379
vector_db:
type: qdrant
url: http://localhost:6333
backends:
- type: s3
bucket: my-company-files
region: us-east-1
- type: gdrive
credentials_path: ./gdrive-creds.json
auth:
jwt_secret: your-secret-key
token_expiry_hours: 24
rate_limits:
default: "100/minute"
semantic_search: "10/minute"
llm_read: "50/hour"
```
Run server:
```bash
nexus server --config config.yaml
```
## Security
### Multi-Layer Security Model
1. **API Key Authentication**: Tenant and agent identification
2. **Row-Level Security (RLS)**: Database-level tenant isolation
3. **Type-Level Validation**: Fail-fast validation before database operations
4. **UNIX-Style Permissions**: Owner, group, and mode bits (v0.3.0)
5. **ACL Permissions**: Fine-grained access control lists (v0.3.0)
6. **ReBAC (Relationship-Based Access Control)**: Zanzibar-style authorization (v0.3.0)
### Type-Level Validation (NEW in v0.1.0)
All domain types have validation methods that are called automatically before database operations. This provides:
- **Fail Fast**: Catch invalid data before expensive database operations
- **Clear Error Messages**: Actionable feedback for developers and API consumers
- **Data Integrity**: Prevent invalid data from entering the database
- **Consistent Validation**: Same rules across all code paths
```python
from nexus.core.metadata import FileMetadata
from nexus.core.exceptions import ValidationError
# Validation happens automatically on put()
try:
metadata = FileMetadata(
path="/data/file.txt", # Must start with /
backend_name="local",
physical_path="/storage/file.txt",
size=1024, # Must be >= 0
)
store.put(metadata) # Validates before DB operation
except ValidationError as e:
print(f"Validation failed: {e}")
# Example: "size cannot be negative, got -1"
```
**Validation Rules:**
- Paths must start with `/` and not contain null bytes
- File sizes and ref counts must be non-negative
- Required fields (path, backend_name, physical_path, etc.) must not be empty
- Content hashes must be valid 64-character SHA-256 hex strings
- Metadata keys must be ≤ 255 characters
### Example: Multi-Tenancy Isolation
```sql
-- RLS automatically filters queries by tenant
SET LOCAL app.current_tenant_id = '<tenant_uuid>';
-- All queries auto-filtered, even with bugs
SELECT * FROM file_paths WHERE path = '/data';
-- Returns only rows for current tenant
```
## Testing
```bash
# Run all tests
pytest
# Run with coverage
pytest --cov=nexus --cov-report=html
# Run specific test file
pytest tests/test_filesystem.py
# Run integration tests
pytest tests/integration/ -v
# Run performance tests
pytest tests/performance/ --benchmark-only
```
## Documentation
- [Plugin Development Guide](./docs/PLUGIN_DEVELOPMENT.md) - Create your own Nexus plugins
- [Plugin System Overview](./docs/PLUGIN_SYSTEM.md) - Plugin architecture and design
- [PostgreSQL Setup Guide](./docs/POSTGRESQL_SETUP.md) - Configure PostgreSQL for production
- [SQL Views for Work Detection](./docs/SQL_VIEWS_FOR_WORK_DETECTION.md) - Work queue patterns
- [API Reference](./docs/api/) - Detailed API documentation
- [Getting Started](./docs/getting-started/) - Quick start guides
- [Deployment Guide](./docs/deployment/) - Production deployment
## Contributing
We welcome contributions! Please see [CONTRIBUTING.md](./CONTRIBUTING.md) for details.
```bash
# Fork the repo and clone
git clone https://github.com/yourusername/nexus.git
cd nexus
# Create a feature branch
git checkout -b feature/your-feature
# Make changes and test
uv pip install -e ".[dev,test]"
pytest
# Format and lint
ruff format .
ruff check .
# Commit and push
git commit -am "Add your feature"
git push origin feature/your-feature
```
## License
Apache 2.0 License - see [LICENSE](./LICENSE) for details.
## Roadmap
### v0.1.0 - Local Mode Foundation (Current)
- [x] Core embedded filesystem (read/write/delete)
- [x] SQLite metadata store
- [x] Local filesystem backend
- [x] Basic file operations (list, glob, grep)
- [x] Virtual path routing
- [x] Directory operations (mkdir, rmdir, is_directory)
- [x] Basic CLI interface with Click and Rich
- [x] Metadata export/import (JSONL format)
- [x] SQL views for ready work detection
- [x] In-memory caching
- [x] Batch operations (avoid N+1 queries)
- [x] Type-level validation
### v0.2.0 - FUSE Mount & Content-Aware Operations (Current)
- [x] **FUSE filesystem mount** - Mount Nexus to local path (e.g., `/mnt/nexus`)
- [x] **Smart read mode** - Return parsed text for binary files (PDFs, Excel, etc.)
- [x] **Virtual file views** - Auto-generate `.txt` and `.md` views for binary files
- [x] **Content parser framework** - Extensible parser system for document types (MarkItDown)
- [x] **PDF parser** - Extract text and markdown from PDFs
- [x] **Excel/CSV parser** - Parse spreadsheets to structured data
- [x] **Content-aware file access** - Access parsed content via virtual views
- [x] **Document type detection** - Auto-detect MIME types and route to parsers
- [x] **Mount CLI commands** - `nexus mount`, `nexus unmount`
- [x] **Mount modes** - Binary, text, and smart modes
- [x] **.raw directory** - Access original binary files
- [x] **Background daemon mode** - Run mount in background with `--daemon`
- [x] **All FUSE operations** - read, write, create, delete, mkdir, rmdir, rename, truncate
- [x] **Unit tests** - Comprehensive test coverage for FUSE operations
- [x] **rclone-style CLI commands** - `sync`, `copy`, `move`, `tree`, `size` with progress bars
- [ ] **Background parsing** - Async content parsing on write
- [x] **FUSE performance optimizations** - Caching (TTL/LRU), cache invalidation, metrics
- [ ] **Image OCR parser** - Extract text from images (PNG, JPEG)
### v0.3.0 - File Permissions & Skills System
**Permissions (Complete):**
- [x] **UNIX-style file permissions** (owner, group, mode)
- [x] **Permission operations** (chmod, chown, chgrp)
- [x] **ACL (Access Control List)** support
- [x] **CLI commands** (getfacl, setfacl)
- [x] **Database schema** for permissions and ACL entries
- [x] **Comprehensive tests** (91 passing tests)
- [x] **ReBAC (Relationship-Based Access Control)** - Zanzibar-style authorization
- [x] **Relationship types** - member-of, owner-of, viewer-of, editor-of, parent-of
- [x] **Permission inheritance via relationships** - Team ownership, group membership
- [x] **Relationship graph queries** - Graph traversal with cycle detection
- [x] **Namespaced tuples** - (subject, relation, object) authorization model
- [x] **Check API** - Fast permission checks with 5-minute TTL caching
- [x] **Expand API** - Discover all subjects with specific permissions
- [x] **Relationship management** - Create, delete, query relationships via CLI
- [x] **Expiring tuples** - Temporary permissions with automatic cleanup
- [x] **Comprehensive ReBAC tests** (14 passing tests, 100% pass rate)
**Permissions (Remaining):**
- [ ] **Default permission policies** per namespace
- [ ] **Permission inheritance** for new files
- [ ] **Permission checking** in all file operations
- [ ] **Permission migration** for existing files
**Skills System (Core - Vendor Neutral):**
- [x] **SKILL.md parser** - Parse Anthropic-compatible SKILL.md with frontmatter
- [x] **Skill registry** - Progressive disclosure, lazy loading, three-tier hierarchy
- [x] **Skill discovery** - Scan `/workspace/.nexus/skills/`, `/shared/skills/`, `/system/skills/`
- [x] **Dependency resolution** - Automatic DAG resolution with cycle detection
- [x] **Skill export** - Export to generic formats (validate, pack, size check)
- [x] **Skill templates** - 5 pre-built templates (basic, data-analysis, code-generation, document-processing, api-integration)
- [x] **Skill lifecycle** - Create, fork, publish workflows with lineage tracking
- [x] **Comprehensive tests** - 156 passing tests (31%+ overall coverage, 65-91% skills module)
- [x] **Skill analytics** - Usage tracking, success rates, execution time, dashboard metrics
- [x] **Skill search** - Text-based search across skill descriptions with relevance scoring
- [x] **Skill governance** - Approval workflow for org-wide skills (submit, approve, reject)
- [x] **Audit trails** - Log all skill operations, compliance reporting, query by filters
- [ ] **Skill versioning** - CAS-backed version control with history tracking
- [ ] **Semantic skill search** - Vector-based search across skill descriptions
- [x] **CLI commands** - `list`, `create`, `fork`, `publish`, `search`, `info`, `export`, `validate`, `size` (see issue #88)
**Note**: External integrations (Claude API upload/download, OpenAI, etc.) will be implemented as **plugins** in v0.3.5+ to maintain vendor neutrality. Core Nexus provides generic skill export (`nexus skills export --format claude`), while `nexus-plugin-anthropic` handles API-specific operations.
### v0.3.5 - Plugin System & External Integrations
- [ ] **Plugin discovery** - Entry point-based plugin discovery
- [ ] **Plugin registry** - Register and manage installed plugins
- [ ] **Plugin CLI namespace** - `nexus <plugin-name> <command>` pattern
- [ ] **Plugin hooks** - Lifecycle hooks (before_write, after_read, etc.)
- [ ] **Plugin configuration** - Per-plugin config in `~/.nexus/plugins/<name>/`
- [ ] **Plugin manager** - `nexus plugins list/install/uninstall/info`
- [ ] **First-party plugins:**
- [ ] `nexus-plugin-anthropic` - Claude API integration (upload/download skills)
- [ ] `nexus-plugin-openai` - OpenAI API integration
- [ ] `nexus-plugin-skill-seekers` - Integration with Skill_Seekers scraper
### v0.4.0 - AI Integration
- [ ] LLM provider abstraction
- [ ] Anthropic Claude integration
- [ ] OpenAI integration
- [ ] Basic KV cache for prompts
- [ ] Semantic search (vector embeddings)
- [ ] LLM-powered document reading
### v0.5.0 - Agent Workspaces
- [ ] Agent workspace structure
- [ ] File-based configuration (.nexus/)
- [ ] Custom command system (markdown)
- [ ] Basic agent memory storage
- [ ] Memory consolidation
- [ ] Memory reflection phase (ACE-inspired: extract insights from execution trajectories)
- [ ] Strategy/playbook organization (ACE-inspired: organize memories as reusable strategies)
### v0.6.0 - Server Mode (Self-Hosted & Managed)
- [ ] FastAPI REST API
- [ ] API key authentication
- [ ] Multi-tenancy support
- [ ] PostgreSQL support
- [ ] Redis caching
- [ ] Docker deployment
- [ ] Batch/transaction APIs (atomic multi-operation updates)
- [ ] Optimistic locking for concurrent writes
- [ ] Auto-scaling configuration (for hosted deployments)
### v0.7.0 - Extended Features & Event System
- [ ] S3 backend support
- [ ] Google Drive backend
- [ ] Job system with checkpointing
- [ ] OAuth token management
- [ ] MCP server implementation
- [ ] Webhook/event system (file changes, memory updates, job events)
- [ ] Watch API for real-time updates (streaming changes to clients)
- [ ] Server-Sent Events (SSE) support for live monitoring
- [ ] Simple admin UI (jobs, memories, files, operation logs)
- [ ] Operation logs table (track storage operations for debugging)
### v0.8.0 - Advanced AI Features & Rich Query
- [ ] Advanced KV cache with context tracking
- [ ] Memory versioning and lineage
- [ ] Multi-agent memory sharing
- [ ] Enhanced semantic search
- [ ] Importance-based memory preservation (ACE-inspired: prevent brevity bias in consolidation)
- [ ] Context-aware memory retrieval (include execution context in search)
- [ ] Automated strategy extraction (LLM-powered extraction from successful trajectories)
- [ ] Rich memory query language (filter by metadata, importance, task type, date ranges, etc.)
- [ ] Memory query builder API (fluent interface for complex queries)
- [ ] Combined vector + metadata search (hybrid search)
### v0.9.0 - Production Readiness
- [ ] Monitoring and observability
- [ ] Performance optimization
- [ ] Comprehensive testing
- [ ] Security hardening
- [ ] Documentation completion
- [ ] Optional OpenTelemetry export (for framework integration)
### v0.9.5 - Prompt Engineering & Optimization
- [ ] Prompt version control with lineage tracking
- [ ] Training dataset storage with CAS deduplication
- [ ] Evaluation metrics time series (performance tracking)
- [ ] Frozen inference snapshots (immutable program state)
- [ ] Experiment tracking export (MLflow, W&B integration)
- [ ] Prompt diff viewer (compare versions)
- [ ] Regression detection alerts (performance drops)
- [ ] Multi-candidate pool management (concurrent prompt testing)
- [ ] Execution trace storage (detailed run logs for debugging)
- [ ] Per-example evaluation results (granular performance tracking)
- [ ] Optimization run grouping (experiment management)
- [ ] Multi-objective tradeoff analysis (accuracy vs latency vs cost)
### v0.10.0 - Production Infrastructure & Auto-Scaling
- [ ] Automatic infrastructure scaling
- [ ] Redis distributed locks (for large deployments)
- [ ] PostgreSQL replication (for high availability)
- [ ] Kubernetes deployment templates
- [ ] Multi-region load balancing
- [ ] Automatic migration from single-node to distributed
### v1.0.0 - Production Release
- [ ] Complete feature set
- [ ] Production-tested
- [ ] Comprehensive documentation
- [ ] Migration tools
- [ ] Enterprise support
## Support
- **Issues**: [GitHub Issues](https://github.com/yourusername/nexus/issues)
- **Discussions**: [GitHub Discussions](https://github.com/yourusername/nexus/discussions)
- **Email**: support@nexus.example.com
- **Slack**: [Join our community](https://nexus-community.slack.com)
---
Built with ❤️ by the Nexus team
Raw data
{
"_id": null,
"home_page": null,
"name": "nexus-ai-fs",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.11",
"maintainer_email": "Nexus Team <team@nexus.example.com>",
"keywords": "agents, ai, content-addressable, distributed, filesystem, llm, storage, vector-search",
"author": null,
"author_email": "Nexus Team <team@nexus.example.com>",
"download_url": "https://files.pythonhosted.org/packages/35/f8/f941b3dc08dbad3aa52e67413d3b4c18e08abf1f60dfd4fd5191fa6cf1b4/nexus_ai_fs-0.2.5.tar.gz",
"platform": null,
"description": "# Nexus: AI-Native Distributed Filesystem\n\n[](https://github.com/nexi-lab/nexus/actions/workflows/test.yml)\n[](https://github.com/nexi-lab/nexus/actions/workflows/lint.yml)\n[](https://badge.fury.io/py/nexus-ai-fs)\n[](https://opensource.org/licenses/Apache-2.0)\n[](https://www.python.org/downloads/)\n\n**Version 0.1.0** | AI Agent Infrastructure Platform\n\nNexus is a complete AI agent infrastructure platform that combines distributed unified filesystem, self-evolving agent memory, intelligent document processing, and seamless deployment from local development to hosted production\u2014all from a single codebase.\n\n## Features\n\n### Foundation\n- **Distributed Unified Filesystem**: Multi-backend abstraction (S3, GDrive, SharePoint, LocalFS)\n- **Tiered Storage**: Hot/Warm/Cold tiers with automatic lineage tracking\n- **Content-Addressable Storage**: 30-50% storage savings via deduplication\n- **\"Everything as a File\" Paradigm**: Configuration, memory, jobs, and commands as files\n\n### Agent Intelligence\n- **Self-Evolving Memory**: Agent memory with automatic consolidation\n- **Memory Versioning**: Track knowledge evolution over time\n- **Multi-Agent Sharing**: Shared memory spaces within tenants\n- **Memory Analytics**: Effectiveness tracking and insights\n- **Prompt Version Control**: Track prompt evolution with lineage\n- **Training Data Management**: Version-controlled datasets with deduplication\n- **Prompt Optimization**: Multi-candidate testing, execution traces, tradeoff analysis\n- **Experiment Tracking**: Organize optimization runs, per-example results, regression detection\n\n### Content Processing\n- **Rich Format Parsing**: Extensible parsers (PDF, Excel, CSV, JSON, images)\n- **LLM KV Cache Management**: 50-90% cost savings on AI queries\n- **Semantic Chunking**: Better search via intelligent document segmentation\n- **MCP Integration**: Native Model Context Protocol server\n- **Document Type Detection**: Automatic routing to appropriate parsers\n\n### Operations\n- **Resumable Jobs**: Checkpointing system survives restarts\n- **OAuth Token Management**: Auto-refreshing credentials\n- **Backend Auto-Mount**: Automatic recognition and mounting\n- **Resource Management**: CPU throttling and rate limiting\n- **Work Queue Detection**: SQL views for efficient task scheduling and dependency resolution\n\n## Deployment Modes\n\nNexus supports two deployment modes from a single codebase:\n\n| Mode | Use Case | Setup Time | Scaling |\n|------|----------|------------|---------|\n| **Local** | Individual developers, CLI tools, prototyping | 60 seconds | Single machine (~10GB) |\n| **Hosted** | Teams and production (auto-scales) | Sign up | Automatic (GB to Petabytes) |\n\n**Note**: Hosted mode automatically scales infrastructure under the hood\u2014you don't choose between \"monolithic\" or \"distributed\". Nexus handles that for you based on your usage.\n\n### Quick Start: Local Mode\n\n```python\nimport nexus\n\n# Zero-deployment filesystem with AI features\n# Config auto-discovered from nexus.yaml or environment\nnx = nexus.connect()\n\nasync with nx:\n # Write and read files\n await nx.write(\"/workspace/data.txt\", b\"Hello World\")\n content = await nx.read(\"/workspace/data.txt\")\n\n # Semantic search across documents\n results = await nx.semantic_search(\n \"/docs/**/*.pdf\",\n query=\"authentication implementation\"\n )\n\n # LLM-powered document reading with KV cache\n answer = await nx.llm_read(\n \"/reports/q4.pdf\",\n prompt=\"Summarize key findings\",\n model=\"claude-sonnet-4\"\n )\n```\n\n**Config file (`nexus.yaml`):**\n```yaml\nmode: local\ndata_dir: ./nexus-data\ncache_size_mb: 100\nenable_vector_search: true\n```\n\n### Quick Start: Hosted Mode\n\n**Coming Soon!** Sign up for early access at [nexus.ai](https://nexus.ai)\n\n```python\nimport nexus\n\n# Connect to Nexus hosted instance\n# Infrastructure scales automatically based on your usage\nnx = nexus.connect(\n api_key=\"your-api-key\",\n endpoint=\"https://api.nexus.ai\"\n)\n\nasync with nx:\n # Same API as local mode!\n await nx.write(\"/workspace/data.txt\", b\"Hello World\")\n content = await nx.read(\"/workspace/data.txt\")\n```\n\n**For self-hosted deployments**, see the [S3-Compatible HTTP Server](#s3-compatible-http-server) section below for deployment instructions.\n\n## Storage Backends\n\nNexus supports multiple storage backends through a unified API. All backends use **Content-Addressable Storage (CAS)** for automatic deduplication.\n\n### Local Backend (Default)\n\nStore files on local filesystem:\n\n```python\nimport nexus\n\n# Auto-detected from config or uses default\nnx = nexus.connect()\n\n# Or explicitly configure\nnx = nexus.connect(config={\n \"backend\": \"local\",\n \"data_dir\": \"./nexus-data\"\n})\n```\n\n### Google Cloud Storage (GCS) Backend\n\nStore files in Google Cloud Storage with local metadata:\n\n```python\nimport nexus\n\n# Connect with GCS backend\nnx = nexus.connect(config={\n \"backend\": \"gcs\",\n \"gcs_bucket_name\": \"my-nexus-bucket\",\n \"gcs_project_id\": \"my-gcp-project\", # Optional\n \"gcs_credentials_path\": \"/path/to/credentials.json\", # Optional\n})\n```\n\n**Authentication Methods:**\n1. **Service Account Key**: Provide `gcs_credentials_path`\n2. **Application Default Credentials** (if not provided):\n - `GOOGLE_APPLICATION_CREDENTIALS` environment variable\n - `gcloud auth application-default login` credentials\n - GCE/Cloud Run service account (when running on GCP)\n\n**Using Config File (`nexus.yaml`):**\n```yaml\nbackend: gcs\ngcs_bucket_name: my-nexus-bucket\ngcs_project_id: my-gcp-project # Optional\n# gcs_credentials_path: /path/to/credentials.json # Optional\n```\n\n**Using Environment Variables:**\n```bash\nexport NEXUS_BACKEND=gcs\nexport NEXUS_GCS_BUCKET_NAME=my-nexus-bucket\nexport NEXUS_GCS_PROJECT_ID=my-gcp-project # Optional\nexport GOOGLE_APPLICATION_CREDENTIALS=/path/to/credentials.json # Optional\n```\n\n**CLI Usage with GCS:**\n```bash\n# Write file to GCS\nnexus write /workspace/data.txt \"Hello GCS!\" \\\n --backend=gcs \\\n --gcs-bucket=my-nexus-bucket\n\n# Or use config file (simpler!)\nnexus write /workspace/data.txt \"Hello GCS!\" --config=nexus.yaml\n```\n\n### Advanced: Direct Backend API\n\nFor advanced use cases, instantiate backends directly:\n\n```python\nfrom nexus import NexusFS, LocalBackend, GCSBackend\n\n# Local backend\nnx_local = NexusFS(\n backend=LocalBackend(\"/path/to/data\"),\n db_path=\"./metadata.db\"\n)\n\n# GCS backend\nnx_gcs = NexusFS(\n backend=GCSBackend(\n bucket_name=\"my-bucket\",\n project_id=\"my-project\",\n credentials_path=\"/path/to/creds.json\"\n ),\n db_path=\"./gcs-metadata.db\"\n)\n\n# Same API for both!\nnx_local.write(\"/file.txt\", b\"data\")\nnx_gcs.write(\"/file.txt\", b\"data\")\n```\n\n### Backend Comparison\n\n| Feature | Local Backend | GCS Backend |\n|---------|--------------|-------------|\n| **Content Storage** | Local filesystem | Google Cloud Storage |\n| **Metadata Storage** | Local SQLite | Local SQLite |\n| **Deduplication** | \u2705 CAS (30-50% savings) | \u2705 CAS (30-50% savings) |\n| **Multi-machine Access** | \u274c Single machine | \u2705 Shared across machines |\n| **Durability** | Single disk | 99.999999999% (11 nines) |\n| **Latency** | <1ms (local) | 10-50ms (network) |\n| **Cost** | Free (local disk) | GCS storage pricing |\n| **Use Case** | Development, single machine | Teams, production, backup |\n\n### Coming Soon\n\n- **Amazon S3 Backend** (v0.7.0)\n- **Azure Blob Storage** (v0.7.0)\n- **Google Drive** (v0.7.0)\n- **SharePoint** (v0.7.0)\n\n## Installation\n\n### Using pip (Recommended)\n\n```bash\n# Install core Nexus\npip install nexus-ai-fs\n\n# Install with FUSE support\npip install nexus-ai-fs[fuse]\n\n# Install with PostgreSQL support\npip install nexus-ai-fs[postgres]\n\n# Install everything\npip install nexus-ai-fs[all] # All features (FUSE + PostgreSQL + future plugins)\n\n# Verify installation\nnexus --version\n```\n\n### Installing First-Party Plugins (Local Development)\n\nFirst-party plugins are in development and not yet published to PyPI. Install from source:\n\n```bash\n# Clone repository\ngit clone https://github.com/nexi-lab/nexus.git\ncd nexus\n\n# Install Nexus\npip install -e .\n\n# Install plugins from local source\npip install -e ./nexus-plugin-anthropic # Claude Skills API\npip install -e ./nexus-plugin-skill-seekers # Doc scraper\n\n# Verify plugins\nnexus plugins list\n```\n\nSee [PLUGIN_INSTALLATION.md](./PLUGIN_INSTALLATION.md) for detailed instructions.\n\n### From Source (Development)\n\n```bash\n# Clone the repository\ngit clone https://github.com/nexi-lab/nexus.git\ncd nexus\n\n# Install using uv (recommended for faster installs)\ncurl -LsSf https://astral.sh/uv/install.sh | sh\nuv venv\nsource .venv/bin/activate # On Windows: .venv\\Scripts\\activate\nuv pip install -e \".[dev]\"\n\n# Or using pip\npip install -e \".[dev]\"\n```\n\n### Development Setup\n\n```bash\n# Install development dependencies\nuv pip install -e \".[dev,test]\"\n\n# Run tests\npytest\n\n# Run type checking\nmypy src/nexus\n\n# Format code\nruff format .\n\n# Lint\nruff check .\n```\n\n## CLI Usage\n\nNexus provides a beautiful command-line interface for all file operations. After installation, the `nexus` command will be available.\n\n### Quick Start\n\n```bash\n# Initialize a new workspace\nnexus init ./my-workspace\n\n# Write a file\nnexus write /workspace/hello.txt \"Hello, Nexus!\"\n\n# Read a file\nnexus cat /workspace/hello.txt\n\n# List files\nnexus ls /workspace\nnexus ls /workspace --recursive\nnexus ls /workspace --long # Detailed view with metadata\n```\n\n### Available Commands\n\n#### File Operations\n\n```bash\n# Write content to a file\nnexus write /path/to/file.txt \"content\"\necho \"content\" | nexus write /path/to/file.txt --input -\n\n# Display file contents (with syntax highlighting)\nnexus cat /workspace/code.py\n\n# Copy files\nnexus cp /source.txt /dest.txt\n\n# Delete files\nnexus rm /workspace/old-file.txt\nnexus rm /workspace/old-file.txt --force # Skip confirmation\n\n# Show file information\nnexus info /workspace/data.txt\n```\n\n#### Directory Operations\n\n```bash\n# Create directory\nnexus mkdir /workspace/data\nnexus mkdir /workspace/deep/nested/dir --parents\n\n# Remove directory\nnexus rmdir /workspace/data\nnexus rmdir /workspace/data --recursive --force\n```\n\n#### File Discovery\n\n```bash\n# List files\nnexus ls /workspace\nnexus ls /workspace --recursive\nnexus ls /workspace --long # Show size, modified time, etag\n\n# Find files by pattern (glob)\nnexus glob \"**/*.py\" # All Python files recursively\nnexus glob \"*.txt\" --path /workspace # Text files in workspace\nnexus glob \"test_*.py\" # Test files\n\n# Search file contents (grep)\nnexus grep \"TODO\" # Find all TODO comments\nnexus grep \"def \\w+\" --file-pattern \"**/*.py\" # Find function definitions\nnexus grep \"error\" --ignore-case # Case-insensitive search\nnexus grep \"TODO\" --max-results 50 # Limit results\n\n# Search modes (v0.2.0+)\nnexus grep \"revenue\" --file-pattern \"**/*.pdf\" # Auto mode: tries parsed first\nnexus grep \"revenue\" --file-pattern \"**/*.pdf\" --search-mode=parsed # Only parsed content\nnexus grep \"TODO\" --search-mode=raw # Only raw text (skip parsing)\n\n# Result shows source type\n# Match: TODO (parsed) \u2190 from parsed PDF\n# Match: TODO (raw) \u2190 from source code\n```\n\n#### File Permissions (v0.3.0)\n\n```bash\n# Change file permissions\nnexus chmod 755 /workspace/script.sh\nnexus chmod rw-r--r-- /workspace/data.txt\n\n# Change file owner and group\nnexus chown alice /workspace/file.txt\nnexus chgrp developers /workspace/code/\n\n# View ACL entries\nnexus getfacl /workspace/file.txt\n\n# Manage ACL entries\nnexus setfacl user:alice:rw- /workspace/file.txt\nnexus setfacl group:developers:r-x /workspace/code/\nnexus setfacl deny:user:bob /workspace/secret.txt\nnexus setfacl user:alice:rwx /workspace/file.txt --remove\n```\n\n**Supported Formats:**\n- **Octal**: `755`, `0o644`, `0755`\n- **Symbolic**: `rwxr-xr-x`, `rw-r--r--`\n- **ACL Entries**: `user:<name>:rwx`, `group:<name>:r-x`, `deny:user:<name>`\n\n#### ReBAC - Relationship-Based Access Control (v0.3.0)\n\nNexus implements Zanzibar-style relationship-based authorization for team-based permissions, hierarchical access, and dynamic permission inheritance.\n\n```bash\n# Create relationship tuples\nnexus rebac create agent alice member-of group eng-team\nnexus rebac create group eng-team owner-of file project-docs\nnexus rebac create file folder-parent parent-of file folder-child\n\n# Check permissions (with graph traversal)\nnexus rebac check agent alice member-of group eng-team # Direct check\nnexus rebac check agent alice owner-of file project-docs # Inherited via group\n\n# Find all subjects with a permission\nnexus rebac expand owner-of file project-docs # Returns: alice (via eng-team)\nnexus rebac expand member-of group eng-team # Returns: alice, bob, ...\n\n# Delete relationships\nnexus rebac delete <tuple-id>\n\n# Create temporary access (expires automatically)\nnexus rebac create agent alice viewer-of file temp-report \\\n --expires \"2025-12-31T23:59:59\"\n```\n\n**ReBAC Features:**\n- **Relationship Types**: `member-of`, `owner-of`, `viewer-of`, `editor-of`, `parent-of`\n- **Graph Traversal**: Recursive permission checking through relationship chains\n- **Permission Inheritance**: Team ownership, hierarchical folders, group membership\n- **Caching**: 5-minute TTL with automatic invalidation on changes\n- **Expiring Access**: Temporary permissions with automatic cleanup\n- **Cycle Detection**: Prevents infinite loops in relationship graphs\n\n**Example Use Cases:**\n```bash\n# Team-based file access\nnexus rebac create agent alice member-of group engineering\nnexus rebac create group engineering owner-of file /projects/backend\n# alice now has owner permission on /projects/backend\n\n# Hierarchical folder permissions\nnexus rebac create agent bob owner-of file /workspace/parent-folder\nnexus rebac create file /workspace/parent-folder parent-of file /workspace/parent-folder/child\n# bob automatically has owner permission on child folder\n\n# Temporary collaborator access\nnexus rebac create agent charlie viewer-of file /reports/q4.pdf \\\n --expires \"2025-01-31T23:59:59\"\n# charlie's access expires automatically on Jan 31, 2025\n```\n\n#### Work Queue Operations\n\n```bash\n# Query work items by status\nnexus work ready --limit 10 # Get ready work items (high priority first)\nnexus work pending # Get pending work items\nnexus work blocked # Get blocked work items (with dependency info)\nnexus work in-progress # Get currently processing items\n\n# View aggregate statistics\nnexus work status # Show counts for all work queues\n\n# Output as JSON (for scripting)\nnexus work ready --json\nnexus work status --json\n```\n\n**Note**: Work items are files with special metadata (status, priority, depends_on, worker_id). See `docs/SQL_VIEWS_FOR_WORK_DETECTION.md` for details on setting up work queues.\n\n### Examples\n\n**Initialize and populate a workspace:**\n\n```bash\n# Create workspace\nnexus init ./my-project\n\n# Create structure\nnexus mkdir /workspace/src --data-dir ./my-project/nexus-data\nnexus mkdir /workspace/tests --data-dir ./my-project/nexus-data\n\n# Add files\necho \"print('Hello World')\" | nexus write /workspace/src/main.py --input - \\\n --data-dir ./my-project/nexus-data\n\n# List everything\nnexus ls / --recursive --long --data-dir ./my-project/nexus-data\n```\n\n**Find and analyze code:**\n\n```bash\n# Find all Python files\nnexus glob \"**/*.py\"\n\n# Search for TODO comments\nnexus grep \"TODO|FIXME\" --file-pattern \"**/*.py\"\n\n# Find all test files\nnexus glob \"**/test_*.py\"\n\n# Search for function definitions\nnexus grep \"^def \\w+\\(\" --file-pattern \"**/*.py\"\n```\n\n**Work with data:**\n\n```bash\n# Write JSON data\necho '{\"name\": \"test\", \"value\": 42}' | nexus write /data/config.json --input -\n\n# Display with syntax highlighting\nnexus cat /data/config.json\n\n# Get file information\nnexus info /data/config.json\n```\n\n### Global Options\n\nAll commands support these global options:\n\n```bash\n# Use custom config file\nnexus ls /workspace --config /path/to/config.yaml\n\n# Override data directory\nnexus ls /workspace --data-dir /path/to/nexus-data\n\n# Combine both (config takes precedence)\nnexus ls /workspace --config ./my-config.yaml --data-dir ./data\n```\n\n### Plugin Management\n\nNexus has a modular plugin system for external integrations:\n\n```bash\n# List installed plugins\nnexus plugins list\n\n# Get detailed plugin information\nnexus plugins info anthropic\nnexus plugins info skill-seekers\n\n# Install a plugin\nnexus plugins install anthropic\nnexus plugins install skill-seekers\n\n# Enable/disable plugins\nnexus plugins enable anthropic\nnexus plugins disable anthropic\n\n# Uninstall a plugin\nnexus plugins uninstall skill-seekers\n```\n\n**First-party plugins (local development only - not yet on PyPI):**\n- **anthropic** - Claude Skills API integration (upload/download/manage skills)\n- **skill-seekers** - Generate skills from documentation websites\n\n**Installation:**\n```bash\n# Install from local source\npip install -e ./nexus-plugin-anthropic\npip install -e ./nexus-plugin-skill-seekers\n```\n\n**Using plugin commands:**\n```bash\n# Anthropic plugin commands\nnexus anthropic upload-skill my-skill\nnexus anthropic list-skills\nnexus anthropic import-github canvas-design\n\n# Skill Seekers plugin commands\nnexus skill-seekers generate https://react.dev/ --name react-basics\nnexus skill-seekers import /path/to/SKILL.md\nnexus skill-seekers list\n```\n\nSee detailed documentation:\n- [Plugin Installation Guide](./PLUGIN_INSTALLATION.md) - **Start here for setup**\n- [nexus-plugin-anthropic](./nexus-plugin-anthropic/README.md) - Anthropic plugin docs\n- [nexus-plugin-skill-seekers](./nexus-plugin-skill-seekers/README.md) - Skill Seekers docs\n\n**Try plugin examples:**\n```bash\n# CLI demo - plugin management commands\n./examples/plugin_cli_demo.sh\n\n# SDK demo - programmatic plugin usage\npython examples/plugin_sdk_demo.py\n```\n\n### Help\n\nGet help for any command:\n\n```bash\nnexus --help # Show all commands\nnexus ls --help # Show help for ls command\nnexus grep --help # Show help for grep command\nnexus plugins --help # Show plugin management commands\n```\n\n## Remote Nexus Server\n\nNexus includes a JSON-RPC server that exposes the full NexusFileSystem interface over HTTP, enabling remote filesystem access and FUSE mounts to remote servers.\n\n### Quick Start\n\n#### Method 1: Using the Startup Script (Recommended)\n\n```bash\n# Navigate to nexus directory\ncd /path/to/nexus\n\n# Start with defaults (host: 0.0.0.0, port: 8080, no auth)\n./start-server.sh\n\n# Or with custom options\n./start-server.sh --host localhost --port 8080 --api-key mysecret\n```\n\n#### Method 2: Direct Command\n\n```bash\n# Start the server (optional API key authentication)\nnexus serve --host 0.0.0.0 --port 8080 --api-key mysecret\n\n# Use remote filesystem from Python\nfrom nexus import RemoteNexusFS\n\nnx = RemoteNexusFS(\n server_url=\"http://localhost:8080\",\n api_key=\"mysecret\" # Optional\n)\n\n# Same API as local NexusFS!\nnx.write(\"/workspace/hello.txt\", b\"Hello Remote!\")\ncontent = nx.read(\"/workspace/hello.txt\")\nfiles = nx.list(\"/workspace\", recursive=True)\n```\n\n### Features\n\n- **Full NFS Interface**: All filesystem operations exposed over RPC (read, write, list, glob, grep, mkdir, etc.)\n- **JSON-RPC 2.0 Protocol**: Standard RPC protocol with proper error handling\n- **API Key Authentication**: Optional Bearer token authentication for security\n- **Backend Agnostic**: Works with local and GCS backends\n- **FUSE Compatible**: Mount remote Nexus servers as local filesystems\n\n### Remote Client Usage\n\n```python\nfrom nexus import RemoteNexusFS\n\n# Connect to remote server\nnx = RemoteNexusFS(\n server_url=\"http://your-server:8080\",\n api_key=\"your-api-key\" # Optional\n)\n\n# All standard operations work\nnx.write(\"/workspace/data.txt\", b\"content\")\ncontent = nx.read(\"/workspace/data.txt\")\nfiles = nx.list(\"/workspace\", recursive=True)\nresults = nx.glob(\"**/*.py\")\nmatches = nx.grep(\"TODO\", file_pattern=\"*.py\")\n```\n\n### Server Options\n\n```bash\n# Start with custom host/port\nnexus serve --host 0.0.0.0 --port 8080\n\n# Start with API key authentication\nnexus serve --api-key mysecret\n\n# Start with GCS backend\nnexus serve --backend=gcs --gcs-bucket=my-bucket --api-key mysecret\n\n# Custom data directory\nnexus serve --data-dir /path/to/data\n```\n\n### Testing the Server\n\nOnce the server is running, verify it's working:\n\n```bash\n# Health check\ncurl http://localhost:8080/health\n# Expected: {\"status\": \"healthy\", \"service\": \"nexus-rpc\"}\n\n# Check available methods\ncurl http://localhost:8080/api/nfs/status\n# Expected: {\"status\": \"running\", \"service\": \"nexus-rpc\", \"version\": \"1.0\", \"methods\": [...]}\n\n# List files (JSON-RPC)\ncurl -X POST http://localhost:8080/api/nfs/list \\\n -H \"Content-Type: application/json\" \\\n -d '{\n \"jsonrpc\": \"2.0\",\n \"method\": \"list\",\n \"params\": {\"path\": \"/\", \"recursive\": false, \"details\": true},\n \"id\": 1\n }'\n\n# With API key\ncurl -X POST http://localhost:8080/api/nfs/list \\\n -H \"Content-Type: application/json\" \\\n -H \"Authorization: Bearer mysecretkey\" \\\n -d '{\"jsonrpc\": \"2.0\", \"method\": \"list\", \"params\": {\"path\": \"/\"}, \"id\": 1}'\n```\n\n### Troubleshooting\n\n**Port Already in Use:**\n```bash\n# Find and kill process using port 8080\nlsof -ti:8080 | xargs kill -9\n\n# Or use a different port\nnexus serve --port 8081\n```\n\n**Module Not Found:**\n```bash\n# Activate virtual environment and install\nsource .venv/bin/activate\npip install -e .\n```\n\n**Permission Denied:**\n```bash\n# Use a directory you have write access to\nnexus serve --data-dir ~/nexus-data\n```\n\n### Deploying Nexus Server\n\n#### Google Cloud Platform (Recommended)\n\nDeploy to GCP with a single command using the automated deployment script:\n\n```bash\n# Quick start\n./deploy-gcp.sh --project-id YOUR-PROJECT-ID --api-key mysecret\n\n# With GCS backend\n./deploy-gcp.sh \\\n --project-id YOUR-PROJECT-ID \\\n --gcs-bucket your-nexus-bucket \\\n --api-key mysecret \\\n --machine-type e2-standard-2\n```\n\n**Features:**\n- \u2705 Automated VM provisioning (Ubuntu 22.04)\n- \u2705 Systemd service with auto-restart\n- \u2705 Firewall configuration\n- \u2705 GCS backend support\n- \u2705 Production-ready setup\n\n**See [GCP Deployment Guide](docs/deployment/GCP_DEPLOYMENT.md) for complete instructions.**\n\n#### Docker Deployment\n\nDeploy using Docker for consistent environments and easy management:\n\n```bash\n# Quick start with Docker Compose\ncp .env.docker.example .env\n# Edit .env with your configuration\ndocker-compose up -d\n\n# Or run directly\ndocker build -t nexus-server:latest .\ndocker run -d \\\n --name nexus-server \\\n --restart unless-stopped \\\n -p 8080:8080 \\\n -v nexus-data:/app/data \\\n -e NEXUS_API_KEY=\"your-api-key\" \\\n nexus-server:latest\n\n# Deploy to GCP with Docker (automated)\n./deploy-gcp-docker.sh \\\n --project-id your-project-id \\\n --api-key mysecret \\\n --build-local\n```\n\n**Features:**\n- \u2705 Multi-stage build for optimized image size (~300MB)\n- \u2705 Non-root user for security\n- \u2705 Health checks and auto-restart\n- \u2705 GCS backend support\n- \u2705 Docker Compose for easy orchestration\n\n**See [Docker Deployment Guide](docs/deployment/DOCKER_DEPLOYMENT.md) for complete instructions.**\n\n**Deployment Features:**\n- **Persistent Metadata**: SQLite database stored on VM disk at `/var/lib/nexus/`\n- **Content Storage**: All file content stored in configured backend (GCS, local, etc.)\n- **Content Deduplication**: CAS-based storage with 30-50% savings\n- **Full NFS API**: All operations available remotely\n\n## FUSE Mount: Use Standard Unix Tools (v0.2.0)\n\nMount Nexus to a local path and use **any standard Unix tool** seamlessly - `ls`, `cat`, `grep`, `vim`, and more!\n\n### Installation\n\nFirst, install FUSE support:\n\n```bash\n# Install Nexus with FUSE support\npip install nexus-ai-fs[fuse]\n\n# Platform-specific FUSE library:\n# macOS: Install macFUSE from https://osxfuse.github.io/\n# Linux: sudo apt-get install fuse3 # or equivalent for your distro\n```\n\n### Quick Start\n\n```bash\n# Mount Nexus to local path (smart mode by default)\nnexus mount /mnt/nexus\n\n# Now use ANY standard Unix tools!\nls -la /mnt/nexus/workspace/\ncat /mnt/nexus/workspace/notes.txt\ngrep -r \"TODO\" /mnt/nexus/workspace/\nfind /mnt/nexus -name \"*.py\"\nvim /mnt/nexus/workspace/code.py\ngit clone /some/repo /mnt/nexus/repos/myproject\n\n# Unmount when done\nnexus unmount /mnt/nexus\n```\n\n### Quick Start Examples\n\n**Example 1: Default (Explicit Views) - Best for Mixed Workflows**\n\n```bash\n# Mount normally\nnexus mount /mnt/nexus\n\n# Binary tools work directly\nevince /mnt/nexus/docs/report.pdf # PDF viewer works \u2713\n\n# Add .txt for text operations\ncat /mnt/nexus/docs/report.pdf.txt # Read as text\ngrep \"results\" /mnt/nexus/docs/*.pdf.txt\n\n# Virtual views auto-generated\nls /mnt/nexus/docs/\n# \u2192 report.pdf\n# \u2192 report.pdf.txt (virtual)\n# \u2192 report.pdf.md (virtual)\n```\n\n**Example 2: Auto-Parse - Best for Search-Heavy Workflows**\n\n```bash\n# Mount with auto-parse\nnexus mount /mnt/nexus --auto-parse\n\n# grep works directly on PDFs!\ngrep \"results\" /mnt/nexus/docs/*.pdf # No .txt needed! \u2713\ncat /mnt/nexus/docs/report.pdf # Returns text \u2713\n\n# Search across everything\ngrep -r \"TODO\" /mnt/nexus/workspace/ # Searches PDFs, Excel, etc.\n\n# Binary via .raw/ when needed\nevince /mnt/nexus/.raw/docs/report.pdf # For PDF viewer\n```\n\n**Example 3: Real-World Script**\n\n```bash\n#!/bin/bash\n# Find all PDFs mentioning \"invoice\"\n\n# Mount in background - command returns immediately!\nnexus mount /mnt/nexus --auto-parse --daemon\n# (No blocking - script continues immediately)\n\n# Mount is ready - grep works on PDFs!\ngrep -l \"invoice\" /mnt/nexus/documents/*.pdf\n\n# Process results\nfor pdf in $(grep -l \"invoice\" /mnt/nexus/documents/*.pdf); do\n echo \"Found in: $pdf\"\n grep -n \"invoice\" \"$pdf\" | head -5\ndone\n\n# Clean up\nnexus unmount /mnt/nexus\n```\n\n**Remote server example:**\n\n```bash\n#!/bin/bash\n# Search PDFs on remote Nexus server\n\n# Mount remote server in background\nnexus mount /mnt/nexus \\\n --remote-url http://nexus-server:8080 \\\n --auto-parse \\\n --daemon\n\n# Command returns immediately - daemon process runs in background\n# You can now use standard Unix tools on remote filesystem!\n\n# Search across remote PDFs\ngrep -r \"TODO\" /mnt/nexus/workspace/ | head -20\n\n# Find large files\nfind /mnt/nexus -type f -size +10M\n\n# Clean up when done\nnexus unmount /mnt/nexus\n```\n\n### File Access: Two Modes\n\nNexus supports **two ways** to access files - choose what fits your workflow:\n\n#### 1. Explicit Views (Default) - Best for Compatibility\n\nBinary files return binary, use `.txt`/`.md` suffixes for parsed content:\n\n```bash\nnexus mount /mnt/nexus\n\n# Binary files work with native tools\nevince /mnt/nexus/docs/report.pdf # PDF viewer gets binary \u2713\nlibreoffice /mnt/nexus/data/sheet.xlsx # Excel app gets binary \u2713\n\n# Add .txt to search/read as text\ncat /mnt/nexus/docs/report.pdf.txt # Returns parsed text\ngrep \"pattern\" /mnt/nexus/docs/*.pdf.txt\n\n# Virtual views appear automatically\nls /mnt/nexus/docs/\n# \u2192 report.pdf\n# \u2192 report.pdf.txt (virtual view)\n# \u2192 report.pdf.md (virtual view)\n```\n\n**When to use:** You want both binary tools AND text search to work\n\n#### 2. Auto-Parse Mode - Best for Search/Grep\n\nBinary files return parsed text directly, use `.raw/` for binary:\n\n```bash\nnexus mount /mnt/nexus --auto-parse\n\n# Binary files return text directly - perfect for grep!\ncat /mnt/nexus/docs/report.pdf # Returns parsed text \u2713\ngrep \"pattern\" /mnt/nexus/docs/*.pdf # Works directly! \u2713\nless /mnt/nexus/docs/report.pdf # Page through text \u2713\n\n# Access binary via .raw/ when needed\nevince /mnt/nexus/.raw/docs/report.pdf # PDF viewer gets binary\n\n# No .txt/.md suffixes - files return text by default\nls /mnt/nexus/docs/\n# \u2192 report.pdf (returns text when read)\n```\n\n**When to use:** Text search is your primary use case, binary tools are secondary\n\n### Mount Modes (Content Parsing)\n\nControl **what** gets parsed:\n\n```bash\n# Smart mode (default) - Auto-detect file types\nnexus mount /mnt/nexus --mode=smart\n# \u2705 PDFs, Excel, Word \u2192 parsed\n# \u2705 .py, .txt, .md \u2192 pass-through\n# \u2705 Best for mixed content\n\n# Text mode - Parse everything aggressively\nnexus mount /mnt/nexus --mode=text\n# \u2705 All files parsed to text\n# \u26a0\ufe0f Slower (always parses)\n\n# Binary mode - No parsing at all\nnexus mount /mnt/nexus --mode=binary\n# \u2705 All files return binary\n# \u274c grep won't work on PDFs\n```\n\n### Comparison Table\n\n| Feature | Explicit Views (default) | Auto-Parse Mode (`--auto-parse`) |\n|---------|-------------------------|-----------------------------------|\n| **PDF viewers work** | \u2705 `evince file.pdf` | \u26a0\ufe0f `evince .raw/file.pdf` |\n| **grep on PDFs** | \u26a0\ufe0f `grep *.pdf.txt` | \u2705 `grep *.pdf` |\n| **Excel apps work** | \u2705 `libreoffice file.xlsx` | \u26a0\ufe0f `libreoffice .raw/file.xlsx` |\n| **Best for** | Binary tools + search | Text search primary use case |\n| **Virtual views** | `.txt`, `.md` suffixes | No suffixes needed |\n| **Binary access** | Direct (`file.pdf`) | Via `.raw/` directory |\n\n### Background (Daemon) Mode\n\nRun the mount in the background and return to your shell immediately:\n\n```bash\n# Mount in background - command returns immediately\nnexus mount /mnt/nexus --daemon\n# \u2713 Mounted Nexus to /mnt/nexus\n#\n# To unmount:\n# nexus unmount /mnt/nexus\n#\n# (Shell prompt returns immediately, mount runs in background)\n\n# Mount is active - you can use it immediately\nls /mnt/nexus\ncat /mnt/nexus/workspace/file.txt\n\n# Check daemon status\nps aux | grep \"nexus mount\" | grep -v grep\n# jinjingzhou 43097 ... nexus mount /mnt/nexus --daemon\n\n# Later, unmount when done\nnexus unmount /mnt/nexus\n```\n\n**How it works:**\n- Command returns to shell immediately (using double-fork technique)\n- Background daemon process keeps mount active\n- Daemon survives terminal close and persists until unmount\n- Safe to close your terminal - mount stays active\n\n**Local Mount:**\n```bash\n# Mount local Nexus data in background\nnexus mount /mnt/nexus --daemon\n```\n\n**Remote Mount:**\n```bash\n# Mount remote Nexus server in background\nnexus mount /mnt/nexus --remote-url http://your-server:8080 --daemon\n\n# With API key authentication\nnexus mount /mnt/nexus \\\n --remote-url http://your-server:8080 \\\n --api-key your-secret-key \\\n --daemon\n```\n\n### Performance & Caching (v0.2.0)\n\nFUSE mounts include automatic caching for improved performance. Caching is **enabled by default** with sensible defaults - no configuration needed for most users.\n\n**Default Performance:**\n- \u2705 Attribute caching (1024 entries, 60s TTL) - Makes `ls` and `stat` operations faster\n- \u2705 Content caching (100 files) - Speeds up repeated file reads\n- \u2705 Parsed content caching (50 files) - Accelerates PDF/Excel text extraction\n- \u2705 Automatic cache invalidation on writes/deletes - Always consistent\n\n**Advanced: Custom Cache Configuration**\n\nFor power users with specific performance requirements:\n\n```python\nfrom nexus import connect\nfrom nexus.fuse import mount_nexus\n\nnx = connect(config={\"data_dir\": \"./nexus-data\"})\n\n# Custom cache configuration\ncache_config = {\n \"attr_cache_size\": 2048, # Double the attribute cache (default: 1024)\n \"attr_cache_ttl\": 120, # Cache attributes for 2 minutes (default: 60s)\n \"content_cache_size\": 200, # Cache 200 files (default: 100)\n \"parsed_cache_size\": 100, # Cache 100 parsed files (default: 50)\n \"enable_metrics\": True # Track cache hit/miss rates (default: False)\n}\n\nfuse = mount_nexus(\n nx,\n \"/mnt/nexus\",\n mode=\"smart\",\n cache_config=cache_config,\n foreground=False\n)\n\n# View cache performance (if metrics enabled)\n# Note: Access via fuse.fuse.operations.cache\n```\n\n**Cache Configuration Options:**\n\n| Option | Default | Description |\n|--------|---------|-------------|\n| `attr_cache_size` | 1024 | Max number of cached file attribute entries |\n| `attr_cache_ttl` | 60 | Time-to-live for attributes in seconds |\n| `content_cache_size` | 100 | Max number of cached file contents |\n| `parsed_cache_size` | 50 | Max number of cached parsed contents (PDFs, etc.) |\n| `enable_metrics` | False | Enable cache hit/miss tracking |\n\n**When to Tune Cache Settings:**\n\n- **Large directory listings**: Increase `attr_cache_size` to 2048+ and `attr_cache_ttl` to 120+\n- **Many small files**: Increase `content_cache_size` to 500+\n- **Heavy PDF/Excel use**: Increase `parsed_cache_size` to 200+\n- **Performance analysis**: Enable `enable_metrics` to measure cache effectiveness\n- **Memory-constrained**: Decrease all cache sizes (e.g., 512 / 50 / 25)\n\n**Notes:**\n- Caches are **thread-safe** - safe for concurrent access\n- Caches are **automatically invalidated** on file writes, deletes, and renames\n- Default settings work well for most use cases - tune only if needed\n\n### Troubleshooting FUSE Mounts\n\n#### Check Mount Status\n\n```bash\n# Check if daemon process is running\nps aux | grep \"nexus mount\" | grep -v grep\n\n# Check mount points\nmount | grep nexus\n\n# List files in mount point (should show files, not empty)\nls -la /mnt/nexus/\n```\n\n#### Common Issues\n\n**Mount appears empty or shows \"Transport endpoint is not connected\":**\n```bash\n# Unmount the stale mount point\nnexus unmount /mnt/nexus\n\n# Or force unmount (macOS)\numount -f /mnt/nexus\n\n# Or force unmount (Linux)\nfusermount -u /mnt/nexus\n\n# Then remount\nnexus mount /mnt/nexus --daemon\n```\n\n**Process won't die (stuck in 'D' or 'U' state):**\n```bash\n# Find stuck processes\nps aux | grep nexus | grep -E \"D|U\"\n\n# Force kill\nkill -9 <PID>\n\n# If process is still stuck (uninterruptible I/O), try:\n# macOS: umount -f /mnt/nexus\n# Linux: fusermount -uz /mnt/nexus\n\n# Note: Stuck processes in 'D' state typically resolve after unmount\n# If they persist, they'll be cleaned up on system reboot\n```\n\n**\"Directory not empty\" error when mounting:**\n```bash\n# Unmount first\nnexus unmount /mnt/nexus\n\n# Or remove and recreate the mount point\nrm -rf /mnt/nexus && mkdir /mnt/nexus\n\n# Then mount\nnexus mount /mnt/nexus --daemon\n```\n\n**Permission denied errors:**\n```bash\n# Ensure FUSE is installed\n# macOS: Install macFUSE from https://osxfuse.github.io/\n# Linux: sudo apt-get install fuse3\n\n# Check mount point permissions\nls -ld /mnt/nexus\n# Should be owned by your user\n\n# Create mount point with correct permissions\nmkdir -p /mnt/nexus\nchmod 755 /mnt/nexus\n```\n\n**Connection refused (remote mounts):**\n```bash\n# Check server is running\ncurl http://your-server:8080/health\n\n# Test connectivity\nping your-server\n\n# Verify API key (if required)\nnexus mount /mnt/nexus \\\n --remote-url http://your-server:8080 \\\n --api-key your-key \\\n --daemon\n```\n\n**Multiple mounts to same mount point:**\n```bash\n# Check for existing mounts\nmount | grep /mnt/nexus\n\n# Unmount all instances\nnexus unmount /mnt/nexus\n\n# Kill any lingering processes\npkill -f \"nexus mount /mnt/nexus\"\n\n# Clean mount and remount\nrm -rf /mnt/nexus && mkdir /mnt/nexus\nnexus mount /mnt/nexus --daemon\n```\n\n#### Debug Mode\n\nFor detailed debugging output:\n\n```bash\n# Run in foreground with debug output\nnexus mount /mnt/nexus --debug\n\n# This will show all FUSE operations in real-time\n# Press Ctrl+C to stop\n```\n\n### rclone-style CLI Commands (v0.2.0)\n\nNexus provides efficient file operations inspired by rclone, with automatic deduplication and progress tracking:\n\n#### Sync Command\nOne-way synchronization with hash-based change detection:\n\n```bash\n# Sync local directory to Nexus (only copies changed files)\nnexus sync ./local/dataset/ /workspace/training/\n\n# Preview changes before syncing (dry-run)\nnexus sync ./data/ /workspace/backup/ --dry-run\n\n# Mirror sync - delete extra files in destination\nnexus sync /workspace/source/ /workspace/dest/ --delete\n\n# Disable hash comparison (force copy all files)\nnexus sync ./data/ /workspace/ --no-checksum\n```\n\n#### Copy Command\nSmart copy with automatic deduplication:\n\n```bash\n# Copy directory recursively (skips identical files)\nnexus copy ./local/data/ /workspace/project/ --recursive\n\n# Copy within Nexus (leverages CAS deduplication)\nnexus copy /workspace/source/ /workspace/dest/ --recursive\n\n# Copy Nexus to local\nnexus copy /workspace/data/ ./backup/ --recursive\n\n# Copy single file\nnexus copy /workspace/file.txt /workspace/copy.txt\n\n# Disable checksum verification\nnexus copy ./data/ /workspace/ --recursive --no-checksum\n```\n\n#### Move Command\nEfficient file/directory moves with confirmation prompts:\n\n```bash\n# Move file (rename if possible, copy+delete otherwise)\nnexus move /workspace/old.txt /workspace/new.txt\n\n# Move directory without confirmation\nnexus move /workspace/old_dir/ /archives/2024/ --force\n```\n\n#### Tree Command\nVisualize directory structure as ASCII tree:\n\n```bash\n# Show full directory tree\nnexus tree /workspace/\n\n# Limit depth to 2 levels\nnexus tree /workspace/ -L 2\n\n# Show file sizes\nnexus tree /workspace/ --show-size\n```\n\n#### Size Command\nCalculate directory sizes with human-readable output:\n\n```bash\n# Calculate total size\nnexus size /workspace/project/\n\n# Human-readable output (KB, MB, GB)\nnexus size /workspace/ --human\n\n# Show top 10 largest files\nnexus size /workspace/ --human --details\n```\n\n**Features:**\n- **Hash-based deduplication** - Only copies changed files\n- **Progress bars** - Visual feedback for long operations\n- **Dry-run mode** - Preview changes before execution\n- **Cross-platform paths** - Works with local filesystem and Nexus paths\n- **Automatic deduplication** - Leverages Content-Addressable Storage (CAS)\n\n### Performance Comparison\n\n| Method | Speed | Content-Aware | Use Case |\n|--------|-------|---------------|----------|\n| `grep -r /mnt/nexus/` | Medium | \u2705 Yes (via mount) | Interactive use |\n| `nexus grep \"pattern\"` | **Fast** (DB-backed) | \u2705 Yes | Large-scale search |\n| Standard tools | Familiar | \u2705 Yes (via mount) | Day-to-day work |\n\n### Use Cases\n\n**Interactive Development**:\n```bash\n# Mount for interactive work\nnexus mount /mnt/nexus\nvim /mnt/nexus/workspace/code.py\ngit clone /mnt/nexus/repos/myproject\n```\n\n**Bulk Operations**:\n```bash\n# Use rclone-style commands for efficiency\nnexus sync /local/dataset/ /workspace/training-data/\nnexus tree /workspace/ > structure.txt\n```\n\n**Automated Workflows**:\n```bash\n# Standard Unix tools in scripts\nfind /mnt/nexus -name \"*.pdf\" -exec grep -l \"invoice\" {} \\;\nrsync -av /mnt/nexus/workspace/ /backup/\n```\n\n## Architecture\n\n### Agent Workspace Structure\n\nEvery agent gets a structured workspace at `/workspace/{tenant}/{agent}/`:\n\n```\n/workspace/acme-corp/research-agent/\n\u251c\u2500\u2500 .nexus/ # Nexus metadata (Git-trackable)\n\u2502 \u251c\u2500\u2500 agent.yaml # Agent configuration\n\u2502 \u251c\u2500\u2500 commands/ # Custom commands (markdown files)\n\u2502 \u2502 \u251c\u2500\u2500 analyze-codebase.md\n\u2502 \u2502 \u2514\u2500\u2500 summarize-docs.md\n\u2502 \u251c\u2500\u2500 jobs/ # Background job definitions\n\u2502 \u2502 \u2514\u2500\u2500 daily-summary.yaml\n\u2502 \u251c\u2500\u2500 memory/ # File-based memory\n\u2502 \u2502 \u251c\u2500\u2500 project-knowledge.md\n\u2502 \u2502 \u2514\u2500\u2500 recent-tasks.jsonl\n\u2502 \u2514\u2500\u2500 secrets.encrypted # KMS-encrypted credentials\n\u251c\u2500\u2500 data/ # Agent's working data\n\u2502 \u251c\u2500\u2500 inputs/\n\u2502 \u2514\u2500\u2500 outputs/\n\u2514\u2500\u2500 INSTRUCTIONS.md # Agent instructions (auto-loaded)\n```\n\n### Path Namespace\n\n```\n/\n\u251c\u2500\u2500 workspace/ # Agent scratch space (hot tier, ephemeral)\n\u251c\u2500\u2500 shared/ # Shared tenant data (warm tier, persistent)\n\u251c\u2500\u2500 external/ # Pass-through backends (no content storage)\n\u251c\u2500\u2500 system/ # System metadata (admin-only)\n\u2514\u2500\u2500 archives/ # Cold storage (read-only)\n```\n\n## Core Components\n\n### File System Operations\n\n```python\nimport nexus\n\n# Works in both local and hosted modes\n# Mode determined by config file or environment\nnx = nexus.connect()\n\nasync with nx:\n # Basic operations\n await nx.write(\"/workspace/data.txt\", b\"content\")\n content = await nx.read(\"/workspace/data.txt\")\n await nx.delete(\"/workspace/data.txt\")\n\n # Batch operations\n files = await nx.list(\"/workspace/\", recursive=True)\n results = await nx.copy_batch(sources, destinations)\n\n # File discovery\n python_files = await nx.glob(\"**/*.py\")\n todos = await nx.grep(r\"TODO:|FIXME:\", file_pattern=\"*.py\")\n```\n\n### Semantic Search\n\n```python\n# Search across documents with vector embeddings\nasync with nexus.connect() as nx:\n results = await nx.semantic_search(\n path=\"/docs/\",\n query=\"How does authentication work?\",\n limit=10,\n filters={\"file_type\": \"markdown\"}\n )\n\n for result in results:\n print(f\"{result.path}:{result.line} - {result.text}\")\n```\n\n### LLM-Powered Reading\n\n```python\n# Read documents with AI, with automatic KV cache\nasync with nexus.connect() as nx:\n answer = await nx.llm_read(\n path=\"/reports/q4-2024.pdf\",\n prompt=\"What were the top 3 challenges?\",\n model=\"claude-sonnet-4\",\n max_tokens=1000\n )\n```\n\n### Agent Memory\n\n```python\n# Store and retrieve agent memories\nasync with nexus.connect() as nx:\n await nx.store_memory(\n content=\"User prefers TypeScript over JavaScript\",\n memory_type=\"preference\",\n tags=[\"coding\", \"languages\"]\n )\n\n memories = await nx.search_memories(\n query=\"programming language preferences\",\n limit=5\n )\n```\n\n### Prompt Optimization (Coming in v0.9.5)\n\n```python\n# Track multiple prompt candidates during optimization\nasync with nexus.connect() as nx:\n # Start optimization run\n run_id = await nx.start_optimization_run(\n module_name=\"SearchModule\",\n objectives=[\"accuracy\", \"latency\", \"cost\"]\n )\n\n # Store prompt candidates with detailed traces\n for candidate in prompt_variants:\n version_id = await nx.store_prompt_version(\n module_name=\"SearchModule\",\n prompt_template=candidate.template,\n metrics={\"accuracy\": 0.85, \"latency_ms\": 450},\n run_id=run_id\n )\n\n # Store execution traces for debugging\n await nx.store_execution_trace(\n prompt_version_id=version_id,\n inputs=test_inputs,\n outputs=predictions,\n intermediate_steps=reasoning_chain\n )\n\n # Analyze tradeoffs across candidates\n analysis = await nx.analyze_prompt_tradeoffs(\n run_id=run_id,\n objectives=[\"accuracy\", \"latency_ms\", \"cost_per_query\"]\n )\n\n # Get per-example results to find failure patterns\n failures = await nx.get_failing_examples(\n prompt_version_id=version_id,\n limit=20\n )\n```\n\n### Custom Commands\n\nCreate `/workspace/{tenant}/{agent}/.nexus/commands/semantic-search.md`:\n\n```markdown\n---\nname: semantic-search\ndescription: Search codebase semantically\nallowed-tools: [semantic_read, glob, grep]\nrequired-scopes: [read]\nmodel: sonnet\n---\n\n## Your task\n\nGiven query: {{query}}\n\n1. Use `glob` to find relevant files by pattern\n2. Use `semantic_read` to extract relevant sections\n3. Summarize findings with file:line citations\n```\n\nExecute via API:\n\n```python\nasync with nexus.connect() as nx:\n result = await nx.execute_command(\n \"semantic-search\",\n context={\"query\": \"authentication implementation\"}\n )\n```\n\n### Skills System (v0.3.0)\n\nManage reusable AI agent skills with SKILL.md format, progressive disclosure, lifecycle management, and dependency resolution:\n\n```python\nfrom nexus.skills import SkillRegistry, SkillManager, SkillExporter\n\n# Initialize filesystem\nnx = nexus.connect()\n\n# Create skill registry\nregistry = SkillRegistry(nx)\n\n# Discover skills from three tiers (agent > tenant > system)\n# Loads metadata only - lightweight and fast\nawait registry.discover()\n\n# List available skills\nskills = registry.list_skills()\n# ['analyze-code', 'data-processing', 'report-generation']\n\n# Get skill metadata (no content loading)\nmetadata = registry.get_metadata(\"analyze-code\")\nprint(f\"{metadata.name}: {metadata.description}\")\n# analyze-code: Analyzes code quality and structure\n\n# Load full skill content (lazy loading + caching)\nskill = await registry.get_skill(\"analyze-code\")\nprint(skill.content) # Full markdown content\n\n# Resolve dependencies automatically (DAG with cycle detection)\ndeps = await registry.resolve_dependencies(\"complex-skill\")\n# ['base-skill', 'helper-skill', 'complex-skill']\n\n# Create skill manager for lifecycle operations\nmanager = SkillManager(nx, registry)\n\n# Create new skill from template\nawait manager.create_skill(\n \"my-analyzer\",\n description=\"Analyzes code quality and structure\",\n template=\"code-generation\", # basic, data-analysis, code-generation, document-processing, api-integration\n author=\"Alice\",\n tier=\"agent\"\n)\n\n# Fork existing skill with lineage tracking\nawait manager.fork_skill(\n \"analyze-code\",\n \"my-custom-analyzer\",\n tier=\"agent\",\n author=\"Bob\"\n)\n\n# Publish skill to tenant library\nawait manager.publish_skill(\n \"my-analyzer\",\n source_tier=\"agent\",\n target_tier=\"tenant\"\n)\n\n# Export skills to .zip (vendor-neutral)\nexporter = SkillExporter(registry)\n\n# Export with dependencies\nawait exporter.export_skill(\n \"analyze-code\",\n output_path=\"analyze-code.zip\",\n format=\"claude\", # Enforces 8MB limit\n include_dependencies=True\n)\n\n# Validate before export\nvalid, msg, size = await exporter.validate_export(\"large-skill\", format=\"claude\")\nif not valid:\n print(f\"Cannot export: {msg}\")\n\n# Enterprise Features (NEW in v0.3.0)\nfrom nexus.skills import (\n SkillAnalyticsTracker,\n SkillGovernance,\n SkillAuditLogger,\n AuditAction\n)\n\n# Track skill usage and analytics\ntracker = SkillAnalyticsTracker(db_connection)\nawait tracker.track_usage(\n \"analyze-code\",\n agent_id=\"alice\",\n execution_time=1.5,\n success=True\n)\n\n# Get analytics for a skill\nanalytics = await tracker.get_skill_analytics(\"analyze-code\")\nprint(f\"Success rate: {analytics.success_rate:.1%}\")\nprint(f\"Avg execution time: {analytics.avg_execution_time:.2f}s\")\n\n# Get dashboard metrics\ndashboard = await tracker.get_dashboard_metrics()\nprint(f\"Total skills: {dashboard.total_skills}\")\nprint(f\"Most used: {dashboard.most_used_skills[:5]}\")\n\n# Governance - approval workflow for org-wide skills\ngov = SkillGovernance(db_connection)\n\n# Submit for approval\napproval_id = await gov.submit_for_approval(\n \"my-analyzer\",\n submitted_by=\"alice\",\n reviewers=[\"bob\", \"charlie\"],\n comments=\"Ready for team-wide use\"\n)\n\n# Approve skill\nawait gov.approve_skill(approval_id, reviewed_by=\"bob\", comments=\"Excellent work!\")\nis_approved = await gov.is_approved(\"my-analyzer\")\n\n# Audit logging for compliance\naudit = SkillAuditLogger(db_connection)\n\n# Log skill operations\nawait audit.log(\n \"analyze-code\",\n AuditAction.EXECUTED,\n agent_id=\"alice\",\n details={\"execution_time\": 1.5, \"success\": True}\n)\n\n# Query audit logs\nlogs = await audit.query_logs(skill_name=\"analyze-code\", action=AuditAction.EXECUTED)\n\n# Generate compliance report\nreport = await audit.generate_compliance_report(tenant_id=\"tenant1\")\nprint(f\"Total operations: {report['total_operations']}\")\nprint(f\"Top skills: {report['top_skills'][:5]}\")\n\n# Search skills by description\nresults = await manager.search_skills(\"code analysis\", limit=5)\nfor skill_name, score in results:\n print(f\"{skill_name}: {score:.1f}\")\n```\n\n#### Skills CLI Commands (v0.3.0)\n\nNexus provides comprehensive CLI commands for skill management:\n\n```bash\n# List all skills\nnexus skills list\nnexus skills list --tenant # Show tenant skills\nnexus skills list --system # Show system skills\nnexus skills list --tier agent # Filter by tier\n\n# Create new skill from template\nnexus skills create my-skill --description \"My custom skill\"\nnexus skills create data-viz --description \"Data visualization\" --template data-analysis\nnexus skills create analyzer --description \"Code analyzer\" --author Alice\n\n# Fork existing skill\nnexus skills fork analyze-code my-analyzer\nnexus skills fork data-analysis custom-analysis --author Bob\n\n# Publish skill to tenant library\nnexus skills publish my-skill\nnexus skills publish shared-skill --from-tier tenant --to-tier system\n\n# Search skills by description\nnexus skills search \"data analysis\"\nnexus skills search \"code\" --tier tenant --limit 5\n\n# Show detailed skill information\nnexus skills info analyze-code\nnexus skills info data-analysis\n\n# Export skill to .zip package (vendor-neutral)\nnexus skills export my-skill --output ./my-skill.zip\nnexus skills export analyze-code --output ./export.zip --format claude\nnexus skills export my-skill --output ./export.zip --no-deps # Exclude dependencies\n\n# Validate skill format and size limits\nnexus skills validate my-skill\nnexus skills validate analyze-code --format claude\n\n# Calculate skill size\nnexus skills size my-skill\nnexus skills size analyze-code --human\n```\n\n**Available Templates:**\n- `basic` - Simple skill template\n- `data-analysis` - Data processing and analysis\n- `code-generation` - Code generation and modification\n- `document-processing` - Document parsing and analysis\n- `api-integration` - API integration and data fetching\n\n**Export Formats:**\n- `generic` - Vendor-neutral .zip format (no size limit)\n- `claude` - Anthropic Claude format (8MB limit enforced)\n- `openai` - OpenAI format (validation only, ready for future plugins)\n\n**Note**: External API integrations (uploading to Claude API, OpenAI, etc.) will be implemented as plugins in v0.3.5+ to maintain vendor neutrality. The core CLI provides generic export functionality.\n\n**SKILL.md Format:**\n\n```markdown\n---\nname: analyze-code\ndescription: Analyzes code quality and structure\nversion: 1.0.0\nauthor: Your Name\nrequires:\n - base-parser\n - ast-analyzer\n---\n\n# Code Analysis Skill\n\nThis skill analyzes code for quality metrics...\n\n## Usage\n\n1. Parse the code files\n2. Run static analysis\n3. Generate report\n```\n\n**Features:**\n- **Progressive Disclosure**: Load metadata during discovery, full content on-demand\n- **Lazy Loading**: Skills cached only when accessed\n- **Three-Tier Hierarchy**: Agent skills override tenant/system skills\n- **Dependency Resolution**: Automatic DAG resolution with cycle detection\n- **Skill Lifecycle**: Create, fork, and publish skills with lineage tracking\n- **Template System**: 5 pre-built templates (basic, data-analysis, code-generation, document-processing, api-integration)\n- **Vendor-Neutral Export**: Generic .zip format with Claude/OpenAI validation\n- **Usage Analytics**: Track performance, success rates, dashboard metrics (NEW in v0.3.0)\n- **Governance**: Approval workflows for team-wide skill publication (NEW in v0.3.0)\n- **Audit Logging**: Complete compliance tracking and reporting (NEW in v0.3.0)\n- **Skill Search**: Find skills by description with relevance scoring (NEW in v0.3.0)\n- **Comprehensive Tests**: 156 passing tests (31%+ overall coverage, 65-91% skills module)\n\n**Skill Tiers:**\n- **Agent** (`/workspace/.nexus/skills/`) - Personal skills (highest priority)\n- **Tenant** (`/shared/skills/`) - Team-shared skills\n- **System** (`/system/skills/`) - Built-in skills (lowest priority)\n\n## Technology Stack\n\n### Core\n- **Language**: Python 3.11+\n- **API Framework**: FastAPI\n- **Database**: PostgreSQL / SQLite (configurable via environment variable)\n- **Cache**: Redis (prod) / In-memory (dev)\n- **Vector DB**: Qdrant\n- **Object Storage**: S3-compatible, GCS, Azure Blob\n\n### AI/ML\n- **LLM Providers**: Anthropic Claude, OpenAI, Google Gemini\n- **Embeddings**: text-embedding-3-large, voyage-ai\n- **Parsing**: PyPDF2, pandas, openpyxl, Pillow\n\n### Infrastructure\n- **Orchestration**: Kubernetes (distributed mode)\n- **Monitoring**: Prometheus + Grafana\n- **Logging**: Structlog + Loki\n- **Admin UI**: Simple HTML/JS (jobs, memories, files, operations)\n\n## Performance Targets\n\n| Metric | Target | Impact |\n|--------|--------|--------|\n| Write Throughput | 500-1000 MB/s | 10-50\u00d7 vs direct backend |\n| Read Latency | <10ms | 10-50\u00d7 vs remote storage |\n| Memory Search | <100ms | Vector search across memories |\n| Storage Savings | 30-50% | CAS deduplication |\n| Job Resumability | 100% | Survives all restarts |\n| LLM Cache Hit Rate | 50-90% | Major cost savings |\n| Prompt Versioning | Full lineage | Track optimization history |\n| Training Data Dedup | 30-50% | CAS-based deduplication |\n| Prompt Optimization | Multi-candidate | Test multiple strategies in parallel |\n| Trace Storage | Full execution logs | Debug failures, analyze patterns |\n\n## Configuration\n\n### Local Mode\n\n```python\nimport nexus\n\n# Config via Python (useful for programmatic configuration)\nnx = nexus.connect(config={\n \"mode\": \"local\",\n \"data_dir\": \"./nexus-data\",\n \"cache_size_mb\": 100,\n \"enable_vector_search\": True\n})\n\n# Or let it auto-discover from nexus.yaml\nnx = nexus.connect()\n```\n\n### Self-Hosted Deployment\n\nFor organizations that want to run their own Nexus instance, create `config.yaml`:\n\n```yaml\nmode: server # local or server\n\ndatabase:\n url: postgresql://user:pass@localhost/nexus\n # or for SQLite: sqlite:///./nexus.db\n # Can also use NEXUS_DATABASE_URL or POSTGRES_URL environment variable\n\ncache:\n type: redis # memory, redis\n url: redis://localhost:6379\n\nvector_db:\n type: qdrant\n url: http://localhost:6333\n\nbackends:\n - type: s3\n bucket: my-company-files\n region: us-east-1\n\n - type: gdrive\n credentials_path: ./gdrive-creds.json\n\nauth:\n jwt_secret: your-secret-key\n token_expiry_hours: 24\n\nrate_limits:\n default: \"100/minute\"\n semantic_search: \"10/minute\"\n llm_read: \"50/hour\"\n```\n\nRun server:\n\n```bash\nnexus server --config config.yaml\n```\n\n## Security\n\n### Multi-Layer Security Model\n\n1. **API Key Authentication**: Tenant and agent identification\n2. **Row-Level Security (RLS)**: Database-level tenant isolation\n3. **Type-Level Validation**: Fail-fast validation before database operations\n4. **UNIX-Style Permissions**: Owner, group, and mode bits (v0.3.0)\n5. **ACL Permissions**: Fine-grained access control lists (v0.3.0)\n6. **ReBAC (Relationship-Based Access Control)**: Zanzibar-style authorization (v0.3.0)\n\n### Type-Level Validation (NEW in v0.1.0)\n\nAll domain types have validation methods that are called automatically before database operations. This provides:\n\n- **Fail Fast**: Catch invalid data before expensive database operations\n- **Clear Error Messages**: Actionable feedback for developers and API consumers\n- **Data Integrity**: Prevent invalid data from entering the database\n- **Consistent Validation**: Same rules across all code paths\n\n```python\nfrom nexus.core.metadata import FileMetadata\nfrom nexus.core.exceptions import ValidationError\n\n# Validation happens automatically on put()\ntry:\n metadata = FileMetadata(\n path=\"/data/file.txt\", # Must start with /\n backend_name=\"local\",\n physical_path=\"/storage/file.txt\",\n size=1024, # Must be >= 0\n )\n store.put(metadata) # Validates before DB operation\nexcept ValidationError as e:\n print(f\"Validation failed: {e}\")\n # Example: \"size cannot be negative, got -1\"\n```\n\n**Validation Rules:**\n- Paths must start with `/` and not contain null bytes\n- File sizes and ref counts must be non-negative\n- Required fields (path, backend_name, physical_path, etc.) must not be empty\n- Content hashes must be valid 64-character SHA-256 hex strings\n- Metadata keys must be \u2264 255 characters\n\n### Example: Multi-Tenancy Isolation\n\n```sql\n-- RLS automatically filters queries by tenant\nSET LOCAL app.current_tenant_id = '<tenant_uuid>';\n\n-- All queries auto-filtered, even with bugs\nSELECT * FROM file_paths WHERE path = '/data';\n-- Returns only rows for current tenant\n```\n\n## Testing\n\n```bash\n# Run all tests\npytest\n\n# Run with coverage\npytest --cov=nexus --cov-report=html\n\n# Run specific test file\npytest tests/test_filesystem.py\n\n# Run integration tests\npytest tests/integration/ -v\n\n# Run performance tests\npytest tests/performance/ --benchmark-only\n```\n\n## Documentation\n\n- [Plugin Development Guide](./docs/PLUGIN_DEVELOPMENT.md) - Create your own Nexus plugins\n- [Plugin System Overview](./docs/PLUGIN_SYSTEM.md) - Plugin architecture and design\n- [PostgreSQL Setup Guide](./docs/POSTGRESQL_SETUP.md) - Configure PostgreSQL for production\n- [SQL Views for Work Detection](./docs/SQL_VIEWS_FOR_WORK_DETECTION.md) - Work queue patterns\n- [API Reference](./docs/api/) - Detailed API documentation\n- [Getting Started](./docs/getting-started/) - Quick start guides\n- [Deployment Guide](./docs/deployment/) - Production deployment\n\n## Contributing\n\nWe welcome contributions! Please see [CONTRIBUTING.md](./CONTRIBUTING.md) for details.\n\n```bash\n# Fork the repo and clone\ngit clone https://github.com/yourusername/nexus.git\ncd nexus\n\n# Create a feature branch\ngit checkout -b feature/your-feature\n\n# Make changes and test\nuv pip install -e \".[dev,test]\"\npytest\n\n# Format and lint\nruff format .\nruff check .\n\n# Commit and push\ngit commit -am \"Add your feature\"\ngit push origin feature/your-feature\n```\n\n## License\n\nApache 2.0 License - see [LICENSE](./LICENSE) for details.\n\n\n## Roadmap\n\n### v0.1.0 - Local Mode Foundation (Current)\n- [x] Core embedded filesystem (read/write/delete)\n- [x] SQLite metadata store\n- [x] Local filesystem backend\n- [x] Basic file operations (list, glob, grep)\n- [x] Virtual path routing\n- [x] Directory operations (mkdir, rmdir, is_directory)\n- [x] Basic CLI interface with Click and Rich\n- [x] Metadata export/import (JSONL format)\n- [x] SQL views for ready work detection\n- [x] In-memory caching\n- [x] Batch operations (avoid N+1 queries)\n- [x] Type-level validation\n\n### v0.2.0 - FUSE Mount & Content-Aware Operations (Current)\n- [x] **FUSE filesystem mount** - Mount Nexus to local path (e.g., `/mnt/nexus`)\n- [x] **Smart read mode** - Return parsed text for binary files (PDFs, Excel, etc.)\n- [x] **Virtual file views** - Auto-generate `.txt` and `.md` views for binary files\n- [x] **Content parser framework** - Extensible parser system for document types (MarkItDown)\n- [x] **PDF parser** - Extract text and markdown from PDFs\n- [x] **Excel/CSV parser** - Parse spreadsheets to structured data\n- [x] **Content-aware file access** - Access parsed content via virtual views\n- [x] **Document type detection** - Auto-detect MIME types and route to parsers\n- [x] **Mount CLI commands** - `nexus mount`, `nexus unmount`\n- [x] **Mount modes** - Binary, text, and smart modes\n- [x] **.raw directory** - Access original binary files\n- [x] **Background daemon mode** - Run mount in background with `--daemon`\n- [x] **All FUSE operations** - read, write, create, delete, mkdir, rmdir, rename, truncate\n- [x] **Unit tests** - Comprehensive test coverage for FUSE operations\n- [x] **rclone-style CLI commands** - `sync`, `copy`, `move`, `tree`, `size` with progress bars\n- [ ] **Background parsing** - Async content parsing on write\n- [x] **FUSE performance optimizations** - Caching (TTL/LRU), cache invalidation, metrics\n- [ ] **Image OCR parser** - Extract text from images (PNG, JPEG)\n\n### v0.3.0 - File Permissions & Skills System\n\n**Permissions (Complete):**\n- [x] **UNIX-style file permissions** (owner, group, mode)\n- [x] **Permission operations** (chmod, chown, chgrp)\n- [x] **ACL (Access Control List)** support\n- [x] **CLI commands** (getfacl, setfacl)\n- [x] **Database schema** for permissions and ACL entries\n- [x] **Comprehensive tests** (91 passing tests)\n- [x] **ReBAC (Relationship-Based Access Control)** - Zanzibar-style authorization\n- [x] **Relationship types** - member-of, owner-of, viewer-of, editor-of, parent-of\n- [x] **Permission inheritance via relationships** - Team ownership, group membership\n- [x] **Relationship graph queries** - Graph traversal with cycle detection\n- [x] **Namespaced tuples** - (subject, relation, object) authorization model\n- [x] **Check API** - Fast permission checks with 5-minute TTL caching\n- [x] **Expand API** - Discover all subjects with specific permissions\n- [x] **Relationship management** - Create, delete, query relationships via CLI\n- [x] **Expiring tuples** - Temporary permissions with automatic cleanup\n- [x] **Comprehensive ReBAC tests** (14 passing tests, 100% pass rate)\n\n**Permissions (Remaining):**\n- [ ] **Default permission policies** per namespace\n- [ ] **Permission inheritance** for new files\n- [ ] **Permission checking** in all file operations\n- [ ] **Permission migration** for existing files\n\n**Skills System (Core - Vendor Neutral):**\n- [x] **SKILL.md parser** - Parse Anthropic-compatible SKILL.md with frontmatter\n- [x] **Skill registry** - Progressive disclosure, lazy loading, three-tier hierarchy\n- [x] **Skill discovery** - Scan `/workspace/.nexus/skills/`, `/shared/skills/`, `/system/skills/`\n- [x] **Dependency resolution** - Automatic DAG resolution with cycle detection\n- [x] **Skill export** - Export to generic formats (validate, pack, size check)\n- [x] **Skill templates** - 5 pre-built templates (basic, data-analysis, code-generation, document-processing, api-integration)\n- [x] **Skill lifecycle** - Create, fork, publish workflows with lineage tracking\n- [x] **Comprehensive tests** - 156 passing tests (31%+ overall coverage, 65-91% skills module)\n- [x] **Skill analytics** - Usage tracking, success rates, execution time, dashboard metrics\n- [x] **Skill search** - Text-based search across skill descriptions with relevance scoring\n- [x] **Skill governance** - Approval workflow for org-wide skills (submit, approve, reject)\n- [x] **Audit trails** - Log all skill operations, compliance reporting, query by filters\n- [ ] **Skill versioning** - CAS-backed version control with history tracking\n- [ ] **Semantic skill search** - Vector-based search across skill descriptions\n- [x] **CLI commands** - `list`, `create`, `fork`, `publish`, `search`, `info`, `export`, `validate`, `size` (see issue #88)\n\n**Note**: External integrations (Claude API upload/download, OpenAI, etc.) will be implemented as **plugins** in v0.3.5+ to maintain vendor neutrality. Core Nexus provides generic skill export (`nexus skills export --format claude`), while `nexus-plugin-anthropic` handles API-specific operations.\n\n### v0.3.5 - Plugin System & External Integrations\n- [ ] **Plugin discovery** - Entry point-based plugin discovery\n- [ ] **Plugin registry** - Register and manage installed plugins\n- [ ] **Plugin CLI namespace** - `nexus <plugin-name> <command>` pattern\n- [ ] **Plugin hooks** - Lifecycle hooks (before_write, after_read, etc.)\n- [ ] **Plugin configuration** - Per-plugin config in `~/.nexus/plugins/<name>/`\n- [ ] **Plugin manager** - `nexus plugins list/install/uninstall/info`\n- [ ] **First-party plugins:**\n - [ ] `nexus-plugin-anthropic` - Claude API integration (upload/download skills)\n - [ ] `nexus-plugin-openai` - OpenAI API integration\n - [ ] `nexus-plugin-skill-seekers` - Integration with Skill_Seekers scraper\n\n### v0.4.0 - AI Integration\n- [ ] LLM provider abstraction\n- [ ] Anthropic Claude integration\n- [ ] OpenAI integration\n- [ ] Basic KV cache for prompts\n- [ ] Semantic search (vector embeddings)\n- [ ] LLM-powered document reading\n\n### v0.5.0 - Agent Workspaces\n- [ ] Agent workspace structure\n- [ ] File-based configuration (.nexus/)\n- [ ] Custom command system (markdown)\n- [ ] Basic agent memory storage\n- [ ] Memory consolidation\n- [ ] Memory reflection phase (ACE-inspired: extract insights from execution trajectories)\n- [ ] Strategy/playbook organization (ACE-inspired: organize memories as reusable strategies)\n\n### v0.6.0 - Server Mode (Self-Hosted & Managed)\n- [ ] FastAPI REST API\n- [ ] API key authentication\n- [ ] Multi-tenancy support\n- [ ] PostgreSQL support\n- [ ] Redis caching\n- [ ] Docker deployment\n- [ ] Batch/transaction APIs (atomic multi-operation updates)\n- [ ] Optimistic locking for concurrent writes\n- [ ] Auto-scaling configuration (for hosted deployments)\n\n### v0.7.0 - Extended Features & Event System\n- [ ] S3 backend support\n- [ ] Google Drive backend\n- [ ] Job system with checkpointing\n- [ ] OAuth token management\n- [ ] MCP server implementation\n- [ ] Webhook/event system (file changes, memory updates, job events)\n- [ ] Watch API for real-time updates (streaming changes to clients)\n- [ ] Server-Sent Events (SSE) support for live monitoring\n- [ ] Simple admin UI (jobs, memories, files, operation logs)\n- [ ] Operation logs table (track storage operations for debugging)\n\n### v0.8.0 - Advanced AI Features & Rich Query\n- [ ] Advanced KV cache with context tracking\n- [ ] Memory versioning and lineage\n- [ ] Multi-agent memory sharing\n- [ ] Enhanced semantic search\n- [ ] Importance-based memory preservation (ACE-inspired: prevent brevity bias in consolidation)\n- [ ] Context-aware memory retrieval (include execution context in search)\n- [ ] Automated strategy extraction (LLM-powered extraction from successful trajectories)\n- [ ] Rich memory query language (filter by metadata, importance, task type, date ranges, etc.)\n- [ ] Memory query builder API (fluent interface for complex queries)\n- [ ] Combined vector + metadata search (hybrid search)\n\n### v0.9.0 - Production Readiness\n- [ ] Monitoring and observability\n- [ ] Performance optimization\n- [ ] Comprehensive testing\n- [ ] Security hardening\n- [ ] Documentation completion\n- [ ] Optional OpenTelemetry export (for framework integration)\n\n### v0.9.5 - Prompt Engineering & Optimization\n- [ ] Prompt version control with lineage tracking\n- [ ] Training dataset storage with CAS deduplication\n- [ ] Evaluation metrics time series (performance tracking)\n- [ ] Frozen inference snapshots (immutable program state)\n- [ ] Experiment tracking export (MLflow, W&B integration)\n- [ ] Prompt diff viewer (compare versions)\n- [ ] Regression detection alerts (performance drops)\n- [ ] Multi-candidate pool management (concurrent prompt testing)\n- [ ] Execution trace storage (detailed run logs for debugging)\n- [ ] Per-example evaluation results (granular performance tracking)\n- [ ] Optimization run grouping (experiment management)\n- [ ] Multi-objective tradeoff analysis (accuracy vs latency vs cost)\n\n### v0.10.0 - Production Infrastructure & Auto-Scaling\n- [ ] Automatic infrastructure scaling\n- [ ] Redis distributed locks (for large deployments)\n- [ ] PostgreSQL replication (for high availability)\n- [ ] Kubernetes deployment templates\n- [ ] Multi-region load balancing\n- [ ] Automatic migration from single-node to distributed\n\n### v1.0.0 - Production Release\n- [ ] Complete feature set\n- [ ] Production-tested\n- [ ] Comprehensive documentation\n- [ ] Migration tools\n- [ ] Enterprise support\n\n## Support\n\n- **Issues**: [GitHub Issues](https://github.com/yourusername/nexus/issues)\n- **Discussions**: [GitHub Discussions](https://github.com/yourusername/nexus/discussions)\n- **Email**: support@nexus.example.com\n- **Slack**: [Join our community](https://nexus-community.slack.com)\n\n---\n\nBuilt with \u2764\ufe0f by the Nexus team\n",
"bugtrack_url": null,
"license": "Apache-2.0",
"summary": "AI-Native Distributed Filesystem Architecture",
"version": "0.2.5",
"project_urls": {
"Changelog": "https://github.com/nexi-lab/nexus/blob/main/CHANGELOG.md",
"Documentation": "https://github.com/nexi-lab/nexus/blob/main/README.md",
"Homepage": "https://github.com/nexi-lab/nexus",
"Issues": "https://github.com/nexi-lab/nexus/issues",
"Repository": "https://github.com/nexi-lab/nexus"
},
"split_keywords": [
"agents",
" ai",
" content-addressable",
" distributed",
" filesystem",
" llm",
" storage",
" vector-search"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "01451412880358d8579f1d2d4548c74a1ed98b31c1bdd48893aa8b7113a1fec2",
"md5": "cb601bfa2f996bf8b8fff9cdd029db9e",
"sha256": "8319648125260b7385e2a30c5bfcff8377875de18d69e54145f8607a3074b86e"
},
"downloads": -1,
"filename": "nexus_ai_fs-0.2.5-py3-none-any.whl",
"has_sig": false,
"md5_digest": "cb601bfa2f996bf8b8fff9cdd029db9e",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.11",
"size": 202727,
"upload_time": "2025-10-21T00:43:39",
"upload_time_iso_8601": "2025-10-21T00:43:39.896171Z",
"url": "https://files.pythonhosted.org/packages/01/45/1412880358d8579f1d2d4548c74a1ed98b31c1bdd48893aa8b7113a1fec2/nexus_ai_fs-0.2.5-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "35f8f941b3dc08dbad3aa52e67413d3b4c18e08abf1f60dfd4fd5191fa6cf1b4",
"md5": "f1c9f25b77b37ea78db51d029ed9d63f",
"sha256": "4fb5e4e15e72e7a210204cbeedf43805f6b80908886e64d15ed369db3fa0fa48"
},
"downloads": -1,
"filename": "nexus_ai_fs-0.2.5.tar.gz",
"has_sig": false,
"md5_digest": "f1c9f25b77b37ea78db51d029ed9d63f",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.11",
"size": 750695,
"upload_time": "2025-10-21T00:43:41",
"upload_time_iso_8601": "2025-10-21T00:43:41.362833Z",
"url": "https://files.pythonhosted.org/packages/35/f8/f941b3dc08dbad3aa52e67413d3b4c18e08abf1f60dfd4fd5191fa6cf1b4/nexus_ai_fs-0.2.5.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-10-21 00:43:41",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "nexi-lab",
"github_project": "nexus",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "nexus-ai-fs"
}