clearstone-sdk


Nameclearstone-sdk JSON
Version 0.1.2 PyPI version JSON
download
home_pageNone
SummaryThe open-source reliability toolkit for AI agents. Add production-grade governance, observability, and debugging to any agent workflow.
upload_time2025-10-28 18:36:43
maintainerNone
docs_urlNone
authorNone
requires_python>=3.10
licenseMIT License Copyright (c) 2025 Clearstone SDK Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
keywords ai langchain agent governance observability testing debugging safety guardrails
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Clearstone SDK

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)
[![GitHub Stars](https://img.shields.io/github/stars/Sancauid/clearstone-sdk?style=social)](https://github.com/Sancauid/clearstone-sdk)

**Production-Grade Governance and Observability for AI Agent Systems.**

Clearstone is a comprehensive Python SDK that provides safety, governance, and observability for multi-agent AI workflows. It combines declarative Policy-as-Code with OpenTelemetry-aligned distributed tracing to help you build reliable, debuggable, and compliant AI systems.

---

## The Problem

Autonomous AI agents are powerful but operate in a high-stakes environment. Without robust guardrails and observability, they can be:

*   **Unsafe:** Accidentally executing destructive actions (e.g., deleting files).
*   **Costly:** Over-using expensive tools or LLM tokens.
*   **Non-compliant:** Mishandling sensitive data (PII).
*   **Unpredictable:** Difficult to debug when they fail.
*   **Opaque:** No visibility into what they're actually doing at runtime.

Clearstone provides the tools to manage these risks with declarative Policy-as-Code governance and production-ready distributed tracing.

## Key Features

### Policy Governance
*   ✅ **Declarative Policy-as-Code:** Write policies as simple Python functions using the `@Policy` decorator. No YAML or complex DSLs.
*   ✅ **Seamless LangChain Integration:** Drop the `PolicyCallbackHandler` into any LangChain agent to enforce policies at runtime.
*   ✅ **Rich Pre-Built Policy Library:** Get started in minutes with 17+ production-ready policies for cost control, RBAC, PII redaction, security alerts, and more.
*   ✅ **Local LLM Protection:** Built-in policies for system load monitoring and model server health checks—specifically designed for local-first AI workflows.
*   ✅ **Human-in-the-Loop Controls:** Pause agent execution for manual approval with the `PAUSE` action and `InterventionClient` for high-stakes decisions.
*   ✅ **Pre-Deploy Validation:** Catch buggy, slow, or non-deterministic policies *before* they reach production with the `PolicyValidator`.
*   ✅ **Line-by-Line Debugging:** Understand exactly why a policy made a decision with the `PolicyDebugger`'s execution trace.
*   ✅ **Performance Metrics:** Track policy execution times, identify bottlenecks, and analyze decision patterns with `PolicyMetrics`.
*   ✅ **Composable Logic:** Build complex rules from simple, reusable policies with `compose_and` and `compose_or` helpers.
*   ✅ **Exportable Audit Trails:** Generate JSON or CSV audit logs for every policy decision, perfect for compliance and analysis.
*   ✅ **Developer CLI:** Accelerate development by scaffolding new, well-structured policy files with the `clearstone new-policy` command.

### Observability & Tracing
*   ✅ **Production-Ready Tracing:** OpenTelemetry-aligned distributed tracing for complete agent execution visibility.
*   ✅ **Automatic Hierarchy Tracking:** Nested spans automatically establish parent-child relationships without manual configuration.
*   ✅ **High-Fidelity Capture:** Nanosecond-precision timing, input/output snapshots, and full error stack traces.
*   ✅ **Thread-Safe Persistence:** SQLite storage with Write-Ahead Logging (WAL) for concurrent-safe trace storage.
*   ✅ **Asynchronous Batching:** Non-blocking span capture with automatic batch writes for zero performance impact.
*   ✅ **Hybrid Serialization:** Smart JSON-first serialization with automatic pickle fallback for complex objects.
*   ✅ **Single-Line Setup:** Initialize the entire tracing system with one `TracerProvider` instantiation.

### AI-Native Testing & Backtesting
*   ✅ **Behavioral Assertions:** Declarative test functions for validating agent behavior (tool usage, execution order, costs, errors).
*   ✅ **Historical Backtesting:** Test new policies against real production traces to predict impact before deployment.
*   ✅ **Policy Test Harness:** Simulate policy enforcement on historical data with detailed impact reports and metrics.
*   ✅ **pytest Integration:** Seamlessly integrate behavioral tests into existing test workflows and CI/CD pipelines.
*   ✅ **Trace-Level Validation:** Assert on complete execution flows, not just individual operations or outputs.
*   ✅ **Comprehensive Reporting:** Track block rates, decision distributions, and identify problematic traces.

### Time-Travel Debugging
*   ✅ **Checkpoint System:** Capture complete agent state at any point in execution history.
*   ✅ **Agent Rehydration:** Dynamically restore agents from checkpoints with full state preservation.
*   ✅ **Deterministic Replay:** Mock non-deterministic functions (time, random) for reproducible debugging sessions.
*   ✅ **Interactive Debugging:** Drop into pdb at any historical execution point with full context.
*   ✅ **Pre-flight Mock Analysis:** See exactly which functions will be mocked and how many responses were recorded before debugging.
*   ✅ **Intelligent Error Handling:** Clear error messages when mock data is insufficient, with actionable guidance.
*   ✅ **Hybrid Serialization:** JSON metadata with pickle state for human-readable yet high-fidelity checkpoints.
*   ✅ **Upstream Span Tracking:** Automatically capture parent span hierarchy for complete execution context.

## Installation

The SDK requires Python 3.10+.

```bash
pip install clearstone-sdk
```

## 5-Minute Quickstart

See how easy it is to protect an agent from performing unauthorized actions.

#### 1. Define Your Policies

Create a file `my_app/policies.py`. Our policies will check a user's role before allowing access to a tool.

```python
# my_app/policies.py
from clearstone import Policy, ALLOW, BLOCK

@Policy(name="block_admin_tools_for_guests", priority=100)
def block_admin_tools_policy(context):
    """A high-priority policy to enforce Role-Based Access Control (RBAC)."""
    
    # Policies read data from the context's metadata
    role = context.metadata.get("role")
    tool_name = context.metadata.get("tool_name")

    if role == "guest" and tool_name == "admin_panel":
        return BLOCK(f"Role '{role}' is not authorized to access '{tool_name}'.")
    
    return ALLOW
```

#### 2. Integrate with Your Agent

In your main application file, initialize the engine and add the `PolicyCallbackHandler` to your agent call.

```python
# my_app/main.py
from clearstone import (
    create_context,
    context_scope,
    PolicyEngine,
    PolicyViolationError
)
from clearstone.integrations.langchain import PolicyCallbackHandler

# This import discovers and registers the policies we just wrote
import my_app.policies

# --- Setup Clearstone (do this once) ---
engine = PolicyEngine()
handler = PolicyCallbackHandler(engine)

def run_agent_with_tool(user_role: str):
    """Simulates running an agent for a user with a specific role."""
    print(f"\n--- Running agent for user with role: '{user_role}' ---")

    # 1. Create a context for this specific run
    context = create_context(
        user_id=f"user_{user_role}",
        agent_id="admin_agent_v1",
        metadata={"role": user_role}
    )

    try:
        # 2. Run the agent within the context scope and with the handler
        with context_scope(context):
            # In a real app, this would be: agent.invoke(..., callbacks=[handler])
            # We simulate the tool call for this example:
            print("Agent is attempting to access 'admin_panel' tool...")
            handler.on_tool_start(serialized={"name": "admin_panel"}, input_str="")
        
        print("✅ SUCCESS: Agent action was approved by all policies.")

    except PolicyViolationError as e:
        # 3. Handle policy violations gracefully
        print(f"❌ BLOCKED: The action was stopped by a policy.")
        print(f"   Reason: {e.decision.reason}")

# --- Run Scenarios ---
run_agent_with_tool("admin")
run_agent_with_tool("guest")
```

#### 3. Run and See the Result
```
--- Running agent for user with role: 'admin' ---
Agent is attempting to access 'admin_panel' tool...
✅ SUCCESS: Agent action was approved by all policies.

--- Running agent for user with role: 'guest' ---
Agent is attempting to access 'admin_panel' tool...
❌ BLOCKED: The action was stopped by a policy.
   Reason: Role 'guest' is not authorized to access 'admin_panel'.
```

## The Developer Toolkit

Clearstone is more than just an engine; it's a complete toolkit for policy governance.

#### 1. Explicit Policy Configuration
Control exactly which policies are active without relying on auto-discovery.
```python
from clearstone import PolicyEngine, Policy, ALLOW, BLOCK

@Policy(name="strict_policy", priority=100)
def strict_policy(context):
    if context.metadata.get("strict_mode"):
        return BLOCK("Strict mode enabled")
    return ALLOW

@Policy(name="lenient_policy", priority=100)
def lenient_policy(context):
    return ALLOW

# Production: use only the strict policy
prod_engine = PolicyEngine(policies=[strict_policy])

# Development: use only the lenient policy
dev_engine = PolicyEngine(policies=[lenient_policy])

# Testing: isolate specific policies
test_engine = PolicyEngine(policies=[strict_policy])
```

#### 2. Composing Policies
Build complex logic from simple, reusable parts.
```python
from clearstone import compose_and
from clearstone.policies.common import token_limit_policy, cost_limit_policy

# This new policy only passes if BOTH underlying policies pass.
safe_and_cheap_policy = compose_and(token_limit_policy, cost_limit_policy)
```

#### 3. Validating Policies Before Deployment
Catch bugs before they reach production. The validator checks for slowness, non-determinism, and fragility.
```python
from clearstone import PolicyValidator

validator = PolicyValidator()
failures = validator.run_all_checks(my_buggy_policy)

if failures:
    print("Policy failed validation:", failures)
else:
    print("Policy is ready for production!")
```

#### 4. Debugging Policy Decisions
Understand *why* a policy made a specific decision with a line-by-line execution trace.
```python
from clearstone import PolicyDebugger

debugger = PolicyDebugger()
decision, trace = debugger.trace_evaluation(my_complex_policy, context)

# Print a human-readable report
print(debugger.format_trace(my_complex_policy, decision, trace))
```

#### 5. Performance Monitoring
Track policy performance and identify bottlenecks with real-time metrics.
```python
from clearstone import PolicyMetrics

metrics = PolicyMetrics()
engine = PolicyEngine(metrics=metrics)

# ... run agent ...

# Get performance summary
summary = metrics.summary()
print(f"Policy 'token_limit' avg latency: {summary['token_limit']['avg_latency_ms']:.4f}ms")

# Find slowest policies
slowest = metrics.get_slowest_policies(top_n=5)
for policy_name, stats in slowest:
    print(f"{policy_name}: {stats['avg_latency_ms']:.4f}ms")

# Find policies that block most often
top_blockers = metrics.get_top_blocking_policies(top_n=5)
```

#### 6. Human-in-the-Loop Interventions
Pause agent execution for manual approval on high-stakes operations like financial transactions or destructive actions.
```python
import dataclasses
from clearstone import (
  Policy, PolicyEngine, create_context, context_scope,
  ALLOW, PAUSE, InterventionClient
)
from clearstone.integrations.langchain import PolicyCallbackHandler, PolicyPauseError

@Policy(name="require_approval_for_large_spend", priority=100)
def approval_policy(context):
  amount = context.metadata.get("amount", 0)
  is_approved = context.metadata.get("is_approved", False)
  
  if amount > 1000 and not is_approved:
    return PAUSE(f"Transaction of ${amount} requires manual approval.")
  
  return ALLOW

def run_transaction(engine, context):
  handler = PolicyCallbackHandler(engine)
  
  try:
    with context_scope(context):
      handler.on_tool_start(serialized={"name": "execute_payment"}, input_str="")
    print("✅ Transaction successful")
    return True
  
  except PolicyPauseError as e:
    print(f"⏸️ Transaction paused: {e.decision.reason}")
    
    intervention_client = InterventionClient()
    intervention_client.request_intervention(e.decision)
    intervention_id = e.decision.metadata.get("intervention_id")
    
    if intervention_client.wait_for_approval(intervention_id):
      # User approved - retry with approval flag
      approved_context = dataclasses.replace(
        context, 
        metadata={**context.metadata, "is_approved": True}
      )
      return run_transaction(engine, approved_context)
    else:
      print("❌ Transaction rejected by user")
      return False

engine = PolicyEngine()
ctx = create_context("user-1", "finance-agent", amount=2500)
run_transaction(engine, ctx)
```

#### 7. Auditing and Exporting
The `PolicyEngine` automatically captures a detailed audit trail. You can analyze it or export it for compliance.
```python
from clearstone import AuditTrail

audit = AuditTrail()
engine = PolicyEngine(audit_trail=audit)

# ... run agent ...

# Get a quick summary
print(audit.summary())
# {'total_decisions': 50, 'blocks': 5, 'alerts': 12, 'block_rate': 0.1}

# Export for external analysis
audit.to_json("audit_log.json")
audit.to_csv("audit_log.csv")
```

## Distributed Tracing & Observability

Clearstone provides production-grade distributed tracing to understand exactly what your AI agents are doing at runtime.

#### Quick Start: Trace Your Agent
```python
from clearstone.observability import TracerProvider, SpanKind

# Initialize once at application startup
provider = TracerProvider(db_path="traces.db")
tracer = provider.get_tracer("my_agent", version="1.0")

# Trace operations with automatic hierarchy
with tracer.span("agent_execution", kind=SpanKind.INTERNAL) as root_span:
    # Nested spans automatically link to parents
    with tracer.span("llm_call", kind=SpanKind.CLIENT, attributes={"model": "gpt-4"}) as llm_span:
        result = call_llm()
    
    with tracer.span("tool_execution", attributes={"tool": "calculator"}):
        output = run_tool()

# Spans are automatically persisted to SQLite
# Retrieve traces for analysis
trace = provider.trace_store.get_trace(root_span.trace_id)
```

#### Key Capabilities

**Automatic Parent-Child Linking**
```python
# No manual span IDs needed - hierarchy is automatic
with tracer.span("parent_operation"):
    with tracer.span("child_operation"):
        with tracer.span("grandchild_operation"):
            pass  # Three-level hierarchy created automatically
```

**Rich Span Attributes**
```python
with tracer.span("llm_call", attributes={
    "model": "gpt-4",
    "temperature": 0.7,
    "max_tokens": 1000
}) as span:
    # Attributes are searchable in storage
    result = call_llm()
```

**Exception Tracking**
```python
with tracer.span("risky_operation") as span:
    raise ValueError("Something went wrong")
# Span automatically captures:
# - status: ERROR
# - error_message: "Something went wrong"
# - error_stacktrace: full traceback
```

**Performance Characteristics**
- ⚡ **Non-blocking:** Span capture takes < 1μs
- 🔄 **Batched writes:** Groups 100 spans per transaction
- 🔒 **Thread-safe:** Multiple threads can trace concurrently
- 💾 **Efficient storage:** SQLite with WAL mode for concurrent reads

## AI-Native Testing & Backtesting

Clearstone provides a powerful testing framework designed specifically for AI agents. Unlike traditional unit tests that check outputs, this framework validates **how** agents behave.

#### Quick Start: Test Agent Behavior

```python
from clearstone.observability import TracerProvider
from clearstone.testing import PolicyTestHarness, assert_tool_was_called, assert_no_errors_in_trace

# Step 1: Run your agent with tracing enabled
provider = TracerProvider(db_path="agent_traces.db")
tracer = provider.get_tracer("research_agent")

with tracer.span("research_workflow"):
    with tracer.span("search", attributes={"tool.name": "web_search"}):
        pass  # Your agent's search logic here

provider.shutdown()

# Step 2: Create behavioral assertions
harness = PolicyTestHarness("agent_traces.db")
traces = harness.load_traces()

# Step 3: Validate behavior
tool_check = assert_tool_was_called("web_search", times=1)
error_check = assert_no_errors_in_trace()

results = [
    harness.simulate_policy(tool_check, traces),
    harness.simulate_policy(error_check, traces)
]

# Step 4: Check results
for result in results:
    summary = result.summary()
    if summary["runs_blocked"] > 0:
        print(f"❌ Test failed: {result.policy_name}")
        print(f"   Blocked traces: {result.blocked_trace_ids}")
    else:
        print(f"✅ Test passed: {result.policy_name}")
```

#### Available Behavioral Assertions

**Tool Usage Validation**
```python
from clearstone.testing import assert_tool_was_called

# Assert tool was called at least once
policy = assert_tool_was_called("calculator")

# Assert exact number of calls
policy = assert_tool_was_called("web_search", times=3)
```

**Cost Control Testing**
```python
from clearstone.testing import assert_llm_cost_is_less_than

# Ensure agent stays within budget
policy = assert_llm_cost_is_less_than(0.50)  # Max $0.50 per run
```

**Error Detection**
```python
from clearstone.testing import assert_no_errors_in_trace

# Validate error-free execution
policy = assert_no_errors_in_trace()
```

**Execution Flow Validation**
```python
from clearstone.testing import assert_span_order

# Ensure correct workflow sequence
policy = assert_span_order(["plan", "search", "synthesize"])
```

#### Historical Backtesting

Test policy changes against production data before deployment:

```python
from clearstone.testing import PolicyTestHarness

# Load 100 historical traces from production
harness = PolicyTestHarness("production_traces.db")
traces = harness.load_traces(limit=100)

# Test a new policy against historical data
def new_cost_policy(trace):
    total_cost = sum(s.attributes.get("cost", 0) for s in trace.spans)
    if total_cost > 2.0:
        return BLOCK("Cost exceeds new limit")
    return ALLOW

# See impact before deploying
result = harness.simulate_policy(new_cost_policy, traces)
summary = result.summary()

print(f"Would block: {summary['runs_blocked']} / {summary['traces_analyzed']} runs")
print(f"Block rate: {summary['block_rate_percent']}")
```

#### pytest Integration

Integrate behavioral tests seamlessly into your test suite:

```python
# tests/test_agent_behavior.py
import pytest
from clearstone.observability import TracerProvider
from clearstone.testing import PolicyTestHarness, assert_tool_was_called

def test_research_agent_uses_search_correctly(tmp_path):
    db_path = tmp_path / "test_traces.db"
    
    # Run agent
    provider = TracerProvider(db_path=str(db_path))
    tracer = provider.get_tracer("test_agent")
    run_research_agent(tracer)  # Your agent function
    provider.shutdown()
    
    # Test behavior
    harness = PolicyTestHarness(str(db_path))
    traces = harness.load_traces()
    
    policy = assert_tool_was_called("search", times=2)
    result = harness.simulate_policy(policy, traces)
    
    assert result.summary()["runs_blocked"] == 0, "Agent should use search exactly twice"
```

**Key Benefits:**
- 🎯 **Behavior-Focused:** Test what agents *do*, not just what they *return*
- 📊 **Data-Driven:** Validate against real execution traces
- 🔄 **Regression Prevention:** Catch behavioral changes before deployment
- 🧪 **CI/CD Ready:** Seamlessly integrates with pytest workflows
- 📈 **Impact Analysis:** Understand policy changes with detailed metrics

## Time-Travel Debugging

Debug AI agents by traveling back to any point in their execution history. Clearstone's time-travel debugging captures complete agent state snapshots and allows you to replay and debug from those exact moments.

#### Quick Start: Create and Load a Checkpoint

```python
from clearstone.debugging import CheckpointManager, ReplayEngine
from clearstone.observability import TracerProvider

provider = TracerProvider(db_path="traces.db")
tracer = provider.get_tracer("my_agent", version="1.0")

with tracer.span("agent_workflow") as root_span:
  trace_id = root_span.trace_id
  
  with tracer.span("step_1") as span1:
    pass
  
  with tracer.span("step_2") as span2:
    span_id = span2.span_id

provider.shutdown()

trace = provider.trace_store.get_trace(trace_id)

manager = CheckpointManager()
checkpoint = manager.create_checkpoint(agent, trace, span_id=span_id)

engine = ReplayEngine(checkpoint)
engine.start_debugging_session("run_next_step", input_data)
```

#### Key Capabilities

**Checkpoint Creation**
```python
from clearstone.debugging import CheckpointManager

manager = CheckpointManager(checkpoint_dir=".checkpoints")

checkpoint = manager.create_checkpoint(
  agent=my_agent,
  trace=execution_trace,
  span_id="span_xyz"
)
```

**Agent Rehydration**
```python
from clearstone.debugging import ReplayEngine

checkpoint = manager.load_checkpoint("t1_ckpt_abc123.ckpt")

engine = ReplayEngine(checkpoint)
```

**Interactive Debugging Session**
```python
engine.start_debugging_session(
  function_to_replay="process_input",
  *args,
  **kwargs
)
```

**Deterministic Replay**

The `DeterministicExecutionContext` automatically mocks non-deterministic functions to ensure reproducible debugging:
- Time functions (`time.time`)
- Random number generation (`random.random`)
- LLM responses (replayed from trace)
- Tool outputs (replayed from trace)

**Checkpoint Serialization**

Checkpoints use a hybrid serialization approach:
- Metadata: JSON (human-readable, version info, timestamps)
- Agent state: Pickle (high-fidelity, preserves complex objects)
- Trace context: Full upstream span hierarchy included

**Key Benefits:**
- 🕰️ **Time Travel:** Jump to any point in agent execution history
- 🔍 **Full Context:** Complete state + parent span hierarchy
- 🎯 **Deterministic:** Reproducible replay with mocked externals
- 🐛 **Interactive:** Drop into pdb with real agent state
- 💾 **Portable:** Save checkpoints to disk, share with team
- 🔄 **Rehydration:** Dynamically restore any agent class

#### Agent Requirements

For agents to be checkpointable, they must implement:

```python
class MyAgent:
  def get_state(self):
    """Return a dictionary of all state to preserve."""
    return {"memory": self.memory, "config": self.config}
  
  def load_state(self, state):
    """Restore agent from a state dictionary."""
    self.memory = state["memory"]
    self.config = state["config"]
```

For simple agents, Clearstone will automatically capture `__dict__` if these methods aren't provided.

## Command-Line Interface (CLI)

Accelerate development with the `clearstone` CLI. The `new-policy` command scaffolds a boilerplate file with best practices.

```bash
# See all available commands
clearstone --help

# Create a new policy file
clearstone new-policy enforce_data_locality --priority=80 --dir=my_app/compliance

# Output: Creates my_app/compliance/enforce_data_locality_policy.py
```
```python
# my_app/compliance/enforce_data_locality_policy.py
from clearstone import Policy, ALLOW, BLOCK, Decision
# ... boilerplate ...

@Policy(name="enforce_data_locality", priority=80)
def enforce_data_locality_policy(context: PolicyContext) -> Decision:
    """
    [TODO: Describe what this policy does.]
    """
    # [TODO: Implement your policy logic here.]
    return ALLOW
```

## For Local LLM Users

Clearstone includes specialized policies designed specifically for local-first AI workflows. These address the unique challenges of running large language models on local hardware:

### System Load Protection

Prevents system freezes by monitoring CPU and memory usage before allowing intensive operations:

```python
from clearstone.policies.common import system_load_policy

# Automatically blocks operations when:
# - CPU usage > 90% (configurable)
# - Memory usage > 95% (configurable)

context = create_context(
    "user", "agent",
    cpu_threshold_percent=85.0,      # Custom threshold
    memory_threshold_percent=90.0
)
```

### Model Health Check

Provides instant feedback when your local model server is down, avoiding mysterious 60-second timeouts:

```python
from clearstone.policies.common import model_health_check_policy

# Quick health check (0.5s timeout) before LLM calls
# Supports Ollama, LM Studio, and custom endpoints

context = create_context(
    "user", "agent",
    local_model_health_url="http://localhost:11434/api/tags",  # Ollama default
    health_check_timeout=1.0
)
```

**Why This Matters:**
- ❌ No more system freezes from resource exhaustion
- ❌ No more waiting 60 seconds for timeout errors  
- ✅ Immediate, actionable error messages
- ✅ Prevents retry loops that make problems worse

See `examples/16_local_llm_protection.py` for a complete demonstration.

---

## Anonymous Usage Telemetry

To help improve Clearstone, the SDK collects anonymous usage statistics by default. This telemetry is:

- **Anonymous:** Only component initialization events are tracked (e.g., "PolicyEngine initialized")
- **Non-Identifying:** No user data, policy logic, or trace content is ever collected
- **Transparent:** All telemetry code is open source and auditable
- **Opt-Out:** Easy to disable at any time

### What We Collect

- Component initialization events (PolicyEngine, TracerProvider, etc.)
- SDK version and Python version
- Anonymous session ID (generated per-process)
- Anonymous user ID (persistent, stored in `~/.clearstone/config.json`)

### What We DON'T Collect

- Your policy logic or decisions
- Trace data or agent outputs
- User identifiers or credentials
- Any personally identifiable information (PII)
- File paths or environment variables

### How to Opt Out

**Option 1: Environment Variable (Recommended)**
```bash
export CLEARSTONE_TELEMETRY_DISABLED=1
```

**Option 2: Config File**

Edit or create `~/.clearstone/config.json`:
```json
{
  "telemetry": {
    "disabled": true
  }
}
```

The SDK checks for opt-out on every process start and respects your choice immediately.

---

## Contributing

Contributions are welcome! Please see our [Contributing Guide](docs/about/contributing.md) for details on how to submit pull requests, set up a development environment, and run tests.

## License

This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.

---

## Community & Support

Join our community to ask questions, share your projects, and get help from the team and other users.

*   **Discord:** [Join the Clearstone Community](https://discord.gg/VZAX4vk8dT)
*   **Twitter:** Follow [@clearstonedev](https://twitter.com/clearstonedev) for the latest news and updates.
*   **GitHub Issues:** [Report a bug or suggest a feature](https://github.com/Sancauid/clearstone-sdk/issues).
*   **Email:** For other inquiries, you can reach out to [pablo@clearstone.dev](mailto:pablo@clearstone.dev).

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "clearstone-sdk",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "ai, langchain, agent, governance, observability, testing, debugging, safety, guardrails",
    "author": null,
    "author_email": "The Clearstone Team <pablo@clearstone.dev>",
    "download_url": "https://files.pythonhosted.org/packages/d1/05/94aefdcb3a2101b20abf799934e4b7f72b909494938d9449f9fbf7cc0c3c/clearstone_sdk-0.1.2.tar.gz",
    "platform": null,
    "description": "# Clearstone SDK\n\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)\n[![GitHub Stars](https://img.shields.io/github/stars/Sancauid/clearstone-sdk?style=social)](https://github.com/Sancauid/clearstone-sdk)\n\n**Production-Grade Governance and Observability for AI Agent Systems.**\n\nClearstone is a comprehensive Python SDK that provides safety, governance, and observability for multi-agent AI workflows. It combines declarative Policy-as-Code with OpenTelemetry-aligned distributed tracing to help you build reliable, debuggable, and compliant AI systems.\n\n---\n\n## The Problem\n\nAutonomous AI agents are powerful but operate in a high-stakes environment. Without robust guardrails and observability, they can be:\n\n*   **Unsafe:** Accidentally executing destructive actions (e.g., deleting files).\n*   **Costly:** Over-using expensive tools or LLM tokens.\n*   **Non-compliant:** Mishandling sensitive data (PII).\n*   **Unpredictable:** Difficult to debug when they fail.\n*   **Opaque:** No visibility into what they're actually doing at runtime.\n\nClearstone provides the tools to manage these risks with declarative Policy-as-Code governance and production-ready distributed tracing.\n\n## Key Features\n\n### Policy Governance\n*   \u2705 **Declarative Policy-as-Code:** Write policies as simple Python functions using the `@Policy` decorator. No YAML or complex DSLs.\n*   \u2705 **Seamless LangChain Integration:** Drop the `PolicyCallbackHandler` into any LangChain agent to enforce policies at runtime.\n*   \u2705 **Rich Pre-Built Policy Library:** Get started in minutes with 17+ production-ready policies for cost control, RBAC, PII redaction, security alerts, and more.\n*   \u2705 **Local LLM Protection:** Built-in policies for system load monitoring and model server health checks\u2014specifically designed for local-first AI workflows.\n*   \u2705 **Human-in-the-Loop Controls:** Pause agent execution for manual approval with the `PAUSE` action and `InterventionClient` for high-stakes decisions.\n*   \u2705 **Pre-Deploy Validation:** Catch buggy, slow, or non-deterministic policies *before* they reach production with the `PolicyValidator`.\n*   \u2705 **Line-by-Line Debugging:** Understand exactly why a policy made a decision with the `PolicyDebugger`'s execution trace.\n*   \u2705 **Performance Metrics:** Track policy execution times, identify bottlenecks, and analyze decision patterns with `PolicyMetrics`.\n*   \u2705 **Composable Logic:** Build complex rules from simple, reusable policies with `compose_and` and `compose_or` helpers.\n*   \u2705 **Exportable Audit Trails:** Generate JSON or CSV audit logs for every policy decision, perfect for compliance and analysis.\n*   \u2705 **Developer CLI:** Accelerate development by scaffolding new, well-structured policy files with the `clearstone new-policy` command.\n\n### Observability & Tracing\n*   \u2705 **Production-Ready Tracing:** OpenTelemetry-aligned distributed tracing for complete agent execution visibility.\n*   \u2705 **Automatic Hierarchy Tracking:** Nested spans automatically establish parent-child relationships without manual configuration.\n*   \u2705 **High-Fidelity Capture:** Nanosecond-precision timing, input/output snapshots, and full error stack traces.\n*   \u2705 **Thread-Safe Persistence:** SQLite storage with Write-Ahead Logging (WAL) for concurrent-safe trace storage.\n*   \u2705 **Asynchronous Batching:** Non-blocking span capture with automatic batch writes for zero performance impact.\n*   \u2705 **Hybrid Serialization:** Smart JSON-first serialization with automatic pickle fallback for complex objects.\n*   \u2705 **Single-Line Setup:** Initialize the entire tracing system with one `TracerProvider` instantiation.\n\n### AI-Native Testing & Backtesting\n*   \u2705 **Behavioral Assertions:** Declarative test functions for validating agent behavior (tool usage, execution order, costs, errors).\n*   \u2705 **Historical Backtesting:** Test new policies against real production traces to predict impact before deployment.\n*   \u2705 **Policy Test Harness:** Simulate policy enforcement on historical data with detailed impact reports and metrics.\n*   \u2705 **pytest Integration:** Seamlessly integrate behavioral tests into existing test workflows and CI/CD pipelines.\n*   \u2705 **Trace-Level Validation:** Assert on complete execution flows, not just individual operations or outputs.\n*   \u2705 **Comprehensive Reporting:** Track block rates, decision distributions, and identify problematic traces.\n\n### Time-Travel Debugging\n*   \u2705 **Checkpoint System:** Capture complete agent state at any point in execution history.\n*   \u2705 **Agent Rehydration:** Dynamically restore agents from checkpoints with full state preservation.\n*   \u2705 **Deterministic Replay:** Mock non-deterministic functions (time, random) for reproducible debugging sessions.\n*   \u2705 **Interactive Debugging:** Drop into pdb at any historical execution point with full context.\n*   \u2705 **Pre-flight Mock Analysis:** See exactly which functions will be mocked and how many responses were recorded before debugging.\n*   \u2705 **Intelligent Error Handling:** Clear error messages when mock data is insufficient, with actionable guidance.\n*   \u2705 **Hybrid Serialization:** JSON metadata with pickle state for human-readable yet high-fidelity checkpoints.\n*   \u2705 **Upstream Span Tracking:** Automatically capture parent span hierarchy for complete execution context.\n\n## Installation\n\nThe SDK requires Python 3.10+.\n\n```bash\npip install clearstone-sdk\n```\n\n## 5-Minute Quickstart\n\nSee how easy it is to protect an agent from performing unauthorized actions.\n\n#### 1. Define Your Policies\n\nCreate a file `my_app/policies.py`. Our policies will check a user's role before allowing access to a tool.\n\n```python\n# my_app/policies.py\nfrom clearstone import Policy, ALLOW, BLOCK\n\n@Policy(name=\"block_admin_tools_for_guests\", priority=100)\ndef block_admin_tools_policy(context):\n    \"\"\"A high-priority policy to enforce Role-Based Access Control (RBAC).\"\"\"\n    \n    # Policies read data from the context's metadata\n    role = context.metadata.get(\"role\")\n    tool_name = context.metadata.get(\"tool_name\")\n\n    if role == \"guest\" and tool_name == \"admin_panel\":\n        return BLOCK(f\"Role '{role}' is not authorized to access '{tool_name}'.\")\n    \n    return ALLOW\n```\n\n#### 2. Integrate with Your Agent\n\nIn your main application file, initialize the engine and add the `PolicyCallbackHandler` to your agent call.\n\n```python\n# my_app/main.py\nfrom clearstone import (\n    create_context,\n    context_scope,\n    PolicyEngine,\n    PolicyViolationError\n)\nfrom clearstone.integrations.langchain import PolicyCallbackHandler\n\n# This import discovers and registers the policies we just wrote\nimport my_app.policies\n\n# --- Setup Clearstone (do this once) ---\nengine = PolicyEngine()\nhandler = PolicyCallbackHandler(engine)\n\ndef run_agent_with_tool(user_role: str):\n    \"\"\"Simulates running an agent for a user with a specific role.\"\"\"\n    print(f\"\\n--- Running agent for user with role: '{user_role}' ---\")\n\n    # 1. Create a context for this specific run\n    context = create_context(\n        user_id=f\"user_{user_role}\",\n        agent_id=\"admin_agent_v1\",\n        metadata={\"role\": user_role}\n    )\n\n    try:\n        # 2. Run the agent within the context scope and with the handler\n        with context_scope(context):\n            # In a real app, this would be: agent.invoke(..., callbacks=[handler])\n            # We simulate the tool call for this example:\n            print(\"Agent is attempting to access 'admin_panel' tool...\")\n            handler.on_tool_start(serialized={\"name\": \"admin_panel\"}, input_str=\"\")\n        \n        print(\"\u2705 SUCCESS: Agent action was approved by all policies.\")\n\n    except PolicyViolationError as e:\n        # 3. Handle policy violations gracefully\n        print(f\"\u274c BLOCKED: The action was stopped by a policy.\")\n        print(f\"   Reason: {e.decision.reason}\")\n\n# --- Run Scenarios ---\nrun_agent_with_tool(\"admin\")\nrun_agent_with_tool(\"guest\")\n```\n\n#### 3. Run and See the Result\n```\n--- Running agent for user with role: 'admin' ---\nAgent is attempting to access 'admin_panel' tool...\n\u2705 SUCCESS: Agent action was approved by all policies.\n\n--- Running agent for user with role: 'guest' ---\nAgent is attempting to access 'admin_panel' tool...\n\u274c BLOCKED: The action was stopped by a policy.\n   Reason: Role 'guest' is not authorized to access 'admin_panel'.\n```\n\n## The Developer Toolkit\n\nClearstone is more than just an engine; it's a complete toolkit for policy governance.\n\n#### 1. Explicit Policy Configuration\nControl exactly which policies are active without relying on auto-discovery.\n```python\nfrom clearstone import PolicyEngine, Policy, ALLOW, BLOCK\n\n@Policy(name=\"strict_policy\", priority=100)\ndef strict_policy(context):\n    if context.metadata.get(\"strict_mode\"):\n        return BLOCK(\"Strict mode enabled\")\n    return ALLOW\n\n@Policy(name=\"lenient_policy\", priority=100)\ndef lenient_policy(context):\n    return ALLOW\n\n# Production: use only the strict policy\nprod_engine = PolicyEngine(policies=[strict_policy])\n\n# Development: use only the lenient policy\ndev_engine = PolicyEngine(policies=[lenient_policy])\n\n# Testing: isolate specific policies\ntest_engine = PolicyEngine(policies=[strict_policy])\n```\n\n#### 2. Composing Policies\nBuild complex logic from simple, reusable parts.\n```python\nfrom clearstone import compose_and\nfrom clearstone.policies.common import token_limit_policy, cost_limit_policy\n\n# This new policy only passes if BOTH underlying policies pass.\nsafe_and_cheap_policy = compose_and(token_limit_policy, cost_limit_policy)\n```\n\n#### 3. Validating Policies Before Deployment\nCatch bugs before they reach production. The validator checks for slowness, non-determinism, and fragility.\n```python\nfrom clearstone import PolicyValidator\n\nvalidator = PolicyValidator()\nfailures = validator.run_all_checks(my_buggy_policy)\n\nif failures:\n    print(\"Policy failed validation:\", failures)\nelse:\n    print(\"Policy is ready for production!\")\n```\n\n#### 4. Debugging Policy Decisions\nUnderstand *why* a policy made a specific decision with a line-by-line execution trace.\n```python\nfrom clearstone import PolicyDebugger\n\ndebugger = PolicyDebugger()\ndecision, trace = debugger.trace_evaluation(my_complex_policy, context)\n\n# Print a human-readable report\nprint(debugger.format_trace(my_complex_policy, decision, trace))\n```\n\n#### 5. Performance Monitoring\nTrack policy performance and identify bottlenecks with real-time metrics.\n```python\nfrom clearstone import PolicyMetrics\n\nmetrics = PolicyMetrics()\nengine = PolicyEngine(metrics=metrics)\n\n# ... run agent ...\n\n# Get performance summary\nsummary = metrics.summary()\nprint(f\"Policy 'token_limit' avg latency: {summary['token_limit']['avg_latency_ms']:.4f}ms\")\n\n# Find slowest policies\nslowest = metrics.get_slowest_policies(top_n=5)\nfor policy_name, stats in slowest:\n    print(f\"{policy_name}: {stats['avg_latency_ms']:.4f}ms\")\n\n# Find policies that block most often\ntop_blockers = metrics.get_top_blocking_policies(top_n=5)\n```\n\n#### 6. Human-in-the-Loop Interventions\nPause agent execution for manual approval on high-stakes operations like financial transactions or destructive actions.\n```python\nimport dataclasses\nfrom clearstone import (\n  Policy, PolicyEngine, create_context, context_scope,\n  ALLOW, PAUSE, InterventionClient\n)\nfrom clearstone.integrations.langchain import PolicyCallbackHandler, PolicyPauseError\n\n@Policy(name=\"require_approval_for_large_spend\", priority=100)\ndef approval_policy(context):\n  amount = context.metadata.get(\"amount\", 0)\n  is_approved = context.metadata.get(\"is_approved\", False)\n  \n  if amount > 1000 and not is_approved:\n    return PAUSE(f\"Transaction of ${amount} requires manual approval.\")\n  \n  return ALLOW\n\ndef run_transaction(engine, context):\n  handler = PolicyCallbackHandler(engine)\n  \n  try:\n    with context_scope(context):\n      handler.on_tool_start(serialized={\"name\": \"execute_payment\"}, input_str=\"\")\n    print(\"\u2705 Transaction successful\")\n    return True\n  \n  except PolicyPauseError as e:\n    print(f\"\u23f8\ufe0f Transaction paused: {e.decision.reason}\")\n    \n    intervention_client = InterventionClient()\n    intervention_client.request_intervention(e.decision)\n    intervention_id = e.decision.metadata.get(\"intervention_id\")\n    \n    if intervention_client.wait_for_approval(intervention_id):\n      # User approved - retry with approval flag\n      approved_context = dataclasses.replace(\n        context, \n        metadata={**context.metadata, \"is_approved\": True}\n      )\n      return run_transaction(engine, approved_context)\n    else:\n      print(\"\u274c Transaction rejected by user\")\n      return False\n\nengine = PolicyEngine()\nctx = create_context(\"user-1\", \"finance-agent\", amount=2500)\nrun_transaction(engine, ctx)\n```\n\n#### 7. Auditing and Exporting\nThe `PolicyEngine` automatically captures a detailed audit trail. You can analyze it or export it for compliance.\n```python\nfrom clearstone import AuditTrail\n\naudit = AuditTrail()\nengine = PolicyEngine(audit_trail=audit)\n\n# ... run agent ...\n\n# Get a quick summary\nprint(audit.summary())\n# {'total_decisions': 50, 'blocks': 5, 'alerts': 12, 'block_rate': 0.1}\n\n# Export for external analysis\naudit.to_json(\"audit_log.json\")\naudit.to_csv(\"audit_log.csv\")\n```\n\n## Distributed Tracing & Observability\n\nClearstone provides production-grade distributed tracing to understand exactly what your AI agents are doing at runtime.\n\n#### Quick Start: Trace Your Agent\n```python\nfrom clearstone.observability import TracerProvider, SpanKind\n\n# Initialize once at application startup\nprovider = TracerProvider(db_path=\"traces.db\")\ntracer = provider.get_tracer(\"my_agent\", version=\"1.0\")\n\n# Trace operations with automatic hierarchy\nwith tracer.span(\"agent_execution\", kind=SpanKind.INTERNAL) as root_span:\n    # Nested spans automatically link to parents\n    with tracer.span(\"llm_call\", kind=SpanKind.CLIENT, attributes={\"model\": \"gpt-4\"}) as llm_span:\n        result = call_llm()\n    \n    with tracer.span(\"tool_execution\", attributes={\"tool\": \"calculator\"}):\n        output = run_tool()\n\n# Spans are automatically persisted to SQLite\n# Retrieve traces for analysis\ntrace = provider.trace_store.get_trace(root_span.trace_id)\n```\n\n#### Key Capabilities\n\n**Automatic Parent-Child Linking**\n```python\n# No manual span IDs needed - hierarchy is automatic\nwith tracer.span(\"parent_operation\"):\n    with tracer.span(\"child_operation\"):\n        with tracer.span(\"grandchild_operation\"):\n            pass  # Three-level hierarchy created automatically\n```\n\n**Rich Span Attributes**\n```python\nwith tracer.span(\"llm_call\", attributes={\n    \"model\": \"gpt-4\",\n    \"temperature\": 0.7,\n    \"max_tokens\": 1000\n}) as span:\n    # Attributes are searchable in storage\n    result = call_llm()\n```\n\n**Exception Tracking**\n```python\nwith tracer.span(\"risky_operation\") as span:\n    raise ValueError(\"Something went wrong\")\n# Span automatically captures:\n# - status: ERROR\n# - error_message: \"Something went wrong\"\n# - error_stacktrace: full traceback\n```\n\n**Performance Characteristics**\n- \u26a1 **Non-blocking:** Span capture takes < 1\u03bcs\n- \ud83d\udd04 **Batched writes:** Groups 100 spans per transaction\n- \ud83d\udd12 **Thread-safe:** Multiple threads can trace concurrently\n- \ud83d\udcbe **Efficient storage:** SQLite with WAL mode for concurrent reads\n\n## AI-Native Testing & Backtesting\n\nClearstone provides a powerful testing framework designed specifically for AI agents. Unlike traditional unit tests that check outputs, this framework validates **how** agents behave.\n\n#### Quick Start: Test Agent Behavior\n\n```python\nfrom clearstone.observability import TracerProvider\nfrom clearstone.testing import PolicyTestHarness, assert_tool_was_called, assert_no_errors_in_trace\n\n# Step 1: Run your agent with tracing enabled\nprovider = TracerProvider(db_path=\"agent_traces.db\")\ntracer = provider.get_tracer(\"research_agent\")\n\nwith tracer.span(\"research_workflow\"):\n    with tracer.span(\"search\", attributes={\"tool.name\": \"web_search\"}):\n        pass  # Your agent's search logic here\n\nprovider.shutdown()\n\n# Step 2: Create behavioral assertions\nharness = PolicyTestHarness(\"agent_traces.db\")\ntraces = harness.load_traces()\n\n# Step 3: Validate behavior\ntool_check = assert_tool_was_called(\"web_search\", times=1)\nerror_check = assert_no_errors_in_trace()\n\nresults = [\n    harness.simulate_policy(tool_check, traces),\n    harness.simulate_policy(error_check, traces)\n]\n\n# Step 4: Check results\nfor result in results:\n    summary = result.summary()\n    if summary[\"runs_blocked\"] > 0:\n        print(f\"\u274c Test failed: {result.policy_name}\")\n        print(f\"   Blocked traces: {result.blocked_trace_ids}\")\n    else:\n        print(f\"\u2705 Test passed: {result.policy_name}\")\n```\n\n#### Available Behavioral Assertions\n\n**Tool Usage Validation**\n```python\nfrom clearstone.testing import assert_tool_was_called\n\n# Assert tool was called at least once\npolicy = assert_tool_was_called(\"calculator\")\n\n# Assert exact number of calls\npolicy = assert_tool_was_called(\"web_search\", times=3)\n```\n\n**Cost Control Testing**\n```python\nfrom clearstone.testing import assert_llm_cost_is_less_than\n\n# Ensure agent stays within budget\npolicy = assert_llm_cost_is_less_than(0.50)  # Max $0.50 per run\n```\n\n**Error Detection**\n```python\nfrom clearstone.testing import assert_no_errors_in_trace\n\n# Validate error-free execution\npolicy = assert_no_errors_in_trace()\n```\n\n**Execution Flow Validation**\n```python\nfrom clearstone.testing import assert_span_order\n\n# Ensure correct workflow sequence\npolicy = assert_span_order([\"plan\", \"search\", \"synthesize\"])\n```\n\n#### Historical Backtesting\n\nTest policy changes against production data before deployment:\n\n```python\nfrom clearstone.testing import PolicyTestHarness\n\n# Load 100 historical traces from production\nharness = PolicyTestHarness(\"production_traces.db\")\ntraces = harness.load_traces(limit=100)\n\n# Test a new policy against historical data\ndef new_cost_policy(trace):\n    total_cost = sum(s.attributes.get(\"cost\", 0) for s in trace.spans)\n    if total_cost > 2.0:\n        return BLOCK(\"Cost exceeds new limit\")\n    return ALLOW\n\n# See impact before deploying\nresult = harness.simulate_policy(new_cost_policy, traces)\nsummary = result.summary()\n\nprint(f\"Would block: {summary['runs_blocked']} / {summary['traces_analyzed']} runs\")\nprint(f\"Block rate: {summary['block_rate_percent']}\")\n```\n\n#### pytest Integration\n\nIntegrate behavioral tests seamlessly into your test suite:\n\n```python\n# tests/test_agent_behavior.py\nimport pytest\nfrom clearstone.observability import TracerProvider\nfrom clearstone.testing import PolicyTestHarness, assert_tool_was_called\n\ndef test_research_agent_uses_search_correctly(tmp_path):\n    db_path = tmp_path / \"test_traces.db\"\n    \n    # Run agent\n    provider = TracerProvider(db_path=str(db_path))\n    tracer = provider.get_tracer(\"test_agent\")\n    run_research_agent(tracer)  # Your agent function\n    provider.shutdown()\n    \n    # Test behavior\n    harness = PolicyTestHarness(str(db_path))\n    traces = harness.load_traces()\n    \n    policy = assert_tool_was_called(\"search\", times=2)\n    result = harness.simulate_policy(policy, traces)\n    \n    assert result.summary()[\"runs_blocked\"] == 0, \"Agent should use search exactly twice\"\n```\n\n**Key Benefits:**\n- \ud83c\udfaf **Behavior-Focused:** Test what agents *do*, not just what they *return*\n- \ud83d\udcca **Data-Driven:** Validate against real execution traces\n- \ud83d\udd04 **Regression Prevention:** Catch behavioral changes before deployment\n- \ud83e\uddea **CI/CD Ready:** Seamlessly integrates with pytest workflows\n- \ud83d\udcc8 **Impact Analysis:** Understand policy changes with detailed metrics\n\n## Time-Travel Debugging\n\nDebug AI agents by traveling back to any point in their execution history. Clearstone's time-travel debugging captures complete agent state snapshots and allows you to replay and debug from those exact moments.\n\n#### Quick Start: Create and Load a Checkpoint\n\n```python\nfrom clearstone.debugging import CheckpointManager, ReplayEngine\nfrom clearstone.observability import TracerProvider\n\nprovider = TracerProvider(db_path=\"traces.db\")\ntracer = provider.get_tracer(\"my_agent\", version=\"1.0\")\n\nwith tracer.span(\"agent_workflow\") as root_span:\n  trace_id = root_span.trace_id\n  \n  with tracer.span(\"step_1\") as span1:\n    pass\n  \n  with tracer.span(\"step_2\") as span2:\n    span_id = span2.span_id\n\nprovider.shutdown()\n\ntrace = provider.trace_store.get_trace(trace_id)\n\nmanager = CheckpointManager()\ncheckpoint = manager.create_checkpoint(agent, trace, span_id=span_id)\n\nengine = ReplayEngine(checkpoint)\nengine.start_debugging_session(\"run_next_step\", input_data)\n```\n\n#### Key Capabilities\n\n**Checkpoint Creation**\n```python\nfrom clearstone.debugging import CheckpointManager\n\nmanager = CheckpointManager(checkpoint_dir=\".checkpoints\")\n\ncheckpoint = manager.create_checkpoint(\n  agent=my_agent,\n  trace=execution_trace,\n  span_id=\"span_xyz\"\n)\n```\n\n**Agent Rehydration**\n```python\nfrom clearstone.debugging import ReplayEngine\n\ncheckpoint = manager.load_checkpoint(\"t1_ckpt_abc123.ckpt\")\n\nengine = ReplayEngine(checkpoint)\n```\n\n**Interactive Debugging Session**\n```python\nengine.start_debugging_session(\n  function_to_replay=\"process_input\",\n  *args,\n  **kwargs\n)\n```\n\n**Deterministic Replay**\n\nThe `DeterministicExecutionContext` automatically mocks non-deterministic functions to ensure reproducible debugging:\n- Time functions (`time.time`)\n- Random number generation (`random.random`)\n- LLM responses (replayed from trace)\n- Tool outputs (replayed from trace)\n\n**Checkpoint Serialization**\n\nCheckpoints use a hybrid serialization approach:\n- Metadata: JSON (human-readable, version info, timestamps)\n- Agent state: Pickle (high-fidelity, preserves complex objects)\n- Trace context: Full upstream span hierarchy included\n\n**Key Benefits:**\n- \ud83d\udd70\ufe0f **Time Travel:** Jump to any point in agent execution history\n- \ud83d\udd0d **Full Context:** Complete state + parent span hierarchy\n- \ud83c\udfaf **Deterministic:** Reproducible replay with mocked externals\n- \ud83d\udc1b **Interactive:** Drop into pdb with real agent state\n- \ud83d\udcbe **Portable:** Save checkpoints to disk, share with team\n- \ud83d\udd04 **Rehydration:** Dynamically restore any agent class\n\n#### Agent Requirements\n\nFor agents to be checkpointable, they must implement:\n\n```python\nclass MyAgent:\n  def get_state(self):\n    \"\"\"Return a dictionary of all state to preserve.\"\"\"\n    return {\"memory\": self.memory, \"config\": self.config}\n  \n  def load_state(self, state):\n    \"\"\"Restore agent from a state dictionary.\"\"\"\n    self.memory = state[\"memory\"]\n    self.config = state[\"config\"]\n```\n\nFor simple agents, Clearstone will automatically capture `__dict__` if these methods aren't provided.\n\n## Command-Line Interface (CLI)\n\nAccelerate development with the `clearstone` CLI. The `new-policy` command scaffolds a boilerplate file with best practices.\n\n```bash\n# See all available commands\nclearstone --help\n\n# Create a new policy file\nclearstone new-policy enforce_data_locality --priority=80 --dir=my_app/compliance\n\n# Output: Creates my_app/compliance/enforce_data_locality_policy.py\n```\n```python\n# my_app/compliance/enforce_data_locality_policy.py\nfrom clearstone import Policy, ALLOW, BLOCK, Decision\n# ... boilerplate ...\n\n@Policy(name=\"enforce_data_locality\", priority=80)\ndef enforce_data_locality_policy(context: PolicyContext) -> Decision:\n    \"\"\"\n    [TODO: Describe what this policy does.]\n    \"\"\"\n    # [TODO: Implement your policy logic here.]\n    return ALLOW\n```\n\n## For Local LLM Users\n\nClearstone includes specialized policies designed specifically for local-first AI workflows. These address the unique challenges of running large language models on local hardware:\n\n### System Load Protection\n\nPrevents system freezes by monitoring CPU and memory usage before allowing intensive operations:\n\n```python\nfrom clearstone.policies.common import system_load_policy\n\n# Automatically blocks operations when:\n# - CPU usage > 90% (configurable)\n# - Memory usage > 95% (configurable)\n\ncontext = create_context(\n    \"user\", \"agent\",\n    cpu_threshold_percent=85.0,      # Custom threshold\n    memory_threshold_percent=90.0\n)\n```\n\n### Model Health Check\n\nProvides instant feedback when your local model server is down, avoiding mysterious 60-second timeouts:\n\n```python\nfrom clearstone.policies.common import model_health_check_policy\n\n# Quick health check (0.5s timeout) before LLM calls\n# Supports Ollama, LM Studio, and custom endpoints\n\ncontext = create_context(\n    \"user\", \"agent\",\n    local_model_health_url=\"http://localhost:11434/api/tags\",  # Ollama default\n    health_check_timeout=1.0\n)\n```\n\n**Why This Matters:**\n- \u274c No more system freezes from resource exhaustion\n- \u274c No more waiting 60 seconds for timeout errors  \n- \u2705 Immediate, actionable error messages\n- \u2705 Prevents retry loops that make problems worse\n\nSee `examples/16_local_llm_protection.py` for a complete demonstration.\n\n---\n\n## Anonymous Usage Telemetry\n\nTo help improve Clearstone, the SDK collects anonymous usage statistics by default. This telemetry is:\n\n- **Anonymous:** Only component initialization events are tracked (e.g., \"PolicyEngine initialized\")\n- **Non-Identifying:** No user data, policy logic, or trace content is ever collected\n- **Transparent:** All telemetry code is open source and auditable\n- **Opt-Out:** Easy to disable at any time\n\n### What We Collect\n\n- Component initialization events (PolicyEngine, TracerProvider, etc.)\n- SDK version and Python version\n- Anonymous session ID (generated per-process)\n- Anonymous user ID (persistent, stored in `~/.clearstone/config.json`)\n\n### What We DON'T Collect\n\n- Your policy logic or decisions\n- Trace data or agent outputs\n- User identifiers or credentials\n- Any personally identifiable information (PII)\n- File paths or environment variables\n\n### How to Opt Out\n\n**Option 1: Environment Variable (Recommended)**\n```bash\nexport CLEARSTONE_TELEMETRY_DISABLED=1\n```\n\n**Option 2: Config File**\n\nEdit or create `~/.clearstone/config.json`:\n```json\n{\n  \"telemetry\": {\n    \"disabled\": true\n  }\n}\n```\n\nThe SDK checks for opt-out on every process start and respects your choice immediately.\n\n---\n\n## Contributing\n\nContributions are welcome! Please see our [Contributing Guide](docs/about/contributing.md) for details on how to submit pull requests, set up a development environment, and run tests.\n\n## License\n\nThis project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.\n\n---\n\n## Community & Support\n\nJoin our community to ask questions, share your projects, and get help from the team and other users.\n\n*   **Discord:** [Join the Clearstone Community](https://discord.gg/VZAX4vk8dT)\n*   **Twitter:** Follow [@clearstonedev](https://twitter.com/clearstonedev) for the latest news and updates.\n*   **GitHub Issues:** [Report a bug or suggest a feature](https://github.com/Sancauid/clearstone-sdk/issues).\n*   **Email:** For other inquiries, you can reach out to [pablo@clearstone.dev](mailto:pablo@clearstone.dev).\n",
    "bugtrack_url": null,
    "license": "MIT License\n        \n        Copyright (c) 2025 Clearstone SDK\n        \n        Permission is hereby granted, free of charge, to any person obtaining a copy\n        of this software and associated documentation files (the \"Software\"), to deal\n        in the Software without restriction, including without limitation the rights\n        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell\n        copies of the Software, and to permit persons to whom the Software is\n        furnished to do so, subject to the following conditions:\n        \n        The above copyright notice and this permission notice shall be included in all\n        copies or substantial portions of the Software.\n        \n        THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\n        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\n        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\n        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\n        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\n        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\n        SOFTWARE.\n        \n        ",
    "summary": "The open-source reliability toolkit for AI agents. Add production-grade governance, observability, and debugging to any agent workflow.",
    "version": "0.1.2",
    "project_urls": {
        "Bug Tracker": "https://github.com/Sancauid/clearstone-sdk/issues",
        "Changelog": "https://github.com/Sancauid/clearstone-sdk/blob/main/CHANGELOG.md",
        "Documentation": "https://sancauid.github.io/clearstone-sdk/",
        "Homepage": "https://github.com/Sancauid/clearstone-sdk"
    },
    "split_keywords": [
        "ai",
        " langchain",
        " agent",
        " governance",
        " observability",
        " testing",
        " debugging",
        " safety",
        " guardrails"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "29f507b5647d40c083c8bb3ccb5d1f5bb6049e42a920c0f194eae6ac17222bb4",
                "md5": "f3044be3b87ce1af0d65916f8b28d772",
                "sha256": "90fe8564513a391d63df5460b78cd88da458e582be4474789fbf3ab197c3072d"
            },
            "downloads": -1,
            "filename": "clearstone_sdk-0.1.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f3044be3b87ce1af0d65916f8b28d772",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 60445,
            "upload_time": "2025-10-28T18:36:42",
            "upload_time_iso_8601": "2025-10-28T18:36:42.362178Z",
            "url": "https://files.pythonhosted.org/packages/29/f5/07b5647d40c083c8bb3ccb5d1f5bb6049e42a920c0f194eae6ac17222bb4/clearstone_sdk-0.1.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "d10594aefdcb3a2101b20abf799934e4b7f72b909494938d9449f9fbf7cc0c3c",
                "md5": "420471f677381d39343fc4809cf1e271",
                "sha256": "a8aafad62b649843b1edf0e68b15481dd01c3eaba425f30d635e9e84d7eb504f"
            },
            "downloads": -1,
            "filename": "clearstone_sdk-0.1.2.tar.gz",
            "has_sig": false,
            "md5_digest": "420471f677381d39343fc4809cf1e271",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 56871,
            "upload_time": "2025-10-28T18:36:43",
            "upload_time_iso_8601": "2025-10-28T18:36:43.326437Z",
            "url": "https://files.pythonhosted.org/packages/d1/05/94aefdcb3a2101b20abf799934e4b7f72b909494938d9449f9fbf7cc0c3c/clearstone_sdk-0.1.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-10-28 18:36:43",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Sancauid",
    "github_project": "clearstone-sdk",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "clearstone-sdk"
}
        
Elapsed time: 2.96788s