open-agent-sdk

Name	open-agent-sdk JSON
Version	0.4.1 JSON
	download
home_page	None
Summary	Lightweight Python SDK for local/self-hosted LLMs via OpenAI-compatible endpoints
upload_time	2025-10-18 08:18:20
maintainer	None
docs_url	None
author	Open Agent SDK Contributors
requires_python	>=3.10
license	MIT
keywords	llm ai agent local openai ollama lmstudio llamacpp
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Open Agent SDK

> Lightweight Python SDK for local/self-hosted LLMs via OpenAI-compatible endpoints

[![PyPI version](https://badge.fury.io/py/open-agent-sdk.svg)](https://pypi.org/project/open-agent-sdk/)
[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

## Overview

Open Agent SDK provides a clean, streaming API for working with OpenAI-compatible local model servers, making it easy to build AI agents with your own hardware.

**Use Case**: Build powerful AI agents using local Qwen/Llama/Mistral models without cloud API costs or data privacy concerns.

**Solution**: Drop-in similar API that works with LM Studio, Ollama, llama.cpp, and any OpenAI-compatible endpoint—complete with streaming, tool call aggregation, and a helper for returning tool results back to the model.

## Supported Providers

### ✅ Supported (OpenAI-Compatible Endpoints)

- **LM Studio** - `http://localhost:1234/v1`
- **Ollama** - `http://localhost:11434/v1`
- **llama.cpp server** - OpenAI-compatible mode
- **vLLM** - OpenAI-compatible API
- **Text Generation WebUI** - OpenAI extension
- **Any OpenAI-compatible local endpoint**
- **Local gateways proxying cloud models** - e.g., Ollama or custom gateways that route to cloud providers

### ❌ Not Supported (Use Official SDKs)

- **Claude/OpenAI direct** - Use their official SDKs, unless you proxy through a local OpenAI-compatible gateway
- **Cloud provider SDKs** - Bedrock, Vertex, Azure, etc. (proxied via local gateway is fine)

## Quick Start

### Installation

```bash
pip install open-agent-sdk
```

For development:

```bash
git clone https://github.com/slb350/open-agent-sdk.git
cd open-agent-sdk
pip install -e .
```

### Simple Query (LM Studio)

```python
import asyncio
from open_agent import query, AgentOptions

async def main():
    options = AgentOptions(
        system_prompt="You are a professional copy editor",
        model="qwen2.5-32b-instruct",
        base_url="http://localhost:1234/v1",
        max_turns=1,
        temperature=0.1
    )

    result = query(prompt="Analyze this text...", options=options)

    response_text = ""
    async for msg in result:
        if hasattr(msg, 'content'):
            for block in msg.content:
                if hasattr(block, 'text'):
                    response_text += block.text

    print(response_text)

asyncio.run(main())
```

### Multi-Turn Conversation (Ollama)

```python
from open_agent import Client, AgentOptions, TextBlock, ToolUseBlock
from open_agent.config import get_base_url

def run_my_tool(name: str, params: dict) -> dict:
    # Replace with your tool execution logic
    return {"result": f"stub output for {name}"}

async def main():
    options = AgentOptions(
        system_prompt="You are a helpful assistant",
        model="kimi-k2:1t-cloud",  # Use your available Ollama model
        base_url=get_base_url(provider="ollama"),
        max_turns=10
    )

    async with Client(options) as client:
        await client.query("What's the capital of France?")

        async for msg in client.receive_messages():
            if isinstance(msg, TextBlock):
                print(f"Assistant: {msg.text}")
            elif isinstance(msg, ToolUseBlock):
                print(f"Tool used: {msg.name}")
                tool_result = run_my_tool(msg.name, msg.input)
                client.add_tool_result(msg.id, tool_result)

asyncio.run(main())
```

See `examples/tool_use_agent.py` for progressively richer patterns (manual loop, helper function, and reusable agent class) demonstrating `add_tool_result()` in context.

### Function Calling with Tools

Define tools using the `@tool` decorator for clean, type-safe function calling:

```python
from open_agent import tool, Client, AgentOptions, TextBlock, ToolUseBlock

# Define tools
@tool("get_weather", "Get current weather", {"location": str, "units": str})
async def get_weather(args):
    return {
        "temperature": 72,
        "conditions": "sunny",
        "units": args["units"]
    }

@tool("calculate", "Perform calculation", {"a": float, "b": float, "op": str})
async def calculate(args):
    ops = {"+": lambda a, b: a + b, "-": lambda a, b: a - b}
    result = ops[args["op"]](args["a"], args["b"])
    return {"result": result}

# Enable automatic tool execution (recommended)
options = AgentOptions(
    system_prompt="You are a helpful assistant with access to tools.",
    model="qwen2.5-32b-instruct",
    base_url="http://localhost:1234/v1",
    tools=[get_weather, calculate],
    auto_execute_tools=True,      # 🔥 Tools execute automatically
    max_tool_iterations=10         # Safety limit for tool loops
)

async with Client(options) as client:
    await client.query("What's 25 + 17?")

    # Simply iterate - tools execute automatically!
    async for block in client.receive_messages():
        if isinstance(block, ToolUseBlock):
            print(f"Tool called: {block.name}")
        elif isinstance(block, TextBlock):
            print(f"Response: {block.text}")
```

**Advanced: Manual Tool Execution**

For custom execution logic or result interception:

```python
# Disable auto-execution
options = AgentOptions(
    system_prompt="You are a helpful assistant with access to tools.",
    model="qwen2.5-32b-instruct",
    base_url="http://localhost:1234/v1",
    tools=[get_weather, calculate],
    auto_execute_tools=False  # Manual mode
)

async with Client(options) as client:
    await client.query("What's 25 + 17?")

    async for block in client.receive_messages():
        if isinstance(block, ToolUseBlock):
            # You execute the tool manually
            tool = {"calculate": calculate, "get_weather": get_weather}[block.name]
            result = await tool.execute(block.input)

            # Return result to agent
            await client.add_tool_result(block.id, result)

            # Continue conversation
            await client.query("")
```

**Key Features:**
- **Automatic execution** (v0.3.0+) - Tools run automatically with safety limits
- **Type-safe schemas** - Simple Python types (`str`, `int`, `float`, `bool`) or full JSON Schema
- **OpenAI-compatible** - Works with any OpenAI function calling endpoint
- **Clean decorator API** - Similar to Claude SDK's `@tool`
- **Hook integration** - PreToolUse/PostToolUse hooks work in both modes

See `examples/calculator_tools.py` and `examples/simple_tool.py` for complete examples.

## Context Management

Local models have fixed context windows (typically 8k-32k tokens). The SDK provides **opt-in utilities** for manual history management—no silent mutations, you stay in control.

### Token Estimation & Truncation

```python
from open_agent import Client, AgentOptions
from open_agent.context import estimate_tokens, truncate_messages

async with Client(options) as client:
    # Long conversation...
    for i in range(50):
        await client.query(f"Question {i}")
        async for msg in client.receive_messages():
            pass

    # Check token usage
    tokens = estimate_tokens(client.history)
    print(f"Context size: ~{tokens} tokens")

    # Manually truncate when needed
    if tokens > 28000:
        client.message_history = truncate_messages(client.history, keep=10)
```

### Recommended Patterns

**1. Stateless Agents** (Best for single-task agents):
```python
# Process each task independently - no history accumulation
for task in tasks:
    async with Client(options) as client:
        await client.query(task)
        # Client disposed, fresh context for next task
```

**2. Manual Truncation** (At natural breakpoints):
```python
from open_agent.context import truncate_messages

async with Client(options) as client:
    for task in tasks:
        await client.query(task)
        # Truncate after each major task
        client.message_history = truncate_messages(client.history, keep=5)
```

**3. External Memory** (RAG-lite for research agents):
```python
# Store important facts in database, keep conversation context small
database = {}
async with Client(options) as client:
    await client.query("Research topic X")
    # Save response to database
    database["topic_x"] = extract_facts(response)

    # Clear history, query database when needed
    client.message_history = truncate_messages(client.history, keep=0)
```

### Why Manual?

The SDK **intentionally** does not auto-compact history because:
- **Domain-specific needs**: Copy editors need different strategies than research agents
- **Token accuracy varies**: Each model family has different tokenizers
- **Risk of breaking context**: Silently removing messages could break tool chains
- **Natural limits exist**: Compaction doesn't bypass model context windows

### Installing Token Estimation

For better token estimation accuracy (optional):

```bash
pip install open-agent-sdk[context]  # Adds tiktoken
```

Without `tiktoken`, falls back to character-based approximation (~75-85% accurate).

See `examples/context_management.py` for complete patterns and usage.

## Lifecycle Hooks

Monitor and control agent behavior at key execution points with Pythonic lifecycle hooks—no subprocess overhead or JSON protocols.

### Quick Example

```python
from open_agent import (
    AgentOptions, Client,
    PreToolUseEvent, PostToolUseEvent, UserPromptSubmitEvent,
    HookDecision,
    HOOK_PRE_TOOL_USE, HOOK_POST_TOOL_USE, HOOK_USER_PROMPT_SUBMIT
)

# Security gate - block dangerous operations
async def security_gate(event: PreToolUseEvent) -> HookDecision | None:
    if event.tool_name == "delete_file":
        return HookDecision(
            continue_=False,
            reason="Delete operations require approval"
        )
    return None  # Allow by default

# Audit logger - track all tool executions
async def audit_logger(event: PostToolUseEvent) -> None:
    print(f"Tool executed: {event.tool_name} -> {event.tool_result}")
    return None

# Input sanitizer - validate user prompts
async def sanitize_input(event: UserPromptSubmitEvent) -> HookDecision | None:
    if "DELETE" in event.prompt.upper():
        return HookDecision(
            continue_=False,
            reason="Dangerous keywords detected"
        )
    return None

# Register hooks in AgentOptions
options = AgentOptions(
    system_prompt="You are a helpful assistant",
    model="qwen2.5-32b-instruct",
    base_url="http://localhost:1234/v1",
    tools=[my_file_tool, my_search_tool],
    hooks={
        HOOK_PRE_TOOL_USE: [security_gate],
        HOOK_POST_TOOL_USE: [audit_logger],
        HOOK_USER_PROMPT_SUBMIT: [sanitize_input],
    }
)

async with Client(options) as client:
    await client.query("Write to /etc/config")  # UserPromptSubmit fires
    async for block in client.receive_messages():
        if isinstance(block, ToolUseBlock):  # PreToolUse fires
            result = await tool.execute(block.input)
            await client.add_tool_result(block.id, result)  # PostToolUse fires
```

### Hook Types

**PreToolUse** - Fires before tool execution (or yielding to user)
- **Block operations**: Return `HookDecision(continue_=False, reason="...")`
- **Modify inputs**: Return `HookDecision(modified_input={...}, reason="...")`
- **Allow**: Return `None`

**PostToolUse** - Fires after tool result added to history
- **Observational only** (tool already executed)
- Use for audit logging, metrics, result validation
- Return `None` (decision ignored for PostToolUse)

**UserPromptSubmit** - Fires before sending prompt to API
- **Block prompts**: Return `HookDecision(continue_=False, reason="...")`
- **Modify prompts**: Return `HookDecision(modified_prompt="...", reason="...")`
- **Allow**: Return `None`

### Common Patterns

**Pattern 1: Redirect to Sandbox**

```python
async def redirect_to_sandbox(event: PreToolUseEvent) -> HookDecision | None:
    """Redirect file operations to safe directory."""
    if event.tool_name == "file_writer":
        path = event.tool_input.get("path", "")
        if not path.startswith("/tmp/"):
            safe_path = f"/tmp/sandbox/{path.lstrip('/')}"
            return HookDecision(
                modified_input={"path": safe_path, "content": event.tool_input.get("content", "")},
                reason="Redirected to sandbox"
            )
    return None
```

**Pattern 2: Compliance Audit Log**

```python
audit_log = []

async def compliance_logger(event: PostToolUseEvent) -> None:
    """Log all tool executions for compliance."""
    audit_log.append({
        "timestamp": datetime.now(),
        "tool": event.tool_name,
        "input": event.tool_input,
        "result": str(event.tool_result)[:100],
        "user": get_current_user()
    })
    return None
```

**Pattern 3: Safety Instructions**

```python
async def add_safety_warning(event: UserPromptSubmitEvent) -> HookDecision | None:
    """Add safety instructions to risky prompts."""
    if "write" in event.prompt.lower() or "delete" in event.prompt.lower():
        safe_prompt = event.prompt + " (Please confirm this is safe before proceeding)"
        return HookDecision(
            modified_prompt=safe_prompt,
            reason="Added safety warning"
        )
    return None
```

### Hook Execution Flow

- Hooks run **sequentially** in the order registered
- **First non-None decision wins** (short-circuit behavior)
- Hooks run **inline on event loop** (spawn tasks for heavy work)
- Works with both **Client** and **query()** function

### Breaking Change (v0.2.4)

`Client.add_tool_result()` is now async to support PostToolUse hooks:

```python
# Old (v0.2.3 and earlier)
client.add_tool_result(tool_id, result)

# New (v0.2.4+)
await client.add_tool_result(tool_id, result)
```

### Why Hooks?

- **Security gates**: Block dangerous operations before they execute
- **Audit logging**: Track all tool executions for compliance
- **Input validation**: Sanitize user prompts before processing
- **Monitoring**: Observe agent behavior in production
- **Control flow**: Modify tool inputs or redirect operations

See `examples/hooks_example.py` for 4 comprehensive patterns (security, audit, sanitization, combined).

## Interrupt Capability

Cancel long-running operations cleanly without corrupting client state. Perfect for timeouts, user cancellations, or conditional interruptions.

### Quick Example

```python
from open_agent import Client, AgentOptions
import asyncio

async def main():
    options = AgentOptions(
        system_prompt="You are a helpful assistant.",
        model="qwen2.5-32b-instruct",
        base_url="http://localhost:1234/v1"
    )

    async with Client(options) as client:
        await client.query("Write a detailed 1000-word essay...")

        # Timeout after 5 seconds
        try:
            async def collect_messages():
                async for block in client.receive_messages():
                    print(block.text, end="", flush=True)

            await asyncio.wait_for(collect_messages(), timeout=5.0)
        except asyncio.TimeoutError:
            await client.interrupt()  # Clean cancellation
            print("\n⚠️ Operation timed out!")

        # Client is still usable after interrupt
        await client.query("Short question?")
        async for block in client.receive_messages():
            print(block.text)
```

### Common Patterns

**1. Timeout-Based Interruption**

```python
try:
    await asyncio.wait_for(process_messages(client), timeout=10.0)
except asyncio.TimeoutError:
    await client.interrupt()
    print("Operation timed out")
```

**2. Conditional Interruption**

```python
# Stop if response contains specific content
full_text = ""
async for block in client.receive_messages():
    full_text += block.text
    if "error" in full_text.lower():
        await client.interrupt()
        break
```

**3. User Cancellation (from separate task)**

```python
async def stream_task():
    await client.query("Long task...")
    async for block in client.receive_messages():
        print(block.text, end="")

async def cancel_button_task():
    await asyncio.sleep(2.0)  # User waits 2 seconds
    await client.interrupt()  # User clicks cancel

# Run both concurrently
await asyncio.gather(stream_task(), cancel_button_task())
```

**4. Interrupt During Auto-Execution**

```python
options = AgentOptions(
    tools=[slow_tool, fast_tool],
    auto_execute_tools=True,
    max_tool_iterations=10
)

async with Client(options) as client:
    await client.query("Use tools...")

    tool_count = 0
    async for block in client.receive_messages():
        if isinstance(block, ToolUseBlock):
            tool_count += 1
            if tool_count >= 2:
                await client.interrupt()  # Stop after 2 tools
                break
```

### How It Works

When you call `client.interrupt()`:
1. **Active stream closure** - HTTP stream closed immediately (not just a flag)
2. **Clean state** - Client remains in valid state for reuse
3. **Partial output** - Text blocks flushed to history, incomplete tools skipped
4. **Idempotent** - Safe to call multiple times
5. **Concurrent-safe** - Can be called from separate asyncio tasks

### Example

See `examples/interrupt_demo.py` for 5 comprehensive patterns:
- Timeout-based interruption
- Conditional interruption
- Auto-execution interruption
- Concurrent interruption (simulated cancel button)
- Interrupt and retry

## 🚀 Practical Examples

We've included two production-ready agents that demonstrate real-world usage:

### 📝 Git Commit Agent
**[examples/git_commit_agent.py](examples/git_commit_agent.py)**

Analyzes your staged git changes and writes professional commit messages following conventional commit format.

```bash
# Stage your changes
git add .

# Run the agent
python examples/git_commit_agent.py

# Output:
# ✓ Found staged changes in 3 file(s)
# 🤖 Analyzing changes and generating commit message...
#
# 📝 Suggested commit message:
# feat(auth): Add OAuth2 integration with refresh tokens
#
# - Implement token refresh mechanism
# - Add secure cookie storage for tokens
# - Update login flow to support OAuth2 providers
# - Add tests for token expiration handling
```

**Features:**
- Analyzes diff to determine commit type (feat/fix/docs/etc)
- Writes clear, descriptive commit messages
- Interactive mode: accept, edit, or regenerate
- Follows conventional commit standards

### 📊 Log Analyzer Agent
**[examples/log_analyzer_agent.py](examples/log_analyzer_agent.py)**

Intelligently analyzes application logs to identify patterns, errors, and provide actionable insights.

```bash
# Analyze a log file
python examples/log_analyzer_agent.py /var/log/app.log

# Analyze with a specific time window
python examples/log_analyzer_agent.py app.log --since "2025-10-15T00:00:00" --until "2025-10-15T12:00:00"

# Interactive mode for drilling down
python examples/log_analyzer_agent.py app.log --interactive
```

**Features:**
- Automatic error pattern detection
- Time-based analysis (peak error times)
- Root cause suggestions
- Interactive mode for investigating specific issues
- Supports multiple log formats (JSON, Apache, syslog, etc)
- Time range filtering with `--since` / `--until`

**Sample Output:**
```
📊 Log Summary:
  Total entries: 45,231
  Errors: 127 (0.3%)
  Warnings: 892

🔴 Top Error Patterns:
  - Connection Error: 67 occurrences
  - NullPointerException: 23 occurrences
  - Timeout Error: 19 occurrences

⏰ Peak error time: 2025-10-15T14:00:00
   Errors in that hour: 43

🤖 ANALYSIS REPORT:
Main Issues (Priority Order):
1. Database connection pool exhaustion during peak hours
2. Unhandled null values in user authentication flow
3. External API timeouts affecting payment processing

Recommendations:
1. Increase connection pool size from 10 to 25
2. Add null checks in AuthService.validateUser() method
3. Implement circuit breaker for payment API with 30s timeout
```

### Why These Examples?

These agents demonstrate:
- **Practical Value**: Solve real problems developers face daily
- **Tool Integration**: Show how to integrate with system commands (git, file I/O)
- **Multi-turn Conversations**: Interactive modes for complex analysis
- **Structured Output**: Parse and format LLM responses for actionable results
- **Privacy-First**: Keep your code and logs local while getting AI assistance

## Configuration

Open Agent SDK uses config helpers to provide flexible configuration via environment variables, provider shortcuts, or explicit parameters:

### Environment Variables (Recommended)

```bash
export OPEN_AGENT_BASE_URL="http://localhost:1234/v1"
export OPEN_AGENT_MODEL="qwen/qwen3-30b-a3b-2507"
```

```python
from open_agent import AgentOptions
from open_agent.config import get_model, get_base_url

# Config helpers read from environment
options = AgentOptions(
    system_prompt="...",
    model=get_model(),      # Reads OPEN_AGENT_MODEL
    base_url=get_base_url() # Reads OPEN_AGENT_BASE_URL
)
```

### Provider Shortcuts

```python
from open_agent.config import get_base_url

# Use built-in defaults for common providers
options = AgentOptions(
    system_prompt="...",
    model="llama3.1:70b",
    base_url=get_base_url(provider="ollama")  # → http://localhost:11434/v1
)
```

**Available providers**: `lmstudio`, `ollama`, `llamacpp`, `vllm`

### Fallback Values

```python
# Provide fallbacks when env vars not set
options = AgentOptions(
    system_prompt="...",
    model=get_model("qwen2.5-32b-instruct"),         # Fallback model
    base_url=get_base_url(provider="lmstudio")       # Fallback URL
)
```

**Configuration Priority:**
- Environment variable (default behaviour)
- Fallback value passed to the config helper
- Provider default (for `base_url` only)

Need to force a specific model even when `OPEN_AGENT_MODEL` is set? Call `get_model("model-name", prefer_env=False)` to ignore the environment variable for that lookup.

**Benefits:**
- Switch between dev/prod by changing environment variables
- No hardcoded URLs or model names
- Per-agent overrides when needed

See [docs/configuration.md](docs/configuration.md) for complete guide.

## Why Not Just Use OpenAI Client?

**Without open-agent-sdk** (raw OpenAI client):
```python
from openai import AsyncOpenAI

client = AsyncOpenAI(base_url="http://localhost:1234/v1", api_key="not-needed")
response = await client.chat.completions.create(
    model="qwen2.5-32b-instruct",
    messages=[{"role": "system", "content": system_prompt},
              {"role": "user", "content": user_prompt}],
    stream=True
)

async for chunk in response:
    # Complex parsing of chunks
    # Extract delta content
    # Handle tool calls manually
    # Track conversation state yourself
```

**With open-agent-sdk**:
```python
from open_agent import query, AgentOptions

options = AgentOptions(
    system_prompt=system_prompt,
    model="qwen2.5-32b-instruct",
    base_url="http://localhost:1234/v1"
)

result = query(prompt=user_prompt, options=options)
async for msg in result:
    # Clean message types (TextBlock, ToolUseBlock)
    # Automatic streaming and tool call handling
```

**Value**: Familiar patterns + Less boilerplate + Easy migration

## API Reference

### AgentOptions

```python
class AgentOptions:
    system_prompt: str                          # System prompt
    model: str                                  # Model name (required)
    base_url: str                               # OpenAI-compatible endpoint URL (required)
    tools: list[Tool] = []                      # Tool instances for function calling
    hooks: dict[str, list[HookHandler]] = None  # Lifecycle hooks for monitoring/control
    auto_execute_tools: bool = False            # Enable automatic tool execution (v0.3.0+)
    max_tool_iterations: int = 5                # Max tool calls per query in auto mode
    max_turns: int = 1                          # Max conversation turns
    max_tokens: int | None = 4096               # Tokens to generate (None uses provider default)
    temperature: float = 0.7                    # Sampling temperature
    timeout: float = 60.0                       # Request timeout in seconds
    api_key: str = "not-needed"                 # Most local servers don't need this
```

**Note**: Use config helpers (`get_model()`, `get_base_url()`) for environment variable and provider support.

### query()

Simple single-turn query function.

```python
async def query(prompt: str, options: AgentOptions) -> AsyncGenerator
```

Returns an async generator yielding messages.

### Client

Multi-turn conversation client with tool monitoring.

```python
async with Client(options: AgentOptions) as client:
    await client.query(prompt: str)
    async for msg in client.receive_messages():
        # Process messages
```

### Message Types

- `TextBlock` - Text content from model
- `ToolUseBlock` - Tool calls from model (has `id`, `name`, `input` fields)
- `ToolResultBlock` - Tool execution results to send back to model
- `ToolUseError` - Tool call parsing error (malformed JSON, missing fields)
- `AssistantMessage` - Full message wrapper

### Tool System

```python
@tool(name: str, description: str, input_schema: dict)
async def my_tool(args: dict) -> Any:
    """Tool handler function"""
    return result

# Tool class
class Tool:
    name: str
    description: str
    input_schema: dict[str, type] | dict[str, Any]
    handler: Callable[[dict], Awaitable[Any]]

    async def execute(arguments: dict) -> Any
    def to_openai_format() -> dict
```

**Schema formats:**
- Simple: `{"param": str, "count": int}` - All parameters required
- JSON Schema: Full schema with `type`, `properties`, `required`, etc.

### Hooks System

```python
# Event types
@dataclass
class PreToolUseEvent:
    tool_name: str
    tool_input: dict[str, Any]
    tool_use_id: str
    history: list[dict[str, Any]]

@dataclass
class PostToolUseEvent:
    tool_name: str
    tool_input: dict[str, Any]
    tool_result: Any
    tool_use_id: str
    history: list[dict[str, Any]]

@dataclass
class UserPromptSubmitEvent:
    prompt: str
    history: list[dict[str, Any]]

# Hook decision
@dataclass
class HookDecision:
    continue_: bool = True
    modified_input: dict[str, Any] | None = None
    modified_prompt: str | None = None
    reason: str | None = None

# Hook handler signature
HookHandler = Callable[[HookEvent], Awaitable[HookDecision | None]]

# Hook constants
HOOK_PRE_TOOL_USE = "pre_tool_use"
HOOK_POST_TOOL_USE = "post_tool_use"
HOOK_USER_PROMPT_SUBMIT = "user_prompt_submit"
```

**Hook behavior:**
- Return `None` to allow by default
- Return `HookDecision(continue_=False)` to block
- Return `HookDecision(modified_input={...})` to modify (PreToolUse)
- Return `HookDecision(modified_prompt="...")` to modify (UserPromptSubmit)
- Raise exception to abort entirely

## Recommended Models

**Local models** (LM Studio, Ollama, llama.cpp):
- **GPT-OSS-120B** - Best in class for speed and quality
- **Qwen 3 30B** - Excellent instruction following, good for most tasks
- **GPT-OSS-20B** - Solid all-around performance
- **Mistral 7B** - Fast and efficient for simple agents

**Cloud-proxied via local gateway** (Ollama cloud provider, custom gateway):
- **kimi-k2:1t-cloud** - Tested and working via Ollama gateway
- **deepseek-v3.1:671b-cloud** - High-quality reasoning model
- **qwen3-coder:480b-cloud** - Code-focused models
- Your `base_url` still points to localhost gateway (e.g., `http://localhost:11434/v1`)
- Gateway handles authentication and routing to cloud provider
- Useful when you need larger models than your hardware can run locally

**Architecture guidance:**
- Prefer MoE (Mixture of Experts) models over dense when available - significantly faster
- Start with 7B-30B models for most agent tasks - they're fast and capable
- Test models with your specific use case - the LLM landscape changes rapidly

## Project Structure

```
open-agent-sdk/
├── open_agent/
│   ├── __init__.py        # query, Client, AgentOptions exports
│   ├── client.py          # Streaming query(), Client, tool helper
│   ├── config.py          # Env/provider helpers
│   ├── context.py         # Token estimation and truncation utilities
│   ├── hooks.py           # Lifecycle hooks (PreToolUse, PostToolUse, UserPromptSubmit)
│   ├── tools.py           # Tool decorator and schema conversion
│   ├── types.py           # Dataclasses for options and blocks
│   └── utils.py           # OpenAI client + ToolCallAggregator
├── docs/
│   ├── configuration.md
│   ├── provider-compatibility.md
│   └── technical-design.md
├── examples/
│   ├── git_commit_agent.py     # 🌟 Practical: Git commit message generator
│   ├── log_analyzer_agent.py   # 🌟 Practical: Log file analyzer
│   ├── calculator_tools.py     # Function calling with @tool decorator
│   ├── simple_tool.py          # Minimal tool usage example
│   ├── tool_use_agent.py       # Complete tool use patterns
│   ├── context_management.py   # Manual history management patterns
│   ├── hooks_example.py        # Lifecycle hooks patterns (security, audit, sanitization)
│   ├── interrupt_demo.py       # Interrupt capability patterns (timeout, conditional, concurrent)
│   ├── simple_lmstudio.py      # Basic usage with LM Studio
│   ├── ollama_chat.py          # Multi-turn chat example
│   ├── config_examples.py      # Configuration patterns
│   └── simple_with_env.py      # Environment variable config
├── tests/
│   ├── integration/               # Integration-style tests using fakes
│   │   └── test_client_behaviour.py  # Streaming, multi-turn, tool flow coverage
│   ├── test_agent_options.py
│   ├── test_auto_execution.py     # Automatic tool execution
│   ├── test_client.py
│   ├── test_config.py
│   ├── test_context.py            # Context utilities (token estimation, truncation)
│   ├── test_hooks.py              # Lifecycle hooks (PreToolUse, PostToolUse, UserPromptSubmit)
│   ├── test_interrupt.py          # Interrupt capability (timeout, concurrent, reuse)
│   ├── test_query.py
│   ├── test_tools.py              # Tool decorator and schema conversion
│   └── test_utils.py
├── CHANGELOG.md
├── pyproject.toml
└── README.md
```

## Examples

### 🌟 Practical Agents (Production-Ready)
- **`git_commit_agent.py`** – Analyzes git diffs and writes professional commit messages
- **`log_analyzer_agent.py`** – Parses logs, finds patterns, suggests fixes with interactive mode
- **`tool_use_agent.py`** – Complete tool use patterns: manual, helper, and agent class

### Core SDK Usage
- `simple_lmstudio.py` – Minimal streaming query with hard-coded config (simplest quickstart)
- `simple_with_env.py` – Using environment variables with config helpers and fallbacks
- `config_examples.py` – Comprehensive reference: provider shortcuts, priority, and all config patterns
- `ollama_chat.py` – Multi-turn chat loop with Ollama, including tool-call logging
- `context_management.py` – Manual history management patterns (stateless, truncation, token monitoring, RAG-lite)
- `hooks_example.py` – Lifecycle hooks patterns (security gates, audit logging, input sanitization, combined)

### Integration Tests
Located in `tests/integration/`:
- `test_client_behaviour.py` – Fake AsyncOpenAI client covering streaming, multi-turn history, and tool-call flows without hitting real servers

## Development Status

**Released v0.1.0** – Core functionality is complete and available on PyPI. Multi-turn conversations, tool monitoring, and streaming are fully implemented.

### Roadmap

- [x] Project planning and architecture
- [x] Core `query()` and `Client` class
- [x] Tool monitoring + `Client.add_tool_result()` helper
- [x] Tool use example (`examples/tool_use_agent.py`)
- [x] PyPI release - Published as `open-agent-sdk`
- [ ] Provider compatibility matrix expansion
- [ ] Additional agent examples

### Tested Providers

- ✅ **Ollama** - Fully validated with `kimi-k2:1t-cloud` (cloud-proxied model)
- ✅ **LM Studio** - Fully validated with `qwen/qwen3-30b` model
- ✅ **llama.cpp** - Fully validated with TinyLlama 1.1B model

See [docs/provider-compatibility.md](docs/provider-compatibility.md) for detailed test results.

## Documentation

- [docs/technical-design.md](docs/technical-design.md) - Architecture details
- [docs/configuration.md](docs/configuration.md) - Configuration guide
- [docs/provider-compatibility.md](docs/provider-compatibility.md) - Provider test results
- [examples/](examples/) - Usage examples

## Testing

Integration-style tests run entirely against lightweight fakes, so they are safe to execute locally and in pre-commit:

```bash
python -m pytest tests/integration
```

Add `-k` or a specific path when you want to target a subset of the unit tests (`tests/test_client.py`, etc.). If you use a virtual environment, prefix commands with `./venv/bin/python -m`.

## Pre-commit Hooks

Install hooks once per clone:

```bash
pip install pre-commit
pre-commit install
```

Running `pre-commit run --all-files` will execute formatting checks and the integration tests (`python -m pytest tests/integration`) before you push changes.

## Requirements

- Python 3.10+
- openai 1.0+ (for AsyncOpenAI client)
- pydantic 2.0+ (for types, optional)
 - Some servers require a dummy `api_key`; set any non-empty string if needed

## License

MIT License - see [LICENSE](LICENSE) for details.

## Acknowledgments

- API design inspired by [claude-agent-sdk](https://github.com/anthropics/claude-agent-sdk-python)
- Built for local/open-source LLM enthusiasts

---

**Status**: Alpha - API stabilizing, feedback welcome

Star ⭐ this repo if you're building AI agents with local models!

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "open-agent-sdk",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "llm, ai, agent, local, openai, ollama, lmstudio, llamacpp",
    "author": "Open Agent SDK Contributors",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/df/0b/c0ce4b0702ef43fe93d17b518d13c5ff5e846dba24290c3026974d1b9733/open_agent_sdk-0.4.1.tar.gz",
    "platform": null,
    "description": "# Open Agent SDK\n\n> Lightweight Python SDK for local/self-hosted LLMs via OpenAI-compatible endpoints\n\n[![PyPI version](https://badge.fury.io/py/open-agent-sdk.svg)](https://pypi.org/project/open-agent-sdk/)\n[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n\n## Overview\n\nOpen Agent SDK provides a clean, streaming API for working with OpenAI-compatible local model servers, making it easy to build AI agents with your own hardware.\n\n**Use Case**: Build powerful AI agents using local Qwen/Llama/Mistral models without cloud API costs or data privacy concerns.\n\n**Solution**: Drop-in similar API that works with LM Studio, Ollama, llama.cpp, and any OpenAI-compatible endpoint\u2014complete with streaming, tool call aggregation, and a helper for returning tool results back to the model.\n\n## Supported Providers\n\n### \u2705 Supported (OpenAI-Compatible Endpoints)\n\n- **LM Studio** - `http://localhost:1234/v1`\n- **Ollama** - `http://localhost:11434/v1`\n- **llama.cpp server** - OpenAI-compatible mode\n- **vLLM** - OpenAI-compatible API\n- **Text Generation WebUI** - OpenAI extension\n- **Any OpenAI-compatible local endpoint**\n- **Local gateways proxying cloud models** - e.g., Ollama or custom gateways that route to cloud providers\n\n### \u274c Not Supported (Use Official SDKs)\n\n- **Claude/OpenAI direct** - Use their official SDKs, unless you proxy through a local OpenAI-compatible gateway\n- **Cloud provider SDKs** - Bedrock, Vertex, Azure, etc. (proxied via local gateway is fine)\n\n## Quick Start\n\n### Installation\n\n```bash\npip install open-agent-sdk\n```\n\nFor development:\n\n```bash\ngit clone https://github.com/slb350/open-agent-sdk.git\ncd open-agent-sdk\npip install -e .\n```\n\n### Simple Query (LM Studio)\n\n```python\nimport asyncio\nfrom open_agent import query, AgentOptions\n\nasync def main():\n    options = AgentOptions(\n        system_prompt=\"You are a professional copy editor\",\n        model=\"qwen2.5-32b-instruct\",\n        base_url=\"http://localhost:1234/v1\",\n        max_turns=1,\n        temperature=0.1\n    )\n\n    result = query(prompt=\"Analyze this text...\", options=options)\n\n    response_text = \"\"\n    async for msg in result:\n        if hasattr(msg, 'content'):\n            for block in msg.content:\n                if hasattr(block, 'text'):\n                    response_text += block.text\n\n    print(response_text)\n\nasyncio.run(main())\n```\n\n### Multi-Turn Conversation (Ollama)\n\n```python\nfrom open_agent import Client, AgentOptions, TextBlock, ToolUseBlock\nfrom open_agent.config import get_base_url\n\ndef run_my_tool(name: str, params: dict) -> dict:\n    # Replace with your tool execution logic\n    return {\"result\": f\"stub output for {name}\"}\n\nasync def main():\n    options = AgentOptions(\n        system_prompt=\"You are a helpful assistant\",\n        model=\"kimi-k2:1t-cloud\",  # Use your available Ollama model\n        base_url=get_base_url(provider=\"ollama\"),\n        max_turns=10\n    )\n\n    async with Client(options) as client:\n        await client.query(\"What's the capital of France?\")\n\n        async for msg in client.receive_messages():\n            if isinstance(msg, TextBlock):\n                print(f\"Assistant: {msg.text}\")\n            elif isinstance(msg, ToolUseBlock):\n                print(f\"Tool used: {msg.name}\")\n                tool_result = run_my_tool(msg.name, msg.input)\n                client.add_tool_result(msg.id, tool_result)\n\nasyncio.run(main())\n```\n\nSee `examples/tool_use_agent.py` for progressively richer patterns (manual loop, helper function, and reusable agent class) demonstrating `add_tool_result()` in context.\n\n### Function Calling with Tools\n\nDefine tools using the `@tool` decorator for clean, type-safe function calling:\n\n```python\nfrom open_agent import tool, Client, AgentOptions, TextBlock, ToolUseBlock\n\n# Define tools\n@tool(\"get_weather\", \"Get current weather\", {\"location\": str, \"units\": str})\nasync def get_weather(args):\n    return {\n        \"temperature\": 72,\n        \"conditions\": \"sunny\",\n        \"units\": args[\"units\"]\n    }\n\n@tool(\"calculate\", \"Perform calculation\", {\"a\": float, \"b\": float, \"op\": str})\nasync def calculate(args):\n    ops = {\"+\": lambda a, b: a + b, \"-\": lambda a, b: a - b}\n    result = ops[args[\"op\"]](args[\"a\"], args[\"b\"])\n    return {\"result\": result}\n\n# Enable automatic tool execution (recommended)\noptions = AgentOptions(\n    system_prompt=\"You are a helpful assistant with access to tools.\",\n    model=\"qwen2.5-32b-instruct\",\n    base_url=\"http://localhost:1234/v1\",\n    tools=[get_weather, calculate],\n    auto_execute_tools=True,      # \ud83d\udd25 Tools execute automatically\n    max_tool_iterations=10         # Safety limit for tool loops\n)\n\nasync with Client(options) as client:\n    await client.query(\"What's 25 + 17?\")\n\n    # Simply iterate - tools execute automatically!\n    async for block in client.receive_messages():\n        if isinstance(block, ToolUseBlock):\n            print(f\"Tool called: {block.name}\")\n        elif isinstance(block, TextBlock):\n            print(f\"Response: {block.text}\")\n```\n\n**Advanced: Manual Tool Execution**\n\nFor custom execution logic or result interception:\n\n```python\n# Disable auto-execution\noptions = AgentOptions(\n    system_prompt=\"You are a helpful assistant with access to tools.\",\n    model=\"qwen2.5-32b-instruct\",\n    base_url=\"http://localhost:1234/v1\",\n    tools=[get_weather, calculate],\n    auto_execute_tools=False  # Manual mode\n)\n\nasync with Client(options) as client:\n    await client.query(\"What's 25 + 17?\")\n\n    async for block in client.receive_messages():\n        if isinstance(block, ToolUseBlock):\n            # You execute the tool manually\n            tool = {\"calculate\": calculate, \"get_weather\": get_weather}[block.name]\n            result = await tool.execute(block.input)\n\n            # Return result to agent\n            await client.add_tool_result(block.id, result)\n\n            # Continue conversation\n            await client.query(\"\")\n```\n\n**Key Features:**\n- **Automatic execution** (v0.3.0+) - Tools run automatically with safety limits\n- **Type-safe schemas** - Simple Python types (`str`, `int`, `float`, `bool`) or full JSON Schema\n- **OpenAI-compatible** - Works with any OpenAI function calling endpoint\n- **Clean decorator API** - Similar to Claude SDK's `@tool`\n- **Hook integration** - PreToolUse/PostToolUse hooks work in both modes\n\nSee `examples/calculator_tools.py` and `examples/simple_tool.py` for complete examples.\n\n## Context Management\n\nLocal models have fixed context windows (typically 8k-32k tokens). The SDK provides **opt-in utilities** for manual history management\u2014no silent mutations, you stay in control.\n\n### Token Estimation & Truncation\n\n```python\nfrom open_agent import Client, AgentOptions\nfrom open_agent.context import estimate_tokens, truncate_messages\n\nasync with Client(options) as client:\n    # Long conversation...\n    for i in range(50):\n        await client.query(f\"Question {i}\")\n        async for msg in client.receive_messages():\n            pass\n\n    # Check token usage\n    tokens = estimate_tokens(client.history)\n    print(f\"Context size: ~{tokens} tokens\")\n\n    # Manually truncate when needed\n    if tokens > 28000:\n        client.message_history = truncate_messages(client.history, keep=10)\n```\n\n### Recommended Patterns\n\n**1. Stateless Agents** (Best for single-task agents):\n```python\n# Process each task independently - no history accumulation\nfor task in tasks:\n    async with Client(options) as client:\n        await client.query(task)\n        # Client disposed, fresh context for next task\n```\n\n**2. Manual Truncation** (At natural breakpoints):\n```python\nfrom open_agent.context import truncate_messages\n\nasync with Client(options) as client:\n    for task in tasks:\n        await client.query(task)\n        # Truncate after each major task\n        client.message_history = truncate_messages(client.history, keep=5)\n```\n\n**3. External Memory** (RAG-lite for research agents):\n```python\n# Store important facts in database, keep conversation context small\ndatabase = {}\nasync with Client(options) as client:\n    await client.query(\"Research topic X\")\n    # Save response to database\n    database[\"topic_x\"] = extract_facts(response)\n\n    # Clear history, query database when needed\n    client.message_history = truncate_messages(client.history, keep=0)\n```\n\n### Why Manual?\n\nThe SDK **intentionally** does not auto-compact history because:\n- **Domain-specific needs**: Copy editors need different strategies than research agents\n- **Token accuracy varies**: Each model family has different tokenizers\n- **Risk of breaking context**: Silently removing messages could break tool chains\n- **Natural limits exist**: Compaction doesn't bypass model context windows\n\n### Installing Token Estimation\n\nFor better token estimation accuracy (optional):\n\n```bash\npip install open-agent-sdk[context]  # Adds tiktoken\n```\n\nWithout `tiktoken`, falls back to character-based approximation (~75-85% accurate).\n\nSee `examples/context_management.py` for complete patterns and usage.\n\n## Lifecycle Hooks\n\nMonitor and control agent behavior at key execution points with Pythonic lifecycle hooks\u2014no subprocess overhead or JSON protocols.\n\n### Quick Example\n\n```python\nfrom open_agent import (\n    AgentOptions, Client,\n    PreToolUseEvent, PostToolUseEvent, UserPromptSubmitEvent,\n    HookDecision,\n    HOOK_PRE_TOOL_USE, HOOK_POST_TOOL_USE, HOOK_USER_PROMPT_SUBMIT\n)\n\n# Security gate - block dangerous operations\nasync def security_gate(event: PreToolUseEvent) -> HookDecision | None:\n    if event.tool_name == \"delete_file\":\n        return HookDecision(\n            continue_=False,\n            reason=\"Delete operations require approval\"\n        )\n    return None  # Allow by default\n\n# Audit logger - track all tool executions\nasync def audit_logger(event: PostToolUseEvent) -> None:\n    print(f\"Tool executed: {event.tool_name} -> {event.tool_result}\")\n    return None\n\n# Input sanitizer - validate user prompts\nasync def sanitize_input(event: UserPromptSubmitEvent) -> HookDecision | None:\n    if \"DELETE\" in event.prompt.upper():\n        return HookDecision(\n            continue_=False,\n            reason=\"Dangerous keywords detected\"\n        )\n    return None\n\n# Register hooks in AgentOptions\noptions = AgentOptions(\n    system_prompt=\"You are a helpful assistant\",\n    model=\"qwen2.5-32b-instruct\",\n    base_url=\"http://localhost:1234/v1\",\n    tools=[my_file_tool, my_search_tool],\n    hooks={\n        HOOK_PRE_TOOL_USE: [security_gate],\n        HOOK_POST_TOOL_USE: [audit_logger],\n        HOOK_USER_PROMPT_SUBMIT: [sanitize_input],\n    }\n)\n\nasync with Client(options) as client:\n    await client.query(\"Write to /etc/config\")  # UserPromptSubmit fires\n    async for block in client.receive_messages():\n        if isinstance(block, ToolUseBlock):  # PreToolUse fires\n            result = await tool.execute(block.input)\n            await client.add_tool_result(block.id, result)  # PostToolUse fires\n```\n\n### Hook Types\n\n**PreToolUse** - Fires before tool execution (or yielding to user)\n- **Block operations**: Return `HookDecision(continue_=False, reason=\"...\")`\n- **Modify inputs**: Return `HookDecision(modified_input={...}, reason=\"...\")`\n- **Allow**: Return `None`\n\n**PostToolUse** - Fires after tool result added to history\n- **Observational only** (tool already executed)\n- Use for audit logging, metrics, result validation\n- Return `None` (decision ignored for PostToolUse)\n\n**UserPromptSubmit** - Fires before sending prompt to API\n- **Block prompts**: Return `HookDecision(continue_=False, reason=\"...\")`\n- **Modify prompts**: Return `HookDecision(modified_prompt=\"...\", reason=\"...\")`\n- **Allow**: Return `None`\n\n### Common Patterns\n\n**Pattern 1: Redirect to Sandbox**\n\n```python\nasync def redirect_to_sandbox(event: PreToolUseEvent) -> HookDecision | None:\n    \"\"\"Redirect file operations to safe directory.\"\"\"\n    if event.tool_name == \"file_writer\":\n        path = event.tool_input.get(\"path\", \"\")\n        if not path.startswith(\"/tmp/\"):\n            safe_path = f\"/tmp/sandbox/{path.lstrip('/')}\"\n            return HookDecision(\n                modified_input={\"path\": safe_path, \"content\": event.tool_input.get(\"content\", \"\")},\n                reason=\"Redirected to sandbox\"\n            )\n    return None\n```\n\n**Pattern 2: Compliance Audit Log**\n\n```python\naudit_log = []\n\nasync def compliance_logger(event: PostToolUseEvent) -> None:\n    \"\"\"Log all tool executions for compliance.\"\"\"\n    audit_log.append({\n        \"timestamp\": datetime.now(),\n        \"tool\": event.tool_name,\n        \"input\": event.tool_input,\n        \"result\": str(event.tool_result)[:100],\n        \"user\": get_current_user()\n    })\n    return None\n```\n\n**Pattern 3: Safety Instructions**\n\n```python\nasync def add_safety_warning(event: UserPromptSubmitEvent) -> HookDecision | None:\n    \"\"\"Add safety instructions to risky prompts.\"\"\"\n    if \"write\" in event.prompt.lower() or \"delete\" in event.prompt.lower():\n        safe_prompt = event.prompt + \" (Please confirm this is safe before proceeding)\"\n        return HookDecision(\n            modified_prompt=safe_prompt,\n            reason=\"Added safety warning\"\n        )\n    return None\n```\n\n### Hook Execution Flow\n\n- Hooks run **sequentially** in the order registered\n- **First non-None decision wins** (short-circuit behavior)\n- Hooks run **inline on event loop** (spawn tasks for heavy work)\n- Works with both **Client** and **query()** function\n\n### Breaking Change (v0.2.4)\n\n`Client.add_tool_result()` is now async to support PostToolUse hooks:\n\n```python\n# Old (v0.2.3 and earlier)\nclient.add_tool_result(tool_id, result)\n\n# New (v0.2.4+)\nawait client.add_tool_result(tool_id, result)\n```\n\n### Why Hooks?\n\n- **Security gates**: Block dangerous operations before they execute\n- **Audit logging**: Track all tool executions for compliance\n- **Input validation**: Sanitize user prompts before processing\n- **Monitoring**: Observe agent behavior in production\n- **Control flow**: Modify tool inputs or redirect operations\n\nSee `examples/hooks_example.py` for 4 comprehensive patterns (security, audit, sanitization, combined).\n\n## Interrupt Capability\n\nCancel long-running operations cleanly without corrupting client state. Perfect for timeouts, user cancellations, or conditional interruptions.\n\n### Quick Example\n\n```python\nfrom open_agent import Client, AgentOptions\nimport asyncio\n\nasync def main():\n    options = AgentOptions(\n        system_prompt=\"You are a helpful assistant.\",\n        model=\"qwen2.5-32b-instruct\",\n        base_url=\"http://localhost:1234/v1\"\n    )\n\n    async with Client(options) as client:\n        await client.query(\"Write a detailed 1000-word essay...\")\n\n        # Timeout after 5 seconds\n        try:\n            async def collect_messages():\n                async for block in client.receive_messages():\n                    print(block.text, end=\"\", flush=True)\n\n            await asyncio.wait_for(collect_messages(), timeout=5.0)\n        except asyncio.TimeoutError:\n            await client.interrupt()  # Clean cancellation\n            print(\"\\n\u26a0\ufe0f Operation timed out!\")\n\n        # Client is still usable after interrupt\n        await client.query(\"Short question?\")\n        async for block in client.receive_messages():\n            print(block.text)\n```\n\n### Common Patterns\n\n**1. Timeout-Based Interruption**\n\n```python\ntry:\n    await asyncio.wait_for(process_messages(client), timeout=10.0)\nexcept asyncio.TimeoutError:\n    await client.interrupt()\n    print(\"Operation timed out\")\n```\n\n**2. Conditional Interruption**\n\n```python\n# Stop if response contains specific content\nfull_text = \"\"\nasync for block in client.receive_messages():\n    full_text += block.text\n    if \"error\" in full_text.lower():\n        await client.interrupt()\n        break\n```\n\n**3. User Cancellation (from separate task)**\n\n```python\nasync def stream_task():\n    await client.query(\"Long task...\")\n    async for block in client.receive_messages():\n        print(block.text, end=\"\")\n\nasync def cancel_button_task():\n    await asyncio.sleep(2.0)  # User waits 2 seconds\n    await client.interrupt()  # User clicks cancel\n\n# Run both concurrently\nawait asyncio.gather(stream_task(), cancel_button_task())\n```\n\n**4. Interrupt During Auto-Execution**\n\n```python\noptions = AgentOptions(\n    tools=[slow_tool, fast_tool],\n    auto_execute_tools=True,\n    max_tool_iterations=10\n)\n\nasync with Client(options) as client:\n    await client.query(\"Use tools...\")\n\n    tool_count = 0\n    async for block in client.receive_messages():\n        if isinstance(block, ToolUseBlock):\n            tool_count += 1\n            if tool_count >= 2:\n                await client.interrupt()  # Stop after 2 tools\n                break\n```\n\n### How It Works\n\nWhen you call `client.interrupt()`:\n1. **Active stream closure** - HTTP stream closed immediately (not just a flag)\n2. **Clean state** - Client remains in valid state for reuse\n3. **Partial output** - Text blocks flushed to history, incomplete tools skipped\n4. **Idempotent** - Safe to call multiple times\n5. **Concurrent-safe** - Can be called from separate asyncio tasks\n\n### Example\n\nSee `examples/interrupt_demo.py` for 5 comprehensive patterns:\n- Timeout-based interruption\n- Conditional interruption\n- Auto-execution interruption\n- Concurrent interruption (simulated cancel button)\n- Interrupt and retry\n\n## \ud83d\ude80 Practical Examples\n\nWe've included two production-ready agents that demonstrate real-world usage:\n\n### \ud83d\udcdd Git Commit Agent\n**[examples/git_commit_agent.py](examples/git_commit_agent.py)**\n\nAnalyzes your staged git changes and writes professional commit messages following conventional commit format.\n\n```bash\n# Stage your changes\ngit add .\n\n# Run the agent\npython examples/git_commit_agent.py\n\n# Output:\n# \u2713 Found staged changes in 3 file(s)\n# \ud83e\udd16 Analyzing changes and generating commit message...\n#\n# \ud83d\udcdd Suggested commit message:\n# feat(auth): Add OAuth2 integration with refresh tokens\n#\n# - Implement token refresh mechanism\n# - Add secure cookie storage for tokens\n# - Update login flow to support OAuth2 providers\n# - Add tests for token expiration handling\n```\n\n**Features:**\n- Analyzes diff to determine commit type (feat/fix/docs/etc)\n- Writes clear, descriptive commit messages\n- Interactive mode: accept, edit, or regenerate\n- Follows conventional commit standards\n\n### \ud83d\udcca Log Analyzer Agent\n**[examples/log_analyzer_agent.py](examples/log_analyzer_agent.py)**\n\nIntelligently analyzes application logs to identify patterns, errors, and provide actionable insights.\n\n```bash\n# Analyze a log file\npython examples/log_analyzer_agent.py /var/log/app.log\n\n# Analyze with a specific time window\npython examples/log_analyzer_agent.py app.log --since \"2025-10-15T00:00:00\" --until \"2025-10-15T12:00:00\"\n\n# Interactive mode for drilling down\npython examples/log_analyzer_agent.py app.log --interactive\n```\n\n**Features:**\n- Automatic error pattern detection\n- Time-based analysis (peak error times)\n- Root cause suggestions\n- Interactive mode for investigating specific issues\n- Supports multiple log formats (JSON, Apache, syslog, etc)\n- Time range filtering with `--since` / `--until`\n\n**Sample Output:**\n```\n\ud83d\udcca Log Summary:\n  Total entries: 45,231\n  Errors: 127 (0.3%)\n  Warnings: 892\n\n\ud83d\udd34 Top Error Patterns:\n  - Connection Error: 67 occurrences\n  - NullPointerException: 23 occurrences\n  - Timeout Error: 19 occurrences\n\n\u23f0 Peak error time: 2025-10-15T14:00:00\n   Errors in that hour: 43\n\n\ud83e\udd16 ANALYSIS REPORT:\nMain Issues (Priority Order):\n1. Database connection pool exhaustion during peak hours\n2. Unhandled null values in user authentication flow\n3. External API timeouts affecting payment processing\n\nRecommendations:\n1. Increase connection pool size from 10 to 25\n2. Add null checks in AuthService.validateUser() method\n3. Implement circuit breaker for payment API with 30s timeout\n```\n\n### Why These Examples?\n\nThese agents demonstrate:\n- **Practical Value**: Solve real problems developers face daily\n- **Tool Integration**: Show how to integrate with system commands (git, file I/O)\n- **Multi-turn Conversations**: Interactive modes for complex analysis\n- **Structured Output**: Parse and format LLM responses for actionable results\n- **Privacy-First**: Keep your code and logs local while getting AI assistance\n\n## Configuration\n\nOpen Agent SDK uses config helpers to provide flexible configuration via environment variables, provider shortcuts, or explicit parameters:\n\n### Environment Variables (Recommended)\n\n```bash\nexport OPEN_AGENT_BASE_URL=\"http://localhost:1234/v1\"\nexport OPEN_AGENT_MODEL=\"qwen/qwen3-30b-a3b-2507\"\n```\n\n```python\nfrom open_agent import AgentOptions\nfrom open_agent.config import get_model, get_base_url\n\n# Config helpers read from environment\noptions = AgentOptions(\n    system_prompt=\"...\",\n    model=get_model(),      # Reads OPEN_AGENT_MODEL\n    base_url=get_base_url() # Reads OPEN_AGENT_BASE_URL\n)\n```\n\n### Provider Shortcuts\n\n```python\nfrom open_agent.config import get_base_url\n\n# Use built-in defaults for common providers\noptions = AgentOptions(\n    system_prompt=\"...\",\n    model=\"llama3.1:70b\",\n    base_url=get_base_url(provider=\"ollama\")  # \u2192 http://localhost:11434/v1\n)\n```\n\n**Available providers**: `lmstudio`, `ollama`, `llamacpp`, `vllm`\n\n### Fallback Values\n\n```python\n# Provide fallbacks when env vars not set\noptions = AgentOptions(\n    system_prompt=\"...\",\n    model=get_model(\"qwen2.5-32b-instruct\"),         # Fallback model\n    base_url=get_base_url(provider=\"lmstudio\")       # Fallback URL\n)\n```\n\n**Configuration Priority:**\n- Environment variable (default behaviour)\n- Fallback value passed to the config helper\n- Provider default (for `base_url` only)\n\nNeed to force a specific model even when `OPEN_AGENT_MODEL` is set? Call `get_model(\"model-name\", prefer_env=False)` to ignore the environment variable for that lookup.\n\n**Benefits:**\n- Switch between dev/prod by changing environment variables\n- No hardcoded URLs or model names\n- Per-agent overrides when needed\n\nSee [docs/configuration.md](docs/configuration.md) for complete guide.\n\n## Why Not Just Use OpenAI Client?\n\n**Without open-agent-sdk** (raw OpenAI client):\n```python\nfrom openai import AsyncOpenAI\n\nclient = AsyncOpenAI(base_url=\"http://localhost:1234/v1\", api_key=\"not-needed\")\nresponse = await client.chat.completions.create(\n    model=\"qwen2.5-32b-instruct\",\n    messages=[{\"role\": \"system\", \"content\": system_prompt},\n              {\"role\": \"user\", \"content\": user_prompt}],\n    stream=True\n)\n\nasync for chunk in response:\n    # Complex parsing of chunks\n    # Extract delta content\n    # Handle tool calls manually\n    # Track conversation state yourself\n```\n\n**With open-agent-sdk**:\n```python\nfrom open_agent import query, AgentOptions\n\noptions = AgentOptions(\n    system_prompt=system_prompt,\n    model=\"qwen2.5-32b-instruct\",\n    base_url=\"http://localhost:1234/v1\"\n)\n\nresult = query(prompt=user_prompt, options=options)\nasync for msg in result:\n    # Clean message types (TextBlock, ToolUseBlock)\n    # Automatic streaming and tool call handling\n```\n\n**Value**: Familiar patterns + Less boilerplate + Easy migration\n\n## API Reference\n\n### AgentOptions\n\n```python\nclass AgentOptions:\n    system_prompt: str                          # System prompt\n    model: str                                  # Model name (required)\n    base_url: str                               # OpenAI-compatible endpoint URL (required)\n    tools: list[Tool] = []                      # Tool instances for function calling\n    hooks: dict[str, list[HookHandler]] = None  # Lifecycle hooks for monitoring/control\n    auto_execute_tools: bool = False            # Enable automatic tool execution (v0.3.0+)\n    max_tool_iterations: int = 5                # Max tool calls per query in auto mode\n    max_turns: int = 1                          # Max conversation turns\n    max_tokens: int | None = 4096               # Tokens to generate (None uses provider default)\n    temperature: float = 0.7                    # Sampling temperature\n    timeout: float = 60.0                       # Request timeout in seconds\n    api_key: str = \"not-needed\"                 # Most local servers don't need this\n```\n\n**Note**: Use config helpers (`get_model()`, `get_base_url()`) for environment variable and provider support.\n\n### query()\n\nSimple single-turn query function.\n\n```python\nasync def query(prompt: str, options: AgentOptions) -> AsyncGenerator\n```\n\nReturns an async generator yielding messages.\n\n### Client\n\nMulti-turn conversation client with tool monitoring.\n\n```python\nasync with Client(options: AgentOptions) as client:\n    await client.query(prompt: str)\n    async for msg in client.receive_messages():\n        # Process messages\n```\n\n### Message Types\n\n- `TextBlock` - Text content from model\n- `ToolUseBlock` - Tool calls from model (has `id`, `name`, `input` fields)\n- `ToolResultBlock` - Tool execution results to send back to model\n- `ToolUseError` - Tool call parsing error (malformed JSON, missing fields)\n- `AssistantMessage` - Full message wrapper\n\n### Tool System\n\n```python\n@tool(name: str, description: str, input_schema: dict)\nasync def my_tool(args: dict) -> Any:\n    \"\"\"Tool handler function\"\"\"\n    return result\n\n# Tool class\nclass Tool:\n    name: str\n    description: str\n    input_schema: dict[str, type] | dict[str, Any]\n    handler: Callable[[dict], Awaitable[Any]]\n\n    async def execute(arguments: dict) -> Any\n    def to_openai_format() -> dict\n```\n\n**Schema formats:**\n- Simple: `{\"param\": str, \"count\": int}` - All parameters required\n- JSON Schema: Full schema with `type`, `properties`, `required`, etc.\n\n### Hooks System\n\n```python\n# Event types\n@dataclass\nclass PreToolUseEvent:\n    tool_name: str\n    tool_input: dict[str, Any]\n    tool_use_id: str\n    history: list[dict[str, Any]]\n\n@dataclass\nclass PostToolUseEvent:\n    tool_name: str\n    tool_input: dict[str, Any]\n    tool_result: Any\n    tool_use_id: str\n    history: list[dict[str, Any]]\n\n@dataclass\nclass UserPromptSubmitEvent:\n    prompt: str\n    history: list[dict[str, Any]]\n\n# Hook decision\n@dataclass\nclass HookDecision:\n    continue_: bool = True\n    modified_input: dict[str, Any] | None = None\n    modified_prompt: str | None = None\n    reason: str | None = None\n\n# Hook handler signature\nHookHandler = Callable[[HookEvent], Awaitable[HookDecision | None]]\n\n# Hook constants\nHOOK_PRE_TOOL_USE = \"pre_tool_use\"\nHOOK_POST_TOOL_USE = \"post_tool_use\"\nHOOK_USER_PROMPT_SUBMIT = \"user_prompt_submit\"\n```\n\n**Hook behavior:**\n- Return `None` to allow by default\n- Return `HookDecision(continue_=False)` to block\n- Return `HookDecision(modified_input={...})` to modify (PreToolUse)\n- Return `HookDecision(modified_prompt=\"...\")` to modify (UserPromptSubmit)\n- Raise exception to abort entirely\n\n## Recommended Models\n\n**Local models** (LM Studio, Ollama, llama.cpp):\n- **GPT-OSS-120B** - Best in class for speed and quality\n- **Qwen 3 30B** - Excellent instruction following, good for most tasks\n- **GPT-OSS-20B** - Solid all-around performance\n- **Mistral 7B** - Fast and efficient for simple agents\n\n**Cloud-proxied via local gateway** (Ollama cloud provider, custom gateway):\n- **kimi-k2:1t-cloud** - Tested and working via Ollama gateway\n- **deepseek-v3.1:671b-cloud** - High-quality reasoning model\n- **qwen3-coder:480b-cloud** - Code-focused models\n- Your `base_url` still points to localhost gateway (e.g., `http://localhost:11434/v1`)\n- Gateway handles authentication and routing to cloud provider\n- Useful when you need larger models than your hardware can run locally\n\n**Architecture guidance:**\n- Prefer MoE (Mixture of Experts) models over dense when available - significantly faster\n- Start with 7B-30B models for most agent tasks - they're fast and capable\n- Test models with your specific use case - the LLM landscape changes rapidly\n\n## Project Structure\n\n```\nopen-agent-sdk/\n\u251c\u2500\u2500 open_agent/\n\u2502   \u251c\u2500\u2500 __init__.py        # query, Client, AgentOptions exports\n\u2502   \u251c\u2500\u2500 client.py          # Streaming query(), Client, tool helper\n\u2502   \u251c\u2500\u2500 config.py          # Env/provider helpers\n\u2502   \u251c\u2500\u2500 context.py         # Token estimation and truncation utilities\n\u2502   \u251c\u2500\u2500 hooks.py           # Lifecycle hooks (PreToolUse, PostToolUse, UserPromptSubmit)\n\u2502   \u251c\u2500\u2500 tools.py           # Tool decorator and schema conversion\n\u2502   \u251c\u2500\u2500 types.py           # Dataclasses for options and blocks\n\u2502   \u2514\u2500\u2500 utils.py           # OpenAI client + ToolCallAggregator\n\u251c\u2500\u2500 docs/\n\u2502   \u251c\u2500\u2500 configuration.md\n\u2502   \u251c\u2500\u2500 provider-compatibility.md\n\u2502   \u2514\u2500\u2500 technical-design.md\n\u251c\u2500\u2500 examples/\n\u2502   \u251c\u2500\u2500 git_commit_agent.py     # \ud83c\udf1f Practical: Git commit message generator\n\u2502   \u251c\u2500\u2500 log_analyzer_agent.py   # \ud83c\udf1f Practical: Log file analyzer\n\u2502   \u251c\u2500\u2500 calculator_tools.py     # Function calling with @tool decorator\n\u2502   \u251c\u2500\u2500 simple_tool.py          # Minimal tool usage example\n\u2502   \u251c\u2500\u2500 tool_use_agent.py       # Complete tool use patterns\n\u2502   \u251c\u2500\u2500 context_management.py   # Manual history management patterns\n\u2502   \u251c\u2500\u2500 hooks_example.py        # Lifecycle hooks patterns (security, audit, sanitization)\n\u2502   \u251c\u2500\u2500 interrupt_demo.py       # Interrupt capability patterns (timeout, conditional, concurrent)\n\u2502   \u251c\u2500\u2500 simple_lmstudio.py      # Basic usage with LM Studio\n\u2502   \u251c\u2500\u2500 ollama_chat.py          # Multi-turn chat example\n\u2502   \u251c\u2500\u2500 config_examples.py      # Configuration patterns\n\u2502   \u2514\u2500\u2500 simple_with_env.py      # Environment variable config\n\u251c\u2500\u2500 tests/\n\u2502   \u251c\u2500\u2500 integration/               # Integration-style tests using fakes\n\u2502   \u2502   \u2514\u2500\u2500 test_client_behaviour.py  # Streaming, multi-turn, tool flow coverage\n\u2502   \u251c\u2500\u2500 test_agent_options.py\n\u2502   \u251c\u2500\u2500 test_auto_execution.py     # Automatic tool execution\n\u2502   \u251c\u2500\u2500 test_client.py\n\u2502   \u251c\u2500\u2500 test_config.py\n\u2502   \u251c\u2500\u2500 test_context.py            # Context utilities (token estimation, truncation)\n\u2502   \u251c\u2500\u2500 test_hooks.py              # Lifecycle hooks (PreToolUse, PostToolUse, UserPromptSubmit)\n\u2502   \u251c\u2500\u2500 test_interrupt.py          # Interrupt capability (timeout, concurrent, reuse)\n\u2502   \u251c\u2500\u2500 test_query.py\n\u2502   \u251c\u2500\u2500 test_tools.py              # Tool decorator and schema conversion\n\u2502   \u2514\u2500\u2500 test_utils.py\n\u251c\u2500\u2500 CHANGELOG.md\n\u251c\u2500\u2500 pyproject.toml\n\u2514\u2500\u2500 README.md\n```\n\n## Examples\n\n### \ud83c\udf1f Practical Agents (Production-Ready)\n- **`git_commit_agent.py`** \u2013 Analyzes git diffs and writes professional commit messages\n- **`log_analyzer_agent.py`** \u2013 Parses logs, finds patterns, suggests fixes with interactive mode\n- **`tool_use_agent.py`** \u2013 Complete tool use patterns: manual, helper, and agent class\n\n### Core SDK Usage\n- `simple_lmstudio.py` \u2013 Minimal streaming query with hard-coded config (simplest quickstart)\n- `simple_with_env.py` \u2013 Using environment variables with config helpers and fallbacks\n- `config_examples.py` \u2013 Comprehensive reference: provider shortcuts, priority, and all config patterns\n- `ollama_chat.py` \u2013 Multi-turn chat loop with Ollama, including tool-call logging\n- `context_management.py` \u2013 Manual history management patterns (stateless, truncation, token monitoring, RAG-lite)\n- `hooks_example.py` \u2013 Lifecycle hooks patterns (security gates, audit logging, input sanitization, combined)\n\n### Integration Tests\nLocated in `tests/integration/`:\n- `test_client_behaviour.py` \u2013 Fake AsyncOpenAI client covering streaming, multi-turn history, and tool-call flows without hitting real servers\n\n## Development Status\n\n**Released v0.1.0** \u2013 Core functionality is complete and available on PyPI. Multi-turn conversations, tool monitoring, and streaming are fully implemented.\n\n### Roadmap\n\n- [x] Project planning and architecture\n- [x] Core `query()` and `Client` class\n- [x] Tool monitoring + `Client.add_tool_result()` helper\n- [x] Tool use example (`examples/tool_use_agent.py`)\n- [x] PyPI release - Published as `open-agent-sdk`\n- [ ] Provider compatibility matrix expansion\n- [ ] Additional agent examples\n\n### Tested Providers\n\n- \u2705 **Ollama** - Fully validated with `kimi-k2:1t-cloud` (cloud-proxied model)\n- \u2705 **LM Studio** - Fully validated with `qwen/qwen3-30b` model\n- \u2705 **llama.cpp** - Fully validated with TinyLlama 1.1B model\n\nSee [docs/provider-compatibility.md](docs/provider-compatibility.md) for detailed test results.\n\n## Documentation\n\n- [docs/technical-design.md](docs/technical-design.md) - Architecture details\n- [docs/configuration.md](docs/configuration.md) - Configuration guide\n- [docs/provider-compatibility.md](docs/provider-compatibility.md) - Provider test results\n- [examples/](examples/) - Usage examples\n\n## Testing\n\nIntegration-style tests run entirely against lightweight fakes, so they are safe to execute locally and in pre-commit:\n\n```bash\npython -m pytest tests/integration\n```\n\nAdd `-k` or a specific path when you want to target a subset of the unit tests (`tests/test_client.py`, etc.). If you use a virtual environment, prefix commands with `./venv/bin/python -m`.\n\n## Pre-commit Hooks\n\nInstall hooks once per clone:\n\n```bash\npip install pre-commit\npre-commit install\n```\n\nRunning `pre-commit run --all-files` will execute formatting checks and the integration tests (`python -m pytest tests/integration`) before you push changes.\n\n## Requirements\n\n- Python 3.10+\n- openai 1.0+ (for AsyncOpenAI client)\n- pydantic 2.0+ (for types, optional)\n - Some servers require a dummy `api_key`; set any non-empty string if needed\n\n## License\n\nMIT License - see [LICENSE](LICENSE) for details.\n\n## Acknowledgments\n\n- API design inspired by [claude-agent-sdk](https://github.com/anthropics/claude-agent-sdk-python)\n- Built for local/open-source LLM enthusiasts\n\n---\n\n**Status**: Alpha - API stabilizing, feedback welcome\n\nStar \u2b50 this repo if you're building AI agents with local models!\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Lightweight Python SDK for local/self-hosted LLMs via OpenAI-compatible endpoints",
    "version": "0.4.1",
    "project_urls": {
        "Documentation": "https://github.com/slb350/open-agent-sdk/tree/main/docs",
        "Homepage": "https://github.com/slb350/open-agent-sdk",
        "Issues": "https://github.com/slb350/open-agent-sdk/issues",
        "Repository": "https://github.com/slb350/open-agent-sdk"
    },
    "split_keywords": [
        "llm",
        " ai",
        " agent",
        " local",
        " openai",
        " ollama",
        " lmstudio",
        " llamacpp"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "3b89d5cc764754dc98fa0c5393628fe2d4400f4248d9264370e4a1f7cafbbb60",
                "md5": "99fab7d7f9ba1663aa89c7e1f72aacfd",
                "sha256": "49762a2eb29cd3125334226148ec1be99d20d027912fe0823c67f16dc3f9f9b4"
            },
            "downloads": -1,
            "filename": "open_agent_sdk-0.4.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "99fab7d7f9ba1663aa89c7e1f72aacfd",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 33491,
            "upload_time": "2025-10-18T08:18:19",
            "upload_time_iso_8601": "2025-10-18T08:18:19.926934Z",
            "url": "https://files.pythonhosted.org/packages/3b/89/d5cc764754dc98fa0c5393628fe2d4400f4248d9264370e4a1f7cafbbb60/open_agent_sdk-0.4.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "df0bc0ce4b0702ef43fe93d17b518d13c5ff5e846dba24290c3026974d1b9733",
                "md5": "e205828a76918349747b605958e117ca",
                "sha256": "e22493faf1b38d64d114eff579283689430a16162e35d5e19c52b44607b5a22d"
            },
            "downloads": -1,
            "filename": "open_agent_sdk-0.4.1.tar.gz",
            "has_sig": false,
            "md5_digest": "e205828a76918349747b605958e117ca",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 67711,
            "upload_time": "2025-10-18T08:18:20",
            "upload_time_iso_8601": "2025-10-18T08:18:20.866738Z",
            "url": "https://files.pythonhosted.org/packages/df/0b/c0ce4b0702ef43fe93d17b518d13c5ff5e846dba24290c3026974d1b9733/open_agent_sdk-0.4.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-10-18 08:18:20",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "slb350",
    "github_project": "open-agent-sdk",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "open-agent-sdk"
}

Open Agent SDK Contributors