# Open Agent SDK
> Lightweight Python SDK for local/self-hosted LLMs via OpenAI-compatible endpoints
[](https://pypi.org/project/open-agent-sdk/)
[](https://www.python.org/downloads/)
[](https://opensource.org/licenses/MIT)
## Overview
Open Agent SDK provides a clean, streaming API for working with OpenAI-compatible local model servers, making it easy to build AI agents with your own hardware.
**Use Case**: Build powerful AI agents using local Qwen/Llama/Mistral models without cloud API costs or data privacy concerns.
**Solution**: Drop-in similar API that works with LM Studio, Ollama, llama.cpp, and any OpenAI-compatible endpointβcomplete with streaming, tool call aggregation, and a helper for returning tool results back to the model.
## Supported Providers
### β
Supported (OpenAI-Compatible Endpoints)
- **LM Studio** - `http://localhost:1234/v1`
- **Ollama** - `http://localhost:11434/v1`
- **llama.cpp server** - OpenAI-compatible mode
- **vLLM** - OpenAI-compatible API
- **Text Generation WebUI** - OpenAI extension
- **Any OpenAI-compatible local endpoint**
- **Local gateways proxying cloud models** - e.g., Ollama or custom gateways that route to cloud providers
### β Not Supported (Use Official SDKs)
- **Claude/OpenAI direct** - Use their official SDKs, unless you proxy through a local OpenAI-compatible gateway
- **Cloud provider SDKs** - Bedrock, Vertex, Azure, etc. (proxied via local gateway is fine)
## Quick Start
### Installation
```bash
pip install open-agent-sdk
```
For development:
```bash
git clone https://github.com/slb350/open-agent-sdk.git
cd open-agent-sdk
pip install -e .
```
### Simple Query (LM Studio)
```python
import asyncio
from open_agent import query, AgentOptions
async def main():
options = AgentOptions(
system_prompt="You are a professional copy editor",
model="qwen2.5-32b-instruct",
base_url="http://localhost:1234/v1",
max_turns=1,
temperature=0.1
)
result = query(prompt="Analyze this text...", options=options)
response_text = ""
async for msg in result:
if hasattr(msg, 'content'):
for block in msg.content:
if hasattr(block, 'text'):
response_text += block.text
print(response_text)
asyncio.run(main())
```
### Multi-Turn Conversation (Ollama)
```python
from open_agent import Client, AgentOptions, TextBlock, ToolUseBlock
from open_agent.config import get_base_url
def run_my_tool(name: str, params: dict) -> dict:
# Replace with your tool execution logic
return {"result": f"stub output for {name}"}
async def main():
options = AgentOptions(
system_prompt="You are a helpful assistant",
model="kimi-k2:1t-cloud", # Use your available Ollama model
base_url=get_base_url(provider="ollama"),
max_turns=10
)
async with Client(options) as client:
await client.query("What's the capital of France?")
async for msg in client.receive_messages():
if isinstance(msg, TextBlock):
print(f"Assistant: {msg.text}")
elif isinstance(msg, ToolUseBlock):
print(f"Tool used: {msg.name}")
tool_result = run_my_tool(msg.name, msg.input)
client.add_tool_result(msg.id, tool_result)
asyncio.run(main())
```
See `examples/tool_use_agent.py` for progressively richer patterns (manual loop, helper function, and reusable agent class) demonstrating `add_tool_result()` in context.
### Function Calling with Tools
Define tools using the `@tool` decorator for clean, type-safe function calling:
```python
from open_agent import tool, Client, AgentOptions, TextBlock, ToolUseBlock
# Define tools
@tool("get_weather", "Get current weather", {"location": str, "units": str})
async def get_weather(args):
return {
"temperature": 72,
"conditions": "sunny",
"units": args["units"]
}
@tool("calculate", "Perform calculation", {"a": float, "b": float, "op": str})
async def calculate(args):
ops = {"+": lambda a, b: a + b, "-": lambda a, b: a - b}
result = ops[args["op"]](args["a"], args["b"])
return {"result": result}
# Enable automatic tool execution (recommended)
options = AgentOptions(
system_prompt="You are a helpful assistant with access to tools.",
model="qwen2.5-32b-instruct",
base_url="http://localhost:1234/v1",
tools=[get_weather, calculate],
auto_execute_tools=True, # π₯ Tools execute automatically
max_tool_iterations=10 # Safety limit for tool loops
)
async with Client(options) as client:
await client.query("What's 25 + 17?")
# Simply iterate - tools execute automatically!
async for block in client.receive_messages():
if isinstance(block, ToolUseBlock):
print(f"Tool called: {block.name}")
elif isinstance(block, TextBlock):
print(f"Response: {block.text}")
```
**Advanced: Manual Tool Execution**
For custom execution logic or result interception:
```python
# Disable auto-execution
options = AgentOptions(
system_prompt="You are a helpful assistant with access to tools.",
model="qwen2.5-32b-instruct",
base_url="http://localhost:1234/v1",
tools=[get_weather, calculate],
auto_execute_tools=False # Manual mode
)
async with Client(options) as client:
await client.query("What's 25 + 17?")
async for block in client.receive_messages():
if isinstance(block, ToolUseBlock):
# You execute the tool manually
tool = {"calculate": calculate, "get_weather": get_weather}[block.name]
result = await tool.execute(block.input)
# Return result to agent
await client.add_tool_result(block.id, result)
# Continue conversation
await client.query("")
```
**Key Features:**
- **Automatic execution** (v0.3.0+) - Tools run automatically with safety limits
- **Type-safe schemas** - Simple Python types (`str`, `int`, `float`, `bool`) or full JSON Schema
- **OpenAI-compatible** - Works with any OpenAI function calling endpoint
- **Clean decorator API** - Similar to Claude SDK's `@tool`
- **Hook integration** - PreToolUse/PostToolUse hooks work in both modes
See `examples/calculator_tools.py` and `examples/simple_tool.py` for complete examples.
## Context Management
Local models have fixed context windows (typically 8k-32k tokens). The SDK provides **opt-in utilities** for manual history managementβno silent mutations, you stay in control.
### Token Estimation & Truncation
```python
from open_agent import Client, AgentOptions
from open_agent.context import estimate_tokens, truncate_messages
async with Client(options) as client:
# Long conversation...
for i in range(50):
await client.query(f"Question {i}")
async for msg in client.receive_messages():
pass
# Check token usage
tokens = estimate_tokens(client.history)
print(f"Context size: ~{tokens} tokens")
# Manually truncate when needed
if tokens > 28000:
client.message_history = truncate_messages(client.history, keep=10)
```
### Recommended Patterns
**1. Stateless Agents** (Best for single-task agents):
```python
# Process each task independently - no history accumulation
for task in tasks:
async with Client(options) as client:
await client.query(task)
# Client disposed, fresh context for next task
```
**2. Manual Truncation** (At natural breakpoints):
```python
from open_agent.context import truncate_messages
async with Client(options) as client:
for task in tasks:
await client.query(task)
# Truncate after each major task
client.message_history = truncate_messages(client.history, keep=5)
```
**3. External Memory** (RAG-lite for research agents):
```python
# Store important facts in database, keep conversation context small
database = {}
async with Client(options) as client:
await client.query("Research topic X")
# Save response to database
database["topic_x"] = extract_facts(response)
# Clear history, query database when needed
client.message_history = truncate_messages(client.history, keep=0)
```
### Why Manual?
The SDK **intentionally** does not auto-compact history because:
- **Domain-specific needs**: Copy editors need different strategies than research agents
- **Token accuracy varies**: Each model family has different tokenizers
- **Risk of breaking context**: Silently removing messages could break tool chains
- **Natural limits exist**: Compaction doesn't bypass model context windows
### Installing Token Estimation
For better token estimation accuracy (optional):
```bash
pip install open-agent-sdk[context] # Adds tiktoken
```
Without `tiktoken`, falls back to character-based approximation (~75-85% accurate).
See `examples/context_management.py` for complete patterns and usage.
## Lifecycle Hooks
Monitor and control agent behavior at key execution points with Pythonic lifecycle hooksβno subprocess overhead or JSON protocols.
### Quick Example
```python
from open_agent import (
AgentOptions, Client,
PreToolUseEvent, PostToolUseEvent, UserPromptSubmitEvent,
HookDecision,
HOOK_PRE_TOOL_USE, HOOK_POST_TOOL_USE, HOOK_USER_PROMPT_SUBMIT
)
# Security gate - block dangerous operations
async def security_gate(event: PreToolUseEvent) -> HookDecision | None:
if event.tool_name == "delete_file":
return HookDecision(
continue_=False,
reason="Delete operations require approval"
)
return None # Allow by default
# Audit logger - track all tool executions
async def audit_logger(event: PostToolUseEvent) -> None:
print(f"Tool executed: {event.tool_name} -> {event.tool_result}")
return None
# Input sanitizer - validate user prompts
async def sanitize_input(event: UserPromptSubmitEvent) -> HookDecision | None:
if "DELETE" in event.prompt.upper():
return HookDecision(
continue_=False,
reason="Dangerous keywords detected"
)
return None
# Register hooks in AgentOptions
options = AgentOptions(
system_prompt="You are a helpful assistant",
model="qwen2.5-32b-instruct",
base_url="http://localhost:1234/v1",
tools=[my_file_tool, my_search_tool],
hooks={
HOOK_PRE_TOOL_USE: [security_gate],
HOOK_POST_TOOL_USE: [audit_logger],
HOOK_USER_PROMPT_SUBMIT: [sanitize_input],
}
)
async with Client(options) as client:
await client.query("Write to /etc/config") # UserPromptSubmit fires
async for block in client.receive_messages():
if isinstance(block, ToolUseBlock): # PreToolUse fires
result = await tool.execute(block.input)
await client.add_tool_result(block.id, result) # PostToolUse fires
```
### Hook Types
**PreToolUse** - Fires before tool execution (or yielding to user)
- **Block operations**: Return `HookDecision(continue_=False, reason="...")`
- **Modify inputs**: Return `HookDecision(modified_input={...}, reason="...")`
- **Allow**: Return `None`
**PostToolUse** - Fires after tool result added to history
- **Observational only** (tool already executed)
- Use for audit logging, metrics, result validation
- Return `None` (decision ignored for PostToolUse)
**UserPromptSubmit** - Fires before sending prompt to API
- **Block prompts**: Return `HookDecision(continue_=False, reason="...")`
- **Modify prompts**: Return `HookDecision(modified_prompt="...", reason="...")`
- **Allow**: Return `None`
### Common Patterns
**Pattern 1: Redirect to Sandbox**
```python
async def redirect_to_sandbox(event: PreToolUseEvent) -> HookDecision | None:
"""Redirect file operations to safe directory."""
if event.tool_name == "file_writer":
path = event.tool_input.get("path", "")
if not path.startswith("/tmp/"):
safe_path = f"/tmp/sandbox/{path.lstrip('/')}"
return HookDecision(
modified_input={"path": safe_path, "content": event.tool_input.get("content", "")},
reason="Redirected to sandbox"
)
return None
```
**Pattern 2: Compliance Audit Log**
```python
audit_log = []
async def compliance_logger(event: PostToolUseEvent) -> None:
"""Log all tool executions for compliance."""
audit_log.append({
"timestamp": datetime.now(),
"tool": event.tool_name,
"input": event.tool_input,
"result": str(event.tool_result)[:100],
"user": get_current_user()
})
return None
```
**Pattern 3: Safety Instructions**
```python
async def add_safety_warning(event: UserPromptSubmitEvent) -> HookDecision | None:
"""Add safety instructions to risky prompts."""
if "write" in event.prompt.lower() or "delete" in event.prompt.lower():
safe_prompt = event.prompt + " (Please confirm this is safe before proceeding)"
return HookDecision(
modified_prompt=safe_prompt,
reason="Added safety warning"
)
return None
```
### Hook Execution Flow
- Hooks run **sequentially** in the order registered
- **First non-None decision wins** (short-circuit behavior)
- Hooks run **inline on event loop** (spawn tasks for heavy work)
- Works with both **Client** and **query()** function
### Breaking Change (v0.2.4)
`Client.add_tool_result()` is now async to support PostToolUse hooks:
```python
# Old (v0.2.3 and earlier)
client.add_tool_result(tool_id, result)
# New (v0.2.4+)
await client.add_tool_result(tool_id, result)
```
### Why Hooks?
- **Security gates**: Block dangerous operations before they execute
- **Audit logging**: Track all tool executions for compliance
- **Input validation**: Sanitize user prompts before processing
- **Monitoring**: Observe agent behavior in production
- **Control flow**: Modify tool inputs or redirect operations
See `examples/hooks_example.py` for 4 comprehensive patterns (security, audit, sanitization, combined).
## Interrupt Capability
Cancel long-running operations cleanly without corrupting client state. Perfect for timeouts, user cancellations, or conditional interruptions.
### Quick Example
```python
from open_agent import Client, AgentOptions
import asyncio
async def main():
options = AgentOptions(
system_prompt="You are a helpful assistant.",
model="qwen2.5-32b-instruct",
base_url="http://localhost:1234/v1"
)
async with Client(options) as client:
await client.query("Write a detailed 1000-word essay...")
# Timeout after 5 seconds
try:
async def collect_messages():
async for block in client.receive_messages():
print(block.text, end="", flush=True)
await asyncio.wait_for(collect_messages(), timeout=5.0)
except asyncio.TimeoutError:
await client.interrupt() # Clean cancellation
print("\nβ οΈ Operation timed out!")
# Client is still usable after interrupt
await client.query("Short question?")
async for block in client.receive_messages():
print(block.text)
```
### Common Patterns
**1. Timeout-Based Interruption**
```python
try:
await asyncio.wait_for(process_messages(client), timeout=10.0)
except asyncio.TimeoutError:
await client.interrupt()
print("Operation timed out")
```
**2. Conditional Interruption**
```python
# Stop if response contains specific content
full_text = ""
async for block in client.receive_messages():
full_text += block.text
if "error" in full_text.lower():
await client.interrupt()
break
```
**3. User Cancellation (from separate task)**
```python
async def stream_task():
await client.query("Long task...")
async for block in client.receive_messages():
print(block.text, end="")
async def cancel_button_task():
await asyncio.sleep(2.0) # User waits 2 seconds
await client.interrupt() # User clicks cancel
# Run both concurrently
await asyncio.gather(stream_task(), cancel_button_task())
```
**4. Interrupt During Auto-Execution**
```python
options = AgentOptions(
tools=[slow_tool, fast_tool],
auto_execute_tools=True,
max_tool_iterations=10
)
async with Client(options) as client:
await client.query("Use tools...")
tool_count = 0
async for block in client.receive_messages():
if isinstance(block, ToolUseBlock):
tool_count += 1
if tool_count >= 2:
await client.interrupt() # Stop after 2 tools
break
```
### How It Works
When you call `client.interrupt()`:
1. **Active stream closure** - HTTP stream closed immediately (not just a flag)
2. **Clean state** - Client remains in valid state for reuse
3. **Partial output** - Text blocks flushed to history, incomplete tools skipped
4. **Idempotent** - Safe to call multiple times
5. **Concurrent-safe** - Can be called from separate asyncio tasks
### Example
See `examples/interrupt_demo.py` for 5 comprehensive patterns:
- Timeout-based interruption
- Conditional interruption
- Auto-execution interruption
- Concurrent interruption (simulated cancel button)
- Interrupt and retry
## π Practical Examples
We've included two production-ready agents that demonstrate real-world usage:
### π Git Commit Agent
**[examples/git_commit_agent.py](examples/git_commit_agent.py)**
Analyzes your staged git changes and writes professional commit messages following conventional commit format.
```bash
# Stage your changes
git add .
# Run the agent
python examples/git_commit_agent.py
# Output:
# β Found staged changes in 3 file(s)
# π€ Analyzing changes and generating commit message...
#
# π Suggested commit message:
# feat(auth): Add OAuth2 integration with refresh tokens
#
# - Implement token refresh mechanism
# - Add secure cookie storage for tokens
# - Update login flow to support OAuth2 providers
# - Add tests for token expiration handling
```
**Features:**
- Analyzes diff to determine commit type (feat/fix/docs/etc)
- Writes clear, descriptive commit messages
- Interactive mode: accept, edit, or regenerate
- Follows conventional commit standards
### π Log Analyzer Agent
**[examples/log_analyzer_agent.py](examples/log_analyzer_agent.py)**
Intelligently analyzes application logs to identify patterns, errors, and provide actionable insights.
```bash
# Analyze a log file
python examples/log_analyzer_agent.py /var/log/app.log
# Analyze with a specific time window
python examples/log_analyzer_agent.py app.log --since "2025-10-15T00:00:00" --until "2025-10-15T12:00:00"
# Interactive mode for drilling down
python examples/log_analyzer_agent.py app.log --interactive
```
**Features:**
- Automatic error pattern detection
- Time-based analysis (peak error times)
- Root cause suggestions
- Interactive mode for investigating specific issues
- Supports multiple log formats (JSON, Apache, syslog, etc)
- Time range filtering with `--since` / `--until`
**Sample Output:**
```
π Log Summary:
Total entries: 45,231
Errors: 127 (0.3%)
Warnings: 892
π΄ Top Error Patterns:
- Connection Error: 67 occurrences
- NullPointerException: 23 occurrences
- Timeout Error: 19 occurrences
β° Peak error time: 2025-10-15T14:00:00
Errors in that hour: 43
π€ ANALYSIS REPORT:
Main Issues (Priority Order):
1. Database connection pool exhaustion during peak hours
2. Unhandled null values in user authentication flow
3. External API timeouts affecting payment processing
Recommendations:
1. Increase connection pool size from 10 to 25
2. Add null checks in AuthService.validateUser() method
3. Implement circuit breaker for payment API with 30s timeout
```
### Why These Examples?
These agents demonstrate:
- **Practical Value**: Solve real problems developers face daily
- **Tool Integration**: Show how to integrate with system commands (git, file I/O)
- **Multi-turn Conversations**: Interactive modes for complex analysis
- **Structured Output**: Parse and format LLM responses for actionable results
- **Privacy-First**: Keep your code and logs local while getting AI assistance
## Configuration
Open Agent SDK uses config helpers to provide flexible configuration via environment variables, provider shortcuts, or explicit parameters:
### Environment Variables (Recommended)
```bash
export OPEN_AGENT_BASE_URL="http://localhost:1234/v1"
export OPEN_AGENT_MODEL="qwen/qwen3-30b-a3b-2507"
```
```python
from open_agent import AgentOptions
from open_agent.config import get_model, get_base_url
# Config helpers read from environment
options = AgentOptions(
system_prompt="...",
model=get_model(), # Reads OPEN_AGENT_MODEL
base_url=get_base_url() # Reads OPEN_AGENT_BASE_URL
)
```
### Provider Shortcuts
```python
from open_agent.config import get_base_url
# Use built-in defaults for common providers
options = AgentOptions(
system_prompt="...",
model="llama3.1:70b",
base_url=get_base_url(provider="ollama") # β http://localhost:11434/v1
)
```
**Available providers**: `lmstudio`, `ollama`, `llamacpp`, `vllm`
### Fallback Values
```python
# Provide fallbacks when env vars not set
options = AgentOptions(
system_prompt="...",
model=get_model("qwen2.5-32b-instruct"), # Fallback model
base_url=get_base_url(provider="lmstudio") # Fallback URL
)
```
**Configuration Priority:**
- Environment variable (default behaviour)
- Fallback value passed to the config helper
- Provider default (for `base_url` only)
Need to force a specific model even when `OPEN_AGENT_MODEL` is set? Call `get_model("model-name", prefer_env=False)` to ignore the environment variable for that lookup.
**Benefits:**
- Switch between dev/prod by changing environment variables
- No hardcoded URLs or model names
- Per-agent overrides when needed
See [docs/configuration.md](docs/configuration.md) for complete guide.
## Why Not Just Use OpenAI Client?
**Without open-agent-sdk** (raw OpenAI client):
```python
from openai import AsyncOpenAI
client = AsyncOpenAI(base_url="http://localhost:1234/v1", api_key="not-needed")
response = await client.chat.completions.create(
model="qwen2.5-32b-instruct",
messages=[{"role": "system", "content": system_prompt},
{"role": "user", "content": user_prompt}],
stream=True
)
async for chunk in response:
# Complex parsing of chunks
# Extract delta content
# Handle tool calls manually
# Track conversation state yourself
```
**With open-agent-sdk**:
```python
from open_agent import query, AgentOptions
options = AgentOptions(
system_prompt=system_prompt,
model="qwen2.5-32b-instruct",
base_url="http://localhost:1234/v1"
)
result = query(prompt=user_prompt, options=options)
async for msg in result:
# Clean message types (TextBlock, ToolUseBlock)
# Automatic streaming and tool call handling
```
**Value**: Familiar patterns + Less boilerplate + Easy migration
## API Reference
### AgentOptions
```python
class AgentOptions:
system_prompt: str # System prompt
model: str # Model name (required)
base_url: str # OpenAI-compatible endpoint URL (required)
tools: list[Tool] = [] # Tool instances for function calling
hooks: dict[str, list[HookHandler]] = None # Lifecycle hooks for monitoring/control
auto_execute_tools: bool = False # Enable automatic tool execution (v0.3.0+)
max_tool_iterations: int = 5 # Max tool calls per query in auto mode
max_turns: int = 1 # Max conversation turns
max_tokens: int | None = 4096 # Tokens to generate (None uses provider default)
temperature: float = 0.7 # Sampling temperature
timeout: float = 60.0 # Request timeout in seconds
api_key: str = "not-needed" # Most local servers don't need this
```
**Note**: Use config helpers (`get_model()`, `get_base_url()`) for environment variable and provider support.
### query()
Simple single-turn query function.
```python
async def query(prompt: str, options: AgentOptions) -> AsyncGenerator
```
Returns an async generator yielding messages.
### Client
Multi-turn conversation client with tool monitoring.
```python
async with Client(options: AgentOptions) as client:
await client.query(prompt: str)
async for msg in client.receive_messages():
# Process messages
```
### Message Types
- `TextBlock` - Text content from model
- `ToolUseBlock` - Tool calls from model (has `id`, `name`, `input` fields)
- `ToolResultBlock` - Tool execution results to send back to model
- `ToolUseError` - Tool call parsing error (malformed JSON, missing fields)
- `AssistantMessage` - Full message wrapper
### Tool System
```python
@tool(name: str, description: str, input_schema: dict)
async def my_tool(args: dict) -> Any:
"""Tool handler function"""
return result
# Tool class
class Tool:
name: str
description: str
input_schema: dict[str, type] | dict[str, Any]
handler: Callable[[dict], Awaitable[Any]]
async def execute(arguments: dict) -> Any
def to_openai_format() -> dict
```
**Schema formats:**
- Simple: `{"param": str, "count": int}` - All parameters required
- JSON Schema: Full schema with `type`, `properties`, `required`, etc.
### Hooks System
```python
# Event types
@dataclass
class PreToolUseEvent:
tool_name: str
tool_input: dict[str, Any]
tool_use_id: str
history: list[dict[str, Any]]
@dataclass
class PostToolUseEvent:
tool_name: str
tool_input: dict[str, Any]
tool_result: Any
tool_use_id: str
history: list[dict[str, Any]]
@dataclass
class UserPromptSubmitEvent:
prompt: str
history: list[dict[str, Any]]
# Hook decision
@dataclass
class HookDecision:
continue_: bool = True
modified_input: dict[str, Any] | None = None
modified_prompt: str | None = None
reason: str | None = None
# Hook handler signature
HookHandler = Callable[[HookEvent], Awaitable[HookDecision | None]]
# Hook constants
HOOK_PRE_TOOL_USE = "pre_tool_use"
HOOK_POST_TOOL_USE = "post_tool_use"
HOOK_USER_PROMPT_SUBMIT = "user_prompt_submit"
```
**Hook behavior:**
- Return `None` to allow by default
- Return `HookDecision(continue_=False)` to block
- Return `HookDecision(modified_input={...})` to modify (PreToolUse)
- Return `HookDecision(modified_prompt="...")` to modify (UserPromptSubmit)
- Raise exception to abort entirely
## Recommended Models
**Local models** (LM Studio, Ollama, llama.cpp):
- **GPT-OSS-120B** - Best in class for speed and quality
- **Qwen 3 30B** - Excellent instruction following, good for most tasks
- **GPT-OSS-20B** - Solid all-around performance
- **Mistral 7B** - Fast and efficient for simple agents
**Cloud-proxied via local gateway** (Ollama cloud provider, custom gateway):
- **kimi-k2:1t-cloud** - Tested and working via Ollama gateway
- **deepseek-v3.1:671b-cloud** - High-quality reasoning model
- **qwen3-coder:480b-cloud** - Code-focused models
- Your `base_url` still points to localhost gateway (e.g., `http://localhost:11434/v1`)
- Gateway handles authentication and routing to cloud provider
- Useful when you need larger models than your hardware can run locally
**Architecture guidance:**
- Prefer MoE (Mixture of Experts) models over dense when available - significantly faster
- Start with 7B-30B models for most agent tasks - they're fast and capable
- Test models with your specific use case - the LLM landscape changes rapidly
## Project Structure
```
open-agent-sdk/
βββ open_agent/
β βββ __init__.py # query, Client, AgentOptions exports
β βββ client.py # Streaming query(), Client, tool helper
β βββ config.py # Env/provider helpers
β βββ context.py # Token estimation and truncation utilities
β βββ hooks.py # Lifecycle hooks (PreToolUse, PostToolUse, UserPromptSubmit)
β βββ tools.py # Tool decorator and schema conversion
β βββ types.py # Dataclasses for options and blocks
β βββ utils.py # OpenAI client + ToolCallAggregator
βββ docs/
β βββ configuration.md
β βββ provider-compatibility.md
β βββ technical-design.md
βββ examples/
β βββ git_commit_agent.py # π Practical: Git commit message generator
β βββ log_analyzer_agent.py # π Practical: Log file analyzer
β βββ calculator_tools.py # Function calling with @tool decorator
β βββ simple_tool.py # Minimal tool usage example
β βββ tool_use_agent.py # Complete tool use patterns
β βββ context_management.py # Manual history management patterns
β βββ hooks_example.py # Lifecycle hooks patterns (security, audit, sanitization)
β βββ interrupt_demo.py # Interrupt capability patterns (timeout, conditional, concurrent)
β βββ simple_lmstudio.py # Basic usage with LM Studio
β βββ ollama_chat.py # Multi-turn chat example
β βββ config_examples.py # Configuration patterns
β βββ simple_with_env.py # Environment variable config
βββ tests/
β βββ integration/ # Integration-style tests using fakes
β β βββ test_client_behaviour.py # Streaming, multi-turn, tool flow coverage
β βββ test_agent_options.py
β βββ test_auto_execution.py # Automatic tool execution
β βββ test_client.py
β βββ test_config.py
β βββ test_context.py # Context utilities (token estimation, truncation)
β βββ test_hooks.py # Lifecycle hooks (PreToolUse, PostToolUse, UserPromptSubmit)
β βββ test_interrupt.py # Interrupt capability (timeout, concurrent, reuse)
β βββ test_query.py
β βββ test_tools.py # Tool decorator and schema conversion
β βββ test_utils.py
βββ CHANGELOG.md
βββ pyproject.toml
βββ README.md
```
## Examples
### π Practical Agents (Production-Ready)
- **`git_commit_agent.py`** β Analyzes git diffs and writes professional commit messages
- **`log_analyzer_agent.py`** β Parses logs, finds patterns, suggests fixes with interactive mode
- **`tool_use_agent.py`** β Complete tool use patterns: manual, helper, and agent class
### Core SDK Usage
- `simple_lmstudio.py` β Minimal streaming query with hard-coded config (simplest quickstart)
- `simple_with_env.py` β Using environment variables with config helpers and fallbacks
- `config_examples.py` β Comprehensive reference: provider shortcuts, priority, and all config patterns
- `ollama_chat.py` β Multi-turn chat loop with Ollama, including tool-call logging
- `context_management.py` β Manual history management patterns (stateless, truncation, token monitoring, RAG-lite)
- `hooks_example.py` β Lifecycle hooks patterns (security gates, audit logging, input sanitization, combined)
### Integration Tests
Located in `tests/integration/`:
- `test_client_behaviour.py` β Fake AsyncOpenAI client covering streaming, multi-turn history, and tool-call flows without hitting real servers
## Development Status
**Released v0.1.0** β Core functionality is complete and available on PyPI. Multi-turn conversations, tool monitoring, and streaming are fully implemented.
### Roadmap
- [x] Project planning and architecture
- [x] Core `query()` and `Client` class
- [x] Tool monitoring + `Client.add_tool_result()` helper
- [x] Tool use example (`examples/tool_use_agent.py`)
- [x] PyPI release - Published as `open-agent-sdk`
- [ ] Provider compatibility matrix expansion
- [ ] Additional agent examples
### Tested Providers
- β
**Ollama** - Fully validated with `kimi-k2:1t-cloud` (cloud-proxied model)
- β
**LM Studio** - Fully validated with `qwen/qwen3-30b` model
- β
**llama.cpp** - Fully validated with TinyLlama 1.1B model
See [docs/provider-compatibility.md](docs/provider-compatibility.md) for detailed test results.
## Documentation
- [docs/technical-design.md](docs/technical-design.md) - Architecture details
- [docs/configuration.md](docs/configuration.md) - Configuration guide
- [docs/provider-compatibility.md](docs/provider-compatibility.md) - Provider test results
- [examples/](examples/) - Usage examples
## Testing
Integration-style tests run entirely against lightweight fakes, so they are safe to execute locally and in pre-commit:
```bash
python -m pytest tests/integration
```
Add `-k` or a specific path when you want to target a subset of the unit tests (`tests/test_client.py`, etc.). If you use a virtual environment, prefix commands with `./venv/bin/python -m`.
## Pre-commit Hooks
Install hooks once per clone:
```bash
pip install pre-commit
pre-commit install
```
Running `pre-commit run --all-files` will execute formatting checks and the integration tests (`python -m pytest tests/integration`) before you push changes.
## Requirements
- Python 3.10+
- openai 1.0+ (for AsyncOpenAI client)
- pydantic 2.0+ (for types, optional)
- Some servers require a dummy `api_key`; set any non-empty string if needed
## License
MIT License - see [LICENSE](LICENSE) for details.
## Acknowledgments
- API design inspired by [claude-agent-sdk](https://github.com/anthropics/claude-agent-sdk-python)
- Built for local/open-source LLM enthusiasts
---
**Status**: Alpha - API stabilizing, feedback welcome
Star β this repo if you're building AI agents with local models!
Raw data
{
"_id": null,
"home_page": null,
"name": "open-agent-sdk",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": "llm, ai, agent, local, openai, ollama, lmstudio, llamacpp",
"author": "Open Agent SDK Contributors",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/df/0b/c0ce4b0702ef43fe93d17b518d13c5ff5e846dba24290c3026974d1b9733/open_agent_sdk-0.4.1.tar.gz",
"platform": null,
"description": "# Open Agent SDK\n\n> Lightweight Python SDK for local/self-hosted LLMs via OpenAI-compatible endpoints\n\n[](https://pypi.org/project/open-agent-sdk/)\n[](https://www.python.org/downloads/)\n[](https://opensource.org/licenses/MIT)\n\n## Overview\n\nOpen Agent SDK provides a clean, streaming API for working with OpenAI-compatible local model servers, making it easy to build AI agents with your own hardware.\n\n**Use Case**: Build powerful AI agents using local Qwen/Llama/Mistral models without cloud API costs or data privacy concerns.\n\n**Solution**: Drop-in similar API that works with LM Studio, Ollama, llama.cpp, and any OpenAI-compatible endpoint\u2014complete with streaming, tool call aggregation, and a helper for returning tool results back to the model.\n\n## Supported Providers\n\n### \u2705 Supported (OpenAI-Compatible Endpoints)\n\n- **LM Studio** - `http://localhost:1234/v1`\n- **Ollama** - `http://localhost:11434/v1`\n- **llama.cpp server** - OpenAI-compatible mode\n- **vLLM** - OpenAI-compatible API\n- **Text Generation WebUI** - OpenAI extension\n- **Any OpenAI-compatible local endpoint**\n- **Local gateways proxying cloud models** - e.g., Ollama or custom gateways that route to cloud providers\n\n### \u274c Not Supported (Use Official SDKs)\n\n- **Claude/OpenAI direct** - Use their official SDKs, unless you proxy through a local OpenAI-compatible gateway\n- **Cloud provider SDKs** - Bedrock, Vertex, Azure, etc. (proxied via local gateway is fine)\n\n## Quick Start\n\n### Installation\n\n```bash\npip install open-agent-sdk\n```\n\nFor development:\n\n```bash\ngit clone https://github.com/slb350/open-agent-sdk.git\ncd open-agent-sdk\npip install -e .\n```\n\n### Simple Query (LM Studio)\n\n```python\nimport asyncio\nfrom open_agent import query, AgentOptions\n\nasync def main():\n options = AgentOptions(\n system_prompt=\"You are a professional copy editor\",\n model=\"qwen2.5-32b-instruct\",\n base_url=\"http://localhost:1234/v1\",\n max_turns=1,\n temperature=0.1\n )\n\n result = query(prompt=\"Analyze this text...\", options=options)\n\n response_text = \"\"\n async for msg in result:\n if hasattr(msg, 'content'):\n for block in msg.content:\n if hasattr(block, 'text'):\n response_text += block.text\n\n print(response_text)\n\nasyncio.run(main())\n```\n\n### Multi-Turn Conversation (Ollama)\n\n```python\nfrom open_agent import Client, AgentOptions, TextBlock, ToolUseBlock\nfrom open_agent.config import get_base_url\n\ndef run_my_tool(name: str, params: dict) -> dict:\n # Replace with your tool execution logic\n return {\"result\": f\"stub output for {name}\"}\n\nasync def main():\n options = AgentOptions(\n system_prompt=\"You are a helpful assistant\",\n model=\"kimi-k2:1t-cloud\", # Use your available Ollama model\n base_url=get_base_url(provider=\"ollama\"),\n max_turns=10\n )\n\n async with Client(options) as client:\n await client.query(\"What's the capital of France?\")\n\n async for msg in client.receive_messages():\n if isinstance(msg, TextBlock):\n print(f\"Assistant: {msg.text}\")\n elif isinstance(msg, ToolUseBlock):\n print(f\"Tool used: {msg.name}\")\n tool_result = run_my_tool(msg.name, msg.input)\n client.add_tool_result(msg.id, tool_result)\n\nasyncio.run(main())\n```\n\nSee `examples/tool_use_agent.py` for progressively richer patterns (manual loop, helper function, and reusable agent class) demonstrating `add_tool_result()` in context.\n\n### Function Calling with Tools\n\nDefine tools using the `@tool` decorator for clean, type-safe function calling:\n\n```python\nfrom open_agent import tool, Client, AgentOptions, TextBlock, ToolUseBlock\n\n# Define tools\n@tool(\"get_weather\", \"Get current weather\", {\"location\": str, \"units\": str})\nasync def get_weather(args):\n return {\n \"temperature\": 72,\n \"conditions\": \"sunny\",\n \"units\": args[\"units\"]\n }\n\n@tool(\"calculate\", \"Perform calculation\", {\"a\": float, \"b\": float, \"op\": str})\nasync def calculate(args):\n ops = {\"+\": lambda a, b: a + b, \"-\": lambda a, b: a - b}\n result = ops[args[\"op\"]](args[\"a\"], args[\"b\"])\n return {\"result\": result}\n\n# Enable automatic tool execution (recommended)\noptions = AgentOptions(\n system_prompt=\"You are a helpful assistant with access to tools.\",\n model=\"qwen2.5-32b-instruct\",\n base_url=\"http://localhost:1234/v1\",\n tools=[get_weather, calculate],\n auto_execute_tools=True, # \ud83d\udd25 Tools execute automatically\n max_tool_iterations=10 # Safety limit for tool loops\n)\n\nasync with Client(options) as client:\n await client.query(\"What's 25 + 17?\")\n\n # Simply iterate - tools execute automatically!\n async for block in client.receive_messages():\n if isinstance(block, ToolUseBlock):\n print(f\"Tool called: {block.name}\")\n elif isinstance(block, TextBlock):\n print(f\"Response: {block.text}\")\n```\n\n**Advanced: Manual Tool Execution**\n\nFor custom execution logic or result interception:\n\n```python\n# Disable auto-execution\noptions = AgentOptions(\n system_prompt=\"You are a helpful assistant with access to tools.\",\n model=\"qwen2.5-32b-instruct\",\n base_url=\"http://localhost:1234/v1\",\n tools=[get_weather, calculate],\n auto_execute_tools=False # Manual mode\n)\n\nasync with Client(options) as client:\n await client.query(\"What's 25 + 17?\")\n\n async for block in client.receive_messages():\n if isinstance(block, ToolUseBlock):\n # You execute the tool manually\n tool = {\"calculate\": calculate, \"get_weather\": get_weather}[block.name]\n result = await tool.execute(block.input)\n\n # Return result to agent\n await client.add_tool_result(block.id, result)\n\n # Continue conversation\n await client.query(\"\")\n```\n\n**Key Features:**\n- **Automatic execution** (v0.3.0+) - Tools run automatically with safety limits\n- **Type-safe schemas** - Simple Python types (`str`, `int`, `float`, `bool`) or full JSON Schema\n- **OpenAI-compatible** - Works with any OpenAI function calling endpoint\n- **Clean decorator API** - Similar to Claude SDK's `@tool`\n- **Hook integration** - PreToolUse/PostToolUse hooks work in both modes\n\nSee `examples/calculator_tools.py` and `examples/simple_tool.py` for complete examples.\n\n## Context Management\n\nLocal models have fixed context windows (typically 8k-32k tokens). The SDK provides **opt-in utilities** for manual history management\u2014no silent mutations, you stay in control.\n\n### Token Estimation & Truncation\n\n```python\nfrom open_agent import Client, AgentOptions\nfrom open_agent.context import estimate_tokens, truncate_messages\n\nasync with Client(options) as client:\n # Long conversation...\n for i in range(50):\n await client.query(f\"Question {i}\")\n async for msg in client.receive_messages():\n pass\n\n # Check token usage\n tokens = estimate_tokens(client.history)\n print(f\"Context size: ~{tokens} tokens\")\n\n # Manually truncate when needed\n if tokens > 28000:\n client.message_history = truncate_messages(client.history, keep=10)\n```\n\n### Recommended Patterns\n\n**1. Stateless Agents** (Best for single-task agents):\n```python\n# Process each task independently - no history accumulation\nfor task in tasks:\n async with Client(options) as client:\n await client.query(task)\n # Client disposed, fresh context for next task\n```\n\n**2. Manual Truncation** (At natural breakpoints):\n```python\nfrom open_agent.context import truncate_messages\n\nasync with Client(options) as client:\n for task in tasks:\n await client.query(task)\n # Truncate after each major task\n client.message_history = truncate_messages(client.history, keep=5)\n```\n\n**3. External Memory** (RAG-lite for research agents):\n```python\n# Store important facts in database, keep conversation context small\ndatabase = {}\nasync with Client(options) as client:\n await client.query(\"Research topic X\")\n # Save response to database\n database[\"topic_x\"] = extract_facts(response)\n\n # Clear history, query database when needed\n client.message_history = truncate_messages(client.history, keep=0)\n```\n\n### Why Manual?\n\nThe SDK **intentionally** does not auto-compact history because:\n- **Domain-specific needs**: Copy editors need different strategies than research agents\n- **Token accuracy varies**: Each model family has different tokenizers\n- **Risk of breaking context**: Silently removing messages could break tool chains\n- **Natural limits exist**: Compaction doesn't bypass model context windows\n\n### Installing Token Estimation\n\nFor better token estimation accuracy (optional):\n\n```bash\npip install open-agent-sdk[context] # Adds tiktoken\n```\n\nWithout `tiktoken`, falls back to character-based approximation (~75-85% accurate).\n\nSee `examples/context_management.py` for complete patterns and usage.\n\n## Lifecycle Hooks\n\nMonitor and control agent behavior at key execution points with Pythonic lifecycle hooks\u2014no subprocess overhead or JSON protocols.\n\n### Quick Example\n\n```python\nfrom open_agent import (\n AgentOptions, Client,\n PreToolUseEvent, PostToolUseEvent, UserPromptSubmitEvent,\n HookDecision,\n HOOK_PRE_TOOL_USE, HOOK_POST_TOOL_USE, HOOK_USER_PROMPT_SUBMIT\n)\n\n# Security gate - block dangerous operations\nasync def security_gate(event: PreToolUseEvent) -> HookDecision | None:\n if event.tool_name == \"delete_file\":\n return HookDecision(\n continue_=False,\n reason=\"Delete operations require approval\"\n )\n return None # Allow by default\n\n# Audit logger - track all tool executions\nasync def audit_logger(event: PostToolUseEvent) -> None:\n print(f\"Tool executed: {event.tool_name} -> {event.tool_result}\")\n return None\n\n# Input sanitizer - validate user prompts\nasync def sanitize_input(event: UserPromptSubmitEvent) -> HookDecision | None:\n if \"DELETE\" in event.prompt.upper():\n return HookDecision(\n continue_=False,\n reason=\"Dangerous keywords detected\"\n )\n return None\n\n# Register hooks in AgentOptions\noptions = AgentOptions(\n system_prompt=\"You are a helpful assistant\",\n model=\"qwen2.5-32b-instruct\",\n base_url=\"http://localhost:1234/v1\",\n tools=[my_file_tool, my_search_tool],\n hooks={\n HOOK_PRE_TOOL_USE: [security_gate],\n HOOK_POST_TOOL_USE: [audit_logger],\n HOOK_USER_PROMPT_SUBMIT: [sanitize_input],\n }\n)\n\nasync with Client(options) as client:\n await client.query(\"Write to /etc/config\") # UserPromptSubmit fires\n async for block in client.receive_messages():\n if isinstance(block, ToolUseBlock): # PreToolUse fires\n result = await tool.execute(block.input)\n await client.add_tool_result(block.id, result) # PostToolUse fires\n```\n\n### Hook Types\n\n**PreToolUse** - Fires before tool execution (or yielding to user)\n- **Block operations**: Return `HookDecision(continue_=False, reason=\"...\")`\n- **Modify inputs**: Return `HookDecision(modified_input={...}, reason=\"...\")`\n- **Allow**: Return `None`\n\n**PostToolUse** - Fires after tool result added to history\n- **Observational only** (tool already executed)\n- Use for audit logging, metrics, result validation\n- Return `None` (decision ignored for PostToolUse)\n\n**UserPromptSubmit** - Fires before sending prompt to API\n- **Block prompts**: Return `HookDecision(continue_=False, reason=\"...\")`\n- **Modify prompts**: Return `HookDecision(modified_prompt=\"...\", reason=\"...\")`\n- **Allow**: Return `None`\n\n### Common Patterns\n\n**Pattern 1: Redirect to Sandbox**\n\n```python\nasync def redirect_to_sandbox(event: PreToolUseEvent) -> HookDecision | None:\n \"\"\"Redirect file operations to safe directory.\"\"\"\n if event.tool_name == \"file_writer\":\n path = event.tool_input.get(\"path\", \"\")\n if not path.startswith(\"/tmp/\"):\n safe_path = f\"/tmp/sandbox/{path.lstrip('/')}\"\n return HookDecision(\n modified_input={\"path\": safe_path, \"content\": event.tool_input.get(\"content\", \"\")},\n reason=\"Redirected to sandbox\"\n )\n return None\n```\n\n**Pattern 2: Compliance Audit Log**\n\n```python\naudit_log = []\n\nasync def compliance_logger(event: PostToolUseEvent) -> None:\n \"\"\"Log all tool executions for compliance.\"\"\"\n audit_log.append({\n \"timestamp\": datetime.now(),\n \"tool\": event.tool_name,\n \"input\": event.tool_input,\n \"result\": str(event.tool_result)[:100],\n \"user\": get_current_user()\n })\n return None\n```\n\n**Pattern 3: Safety Instructions**\n\n```python\nasync def add_safety_warning(event: UserPromptSubmitEvent) -> HookDecision | None:\n \"\"\"Add safety instructions to risky prompts.\"\"\"\n if \"write\" in event.prompt.lower() or \"delete\" in event.prompt.lower():\n safe_prompt = event.prompt + \" (Please confirm this is safe before proceeding)\"\n return HookDecision(\n modified_prompt=safe_prompt,\n reason=\"Added safety warning\"\n )\n return None\n```\n\n### Hook Execution Flow\n\n- Hooks run **sequentially** in the order registered\n- **First non-None decision wins** (short-circuit behavior)\n- Hooks run **inline on event loop** (spawn tasks for heavy work)\n- Works with both **Client** and **query()** function\n\n### Breaking Change (v0.2.4)\n\n`Client.add_tool_result()` is now async to support PostToolUse hooks:\n\n```python\n# Old (v0.2.3 and earlier)\nclient.add_tool_result(tool_id, result)\n\n# New (v0.2.4+)\nawait client.add_tool_result(tool_id, result)\n```\n\n### Why Hooks?\n\n- **Security gates**: Block dangerous operations before they execute\n- **Audit logging**: Track all tool executions for compliance\n- **Input validation**: Sanitize user prompts before processing\n- **Monitoring**: Observe agent behavior in production\n- **Control flow**: Modify tool inputs or redirect operations\n\nSee `examples/hooks_example.py` for 4 comprehensive patterns (security, audit, sanitization, combined).\n\n## Interrupt Capability\n\nCancel long-running operations cleanly without corrupting client state. Perfect for timeouts, user cancellations, or conditional interruptions.\n\n### Quick Example\n\n```python\nfrom open_agent import Client, AgentOptions\nimport asyncio\n\nasync def main():\n options = AgentOptions(\n system_prompt=\"You are a helpful assistant.\",\n model=\"qwen2.5-32b-instruct\",\n base_url=\"http://localhost:1234/v1\"\n )\n\n async with Client(options) as client:\n await client.query(\"Write a detailed 1000-word essay...\")\n\n # Timeout after 5 seconds\n try:\n async def collect_messages():\n async for block in client.receive_messages():\n print(block.text, end=\"\", flush=True)\n\n await asyncio.wait_for(collect_messages(), timeout=5.0)\n except asyncio.TimeoutError:\n await client.interrupt() # Clean cancellation\n print(\"\\n\u26a0\ufe0f Operation timed out!\")\n\n # Client is still usable after interrupt\n await client.query(\"Short question?\")\n async for block in client.receive_messages():\n print(block.text)\n```\n\n### Common Patterns\n\n**1. Timeout-Based Interruption**\n\n```python\ntry:\n await asyncio.wait_for(process_messages(client), timeout=10.0)\nexcept asyncio.TimeoutError:\n await client.interrupt()\n print(\"Operation timed out\")\n```\n\n**2. Conditional Interruption**\n\n```python\n# Stop if response contains specific content\nfull_text = \"\"\nasync for block in client.receive_messages():\n full_text += block.text\n if \"error\" in full_text.lower():\n await client.interrupt()\n break\n```\n\n**3. User Cancellation (from separate task)**\n\n```python\nasync def stream_task():\n await client.query(\"Long task...\")\n async for block in client.receive_messages():\n print(block.text, end=\"\")\n\nasync def cancel_button_task():\n await asyncio.sleep(2.0) # User waits 2 seconds\n await client.interrupt() # User clicks cancel\n\n# Run both concurrently\nawait asyncio.gather(stream_task(), cancel_button_task())\n```\n\n**4. Interrupt During Auto-Execution**\n\n```python\noptions = AgentOptions(\n tools=[slow_tool, fast_tool],\n auto_execute_tools=True,\n max_tool_iterations=10\n)\n\nasync with Client(options) as client:\n await client.query(\"Use tools...\")\n\n tool_count = 0\n async for block in client.receive_messages():\n if isinstance(block, ToolUseBlock):\n tool_count += 1\n if tool_count >= 2:\n await client.interrupt() # Stop after 2 tools\n break\n```\n\n### How It Works\n\nWhen you call `client.interrupt()`:\n1. **Active stream closure** - HTTP stream closed immediately (not just a flag)\n2. **Clean state** - Client remains in valid state for reuse\n3. **Partial output** - Text blocks flushed to history, incomplete tools skipped\n4. **Idempotent** - Safe to call multiple times\n5. **Concurrent-safe** - Can be called from separate asyncio tasks\n\n### Example\n\nSee `examples/interrupt_demo.py` for 5 comprehensive patterns:\n- Timeout-based interruption\n- Conditional interruption\n- Auto-execution interruption\n- Concurrent interruption (simulated cancel button)\n- Interrupt and retry\n\n## \ud83d\ude80 Practical Examples\n\nWe've included two production-ready agents that demonstrate real-world usage:\n\n### \ud83d\udcdd Git Commit Agent\n**[examples/git_commit_agent.py](examples/git_commit_agent.py)**\n\nAnalyzes your staged git changes and writes professional commit messages following conventional commit format.\n\n```bash\n# Stage your changes\ngit add .\n\n# Run the agent\npython examples/git_commit_agent.py\n\n# Output:\n# \u2713 Found staged changes in 3 file(s)\n# \ud83e\udd16 Analyzing changes and generating commit message...\n#\n# \ud83d\udcdd Suggested commit message:\n# feat(auth): Add OAuth2 integration with refresh tokens\n#\n# - Implement token refresh mechanism\n# - Add secure cookie storage for tokens\n# - Update login flow to support OAuth2 providers\n# - Add tests for token expiration handling\n```\n\n**Features:**\n- Analyzes diff to determine commit type (feat/fix/docs/etc)\n- Writes clear, descriptive commit messages\n- Interactive mode: accept, edit, or regenerate\n- Follows conventional commit standards\n\n### \ud83d\udcca Log Analyzer Agent\n**[examples/log_analyzer_agent.py](examples/log_analyzer_agent.py)**\n\nIntelligently analyzes application logs to identify patterns, errors, and provide actionable insights.\n\n```bash\n# Analyze a log file\npython examples/log_analyzer_agent.py /var/log/app.log\n\n# Analyze with a specific time window\npython examples/log_analyzer_agent.py app.log --since \"2025-10-15T00:00:00\" --until \"2025-10-15T12:00:00\"\n\n# Interactive mode for drilling down\npython examples/log_analyzer_agent.py app.log --interactive\n```\n\n**Features:**\n- Automatic error pattern detection\n- Time-based analysis (peak error times)\n- Root cause suggestions\n- Interactive mode for investigating specific issues\n- Supports multiple log formats (JSON, Apache, syslog, etc)\n- Time range filtering with `--since` / `--until`\n\n**Sample Output:**\n```\n\ud83d\udcca Log Summary:\n Total entries: 45,231\n Errors: 127 (0.3%)\n Warnings: 892\n\n\ud83d\udd34 Top Error Patterns:\n - Connection Error: 67 occurrences\n - NullPointerException: 23 occurrences\n - Timeout Error: 19 occurrences\n\n\u23f0 Peak error time: 2025-10-15T14:00:00\n Errors in that hour: 43\n\n\ud83e\udd16 ANALYSIS REPORT:\nMain Issues (Priority Order):\n1. Database connection pool exhaustion during peak hours\n2. Unhandled null values in user authentication flow\n3. External API timeouts affecting payment processing\n\nRecommendations:\n1. Increase connection pool size from 10 to 25\n2. Add null checks in AuthService.validateUser() method\n3. Implement circuit breaker for payment API with 30s timeout\n```\n\n### Why These Examples?\n\nThese agents demonstrate:\n- **Practical Value**: Solve real problems developers face daily\n- **Tool Integration**: Show how to integrate with system commands (git, file I/O)\n- **Multi-turn Conversations**: Interactive modes for complex analysis\n- **Structured Output**: Parse and format LLM responses for actionable results\n- **Privacy-First**: Keep your code and logs local while getting AI assistance\n\n## Configuration\n\nOpen Agent SDK uses config helpers to provide flexible configuration via environment variables, provider shortcuts, or explicit parameters:\n\n### Environment Variables (Recommended)\n\n```bash\nexport OPEN_AGENT_BASE_URL=\"http://localhost:1234/v1\"\nexport OPEN_AGENT_MODEL=\"qwen/qwen3-30b-a3b-2507\"\n```\n\n```python\nfrom open_agent import AgentOptions\nfrom open_agent.config import get_model, get_base_url\n\n# Config helpers read from environment\noptions = AgentOptions(\n system_prompt=\"...\",\n model=get_model(), # Reads OPEN_AGENT_MODEL\n base_url=get_base_url() # Reads OPEN_AGENT_BASE_URL\n)\n```\n\n### Provider Shortcuts\n\n```python\nfrom open_agent.config import get_base_url\n\n# Use built-in defaults for common providers\noptions = AgentOptions(\n system_prompt=\"...\",\n model=\"llama3.1:70b\",\n base_url=get_base_url(provider=\"ollama\") # \u2192 http://localhost:11434/v1\n)\n```\n\n**Available providers**: `lmstudio`, `ollama`, `llamacpp`, `vllm`\n\n### Fallback Values\n\n```python\n# Provide fallbacks when env vars not set\noptions = AgentOptions(\n system_prompt=\"...\",\n model=get_model(\"qwen2.5-32b-instruct\"), # Fallback model\n base_url=get_base_url(provider=\"lmstudio\") # Fallback URL\n)\n```\n\n**Configuration Priority:**\n- Environment variable (default behaviour)\n- Fallback value passed to the config helper\n- Provider default (for `base_url` only)\n\nNeed to force a specific model even when `OPEN_AGENT_MODEL` is set? Call `get_model(\"model-name\", prefer_env=False)` to ignore the environment variable for that lookup.\n\n**Benefits:**\n- Switch between dev/prod by changing environment variables\n- No hardcoded URLs or model names\n- Per-agent overrides when needed\n\nSee [docs/configuration.md](docs/configuration.md) for complete guide.\n\n## Why Not Just Use OpenAI Client?\n\n**Without open-agent-sdk** (raw OpenAI client):\n```python\nfrom openai import AsyncOpenAI\n\nclient = AsyncOpenAI(base_url=\"http://localhost:1234/v1\", api_key=\"not-needed\")\nresponse = await client.chat.completions.create(\n model=\"qwen2.5-32b-instruct\",\n messages=[{\"role\": \"system\", \"content\": system_prompt},\n {\"role\": \"user\", \"content\": user_prompt}],\n stream=True\n)\n\nasync for chunk in response:\n # Complex parsing of chunks\n # Extract delta content\n # Handle tool calls manually\n # Track conversation state yourself\n```\n\n**With open-agent-sdk**:\n```python\nfrom open_agent import query, AgentOptions\n\noptions = AgentOptions(\n system_prompt=system_prompt,\n model=\"qwen2.5-32b-instruct\",\n base_url=\"http://localhost:1234/v1\"\n)\n\nresult = query(prompt=user_prompt, options=options)\nasync for msg in result:\n # Clean message types (TextBlock, ToolUseBlock)\n # Automatic streaming and tool call handling\n```\n\n**Value**: Familiar patterns + Less boilerplate + Easy migration\n\n## API Reference\n\n### AgentOptions\n\n```python\nclass AgentOptions:\n system_prompt: str # System prompt\n model: str # Model name (required)\n base_url: str # OpenAI-compatible endpoint URL (required)\n tools: list[Tool] = [] # Tool instances for function calling\n hooks: dict[str, list[HookHandler]] = None # Lifecycle hooks for monitoring/control\n auto_execute_tools: bool = False # Enable automatic tool execution (v0.3.0+)\n max_tool_iterations: int = 5 # Max tool calls per query in auto mode\n max_turns: int = 1 # Max conversation turns\n max_tokens: int | None = 4096 # Tokens to generate (None uses provider default)\n temperature: float = 0.7 # Sampling temperature\n timeout: float = 60.0 # Request timeout in seconds\n api_key: str = \"not-needed\" # Most local servers don't need this\n```\n\n**Note**: Use config helpers (`get_model()`, `get_base_url()`) for environment variable and provider support.\n\n### query()\n\nSimple single-turn query function.\n\n```python\nasync def query(prompt: str, options: AgentOptions) -> AsyncGenerator\n```\n\nReturns an async generator yielding messages.\n\n### Client\n\nMulti-turn conversation client with tool monitoring.\n\n```python\nasync with Client(options: AgentOptions) as client:\n await client.query(prompt: str)\n async for msg in client.receive_messages():\n # Process messages\n```\n\n### Message Types\n\n- `TextBlock` - Text content from model\n- `ToolUseBlock` - Tool calls from model (has `id`, `name`, `input` fields)\n- `ToolResultBlock` - Tool execution results to send back to model\n- `ToolUseError` - Tool call parsing error (malformed JSON, missing fields)\n- `AssistantMessage` - Full message wrapper\n\n### Tool System\n\n```python\n@tool(name: str, description: str, input_schema: dict)\nasync def my_tool(args: dict) -> Any:\n \"\"\"Tool handler function\"\"\"\n return result\n\n# Tool class\nclass Tool:\n name: str\n description: str\n input_schema: dict[str, type] | dict[str, Any]\n handler: Callable[[dict], Awaitable[Any]]\n\n async def execute(arguments: dict) -> Any\n def to_openai_format() -> dict\n```\n\n**Schema formats:**\n- Simple: `{\"param\": str, \"count\": int}` - All parameters required\n- JSON Schema: Full schema with `type`, `properties`, `required`, etc.\n\n### Hooks System\n\n```python\n# Event types\n@dataclass\nclass PreToolUseEvent:\n tool_name: str\n tool_input: dict[str, Any]\n tool_use_id: str\n history: list[dict[str, Any]]\n\n@dataclass\nclass PostToolUseEvent:\n tool_name: str\n tool_input: dict[str, Any]\n tool_result: Any\n tool_use_id: str\n history: list[dict[str, Any]]\n\n@dataclass\nclass UserPromptSubmitEvent:\n prompt: str\n history: list[dict[str, Any]]\n\n# Hook decision\n@dataclass\nclass HookDecision:\n continue_: bool = True\n modified_input: dict[str, Any] | None = None\n modified_prompt: str | None = None\n reason: str | None = None\n\n# Hook handler signature\nHookHandler = Callable[[HookEvent], Awaitable[HookDecision | None]]\n\n# Hook constants\nHOOK_PRE_TOOL_USE = \"pre_tool_use\"\nHOOK_POST_TOOL_USE = \"post_tool_use\"\nHOOK_USER_PROMPT_SUBMIT = \"user_prompt_submit\"\n```\n\n**Hook behavior:**\n- Return `None` to allow by default\n- Return `HookDecision(continue_=False)` to block\n- Return `HookDecision(modified_input={...})` to modify (PreToolUse)\n- Return `HookDecision(modified_prompt=\"...\")` to modify (UserPromptSubmit)\n- Raise exception to abort entirely\n\n## Recommended Models\n\n**Local models** (LM Studio, Ollama, llama.cpp):\n- **GPT-OSS-120B** - Best in class for speed and quality\n- **Qwen 3 30B** - Excellent instruction following, good for most tasks\n- **GPT-OSS-20B** - Solid all-around performance\n- **Mistral 7B** - Fast and efficient for simple agents\n\n**Cloud-proxied via local gateway** (Ollama cloud provider, custom gateway):\n- **kimi-k2:1t-cloud** - Tested and working via Ollama gateway\n- **deepseek-v3.1:671b-cloud** - High-quality reasoning model\n- **qwen3-coder:480b-cloud** - Code-focused models\n- Your `base_url` still points to localhost gateway (e.g., `http://localhost:11434/v1`)\n- Gateway handles authentication and routing to cloud provider\n- Useful when you need larger models than your hardware can run locally\n\n**Architecture guidance:**\n- Prefer MoE (Mixture of Experts) models over dense when available - significantly faster\n- Start with 7B-30B models for most agent tasks - they're fast and capable\n- Test models with your specific use case - the LLM landscape changes rapidly\n\n## Project Structure\n\n```\nopen-agent-sdk/\n\u251c\u2500\u2500 open_agent/\n\u2502 \u251c\u2500\u2500 __init__.py # query, Client, AgentOptions exports\n\u2502 \u251c\u2500\u2500 client.py # Streaming query(), Client, tool helper\n\u2502 \u251c\u2500\u2500 config.py # Env/provider helpers\n\u2502 \u251c\u2500\u2500 context.py # Token estimation and truncation utilities\n\u2502 \u251c\u2500\u2500 hooks.py # Lifecycle hooks (PreToolUse, PostToolUse, UserPromptSubmit)\n\u2502 \u251c\u2500\u2500 tools.py # Tool decorator and schema conversion\n\u2502 \u251c\u2500\u2500 types.py # Dataclasses for options and blocks\n\u2502 \u2514\u2500\u2500 utils.py # OpenAI client + ToolCallAggregator\n\u251c\u2500\u2500 docs/\n\u2502 \u251c\u2500\u2500 configuration.md\n\u2502 \u251c\u2500\u2500 provider-compatibility.md\n\u2502 \u2514\u2500\u2500 technical-design.md\n\u251c\u2500\u2500 examples/\n\u2502 \u251c\u2500\u2500 git_commit_agent.py # \ud83c\udf1f Practical: Git commit message generator\n\u2502 \u251c\u2500\u2500 log_analyzer_agent.py # \ud83c\udf1f Practical: Log file analyzer\n\u2502 \u251c\u2500\u2500 calculator_tools.py # Function calling with @tool decorator\n\u2502 \u251c\u2500\u2500 simple_tool.py # Minimal tool usage example\n\u2502 \u251c\u2500\u2500 tool_use_agent.py # Complete tool use patterns\n\u2502 \u251c\u2500\u2500 context_management.py # Manual history management patterns\n\u2502 \u251c\u2500\u2500 hooks_example.py # Lifecycle hooks patterns (security, audit, sanitization)\n\u2502 \u251c\u2500\u2500 interrupt_demo.py # Interrupt capability patterns (timeout, conditional, concurrent)\n\u2502 \u251c\u2500\u2500 simple_lmstudio.py # Basic usage with LM Studio\n\u2502 \u251c\u2500\u2500 ollama_chat.py # Multi-turn chat example\n\u2502 \u251c\u2500\u2500 config_examples.py # Configuration patterns\n\u2502 \u2514\u2500\u2500 simple_with_env.py # Environment variable config\n\u251c\u2500\u2500 tests/\n\u2502 \u251c\u2500\u2500 integration/ # Integration-style tests using fakes\n\u2502 \u2502 \u2514\u2500\u2500 test_client_behaviour.py # Streaming, multi-turn, tool flow coverage\n\u2502 \u251c\u2500\u2500 test_agent_options.py\n\u2502 \u251c\u2500\u2500 test_auto_execution.py # Automatic tool execution\n\u2502 \u251c\u2500\u2500 test_client.py\n\u2502 \u251c\u2500\u2500 test_config.py\n\u2502 \u251c\u2500\u2500 test_context.py # Context utilities (token estimation, truncation)\n\u2502 \u251c\u2500\u2500 test_hooks.py # Lifecycle hooks (PreToolUse, PostToolUse, UserPromptSubmit)\n\u2502 \u251c\u2500\u2500 test_interrupt.py # Interrupt capability (timeout, concurrent, reuse)\n\u2502 \u251c\u2500\u2500 test_query.py\n\u2502 \u251c\u2500\u2500 test_tools.py # Tool decorator and schema conversion\n\u2502 \u2514\u2500\u2500 test_utils.py\n\u251c\u2500\u2500 CHANGELOG.md\n\u251c\u2500\u2500 pyproject.toml\n\u2514\u2500\u2500 README.md\n```\n\n## Examples\n\n### \ud83c\udf1f Practical Agents (Production-Ready)\n- **`git_commit_agent.py`** \u2013 Analyzes git diffs and writes professional commit messages\n- **`log_analyzer_agent.py`** \u2013 Parses logs, finds patterns, suggests fixes with interactive mode\n- **`tool_use_agent.py`** \u2013 Complete tool use patterns: manual, helper, and agent class\n\n### Core SDK Usage\n- `simple_lmstudio.py` \u2013 Minimal streaming query with hard-coded config (simplest quickstart)\n- `simple_with_env.py` \u2013 Using environment variables with config helpers and fallbacks\n- `config_examples.py` \u2013 Comprehensive reference: provider shortcuts, priority, and all config patterns\n- `ollama_chat.py` \u2013 Multi-turn chat loop with Ollama, including tool-call logging\n- `context_management.py` \u2013 Manual history management patterns (stateless, truncation, token monitoring, RAG-lite)\n- `hooks_example.py` \u2013 Lifecycle hooks patterns (security gates, audit logging, input sanitization, combined)\n\n### Integration Tests\nLocated in `tests/integration/`:\n- `test_client_behaviour.py` \u2013 Fake AsyncOpenAI client covering streaming, multi-turn history, and tool-call flows without hitting real servers\n\n## Development Status\n\n**Released v0.1.0** \u2013 Core functionality is complete and available on PyPI. Multi-turn conversations, tool monitoring, and streaming are fully implemented.\n\n### Roadmap\n\n- [x] Project planning and architecture\n- [x] Core `query()` and `Client` class\n- [x] Tool monitoring + `Client.add_tool_result()` helper\n- [x] Tool use example (`examples/tool_use_agent.py`)\n- [x] PyPI release - Published as `open-agent-sdk`\n- [ ] Provider compatibility matrix expansion\n- [ ] Additional agent examples\n\n### Tested Providers\n\n- \u2705 **Ollama** - Fully validated with `kimi-k2:1t-cloud` (cloud-proxied model)\n- \u2705 **LM Studio** - Fully validated with `qwen/qwen3-30b` model\n- \u2705 **llama.cpp** - Fully validated with TinyLlama 1.1B model\n\nSee [docs/provider-compatibility.md](docs/provider-compatibility.md) for detailed test results.\n\n## Documentation\n\n- [docs/technical-design.md](docs/technical-design.md) - Architecture details\n- [docs/configuration.md](docs/configuration.md) - Configuration guide\n- [docs/provider-compatibility.md](docs/provider-compatibility.md) - Provider test results\n- [examples/](examples/) - Usage examples\n\n## Testing\n\nIntegration-style tests run entirely against lightweight fakes, so they are safe to execute locally and in pre-commit:\n\n```bash\npython -m pytest tests/integration\n```\n\nAdd `-k` or a specific path when you want to target a subset of the unit tests (`tests/test_client.py`, etc.). If you use a virtual environment, prefix commands with `./venv/bin/python -m`.\n\n## Pre-commit Hooks\n\nInstall hooks once per clone:\n\n```bash\npip install pre-commit\npre-commit install\n```\n\nRunning `pre-commit run --all-files` will execute formatting checks and the integration tests (`python -m pytest tests/integration`) before you push changes.\n\n## Requirements\n\n- Python 3.10+\n- openai 1.0+ (for AsyncOpenAI client)\n- pydantic 2.0+ (for types, optional)\n - Some servers require a dummy `api_key`; set any non-empty string if needed\n\n## License\n\nMIT License - see [LICENSE](LICENSE) for details.\n\n## Acknowledgments\n\n- API design inspired by [claude-agent-sdk](https://github.com/anthropics/claude-agent-sdk-python)\n- Built for local/open-source LLM enthusiasts\n\n---\n\n**Status**: Alpha - API stabilizing, feedback welcome\n\nStar \u2b50 this repo if you're building AI agents with local models!\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Lightweight Python SDK for local/self-hosted LLMs via OpenAI-compatible endpoints",
"version": "0.4.1",
"project_urls": {
"Documentation": "https://github.com/slb350/open-agent-sdk/tree/main/docs",
"Homepage": "https://github.com/slb350/open-agent-sdk",
"Issues": "https://github.com/slb350/open-agent-sdk/issues",
"Repository": "https://github.com/slb350/open-agent-sdk"
},
"split_keywords": [
"llm",
" ai",
" agent",
" local",
" openai",
" ollama",
" lmstudio",
" llamacpp"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "3b89d5cc764754dc98fa0c5393628fe2d4400f4248d9264370e4a1f7cafbbb60",
"md5": "99fab7d7f9ba1663aa89c7e1f72aacfd",
"sha256": "49762a2eb29cd3125334226148ec1be99d20d027912fe0823c67f16dc3f9f9b4"
},
"downloads": -1,
"filename": "open_agent_sdk-0.4.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "99fab7d7f9ba1663aa89c7e1f72aacfd",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 33491,
"upload_time": "2025-10-18T08:18:19",
"upload_time_iso_8601": "2025-10-18T08:18:19.926934Z",
"url": "https://files.pythonhosted.org/packages/3b/89/d5cc764754dc98fa0c5393628fe2d4400f4248d9264370e4a1f7cafbbb60/open_agent_sdk-0.4.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "df0bc0ce4b0702ef43fe93d17b518d13c5ff5e846dba24290c3026974d1b9733",
"md5": "e205828a76918349747b605958e117ca",
"sha256": "e22493faf1b38d64d114eff579283689430a16162e35d5e19c52b44607b5a22d"
},
"downloads": -1,
"filename": "open_agent_sdk-0.4.1.tar.gz",
"has_sig": false,
"md5_digest": "e205828a76918349747b605958e117ca",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 67711,
"upload_time": "2025-10-18T08:18:20",
"upload_time_iso_8601": "2025-10-18T08:18:20.866738Z",
"url": "https://files.pythonhosted.org/packages/df/0b/c0ce4b0702ef43fe93d17b518d13c5ff5e846dba24290c3026974d1b9733/open_agent_sdk-0.4.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-10-18 08:18:20",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "slb350",
"github_project": "open-agent-sdk",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "open-agent-sdk"
}