fastal-langgraph-toolkit

Name	fastal-langgraph-toolkit JSON
Version	0.4.0 JSON
	download
home_page	None
Summary	Common utilities and tools for LangGraph agents by Fastal
upload_time	2025-08-26 12:54:47
maintainer	None
docs_url	None
author	None
requires_python	>=3.10
license	MIT
keywords	agents ai langchain langgraph toolkit
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Fastal LangGraph Toolkit

[![CI/CD](https://github.com/FastalGroup/fastal-langgraph-toolkit/actions/workflows/test.yml/badge.svg)](https://github.com/FastalGroup/fastal-langgraph-toolkit/actions/workflows/test.yml)
[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![PyPI version](https://img.shields.io/pypi/v/fastal-langgraph-toolkit)](https://pypi.org/project/fastal-langgraph-toolkit/)

**Production-ready toolkit for building enterprise LangGraph agents with multi-provider support, intelligent conversation management, and speech processing capabilities.**

## 🏢 About

The Fastal LangGraph Toolkit was originally developed internally by the **Fastal Group** to support enterprise-grade agentic application implementations across multiple client projects. After proving its effectiveness in production environments, we've open-sourced this toolkit to contribute to the broader LangGraph community.

### Why This Toolkit?

Building production LangGraph agents involves solving common challenges in advanced research and development projects:
- **Multi-provider Management**: Support for multiple LLM/embedding/speech providers with seamless switching
- **Context Management**: Intelligent conversation summarization for long-running sessions
- **Memory Optimization**: Token-efficient context handling for cost control
- **Speech Processing**: Enterprise-grade speech-to-text transcription capabilities
- **Type Safety**: Proper state management with TypedDict integration
- **Configuration Injection**: Clean separation between business logic and framework concerns

This toolkit provides battle-tested solutions for these challenges, extracted from real enterprise implementations.

## ✨ Features

### 🔄 Multi-Provider Model Factory (Chat LLM, Embeddings & Speech)
The current version of the model factory supports the following providers, more providers will be added in future versions.

- **LLM Support**: OpenAI (including GPT-5 models), Anthropic, Ollama, AWS Bedrock
- **Embeddings Support**: OpenAI, Ollama, AWS Bedrock  
- **Speech-to-Text Support**: OpenAI Whisper (more providers coming soon)

Main features:
- **GPT-5 Support**: Full support for GPT-5, GPT-5-mini, and GPT-5-nano with automatic parameter mapping
- **Configuration Injection**: Clean provider abstraction
- **Provider Health Checks**: Availability validation
- **Seamless Switching**: Change providers without code changes

### 🎤 Enterprise Speech Processing

Production-ready speech-to-text processing with enterprise-grade reliability and performance.

Features:
- **Multi-Format Support**: MP3, WAV, M4A, and other common audio formats
- **Language Detection**: Automatic language identification and custom language hints
- **Async Processing**: Full async/await support for non-blocking operations  
- **Segment Information**: Detailed timestamp and confidence data when available
- **Error Handling**: Robust error management with detailed logging
- **Type Safety**: Standardized `TranscriptionResult` format across providers
- **Lazy Loading**: Efficient resource management with provider lazy loading

### 🧠 Intelligent Conversation Summarization

The LangChain/LangGraph framework provides good support for managing both short-term and long-term memory in agents through the LangMem module. However, we found that automated summarization based solely on token counting is not a sufficient approach for most real and complex agents. The solution included in this kit offers an alternative and more sophisticated method, based on the structure of the conversation and a focus on the object and content of the discussions.

Features:
- **Ready-to-Use LangGraph Node**: `summary_node()` method provides instant integration
- **Conversation Pair Counting**: Smart Human+AI message pair detection
- **ReAct Tool Filtering**: Automatic exclusion of tool calls from summaries
- **Configurable Thresholds**: Customizable trigger points for summarization
- **Context Preservation**: Keep recent conversations for continuity
- **Custom Prompts**: Domain-specific summarization templates
- **State Auto-Injection**: Seamless integration with existing states
- **Token Optimization**: Reduce context length for cost efficiency
- **Built-in Error Handling**: Robust error management with optional logging

### 💾 Memory Management
- **`SummarizableState`**: Type-safe base class for summary-enabled states
- **Automatic State Management**: No manual field initialization required
- **LangGraph Integration**: Native compatibility with LangGraph checkpointing
- **Clean Architecture**: Separation of concerns between summary and business logic

## 📦 Installation

### From PyPI (Recommended)
```bash
# Using uv (recommended)
uv add fastal-langgraph-toolkit

# Using pip
pip install fastal-langgraph-toolkit
```

### Optional Dependencies for Speech Processing
```bash
# Install with STT support
uv add "fastal-langgraph-toolkit[stt]"

# Or install manually
uv add fastal-langgraph-toolkit openai
```

### Development Installation
```bash
# Clone the repository
git clone https://github.com/fastal/langgraph-toolkit.git
cd fastal-langgraph-toolkit

# Install in editable mode with uv
uv add --editable .

# Or with pip
pip install -e .
```

### Requirements
- **Python**: 3.10+ 
- **LangChain**: Core components for LLM integration
- **LangGraph**: State management and agent workflows
- **Pydantic**: Type validation and settings management

## 🚀 Quick Start

### Multi-Provider Model Factory

#### GPT-5 Support (New in v0.3.0)

The toolkit provides first-class support for OpenAI's GPT-5 models with intelligent parameter handling:

```python
from fastal.langgraph.toolkit import ModelFactory
from types import SimpleNamespace

# Configuration works transparently with GPT-5
config = SimpleNamespace(
    api_key="your-openai-key",
    temperature=0.7,  # Will be ignored for GPT-5 (only accepts 1 or omit)
    max_tokens=2000,  # Automatically mapped to max_completion_tokens for GPT-5
)

# Standard GPT-5 usage
llm = ModelFactory.create_llm("openai", "gpt-5-mini", config)

# Vision tasks with GPT-5 - optimized configuration
vision_llm = ModelFactory.create_llm(
    "openai", 
    "gpt-5-mini",
    config,
    max_completion_tokens=2000,  # Explicit parameter for clarity
    temperature=1,                # Use 1 for GPT-5 (or omit)
    reasoning_effort="minimal",   # Prevents reasoning tokens consuming output
    verbosity="low"              # Control output length
)

# Complex reasoning with GPT-5
reasoning_llm = ModelFactory.create_llm(
    "openai",
    "gpt-5",
    config,
    max_completion_tokens=4000,
    reasoning_effort="high",  # Maximum reasoning capability
    verbosity="high"         # Comprehensive outputs
)

# Important GPT-5 Notes:
# - temperature: Only accepts 1 or parameter omission (auto-handled with warning)
# - max_tokens automatically mapped to max_completion_tokens
# - reasoning_effort controls thinking process: minimal, low, medium, high
# - verbosity controls output detail: low, medium, high
# - Works seamlessly with vision/multimodal tasks
```

#### Standard Multi-Provider Usage

```python
from fastal.langgraph.toolkit import ModelFactory
from types import SimpleNamespace

# Configuration using SimpleNamespace (required)
config = SimpleNamespace(
    api_key="your-api-key",
    temperature=0.7,
    streaming=True  # Enable streaming for real-time responses
)

# Create LLM with different providers
openai_llm = ModelFactory.create_llm("openai", "gpt-4o", config)
claude_llm = ModelFactory.create_llm("anthropic", "claude-3-sonnet-20240229", config)
local_llm = ModelFactory.create_llm("ollama", "llama2", config)

# Create embeddings
embeddings = ModelFactory.create_embeddings("openai", "text-embedding-3-small", config)

# Check what's available in your environment
providers = ModelFactory.get_available_providers()
print(f"Available LLM providers: {providers['llm_providers']}")
print(f"Available embedding providers: {providers['embedding_providers']}")
print(f"Available STT providers: {providers['stt_providers']}")
```

### Speech-to-Text Processing

```python
from fastal.langgraph.toolkit import ModelFactory, TranscriptionResult
import asyncio

# Configure STT provider (OpenAI Whisper)
stt_config = SimpleNamespace(
    api_key="your-openai-api-key"
)

# Create STT instance
stt = ModelFactory.create_stt("openai", "whisper-1", stt_config)

# Synchronous transcription
with open("audio.mp3", "rb") as audio_file:
    audio_data = audio_file.read()

result = stt.transcribe(
    audio_data,
    language="en",        # Optional: Language hint
    temperature=0.2,      # Optional: Lower = more deterministic
    response_format="verbose_json"  # Get detailed segment information
)

print(f"Transcribed text: {result['text']}")
print(f"Detected language: {result['language']}")
print(f"Duration: {result['duration_seconds']} seconds")

# Process segments if available
if result.get('segments'):
    for segment in result['segments']:
        print(f"{segment['start']:.2f}s - {segment['end']:.2f}s: {segment['text']}")

# Async transcription
async def async_transcribe():
    result = await stt.atranscribe(audio_data, language="en")
    return result['text']

# Run async example
text = asyncio.run(async_transcribe())
print(f"Async result: {text}")
```

### Intelligent Conversation Summarization

#### Basic Setup
```python
from fastal.langgraph.toolkit import SummaryManager, SummaryConfig, SummarizableState
from langchain_core.messages import HumanMessage, AIMessage
from typing import Annotated
from langgraph.graph.message import add_messages

# 1. Define your state using SummarizableState (recommended)
class MyAgentState(SummarizableState):
    """Your agent state with automatic summary support"""
    messages: Annotated[list, add_messages]
    thread_id: str
    # summary and last_summarized_index are automatically provided

# 2. Create summary manager with default settings
llm = ModelFactory.create_llm("openai", "gpt-4o", config)
summary_manager = SummaryManager(llm)

# 3. Use ready-to-use summary node in your LangGraph workflow
from langgraph.graph import StateGraph
import logging

# Optional: Configure logging for summary operations
logger = logging.getLogger(__name__)
summary_manager.set_logger(logger)

# Add to your workflow
workflow = StateGraph(MyAgentState)
workflow.add_node("summary_check", summary_manager.summary_node)  # Ready-to-use!
workflow.set_entry_point("summary_check")
```

#### Advanced Configuration
```python
# Custom configuration for domain-specific needs
custom_config = SummaryConfig(
    pairs_threshold=20,  # Trigger summary after 20 conversation pairs
    recent_pairs_to_preserve=5,  # Keep last 5 pairs in full context
    max_summary_length=500,  # Max words in summary
    
    # Custom prompts for your domain
    new_summary_prompt="""
    Analyze this customer support conversation and create a concise summary focusing on:
    - Customer's main issue or request
    - Actions taken by the agent
    - Current status of the resolution
    - Any pending items or next steps
    
    Conversation to summarize:
    {messages_text}
    """,
    
    combine_summary_prompt="""
    Update the existing summary with new information from the recent conversation.
    
    Previous summary:
    {existing_summary}
    
    New conversation:
    {messages_text}
    
    Provide an updated comprehensive summary:
    """
)

summary_manager = SummaryManager(llm, custom_config)
```

#### Complete LangGraph Integration Example

**Simple Approach (Recommended):**
```python
from langgraph.graph import StateGraph
from langgraph.checkpoint.postgres.aio import AsyncPostgresSaver
import logging

logger = logging.getLogger(__name__)

class CustomerSupportAgent:
    def __init__(self):
        self.llm = ModelFactory.create_llm("openai", "gpt-4o", config)
        self.summary_manager = SummaryManager(self.llm, custom_config)
        # Optional: Configure logging for summary operations
        self.summary_manager.set_logger(logger)
        self.graph = self._create_graph()
    
    async def _agent_node(self, state: MyAgentState) -> dict:
        """Main agent logic with optimized context"""
        messages = state["messages"]
        last_idx = state.get("last_summarized_index", 0)
        summary = state.get("summary")
        
        # Use only recent messages + summary for context efficiency
        recent_messages = messages[last_idx:]
        
        if summary:
            system_msg = f"Previous conversation summary: {summary}\n\nContinue the conversation:"
            context = [SystemMessage(content=system_msg)] + recent_messages
        else:
            context = recent_messages
        
        response = await self.llm.ainvoke(context)
        return {"messages": [response]}
    
    def _create_graph(self):
        workflow = StateGraph(MyAgentState)
        
        # Use ready-to-use summary node from toolkit
        workflow.add_node("summary_check", self.summary_manager.summary_node)
        workflow.add_node("agent", self._agent_node)
        
        workflow.set_entry_point("summary_check")
        workflow.add_edge("summary_check", "agent")
        workflow.add_edge("agent", "__end__")
        
        return workflow
    
    async def process_message(self, message: str, thread_id: str):
        """Process user message with automatic summarization"""
        async with AsyncPostgresSaver.from_conn_string(db_url) as checkpointer:
            app = self.graph.compile(checkpointer=checkpointer)
            
            config = {"configurable": {"thread_id": thread_id}}
            input_state = {"messages": [HumanMessage(content=message)]}
            
            result = await app.ainvoke(input_state, config=config)
            return result["messages"][-1].content
```

**Advanced Approach (Custom Implementation):**
```python
# If you need custom logic in your summary node
class AdvancedCustomerSupportAgent:
    def __init__(self):
        self.llm = ModelFactory.create_llm("openai", "gpt-4o", config)
        self.summary_manager = SummaryManager(self.llm, custom_config)
        self.graph = self._create_graph()
    
    async def _custom_summary_node(self, state: MyAgentState) -> dict:
        """Custom summary node with additional business logic"""
        thread_id = state.get("thread_id", "")
        
        # Custom business logic before summarization
        if self._should_skip_summary(state):
            return {}
        
        # Use summary manager for the actual summarization
        if await self.summary_manager.should_create_summary(state):
            result = await self.summary_manager.process_summary(state)
            
            # Custom logging or analytics
            if result:
                logger.info(f"Summary created for customer thread {thread_id}")
                self._track_summary_analytics(state, result)
            
            return result
        
        return {}
    
    def _should_skip_summary(self, state):
        """Custom business logic to skip summarization"""
        # Example: Skip for priority customers or short sessions
        return False
    
    def _track_summary_analytics(self, state, result):
        """Custom analytics tracking"""
        pass
```

## 📋 API Reference

### ModelFactory

Main factory class for creating LLM and embedding instances across multiple providers.

#### `ModelFactory.create_llm(provider: str, model: str, config: SimpleNamespace) -> BaseChatModel`

Creates an LLM instance for the specified provider.

**Parameters:**
- `provider`: Provider name (`"openai"`, `"anthropic"`, `"ollama"`, `"bedrock"`)
- `model`: Model name (e.g., `"gpt-4o"`, `"claude-3-sonnet-20240229"`)
- `config`: Configuration object with provider-specific settings

**Returns:** LangChain `BaseChatModel` instance

**Example:**
```python
from types import SimpleNamespace
from fastal.langgraph.toolkit import ModelFactory

config = SimpleNamespace(api_key="sk-...", temperature=0.7, streaming=True)
llm = ModelFactory.create_llm("openai", "gpt-4o", config)
```

#### `ModelFactory.create_embeddings(provider: str, model: str, config: SimpleNamespace) -> Embeddings`

Creates an embeddings instance for the specified provider.

**Parameters:**
- `provider`: Provider name (`"openai"`, `"ollama"`, `"bedrock"`)
- `model`: Model name (e.g., `"text-embedding-3-small"`)
- `config`: Configuration object with provider-specific settings

**Returns:** LangChain `Embeddings` instance

#### `ModelFactory.create_stt(provider: str, model: str | None, config: SimpleNamespace) -> BaseSTTProvider`

Creates a speech-to-text instance for the specified provider.

**Parameters:**
- `provider`: Provider name (`"openai"` - more providers coming soon)
- `model`: Optional model name (defaults to provider's default, e.g., `"whisper-1"`)
- `config`: Configuration object with provider-specific settings

**Returns:** STT provider instance with transcription capabilities

**Example:**
```python
from types import SimpleNamespace
from fastal.langgraph.toolkit import ModelFactory

config = SimpleNamespace(api_key="sk-...")
stt = ModelFactory.create_stt("openai", "whisper-1", config)

# Transcribe audio
with open("audio.mp3", "rb") as f:
    result = stt.transcribe(f.read(), language="en")
    print(result['text'])
```

#### `ModelFactory.get_available_providers() -> dict`

Returns available providers in the current environment.

**Returns:** Dictionary with `"llm_providers"`, `"embedding_providers"`, and `"stt_providers"` keys containing available provider lists

### SummaryManager

Manages intelligent conversation summarization with configurable thresholds and custom prompts.

#### `SummaryManager(llm: BaseChatModel, config: SummaryConfig | None = None)`

Initialize summary manager with LLM and optional configuration.

**Parameters:**
- `llm`: LangChain LLM instance for generating summaries
- `config`: Optional `SummaryConfig` instance (uses defaults if None)

#### `async should_create_summary(state: dict) -> bool`

Determines if summarization is needed based on conversation pairs threshold.

**Parameters:**
- `state`: Current agent state containing messages and summary info

**Returns:** `True` if summary should be created, `False` otherwise

#### `async process_summary(state: dict) -> dict`

Creates or updates conversation summary and returns state updates.

**Parameters:**
- `state`: Current agent state

**Returns:** Dictionary with `summary` and `last_summarized_index` fields

#### `count_conversation_pairs(messages: list, start_index: int = 0) -> int`

Counts Human+AI conversation pairs, excluding tool calls.

**Parameters:**
- `messages`: List of LangChain messages
- `start_index`: Starting index for counting (default: 0)

**Returns:** Number of complete conversation pairs

#### `async summary_node(state: dict) -> dict`

**Ready-to-use LangGraph node for conversation summarization.**

This method provides a complete LangGraph node that can be directly added to workflows. It handles the entire summary workflow internally and provides optional logging.

**Parameters:**
- `state`: LangGraph state (will be auto-injected with summary fields if missing)

**Returns:** Empty dict if no summary needed, or dict with summary fields if created

**Example:**
```python
# In your LangGraph workflow
summary_manager = SummaryManager(llm, config)
summary_manager.set_logger(logger)  # Optional logging

workflow.add_node("summary_check", summary_manager.summary_node)
workflow.set_entry_point("summary_check")
```

#### `set_logger(logger)`

Set logger for summary_node logging (optional).

**Parameters:**
- `logger`: Logger instance for summary_node operations

**Note:** When a logger is configured, `summary_node()` will automatically log when summaries are created.

### SummaryConfig

Configuration class for customizing summarization behavior.

#### `SummaryConfig(**kwargs)`

**Parameters:**
- `pairs_threshold: int = 10` - Trigger summary after N conversation pairs
- `recent_pairs_to_preserve: int = 3` - Keep N recent pairs in context
- `max_summary_length: int = 200` - Maximum words in summary
- `new_summary_prompt: str` - Template for creating new summaries
- `combine_summary_prompt: str` - Template for updating existing summaries

**Default Prompts:**
```python
# Default new summary prompt
new_summary_prompt = """
Analyze the conversation and create a concise summary highlighting:
- Main topics discussed
- Key decisions or conclusions
- Important context for future interactions

Conversation:
{messages_text}

Summary:
"""

# Default combine summary prompt  
combine_summary_prompt = """
Existing Summary: {existing_summary}

New Conversation: {messages_text}

Create an updated summary that combines the essential information:
"""
```

### SummarizableState

Base TypedDict class for states that support automatic summarization.

#### Inheritance Usage
```python
from fastal.langgraph.toolkit import SummarizableState
from typing import Annotated
from langgraph.graph.message import add_messages

class MyAgentState(SummarizableState):
    """Your custom state with summary support"""
    messages: Annotated[list, add_messages]
    thread_id: str
    # summary: str | None - automatically provided
    # last_summarized_index: int - automatically provided
```

**Provided Fields:**
- `summary: str | None` - Current conversation summary
- `last_summarized_index: int` - Index of first message NOT in last summary

### Speech-to-Text Providers

Base class and methods for speech-to-text operations across different providers.

#### `BaseSTTProvider.transcribe(audio_data: bytes, **kwargs) -> TranscriptionResult`

Transcribe audio to text synchronously.

**Parameters:**
- `audio_data`: Audio file data in bytes format
- `**kwargs`: Provider-specific parameters (language, temperature, etc.)

**Returns:** `TranscriptionResult` dictionary with transcription data

#### `BaseSTTProvider.atranscribe(audio_data: bytes, **kwargs) -> TranscriptionResult`

Transcribe audio to text asynchronously.

**Parameters:**
- `audio_data`: Audio file data in bytes format  
- `**kwargs`: Provider-specific parameters (language, temperature, etc.)

**Returns:** `TranscriptionResult` dictionary with transcription data

### TranscriptionResult

Standardized result format for speech-to-text operations.

#### Fields:
- `text: str` - The transcribed text
- `language: str | None` - Detected or specified language code
- `confidence: float | None` - Overall confidence score (if available)
- `duration_seconds: float | None` - Audio duration in seconds
- `segments: List[TranscriptionSegment] | None` - Detailed segment information
- `warnings: List[str] | None` - Any warnings or notes

#### TranscriptionSegment Fields:
- `text: str` - Segment text
- `start: float` - Start time in seconds
- `end: float` - End time in seconds  
- `confidence: float | None` - Segment confidence score

**Example:**
```python
result = stt.transcribe(audio_data, response_format="verbose_json")

print(f"Text: {result['text']}")
print(f"Language: {result['language']}")
print(f"Duration: {result['duration_seconds']}s")

# Process segments
for segment in result.get('segments', []):
    print(f"{segment['start']:.1f}s: {segment['text']}")
```

## ⚙️ Configuration

### SimpleNamespace Requirement

The toolkit requires configuration objects (not dictionaries) for type safety and dot notation access:

```python
from types import SimpleNamespace

# ✅ Correct - SimpleNamespace
config = SimpleNamespace(
    api_key="sk-...",
    base_url="https://api.openai.com/v1",  # Optional
    temperature=0.7,                        # Optional
    streaming=True                          # Optional
)

# ❌ Incorrect - Dictionary
config = {"api_key": "sk-...", "temperature": 0.7}
```

### Provider-Specific Configuration

#### OpenAI
```python
# For LLM and Embeddings
openai_config = SimpleNamespace(
    api_key="sk-...",              # Required (or set OPENAI_API_KEY)
    base_url="https://api.openai.com/v1",  # Optional
    organization="org-...",         # Optional
    temperature=0.7,               # Optional
    streaming=True,                # Optional
    max_tokens=1000                # Optional
)

# For Speech-to-Text (Whisper)
openai_stt_config = SimpleNamespace(
    api_key="sk-...",              # Required (or set OPENAI_API_KEY)
    # Most Whisper parameters are set per-request, not in config
)
```

#### Anthropic
```python
anthropic_config = SimpleNamespace(
    api_key="sk-ant-...",          # Required (or set ANTHROPIC_API_KEY)
    temperature=0.7,               # Optional
    streaming=True,                # Optional
    max_tokens=1000                # Optional
)
```

#### Ollama (Local)
```python
ollama_config = SimpleNamespace(
    base_url="http://localhost:11434",  # Optional (default)
    temperature=0.7,                    # Optional
    streaming=True                      # Optional
)
```

#### AWS Bedrock
```python
bedrock_config = SimpleNamespace(
    region="us-east-1",            # Optional (uses AWS config)
    aws_access_key_id="...",       # Optional (uses AWS config)
    aws_secret_access_key="...",   # Optional (uses AWS config)
    temperature=0.7,               # Optional
    streaming=True                 # Optional
)
```

### Environment Variables Helper

```python
from fastal.langgraph.toolkit.models.config import get_default_config

# Automatically uses environment variables
openai_config = get_default_config("openai")     # Uses OPENAI_API_KEY
anthropic_config = get_default_config("anthropic") # Uses ANTHROPIC_API_KEY
```

## 🎯 Advanced Examples

### Enterprise Multi-Provider Setup

```python
from fastal.langgraph.toolkit import ModelFactory
from types import SimpleNamespace
import os

class EnterpriseAgentConfig:
    """Enterprise configuration with fallback providers"""
    
    def __init__(self):
        self.primary_llm = self._setup_primary_llm()
        self.fallback_llm = self._setup_fallback_llm()
        self.embeddings = self._setup_embeddings()
    
    def _setup_primary_llm(self):
        """Primary: OpenAI GPT-4"""
        if os.getenv("OPENAI_API_KEY"):
            config = SimpleNamespace(
                api_key=os.getenv("OPENAI_API_KEY"),
                temperature=0.1,
                streaming=True,
                max_tokens=2000
            )
            return ModelFactory.create_llm("openai", "gpt-4o", config)
        return None
    
    def _setup_fallback_llm(self):
        """Fallback: Anthropic Claude"""
        if os.getenv("ANTHROPIC_API_KEY"):
            config = SimpleNamespace(
                api_key=os.getenv("ANTHROPIC_API_KEY"),
                temperature=0.1,
                streaming=True,
                max_tokens=2000
            )
            return ModelFactory.create_llm("anthropic", "claude-3-sonnet-20240229", config)
        return None
    
    def _setup_embeddings(self):
        """Embeddings with local fallback"""
        # Try OpenAI first
        if os.getenv("OPENAI_API_KEY"):
            config = SimpleNamespace(api_key=os.getenv("OPENAI_API_KEY"))
            return ModelFactory.create_embeddings("openai", "text-embedding-3-small", config)
        
        # Fallback to local Ollama
        config = SimpleNamespace(base_url="http://localhost:11434")
        return ModelFactory.create_embeddings("ollama", "nomic-embed-text", config)
    
    def get_llm(self):
        """Get available LLM with fallback logic"""
        return self.primary_llm or self.fallback_llm
```

### Domain-Specific Summarization

```python
from fastal.langgraph.toolkit import SummaryManager, SummaryConfig

class CustomerServiceSummaryManager:
    """Specialized summary manager for customer service conversations"""
    
    def __init__(self, llm):
        # Customer service specific configuration
        self.config = SummaryConfig(
            pairs_threshold=8,  # Shorter conversations in support
            recent_pairs_to_preserve=3,
            max_summary_length=400,
            
            new_summary_prompt="""
            Analyze this customer service conversation and create a structured summary:

            **Customer Information:**
            - Name/Contact: [Extract if mentioned]
            - Account/Order: [Extract if mentioned]

            **Issue Summary:**
            - Problem: [Main issue described]
            - Category: [Technical/Billing/General/etc.]
            - Urgency: [High/Medium/Low based on language]

            **Actions Taken:**
            - Solutions attempted: [List what agent tried]
            - Information provided: [Key info given to customer]

            **Current Status:**
            - Resolution status: [Resolved/Pending/Escalated]
            - Next steps: [What needs to happen next]

            **Conversation:**
            {messages_text}

            **Structured Summary:**
            """,
            
            combine_summary_prompt="""
            Update the customer service summary with new conversation information:

            **Previous Summary:**
            {existing_summary}

            **New Conversation:**
            {messages_text}

            **Updated Summary:**
            Merge the information, updating status and adding new actions/developments:
            """
        )
        
        self.summary_manager = SummaryManager(llm, self.config)
    
    async def process_summary(self, state):
        """Process with customer service specific logic"""
        return await self.summary_manager.process_summary(state)
```

### Multi-Modal Agent with Speech Processing

```python
from fastal.langgraph.toolkit import ModelFactory, SummaryManager, SummarizableState
from typing import Annotated
from langgraph.graph.message import add_messages
from langgraph.graph import StateGraph
import asyncio

class MultiModalAgentState(SummarizableState):
    """State supporting both text and speech inputs"""
    messages: Annotated[list, add_messages]
    thread_id: str
    audio_transcriptions: list = []
    processing_metadata: dict = {}

class MultiModalAgent:
    """Agent that handles both text and speech inputs"""
    
    def __init__(self):
        # Configure providers
        config = SimpleNamespace(api_key=os.getenv("OPENAI_API_KEY"))
        
        self.llm = ModelFactory.create_llm("openai", "gpt-4o", config)
        self.stt = ModelFactory.create_stt("openai", "whisper-1", config)
        self.summary_manager = SummaryManager(self.llm)
        
        self.graph = self._create_graph()
    
    async def _audio_processing_node(self, state: MultiModalAgentState):
        """Node to process incoming audio files"""
        messages = state["messages"]
        latest_message = messages[-1] if messages else None
        
        # Check if latest message contains audio data
        audio_data = getattr(latest_message, 'audio_data', None)
        if not audio_data:
            return {}
        
        try:
            # Transcribe audio with enhanced settings
            transcription = await self.stt.atranscribe(
                audio_data,
                language="auto",  # Auto-detect language
                temperature=0.1,  # High accuracy
                response_format="verbose_json"  # Get segments
            )
            
            # Store transcription with metadata
            transcription_record = {
                "timestamp": datetime.utcnow().isoformat(),
                "text": transcription["text"],
                "language": transcription["language"],
                "duration": transcription["duration_seconds"],
                "segments": transcription.get("segments", []),
                "message_id": len(messages) - 1
            }
            
            # Add transcribed text as new message
            from langchain_core.messages import HumanMessage
            text_message = HumanMessage(
                content=f"[Audio transcription]: {transcription['text']}"
            )
            
            return {
                "messages": [text_message],
                "audio_transcriptions": state.get("audio_transcriptions", []) + [transcription_record],
                "processing_metadata": {
                    **state.get("processing_metadata", {}),
                    "last_audio_processed": transcription_record["timestamp"]
                }
            }
            
        except Exception as e:
            logger.error(f"Audio transcription failed: {e}")
            # Add error message instead of failing
            error_message = HumanMessage(content="[Audio processing error - please try again]")
            return {"messages": [error_message]}

    async def _intelligent_agent_node(self, state: MultiModalAgentState):
        """Main agent node with audio context awareness"""
        messages = state["messages"]
        audio_transcriptions = state.get("audio_transcriptions", [])
        
        # Build context with audio awareness
        context_prefix = ""
        if audio_transcriptions:
            recent_audio = audio_transcriptions[-3:]  # Last 3 audio inputs
            audio_summary = "\n".join([
                f"- Audio {i+1}: {trans['text'][:100]}..." 
                for i, trans in enumerate(recent_audio)
            ])
            context_prefix = f"Recent audio inputs:\n{audio_summary}\n\n"
        
        # Use summarized context if available
        summary = state.get("summary", "")
        if summary:
            context_prefix += f"Conversation summary: {summary}\n\n"
        
        # Prepare messages with context
        if context_prefix:
            from langchain_core.messages import SystemMessage
            context_msg = SystemMessage(content=context_prefix + "Continue the conversation:")
            recent_messages = messages[-5:]  # Only recent messages
            llm_input = [context_msg] + recent_messages
        else:
            llm_input = messages
        
        response = await self.llm.ainvoke(llm_input)
        return {"messages": [response]}
    
    def _create_graph(self):
        """Create the multi-modal processing graph"""
        workflow = StateGraph(MultiModalAgentState)
        
        # Add nodes
        workflow.add_node("audio_processing", self._audio_processing_node)
        workflow.add_node("summary_check", self.summary_manager.summary_node)
        workflow.add_node("agent", self._intelligent_agent_node)
        
        # Define flow
        workflow.set_entry_point("audio_processing")
        workflow.add_edge("audio_processing", "summary_check")
        workflow.add_edge("summary_check", "agent")
        workflow.add_edge("agent", "__end__")
        
        return workflow.compile()
    
    async def process_audio_message(self, audio_data: bytes, thread_id: str):
        """Process audio input with full pipeline"""
        # Create audio message (custom message type)
        class AudioMessage:
            def __init__(self, audio_data):
                self.audio_data = audio_data
                self.content = "[Audio message]"
        
        config = {"configurable": {"thread_id": thread_id}}
        input_state = {
            "messages": [AudioMessage(audio_data)],
            "thread_id": thread_id
        }
        
        result = await self.graph.ainvoke(input_state, config=config)
        return result
    
    async def process_text_message(self, text: str, thread_id: str):
        """Process text input (standard flow)"""
        from langchain_core.messages import HumanMessage
        
        config = {"configurable": {"thread_id": thread_id}}
        input_state = {
            "messages": [HumanMessage(content=text)],
            "thread_id": thread_id
        }
        
        result = await self.graph.ainvoke(input_state, config=config)
        return result
    
    def get_audio_history(self, thread_id: str, limit: int = 10):
        """Get audio transcription history for a thread"""
        # This would typically query your checkpointer
        # For now, return from current state
        return {"transcriptions": "Audio history would be retrieved from checkpointer"}

# Usage example
async def multi_modal_example():
    agent = MultiModalAgent()
    
    # Process audio file
    with open("user_question.mp3", "rb") as f:
        audio_data = f.read()
    
    result = await agent.process_audio_message(audio_data, "user123")
    print("Audio processing result:", result["messages"][-1].content)
    
    # Follow up with text
    text_result = await agent.process_text_message(
        "Can you clarify that last point?", "user123"
    )
    print("Text follow-up:", text_result["messages"][-1].content)

# Run the example
# asyncio.run(multi_modal_example())
```

### Real-time Audio Processing Pipeline

```python
import aiofiles
from pathlib import Path

class RealTimeAudioProcessor:
    """Process audio files from a directory in real-time"""
    
    def __init__(self, watch_directory: str):
        self.watch_directory = Path(watch_directory)
        self.stt = ModelFactory.create_stt("openai", config=SimpleNamespace(
            api_key=os.getenv("OPENAI_API_KEY")
        ))
        self.processed_files = set()
    
    async def process_audio_file(self, file_path: Path):
        """Process a single audio file"""
        async with aiofiles.open(file_path, 'rb') as f:
            audio_data = await f.read()
        
        # Transcribe with optimized settings
        result = await self.stt.atranscribe(
            audio_data,
            language="en",
            temperature=0.0,  # Maximum accuracy
            response_format="verbose_json"
        )
        
        # Save results
        output_file = file_path.with_suffix('.json')
        output_data = {
            "source_file": str(file_path),
            "transcription": result,
            "processed_at": datetime.utcnow().isoformat()
        }
        
        async with aiofiles.open(output_file, 'w') as f:
            await f.write(json.dumps(output_data, indent=2))
        
        return result
    
    async def watch_and_process(self):
        """Watch directory and process new audio files"""
        while True:
            try:
                # Find new audio files
                audio_files = list(self.watch_directory.glob("*.mp3")) + \
                             list(self.watch_directory.glob("*.wav")) + \
                             list(self.watch_directory.glob("*.m4a"))
                
                new_files = [f for f in audio_files if f not in self.processed_files]
                
                if new_files:
                    print(f"Found {len(new_files)} new audio files")
                    
                    # Process files concurrently
                    tasks = [self.process_audio_file(file) for file in new_files]
                    results = await asyncio.gather(*tasks, return_exceptions=True)
                    
                    for file, result in zip(new_files, results):
                        if isinstance(result, Exception):
                            print(f"Error processing {file}: {result}")
                        else:
                            print(f"✓ Processed {file.name}: {result['text'][:100]}...")
                            self.processed_files.add(file)
                
                # Wait before next check
                await asyncio.sleep(5)
                
            except KeyboardInterrupt:
                print("Stopping audio processor...")
                break
            except Exception as e:
                print(f"Error in watch loop: {e}")
                await asyncio.sleep(10)

# Usage
processor = RealTimeAudioProcessor("./audio_input")
# asyncio.run(processor.watch_and_process())
```

### Memory-Optimized Long Conversations

```python
from fastal.langgraph.toolkit import SummarizableState, SummaryManager
from typing import Annotated
from langgraph.graph.message import add_messages

class OptimizedConversationState(SummarizableState):
    """State optimized for very long conversations"""
    messages: Annotated[list, add_messages]
    thread_id: str
    user_context: dict = {}  # Additional user context
    conversation_metadata: dict = {}  # Metadata for analytics

class LongConversationAgent:
    """Agent optimized for handling very long conversations"""
    
    def __init__(self, llm):
        # Aggressive summarization for memory efficiency
        config = SummaryConfig(
            pairs_threshold=5,    # Frequent summarization
            recent_pairs_to_preserve=2,  # Minimal recent context
            max_summary_length=600,  # Comprehensive summaries
        )
        
        self.summary_manager = SummaryManager(llm, config)
        self.llm = llm
    
    async def process_with_optimization(self, state: OptimizedConversationState):
        """Process message with aggressive memory optimization"""
        
        # Always check for summarization opportunities
        if await self.summary_manager.should_create_summary(state):
            # Create summary to optimize memory
            summary_update = await self.summary_manager.process_summary(state)
            state.update(summary_update)
        
        # Use only recent context + summary for LLM call
        messages = state["messages"]
        last_idx = state.get("last_summarized_index", 0)
        summary = state.get("summary")
        
        # Ultra-minimal context for cost efficiency
        recent_messages = messages[last_idx:]
        
        if summary:
            context = f"Context: {summary}\n\nContinue conversation:"
            context_msg = SystemMessage(content=context)
            llm_input = [context_msg] + recent_messages[-2:]  # Only last exchange
        else:
            llm_input = recent_messages[-4:]  # Minimal fallback
        
        response = await self.llm.ainvoke(llm_input)
        return {"messages": [response]}
```

### Token Usage Analytics

```python
import tiktoken
from collections import defaultdict

class TokenOptimizedSummaryManager:
    """Summary manager with token usage tracking and optimization"""
    
    def __init__(self, llm, config=None):
        self.summary_manager = SummaryManager(llm, config)
        self.tokenizer = tiktoken.get_encoding("cl100k_base")  # GPT-4 tokenizer
        self.token_stats = defaultdict(int)
    
    def count_tokens(self, text: str) -> int:
        """Count tokens in text"""
        return len(self.tokenizer.encode(text))
    
    async def process_with_analytics(self, state):
        """Process summary with token usage analytics"""
        messages = state["messages"]
        
        # Count tokens before summarization
        total_tokens_before = sum(
            self.count_tokens(str(msg.content)) for msg in messages
        )
        
        # Process summary
        result = await self.summary_manager.process_summary(state)
        
        if result:  # Summary was created
            summary = result.get("summary", "")
            last_idx = result.get("last_summarized_index", 0)
            
            # Count tokens after summarization
            remaining_messages = messages[last_idx:]
            remaining_tokens = sum(
                self.count_tokens(str(msg.content)) for msg in remaining_messages
            )
            summary_tokens = self.count_tokens(summary)
            total_tokens_after = remaining_tokens + summary_tokens
            
            # Track savings
            tokens_saved = total_tokens_before - total_tokens_after
            self.token_stats["tokens_saved"] += tokens_saved
            self.token_stats["summaries_created"] += 1
            
            print(f"💰 Token optimization: {tokens_saved} tokens saved "
                  f"({total_tokens_before} → {total_tokens_after})")
        
        return result
    
    def get_analytics(self):
        """Get token usage analytics"""
        return dict(self.token_stats)
```

## 🔧 Best Practices

### 1. State Design
```python
# ✅ Use SummarizableState for automatic summary support
class MyAgentState(SummarizableState):
    messages: Annotated[list, add_messages]
    thread_id: str

# ❌ Don't manually define summary fields
class BadAgentState(TypedDict):
    messages: Annotated[list, add_messages]
    thread_id: str
    summary: str | None  # Manual definition not needed
    last_summarized_index: int  # Manual definition not needed
```

### 2. Graph Architecture
```python
# ✅ Use ready-to-use summary node (Recommended)
summary_manager = SummaryManager(llm, config)
summary_manager.set_logger(logger)  # Optional logging

workflow.add_node("summary_check", summary_manager.summary_node)
workflow.set_entry_point("summary_check")  # Always check summary first
workflow.add_edge("summary_check", "agent")  # Then process
workflow.add_edge("tools", "agent")  # Tools return to agent, not summary

# ✅ Alternative: Custom summary node (if you need custom logic)
async def custom_summary_node(state):
    if await summary_manager.should_create_summary(state):
        return await summary_manager.process_summary(state)
    return {}

workflow.add_node("summary_check", custom_summary_node)

# ❌ Don't create summaries mid-conversation
# This would create summaries during tool execution
workflow.add_edge("tools", "summary_check")  # Wrong!
```

### 3. Configuration Management
```python
# ✅ Environment-based configuration
class ProductionConfig:
    def __init__(self):
        self.llm_config = SimpleNamespace(
            api_key=os.getenv("OPENAI_API_KEY"),
            temperature=0.1,  # Conservative for production
            streaming=True
        )
        
        self.summary_config = SummaryConfig(
            pairs_threshold=12,  # Longer thresholds for production
            recent_pairs_to_preserve=4,
            max_summary_length=300
        )

# ❌ Don't hardcode credentials
bad_config = SimpleNamespace(api_key="sk-hardcoded-key")  # Never do this!
```

### 4. Error Handling
```python
# ✅ Use built-in error handling (Recommended)
# The summary_node() method already includes robust error handling
summary_manager = SummaryManager(llm, config)
summary_manager.set_logger(logger)  # Automatic error logging

workflow.add_node("summary_check", summary_manager.summary_node)

# ✅ Custom error handling (if needed)
async def robust_summary_node(state):
    """Custom summary node with additional error handling"""
    try:
        if await summary_manager.should_create_summary(state):
            return await summary_manager.process_summary(state)
        return {}
    except Exception as e:
        logger.error(f"Summary creation failed: {e}")
        # Continue without summary rather than failing
        return {}
```

### 5. Performance Monitoring
```python
import time
from functools import wraps

# ✅ Built-in monitoring (Recommended)
# The summary_node() automatically logs performance when logger is configured
summary_manager = SummaryManager(llm, config)
summary_manager.set_logger(logger)  # Automatic performance logging

workflow.add_node("summary_check", summary_manager.summary_node)

# ✅ Custom performance monitoring (if needed)
def monitor_performance(func):
    """Decorator to monitor summary performance"""
    @wraps(func)
    async def wrapper(*args, **kwargs):
        start_time = time.time()
        result = await func(*args, **kwargs)
        duration = time.time() - start_time
        
        if result:  # Summary was created
            logger.info(f"Summary created in {duration:.2f}s")
        
        return result
    return wrapper

# Usage with custom node
@monitor_performance
async def monitored_summary_node(state):
    return await summary_manager.process_summary(state)
```

## 📊 Performance Considerations

### Token Efficiency
- **Without summarization**: ~50,000 tokens for 50-message conversation
- **With summarization**: ~8,000 tokens (84% reduction)
- **Cost savings**: Proportional to token reduction

### Response Time
- **Summary creation**: 2-5 seconds additional latency
- **Context processing**: 50-80% faster with summarized context
- **Overall impact**: Net positive for conversations >15 messages

### Memory Usage
- **State size**: Reduced by 70-90% with active summarization
- **Checkpointer storage**: Significantly smaller state objects
- **Database impact**: Reduced checkpoint table growth

## 🛠️ Troubleshooting

### Common Issues

#### 1. "SimpleNamespace required" Error
```python
# ❌ Cause: Using dictionary instead of SimpleNamespace
config = {"api_key": "sk-..."}

# ✅ Solution: Use SimpleNamespace
from types import SimpleNamespace
config = SimpleNamespace(api_key="sk-...")
```

#### 2. Summary Not Created
```python
# Check if threshold is reached
pairs = summary_manager.count_conversation_pairs(state["messages"])
print(f"Current pairs: {pairs}, Threshold: {config.pairs_threshold}")

# Check message types
for i, msg in enumerate(state["messages"]):
    print(f"{i}: {type(msg).__name__} - {hasattr(msg, 'tool_calls')}")
```

#### 3. Provider Not Available
```python
# Check available providers
providers = ModelFactory.get_available_providers()
print(f"Available: {providers}")

# Verify environment variables
import os
print(f"OpenAI key set: {bool(os.getenv('OPENAI_API_KEY'))}")
```

### Debug Mode
```python
# Enable debug logging for detailed output
import logging
logging.getLogger("fastal.langgraph.toolkit").setLevel(logging.DEBUG)
```

## 🗺️ Roadmap

The Fastal LangGraph Toolkit follows a structured development roadmap with clear versioning and feature additions. Current development status and planned features:

### Current Status (v0.3.1)
- ✅ **GPT-5 Model Support**: Full support for GPT-5, GPT-5-mini, GPT-5-nano with intelligent configuration
- ✅ **Multi-Provider LLM Support**: OpenAI (including GPT-5), Anthropic, Ollama, AWS Bedrock
- ✅ **Multi-Provider Embeddings**: OpenAI, Ollama, AWS Bedrock  
- ✅ **Intelligent Conversation Summarization**: Production-ready with ready-to-use LangGraph node
- ✅ **OpenAI Speech-to-Text**: Whisper integration with full async support
- ✅ **Type Safety**: Full TypedDict integration and TYPE_CHECKING imports
- ✅ **Enterprise Testing**: Comprehensive test suite with CI/CD
- ✅ **Development Status**: Beta (PyPI classifier updated)

### Planned Features

#### v0.4.0 - Extended STT Providers (Q1 2025)
- 🚧 **Google Cloud Speech-to-Text**: Full integration with streaming support
- 🚧 **Azure Cognitive Services STT**: Real-time transcription capabilities
- 🚧 **Advanced STT Features**: Speaker diarization, custom vocabularies, language detection
- 🚧 **STT Performance Optimizations**: Batch processing and caching

#### v0.5.0 - Text-to-Speech Support (Q2 2025)
- 🚧 **OpenAI TTS**: High-quality voice synthesis with multiple voices
- 🚧 **Google Cloud TTS**: WaveNet and Neural2 voice support
- 🚧 **ElevenLabs Integration**: Premium voice models and voice cloning
- 🚧 **Azure TTS**: Neural voices with emotion control
- 🚧 **TTSFactory**: Unified factory pattern for all TTS providers

#### v0.6.0 - Advanced Features (Q3 2025)
- 🚧 **Real-Time Streaming**: Live audio processing capabilities
- 🚧 **Audio Processing Pipeline**: Noise reduction, format conversion
- 🚧 **Multi-Language Support**: Enhanced language detection and switching
- 🚧 **Voice Activity Detection**: Smart audio segmentation

#### v1.0.0 - Production Release (Q4 2025)
- 🚧 **Enterprise Authentication**: OAuth, SAML, and enterprise SSO
- 🚧 **Advanced Monitoring**: Metrics, tracing, and observability
- 🚧 **Performance Optimizations**: Caching, connection pooling
- 🚧 **Documentation**: Complete enterprise deployment guides

### Provider Expansion Plans

**Speech-to-Text Providers (v0.4.0):**
- Google Cloud Speech-to-Text
- Azure Cognitive Services Speech
- AWS Transcribe (planned)
- AssemblyAI (community request)

**Text-to-Speech Providers (v0.5.0):**
- OpenAI TTS
- Google Cloud Text-to-Speech  
- ElevenLabs
- Azure Cognitive Services TTS
- AWS Polly (planned)

### Community Contributions

We welcome community contributions and feedback to help shape the roadmap:

- **Feature Requests**: Create issues for new provider requests or feature suggestions
- **Provider Implementations**: Community-contributed providers are welcome
- **Documentation**: Help improve examples and guides
- **Testing**: Real-world usage feedback from enterprise deployments

**Priority is given to providers with proven enterprise demand and active community support.**

### Version Strategy

- **Major versions** (1.x): Breaking changes, major architectural improvements
- **Minor versions** (0.x): New features, provider additions
- **Patch versions** (0.x.y): Bug fixes, security updates
- **Beta releases** (0.x.0b1): Pre-release testing with new features

**Current stable release**: v0.3.1 (GPT-5 support, Beta status on PyPI)

## License

MIT License

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "fastal-langgraph-toolkit",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "agents, ai, langchain, langgraph, toolkit",
    "author": null,
    "author_email": "Stefano Capezzone <stefano@capezzone.it>",
    "download_url": "https://files.pythonhosted.org/packages/15/3c/9af6b18de899c44acf46daa13e9941e2717d61f970d23a869df28ed73c13/fastal_langgraph_toolkit-0.4.0.tar.gz",
    "platform": null,
    "description": "# Fastal LangGraph Toolkit\n\n[![CI/CD](https://github.com/FastalGroup/fastal-langgraph-toolkit/actions/workflows/test.yml/badge.svg)](https://github.com/FastalGroup/fastal-langgraph-toolkit/actions/workflows/test.yml)\n[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![PyPI version](https://img.shields.io/pypi/v/fastal-langgraph-toolkit)](https://pypi.org/project/fastal-langgraph-toolkit/)\n\n**Production-ready toolkit for building enterprise LangGraph agents with multi-provider support, intelligent conversation management, and speech processing capabilities.**\n\n## \ud83c\udfe2 About\n\nThe Fastal LangGraph Toolkit was originally developed internally by the **Fastal Group** to support enterprise-grade agentic application implementations across multiple client projects. After proving its effectiveness in production environments, we've open-sourced this toolkit to contribute to the broader LangGraph community.\n\n### Why This Toolkit?\n\nBuilding production LangGraph agents involves solving common challenges in advanced research and development projects:\n- **Multi-provider Management**: Support for multiple LLM/embedding/speech providers with seamless switching\n- **Context Management**: Intelligent conversation summarization for long-running sessions\n- **Memory Optimization**: Token-efficient context handling for cost control\n- **Speech Processing**: Enterprise-grade speech-to-text transcription capabilities\n- **Type Safety**: Proper state management with TypedDict integration\n- **Configuration Injection**: Clean separation between business logic and framework concerns\n\nThis toolkit provides battle-tested solutions for these challenges, extracted from real enterprise implementations.\n\n## \u2728 Features\n\n### \ud83d\udd04 Multi-Provider Model Factory (Chat LLM, Embeddings & Speech)\nThe current version of the model factory supports the following providers, more providers will be added in future versions.\n\n- **LLM Support**: OpenAI (including GPT-5 models), Anthropic, Ollama, AWS Bedrock\n- **Embeddings Support**: OpenAI, Ollama, AWS Bedrock  \n- **Speech-to-Text Support**: OpenAI Whisper (more providers coming soon)\n\nMain features:\n- **GPT-5 Support**: Full support for GPT-5, GPT-5-mini, and GPT-5-nano with automatic parameter mapping\n- **Configuration Injection**: Clean provider abstraction\n- **Provider Health Checks**: Availability validation\n- **Seamless Switching**: Change providers without code changes\n\n### \ud83c\udfa4 Enterprise Speech Processing\n\nProduction-ready speech-to-text processing with enterprise-grade reliability and performance.\n\nFeatures:\n- **Multi-Format Support**: MP3, WAV, M4A, and other common audio formats\n- **Language Detection**: Automatic language identification and custom language hints\n- **Async Processing**: Full async/await support for non-blocking operations  \n- **Segment Information**: Detailed timestamp and confidence data when available\n- **Error Handling**: Robust error management with detailed logging\n- **Type Safety**: Standardized `TranscriptionResult` format across providers\n- **Lazy Loading**: Efficient resource management with provider lazy loading\n\n### \ud83e\udde0 Intelligent Conversation Summarization\n\nThe LangChain/LangGraph framework provides good support for managing both short-term and long-term memory in agents through the LangMem module. However, we found that automated summarization based solely on token counting is not a sufficient approach for most real and complex agents. The solution included in this kit offers an alternative and more sophisticated method, based on the structure of the conversation and a focus on the object and content of the discussions.\n\nFeatures:\n- **Ready-to-Use LangGraph Node**: `summary_node()` method provides instant integration\n- **Conversation Pair Counting**: Smart Human+AI message pair detection\n- **ReAct Tool Filtering**: Automatic exclusion of tool calls from summaries\n- **Configurable Thresholds**: Customizable trigger points for summarization\n- **Context Preservation**: Keep recent conversations for continuity\n- **Custom Prompts**: Domain-specific summarization templates\n- **State Auto-Injection**: Seamless integration with existing states\n- **Token Optimization**: Reduce context length for cost efficiency\n- **Built-in Error Handling**: Robust error management with optional logging\n\n### \ud83d\udcbe Memory Management\n- **`SummarizableState`**: Type-safe base class for summary-enabled states\n- **Automatic State Management**: No manual field initialization required\n- **LangGraph Integration**: Native compatibility with LangGraph checkpointing\n- **Clean Architecture**: Separation of concerns between summary and business logic\n\n## \ud83d\udce6 Installation\n\n### From PyPI (Recommended)\n```bash\n# Using uv (recommended)\nuv add fastal-langgraph-toolkit\n\n# Using pip\npip install fastal-langgraph-toolkit\n```\n\n### Optional Dependencies for Speech Processing\n```bash\n# Install with STT support\nuv add \"fastal-langgraph-toolkit[stt]\"\n\n# Or install manually\nuv add fastal-langgraph-toolkit openai\n```\n\n### Development Installation\n```bash\n# Clone the repository\ngit clone https://github.com/fastal/langgraph-toolkit.git\ncd fastal-langgraph-toolkit\n\n# Install in editable mode with uv\nuv add --editable .\n\n# Or with pip\npip install -e .\n```\n\n### Requirements\n- **Python**: 3.10+ \n- **LangChain**: Core components for LLM integration\n- **LangGraph**: State management and agent workflows\n- **Pydantic**: Type validation and settings management\n\n## \ud83d\ude80 Quick Start\n\n### Multi-Provider Model Factory\n\n#### GPT-5 Support (New in v0.3.0)\n\nThe toolkit provides first-class support for OpenAI's GPT-5 models with intelligent parameter handling:\n\n```python\nfrom fastal.langgraph.toolkit import ModelFactory\nfrom types import SimpleNamespace\n\n# Configuration works transparently with GPT-5\nconfig = SimpleNamespace(\n    api_key=\"your-openai-key\",\n    temperature=0.7,  # Will be ignored for GPT-5 (only accepts 1 or omit)\n    max_tokens=2000,  # Automatically mapped to max_completion_tokens for GPT-5\n)\n\n# Standard GPT-5 usage\nllm = ModelFactory.create_llm(\"openai\", \"gpt-5-mini\", config)\n\n# Vision tasks with GPT-5 - optimized configuration\nvision_llm = ModelFactory.create_llm(\n    \"openai\", \n    \"gpt-5-mini\",\n    config,\n    max_completion_tokens=2000,  # Explicit parameter for clarity\n    temperature=1,                # Use 1 for GPT-5 (or omit)\n    reasoning_effort=\"minimal\",   # Prevents reasoning tokens consuming output\n    verbosity=\"low\"              # Control output length\n)\n\n# Complex reasoning with GPT-5\nreasoning_llm = ModelFactory.create_llm(\n    \"openai\",\n    \"gpt-5\",\n    config,\n    max_completion_tokens=4000,\n    reasoning_effort=\"high\",  # Maximum reasoning capability\n    verbosity=\"high\"         # Comprehensive outputs\n)\n\n# Important GPT-5 Notes:\n# - temperature: Only accepts 1 or parameter omission (auto-handled with warning)\n# - max_tokens automatically mapped to max_completion_tokens\n# - reasoning_effort controls thinking process: minimal, low, medium, high\n# - verbosity controls output detail: low, medium, high\n# - Works seamlessly with vision/multimodal tasks\n```\n\n#### Standard Multi-Provider Usage\n\n```python\nfrom fastal.langgraph.toolkit import ModelFactory\nfrom types import SimpleNamespace\n\n# Configuration using SimpleNamespace (required)\nconfig = SimpleNamespace(\n    api_key=\"your-api-key\",\n    temperature=0.7,\n    streaming=True  # Enable streaming for real-time responses\n)\n\n# Create LLM with different providers\nopenai_llm = ModelFactory.create_llm(\"openai\", \"gpt-4o\", config)\nclaude_llm = ModelFactory.create_llm(\"anthropic\", \"claude-3-sonnet-20240229\", config)\nlocal_llm = ModelFactory.create_llm(\"ollama\", \"llama2\", config)\n\n# Create embeddings\nembeddings = ModelFactory.create_embeddings(\"openai\", \"text-embedding-3-small\", config)\n\n# Check what's available in your environment\nproviders = ModelFactory.get_available_providers()\nprint(f\"Available LLM providers: {providers['llm_providers']}\")\nprint(f\"Available embedding providers: {providers['embedding_providers']}\")\nprint(f\"Available STT providers: {providers['stt_providers']}\")\n```\n\n### Speech-to-Text Processing\n\n```python\nfrom fastal.langgraph.toolkit import ModelFactory, TranscriptionResult\nimport asyncio\n\n# Configure STT provider (OpenAI Whisper)\nstt_config = SimpleNamespace(\n    api_key=\"your-openai-api-key\"\n)\n\n# Create STT instance\nstt = ModelFactory.create_stt(\"openai\", \"whisper-1\", stt_config)\n\n# Synchronous transcription\nwith open(\"audio.mp3\", \"rb\") as audio_file:\n    audio_data = audio_file.read()\n\nresult = stt.transcribe(\n    audio_data,\n    language=\"en\",        # Optional: Language hint\n    temperature=0.2,      # Optional: Lower = more deterministic\n    response_format=\"verbose_json\"  # Get detailed segment information\n)\n\nprint(f\"Transcribed text: {result['text']}\")\nprint(f\"Detected language: {result['language']}\")\nprint(f\"Duration: {result['duration_seconds']} seconds\")\n\n# Process segments if available\nif result.get('segments'):\n    for segment in result['segments']:\n        print(f\"{segment['start']:.2f}s - {segment['end']:.2f}s: {segment['text']}\")\n\n# Async transcription\nasync def async_transcribe():\n    result = await stt.atranscribe(audio_data, language=\"en\")\n    return result['text']\n\n# Run async example\ntext = asyncio.run(async_transcribe())\nprint(f\"Async result: {text}\")\n```\n\n### Intelligent Conversation Summarization\n\n#### Basic Setup\n```python\nfrom fastal.langgraph.toolkit import SummaryManager, SummaryConfig, SummarizableState\nfrom langchain_core.messages import HumanMessage, AIMessage\nfrom typing import Annotated\nfrom langgraph.graph.message import add_messages\n\n# 1. Define your state using SummarizableState (recommended)\nclass MyAgentState(SummarizableState):\n    \"\"\"Your agent state with automatic summary support\"\"\"\n    messages: Annotated[list, add_messages]\n    thread_id: str\n    # summary and last_summarized_index are automatically provided\n\n# 2. Create summary manager with default settings\nllm = ModelFactory.create_llm(\"openai\", \"gpt-4o\", config)\nsummary_manager = SummaryManager(llm)\n\n# 3. Use ready-to-use summary node in your LangGraph workflow\nfrom langgraph.graph import StateGraph\nimport logging\n\n# Optional: Configure logging for summary operations\nlogger = logging.getLogger(__name__)\nsummary_manager.set_logger(logger)\n\n# Add to your workflow\nworkflow = StateGraph(MyAgentState)\nworkflow.add_node(\"summary_check\", summary_manager.summary_node)  # Ready-to-use!\nworkflow.set_entry_point(\"summary_check\")\n```\n\n#### Advanced Configuration\n```python\n# Custom configuration for domain-specific needs\ncustom_config = SummaryConfig(\n    pairs_threshold=20,  # Trigger summary after 20 conversation pairs\n    recent_pairs_to_preserve=5,  # Keep last 5 pairs in full context\n    max_summary_length=500,  # Max words in summary\n    \n    # Custom prompts for your domain\n    new_summary_prompt=\"\"\"\n    Analyze this customer support conversation and create a concise summary focusing on:\n    - Customer's main issue or request\n    - Actions taken by the agent\n    - Current status of the resolution\n    - Any pending items or next steps\n    \n    Conversation to summarize:\n    {messages_text}\n    \"\"\",\n    \n    combine_summary_prompt=\"\"\"\n    Update the existing summary with new information from the recent conversation.\n    \n    Previous summary:\n    {existing_summary}\n    \n    New conversation:\n    {messages_text}\n    \n    Provide an updated comprehensive summary:\n    \"\"\"\n)\n\nsummary_manager = SummaryManager(llm, custom_config)\n```\n\n#### Complete LangGraph Integration Example\n\n**Simple Approach (Recommended):**\n```python\nfrom langgraph.graph import StateGraph\nfrom langgraph.checkpoint.postgres.aio import AsyncPostgresSaver\nimport logging\n\nlogger = logging.getLogger(__name__)\n\nclass CustomerSupportAgent:\n    def __init__(self):\n        self.llm = ModelFactory.create_llm(\"openai\", \"gpt-4o\", config)\n        self.summary_manager = SummaryManager(self.llm, custom_config)\n        # Optional: Configure logging for summary operations\n        self.summary_manager.set_logger(logger)\n        self.graph = self._create_graph()\n    \n    async def _agent_node(self, state: MyAgentState) -> dict:\n        \"\"\"Main agent logic with optimized context\"\"\"\n        messages = state[\"messages\"]\n        last_idx = state.get(\"last_summarized_index\", 0)\n        summary = state.get(\"summary\")\n        \n        # Use only recent messages + summary for context efficiency\n        recent_messages = messages[last_idx:]\n        \n        if summary:\n            system_msg = f\"Previous conversation summary: {summary}\\n\\nContinue the conversation:\"\n            context = [SystemMessage(content=system_msg)] + recent_messages\n        else:\n            context = recent_messages\n        \n        response = await self.llm.ainvoke(context)\n        return {\"messages\": [response]}\n    \n    def _create_graph(self):\n        workflow = StateGraph(MyAgentState)\n        \n        # Use ready-to-use summary node from toolkit\n        workflow.add_node(\"summary_check\", self.summary_manager.summary_node)\n        workflow.add_node(\"agent\", self._agent_node)\n        \n        workflow.set_entry_point(\"summary_check\")\n        workflow.add_edge(\"summary_check\", \"agent\")\n        workflow.add_edge(\"agent\", \"__end__\")\n        \n        return workflow\n    \n    async def process_message(self, message: str, thread_id: str):\n        \"\"\"Process user message with automatic summarization\"\"\"\n        async with AsyncPostgresSaver.from_conn_string(db_url) as checkpointer:\n            app = self.graph.compile(checkpointer=checkpointer)\n            \n            config = {\"configurable\": {\"thread_id\": thread_id}}\n            input_state = {\"messages\": [HumanMessage(content=message)]}\n            \n            result = await app.ainvoke(input_state, config=config)\n            return result[\"messages\"][-1].content\n```\n\n**Advanced Approach (Custom Implementation):**\n```python\n# If you need custom logic in your summary node\nclass AdvancedCustomerSupportAgent:\n    def __init__(self):\n        self.llm = ModelFactory.create_llm(\"openai\", \"gpt-4o\", config)\n        self.summary_manager = SummaryManager(self.llm, custom_config)\n        self.graph = self._create_graph()\n    \n    async def _custom_summary_node(self, state: MyAgentState) -> dict:\n        \"\"\"Custom summary node with additional business logic\"\"\"\n        thread_id = state.get(\"thread_id\", \"\")\n        \n        # Custom business logic before summarization\n        if self._should_skip_summary(state):\n            return {}\n        \n        # Use summary manager for the actual summarization\n        if await self.summary_manager.should_create_summary(state):\n            result = await self.summary_manager.process_summary(state)\n            \n            # Custom logging or analytics\n            if result:\n                logger.info(f\"Summary created for customer thread {thread_id}\")\n                self._track_summary_analytics(state, result)\n            \n            return result\n        \n        return {}\n    \n    def _should_skip_summary(self, state):\n        \"\"\"Custom business logic to skip summarization\"\"\"\n        # Example: Skip for priority customers or short sessions\n        return False\n    \n    def _track_summary_analytics(self, state, result):\n        \"\"\"Custom analytics tracking\"\"\"\n        pass\n```\n\n## \ud83d\udccb API Reference\n\n### ModelFactory\n\nMain factory class for creating LLM and embedding instances across multiple providers.\n\n#### `ModelFactory.create_llm(provider: str, model: str, config: SimpleNamespace) -> BaseChatModel`\n\nCreates an LLM instance for the specified provider.\n\n**Parameters:**\n- `provider`: Provider name (`\"openai\"`, `\"anthropic\"`, `\"ollama\"`, `\"bedrock\"`)\n- `model`: Model name (e.g., `\"gpt-4o\"`, `\"claude-3-sonnet-20240229\"`)\n- `config`: Configuration object with provider-specific settings\n\n**Returns:** LangChain `BaseChatModel` instance\n\n**Example:**\n```python\nfrom types import SimpleNamespace\nfrom fastal.langgraph.toolkit import ModelFactory\n\nconfig = SimpleNamespace(api_key=\"sk-...\", temperature=0.7, streaming=True)\nllm = ModelFactory.create_llm(\"openai\", \"gpt-4o\", config)\n```\n\n#### `ModelFactory.create_embeddings(provider: str, model: str, config: SimpleNamespace) -> Embeddings`\n\nCreates an embeddings instance for the specified provider.\n\n**Parameters:**\n- `provider`: Provider name (`\"openai\"`, `\"ollama\"`, `\"bedrock\"`)\n- `model`: Model name (e.g., `\"text-embedding-3-small\"`)\n- `config`: Configuration object with provider-specific settings\n\n**Returns:** LangChain `Embeddings` instance\n\n#### `ModelFactory.create_stt(provider: str, model: str | None, config: SimpleNamespace) -> BaseSTTProvider`\n\nCreates a speech-to-text instance for the specified provider.\n\n**Parameters:**\n- `provider`: Provider name (`\"openai\"` - more providers coming soon)\n- `model`: Optional model name (defaults to provider's default, e.g., `\"whisper-1\"`)\n- `config`: Configuration object with provider-specific settings\n\n**Returns:** STT provider instance with transcription capabilities\n\n**Example:**\n```python\nfrom types import SimpleNamespace\nfrom fastal.langgraph.toolkit import ModelFactory\n\nconfig = SimpleNamespace(api_key=\"sk-...\")\nstt = ModelFactory.create_stt(\"openai\", \"whisper-1\", config)\n\n# Transcribe audio\nwith open(\"audio.mp3\", \"rb\") as f:\n    result = stt.transcribe(f.read(), language=\"en\")\n    print(result['text'])\n```\n\n#### `ModelFactory.get_available_providers() -> dict`\n\nReturns available providers in the current environment.\n\n**Returns:** Dictionary with `\"llm_providers\"`, `\"embedding_providers\"`, and `\"stt_providers\"` keys containing available provider lists\n\n### SummaryManager\n\nManages intelligent conversation summarization with configurable thresholds and custom prompts.\n\n#### `SummaryManager(llm: BaseChatModel, config: SummaryConfig | None = None)`\n\nInitialize summary manager with LLM and optional configuration.\n\n**Parameters:**\n- `llm`: LangChain LLM instance for generating summaries\n- `config`: Optional `SummaryConfig` instance (uses defaults if None)\n\n#### `async should_create_summary(state: dict) -> bool`\n\nDetermines if summarization is needed based on conversation pairs threshold.\n\n**Parameters:**\n- `state`: Current agent state containing messages and summary info\n\n**Returns:** `True` if summary should be created, `False` otherwise\n\n#### `async process_summary(state: dict) -> dict`\n\nCreates or updates conversation summary and returns state updates.\n\n**Parameters:**\n- `state`: Current agent state\n\n**Returns:** Dictionary with `summary` and `last_summarized_index` fields\n\n#### `count_conversation_pairs(messages: list, start_index: int = 0) -> int`\n\nCounts Human+AI conversation pairs, excluding tool calls.\n\n**Parameters:**\n- `messages`: List of LangChain messages\n- `start_index`: Starting index for counting (default: 0)\n\n**Returns:** Number of complete conversation pairs\n\n#### `async summary_node(state: dict) -> dict`\n\n**Ready-to-use LangGraph node for conversation summarization.**\n\nThis method provides a complete LangGraph node that can be directly added to workflows. It handles the entire summary workflow internally and provides optional logging.\n\n**Parameters:**\n- `state`: LangGraph state (will be auto-injected with summary fields if missing)\n\n**Returns:** Empty dict if no summary needed, or dict with summary fields if created\n\n**Example:**\n```python\n# In your LangGraph workflow\nsummary_manager = SummaryManager(llm, config)\nsummary_manager.set_logger(logger)  # Optional logging\n\nworkflow.add_node(\"summary_check\", summary_manager.summary_node)\nworkflow.set_entry_point(\"summary_check\")\n```\n\n#### `set_logger(logger)`\n\nSet logger for summary_node logging (optional).\n\n**Parameters:**\n- `logger`: Logger instance for summary_node operations\n\n**Note:** When a logger is configured, `summary_node()` will automatically log when summaries are created.\n\n### SummaryConfig\n\nConfiguration class for customizing summarization behavior.\n\n#### `SummaryConfig(**kwargs)`\n\n**Parameters:**\n- `pairs_threshold: int = 10` - Trigger summary after N conversation pairs\n- `recent_pairs_to_preserve: int = 3` - Keep N recent pairs in context\n- `max_summary_length: int = 200` - Maximum words in summary\n- `new_summary_prompt: str` - Template for creating new summaries\n- `combine_summary_prompt: str` - Template for updating existing summaries\n\n**Default Prompts:**\n```python\n# Default new summary prompt\nnew_summary_prompt = \"\"\"\nAnalyze the conversation and create a concise summary highlighting:\n- Main topics discussed\n- Key decisions or conclusions\n- Important context for future interactions\n\nConversation:\n{messages_text}\n\nSummary:\n\"\"\"\n\n# Default combine summary prompt  \ncombine_summary_prompt = \"\"\"\nExisting Summary: {existing_summary}\n\nNew Conversation: {messages_text}\n\nCreate an updated summary that combines the essential information:\n\"\"\"\n```\n\n### SummarizableState\n\nBase TypedDict class for states that support automatic summarization.\n\n#### Inheritance Usage\n```python\nfrom fastal.langgraph.toolkit import SummarizableState\nfrom typing import Annotated\nfrom langgraph.graph.message import add_messages\n\nclass MyAgentState(SummarizableState):\n    \"\"\"Your custom state with summary support\"\"\"\n    messages: Annotated[list, add_messages]\n    thread_id: str\n    # summary: str | None - automatically provided\n    # last_summarized_index: int - automatically provided\n```\n\n**Provided Fields:**\n- `summary: str | None` - Current conversation summary\n- `last_summarized_index: int` - Index of first message NOT in last summary\n\n### Speech-to-Text Providers\n\nBase class and methods for speech-to-text operations across different providers.\n\n#### `BaseSTTProvider.transcribe(audio_data: bytes, **kwargs) -> TranscriptionResult`\n\nTranscribe audio to text synchronously.\n\n**Parameters:**\n- `audio_data`: Audio file data in bytes format\n- `**kwargs`: Provider-specific parameters (language, temperature, etc.)\n\n**Returns:** `TranscriptionResult` dictionary with transcription data\n\n#### `BaseSTTProvider.atranscribe(audio_data: bytes, **kwargs) -> TranscriptionResult`\n\nTranscribe audio to text asynchronously.\n\n**Parameters:**\n- `audio_data`: Audio file data in bytes format  \n- `**kwargs`: Provider-specific parameters (language, temperature, etc.)\n\n**Returns:** `TranscriptionResult` dictionary with transcription data\n\n### TranscriptionResult\n\nStandardized result format for speech-to-text operations.\n\n#### Fields:\n- `text: str` - The transcribed text\n- `language: str | None` - Detected or specified language code\n- `confidence: float | None` - Overall confidence score (if available)\n- `duration_seconds: float | None` - Audio duration in seconds\n- `segments: List[TranscriptionSegment] | None` - Detailed segment information\n- `warnings: List[str] | None` - Any warnings or notes\n\n#### TranscriptionSegment Fields:\n- `text: str` - Segment text\n- `start: float` - Start time in seconds\n- `end: float` - End time in seconds  \n- `confidence: float | None` - Segment confidence score\n\n**Example:**\n```python\nresult = stt.transcribe(audio_data, response_format=\"verbose_json\")\n\nprint(f\"Text: {result['text']}\")\nprint(f\"Language: {result['language']}\")\nprint(f\"Duration: {result['duration_seconds']}s\")\n\n# Process segments\nfor segment in result.get('segments', []):\n    print(f\"{segment['start']:.1f}s: {segment['text']}\")\n```\n\n## \u2699\ufe0f Configuration\n\n### SimpleNamespace Requirement\n\nThe toolkit requires configuration objects (not dictionaries) for type safety and dot notation access:\n\n```python\nfrom types import SimpleNamespace\n\n# \u2705 Correct - SimpleNamespace\nconfig = SimpleNamespace(\n    api_key=\"sk-...\",\n    base_url=\"https://api.openai.com/v1\",  # Optional\n    temperature=0.7,                        # Optional\n    streaming=True                          # Optional\n)\n\n# \u274c Incorrect - Dictionary\nconfig = {\"api_key\": \"sk-...\", \"temperature\": 0.7}\n```\n\n### Provider-Specific Configuration\n\n#### OpenAI\n```python\n# For LLM and Embeddings\nopenai_config = SimpleNamespace(\n    api_key=\"sk-...\",              # Required (or set OPENAI_API_KEY)\n    base_url=\"https://api.openai.com/v1\",  # Optional\n    organization=\"org-...\",         # Optional\n    temperature=0.7,               # Optional\n    streaming=True,                # Optional\n    max_tokens=1000                # Optional\n)\n\n# For Speech-to-Text (Whisper)\nopenai_stt_config = SimpleNamespace(\n    api_key=\"sk-...\",              # Required (or set OPENAI_API_KEY)\n    # Most Whisper parameters are set per-request, not in config\n)\n```\n\n#### Anthropic\n```python\nanthropic_config = SimpleNamespace(\n    api_key=\"sk-ant-...\",          # Required (or set ANTHROPIC_API_KEY)\n    temperature=0.7,               # Optional\n    streaming=True,                # Optional\n    max_tokens=1000                # Optional\n)\n```\n\n#### Ollama (Local)\n```python\nollama_config = SimpleNamespace(\n    base_url=\"http://localhost:11434\",  # Optional (default)\n    temperature=0.7,                    # Optional\n    streaming=True                      # Optional\n)\n```\n\n#### AWS Bedrock\n```python\nbedrock_config = SimpleNamespace(\n    region=\"us-east-1\",            # Optional (uses AWS config)\n    aws_access_key_id=\"...\",       # Optional (uses AWS config)\n    aws_secret_access_key=\"...\",   # Optional (uses AWS config)\n    temperature=0.7,               # Optional\n    streaming=True                 # Optional\n)\n```\n\n### Environment Variables Helper\n\n```python\nfrom fastal.langgraph.toolkit.models.config import get_default_config\n\n# Automatically uses environment variables\nopenai_config = get_default_config(\"openai\")     # Uses OPENAI_API_KEY\nanthropic_config = get_default_config(\"anthropic\") # Uses ANTHROPIC_API_KEY\n```\n\n## \ud83c\udfaf Advanced Examples\n\n### Enterprise Multi-Provider Setup\n\n```python\nfrom fastal.langgraph.toolkit import ModelFactory\nfrom types import SimpleNamespace\nimport os\n\nclass EnterpriseAgentConfig:\n    \"\"\"Enterprise configuration with fallback providers\"\"\"\n    \n    def __init__(self):\n        self.primary_llm = self._setup_primary_llm()\n        self.fallback_llm = self._setup_fallback_llm()\n        self.embeddings = self._setup_embeddings()\n    \n    def _setup_primary_llm(self):\n        \"\"\"Primary: OpenAI GPT-4\"\"\"\n        if os.getenv(\"OPENAI_API_KEY\"):\n            config = SimpleNamespace(\n                api_key=os.getenv(\"OPENAI_API_KEY\"),\n                temperature=0.1,\n                streaming=True,\n                max_tokens=2000\n            )\n            return ModelFactory.create_llm(\"openai\", \"gpt-4o\", config)\n        return None\n    \n    def _setup_fallback_llm(self):\n        \"\"\"Fallback: Anthropic Claude\"\"\"\n        if os.getenv(\"ANTHROPIC_API_KEY\"):\n            config = SimpleNamespace(\n                api_key=os.getenv(\"ANTHROPIC_API_KEY\"),\n                temperature=0.1,\n                streaming=True,\n                max_tokens=2000\n            )\n            return ModelFactory.create_llm(\"anthropic\", \"claude-3-sonnet-20240229\", config)\n        return None\n    \n    def _setup_embeddings(self):\n        \"\"\"Embeddings with local fallback\"\"\"\n        # Try OpenAI first\n        if os.getenv(\"OPENAI_API_KEY\"):\n            config = SimpleNamespace(api_key=os.getenv(\"OPENAI_API_KEY\"))\n            return ModelFactory.create_embeddings(\"openai\", \"text-embedding-3-small\", config)\n        \n        # Fallback to local Ollama\n        config = SimpleNamespace(base_url=\"http://localhost:11434\")\n        return ModelFactory.create_embeddings(\"ollama\", \"nomic-embed-text\", config)\n    \n    def get_llm(self):\n        \"\"\"Get available LLM with fallback logic\"\"\"\n        return self.primary_llm or self.fallback_llm\n```\n\n### Domain-Specific Summarization\n\n```python\nfrom fastal.langgraph.toolkit import SummaryManager, SummaryConfig\n\nclass CustomerServiceSummaryManager:\n    \"\"\"Specialized summary manager for customer service conversations\"\"\"\n    \n    def __init__(self, llm):\n        # Customer service specific configuration\n        self.config = SummaryConfig(\n            pairs_threshold=8,  # Shorter conversations in support\n            recent_pairs_to_preserve=3,\n            max_summary_length=400,\n            \n            new_summary_prompt=\"\"\"\n            Analyze this customer service conversation and create a structured summary:\n\n            **Customer Information:**\n            - Name/Contact: [Extract if mentioned]\n            - Account/Order: [Extract if mentioned]\n\n            **Issue Summary:**\n            - Problem: [Main issue described]\n            - Category: [Technical/Billing/General/etc.]\n            - Urgency: [High/Medium/Low based on language]\n\n            **Actions Taken:**\n            - Solutions attempted: [List what agent tried]\n            - Information provided: [Key info given to customer]\n\n            **Current Status:**\n            - Resolution status: [Resolved/Pending/Escalated]\n            - Next steps: [What needs to happen next]\n\n            **Conversation:**\n            {messages_text}\n\n            **Structured Summary:**\n            \"\"\",\n            \n            combine_summary_prompt=\"\"\"\n            Update the customer service summary with new conversation information:\n\n            **Previous Summary:**\n            {existing_summary}\n\n            **New Conversation:**\n            {messages_text}\n\n            **Updated Summary:**\n            Merge the information, updating status and adding new actions/developments:\n            \"\"\"\n        )\n        \n        self.summary_manager = SummaryManager(llm, self.config)\n    \n    async def process_summary(self, state):\n        \"\"\"Process with customer service specific logic\"\"\"\n        return await self.summary_manager.process_summary(state)\n```\n\n### Multi-Modal Agent with Speech Processing\n\n```python\nfrom fastal.langgraph.toolkit import ModelFactory, SummaryManager, SummarizableState\nfrom typing import Annotated\nfrom langgraph.graph.message import add_messages\nfrom langgraph.graph import StateGraph\nimport asyncio\n\nclass MultiModalAgentState(SummarizableState):\n    \"\"\"State supporting both text and speech inputs\"\"\"\n    messages: Annotated[list, add_messages]\n    thread_id: str\n    audio_transcriptions: list = []\n    processing_metadata: dict = {}\n\nclass MultiModalAgent:\n    \"\"\"Agent that handles both text and speech inputs\"\"\"\n    \n    def __init__(self):\n        # Configure providers\n        config = SimpleNamespace(api_key=os.getenv(\"OPENAI_API_KEY\"))\n        \n        self.llm = ModelFactory.create_llm(\"openai\", \"gpt-4o\", config)\n        self.stt = ModelFactory.create_stt(\"openai\", \"whisper-1\", config)\n        self.summary_manager = SummaryManager(self.llm)\n        \n        self.graph = self._create_graph()\n    \n    async def _audio_processing_node(self, state: MultiModalAgentState):\n        \"\"\"Node to process incoming audio files\"\"\"\n        messages = state[\"messages\"]\n        latest_message = messages[-1] if messages else None\n        \n        # Check if latest message contains audio data\n        audio_data = getattr(latest_message, 'audio_data', None)\n        if not audio_data:\n            return {}\n        \n        try:\n            # Transcribe audio with enhanced settings\n            transcription = await self.stt.atranscribe(\n                audio_data,\n                language=\"auto\",  # Auto-detect language\n                temperature=0.1,  # High accuracy\n                response_format=\"verbose_json\"  # Get segments\n            )\n            \n            # Store transcription with metadata\n            transcription_record = {\n                \"timestamp\": datetime.utcnow().isoformat(),\n                \"text\": transcription[\"text\"],\n                \"language\": transcription[\"language\"],\n                \"duration\": transcription[\"duration_seconds\"],\n                \"segments\": transcription.get(\"segments\", []),\n                \"message_id\": len(messages) - 1\n            }\n            \n            # Add transcribed text as new message\n            from langchain_core.messages import HumanMessage\n            text_message = HumanMessage(\n                content=f\"[Audio transcription]: {transcription['text']}\"\n            )\n            \n            return {\n                \"messages\": [text_message],\n                \"audio_transcriptions\": state.get(\"audio_transcriptions\", []) + [transcription_record],\n                \"processing_metadata\": {\n                    **state.get(\"processing_metadata\", {}),\n                    \"last_audio_processed\": transcription_record[\"timestamp\"]\n                }\n            }\n            \n        except Exception as e:\n            logger.error(f\"Audio transcription failed: {e}\")\n            # Add error message instead of failing\n            error_message = HumanMessage(content=\"[Audio processing error - please try again]\")\n            return {\"messages\": [error_message]}\n\n    async def _intelligent_agent_node(self, state: MultiModalAgentState):\n        \"\"\"Main agent node with audio context awareness\"\"\"\n        messages = state[\"messages\"]\n        audio_transcriptions = state.get(\"audio_transcriptions\", [])\n        \n        # Build context with audio awareness\n        context_prefix = \"\"\n        if audio_transcriptions:\n            recent_audio = audio_transcriptions[-3:]  # Last 3 audio inputs\n            audio_summary = \"\\n\".join([\n                f\"- Audio {i+1}: {trans['text'][:100]}...\" \n                for i, trans in enumerate(recent_audio)\n            ])\n            context_prefix = f\"Recent audio inputs:\\n{audio_summary}\\n\\n\"\n        \n        # Use summarized context if available\n        summary = state.get(\"summary\", \"\")\n        if summary:\n            context_prefix += f\"Conversation summary: {summary}\\n\\n\"\n        \n        # Prepare messages with context\n        if context_prefix:\n            from langchain_core.messages import SystemMessage\n            context_msg = SystemMessage(content=context_prefix + \"Continue the conversation:\")\n            recent_messages = messages[-5:]  # Only recent messages\n            llm_input = [context_msg] + recent_messages\n        else:\n            llm_input = messages\n        \n        response = await self.llm.ainvoke(llm_input)\n        return {\"messages\": [response]}\n    \n    def _create_graph(self):\n        \"\"\"Create the multi-modal processing graph\"\"\"\n        workflow = StateGraph(MultiModalAgentState)\n        \n        # Add nodes\n        workflow.add_node(\"audio_processing\", self._audio_processing_node)\n        workflow.add_node(\"summary_check\", self.summary_manager.summary_node)\n        workflow.add_node(\"agent\", self._intelligent_agent_node)\n        \n        # Define flow\n        workflow.set_entry_point(\"audio_processing\")\n        workflow.add_edge(\"audio_processing\", \"summary_check\")\n        workflow.add_edge(\"summary_check\", \"agent\")\n        workflow.add_edge(\"agent\", \"__end__\")\n        \n        return workflow.compile()\n    \n    async def process_audio_message(self, audio_data: bytes, thread_id: str):\n        \"\"\"Process audio input with full pipeline\"\"\"\n        # Create audio message (custom message type)\n        class AudioMessage:\n            def __init__(self, audio_data):\n                self.audio_data = audio_data\n                self.content = \"[Audio message]\"\n        \n        config = {\"configurable\": {\"thread_id\": thread_id}}\n        input_state = {\n            \"messages\": [AudioMessage(audio_data)],\n            \"thread_id\": thread_id\n        }\n        \n        result = await self.graph.ainvoke(input_state, config=config)\n        return result\n    \n    async def process_text_message(self, text: str, thread_id: str):\n        \"\"\"Process text input (standard flow)\"\"\"\n        from langchain_core.messages import HumanMessage\n        \n        config = {\"configurable\": {\"thread_id\": thread_id}}\n        input_state = {\n            \"messages\": [HumanMessage(content=text)],\n            \"thread_id\": thread_id\n        }\n        \n        result = await self.graph.ainvoke(input_state, config=config)\n        return result\n    \n    def get_audio_history(self, thread_id: str, limit: int = 10):\n        \"\"\"Get audio transcription history for a thread\"\"\"\n        # This would typically query your checkpointer\n        # For now, return from current state\n        return {\"transcriptions\": \"Audio history would be retrieved from checkpointer\"}\n\n# Usage example\nasync def multi_modal_example():\n    agent = MultiModalAgent()\n    \n    # Process audio file\n    with open(\"user_question.mp3\", \"rb\") as f:\n        audio_data = f.read()\n    \n    result = await agent.process_audio_message(audio_data, \"user123\")\n    print(\"Audio processing result:\", result[\"messages\"][-1].content)\n    \n    # Follow up with text\n    text_result = await agent.process_text_message(\n        \"Can you clarify that last point?\", \"user123\"\n    )\n    print(\"Text follow-up:\", text_result[\"messages\"][-1].content)\n\n# Run the example\n# asyncio.run(multi_modal_example())\n```\n\n### Real-time Audio Processing Pipeline\n\n```python\nimport aiofiles\nfrom pathlib import Path\n\nclass RealTimeAudioProcessor:\n    \"\"\"Process audio files from a directory in real-time\"\"\"\n    \n    def __init__(self, watch_directory: str):\n        self.watch_directory = Path(watch_directory)\n        self.stt = ModelFactory.create_stt(\"openai\", config=SimpleNamespace(\n            api_key=os.getenv(\"OPENAI_API_KEY\")\n        ))\n        self.processed_files = set()\n    \n    async def process_audio_file(self, file_path: Path):\n        \"\"\"Process a single audio file\"\"\"\n        async with aiofiles.open(file_path, 'rb') as f:\n            audio_data = await f.read()\n        \n        # Transcribe with optimized settings\n        result = await self.stt.atranscribe(\n            audio_data,\n            language=\"en\",\n            temperature=0.0,  # Maximum accuracy\n            response_format=\"verbose_json\"\n        )\n        \n        # Save results\n        output_file = file_path.with_suffix('.json')\n        output_data = {\n            \"source_file\": str(file_path),\n            \"transcription\": result,\n            \"processed_at\": datetime.utcnow().isoformat()\n        }\n        \n        async with aiofiles.open(output_file, 'w') as f:\n            await f.write(json.dumps(output_data, indent=2))\n        \n        return result\n    \n    async def watch_and_process(self):\n        \"\"\"Watch directory and process new audio files\"\"\"\n        while True:\n            try:\n                # Find new audio files\n                audio_files = list(self.watch_directory.glob(\"*.mp3\")) + \\\n                             list(self.watch_directory.glob(\"*.wav\")) + \\\n                             list(self.watch_directory.glob(\"*.m4a\"))\n                \n                new_files = [f for f in audio_files if f not in self.processed_files]\n                \n                if new_files:\n                    print(f\"Found {len(new_files)} new audio files\")\n                    \n                    # Process files concurrently\n                    tasks = [self.process_audio_file(file) for file in new_files]\n                    results = await asyncio.gather(*tasks, return_exceptions=True)\n                    \n                    for file, result in zip(new_files, results):\n                        if isinstance(result, Exception):\n                            print(f\"Error processing {file}: {result}\")\n                        else:\n                            print(f\"\u2713 Processed {file.name}: {result['text'][:100]}...\")\n                            self.processed_files.add(file)\n                \n                # Wait before next check\n                await asyncio.sleep(5)\n                \n            except KeyboardInterrupt:\n                print(\"Stopping audio processor...\")\n                break\n            except Exception as e:\n                print(f\"Error in watch loop: {e}\")\n                await asyncio.sleep(10)\n\n# Usage\nprocessor = RealTimeAudioProcessor(\"./audio_input\")\n# asyncio.run(processor.watch_and_process())\n```\n\n### Memory-Optimized Long Conversations\n\n```python\nfrom fastal.langgraph.toolkit import SummarizableState, SummaryManager\nfrom typing import Annotated\nfrom langgraph.graph.message import add_messages\n\nclass OptimizedConversationState(SummarizableState):\n    \"\"\"State optimized for very long conversations\"\"\"\n    messages: Annotated[list, add_messages]\n    thread_id: str\n    user_context: dict = {}  # Additional user context\n    conversation_metadata: dict = {}  # Metadata for analytics\n\nclass LongConversationAgent:\n    \"\"\"Agent optimized for handling very long conversations\"\"\"\n    \n    def __init__(self, llm):\n        # Aggressive summarization for memory efficiency\n        config = SummaryConfig(\n            pairs_threshold=5,    # Frequent summarization\n            recent_pairs_to_preserve=2,  # Minimal recent context\n            max_summary_length=600,  # Comprehensive summaries\n        )\n        \n        self.summary_manager = SummaryManager(llm, config)\n        self.llm = llm\n    \n    async def process_with_optimization(self, state: OptimizedConversationState):\n        \"\"\"Process message with aggressive memory optimization\"\"\"\n        \n        # Always check for summarization opportunities\n        if await self.summary_manager.should_create_summary(state):\n            # Create summary to optimize memory\n            summary_update = await self.summary_manager.process_summary(state)\n            state.update(summary_update)\n        \n        # Use only recent context + summary for LLM call\n        messages = state[\"messages\"]\n        last_idx = state.get(\"last_summarized_index\", 0)\n        summary = state.get(\"summary\")\n        \n        # Ultra-minimal context for cost efficiency\n        recent_messages = messages[last_idx:]\n        \n        if summary:\n            context = f\"Context: {summary}\\n\\nContinue conversation:\"\n            context_msg = SystemMessage(content=context)\n            llm_input = [context_msg] + recent_messages[-2:]  # Only last exchange\n        else:\n            llm_input = recent_messages[-4:]  # Minimal fallback\n        \n        response = await self.llm.ainvoke(llm_input)\n        return {\"messages\": [response]}\n```\n\n### Token Usage Analytics\n\n```python\nimport tiktoken\nfrom collections import defaultdict\n\nclass TokenOptimizedSummaryManager:\n    \"\"\"Summary manager with token usage tracking and optimization\"\"\"\n    \n    def __init__(self, llm, config=None):\n        self.summary_manager = SummaryManager(llm, config)\n        self.tokenizer = tiktoken.get_encoding(\"cl100k_base\")  # GPT-4 tokenizer\n        self.token_stats = defaultdict(int)\n    \n    def count_tokens(self, text: str) -> int:\n        \"\"\"Count tokens in text\"\"\"\n        return len(self.tokenizer.encode(text))\n    \n    async def process_with_analytics(self, state):\n        \"\"\"Process summary with token usage analytics\"\"\"\n        messages = state[\"messages\"]\n        \n        # Count tokens before summarization\n        total_tokens_before = sum(\n            self.count_tokens(str(msg.content)) for msg in messages\n        )\n        \n        # Process summary\n        result = await self.summary_manager.process_summary(state)\n        \n        if result:  # Summary was created\n            summary = result.get(\"summary\", \"\")\n            last_idx = result.get(\"last_summarized_index\", 0)\n            \n            # Count tokens after summarization\n            remaining_messages = messages[last_idx:]\n            remaining_tokens = sum(\n                self.count_tokens(str(msg.content)) for msg in remaining_messages\n            )\n            summary_tokens = self.count_tokens(summary)\n            total_tokens_after = remaining_tokens + summary_tokens\n            \n            # Track savings\n            tokens_saved = total_tokens_before - total_tokens_after\n            self.token_stats[\"tokens_saved\"] += tokens_saved\n            self.token_stats[\"summaries_created\"] += 1\n            \n            print(f\"\ud83d\udcb0 Token optimization: {tokens_saved} tokens saved \"\n                  f\"({total_tokens_before} \u2192 {total_tokens_after})\")\n        \n        return result\n    \n    def get_analytics(self):\n        \"\"\"Get token usage analytics\"\"\"\n        return dict(self.token_stats)\n```\n\n## \ud83d\udd27 Best Practices\n\n### 1. State Design\n```python\n# \u2705 Use SummarizableState for automatic summary support\nclass MyAgentState(SummarizableState):\n    messages: Annotated[list, add_messages]\n    thread_id: str\n\n# \u274c Don't manually define summary fields\nclass BadAgentState(TypedDict):\n    messages: Annotated[list, add_messages]\n    thread_id: str\n    summary: str | None  # Manual definition not needed\n    last_summarized_index: int  # Manual definition not needed\n```\n\n### 2. Graph Architecture\n```python\n# \u2705 Use ready-to-use summary node (Recommended)\nsummary_manager = SummaryManager(llm, config)\nsummary_manager.set_logger(logger)  # Optional logging\n\nworkflow.add_node(\"summary_check\", summary_manager.summary_node)\nworkflow.set_entry_point(\"summary_check\")  # Always check summary first\nworkflow.add_edge(\"summary_check\", \"agent\")  # Then process\nworkflow.add_edge(\"tools\", \"agent\")  # Tools return to agent, not summary\n\n# \u2705 Alternative: Custom summary node (if you need custom logic)\nasync def custom_summary_node(state):\n    if await summary_manager.should_create_summary(state):\n        return await summary_manager.process_summary(state)\n    return {}\n\nworkflow.add_node(\"summary_check\", custom_summary_node)\n\n# \u274c Don't create summaries mid-conversation\n# This would create summaries during tool execution\nworkflow.add_edge(\"tools\", \"summary_check\")  # Wrong!\n```\n\n### 3. Configuration Management\n```python\n# \u2705 Environment-based configuration\nclass ProductionConfig:\n    def __init__(self):\n        self.llm_config = SimpleNamespace(\n            api_key=os.getenv(\"OPENAI_API_KEY\"),\n            temperature=0.1,  # Conservative for production\n            streaming=True\n        )\n        \n        self.summary_config = SummaryConfig(\n            pairs_threshold=12,  # Longer thresholds for production\n            recent_pairs_to_preserve=4,\n            max_summary_length=300\n        )\n\n# \u274c Don't hardcode credentials\nbad_config = SimpleNamespace(api_key=\"sk-hardcoded-key\")  # Never do this!\n```\n\n### 4. Error Handling\n```python\n# \u2705 Use built-in error handling (Recommended)\n# The summary_node() method already includes robust error handling\nsummary_manager = SummaryManager(llm, config)\nsummary_manager.set_logger(logger)  # Automatic error logging\n\nworkflow.add_node(\"summary_check\", summary_manager.summary_node)\n\n# \u2705 Custom error handling (if needed)\nasync def robust_summary_node(state):\n    \"\"\"Custom summary node with additional error handling\"\"\"\n    try:\n        if await summary_manager.should_create_summary(state):\n            return await summary_manager.process_summary(state)\n        return {}\n    except Exception as e:\n        logger.error(f\"Summary creation failed: {e}\")\n        # Continue without summary rather than failing\n        return {}\n```\n\n### 5. Performance Monitoring\n```python\nimport time\nfrom functools import wraps\n\n# \u2705 Built-in monitoring (Recommended)\n# The summary_node() automatically logs performance when logger is configured\nsummary_manager = SummaryManager(llm, config)\nsummary_manager.set_logger(logger)  # Automatic performance logging\n\nworkflow.add_node(\"summary_check\", summary_manager.summary_node)\n\n# \u2705 Custom performance monitoring (if needed)\ndef monitor_performance(func):\n    \"\"\"Decorator to monitor summary performance\"\"\"\n    @wraps(func)\n    async def wrapper(*args, **kwargs):\n        start_time = time.time()\n        result = await func(*args, **kwargs)\n        duration = time.time() - start_time\n        \n        if result:  # Summary was created\n            logger.info(f\"Summary created in {duration:.2f}s\")\n        \n        return result\n    return wrapper\n\n# Usage with custom node\n@monitor_performance\nasync def monitored_summary_node(state):\n    return await summary_manager.process_summary(state)\n```\n\n## \ud83d\udcca Performance Considerations\n\n### Token Efficiency\n- **Without summarization**: ~50,000 tokens for 50-message conversation\n- **With summarization**: ~8,000 tokens (84% reduction)\n- **Cost savings**: Proportional to token reduction\n\n### Response Time\n- **Summary creation**: 2-5 seconds additional latency\n- **Context processing**: 50-80% faster with summarized context\n- **Overall impact**: Net positive for conversations >15 messages\n\n### Memory Usage\n- **State size**: Reduced by 70-90% with active summarization\n- **Checkpointer storage**: Significantly smaller state objects\n- **Database impact**: Reduced checkpoint table growth\n\n## \ud83d\udee0\ufe0f Troubleshooting\n\n### Common Issues\n\n#### 1. \"SimpleNamespace required\" Error\n```python\n# \u274c Cause: Using dictionary instead of SimpleNamespace\nconfig = {\"api_key\": \"sk-...\"}\n\n# \u2705 Solution: Use SimpleNamespace\nfrom types import SimpleNamespace\nconfig = SimpleNamespace(api_key=\"sk-...\")\n```\n\n#### 2. Summary Not Created\n```python\n# Check if threshold is reached\npairs = summary_manager.count_conversation_pairs(state[\"messages\"])\nprint(f\"Current pairs: {pairs}, Threshold: {config.pairs_threshold}\")\n\n# Check message types\nfor i, msg in enumerate(state[\"messages\"]):\n    print(f\"{i}: {type(msg).__name__} - {hasattr(msg, 'tool_calls')}\")\n```\n\n#### 3. Provider Not Available\n```python\n# Check available providers\nproviders = ModelFactory.get_available_providers()\nprint(f\"Available: {providers}\")\n\n# Verify environment variables\nimport os\nprint(f\"OpenAI key set: {bool(os.getenv('OPENAI_API_KEY'))}\")\n```\n\n### Debug Mode\n```python\n# Enable debug logging for detailed output\nimport logging\nlogging.getLogger(\"fastal.langgraph.toolkit\").setLevel(logging.DEBUG)\n```\n\n## \ud83d\uddfa\ufe0f Roadmap\n\nThe Fastal LangGraph Toolkit follows a structured development roadmap with clear versioning and feature additions. Current development status and planned features:\n\n### Current Status (v0.3.1)\n- \u2705 **GPT-5 Model Support**: Full support for GPT-5, GPT-5-mini, GPT-5-nano with intelligent configuration\n- \u2705 **Multi-Provider LLM Support**: OpenAI (including GPT-5), Anthropic, Ollama, AWS Bedrock\n- \u2705 **Multi-Provider Embeddings**: OpenAI, Ollama, AWS Bedrock  \n- \u2705 **Intelligent Conversation Summarization**: Production-ready with ready-to-use LangGraph node\n- \u2705 **OpenAI Speech-to-Text**: Whisper integration with full async support\n- \u2705 **Type Safety**: Full TypedDict integration and TYPE_CHECKING imports\n- \u2705 **Enterprise Testing**: Comprehensive test suite with CI/CD\n- \u2705 **Development Status**: Beta (PyPI classifier updated)\n\n### Planned Features\n\n#### v0.4.0 - Extended STT Providers (Q1 2025)\n- \ud83d\udea7 **Google Cloud Speech-to-Text**: Full integration with streaming support\n- \ud83d\udea7 **Azure Cognitive Services STT**: Real-time transcription capabilities\n- \ud83d\udea7 **Advanced STT Features**: Speaker diarization, custom vocabularies, language detection\n- \ud83d\udea7 **STT Performance Optimizations**: Batch processing and caching\n\n#### v0.5.0 - Text-to-Speech Support (Q2 2025)\n- \ud83d\udea7 **OpenAI TTS**: High-quality voice synthesis with multiple voices\n- \ud83d\udea7 **Google Cloud TTS**: WaveNet and Neural2 voice support\n- \ud83d\udea7 **ElevenLabs Integration**: Premium voice models and voice cloning\n- \ud83d\udea7 **Azure TTS**: Neural voices with emotion control\n- \ud83d\udea7 **TTSFactory**: Unified factory pattern for all TTS providers\n\n#### v0.6.0 - Advanced Features (Q3 2025)\n- \ud83d\udea7 **Real-Time Streaming**: Live audio processing capabilities\n- \ud83d\udea7 **Audio Processing Pipeline**: Noise reduction, format conversion\n- \ud83d\udea7 **Multi-Language Support**: Enhanced language detection and switching\n- \ud83d\udea7 **Voice Activity Detection**: Smart audio segmentation\n\n#### v1.0.0 - Production Release (Q4 2025)\n- \ud83d\udea7 **Enterprise Authentication**: OAuth, SAML, and enterprise SSO\n- \ud83d\udea7 **Advanced Monitoring**: Metrics, tracing, and observability\n- \ud83d\udea7 **Performance Optimizations**: Caching, connection pooling\n- \ud83d\udea7 **Documentation**: Complete enterprise deployment guides\n\n### Provider Expansion Plans\n\n**Speech-to-Text Providers (v0.4.0):**\n- Google Cloud Speech-to-Text\n- Azure Cognitive Services Speech\n- AWS Transcribe (planned)\n- AssemblyAI (community request)\n\n**Text-to-Speech Providers (v0.5.0):**\n- OpenAI TTS\n- Google Cloud Text-to-Speech  \n- ElevenLabs\n- Azure Cognitive Services TTS\n- AWS Polly (planned)\n\n### Community Contributions\n\nWe welcome community contributions and feedback to help shape the roadmap:\n\n- **Feature Requests**: Create issues for new provider requests or feature suggestions\n- **Provider Implementations**: Community-contributed providers are welcome\n- **Documentation**: Help improve examples and guides\n- **Testing**: Real-world usage feedback from enterprise deployments\n\n**Priority is given to providers with proven enterprise demand and active community support.**\n\n### Version Strategy\n\n- **Major versions** (1.x): Breaking changes, major architectural improvements\n- **Minor versions** (0.x): New features, provider additions\n- **Patch versions** (0.x.y): Bug fixes, security updates\n- **Beta releases** (0.x.0b1): Pre-release testing with new features\n\n**Current stable release**: v0.3.1 (GPT-5 support, Beta status on PyPI)\n\n## License\n\nMIT License",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Common utilities and tools for LangGraph agents by Fastal",
    "version": "0.4.0",
    "project_urls": {
        "Homepage": "https://github.com/FastalGroup/fastal-langgraph-toolkit",
        "Issues": "https://github.com/FastalGroup/fastal-langgraph-toolkit/issues",
        "Repository": "https://github.com/FastalGroup/fastal-langgraph-toolkit.git"
    },
    "split_keywords": [
        "agents",
        " ai",
        " langchain",
        " langgraph",
        " toolkit"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "b2b8e04a02787b0f78a875efa6007113be2fd7d0b7a8a13f52c8142f149627c9",
                "md5": "11ef4fb57632d18996ef962e59744732",
                "sha256": "acd88ee28c38c9bde663bda5c160ef7a74d66915ea02b55c2839e8e8c27ae6e7"
            },
            "downloads": -1,
            "filename": "fastal_langgraph_toolkit-0.4.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "11ef4fb57632d18996ef962e59744732",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 51648,
            "upload_time": "2025-08-26T12:54:46",
            "upload_time_iso_8601": "2025-08-26T12:54:46.113416Z",
            "url": "https://files.pythonhosted.org/packages/b2/b8/e04a02787b0f78a875efa6007113be2fd7d0b7a8a13f52c8142f149627c9/fastal_langgraph_toolkit-0.4.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "153c9af6b18de899c44acf46daa13e9941e2717d61f970d23a869df28ed73c13",
                "md5": "c2e7b080f323f386b48f838066b508d7",
                "sha256": "e284d79eee81b34c2d46c5e91bce61e8dac74e5c2d0fe01a01a1f1fb178a7e61"
            },
            "downloads": -1,
            "filename": "fastal_langgraph_toolkit-0.4.0.tar.gz",
            "has_sig": false,
            "md5_digest": "c2e7b080f323f386b48f838066b508d7",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 158288,
            "upload_time": "2025-08-26T12:54:47",
            "upload_time_iso_8601": "2025-08-26T12:54:47.120895Z",
            "url": "https://files.pythonhosted.org/packages/15/3c/9af6b18de899c44acf46daa13e9941e2717d61f970d23a869df28ed73c13/fastal_langgraph_toolkit-0.4.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-26 12:54:47",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "FastalGroup",
    "github_project": "fastal-langgraph-toolkit",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "fastal-langgraph-toolkit"
}

None