bestehorn-llmmanager

Name	bestehorn-llmmanager JSON
Version	0.1.13 JSON
	download
home_page	None
Summary	A comprehensive Python library for managing AWS Bedrock Converse API interactions with multi-model support, intelligent retry logic, and parallel processing capabilities
upload_time	2025-07-24 12:40:22
maintainer	None
docs_url	None
author	None
requires_python	<4.0,>=3.8
license	MIT
keywords	aws bedrock llm ai machine-learning amazon claude anthropic
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage

            # LLMManager by Markus Bestehorn

[![PyPI version](https://img.shields.io/pypi/v/bestehorn-llmmanager.svg)](https://pypi.org/project/bestehorn-llmmanager/)
[![Python Versions](https://img.shields.io/pypi/pyversions/bestehorn-llmmanager.svg)](https://pypi.org/project/bestehorn-llmmanager/)
[![Build Status](https://github.com/Bestehorn/LLMManager/actions/workflows/ci.yml/badge.svg)](https://github.com/Bestehorn/LLMManager/actions/workflows/ci.yml)
[![codecov](https://codecov.io/gh/Bestehorn/LLMManager/graph/badge.svg?token=YOUR_CODECOV_TOKEN)](https://codecov.io/gh/Bestehorn/LLMManager)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)

**A comprehensive Python library that simplifies and enhances AWS Bedrock Converse API interactions with intelligent message building, automatic file format detection, multi-modal support, and advanced reliability features.**

I have implemented this set of functions over time while working on projects and demos using Amazon Bedrock.
One of the key use cases that started the implementation of this project has been: I have a prompt and I do not care where the prompt gets executed, I just want to make sure I get a response and I do not want to handle any temporal issues such as malformed responses from LLMs or throttling.
It helped me to avoid common mistakes and sources for errors when using the [standard AWS Bedrock Converse API](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-runtime/client/converse.html).

⚠️**Important**: The code in this repository comes with no guarentees or liabilities. It is not intended for production use, but rather as an inspiration for implementing your own logic for the Bedrock converse API in production environments.

Quick note on the package name: As there are other packages (public & private) that are named "LLMManager", I have chosen to prefix the name of this package with my last name as I am sure there are no other packages with this name.
If this is an issue, feel free to contact me and propose another name.

## 🚀 Key Simplifications Over Native AWS Bedrock Converse API

This library was built to simplify the [standard AWS Bedrock Converse API](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-runtime/client/converse.html). 
While some of these simplifications come at a loss of flexibility, for most use cases, the code becomes cleaner, less prone to errors and easier to understand when it is based on the abstractions provided in this package: 

### 🎯 **Fluent MessageBuilder with Automatic Format Detection**
- **Native API**: Manual message dictionary construction with complex nested structures resulting in complex JSON
- **LLMManager**: Intuitive fluent interface with automatic file type detection and validation that hides the intricacies of building message JSON for Bedrock's converse API.

### 🔄 **Intelligent Multi-Region Failover**
- **Native API**: Single region, manual error handling required, i.e., you have to handle different types of errors in your code and implement retry logic.
- **LLMManager**: Automatic failover across multiple AWS regions with configurable retry strategies, e.g., if you request fails in one AWS region, it is moved to another region.

### ⚡ **Built-in Parallel Processing**
- **Native API**: Sequential processing only
- **LLMManager**: Concurrent execution across multiple models and regions; this is particularly important for any kind of batch processing to reduce inference time.

### 🛡️ **Enhanced Error Handling & Reliability**
- **Native API**: Basic error responses, no automatic retries
- **LLMManager**: Comprehensive error handling, exponential backoff, and feature fallback

### 📊 **Rich Response Management**
- **Native API**: Raw response dictionaries, i.e., you have to parse and understand the JSON structure from Bedrock.
- **LLMManager**: Structured response objects with metadata, utilities, and validation that provide data access through typed functions.

## ✨ Features

### 🏗️ **Fluent MessageBuilder**
- **Intuitive API**: Chain methods to build complex multi-modal messages
- **Automatic Format Detection**: Intelligent file type detection from content and filenames
- **Multi-Modal Support**: Seamlessly combine text, images, documents, and videos
- **Built-in Validation**: Automatic content validation and size limit enforcement
- **Type Safety**: Comprehensive enums for formats, roles, and content types

### 🔄 **Advanced Reliability**
- **Multi-Model Support**: Work with multiple LLM models simultaneously with automatic fallback
- **Multi-Region Failover**: Automatic failover across AWS regions with intelligent routing
- **Intelligent Retry Logic**: Exponential backoff with configurable retry strategies
- **Feature Fallback**: Graceful degradation when advanced features aren't supported

### ⚡ **Performance & Scalability**
- **Parallel Processing**: Execute multiple requests concurrently across regions
- **Load Balancing**: Intelligent distribution across available resources
- **Connection Pooling**: Efficient resource management for high-throughput scenarios
- **Streaming Support**: Real-time response streaming for long-form content

### 🛡️ **Security & Authentication**
- **Flexible Authentication**: Support for AWS profiles, credentials, IAM roles, and auto-detection
- **Response Validation**: Optional content validation with custom validation functions
- **Guardrail Integration**: Full support for AWS Bedrock guardrails
- **Secure File Handling**: Safe processing of uploaded files with size and format validation

### 📊 **Comprehensive Monitoring**
- **Rich Response Objects**: Detailed response metadata with performance metrics
- **Execution Statistics**: Request timing, token usage, and success rates
- **Error Tracking**: Comprehensive error logging with retry attempt details
- **Validation Reporting**: Detailed validation results and failure analysis

## Installation

### From PyPI (Recommended)

```bash
pip install bestehorn-llmmanager
```

### From Source (Development)

For development or integration into other projects:

```bash
git clone https://github.com/Bestehorn/LLMManager.git
cd LLMManager
pip install -e .
```

### With Development Dependencies

```bash
pip install -e .[dev]
```

## 🚀 Quick Start

### MessageBuilder vs. Native API Comparison

**❌ With Native AWS Bedrock Converse API:**
```python
import boto3
import base64

# Complex manual message construction
bedrock = boto3.client('bedrock-runtime', region_name='us-east-1')

# Manual dictionary construction - error-prone and verbose
with open("document.pdf", "rb") as f:
    doc_bytes = f.read()

messages = [
    {
        "role": "user",
        "content": [
            {"text": "Analyze this document:"},
            {
                "document": {
                    "name": "document.pdf",
                    "format": "pdf",  # Must specify format manually
                    "source": {"bytes": doc_bytes}
                }
            }
        ]
    }
]

# Basic API call with no error handling or retry logic
try:
    response = bedrock.converse(
        modelId="anthropic.claude-3-haiku-20240307-v1:0",  # Must specify exact model ID
        messages=messages
    )
    content = response['output']['message']['content'][0]['text']
except Exception as e:
    print(f"Error: {e}")  # Limited error information
```

In this example above, you have to load the document and then construct the JSON for the message to get it passed to Bedrock's converse API in a single region.
Once the response returns, you have to spcifically know where the answer is (which is simple for this straightforward example) and for complex usage scenarios, you will need complex logic to access the right response.
Furthermore, you have to specifically know the model ID of the model which you want to access.
If you want to access the model directly (as above), you will have to find the right model in the [AWS documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html).
For cross-region inference (CRIS), this becomes even more complex as you have to choose the right inference/model ID for the CRIS profile you want to use, e.g., different CRIS profiles for the model in the US or EU from the [AWS documentation page](https://docs.aws.amazon.com/bedrock/latest/userguide/inference-profiles-support.html#inference-profiles-support-system).
If the request to Bedrock fails, you get a relatively generic Exception object which you have to handle and there is no automated retry with this approach or other ways to handle errors gracefully.
The code below covers all of this and more using the LLM Manager:

**✅ With LLMManager and MessageBuilder:**
```python
from bestehorn_llmmanager import LLMManager, create_user_message

# Simple initialization with friendly model names and multi-region support
manager = LLMManager(
    models=["Claude 3 Haiku", "Claude 3 Sonnet"],  # Friendly names, automatic fallback
    regions=["us-east-1", "us-west-2", "eu-west-1"]  # Multi-region with automatic failover
)

# Fluent message building with automatic format detection
message = create_user_message()\
    .add_text("Analyze this document:")\
    .add_local_document("document.pdf")\
    .build()  # Automatically detects PDF format from file extension

# Enhanced API call with comprehensive error handling
response = manager.converse(messages=[message])

if response.success:
    print(response.get_content())
    print(f"Used model: {response.model_used} in region: {response.region_used}")
    print(f"Duration: {response.total_duration_ms}ms")
else:
    print(f"Request failed after {len(response.attempts)} attempts")
    print(f"Last error: {response.get_last_error()}")
```

First an foremost, you simply select the models you want to use with the models parameter and the LLM Manager will use either CRIS or direct access (with preference to CRIS).
You do not have to read documentation pages or find model IDs to use model as the LLM Manager has a built-in parsing mechanism for these documentation pages to load the model IDs and inference profiles automatically.
The list of regions gives flexibility and the default converse API does not give the option to execute prompts in various regions. 
The loading of the PDF document happens simply by pointing to the local file, the detection of the content will also happen automatically.
As a response from Bedrock, you always get a BedrockResponse object which allows you to easily determine if the request was sucessful or not.
The access to the response's text does not require knowing the JSON structure and reduces parsing of JSON. 

### 🏗️ MessageBuilder: Intelligent Multi-Modal Message Construction

The MessageBuilder provides a fluent, type-safe interface for building complex messages with automatic format detection:

#### Basic Text Messages
```python
from bestehorn_llmmanager import create_user_message, create_assistant_message

# Simple text message
message = create_user_message().add_text("Hello, how are you?").build()

# Multi-paragraph text
message = create_user_message()\
    .add_text("First paragraph of my question.")\
    .add_text("Second paragraph with more details.")\
    .build()
```

#### Multi-Modal Messages with Automatic Format Detection
```python
# Combine text, images, and documents in one fluent chain
message = create_user_message()\
    .add_text("Please analyze this data visualization and the underlying data:")\
    .add_local_image("charts/sales_chart.png")  # Auto-detects PNG format\
    .add_local_document("data/sales_data.xlsx")  # Auto-detects Excel format\
    .add_text("What trends do you notice and what recommendations do you have?")\
    .build()

# The MessageBuilder automatically:
# - Detects file formats from extensions and content
# - Validates file sizes and formats
# - Handles file reading and encoding
# - Creates proper AWS Bedrock message structure
```

#### File Format Detection Capabilities
```python
from bestehorn_llmmanager import ImageFormatEnum, DocumentFormatEnum, VideoFormatEnum

# Automatic detection from file extensions
message = create_user_message()\
    .add_local_image("photo.jpg")     # Detects JPEG\
    .add_local_image("diagram.png")   # Detects PNG\
    .add_local_document("report.pdf") # Detects PDF\
    .add_local_document("data.csv")   # Detects CSV\
    .build()

# Manual format specification when needed
message = create_user_message()\
    .add_image_bytes(image_data, format=ImageFormatEnum.WEBP)\
    .add_document_bytes(doc_data, format=DocumentFormatEnum.DOCX, name="Proposal")\
    .build()

# Supported formats:
# Images: JPEG, PNG, GIF, WEBP
# Documents: PDF, CSV, DOC, DOCX, XLS, XLSX, HTML, TXT, MD  
# Videos: MP4, MOV, AVI, WEBM, MKV
```

### ⚡ Parallel Processing with MessageBuilder

**❌ Native API:** Sequential processing only
```python
# With native API, you must process requests one by one
results = []
for question in questions:
    response = bedrock.converse(modelId="...", messages=[{"role": "user", "content": [{"text": question}]}])
    results.append(response)  # Slow, sequential processing
```

**✅ LLMManager Advantage:** Concurrent multi-region processing
```python
from bestehorn_llmmanager import ParallelLLMManager, create_user_message
from bestehorn_llmmanager.bedrock.models.parallel_structures import BedrockConverseRequest

# Initialize parallel manager with multiple regions for high availability
parallel_manager = ParallelLLMManager(
    models=["Claude 3 Haiku", "Claude 3 Sonnet"],
    regions=["us-east-1", "us-west-2", "eu-west-1"]
)

# Create multiple requests using MessageBuilder
questions = ["What is AI?", "Explain machine learning", "How does neural network training work?"]

requests = []
for i, question in enumerate(questions):
    message = create_user_message().add_text(question).build()
    requests.append(BedrockConverseRequest(
        request_id=f"question-{i}",
        messages=[message]
    ))

# Execute all requests in parallel across multiple regions
parallel_response = parallel_manager.converse_parallel(
    requests=requests,
    target_regions_per_request=2  # Use 2 regions per request for redundancy
)

# Get comprehensive results
print(f"Success rate: {parallel_response.get_success_rate():.1%}")
print(f"Total duration: {parallel_response.total_duration_ms}ms")
print(f"Average per request: {parallel_response.parallel_execution_stats.average_request_duration_ms:.1f}ms")

# Access individual results
for request_id, response in parallel_response.get_successful_responses().items():
    print(f"{request_id}: {response.get_content()}")
    print(f"  Model: {response.model_used}, Region: {response.region_used}")
```

### 🔄 Advanced MessageBuilder Patterns

#### Conversation Context Management
```python
from bestehorn_llmmanager import create_user_message, create_assistant_message

# Build a multi-turn conversation with context
conversation = []

# Initial user message with image
user_msg1 = create_user_message()\
    .add_text("What's in this image?")\
    .add_local_image("photo.jpg")\
    .build()
conversation.append(user_msg1)

# Simulate assistant response (or use actual response)
assistant_msg1 = create_assistant_message()\
    .add_text("I can see a beautiful landscape with mountains and a lake.")\
    .build()
conversation.append(assistant_msg1)

# Follow-up question maintaining context
user_msg2 = create_user_message()\
    .add_text("What time of day do you think this photo was taken? Please be specific about the lighting conditions.")\
    .build()
conversation.append(user_msg2)

# Process the entire conversation
response = manager.converse(messages=conversation)
```

#### Batch Document Processing
```python
import os
from pathlib import Path

# Process multiple documents with detailed analysis
documents_dir = Path("documents")
document_files = list(documents_dir.glob("*.pdf"))

# Create parallel requests for document analysis
requests = []
for doc_file in document_files:
    message = create_user_message()\
        .add_text(f"Please provide a comprehensive analysis of this document, including:")\
        .add_text("1. Main topics and themes")\
        .add_text("2. Key findings or conclusions")\
        .add_text("3. Important data or statistics mentioned")\
        .add_text("4. Any recommendations or action items")\
        .add_local_document(str(doc_file), name=doc_file.stem)\
        .build()
    
    requests.append(BedrockConverseRequest(
        request_id=f"doc-analysis-{doc_file.stem}",
        messages=[message]
    ))

# Process all documents in parallel
parallel_response = parallel_manager.converse_parallel(requests=requests)

# Generate summary report
for request_id, response in parallel_response.get_successful_responses().items():
    doc_name = request_id.replace("doc-analysis-", "")
    print(f"\n=== Analysis of {doc_name} ===")
    print(response.get_content())
    print(f"Processing time: {response.total_duration_ms}ms")
```

#### Error Handling and Validation
```python
from bestehorn_llmmanager.bedrock.exceptions import RequestValidationError, LLMManagerError

try:
    # Build message with potential validation issues
    message = create_user_message()\
        .add_text("Analyze this large file:")\
        .add_local_document("very_large_file.pdf", max_size_mb=10.0)  # Increased limit\
        .build()
    
    response = manager.converse(messages=[message])
    
    if response.success:
        print(f"Analysis complete: {response.get_content()[:200]}...")
        
        # Check for warnings (non-fatal issues)
        warnings = response.get_warnings()
        if warnings:
            print(f"Warnings: {warnings}")
            
    else:
        # Detailed error analysis
        print(f"Request failed after {len(response.attempts)} attempts")
        print(f"Models tried: {[attempt.model_used for attempt in response.attempts]}")
        print(f"Regions tried: {[attempt.region_used for attempt in response.attempts]}")
        print(f"Final error: {response.get_last_error()}")

except RequestValidationError as e:
    print(f"Message validation failed: {e}")
    if hasattr(e, 'validation_errors'):
        for error in e.validation_errors:
            print(f"  - {error}")
            
except FileNotFoundError as e:
    print(f"File not found: {e}")
    
except LLMManagerError as e:
    print(f"LLM Manager error: {e}")
```

### With Authentication Configuration

```python
from bestehorn_llmmanager import LLMManager
from bestehorn_llmmanager.bedrock.models.llm_manager_structures import (
    AuthConfig, AuthenticationType
)

# Configure authentication
auth_config = AuthConfig(
    auth_type=AuthenticationType.PROFILE,
    profile_name="my-aws-profile"
)

manager = LLMManager(
    models=["Claude 3 Haiku"],
    regions=["us-east-1"],
    auth_config=auth_config
)
```

## Requirements

- Python 3.8+
- AWS credentials configured (AWS CLI, environment variables, or IAM roles)
- Internet access for initial model data download

### Dependencies

- `boto3>=1.28.0` - AWS SDK
- `beautifulsoup4>=4.12.0` - HTML parsing
- `requests>=2.31.0` - HTTP requests

## Configuration

### AWS Credentials

The library supports multiple authentication methods:

1. **AWS Profiles**: Use named profiles from `~/.aws/credentials`
2. **Environment Variables**: `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`
3. **IAM Roles**: For EC2 instances or Lambda functions
4. **Default Credential Chain**: Standard AWS credential resolution

### Model Data

The library automatically downloads and caches AWS Bedrock model information on first use. This requires internet connectivity initially but uses cached data for subsequent runs.

## 🔧 Advanced Usage

### 🔄 Custom Retry Configuration

**❌ Native API:** No automatic retry logic
```python
# With native API, you must implement your own retry logic
import time
import random

def retry_with_backoff(func, max_retries=3):
    for attempt in range(max_retries):
        try:
            return func()
        except Exception as e:
            if attempt == max_retries - 1:
                raise e
            time.sleep(2 ** attempt + random.uniform(0, 1))
```

**✅ LLMManager Simplification:** Built-in intelligent retry with multiple strategies
```python
from bestehorn_llmmanager.bedrock.models.llm_manager_structures import (
    RetryConfig, RetryStrategy
)

# Configure sophisticated retry behavior
retry_config = RetryConfig(
    max_retries=5,                              # Maximum retry attempts
    retry_delay=1.0,                            # Initial delay (seconds)
    backoff_multiplier=2.0,                     # Exponential backoff
    max_retry_delay=60.0,                       # Maximum delay cap
    retry_strategy=RetryStrategy.REGION_FIRST,  # Try different regions first
    enable_feature_fallback=True                # Disable features if incompatible
)

manager = LLMManager(
    models=["Claude 3 Haiku", "Claude 3 Sonnet"],
    regions=["us-east-1", "us-west-2", "eu-west-1"],
    retry_config=retry_config
)

# The manager automatically handles:
# - Exponential backoff between retries
# - Region failover for high availability  
# - Model fallback if primary model fails
# - Feature degradation for compatibility issues
# - Detailed retry statistics and logging
```

### 🛡️ Response Validation

**❌ Native API:** No response validation capabilities
```python
# With native API, manual validation is required
response = bedrock.converse(...)
content = response['output']['message']['content'][0]['text']

# Manual validation logic
if "inappropriate" in content.lower():
    # Handle inappropriate content manually
    pass
```

LLMs are inherently probabilistic in their responses, i.e., even with exactly the same prompt between to calls, responses from LLMs will differ.
When you need a specific format, e.g., you want a formatted JSON response for down-stream processing, then even if the prompt explicitly specifies the exact format of the JSON response, there is still a chance you get a malformed response.
In such cases, it is often advisable to just execute another request to obtain a compliant response from Bedrock.
With the default converse API, you have to handle this in your application code.
The LLM Manager gives you a way to handle this more elegantly by poiting to a validation function and having the LLM Manager retry if the validation function determines the response does not conform to the expected format.

**✅ LLMManager Advantage:** Comprehensive response validation system
```python
from bestehorn_llmmanager.bedrock.models.llm_manager_structures import (
    ResponseValidationConfig, ValidationResult
)

def custom_content_validator(response):
    """Custom validation function for response content."""
    content = response.get_content()
    
    # Check for empty responses
    if not content or len(content.strip()) < 10:
        return ValidationResult(
            success=False,
            error_message="Response too short or empty"
        )
    
    # Check for potentially harmful content
    harmful_keywords = ["violence", "illegal", "inappropriate"]
    if any(keyword in content.lower() for keyword in harmful_keywords):
        return ValidationResult(
            success=False,
            error_message="Response contains potentially harmful content",
            error_details={"flagged_content": content[:100]}
        )
    
    # Check for factual consistency (example)
    if "I don't know" in content and len(content) < 50:
        return ValidationResult(
            success=False,
            error_message="Response appears incomplete"
        )
    
    return ValidationResult(success=True)

# Configure validation
validation_config = ResponseValidationConfig(
    response_validation_function=custom_content_validator,
    response_validation_retries=3,               # Retry validation failures
    response_validation_delay=0.5                # Delay between validation retries
)

message = create_user_message().add_text("Tell me about AI safety").build()
response = manager.converse(
    messages=[message],
    response_validation_config=validation_config
)

# Check validation results
if response.had_validation_failures():
    print("Validation issues detected:")
    for error in response.get_validation_errors():
        print(f"  - {error['error_message']}")
```

In the example above, the validation config contains a custom-validation function that checks the content of the message for various errors: length of the response, illegal keywowrds and completeness.
This function is then passed as a parameter and used before the call from the LLM Manager converse() function returns.
In case of errors, the LLM Manager will try to get a valid response before returning.
If you need a JSON-compliant response, then implementing a function that uses the standard Python JSON parser to check the format of the incoming response text is straightforward.

### 📊 Rich Response Analysis

**❌ Native API Limitation:** Basic response dictionaries only
```python
# Native API provides minimal response information
response = bedrock.converse(...)
content = response['output']['message']['content'][0]['text']
usage = response.get('usage', {})
# Limited metadata available
```

**✅ LLMManager Advantage:** Comprehensive response objects with detailed metadata
```python
message = create_user_message().add_text("Explain quantum computing").build()
response = manager.converse(messages=[message])

# Rich response information
print(f"Success: {response.success}")
print(f"Content: {response.get_content()}")

# Performance metrics
print(f"Total duration: {response.total_duration_ms}ms")
print(f"Model used: {response.model_used}")
print(f"Region used: {response.region_used}")

# Token usage analysis
usage = response.get_usage()
if usage:
    print(f"Input tokens: {usage.get('inputTokens', 'N/A')}")
    print(f"Output tokens: {usage.get('outputTokens', 'N/A')}")
    print(f"Total tokens: {usage.get('totalTokens', 'N/A')}")

# Detailed execution metrics
metrics = response.get_metrics()
print(f"Request preparation time: {metrics.get('request_prep_ms', 'N/A')}ms")
print(f"Network latency: {metrics.get('network_latency_ms', 'N/A')}ms")

# Retry and error information
print(f"Total attempts: {len(response.attempts)}")
for i, attempt in enumerate(response.attempts):
    print(f"  Attempt {i+1}: {attempt.model_used} in {attempt.region_used}")
    if attempt.error:
        print(f"    Error: {attempt.error}")

# Stop reason analysis
stop_reason = response.get_stop_reason()
print(f"Stop reason: {stop_reason}")  # max_tokens, end_turn, tool_use, etc.

# Warnings and issues
warnings = response.get_warnings()
if warnings:
    print(f"Warnings: {warnings}")
```

### 🧪 Streaming Responses

**❌ Native API:** Basic streaming support
```python
# Native API streaming is limited
response = bedrock.converse_stream(...)
for chunk in response['stream']:
    if 'contentBlockDelta' in chunk:
        print(chunk['contentBlockDelta'].get('text', ''), end='')
```

**✅ LLMManager Advantage:** Enhanced streaming with MessageBuilder
```python
# Create streaming request with MessageBuilder
message = create_user_message()\
    .add_text("Write a detailed explanation of machine learning algorithms.")\
    .add_text("Include examples and use cases for each algorithm.")\
    .build()

# Stream response with enhanced error handling
try:
    stream_response = manager.converse_stream(messages=[message])
    
    print("Streaming response:")
    full_content = ""
    
    for chunk in stream_response:
        if chunk.get("contentBlockDelta"):
            delta = chunk["contentBlockDelta"]
            if "text" in delta:
                text_chunk = delta["text"]
                print(text_chunk, end="", flush=True)
                full_content += text_chunk
        
        # Handle tool use in streaming
        elif chunk.get("contentBlockStart"):
            block_start = chunk["contentBlockStart"]
            if "toolUse" in block_start:
                print(f"\n[Tool use started: {block_start['toolUse']['name']}]")
    
    print(f"\n\nStream completed. Total characters: {len(full_content)}")
    
except Exception as e:
    print(f"Streaming error: {e}")
    # Automatic fallback to non-streaming if needed
    fallback_response = manager.converse(messages=[messages])
    print(f"Fallback response: {fallback_response.get_content()}")
```

### 🔧 Tool Use Integration

**Native API:** Complex tool configuration
```python
# Manual tool definition with native API
tools = [
    {
        "toolSpec": {
            "name": "calculator",
            "description": "Perform mathematical calculations",
            "inputSchema": {
                "json": {
                    "type": "object",
                    "properties": {
                        "expression": {"type": "string"}
                    },
                    "required": ["expression"]
                }
            }
        }
    }
]

tool_config = {"tools": tools}
```

**✅ LLMManager Enhancement:** Simplified tool integration with MessageBuilder
```python
# Define tools with enhanced configuration
def calculator_function(expression: str) -> str:
    """Safe calculator function."""
    try:
        # Simple expression evaluation (in practice, use a safe evaluator)
        result = eval(expression.replace('^', '**'))
        return str(result)
    except Exception as e:
        return f"Error: {str(e)}"

# Tool configuration
tool_config = {
    "tools": [
        {
            "toolSpec": {
                "name": "calculator",
                "description": "Perform basic mathematical calculations",
                "inputSchema": {
                    "json": {
                        "type": "object",
                        "properties": {
                            "expression": {
                                "type": "string",
                                "description": "Mathematical expression to evaluate"
                            }
                        },
                        "required": ["expression"]
                    }
                }
            }
        }
    ]
}

# Create message with tool request
message = create_user_message()\
    .add_text("Please calculate 25 * 37 + 15 and explain the result.")\
    .build()

response = manager.converse(
    messages=[message],
    tool_config=tool_config
)

# Handle tool use in response
if response.success:
    # Check for tool use in the response
    response_data = response.response_data
    content = response_data.get("output", {}).get("message", {}).get("content", [])
    
    for block in content:
        if "toolUse" in block:
            tool_use = block["toolUse"]
            tool_name = tool_use["name"]
            tool_input = tool_use["input"]
            tool_id = tool_use["toolUseId"]
            
            print(f"Tool called: {tool_name}")
            print(f"Input: {tool_input}")
            
            # Execute the tool function
            if tool_name == "calculator":
                result = calculator_function(tool_input["expression"])
                
                # Create tool result message
                tool_result_message = create_user_message()\
                    .add_tool_result(
                        tool_use_id=tool_id,
                        content=result,
                        status="success"
                    )\
                    .build()
                
                # Continue conversation with tool result
                final_response = manager.converse(messages=[
                    message,
                    response_data["output"]["message"],
                    tool_result_message
                ])
                
                print(f"Final response: {final_response.get_content()}")
```

## Development

### Setting Up Development Environment

```bash
# Clone the repository
git clone <repository-url>
cd bestehorn-llmmanager

# Install in editable mode with development dependencies
pip install -e .[dev]

# Run tests
pytest

# Run tests with coverage
pytest --cov=bestehorn_llmmanager
```

### Project Structure

```
bestehorn-llmmanager/
├── src/
│   └── bestehorn_llmmanager/
│       ├── __init__.py
│       ├── llm_manager.py
│       ├── parallel_llm_manager.py
│       └── bedrock/
├── test/
├── docs/
├── pyproject.toml
└── README.md
```

### Running Tests

```bash
# Unit tests only
pytest test/bestehorn_llmmanager/

# Integration tests (requires AWS credentials)
pytest test/integration/ -m integration

# All tests
pytest
```

## 📚 API Reference

### 🏗️ MessageBuilder API

The MessageBuilder provides a fluent interface for constructing multi-modal messages with automatic format detection.

#### Factory Functions
```python
from bestehorn_llmmanager import (
    create_user_message, create_assistant_message, 
    create_message, MessageBuilder, RolesEnum
)

# Convenience factories
create_user_message() -> ConverseMessageBuilder
create_assistant_message() -> ConverseMessageBuilder
create_message(role: RolesEnum) -> ConverseMessageBuilder

# Direct instantiation
MessageBuilder(role: RolesEnum) -> ConverseMessageBuilder
```

#### Core Methods

**Text Content**
```python
def add_text(self, text: str) -> 'ConverseMessageBuilder'
```
Add text content to the message. Text is automatically stripped of whitespace.

**Image Content**
```python
def add_image_bytes(
    self, bytes: bytes, 
    format: Optional[ImageFormatEnum] = None,
    filename: Optional[str] = None
) -> 'ConverseMessageBuilder'

def add_local_image(
    self, path_to_local_file: str,
    format: Optional[ImageFormatEnum] = None,
    max_size_mb: float = 3.75
) -> 'ConverseMessageBuilder'
```

**Document Content**
```python
def add_document_bytes(
    self, bytes: bytes,
    format: Optional[DocumentFormatEnum] = None,
    filename: Optional[str] = None,
    name: Optional[str] = None
) -> 'ConverseMessageBuilder'

def add_local_document(
    self, path_to_local_file: str,
    format: Optional[DocumentFormatEnum] = None,
    name: Optional[str] = None,
    max_size_mb: float = 4.5
) -> 'ConverseMessageBuilder'
```

**Video Content**
```python
def add_video_bytes(
    self, bytes: bytes,
    format: Optional[VideoFormatEnum] = None,
    filename: Optional[str] = None
) -> 'ConverseMessageBuilder'

def add_local_video(
    self, path_to_local_file: str,
    format: Optional[VideoFormatEnum] = None,
    max_size_mb: float = 100.0
) -> 'ConverseMessageBuilder'
```

**Build Message**
```python
def build(self) -> Dict[str, Any]
```
Build and return the AWS Bedrock-compatible message dictionary.

#### Supported Formats

**Images**: JPEG, PNG, GIF, WEBP  
**Documents**: PDF, CSV, DOC, DOCX, XLS, XLSX, HTML, TXT, MD  
**Videos**: MP4, MOV, AVI, WEBM, MKV

### 🔧 LLMManager

Primary interface for single AWS Bedrock requests with enhanced reliability.

#### Constructor
```python
LLMManager(
    models: List[str],                                    # Model names/IDs
    regions: List[str],                                   # AWS regions
    auth_config: Optional[AuthConfig] = None,             # Authentication config
    retry_config: Optional[RetryConfig] = None,           # Retry configuration
    default_inference_config: Optional[Dict] = None,     # Default inference params
    timeout: int = 300,                                   # Request timeout (seconds)
    log_level: Union[int, str] = logging.WARNING         # Logging level
)
```

#### Core Methods
```python
def converse(
    self, messages: List[Dict[str, Any]],
    system: Optional[List[Dict[str, str]]] = None,
    inference_config: Optional[Dict[str, Any]] = None,
    **kwargs
) -> BedrockResponse

def converse_stream(
    self, messages: List[Dict[str, Any]],
    **kwargs
) -> StreamingResponse

def validate_configuration(self) -> Dict[str, Any]
def refresh_model_data(self) -> None
def get_available_models(self) -> List[str]
def get_available_regions(self) -> List[str]
```

### ⚡ ParallelLLMManager

Interface for concurrent multi-region processing.

#### Constructor
```python
ParallelLLMManager(
    models: List[str],                                    # Model names/IDs
    regions: List[str],                                   # AWS regions
    parallel_config: Optional[ParallelProcessingConfig] = None,
    **kwargs                                              # Same as LLMManager
)
```

#### Core Methods
```python
def converse_parallel(
    self, requests: List[BedrockConverseRequest],
    target_regions_per_request: int = 2,
    response_validation_config: Optional[ResponseValidationConfig] = None
) -> ParallelResponse

def get_parallel_config(self) -> ParallelProcessingConfig
```

### 📊 Response Objects

#### BedrockResponse
```python
# Status and content
response.success: bool
response.get_content() -> str
response.get_usage() -> Dict[str, Any]
response.get_stop_reason() -> str

# Execution details
response.model_used: str
response.region_used: str
response.total_duration_ms: float
response.attempts: List[RequestAttempt]

# Error handling
response.get_last_error() -> Exception
response.get_all_errors() -> List[Exception]
response.get_warnings() -> List[str]

# Validation (if used)
response.had_validation_failures() -> bool
response.get_validation_errors() -> List[Dict]

# Serialization
response.to_dict() -> Dict[str, Any]
response.to_json(indent: int = None) -> str
```

#### ParallelResponse
```python
# Overall results
parallel_response.success: bool
parallel_response.get_success_rate() -> float
parallel_response.total_duration_ms: float

# Individual responses
parallel_response.request_responses: Dict[str, BedrockResponse]
parallel_response.get_successful_responses() -> Dict[str, BedrockResponse]
parallel_response.failed_requests: List[str]

# Statistics
parallel_response.parallel_execution_stats.total_requests: int
parallel_response.parallel_execution_stats.successful_requests: int
parallel_response.parallel_execution_stats.average_request_duration_ms: float
```

### 🔧 Configuration Classes

#### AuthConfig
```python
from bestehorn_llmmanager.bedrock.models.llm_manager_structures import (
    AuthConfig, AuthenticationType
)

AuthConfig(
    auth_type: AuthenticationType,                # PROFILE, CREDENTIALS, IAM_ROLE, AUTO
    profile_name: Optional[str] = None,           # AWS profile name
    access_key_id: Optional[str] = None,          # AWS access key
    secret_access_key: Optional[str] = None,      # AWS secret key
    session_token: Optional[str] = None,          # AWS session token
    region: Optional[str] = None                  # Default region
)
```

#### RetryConfig
```python
from bestehorn_llmmanager.bedrock.models.llm_manager_structures import (
    RetryConfig, RetryStrategy
)

RetryConfig(
    max_retries: int = 3,                         # Maximum retry attempts
    retry_delay: float = 1.0,                     # Initial delay (seconds)
    backoff_multiplier: float = 2.0,              # Exponential backoff multiplier
    max_retry_delay: float = 60.0,                # Maximum delay cap
    retry_strategy: RetryStrategy = RetryStrategy.REGION_FIRST,
    enable_feature_fallback: bool = True          # Disable features on errors
)
```

### 🎯 Enums

#### MessageBuilder Enums
```python
from bestehorn_llmmanager import (
    RolesEnum, ImageFormatEnum, 
    DocumentFormatEnum, VideoFormatEnum
)

class RolesEnum(str, Enum):
    USER = "user"
    ASSISTANT = "assistant"

class ImageFormatEnum(str, Enum):
    JPEG = "jpeg"
    PNG = "png"
    GIF = "gif"
    WEBP = "webp"

class DocumentFormatEnum(str, Enum):
    PDF = "pdf"
    CSV = "csv"
    DOC = "doc"
    DOCX = "docx"
    XLS = "xls"
    XLSX = "xlsx"
    HTML = "html"
    TXT = "txt"
    MD = "md"

class VideoFormatEnum(str, Enum):
    MP4 = "mp4"
    MOV = "mov"
    AVI = "avi"
    WEBM = "webm"
    MKV = "mkv"
```

## 🎯 Complete Example: Multi-Modal Document Analysis

Here's a comprehensive example showcasing the library's key features:

```python
from bestehorn_llmmanager import (
    LLMManager, ParallelLLMManager, create_user_message,
    AuthConfig, AuthenticationType, RetryConfig, RetryStrategy
)
from bestehorn_llmmanager.bedrock.models.parallel_structures import BedrockConverseRequest
from bestehorn_llmmanager.bedrock.models.llm_manager_structures import ResponseValidationConfig, ValidationResult

# Configure authentication and retry behavior
auth_config = AuthConfig(
    auth_type=AuthenticationType.PROFILE,
    profile_name="bedrock-profile"
)

retry_config = RetryConfig(
    max_retries=5,
    retry_strategy=RetryStrategy.REGION_FIRST,
    enable_feature_fallback=True
)

# Custom response validation
def validate_analysis_quality(response):
    content = response.get_content()
    if len(content) < 100:
        return ValidationResult(
            success=False,
            error_message="Analysis too brief"
        )
    return ValidationResult(success=True)

validation_config = ResponseValidationConfig(
    response_validation_function=validate_analysis_quality,
    response_validation_retries=2
)

# Initialize managers
single_manager = LLMManager(
    models=["Claude 3 Sonnet", "Claude 3 Haiku"],
    regions=["us-east-1", "us-west-2", "eu-west-1"],
    auth_config=auth_config,
    retry_config=retry_config
)

parallel_manager = ParallelLLMManager(
    models=["Claude 3 Sonnet", "Claude 3 Haiku"],
    regions=["us-east-1", "us-west-2", "eu-west-1"],
    auth_config=auth_config,
    retry_config=retry_config
)

# Single request with multi-modal content
message = create_user_message()\
    .add_text("Please analyze this comprehensive business report:")\
    .add_local_document("reports/quarterly_report.pdf")\
    .add_text("Focus on these key areas:")\
    .add_text("1. Financial performance trends")\
    .add_text("2. Market position analysis")\
    .add_text("3. Strategic recommendations")\
    .add_local_image("charts/performance_chart.png")\
    .add_text("How does the visual data support the written analysis?")\
    .build()

# Execute single request with validation
response = single_manager.converse(
    messages=[message],
    response_validation_config=validation_config
)

if response.success:
    print(f"Analysis completed in {response.total_duration_ms}ms")
    print(f"Using {response.model_used} in {response.region_used}")
    print(f"Content: {response.get_content()[:500]}...")
    
    # Check token usage
    usage = response.get_usage()
    if usage:
        print(f"Tokens used: {usage.get('totalTokens', 'N/A')}")
else:
    print(f"Analysis failed: {response.get_last_error()}")

# Parallel processing for multiple documents
documents = ["doc1.pdf", "doc2.pdf", "doc3.pdf"]
requests = []

for i, doc in enumerate(documents):
    message = create_user_message()\
        .add_text(f"Provide a executive summary of this document:")\
        .add_local_document(doc)\
        .build()
    
    requests.append(BedrockConverseRequest(
        request_id=f"summary-{i}",
        messages=[message]
    ))

# Execute in parallel with high availability
parallel_response = parallel_manager.converse_parallel(
    requests=requests,
    target_regions_per_request=2,
    response_validation_config=validation_config
)

print(f"Parallel execution results:")
print(f"Success rate: {parallel_response.get_success_rate():.1%}")
print(f"Total time: {parallel_response.total_duration_ms}ms")
print(f"Average per request: {parallel_response.parallel_execution_stats.average_request_duration_ms:.1f}ms")

# Process results
for request_id, response in parallel_response.get_successful_responses().items():
    print(f"\n{request_id}: {response.get_content()[:200]}...")
```

This comprehensive example demonstrates:
- **Multi-modal message construction** with text, documents, and images
- **Automatic format detection** for uploaded files
- **Multi-region failover** with intelligent retry logic
- **Response validation** with custom validation functions
- **Parallel processing** across multiple models and regions
- **Detailed performance monitoring** and error handling
- **Flexible authentication** configuration

## Error Handling

The library provides comprehensive error handling with specific exception types:

- `LLMManagerError`: Base exception for all library errors
- `ConfigurationError`: Configuration-related errors
- `AuthenticationError`: AWS authentication failures
- `RequestValidationError`: Request validation failures
- `RetryExhaustedError`: All retry attempts failed
- `ParallelProcessingError`: Parallel execution errors

## License

MIT License - see LICENSE file for details.

## Contributing

1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Add tests for new functionality
5. Run the test suite
6. Submit a pull request

## Support

For issues and questions:
- Check the documentation in the `docs/` directory
- Review existing issues on GitHub
- Create a new issue with detailed information about your problem

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "bestehorn-llmmanager",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.8",
    "maintainer_email": "Markus Bestehorn <markus.bestehorn@googlemail.com>",
    "keywords": "aws, bedrock, llm, ai, machine-learning, amazon, claude, anthropic",
    "author": null,
    "author_email": "Markus Bestehorn <markus.bestehorn@googlemail.com>",
    "download_url": "https://files.pythonhosted.org/packages/2d/0b/825fa55758803bf52a9205333e2a854f0a188459bec9cdc29ef873d7d862/bestehorn_llmmanager-0.1.13.tar.gz",
    "platform": null,
    "description": "# LLMManager by Markus Bestehorn\n\n[![PyPI version](https://img.shields.io/pypi/v/bestehorn-llmmanager.svg)](https://pypi.org/project/bestehorn-llmmanager/)\n[![Python Versions](https://img.shields.io/pypi/pyversions/bestehorn-llmmanager.svg)](https://pypi.org/project/bestehorn-llmmanager/)\n[![Build Status](https://github.com/Bestehorn/LLMManager/actions/workflows/ci.yml/badge.svg)](https://github.com/Bestehorn/LLMManager/actions/workflows/ci.yml)\n[![codecov](https://codecov.io/gh/Bestehorn/LLMManager/graph/badge.svg?token=YOUR_CODECOV_TOKEN)](https://codecov.io/gh/Bestehorn/LLMManager)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n\n**A comprehensive Python library that simplifies and enhances AWS Bedrock Converse API interactions with intelligent message building, automatic file format detection, multi-modal support, and advanced reliability features.**\n\nI have implemented this set of functions over time while working on projects and demos using Amazon Bedrock.\nOne of the key use cases that started the implementation of this project has been: I have a prompt and I do not care where the prompt gets executed, I just want to make sure I get a response and I do not want to handle any temporal issues such as malformed responses from LLMs or throttling.\nIt helped me to avoid common mistakes and sources for errors when using the [standard AWS Bedrock Converse API](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-runtime/client/converse.html).\n\n\u26a0\ufe0f**Important**: The code in this repository comes with no guarentees or liabilities. It is not intended for production use, but rather as an inspiration for implementing your own logic for the Bedrock converse API in production environments.\n\nQuick note on the package name: As there are other packages (public & private) that are named \"LLMManager\", I have chosen to prefix the name of this package with my last name as I am sure there are no other packages with this name.\nIf this is an issue, feel free to contact me and propose another name.\n\n## \ud83d\ude80 Key Simplifications Over Native AWS Bedrock Converse API\n\nThis library was built to simplify the [standard AWS Bedrock Converse API](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-runtime/client/converse.html). \nWhile some of these simplifications come at a loss of flexibility, for most use cases, the code becomes cleaner, less prone to errors and easier to understand when it is based on the abstractions provided in this package: \n\n### \ud83c\udfaf **Fluent MessageBuilder with Automatic Format Detection**\n- **Native API**: Manual message dictionary construction with complex nested structures resulting in complex JSON\n- **LLMManager**: Intuitive fluent interface with automatic file type detection and validation that hides the intricacies of building message JSON for Bedrock's converse API.\n\n### \ud83d\udd04 **Intelligent Multi-Region Failover**\n- **Native API**: Single region, manual error handling required, i.e., you have to handle different types of errors in your code and implement retry logic.\n- **LLMManager**: Automatic failover across multiple AWS regions with configurable retry strategies, e.g., if you request fails in one AWS region, it is moved to another region.\n\n### \u26a1 **Built-in Parallel Processing**\n- **Native API**: Sequential processing only\n- **LLMManager**: Concurrent execution across multiple models and regions; this is particularly important for any kind of batch processing to reduce inference time.\n\n### \ud83d\udee1\ufe0f **Enhanced Error Handling & Reliability**\n- **Native API**: Basic error responses, no automatic retries\n- **LLMManager**: Comprehensive error handling, exponential backoff, and feature fallback\n\n### \ud83d\udcca **Rich Response Management**\n- **Native API**: Raw response dictionaries, i.e., you have to parse and understand the JSON structure from Bedrock.\n- **LLMManager**: Structured response objects with metadata, utilities, and validation that provide data access through typed functions.\n\n## \u2728 Features\n\n### \ud83c\udfd7\ufe0f **Fluent MessageBuilder**\n- **Intuitive API**: Chain methods to build complex multi-modal messages\n- **Automatic Format Detection**: Intelligent file type detection from content and filenames\n- **Multi-Modal Support**: Seamlessly combine text, images, documents, and videos\n- **Built-in Validation**: Automatic content validation and size limit enforcement\n- **Type Safety**: Comprehensive enums for formats, roles, and content types\n\n### \ud83d\udd04 **Advanced Reliability**\n- **Multi-Model Support**: Work with multiple LLM models simultaneously with automatic fallback\n- **Multi-Region Failover**: Automatic failover across AWS regions with intelligent routing\n- **Intelligent Retry Logic**: Exponential backoff with configurable retry strategies\n- **Feature Fallback**: Graceful degradation when advanced features aren't supported\n\n### \u26a1 **Performance & Scalability**\n- **Parallel Processing**: Execute multiple requests concurrently across regions\n- **Load Balancing**: Intelligent distribution across available resources\n- **Connection Pooling**: Efficient resource management for high-throughput scenarios\n- **Streaming Support**: Real-time response streaming for long-form content\n\n### \ud83d\udee1\ufe0f **Security & Authentication**\n- **Flexible Authentication**: Support for AWS profiles, credentials, IAM roles, and auto-detection\n- **Response Validation**: Optional content validation with custom validation functions\n- **Guardrail Integration**: Full support for AWS Bedrock guardrails\n- **Secure File Handling**: Safe processing of uploaded files with size and format validation\n\n### \ud83d\udcca **Comprehensive Monitoring**\n- **Rich Response Objects**: Detailed response metadata with performance metrics\n- **Execution Statistics**: Request timing, token usage, and success rates\n- **Error Tracking**: Comprehensive error logging with retry attempt details\n- **Validation Reporting**: Detailed validation results and failure analysis\n\n## Installation\n\n### From PyPI (Recommended)\n\n```bash\npip install bestehorn-llmmanager\n```\n\n### From Source (Development)\n\nFor development or integration into other projects:\n\n```bash\ngit clone https://github.com/Bestehorn/LLMManager.git\ncd LLMManager\npip install -e .\n```\n\n### With Development Dependencies\n\n```bash\npip install -e .[dev]\n```\n\n## \ud83d\ude80 Quick Start\n\n### MessageBuilder vs. Native API Comparison\n\n**\u274c With Native AWS Bedrock Converse API:**\n```python\nimport boto3\nimport base64\n\n# Complex manual message construction\nbedrock = boto3.client('bedrock-runtime', region_name='us-east-1')\n\n# Manual dictionary construction - error-prone and verbose\nwith open(\"document.pdf\", \"rb\") as f:\n    doc_bytes = f.read()\n\nmessages = [\n    {\n        \"role\": \"user\",\n        \"content\": [\n            {\"text\": \"Analyze this document:\"},\n            {\n                \"document\": {\n                    \"name\": \"document.pdf\",\n                    \"format\": \"pdf\",  # Must specify format manually\n                    \"source\": {\"bytes\": doc_bytes}\n                }\n            }\n        ]\n    }\n]\n\n# Basic API call with no error handling or retry logic\ntry:\n    response = bedrock.converse(\n        modelId=\"anthropic.claude-3-haiku-20240307-v1:0\",  # Must specify exact model ID\n        messages=messages\n    )\n    content = response['output']['message']['content'][0]['text']\nexcept Exception as e:\n    print(f\"Error: {e}\")  # Limited error information\n```\n\nIn this example above, you have to load the document and then construct the JSON for the message to get it passed to Bedrock's converse API in a single region.\nOnce the response returns, you have to spcifically know where the answer is (which is simple for this straightforward example) and for complex usage scenarios, you will need complex logic to access the right response.\nFurthermore, you have to specifically know the model ID of the model which you want to access.\nIf you want to access the model directly (as above), you will have to find the right model in the [AWS documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html).\nFor cross-region inference (CRIS), this becomes even more complex as you have to choose the right inference/model ID for the CRIS profile you want to use, e.g., different CRIS profiles for the model in the US or EU from the [AWS documentation page](https://docs.aws.amazon.com/bedrock/latest/userguide/inference-profiles-support.html#inference-profiles-support-system).\nIf the request to Bedrock fails, you get a relatively generic Exception object which you have to handle and there is no automated retry with this approach or other ways to handle errors gracefully.\nThe code below covers all of this and more using the LLM Manager:\n\n**\u2705 With LLMManager and MessageBuilder:**\n```python\nfrom bestehorn_llmmanager import LLMManager, create_user_message\n\n# Simple initialization with friendly model names and multi-region support\nmanager = LLMManager(\n    models=[\"Claude 3 Haiku\", \"Claude 3 Sonnet\"],  # Friendly names, automatic fallback\n    regions=[\"us-east-1\", \"us-west-2\", \"eu-west-1\"]  # Multi-region with automatic failover\n)\n\n# Fluent message building with automatic format detection\nmessage = create_user_message()\\\n    .add_text(\"Analyze this document:\")\\\n    .add_local_document(\"document.pdf\")\\\n    .build()  # Automatically detects PDF format from file extension\n\n# Enhanced API call with comprehensive error handling\nresponse = manager.converse(messages=[message])\n\nif response.success:\n    print(response.get_content())\n    print(f\"Used model: {response.model_used} in region: {response.region_used}\")\n    print(f\"Duration: {response.total_duration_ms}ms\")\nelse:\n    print(f\"Request failed after {len(response.attempts)} attempts\")\n    print(f\"Last error: {response.get_last_error()}\")\n```\n\nFirst an foremost, you simply select the models you want to use with the models parameter and the LLM Manager will use either CRIS or direct access (with preference to CRIS).\nYou do not have to read documentation pages or find model IDs to use model as the LLM Manager has a built-in parsing mechanism for these documentation pages to load the model IDs and inference profiles automatically.\nThe list of regions gives flexibility and the default converse API does not give the option to execute prompts in various regions. \nThe loading of the PDF document happens simply by pointing to the local file, the detection of the content will also happen automatically.\nAs a response from Bedrock, you always get a BedrockResponse object which allows you to easily determine if the request was sucessful or not.\nThe access to the response's text does not require knowing the JSON structure and reduces parsing of JSON. \n\n### \ud83c\udfd7\ufe0f MessageBuilder: Intelligent Multi-Modal Message Construction\n\nThe MessageBuilder provides a fluent, type-safe interface for building complex messages with automatic format detection:\n\n#### Basic Text Messages\n```python\nfrom bestehorn_llmmanager import create_user_message, create_assistant_message\n\n# Simple text message\nmessage = create_user_message().add_text(\"Hello, how are you?\").build()\n\n# Multi-paragraph text\nmessage = create_user_message()\\\n    .add_text(\"First paragraph of my question.\")\\\n    .add_text(\"Second paragraph with more details.\")\\\n    .build()\n```\n\n#### Multi-Modal Messages with Automatic Format Detection\n```python\n# Combine text, images, and documents in one fluent chain\nmessage = create_user_message()\\\n    .add_text(\"Please analyze this data visualization and the underlying data:\")\\\n    .add_local_image(\"charts/sales_chart.png\")  # Auto-detects PNG format\\\n    .add_local_document(\"data/sales_data.xlsx\")  # Auto-detects Excel format\\\n    .add_text(\"What trends do you notice and what recommendations do you have?\")\\\n    .build()\n\n# The MessageBuilder automatically:\n# - Detects file formats from extensions and content\n# - Validates file sizes and formats\n# - Handles file reading and encoding\n# - Creates proper AWS Bedrock message structure\n```\n\n#### File Format Detection Capabilities\n```python\nfrom bestehorn_llmmanager import ImageFormatEnum, DocumentFormatEnum, VideoFormatEnum\n\n# Automatic detection from file extensions\nmessage = create_user_message()\\\n    .add_local_image(\"photo.jpg\")     # Detects JPEG\\\n    .add_local_image(\"diagram.png\")   # Detects PNG\\\n    .add_local_document(\"report.pdf\") # Detects PDF\\\n    .add_local_document(\"data.csv\")   # Detects CSV\\\n    .build()\n\n# Manual format specification when needed\nmessage = create_user_message()\\\n    .add_image_bytes(image_data, format=ImageFormatEnum.WEBP)\\\n    .add_document_bytes(doc_data, format=DocumentFormatEnum.DOCX, name=\"Proposal\")\\\n    .build()\n\n# Supported formats:\n# Images: JPEG, PNG, GIF, WEBP\n# Documents: PDF, CSV, DOC, DOCX, XLS, XLSX, HTML, TXT, MD  \n# Videos: MP4, MOV, AVI, WEBM, MKV\n```\n\n### \u26a1 Parallel Processing with MessageBuilder\n\n**\u274c Native API:** Sequential processing only\n```python\n# With native API, you must process requests one by one\nresults = []\nfor question in questions:\n    response = bedrock.converse(modelId=\"...\", messages=[{\"role\": \"user\", \"content\": [{\"text\": question}]}])\n    results.append(response)  # Slow, sequential processing\n```\n\n**\u2705 LLMManager Advantage:** Concurrent multi-region processing\n```python\nfrom bestehorn_llmmanager import ParallelLLMManager, create_user_message\nfrom bestehorn_llmmanager.bedrock.models.parallel_structures import BedrockConverseRequest\n\n# Initialize parallel manager with multiple regions for high availability\nparallel_manager = ParallelLLMManager(\n    models=[\"Claude 3 Haiku\", \"Claude 3 Sonnet\"],\n    regions=[\"us-east-1\", \"us-west-2\", \"eu-west-1\"]\n)\n\n# Create multiple requests using MessageBuilder\nquestions = [\"What is AI?\", \"Explain machine learning\", \"How does neural network training work?\"]\n\nrequests = []\nfor i, question in enumerate(questions):\n    message = create_user_message().add_text(question).build()\n    requests.append(BedrockConverseRequest(\n        request_id=f\"question-{i}\",\n        messages=[message]\n    ))\n\n# Execute all requests in parallel across multiple regions\nparallel_response = parallel_manager.converse_parallel(\n    requests=requests,\n    target_regions_per_request=2  # Use 2 regions per request for redundancy\n)\n\n# Get comprehensive results\nprint(f\"Success rate: {parallel_response.get_success_rate():.1%}\")\nprint(f\"Total duration: {parallel_response.total_duration_ms}ms\")\nprint(f\"Average per request: {parallel_response.parallel_execution_stats.average_request_duration_ms:.1f}ms\")\n\n# Access individual results\nfor request_id, response in parallel_response.get_successful_responses().items():\n    print(f\"{request_id}: {response.get_content()}\")\n    print(f\"  Model: {response.model_used}, Region: {response.region_used}\")\n```\n\n### \ud83d\udd04 Advanced MessageBuilder Patterns\n\n#### Conversation Context Management\n```python\nfrom bestehorn_llmmanager import create_user_message, create_assistant_message\n\n# Build a multi-turn conversation with context\nconversation = []\n\n# Initial user message with image\nuser_msg1 = create_user_message()\\\n    .add_text(\"What's in this image?\")\\\n    .add_local_image(\"photo.jpg\")\\\n    .build()\nconversation.append(user_msg1)\n\n# Simulate assistant response (or use actual response)\nassistant_msg1 = create_assistant_message()\\\n    .add_text(\"I can see a beautiful landscape with mountains and a lake.\")\\\n    .build()\nconversation.append(assistant_msg1)\n\n# Follow-up question maintaining context\nuser_msg2 = create_user_message()\\\n    .add_text(\"What time of day do you think this photo was taken? Please be specific about the lighting conditions.\")\\\n    .build()\nconversation.append(user_msg2)\n\n# Process the entire conversation\nresponse = manager.converse(messages=conversation)\n```\n\n#### Batch Document Processing\n```python\nimport os\nfrom pathlib import Path\n\n# Process multiple documents with detailed analysis\ndocuments_dir = Path(\"documents\")\ndocument_files = list(documents_dir.glob(\"*.pdf\"))\n\n# Create parallel requests for document analysis\nrequests = []\nfor doc_file in document_files:\n    message = create_user_message()\\\n        .add_text(f\"Please provide a comprehensive analysis of this document, including:\")\\\n        .add_text(\"1. Main topics and themes\")\\\n        .add_text(\"2. Key findings or conclusions\")\\\n        .add_text(\"3. Important data or statistics mentioned\")\\\n        .add_text(\"4. Any recommendations or action items\")\\\n        .add_local_document(str(doc_file), name=doc_file.stem)\\\n        .build()\n    \n    requests.append(BedrockConverseRequest(\n        request_id=f\"doc-analysis-{doc_file.stem}\",\n        messages=[message]\n    ))\n\n# Process all documents in parallel\nparallel_response = parallel_manager.converse_parallel(requests=requests)\n\n# Generate summary report\nfor request_id, response in parallel_response.get_successful_responses().items():\n    doc_name = request_id.replace(\"doc-analysis-\", \"\")\n    print(f\"\\n=== Analysis of {doc_name} ===\")\n    print(response.get_content())\n    print(f\"Processing time: {response.total_duration_ms}ms\")\n```\n\n#### Error Handling and Validation\n```python\nfrom bestehorn_llmmanager.bedrock.exceptions import RequestValidationError, LLMManagerError\n\ntry:\n    # Build message with potential validation issues\n    message = create_user_message()\\\n        .add_text(\"Analyze this large file:\")\\\n        .add_local_document(\"very_large_file.pdf\", max_size_mb=10.0)  # Increased limit\\\n        .build()\n    \n    response = manager.converse(messages=[message])\n    \n    if response.success:\n        print(f\"Analysis complete: {response.get_content()[:200]}...\")\n        \n        # Check for warnings (non-fatal issues)\n        warnings = response.get_warnings()\n        if warnings:\n            print(f\"Warnings: {warnings}\")\n            \n    else:\n        # Detailed error analysis\n        print(f\"Request failed after {len(response.attempts)} attempts\")\n        print(f\"Models tried: {[attempt.model_used for attempt in response.attempts]}\")\n        print(f\"Regions tried: {[attempt.region_used for attempt in response.attempts]}\")\n        print(f\"Final error: {response.get_last_error()}\")\n\nexcept RequestValidationError as e:\n    print(f\"Message validation failed: {e}\")\n    if hasattr(e, 'validation_errors'):\n        for error in e.validation_errors:\n            print(f\"  - {error}\")\n            \nexcept FileNotFoundError as e:\n    print(f\"File not found: {e}\")\n    \nexcept LLMManagerError as e:\n    print(f\"LLM Manager error: {e}\")\n```\n\n### With Authentication Configuration\n\n```python\nfrom bestehorn_llmmanager import LLMManager\nfrom bestehorn_llmmanager.bedrock.models.llm_manager_structures import (\n    AuthConfig, AuthenticationType\n)\n\n# Configure authentication\nauth_config = AuthConfig(\n    auth_type=AuthenticationType.PROFILE,\n    profile_name=\"my-aws-profile\"\n)\n\nmanager = LLMManager(\n    models=[\"Claude 3 Haiku\"],\n    regions=[\"us-east-1\"],\n    auth_config=auth_config\n)\n```\n\n## Requirements\n\n- Python 3.8+\n- AWS credentials configured (AWS CLI, environment variables, or IAM roles)\n- Internet access for initial model data download\n\n### Dependencies\n\n- `boto3>=1.28.0` - AWS SDK\n- `beautifulsoup4>=4.12.0` - HTML parsing\n- `requests>=2.31.0` - HTTP requests\n\n## Configuration\n\n### AWS Credentials\n\nThe library supports multiple authentication methods:\n\n1. **AWS Profiles**: Use named profiles from `~/.aws/credentials`\n2. **Environment Variables**: `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`\n3. **IAM Roles**: For EC2 instances or Lambda functions\n4. **Default Credential Chain**: Standard AWS credential resolution\n\n### Model Data\n\nThe library automatically downloads and caches AWS Bedrock model information on first use. This requires internet connectivity initially but uses cached data for subsequent runs.\n\n## \ud83d\udd27 Advanced Usage\n\n### \ud83d\udd04 Custom Retry Configuration\n\n**\u274c Native API:** No automatic retry logic\n```python\n# With native API, you must implement your own retry logic\nimport time\nimport random\n\ndef retry_with_backoff(func, max_retries=3):\n    for attempt in range(max_retries):\n        try:\n            return func()\n        except Exception as e:\n            if attempt == max_retries - 1:\n                raise e\n            time.sleep(2 ** attempt + random.uniform(0, 1))\n```\n\n**\u2705 LLMManager Simplification:** Built-in intelligent retry with multiple strategies\n```python\nfrom bestehorn_llmmanager.bedrock.models.llm_manager_structures import (\n    RetryConfig, RetryStrategy\n)\n\n# Configure sophisticated retry behavior\nretry_config = RetryConfig(\n    max_retries=5,                              # Maximum retry attempts\n    retry_delay=1.0,                            # Initial delay (seconds)\n    backoff_multiplier=2.0,                     # Exponential backoff\n    max_retry_delay=60.0,                       # Maximum delay cap\n    retry_strategy=RetryStrategy.REGION_FIRST,  # Try different regions first\n    enable_feature_fallback=True                # Disable features if incompatible\n)\n\nmanager = LLMManager(\n    models=[\"Claude 3 Haiku\", \"Claude 3 Sonnet\"],\n    regions=[\"us-east-1\", \"us-west-2\", \"eu-west-1\"],\n    retry_config=retry_config\n)\n\n# The manager automatically handles:\n# - Exponential backoff between retries\n# - Region failover for high availability  \n# - Model fallback if primary model fails\n# - Feature degradation for compatibility issues\n# - Detailed retry statistics and logging\n```\n\n### \ud83d\udee1\ufe0f Response Validation\n\n**\u274c Native API:** No response validation capabilities\n```python\n# With native API, manual validation is required\nresponse = bedrock.converse(...)\ncontent = response['output']['message']['content'][0]['text']\n\n# Manual validation logic\nif \"inappropriate\" in content.lower():\n    # Handle inappropriate content manually\n    pass\n```\n\nLLMs are inherently probabilistic in their responses, i.e., even with exactly the same prompt between to calls, responses from LLMs will differ.\nWhen you need a specific format, e.g., you want a formatted JSON response for down-stream processing, then even if the prompt explicitly specifies the exact format of the JSON response, there is still a chance you get a malformed response.\nIn such cases, it is often advisable to just execute another request to obtain a compliant response from Bedrock.\nWith the default converse API, you have to handle this in your application code.\nThe LLM Manager gives you a way to handle this more elegantly by poiting to a validation function and having the LLM Manager retry if the validation function determines the response does not conform to the expected format.\n\n**\u2705 LLMManager Advantage:** Comprehensive response validation system\n```python\nfrom bestehorn_llmmanager.bedrock.models.llm_manager_structures import (\n    ResponseValidationConfig, ValidationResult\n)\n\ndef custom_content_validator(response):\n    \"\"\"Custom validation function for response content.\"\"\"\n    content = response.get_content()\n    \n    # Check for empty responses\n    if not content or len(content.strip()) < 10:\n        return ValidationResult(\n            success=False,\n            error_message=\"Response too short or empty\"\n        )\n    \n    # Check for potentially harmful content\n    harmful_keywords = [\"violence\", \"illegal\", \"inappropriate\"]\n    if any(keyword in content.lower() for keyword in harmful_keywords):\n        return ValidationResult(\n            success=False,\n            error_message=\"Response contains potentially harmful content\",\n            error_details={\"flagged_content\": content[:100]}\n        )\n    \n    # Check for factual consistency (example)\n    if \"I don't know\" in content and len(content) < 50:\n        return ValidationResult(\n            success=False,\n            error_message=\"Response appears incomplete\"\n        )\n    \n    return ValidationResult(success=True)\n\n# Configure validation\nvalidation_config = ResponseValidationConfig(\n    response_validation_function=custom_content_validator,\n    response_validation_retries=3,               # Retry validation failures\n    response_validation_delay=0.5                # Delay between validation retries\n)\n\nmessage = create_user_message().add_text(\"Tell me about AI safety\").build()\nresponse = manager.converse(\n    messages=[message],\n    response_validation_config=validation_config\n)\n\n# Check validation results\nif response.had_validation_failures():\n    print(\"Validation issues detected:\")\n    for error in response.get_validation_errors():\n        print(f\"  - {error['error_message']}\")\n```\n\nIn the example above, the validation config contains a custom-validation function that checks the content of the message for various errors: length of the response, illegal keywowrds and completeness.\nThis function is then passed as a parameter and used before the call from the LLM Manager converse() function returns.\nIn case of errors, the LLM Manager will try to get a valid response before returning.\nIf you need a JSON-compliant response, then implementing a function that uses the standard Python JSON parser to check the format of the incoming response text is straightforward.\n\n### \ud83d\udcca Rich Response Analysis\n\n**\u274c Native API Limitation:** Basic response dictionaries only\n```python\n# Native API provides minimal response information\nresponse = bedrock.converse(...)\ncontent = response['output']['message']['content'][0]['text']\nusage = response.get('usage', {})\n# Limited metadata available\n```\n\n**\u2705 LLMManager Advantage:** Comprehensive response objects with detailed metadata\n```python\nmessage = create_user_message().add_text(\"Explain quantum computing\").build()\nresponse = manager.converse(messages=[message])\n\n# Rich response information\nprint(f\"Success: {response.success}\")\nprint(f\"Content: {response.get_content()}\")\n\n# Performance metrics\nprint(f\"Total duration: {response.total_duration_ms}ms\")\nprint(f\"Model used: {response.model_used}\")\nprint(f\"Region used: {response.region_used}\")\n\n# Token usage analysis\nusage = response.get_usage()\nif usage:\n    print(f\"Input tokens: {usage.get('inputTokens', 'N/A')}\")\n    print(f\"Output tokens: {usage.get('outputTokens', 'N/A')}\")\n    print(f\"Total tokens: {usage.get('totalTokens', 'N/A')}\")\n\n# Detailed execution metrics\nmetrics = response.get_metrics()\nprint(f\"Request preparation time: {metrics.get('request_prep_ms', 'N/A')}ms\")\nprint(f\"Network latency: {metrics.get('network_latency_ms', 'N/A')}ms\")\n\n# Retry and error information\nprint(f\"Total attempts: {len(response.attempts)}\")\nfor i, attempt in enumerate(response.attempts):\n    print(f\"  Attempt {i+1}: {attempt.model_used} in {attempt.region_used}\")\n    if attempt.error:\n        print(f\"    Error: {attempt.error}\")\n\n# Stop reason analysis\nstop_reason = response.get_stop_reason()\nprint(f\"Stop reason: {stop_reason}\")  # max_tokens, end_turn, tool_use, etc.\n\n# Warnings and issues\nwarnings = response.get_warnings()\nif warnings:\n    print(f\"Warnings: {warnings}\")\n```\n\n### \ud83e\uddea Streaming Responses\n\n**\u274c Native API:** Basic streaming support\n```python\n# Native API streaming is limited\nresponse = bedrock.converse_stream(...)\nfor chunk in response['stream']:\n    if 'contentBlockDelta' in chunk:\n        print(chunk['contentBlockDelta'].get('text', ''), end='')\n```\n\n**\u2705 LLMManager Advantage:** Enhanced streaming with MessageBuilder\n```python\n# Create streaming request with MessageBuilder\nmessage = create_user_message()\\\n    .add_text(\"Write a detailed explanation of machine learning algorithms.\")\\\n    .add_text(\"Include examples and use cases for each algorithm.\")\\\n    .build()\n\n# Stream response with enhanced error handling\ntry:\n    stream_response = manager.converse_stream(messages=[message])\n    \n    print(\"Streaming response:\")\n    full_content = \"\"\n    \n    for chunk in stream_response:\n        if chunk.get(\"contentBlockDelta\"):\n            delta = chunk[\"contentBlockDelta\"]\n            if \"text\" in delta:\n                text_chunk = delta[\"text\"]\n                print(text_chunk, end=\"\", flush=True)\n                full_content += text_chunk\n        \n        # Handle tool use in streaming\n        elif chunk.get(\"contentBlockStart\"):\n            block_start = chunk[\"contentBlockStart\"]\n            if \"toolUse\" in block_start:\n                print(f\"\\n[Tool use started: {block_start['toolUse']['name']}]\")\n    \n    print(f\"\\n\\nStream completed. Total characters: {len(full_content)}\")\n    \nexcept Exception as e:\n    print(f\"Streaming error: {e}\")\n    # Automatic fallback to non-streaming if needed\n    fallback_response = manager.converse(messages=[messages])\n    print(f\"Fallback response: {fallback_response.get_content()}\")\n```\n\n### \ud83d\udd27 Tool Use Integration\n\n**Native API:** Complex tool configuration\n```python\n# Manual tool definition with native API\ntools = [\n    {\n        \"toolSpec\": {\n            \"name\": \"calculator\",\n            \"description\": \"Perform mathematical calculations\",\n            \"inputSchema\": {\n                \"json\": {\n                    \"type\": \"object\",\n                    \"properties\": {\n                        \"expression\": {\"type\": \"string\"}\n                    },\n                    \"required\": [\"expression\"]\n                }\n            }\n        }\n    }\n]\n\ntool_config = {\"tools\": tools}\n```\n\n**\u2705 LLMManager Enhancement:** Simplified tool integration with MessageBuilder\n```python\n# Define tools with enhanced configuration\ndef calculator_function(expression: str) -> str:\n    \"\"\"Safe calculator function.\"\"\"\n    try:\n        # Simple expression evaluation (in practice, use a safe evaluator)\n        result = eval(expression.replace('^', '**'))\n        return str(result)\n    except Exception as e:\n        return f\"Error: {str(e)}\"\n\n# Tool configuration\ntool_config = {\n    \"tools\": [\n        {\n            \"toolSpec\": {\n                \"name\": \"calculator\",\n                \"description\": \"Perform basic mathematical calculations\",\n                \"inputSchema\": {\n                    \"json\": {\n                        \"type\": \"object\",\n                        \"properties\": {\n                            \"expression\": {\n                                \"type\": \"string\",\n                                \"description\": \"Mathematical expression to evaluate\"\n                            }\n                        },\n                        \"required\": [\"expression\"]\n                    }\n                }\n            }\n        }\n    ]\n}\n\n# Create message with tool request\nmessage = create_user_message()\\\n    .add_text(\"Please calculate 25 * 37 + 15 and explain the result.\")\\\n    .build()\n\nresponse = manager.converse(\n    messages=[message],\n    tool_config=tool_config\n)\n\n# Handle tool use in response\nif response.success:\n    # Check for tool use in the response\n    response_data = response.response_data\n    content = response_data.get(\"output\", {}).get(\"message\", {}).get(\"content\", [])\n    \n    for block in content:\n        if \"toolUse\" in block:\n            tool_use = block[\"toolUse\"]\n            tool_name = tool_use[\"name\"]\n            tool_input = tool_use[\"input\"]\n            tool_id = tool_use[\"toolUseId\"]\n            \n            print(f\"Tool called: {tool_name}\")\n            print(f\"Input: {tool_input}\")\n            \n            # Execute the tool function\n            if tool_name == \"calculator\":\n                result = calculator_function(tool_input[\"expression\"])\n                \n                # Create tool result message\n                tool_result_message = create_user_message()\\\n                    .add_tool_result(\n                        tool_use_id=tool_id,\n                        content=result,\n                        status=\"success\"\n                    )\\\n                    .build()\n                \n                # Continue conversation with tool result\n                final_response = manager.converse(messages=[\n                    message,\n                    response_data[\"output\"][\"message\"],\n                    tool_result_message\n                ])\n                \n                print(f\"Final response: {final_response.get_content()}\")\n```\n\n## Development\n\n### Setting Up Development Environment\n\n```bash\n# Clone the repository\ngit clone <repository-url>\ncd bestehorn-llmmanager\n\n# Install in editable mode with development dependencies\npip install -e .[dev]\n\n# Run tests\npytest\n\n# Run tests with coverage\npytest --cov=bestehorn_llmmanager\n```\n\n### Project Structure\n\n```\nbestehorn-llmmanager/\n\u251c\u2500\u2500 src/\n\u2502   \u2514\u2500\u2500 bestehorn_llmmanager/\n\u2502       \u251c\u2500\u2500 __init__.py\n\u2502       \u251c\u2500\u2500 llm_manager.py\n\u2502       \u251c\u2500\u2500 parallel_llm_manager.py\n\u2502       \u2514\u2500\u2500 bedrock/\n\u251c\u2500\u2500 test/\n\u251c\u2500\u2500 docs/\n\u251c\u2500\u2500 pyproject.toml\n\u2514\u2500\u2500 README.md\n```\n\n### Running Tests\n\n```bash\n# Unit tests only\npytest test/bestehorn_llmmanager/\n\n# Integration tests (requires AWS credentials)\npytest test/integration/ -m integration\n\n# All tests\npytest\n```\n\n## \ud83d\udcda API Reference\n\n### \ud83c\udfd7\ufe0f MessageBuilder API\n\nThe MessageBuilder provides a fluent interface for constructing multi-modal messages with automatic format detection.\n\n#### Factory Functions\n```python\nfrom bestehorn_llmmanager import (\n    create_user_message, create_assistant_message, \n    create_message, MessageBuilder, RolesEnum\n)\n\n# Convenience factories\ncreate_user_message() -> ConverseMessageBuilder\ncreate_assistant_message() -> ConverseMessageBuilder\ncreate_message(role: RolesEnum) -> ConverseMessageBuilder\n\n# Direct instantiation\nMessageBuilder(role: RolesEnum) -> ConverseMessageBuilder\n```\n\n#### Core Methods\n\n**Text Content**\n```python\ndef add_text(self, text: str) -> 'ConverseMessageBuilder'\n```\nAdd text content to the message. Text is automatically stripped of whitespace.\n\n**Image Content**\n```python\ndef add_image_bytes(\n    self, bytes: bytes, \n    format: Optional[ImageFormatEnum] = None,\n    filename: Optional[str] = None\n) -> 'ConverseMessageBuilder'\n\ndef add_local_image(\n    self, path_to_local_file: str,\n    format: Optional[ImageFormatEnum] = None,\n    max_size_mb: float = 3.75\n) -> 'ConverseMessageBuilder'\n```\n\n**Document Content**\n```python\ndef add_document_bytes(\n    self, bytes: bytes,\n    format: Optional[DocumentFormatEnum] = None,\n    filename: Optional[str] = None,\n    name: Optional[str] = None\n) -> 'ConverseMessageBuilder'\n\ndef add_local_document(\n    self, path_to_local_file: str,\n    format: Optional[DocumentFormatEnum] = None,\n    name: Optional[str] = None,\n    max_size_mb: float = 4.5\n) -> 'ConverseMessageBuilder'\n```\n\n**Video Content**\n```python\ndef add_video_bytes(\n    self, bytes: bytes,\n    format: Optional[VideoFormatEnum] = None,\n    filename: Optional[str] = None\n) -> 'ConverseMessageBuilder'\n\ndef add_local_video(\n    self, path_to_local_file: str,\n    format: Optional[VideoFormatEnum] = None,\n    max_size_mb: float = 100.0\n) -> 'ConverseMessageBuilder'\n```\n\n**Build Message**\n```python\ndef build(self) -> Dict[str, Any]\n```\nBuild and return the AWS Bedrock-compatible message dictionary.\n\n#### Supported Formats\n\n**Images**: JPEG, PNG, GIF, WEBP  \n**Documents**: PDF, CSV, DOC, DOCX, XLS, XLSX, HTML, TXT, MD  \n**Videos**: MP4, MOV, AVI, WEBM, MKV\n\n### \ud83d\udd27 LLMManager\n\nPrimary interface for single AWS Bedrock requests with enhanced reliability.\n\n#### Constructor\n```python\nLLMManager(\n    models: List[str],                                    # Model names/IDs\n    regions: List[str],                                   # AWS regions\n    auth_config: Optional[AuthConfig] = None,             # Authentication config\n    retry_config: Optional[RetryConfig] = None,           # Retry configuration\n    default_inference_config: Optional[Dict] = None,     # Default inference params\n    timeout: int = 300,                                   # Request timeout (seconds)\n    log_level: Union[int, str] = logging.WARNING         # Logging level\n)\n```\n\n#### Core Methods\n```python\ndef converse(\n    self, messages: List[Dict[str, Any]],\n    system: Optional[List[Dict[str, str]]] = None,\n    inference_config: Optional[Dict[str, Any]] = None,\n    **kwargs\n) -> BedrockResponse\n\ndef converse_stream(\n    self, messages: List[Dict[str, Any]],\n    **kwargs\n) -> StreamingResponse\n\ndef validate_configuration(self) -> Dict[str, Any]\ndef refresh_model_data(self) -> None\ndef get_available_models(self) -> List[str]\ndef get_available_regions(self) -> List[str]\n```\n\n### \u26a1 ParallelLLMManager\n\nInterface for concurrent multi-region processing.\n\n#### Constructor\n```python\nParallelLLMManager(\n    models: List[str],                                    # Model names/IDs\n    regions: List[str],                                   # AWS regions\n    parallel_config: Optional[ParallelProcessingConfig] = None,\n    **kwargs                                              # Same as LLMManager\n)\n```\n\n#### Core Methods\n```python\ndef converse_parallel(\n    self, requests: List[BedrockConverseRequest],\n    target_regions_per_request: int = 2,\n    response_validation_config: Optional[ResponseValidationConfig] = None\n) -> ParallelResponse\n\ndef get_parallel_config(self) -> ParallelProcessingConfig\n```\n\n### \ud83d\udcca Response Objects\n\n#### BedrockResponse\n```python\n# Status and content\nresponse.success: bool\nresponse.get_content() -> str\nresponse.get_usage() -> Dict[str, Any]\nresponse.get_stop_reason() -> str\n\n# Execution details\nresponse.model_used: str\nresponse.region_used: str\nresponse.total_duration_ms: float\nresponse.attempts: List[RequestAttempt]\n\n# Error handling\nresponse.get_last_error() -> Exception\nresponse.get_all_errors() -> List[Exception]\nresponse.get_warnings() -> List[str]\n\n# Validation (if used)\nresponse.had_validation_failures() -> bool\nresponse.get_validation_errors() -> List[Dict]\n\n# Serialization\nresponse.to_dict() -> Dict[str, Any]\nresponse.to_json(indent: int = None) -> str\n```\n\n#### ParallelResponse\n```python\n# Overall results\nparallel_response.success: bool\nparallel_response.get_success_rate() -> float\nparallel_response.total_duration_ms: float\n\n# Individual responses\nparallel_response.request_responses: Dict[str, BedrockResponse]\nparallel_response.get_successful_responses() -> Dict[str, BedrockResponse]\nparallel_response.failed_requests: List[str]\n\n# Statistics\nparallel_response.parallel_execution_stats.total_requests: int\nparallel_response.parallel_execution_stats.successful_requests: int\nparallel_response.parallel_execution_stats.average_request_duration_ms: float\n```\n\n### \ud83d\udd27 Configuration Classes\n\n#### AuthConfig\n```python\nfrom bestehorn_llmmanager.bedrock.models.llm_manager_structures import (\n    AuthConfig, AuthenticationType\n)\n\nAuthConfig(\n    auth_type: AuthenticationType,                # PROFILE, CREDENTIALS, IAM_ROLE, AUTO\n    profile_name: Optional[str] = None,           # AWS profile name\n    access_key_id: Optional[str] = None,          # AWS access key\n    secret_access_key: Optional[str] = None,      # AWS secret key\n    session_token: Optional[str] = None,          # AWS session token\n    region: Optional[str] = None                  # Default region\n)\n```\n\n#### RetryConfig\n```python\nfrom bestehorn_llmmanager.bedrock.models.llm_manager_structures import (\n    RetryConfig, RetryStrategy\n)\n\nRetryConfig(\n    max_retries: int = 3,                         # Maximum retry attempts\n    retry_delay: float = 1.0,                     # Initial delay (seconds)\n    backoff_multiplier: float = 2.0,              # Exponential backoff multiplier\n    max_retry_delay: float = 60.0,                # Maximum delay cap\n    retry_strategy: RetryStrategy = RetryStrategy.REGION_FIRST,\n    enable_feature_fallback: bool = True          # Disable features on errors\n)\n```\n\n### \ud83c\udfaf Enums\n\n#### MessageBuilder Enums\n```python\nfrom bestehorn_llmmanager import (\n    RolesEnum, ImageFormatEnum, \n    DocumentFormatEnum, VideoFormatEnum\n)\n\nclass RolesEnum(str, Enum):\n    USER = \"user\"\n    ASSISTANT = \"assistant\"\n\nclass ImageFormatEnum(str, Enum):\n    JPEG = \"jpeg\"\n    PNG = \"png\"\n    GIF = \"gif\"\n    WEBP = \"webp\"\n\nclass DocumentFormatEnum(str, Enum):\n    PDF = \"pdf\"\n    CSV = \"csv\"\n    DOC = \"doc\"\n    DOCX = \"docx\"\n    XLS = \"xls\"\n    XLSX = \"xlsx\"\n    HTML = \"html\"\n    TXT = \"txt\"\n    MD = \"md\"\n\nclass VideoFormatEnum(str, Enum):\n    MP4 = \"mp4\"\n    MOV = \"mov\"\n    AVI = \"avi\"\n    WEBM = \"webm\"\n    MKV = \"mkv\"\n```\n\n## \ud83c\udfaf Complete Example: Multi-Modal Document Analysis\n\nHere's a comprehensive example showcasing the library's key features:\n\n```python\nfrom bestehorn_llmmanager import (\n    LLMManager, ParallelLLMManager, create_user_message,\n    AuthConfig, AuthenticationType, RetryConfig, RetryStrategy\n)\nfrom bestehorn_llmmanager.bedrock.models.parallel_structures import BedrockConverseRequest\nfrom bestehorn_llmmanager.bedrock.models.llm_manager_structures import ResponseValidationConfig, ValidationResult\n\n# Configure authentication and retry behavior\nauth_config = AuthConfig(\n    auth_type=AuthenticationType.PROFILE,\n    profile_name=\"bedrock-profile\"\n)\n\nretry_config = RetryConfig(\n    max_retries=5,\n    retry_strategy=RetryStrategy.REGION_FIRST,\n    enable_feature_fallback=True\n)\n\n# Custom response validation\ndef validate_analysis_quality(response):\n    content = response.get_content()\n    if len(content) < 100:\n        return ValidationResult(\n            success=False,\n            error_message=\"Analysis too brief\"\n        )\n    return ValidationResult(success=True)\n\nvalidation_config = ResponseValidationConfig(\n    response_validation_function=validate_analysis_quality,\n    response_validation_retries=2\n)\n\n# Initialize managers\nsingle_manager = LLMManager(\n    models=[\"Claude 3 Sonnet\", \"Claude 3 Haiku\"],\n    regions=[\"us-east-1\", \"us-west-2\", \"eu-west-1\"],\n    auth_config=auth_config,\n    retry_config=retry_config\n)\n\nparallel_manager = ParallelLLMManager(\n    models=[\"Claude 3 Sonnet\", \"Claude 3 Haiku\"],\n    regions=[\"us-east-1\", \"us-west-2\", \"eu-west-1\"],\n    auth_config=auth_config,\n    retry_config=retry_config\n)\n\n# Single request with multi-modal content\nmessage = create_user_message()\\\n    .add_text(\"Please analyze this comprehensive business report:\")\\\n    .add_local_document(\"reports/quarterly_report.pdf\")\\\n    .add_text(\"Focus on these key areas:\")\\\n    .add_text(\"1. Financial performance trends\")\\\n    .add_text(\"2. Market position analysis\")\\\n    .add_text(\"3. Strategic recommendations\")\\\n    .add_local_image(\"charts/performance_chart.png\")\\\n    .add_text(\"How does the visual data support the written analysis?\")\\\n    .build()\n\n# Execute single request with validation\nresponse = single_manager.converse(\n    messages=[message],\n    response_validation_config=validation_config\n)\n\nif response.success:\n    print(f\"Analysis completed in {response.total_duration_ms}ms\")\n    print(f\"Using {response.model_used} in {response.region_used}\")\n    print(f\"Content: {response.get_content()[:500]}...\")\n    \n    # Check token usage\n    usage = response.get_usage()\n    if usage:\n        print(f\"Tokens used: {usage.get('totalTokens', 'N/A')}\")\nelse:\n    print(f\"Analysis failed: {response.get_last_error()}\")\n\n# Parallel processing for multiple documents\ndocuments = [\"doc1.pdf\", \"doc2.pdf\", \"doc3.pdf\"]\nrequests = []\n\nfor i, doc in enumerate(documents):\n    message = create_user_message()\\\n        .add_text(f\"Provide a executive summary of this document:\")\\\n        .add_local_document(doc)\\\n        .build()\n    \n    requests.append(BedrockConverseRequest(\n        request_id=f\"summary-{i}\",\n        messages=[message]\n    ))\n\n# Execute in parallel with high availability\nparallel_response = parallel_manager.converse_parallel(\n    requests=requests,\n    target_regions_per_request=2,\n    response_validation_config=validation_config\n)\n\nprint(f\"Parallel execution results:\")\nprint(f\"Success rate: {parallel_response.get_success_rate():.1%}\")\nprint(f\"Total time: {parallel_response.total_duration_ms}ms\")\nprint(f\"Average per request: {parallel_response.parallel_execution_stats.average_request_duration_ms:.1f}ms\")\n\n# Process results\nfor request_id, response in parallel_response.get_successful_responses().items():\n    print(f\"\\n{request_id}: {response.get_content()[:200]}...\")\n```\n\nThis comprehensive example demonstrates:\n- **Multi-modal message construction** with text, documents, and images\n- **Automatic format detection** for uploaded files\n- **Multi-region failover** with intelligent retry logic\n- **Response validation** with custom validation functions\n- **Parallel processing** across multiple models and regions\n- **Detailed performance monitoring** and error handling\n- **Flexible authentication** configuration\n\n## Error Handling\n\nThe library provides comprehensive error handling with specific exception types:\n\n- `LLMManagerError`: Base exception for all library errors\n- `ConfigurationError`: Configuration-related errors\n- `AuthenticationError`: AWS authentication failures\n- `RequestValidationError`: Request validation failures\n- `RetryExhaustedError`: All retry attempts failed\n- `ParallelProcessingError`: Parallel execution errors\n\n## License\n\nMIT License - see LICENSE file for details.\n\n## Contributing\n\n1. Fork the repository\n2. Create a feature branch\n3. Make your changes\n4. Add tests for new functionality\n5. Run the test suite\n6. Submit a pull request\n\n## Support\n\nFor issues and questions:\n- Check the documentation in the `docs/` directory\n- Review existing issues on GitHub\n- Create a new issue with detailed information about your problem\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A comprehensive Python library for managing AWS Bedrock Converse API interactions with multi-model support, intelligent retry logic, and parallel processing capabilities",
    "version": "0.1.13",
    "project_urls": {
        "Bug Tracker": "https://github.com/Bestehorn/LLMManager/issues",
        "Documentation": "https://github.com/Bestehorn/LLMManager/blob/main/README.md",
        "Homepage": "https://github.com/Bestehorn/LLMManager",
        "Repository": "https://github.com/Bestehorn/LLMManager.git"
    },
    "split_keywords": [
        "aws",
        " bedrock",
        " llm",
        " ai",
        " machine-learning",
        " amazon",
        " claude",
        " anthropic"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "25d13437932261122da4b00bf7354c6ca04b40952f1910bc4f890ead62cbe9c0",
                "md5": "fcdd359d7b8410cf9c26a123b8e528da",
                "sha256": "ac604e2fb0297e81a3544ffae0c1e6375292302203c2c78b08fbbff3ba976bce"
            },
            "downloads": -1,
            "filename": "bestehorn_llmmanager-0.1.13-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "fcdd359d7b8410cf9c26a123b8e528da",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.8",
            "size": 190492,
            "upload_time": "2025-07-24T12:40:21",
            "upload_time_iso_8601": "2025-07-24T12:40:21.316580Z",
            "url": "https://files.pythonhosted.org/packages/25/d1/3437932261122da4b00bf7354c6ca04b40952f1910bc4f890ead62cbe9c0/bestehorn_llmmanager-0.1.13-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "2d0b825fa55758803bf52a9205333e2a854f0a188459bec9cdc29ef873d7d862",
                "md5": "fecdfe7fe83588095df900165638c950",
                "sha256": "5483ffa2df73b77cf3c2c746372119538af7950733b6ef704acb90c8cb798e49"
            },
            "downloads": -1,
            "filename": "bestehorn_llmmanager-0.1.13.tar.gz",
            "has_sig": false,
            "md5_digest": "fecdfe7fe83588095df900165638c950",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.8",
            "size": 194828,
            "upload_time": "2025-07-24T12:40:22",
            "upload_time_iso_8601": "2025-07-24T12:40:22.870529Z",
            "url": "https://files.pythonhosted.org/packages/2d/0b/825fa55758803bf52a9205333e2a854f0a188459bec9cdc29ef873d7d862/bestehorn_llmmanager-0.1.13.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-24 12:40:22",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Bestehorn",
    "github_project": "LLMManager",
    "travis_ci": false,
    "coveralls": true,
    "github_actions": true,
    "tox": true,
    "lcname": "bestehorn-llmmanager"
}

None