# Nimble LLM Caller
A robust, multi-model LLM calling package with **intelligent context management**, file processing, and advanced prompt handling capabilities.
## 🚀 Key Features
### Core Capabilities
- **Multi-Model Support**: Call multiple LLM providers (OpenAI, Anthropic, Google, etc.) through LiteLLM
- **Intelligent Context Management**: Automatic context-size-aware request handling with model upshifting
- **File Processing**: Support for 29+ file types (PDF, Word, images, JSON, CSV, XML, YAML, etc.)
- **Batch Processing**: Submit multiple prompts to multiple models efficiently
- **Robust JSON Parsing**: Multiple fallback strategies for parsing LLM responses
- **Retry Logic**: Exponential backoff with jitter for handling rate limits and transient errors
### Advanced Features
- **Context-Size-Aware Safe Submit**: Automatic overflow handling with model upshifting and content chunking
- **File Attachment Support**: Process and include files directly in LLM requests
- **Comprehensive Interaction Logging**: Detailed request/response tracking with metadata
- **Prompt Management**: JSON-based prompt templates with variable substitution
- **Document Assembly**: Built-in formatters for text, markdown, and LaTeX output
- **Graceful Degradation**: Fallback strategies for reliability
- **Full Backward Compatibility**: Existing code continues to work unchanged
## 📦 Installation
### Basic Installation
```bash
pip install nimble-llm-caller
```
### Enhanced Installation (Recommended)
```bash
# Install with enhanced file processing capabilities
pip install nimble-llm-caller[enhanced]
```
### All Features Installation
```bash
# Install with all optional dependencies
pip install nimble-llm-caller[all]
```
### Development Installation
```bash
# Clone the repository
git clone https://github.com/fredzannarbor/nimble-llm-caller.git
cd nimble-llm-caller
# Install in development mode with all features
pip install -e .[dev,enhanced]
# Run setup script
python setup_dev.py setup
```
### Installation Options Summary
| Installation | Command | Features |
|-------------|---------|----------|
| **Basic** | `pip install nimble-llm-caller` | Core LLM calling, basic context management |
| **Enhanced** | `pip install nimble-llm-caller[enhanced]` | + File processing (PDF, Word, images), advanced tokenization |
| **All** | `pip install nimble-llm-caller[all]` | + All optional features and dependencies |
| **Development** | `pip install -e .[dev,enhanced]` | + Testing, linting, documentation tools |
## ⚙️ Configuration
### 1. API Keys Setup
Set your API keys in environment variables:
```bash
# Required: At least one LLM provider
export OPENAI_API_KEY="your-openai-key"
export ANTHROPIC_API_KEY="your-anthropic-key"
export GOOGLE_API_KEY="your-google-key"
# Optional: For enhanced features
export LITELLM_LOG="INFO" # Enable LiteLLM logging
```
### 2. Environment File (.env)
Create a `.env` file in your project root:
```env
# LLM Provider API Keys
OPENAI_API_KEY=your-openai-key
ANTHROPIC_API_KEY=your-anthropic-key
GOOGLE_API_KEY=your-google-key
# Optional Configuration
LITELLM_LOG=INFO
NIMBLE_LOG_LEVEL=INFO
NIMBLE_DEFAULT_MODEL=gpt-4o
NIMBLE_MAX_RETRIES=3
```
### 3. Configuration File
Create a configuration file for advanced settings:
```python
# config.py
from nimble_llm_caller.models.context_config import ContextConfig, ContextStrategy
# Custom context configuration
context_config = ContextConfig(
default_strategy=ContextStrategy.UPSHIFT,
enable_chunking=True,
chunk_overlap_tokens=100,
max_cost_multiplier=3.0,
enable_model_fallback=True
)
```
## 🚀 Quick Start
### Basic Usage (Backward Compatible)
```python
from nimble_llm_caller import LLMCaller, LLMRequest
# Traditional usage - still works!
caller = LLMCaller()
request = LLMRequest(
prompt_key="summarize_text",
model="gpt-4",
substitutions={"text": "Your text here"}
)
response = caller.call(request)
print(f"Result: {response.content}")
```
### Enhanced Usage with Intelligent Context Management
```python
from nimble_llm_caller import EnhancedLLMCaller, LLMRequest, FileAttachment
# Enhanced caller with all intelligent features
caller = EnhancedLLMCaller(
enable_context_management=True,
enable_file_processing=True,
enable_interaction_logging=True
)
# Request with file attachments and automatic context management
request = LLMRequest(
prompt_key="analyze_document",
model="gpt-4",
file_attachments=[
FileAttachment(file_path="document.pdf", content_type="application/pdf"),
FileAttachment(file_path="data.csv", content_type="text/csv")
],
substitutions={"analysis_type": "comprehensive"}
)
# Automatic context management, file processing, and logging
response = caller.call(request)
print(f"Analysis: {response.content}")
print(f"Files processed: {response.files_processed}")
print(f"Model used: {response.model} (original: {response.original_model})")
```
### Content Generation with File Processing
```python
from nimble_llm_caller import LLMContentGenerator
# Initialize with prompts and enhanced features
generator = LLMContentGenerator(
prompt_file_path="prompts.json",
enable_context_management=True,
enable_file_processing=True
)
# Process multiple files with intelligent context handling
results = generator.call_batch(
prompt_keys=["summarize_document", "extract_key_points"],
models=["gpt-4o", "claude-3-sonnet"],
shared_substitutions={
"files": ["report.pdf", "data.xlsx", "presentation.pptx"]
}
)
print(f"Success rate: {results.success_rate:.1f}%")
print(f"Total files processed: {sum(r.files_processed for r in results.responses)}")
```
## 📋 Usage Examples
### 1. Context-Size-Aware Processing
```python
from nimble_llm_caller import EnhancedLLMCaller, LLMRequest
caller = EnhancedLLMCaller(enable_context_management=True)
# Large content that might exceed context limits
large_content = "..." * 50000 # Very large text
request = LLMRequest(
prompt_key="analyze_content",
model="gpt-5-mini", # Will automatically upshift if needed
substitutions={"content": large_content}
)
# Automatic handling: upshift to gpt-4-turbo or chunk content
response = caller.call(request)
if response.upshift_reason:
print(f"Upshifted from {response.original_model} to {response.model}")
print(f"Reason: {response.upshift_reason}")
if response.was_chunked:
print(f"Content was chunked: {response.chunk_info}")
```
### 2. File Processing with Multiple Formats
```python
from nimble_llm_caller import EnhancedLLMCaller, LLMRequest, FileAttachment
caller = EnhancedLLMCaller(
enable_file_processing=True,
enable_context_management=True
)
# Process multiple file types
files = [
FileAttachment("report.pdf", content_type="application/pdf"),
FileAttachment("data.xlsx", content_type="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"),
FileAttachment("image.png", content_type="image/png"),
FileAttachment("config.yaml", content_type="application/x-yaml")
]
request = LLMRequest(
prompt_key="comprehensive_analysis",
model="gpt-4o", # Vision-capable model for images
file_attachments=files
)
response = caller.call(request)
print(f"Processed {response.files_processed} files")
print(f"Analysis: {response.content}")
```
### 3. Interaction Logging and Monitoring
```python
from nimble_llm_caller import EnhancedLLMCaller
# Enable comprehensive logging
caller = EnhancedLLMCaller(
enable_interaction_logging=True,
log_file_path="llm_interactions.log",
log_content=True,
log_metadata=True
)
# Make requests - all interactions are logged
response = caller.call(request)
# Access recent interactions
recent = caller.interaction_logger.get_recent_interactions(count=5)
for interaction in recent:
print(f"Request: {interaction.prompt_key} -> {interaction.model}")
print(f"Duration: {interaction.duration_ms}ms")
print(f"Tokens: {interaction.token_usage}")
# Get statistics
stats = caller.interaction_logger.get_statistics()
print(f"Total requests: {stats['total_requests']}")
print(f"Success rate: {stats['success_rate']:.1f}%")
print(f"Average duration: {stats['avg_duration_ms']:.1f}ms")
```
### 4. Custom Context Strategies
```python
from nimble_llm_caller import EnhancedLLMCaller, ContextConfig, ContextStrategy
# Custom context configuration
config = ContextConfig(
default_strategy=ContextStrategy.CHUNK, # Prefer chunking over upshifting
enable_chunking=True,
chunk_overlap_tokens=200,
max_cost_multiplier=2.0, # Limit cost increases
enable_model_fallback=True
)
caller = EnhancedLLMCaller(
enable_context_management=True,
context_config=config
)
# Requests will use chunking strategy when context limits are exceeded
response = caller.call(large_request)
```
### 5. Batch Processing with Context Management
```python
from nimble_llm_caller import LLMContentGenerator
generator = LLMContentGenerator(
prompt_file_path="prompts.json",
enable_context_management=True,
enable_file_processing=True
)
# Batch process with automatic context handling
results = generator.call_batch(
prompt_keys=["analyze_document", "extract_insights", "generate_summary"],
models=["gpt-4o", "claude-3-sonnet", "gemini-1.5-pro"],
shared_substitutions={
"documents": ["doc1.pdf", "doc2.docx", "doc3.txt"]
},
parallel=True,
max_concurrent=3
)
# Results include context management information
for response in results.responses:
print(f"Prompt: {response.prompt_key}")
print(f"Model: {response.model} (original: {response.original_model})")
print(f"Strategy: {response.context_strategy_used}")
print(f"Files: {response.files_processed}")
print("---")
```
## 📝 Prompt Format
### Basic Prompt Structure
```json
{
"prompt_keys": ["summarize_text", "analyze_document"],
"summarize_text": {
"messages": [
{
"role": "system",
"content": "You are a professional summarizer."
},
{
"role": "user",
"content": "Summarize this text: {text}"
}
],
"params": {
"temperature": 0.3,
"max_tokens": 1000
}
}
}
```
### Enhanced Prompt with File Processing
```json
{
"analyze_document": {
"messages": [
{
"role": "system",
"content": "You are a document analyst. Analyze the provided files and give insights."
},
{
"role": "user",
"content": "Please analyze the attached files and provide {analysis_type} analysis. Focus on: {focus_areas}"
}
],
"params": {
"temperature": 0.2,
"max_tokens": 2000
},
"supports_files": true,
"supports_vision": true
}
}
```
## 🔧 Advanced Configuration
### Context Management Settings
```python
from nimble_llm_caller.models.context_config import ContextConfig, ContextStrategy
# Fine-tune context management
config = ContextConfig(
# Strategy when context limit is exceeded
default_strategy=ContextStrategy.UPSHIFT, # or CHUNK, TRUNCATE, ERROR
# Chunking settings
enable_chunking=True,
chunk_overlap_tokens=100,
max_chunks=10,
# Model upshifting settings
enable_model_upshifting=True,
max_cost_multiplier=3.0,
enable_model_fallback=True,
# Safety margins
context_buffer_tokens=500,
enable_token_estimation=True
)
```
### File Processing Configuration
```python
from nimble_llm_caller.core.file_processor import FileProcessor
# Custom file processor
processor = FileProcessor(
max_file_size_mb=50,
supported_formats=[
"pdf", "docx", "txt", "md", "json", "csv",
"xlsx", "png", "jpg", "yaml", "xml"
],
extract_metadata=True,
preserve_formatting=True
)
```
### Logging Configuration
```python
from nimble_llm_caller.core.interaction_logger import InteractionLogger
# Custom interaction logger
logger = InteractionLogger(
log_file_path="interactions.jsonl",
log_content=True,
log_metadata=True,
async_logging=True,
max_log_size_mb=100,
max_files=10
)
```
## 🔍 Monitoring and Debugging
### Access Interaction Logs
```python
# Get recent interactions
recent = caller.interaction_logger.get_recent_interactions(count=10)
# Filter by model
gpt4_interactions = caller.interaction_logger.get_interactions_by_model("gpt-4o")
# Filter by time range
from datetime import datetime, timedelta
since = datetime.now() - timedelta(hours=1)
recent_hour = caller.interaction_logger.get_interactions_since(since)
```
### Performance Statistics
```python
stats = caller.interaction_logger.get_statistics()
print(f"""
Performance Statistics:
- Total Requests: {stats['total_requests']}
- Success Rate: {stats['success_rate']:.1f}%
- Average Duration: {stats['avg_duration_ms']:.1f}ms
- Total Tokens: {stats['total_tokens']}
- Average Cost: ${stats['avg_cost']:.4f}
""")
```
### Error Analysis
```python
# Get failed requests
failed = caller.interaction_logger.get_failed_interactions()
for failure in failed:
print(f"Failed: {failure.prompt_key} -> {failure.error}")
print(f"Model: {failure.model}, Duration: {failure.duration_ms}ms")
```
## 🔄 Migration Guide
### From v0.1.x to v0.2.x
Your existing code continues to work unchanged! New features are opt-in:
```python
# Old code (still works)
from nimble_llm_caller import LLMCaller, LLMRequest
caller = LLMCaller()
response = caller.call(request)
# New enhanced features (optional)
from nimble_llm_caller import EnhancedLLMCaller
caller = EnhancedLLMCaller(
enable_context_management=True,
enable_file_processing=True
)
```
See [MIGRATION.md](MIGRATION.md) for detailed migration instructions.
## 📚 Documentation
- **[Installation Guide](INSTALLATION.md)**: Detailed installation instructions
- **[Migration Guide](MIGRATION.md)**: Upgrading from previous versions
- **[API Reference](docs/api/)**: Complete API documentation
- **[Examples](examples/)**: Working code examples
- **[Configuration](docs/configuration.md)**: Advanced configuration options
## 🤝 Contributing
We welcome contributions! Please see our [Contributing Guide](CONTRIBUTING.md) for details.
## 📄 License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## 🆘 Support
- **Issues**: [GitHub Issues](https://github.com/nimblebooks/nimble-llm-caller/issues)
- **Discussions**: [GitHub Discussions](https://github.com/nimblebooks/nimble-llm-caller/discussions)
- **Documentation**: [Full Documentation](https://nimble-llm-caller.readthedocs.io)
## 🏷️ Version
Current version: **0.2.2** - Intelligent Context Management Release
### Recent Updates
- 📖 **v0.2.2**: Improved README
- ✅ **v0.2.1**: Bug fixes for InteractionLogger
- 🚀 **v0.2.0**: Intelligent context management, file processing, enhanced logging
- 📦 **v0.1.0**: Initial release with basic LLM calling capabilities
Raw data
{
"_id": null,
"home_page": null,
"name": "nimble-llm-caller",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": "llm, ai, openai, anthropic, google, prompt, batch, document, context-management, file-processing, token-estimation, model-upshifting",
"author": null,
"author_email": "Nimble Books LLC <info@nimblebooks.com>",
"download_url": "https://files.pythonhosted.org/packages/4e/01/eaa9cab317c8069f51bfdd6b6f21e8a4956926a9acae8040f968970665e1/nimble_llm_caller-0.2.2.tar.gz",
"platform": null,
"description": "# Nimble LLM Caller\n\nA robust, multi-model LLM calling package with **intelligent context management**, file processing, and advanced prompt handling capabilities.\n\n## \ud83d\ude80 Key Features\n\n### Core Capabilities\n- **Multi-Model Support**: Call multiple LLM providers (OpenAI, Anthropic, Google, etc.) through LiteLLM\n- **Intelligent Context Management**: Automatic context-size-aware request handling with model upshifting\n- **File Processing**: Support for 29+ file types (PDF, Word, images, JSON, CSV, XML, YAML, etc.)\n- **Batch Processing**: Submit multiple prompts to multiple models efficiently\n- **Robust JSON Parsing**: Multiple fallback strategies for parsing LLM responses\n- **Retry Logic**: Exponential backoff with jitter for handling rate limits and transient errors\n\n### Advanced Features\n- **Context-Size-Aware Safe Submit**: Automatic overflow handling with model upshifting and content chunking\n- **File Attachment Support**: Process and include files directly in LLM requests\n- **Comprehensive Interaction Logging**: Detailed request/response tracking with metadata\n- **Prompt Management**: JSON-based prompt templates with variable substitution\n- **Document Assembly**: Built-in formatters for text, markdown, and LaTeX output\n- **Graceful Degradation**: Fallback strategies for reliability\n- **Full Backward Compatibility**: Existing code continues to work unchanged\n\n## \ud83d\udce6 Installation\n\n### Basic Installation\n```bash\npip install nimble-llm-caller\n```\n\n### Enhanced Installation (Recommended)\n```bash\n# Install with enhanced file processing capabilities\npip install nimble-llm-caller[enhanced]\n```\n\n### All Features Installation\n```bash\n# Install with all optional dependencies\npip install nimble-llm-caller[all]\n```\n\n### Development Installation\n```bash\n# Clone the repository\ngit clone https://github.com/fredzannarbor/nimble-llm-caller.git\ncd nimble-llm-caller\n\n# Install in development mode with all features\npip install -e .[dev,enhanced]\n\n# Run setup script\npython setup_dev.py setup\n```\n\n### Installation Options Summary\n\n| Installation | Command | Features |\n|-------------|---------|----------|\n| **Basic** | `pip install nimble-llm-caller` | Core LLM calling, basic context management |\n| **Enhanced** | `pip install nimble-llm-caller[enhanced]` | + File processing (PDF, Word, images), advanced tokenization |\n| **All** | `pip install nimble-llm-caller[all]` | + All optional features and dependencies |\n| **Development** | `pip install -e .[dev,enhanced]` | + Testing, linting, documentation tools |\n\n## \u2699\ufe0f Configuration\n\n### 1. API Keys Setup\nSet your API keys in environment variables:\n\n```bash\n# Required: At least one LLM provider\nexport OPENAI_API_KEY=\"your-openai-key\"\nexport ANTHROPIC_API_KEY=\"your-anthropic-key\"\nexport GOOGLE_API_KEY=\"your-google-key\"\n\n# Optional: For enhanced features\nexport LITELLM_LOG=\"INFO\" # Enable LiteLLM logging\n```\n\n### 2. Environment File (.env)\nCreate a `.env` file in your project root:\n\n```env\n# LLM Provider API Keys\nOPENAI_API_KEY=your-openai-key\nANTHROPIC_API_KEY=your-anthropic-key\nGOOGLE_API_KEY=your-google-key\n\n# Optional Configuration\nLITELLM_LOG=INFO\nNIMBLE_LOG_LEVEL=INFO\nNIMBLE_DEFAULT_MODEL=gpt-4o\nNIMBLE_MAX_RETRIES=3\n```\n\n### 3. Configuration File\nCreate a configuration file for advanced settings:\n\n```python\n# config.py\nfrom nimble_llm_caller.models.context_config import ContextConfig, ContextStrategy\n\n# Custom context configuration\ncontext_config = ContextConfig(\n default_strategy=ContextStrategy.UPSHIFT,\n enable_chunking=True,\n chunk_overlap_tokens=100,\n max_cost_multiplier=3.0,\n enable_model_fallback=True\n)\n```\n\n## \ud83d\ude80 Quick Start\n\n### Basic Usage (Backward Compatible)\n```python\nfrom nimble_llm_caller import LLMCaller, LLMRequest\n\n# Traditional usage - still works!\ncaller = LLMCaller()\nrequest = LLMRequest(\n prompt_key=\"summarize_text\",\n model=\"gpt-4\",\n substitutions={\"text\": \"Your text here\"}\n)\nresponse = caller.call(request)\nprint(f\"Result: {response.content}\")\n```\n\n### Enhanced Usage with Intelligent Context Management\n```python\nfrom nimble_llm_caller import EnhancedLLMCaller, LLMRequest, FileAttachment\n\n# Enhanced caller with all intelligent features\ncaller = EnhancedLLMCaller(\n enable_context_management=True,\n enable_file_processing=True,\n enable_interaction_logging=True\n)\n\n# Request with file attachments and automatic context management\nrequest = LLMRequest(\n prompt_key=\"analyze_document\",\n model=\"gpt-4\",\n file_attachments=[\n FileAttachment(file_path=\"document.pdf\", content_type=\"application/pdf\"),\n FileAttachment(file_path=\"data.csv\", content_type=\"text/csv\")\n ],\n substitutions={\"analysis_type\": \"comprehensive\"}\n)\n\n# Automatic context management, file processing, and logging\nresponse = caller.call(request)\nprint(f\"Analysis: {response.content}\")\nprint(f\"Files processed: {response.files_processed}\")\nprint(f\"Model used: {response.model} (original: {response.original_model})\")\n```\n\n### Content Generation with File Processing\n```python\nfrom nimble_llm_caller import LLMContentGenerator\n\n# Initialize with prompts and enhanced features\ngenerator = LLMContentGenerator(\n prompt_file_path=\"prompts.json\",\n enable_context_management=True,\n enable_file_processing=True\n)\n\n# Process multiple files with intelligent context handling\nresults = generator.call_batch(\n prompt_keys=[\"summarize_document\", \"extract_key_points\"],\n models=[\"gpt-4o\", \"claude-3-sonnet\"],\n shared_substitutions={\n \"files\": [\"report.pdf\", \"data.xlsx\", \"presentation.pptx\"]\n }\n)\n\nprint(f\"Success rate: {results.success_rate:.1f}%\")\nprint(f\"Total files processed: {sum(r.files_processed for r in results.responses)}\")\n```\n\n## \ud83d\udccb Usage Examples\n\n### 1. Context-Size-Aware Processing\n```python\nfrom nimble_llm_caller import EnhancedLLMCaller, LLMRequest\n\ncaller = EnhancedLLMCaller(enable_context_management=True)\n\n# Large content that might exceed context limits\nlarge_content = \"...\" * 50000 # Very large text\n\nrequest = LLMRequest(\n prompt_key=\"analyze_content\",\n model=\"gpt-5-mini\", # Will automatically upshift if needed\n substitutions={\"content\": large_content}\n)\n\n# Automatic handling: upshift to gpt-4-turbo or chunk content\nresponse = caller.call(request)\n\nif response.upshift_reason:\n print(f\"Upshifted from {response.original_model} to {response.model}\")\n print(f\"Reason: {response.upshift_reason}\")\n\nif response.was_chunked:\n print(f\"Content was chunked: {response.chunk_info}\")\n```\n\n### 2. File Processing with Multiple Formats\n```python\nfrom nimble_llm_caller import EnhancedLLMCaller, LLMRequest, FileAttachment\n\ncaller = EnhancedLLMCaller(\n enable_file_processing=True,\n enable_context_management=True\n)\n\n# Process multiple file types\nfiles = [\n FileAttachment(\"report.pdf\", content_type=\"application/pdf\"),\n FileAttachment(\"data.xlsx\", content_type=\"application/vnd.openxmlformats-officedocument.spreadsheetml.sheet\"),\n FileAttachment(\"image.png\", content_type=\"image/png\"),\n FileAttachment(\"config.yaml\", content_type=\"application/x-yaml\")\n]\n\nrequest = LLMRequest(\n prompt_key=\"comprehensive_analysis\",\n model=\"gpt-4o\", # Vision-capable model for images\n file_attachments=files\n)\n\nresponse = caller.call(request)\nprint(f\"Processed {response.files_processed} files\")\nprint(f\"Analysis: {response.content}\")\n```\n\n### 3. Interaction Logging and Monitoring\n```python\nfrom nimble_llm_caller import EnhancedLLMCaller\n\n# Enable comprehensive logging\ncaller = EnhancedLLMCaller(\n enable_interaction_logging=True,\n log_file_path=\"llm_interactions.log\",\n log_content=True,\n log_metadata=True\n)\n\n# Make requests - all interactions are logged\nresponse = caller.call(request)\n\n# Access recent interactions\nrecent = caller.interaction_logger.get_recent_interactions(count=5)\nfor interaction in recent:\n print(f\"Request: {interaction.prompt_key} -> {interaction.model}\")\n print(f\"Duration: {interaction.duration_ms}ms\")\n print(f\"Tokens: {interaction.token_usage}\")\n\n# Get statistics\nstats = caller.interaction_logger.get_statistics()\nprint(f\"Total requests: {stats['total_requests']}\")\nprint(f\"Success rate: {stats['success_rate']:.1f}%\")\nprint(f\"Average duration: {stats['avg_duration_ms']:.1f}ms\")\n```\n\n### 4. Custom Context Strategies\n```python\nfrom nimble_llm_caller import EnhancedLLMCaller, ContextConfig, ContextStrategy\n\n# Custom context configuration\nconfig = ContextConfig(\n default_strategy=ContextStrategy.CHUNK, # Prefer chunking over upshifting\n enable_chunking=True,\n chunk_overlap_tokens=200,\n max_cost_multiplier=2.0, # Limit cost increases\n enable_model_fallback=True\n)\n\ncaller = EnhancedLLMCaller(\n enable_context_management=True,\n context_config=config\n)\n\n# Requests will use chunking strategy when context limits are exceeded\nresponse = caller.call(large_request)\n```\n\n### 5. Batch Processing with Context Management\n```python\nfrom nimble_llm_caller import LLMContentGenerator\n\ngenerator = LLMContentGenerator(\n prompt_file_path=\"prompts.json\",\n enable_context_management=True,\n enable_file_processing=True\n)\n\n# Batch process with automatic context handling\nresults = generator.call_batch(\n prompt_keys=[\"analyze_document\", \"extract_insights\", \"generate_summary\"],\n models=[\"gpt-4o\", \"claude-3-sonnet\", \"gemini-1.5-pro\"],\n shared_substitutions={\n \"documents\": [\"doc1.pdf\", \"doc2.docx\", \"doc3.txt\"]\n },\n parallel=True,\n max_concurrent=3\n)\n\n# Results include context management information\nfor response in results.responses:\n print(f\"Prompt: {response.prompt_key}\")\n print(f\"Model: {response.model} (original: {response.original_model})\")\n print(f\"Strategy: {response.context_strategy_used}\")\n print(f\"Files: {response.files_processed}\")\n print(\"---\")\n```\n\n## \ud83d\udcdd Prompt Format\n\n### Basic Prompt Structure\n```json\n{\n \"prompt_keys\": [\"summarize_text\", \"analyze_document\"],\n \"summarize_text\": {\n \"messages\": [\n {\n \"role\": \"system\",\n \"content\": \"You are a professional summarizer.\"\n },\n {\n \"role\": \"user\", \n \"content\": \"Summarize this text: {text}\"\n }\n ],\n \"params\": {\n \"temperature\": 0.3,\n \"max_tokens\": 1000\n }\n }\n}\n```\n\n### Enhanced Prompt with File Processing\n```json\n{\n \"analyze_document\": {\n \"messages\": [\n {\n \"role\": \"system\",\n \"content\": \"You are a document analyst. Analyze the provided files and give insights.\"\n },\n {\n \"role\": \"user\",\n \"content\": \"Please analyze the attached files and provide {analysis_type} analysis. Focus on: {focus_areas}\"\n }\n ],\n \"params\": {\n \"temperature\": 0.2,\n \"max_tokens\": 2000\n },\n \"supports_files\": true,\n \"supports_vision\": true\n }\n}\n```\n\n## \ud83d\udd27 Advanced Configuration\n\n### Context Management Settings\n```python\nfrom nimble_llm_caller.models.context_config import ContextConfig, ContextStrategy\n\n# Fine-tune context management\nconfig = ContextConfig(\n # Strategy when context limit is exceeded\n default_strategy=ContextStrategy.UPSHIFT, # or CHUNK, TRUNCATE, ERROR\n \n # Chunking settings\n enable_chunking=True,\n chunk_overlap_tokens=100,\n max_chunks=10,\n \n # Model upshifting settings\n enable_model_upshifting=True,\n max_cost_multiplier=3.0,\n enable_model_fallback=True,\n \n # Safety margins\n context_buffer_tokens=500,\n enable_token_estimation=True\n)\n```\n\n### File Processing Configuration\n```python\nfrom nimble_llm_caller.core.file_processor import FileProcessor\n\n# Custom file processor\nprocessor = FileProcessor(\n max_file_size_mb=50,\n supported_formats=[\n \"pdf\", \"docx\", \"txt\", \"md\", \"json\", \"csv\", \n \"xlsx\", \"png\", \"jpg\", \"yaml\", \"xml\"\n ],\n extract_metadata=True,\n preserve_formatting=True\n)\n```\n\n### Logging Configuration\n```python\nfrom nimble_llm_caller.core.interaction_logger import InteractionLogger\n\n# Custom interaction logger\nlogger = InteractionLogger(\n log_file_path=\"interactions.jsonl\",\n log_content=True,\n log_metadata=True,\n async_logging=True,\n max_log_size_mb=100,\n max_files=10\n)\n```\n\n## \ud83d\udd0d Monitoring and Debugging\n\n### Access Interaction Logs\n```python\n# Get recent interactions\nrecent = caller.interaction_logger.get_recent_interactions(count=10)\n\n# Filter by model\ngpt4_interactions = caller.interaction_logger.get_interactions_by_model(\"gpt-4o\")\n\n# Filter by time range\nfrom datetime import datetime, timedelta\nsince = datetime.now() - timedelta(hours=1)\nrecent_hour = caller.interaction_logger.get_interactions_since(since)\n```\n\n### Performance Statistics\n```python\nstats = caller.interaction_logger.get_statistics()\nprint(f\"\"\"\nPerformance Statistics:\n- Total Requests: {stats['total_requests']}\n- Success Rate: {stats['success_rate']:.1f}%\n- Average Duration: {stats['avg_duration_ms']:.1f}ms\n- Total Tokens: {stats['total_tokens']}\n- Average Cost: ${stats['avg_cost']:.4f}\n\"\"\")\n```\n\n### Error Analysis\n```python\n# Get failed requests\nfailed = caller.interaction_logger.get_failed_interactions()\nfor failure in failed:\n print(f\"Failed: {failure.prompt_key} -> {failure.error}\")\n print(f\"Model: {failure.model}, Duration: {failure.duration_ms}ms\")\n```\n\n## \ud83d\udd04 Migration Guide\n\n### From v0.1.x to v0.2.x\nYour existing code continues to work unchanged! New features are opt-in:\n\n```python\n# Old code (still works)\nfrom nimble_llm_caller import LLMCaller, LLMRequest\ncaller = LLMCaller()\nresponse = caller.call(request)\n\n# New enhanced features (optional)\nfrom nimble_llm_caller import EnhancedLLMCaller\ncaller = EnhancedLLMCaller(\n enable_context_management=True,\n enable_file_processing=True\n)\n```\n\nSee [MIGRATION.md](MIGRATION.md) for detailed migration instructions.\n\n## \ud83d\udcda Documentation\n\n- **[Installation Guide](INSTALLATION.md)**: Detailed installation instructions\n- **[Migration Guide](MIGRATION.md)**: Upgrading from previous versions \n- **[API Reference](docs/api/)**: Complete API documentation\n- **[Examples](examples/)**: Working code examples\n- **[Configuration](docs/configuration.md)**: Advanced configuration options\n\n## \ud83e\udd1d Contributing\n\nWe welcome contributions! Please see our [Contributing Guide](CONTRIBUTING.md) for details.\n\n## \ud83d\udcc4 License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## \ud83c\udd98 Support\n\n- **Issues**: [GitHub Issues](https://github.com/nimblebooks/nimble-llm-caller/issues)\n- **Discussions**: [GitHub Discussions](https://github.com/nimblebooks/nimble-llm-caller/discussions)\n- **Documentation**: [Full Documentation](https://nimble-llm-caller.readthedocs.io)\n\n## \ud83c\udff7\ufe0f Version\n\nCurrent version: **0.2.2** - Intelligent Context Management Release\n\n### Recent Updates\n- \ud83d\udcd6 **v0.2.2**: Improved README\n- \u2705 **v0.2.1**: Bug fixes for InteractionLogger\n- \ud83d\ude80 **v0.2.0**: Intelligent context management, file processing, enhanced logging\n- \ud83d\udce6 **v0.1.0**: Initial release with basic LLM calling capabilities\n",
"bugtrack_url": null,
"license": null,
"summary": "A robust, multi-model LLM calling package with intelligent context management, file processing, and advanced prompt handling",
"version": "0.2.2",
"project_urls": {
"Documentation": "https://nimble-llm-caller.readthedocs.io",
"Homepage": "https://github.com/nimblebooks/nimble-llm-caller",
"Issues": "https://github.com/nimblebooks/nimble-llm-caller/issues",
"Repository": "https://github.com/nimblebooks/nimble-llm-caller"
},
"split_keywords": [
"llm",
" ai",
" openai",
" anthropic",
" google",
" prompt",
" batch",
" document",
" context-management",
" file-processing",
" token-estimation",
" model-upshifting"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "d669105b0d063aff667cd935a76df12e5a65b31836484020f124071c77dd211b",
"md5": "fb46c76566a0a5697e4aba0106651511",
"sha256": "72f2453e2ac07ede3c6a212a735ade7993a96a5486a250bf1845142fba533099"
},
"downloads": -1,
"filename": "nimble_llm_caller-0.2.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "fb46c76566a0a5697e4aba0106651511",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 86763,
"upload_time": "2025-08-22T05:40:13",
"upload_time_iso_8601": "2025-08-22T05:40:13.459826Z",
"url": "https://files.pythonhosted.org/packages/d6/69/105b0d063aff667cd935a76df12e5a65b31836484020f124071c77dd211b/nimble_llm_caller-0.2.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "4e01eaa9cab317c8069f51bfdd6b6f21e8a4956926a9acae8040f968970665e1",
"md5": "6bc30ceedce2992bc4d5bb52f8f4dfd6",
"sha256": "057a5ebcbcbcbfc2a8a54d372ba2f3fd77d19526b06d6c26935572cb5c4108e4"
},
"downloads": -1,
"filename": "nimble_llm_caller-0.2.2.tar.gz",
"has_sig": false,
"md5_digest": "6bc30ceedce2992bc4d5bb52f8f4dfd6",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 84199,
"upload_time": "2025-08-22T05:40:15",
"upload_time_iso_8601": "2025-08-22T05:40:15.133389Z",
"url": "https://files.pythonhosted.org/packages/4e/01/eaa9cab317c8069f51bfdd6b6f21e8a4956926a9acae8040f968970665e1/nimble_llm_caller-0.2.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-22 05:40:15",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "nimblebooks",
"github_project": "nimble-llm-caller",
"github_not_found": true,
"lcname": "nimble-llm-caller"
}