esperanto

Name	esperanto JSON
Version	2.8.3 JSON
	download
home_page	None
Summary	A light-weight, production-ready, unified interface for various AI model providers
upload_time	2025-11-01 18:31:35
maintainer	None
docs_url	None
author	None
requires_python	<3.14,>=3.9
license	MIT
keywords	ai anthropic deepseek elevenlabs gemini google groq llm mistral openai openrouter speech-to-text text-to-speech transformers x.ai
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Esperanto 🌐

[![PyPI version](https://badge.fury.io/py/esperanto.svg)](https://badge.fury.io/py/esperanto)
[![PyPI Downloads](https://img.shields.io/pypi/dm/esperanto)](https://pypi.org/project/esperanto/)
[![Coverage](https://img.shields.io/badge/coverage-87%25-brightgreen)](https://github.com/lfnovo/esperanto)
[![Python Versions](https://img.shields.io/pypi/pyversions/esperanto)](https://pypi.org/project/esperanto/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

Esperanto is a powerful Python library that provides a unified interface for interacting with various Large Language Model (LLM) providers. It simplifies the process of working with different AI models (LLMs, Embedders, Transcribers, and TTS) APIs by offering a consistent interface while maintaining provider-specific optimizations.

## Why Esperanto? 🚀

**🪶 Ultra-Lightweight Architecture**
- **Direct HTTP Communication**: All providers communicate directly via HTTP APIs using `httpx` - no bulky vendor SDKs required
- **Minimal Dependencies**: Unlike LangChain and similar frameworks, Esperanto has a tiny footprint with zero overhead layers
- **Production-Ready Performance**: Direct API calls mean faster response times and lower memory usage

**🔄 True Provider Flexibility**
- **Standardized Responses**: Switch between any provider (OpenAI ↔ Anthropic ↔ Google ↔ etc.) without changing a single line of code
- **Consistent Interface**: Same methods, same response objects, same patterns across all 15+ providers
- **Future-Proof**: Add new providers or change existing ones without refactoring your application

**⚡ Perfect for Production**
- **Prototyping to Production**: Start experimenting and deploy the same code to production
- **No Vendor Lock-in**: Test different providers, optimize costs, and maintain flexibility
- **Enterprise-Ready**: Direct HTTP calls, standardized error handling, and comprehensive async support

Whether you're building a quick prototype or a production application serving millions of requests, Esperanto gives you the performance of direct API calls with the convenience of a unified interface.

## Features ✨

- **Unified Interface**: Work with multiple LLM providers using a consistent API
- **Provider Support**:
  - OpenAI (GPT-4o, o1, o3, o4, Whisper, TTS)
  - OpenAI-Compatible (LM Studio, Ollama, vLLM, custom endpoints)
  - Anthropic (Claude models)
  - OpenRouter (Access to multiple models)
  - xAI (Grok)
  - Perplexity (Sonar models)
  - Groq (Mixtral, Llama, Whisper)
  - Google GenAI (Gemini LLM, Text To Speech, Embedding with native task optimization)
  - Vertex AI (Google Cloud, LLM, Embedding, TTS)
  - Ollama (Local deployment multiple models)
  - Transformers (Universal local models - Qwen, CrossEncoder, BAAI, Jina, Mixedbread)
  - ElevenLabs (Text-to-Speech, Speech-to-Text)
  - Azure OpenAI (Chat, Embedding, Whisper, TTS)
  - Mistral (Mistral Large, Small, Embedding, etc.)
  - DeepSeek (deepseek-chat)
  - Voyage (Embeddings, Reranking)
  - Jina (Advanced embedding models with task optimization, Reranking)
- **Embedding Support**: Multiple embedding providers for vector representations
- **Reranking Support**: Universal reranking interface for improving search relevance
- **Speech-to-Text Support**: Transcribe audio using multiple providers
- **Text-to-Speech Support**: Generate speech using multiple providers
- **Async Support**: Both synchronous and asynchronous API calls
- **Streaming**: Support for streaming responses
- **Structured Output**: JSON output formatting (where supported)
- **LangChain Integration**: Easy conversion to LangChain chat models

## 📚 Documentation

- **[Quick Start Guide](https://github.com/lfnovo/esperanto/blob/main/docs/quickstart.md)** - Get started in 5 minutes
- **[Documentation Index](https://github.com/lfnovo/esperanto/blob/main/docs/README.md)** - Complete documentation hub
- **[Provider Comparison](https://github.com/lfnovo/esperanto/blob/main/docs/providers/README.md)** - Choose the right provider
- **[Configuration Guide](https://github.com/lfnovo/esperanto/blob/main/docs/configuration.md)** - Environment setup

### By Capability
- [Language Models (LLM)](https://github.com/lfnovo/esperanto/blob/main/docs/capabilities/llm.md) - Text generation and chat
- [Embeddings](https://github.com/lfnovo/esperanto/blob/main/docs/capabilities/embedding.md) - Vector representations
- [Reranking](https://github.com/lfnovo/esperanto/blob/main/docs/capabilities/reranking.md) - Search relevance
- [Speech-to-Text](https://github.com/lfnovo/esperanto/blob/main/docs/capabilities/speech-to-text.md) - Audio transcription
- [Text-to-Speech](https://github.com/lfnovo/esperanto/blob/main/docs/capabilities/text-to-speech.md) - Voice generation

### By Provider
- [Provider Setup Guides](https://github.com/lfnovo/esperanto/blob/main/docs/providers/) - Complete setup for all 17 providers

### Advanced Topics
- [Task-Aware Embeddings](https://github.com/lfnovo/esperanto/blob/main/docs/advanced/task-aware-embeddings.md)
- [LangChain Integration](https://github.com/lfnovo/esperanto/blob/main/docs/advanced/langchain-integration.md)
- [Timeout Configuration](https://github.com/lfnovo/esperanto/blob/main/docs/advanced/timeout-configuration.md)
- [Model Discovery](https://github.com/lfnovo/esperanto/blob/main/docs/advanced/model-discovery.md)
- [Transformers Features](https://github.com/lfnovo/esperanto/blob/main/docs/advanced/transformers-features.md)

**[CHANGELOG](https://github.com/lfnovo/esperanto/blob/main/CHANGELOG.md)** - Version history and migration guides

## Installation 🚀

Install Esperanto using pip:

```bash
pip install esperanto
```

### Optional Dependencies

**Transformers Provider**

If you plan to use the transformers provider, install with the transformers extra:

```bash
pip install "esperanto[transformers]"
```

This installs:
- `transformers` - Core Hugging Face library
- `torch` - PyTorch framework
- `tokenizers` - Fast tokenization
- `sentence-transformers` - CrossEncoder support
- `scikit-learn` - Advanced embedding features
- `numpy` - Numerical computations

**LangChain Integration**

If you plan to use any of the `.to_langchain()` methods, you need to install the correct LangChain SDKs manually:

```bash
# Core LangChain dependencies (required)
pip install "langchain>=0.3.8,<0.4.0" "langchain-core>=0.3.29,<0.4.0"

# Provider-specific LangChain packages (install only what you need)
pip install "langchain-openai>=0.2.9"
pip install "langchain-anthropic>=0.3.0"
pip install "langchain-google-genai>=2.1.2"
pip install "langchain-ollama>=0.2.0"
pip install "langchain-groq>=0.2.1"
pip install "langchain_mistralai>=0.2.1"
pip install "langchain_deepseek>=0.1.3"
pip install "langchain-google-vertexai>=2.0.24"
```

## Provider Support Matrix

| Provider     | LLM Support | Embedding Support | Reranking Support | Speech-to-Text | Text-to-Speech | JSON Mode |
|--------------|-------------|------------------|-------------------|----------------|----------------|-----------|
| OpenAI       | ✅          | ✅               | ❌                | ✅             | ✅             | ✅        |
| OpenAI-Compatible | ✅          | ✅               | ❌                | ✅             | ✅             | ⚠️*       |
| Anthropic    | ✅          | ❌               | ❌                | ❌             | ❌             | ✅        |
| Groq         | ✅          | ❌               | ❌                | ✅             | ❌             | ✅        |
| Google (GenAI) | ✅          | ✅               | ❌                | ❌             | ✅             | ✅        |
| Vertex AI    | ✅          | ✅               | ❌                | ❌             | ✅             | ❌        |
| Ollama       | ✅          | ✅               | ❌                | ❌             | ❌             | ❌        |
| Perplexity   | ✅          | ❌               | ❌                | ❌             | ❌             | ✅        |
| Transformers | ❌          | ✅               | ✅                | ❌             | ❌             | ❌        |
| ElevenLabs   | ❌          | ❌               | ❌                | ✅             | ✅             | ❌        |
| Azure OpenAI | ✅          | ✅               | ❌                | ✅             | ✅             | ✅        |
| Mistral      | ✅          | ✅               | ❌                | ❌             | ❌             | ✅        |
| DeepSeek     | ✅          | ❌               | ❌                | ❌             | ❌             | ✅        |
| Voyage       | ❌          | ✅               | ✅                | ❌             | ❌             | ❌        |
| Jina         | ❌          | ✅               | ✅                | ❌             | ❌             | ❌        |
| xAI          | ✅          | ❌               | ❌                | ❌             | ❌             | ✅        |
| OpenRouter   | ✅          | ❌               | ❌                | ❌             | ❌             | ✅        |

*⚠️ OpenAI-Compatible: JSON mode support depends on the specific endpoint implementation

## Quick Start 🏃‍♂️

You can use Esperanto in two ways: directly with provider-specific classes or through the AI Factory.

### Using AI Factory

The AI Factory provides a convenient way to create model instances and discover available providers:

```python
from esperanto.factory import AIFactory

# Get available providers for each model type
providers = AIFactory.get_available_providers()
print(providers)
# Output:
# {
#     'language': ['openai', 'openai-compatible', 'anthropic', 'google', 'groq', 'ollama', 'openrouter', 'xai', 'perplexity', 'azure', 'mistral', 'deepseek'],
#     'embedding': ['openai', 'openai-compatible', 'google', 'ollama', 'vertex', 'transformers', 'voyage', 'mistral', 'azure', 'jina'],
#     'reranker': ['jina', 'voyage', 'transformers'],
#     'speech_to_text': ['openai', 'openai-compatible', 'groq', 'elevenlabs', 'azure'],
#     'text_to_speech': ['openai', 'openai-compatible', 'elevenlabs', 'google', 'vertex', 'azure']
# }

# Create model instances
model = AIFactory.create_language(
    "openai", 
    "gpt-3.5-turbo",
    config={"structured": {"type": "json"}}
)  # Language model
embedder = AIFactory.create_embedding("openai", "text-embedding-3-small")  # Embedding model
reranker = AIFactory.create_reranker("transformers", "cross-encoder/ms-marco-MiniLM-L-6-v2")  # Universal reranker model
transcriber = AIFactory.create_speech_to_text("openai", "whisper-1")  # Speech-to-text model
speaker = AIFactory.create_text_to_speech("openai", "tts-1")  # Text-to-speech model

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What's the capital of France?"},
]
response = model.chat_complete(messages)

# Create an embedding instance
texts = ["Hello, world!", "Another text"]
# Synchronous usage
embeddings = embedder.embed(texts)
# Async usage
embeddings = await embedder.aembed(texts)
```

### Model Discovery 🔍

Esperanto provides a convenient way to discover available models from providers without creating instances:

```python
from esperanto.factory import AIFactory

# Discover available models from OpenAI
models = AIFactory.get_provider_models("openai", api_key="your-api-key")
for model in models:
    print(f"{model.id} - owned by {model.owned_by}")

# Filter by model type (for providers like OpenAI that support multiple types)
language_models = AIFactory.get_provider_models(
    "openai",
    api_key="your-api-key",
    model_type="language"  # Options: 'language', 'embedding', 'speech_to_text', 'text_to_speech'
)

# Some providers return hardcoded lists (e.g., Anthropic)
claude_models = AIFactory.get_provider_models("anthropic")
for model in claude_models:
    print(f"{model.id} - Context: {model.context_window} tokens")

# Example output:
# claude-3-5-sonnet-20241022 - Context: 200000 tokens
# claude-3-5-haiku-20241022 - Context: 200000 tokens
# claude-3-opus-20240229 - Context: 200000 tokens

# OpenAI-compatible endpoints (requires base_url)
local_models = AIFactory.get_provider_models(
    "openai-compatible",
    base_url="http://localhost:1234/v1"  # LM Studio, vLLM, etc.
)
for model in local_models:
    print(f"{model.id} - {model.owned_by}")
```

**Benefits of Static Discovery:**
- ✅ **No instance creation required** - Query models without setting up providers
- ✅ **Cached results** - Model lists are cached for 1 hour to reduce API calls
- ✅ **Flexible configuration** - Pass provider-specific config (API keys, base URLs, etc.)
- ✅ **Type filtering** - Filter models by type for multi-model providers

**Supported Providers:**
- **OpenAI** - Fetches models via API (supports type filtering)
- **OpenAI-Compatible** - Fetches models from any OpenAI-compatible endpoint (LM Studio, vLLM, etc.)
- **Anthropic** - Returns hardcoded list of Claude models
- **Google/Gemini** - Fetches models via API
- **Groq** - Fetches models via API
- **Mistral** - Fetches models via API
- **Ollama** - Fetches locally available models
- **Jina** - Returns hardcoded list of embedding/reranking models
- **Voyage** - Returns hardcoded list of embedding/reranking models
- **And more...**

> **Note**: This is the recommended way to discover models. The `.models` property on provider instances is deprecated and will be removed in version 3.0.

### Using Provider-Specific Classes

Here's a simple example to get you started:

```python
from esperanto.providers.llm.openai import OpenAILanguageModel
from esperanto.providers.llm.anthropic import AnthropicLanguageModel

# Initialize a provider with structured output
model = OpenAILanguageModel(
    api_key="your-api-key",
    model_name="gpt-4",  # Optional, defaults to gpt-4
    structured={"type": "json"}  # Optional, for JSON output
)

# Simple chat completion
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "List three colors in JSON format"}
]

# Synchronous call
response = model.chat_complete(messages)
print(response.choices[0].message.content)  # Will be in JSON format

# Async call
async def get_response():
    response = await model.achat_complete(messages)
    print(response.choices[0].message.content)  # Will be in JSON format
```

## Standardized Responses

All providers in Esperanto return standardized response objects, making it easy to work with different models without changing your code.

### LLM Responses

```python
from esperanto.factory import AIFactory

model = AIFactory.create_language(
    "openai", 
    "gpt-3.5-turbo",
    config={"structured": {"type": "json"}}
)
messages = [{"role": "user", "content": "Hello!"}]

# All LLM responses follow this structure
response = model.chat_complete(messages)
print(response.choices[0].message.content)  # The actual response text
print(response.choices[0].message.role)     # 'assistant'
print(response.model)                       # The model used
print(response.usage.total_tokens)          # Token usage information
print(response.content)          # Shortcut for response.choices[0].message.content

# For streaming responses
for chunk in model.chat_complete(messages):
    print(chunk.choices[0].delta.content, end="", flush=True)

# Async streaming
async for chunk in model.achat_complete(messages):
    print(chunk.choices[0].delta.content, end="", flush=True)
```

### Embedding Responses

```python
from esperanto.factory import AIFactory

model = AIFactory.create_embedding("openai", "text-embedding-3-small")
texts = ["Hello, world!", "Another text"]

# All embedding responses follow this structure
response = model.embed(texts)
print(response.data[0].embedding)     # Vector for first text
print(response.data[0].index)         # Index of the text (0)
print(response.model)                 # The model used
print(response.usage.total_tokens)    # Token usage information
```

### Reranking Responses

```python
from esperanto.factory import AIFactory

reranker = AIFactory.create_reranker("transformers", "BAAI/bge-reranker-base")
query = "What is machine learning?"
documents = [
    "Machine learning is a subset of artificial intelligence.",
    "The weather is nice today.",
    "Python is a programming language used in ML."
]

# All reranking responses follow this structure
response = reranker.rerank(query, documents, top_k=2)
print(response.results[0].document)          # Highest ranked document
print(response.results[0].relevance_score)   # Normalized 0-1 relevance score
print(response.results[0].index)             # Original document index
print(response.model)                        # The model used
```

### Task-Aware Embeddings 🎯

Esperanto supports advanced task-aware embeddings that optimize vector representations for specific use cases. This works across **all embedding providers** through a universal interface:

```python
from esperanto.factory import AIFactory
from esperanto.common_types.task_type import EmbeddingTaskType

# Task-optimized embeddings work with ANY provider
model = AIFactory.create_embedding(
    provider="jina",  # Also works with: "openai", "google", "transformers", etc.
    model_name="jina-embeddings-v3",
    config={
        "task_type": EmbeddingTaskType.RETRIEVAL_QUERY,  # Optimize for search queries
        "late_chunking": True,                           # Better long-context handling
        "output_dimensions": 512                         # Control vector size
    }
)

# Generate optimized embeddings
query = "What is machine learning?"
embeddings = model.embed([query])
```

**Universal Task Types:**
- `RETRIEVAL_QUERY` - Optimize for search queries
- `RETRIEVAL_DOCUMENT` - Optimize for document storage  
- `SIMILARITY` - General text similarity
- `CLASSIFICATION` - Text classification tasks
- `CLUSTERING` - Document clustering
- `CODE_RETRIEVAL` - Code search optimization
- `QUESTION_ANSWERING` - Optimize for Q&A tasks
- `FACT_VERIFICATION` - Optimize for fact checking

**Provider Support:**
- **Jina**: Native API support for all features
- **Google**: Native task type translation to Gemini API
- **OpenAI**: Task optimization via intelligent text prefixes
- **Transformers**: Local emulation with task-specific processing
- **Others**: Graceful degradation with consistent interface

The standardized response objects ensure consistency across different providers, making it easy to:
- Switch between providers without changing your application code
- Handle responses in a uniform way
- Access common attributes like token usage and model information

## Provider Configuration 🔧

### OpenAI

```python
from esperanto.providers.llm.openai import OpenAILanguageModel

model = OpenAILanguageModel(
    api_key="your-api-key",  # Or set OPENAI_API_KEY env var
    model_name="gpt-4",      # Optional
    temperature=0.7,         # Optional
    max_tokens=850,         # Optional
    streaming=False,        # Optional
    top_p=0.9,             # Optional
    structured={"type": "json"},      # Optional, for JSON output
    base_url=None,         # Optional, for custom endpoint
    organization=None      # Optional, for org-specific API
)
```

### OpenAI-Compatible Endpoints

Use any OpenAI-compatible endpoint (LM Studio, Ollama, vLLM, custom deployments) with the same interface:

```python
from esperanto.factory import AIFactory

# Using factory config
model = AIFactory.create_language(
    "openai-compatible",
    "your-model-name",  # Use any model name supported by your endpoint
    config={
        "base_url": "http://localhost:1234/v1",  # Your endpoint URL (required)
        "api_key": "your-api-key"                # Your API key (optional)
    }
)

# Or set environment variables
# Generic (works for all provider types):
# OPENAI_COMPATIBLE_BASE_URL=http://localhost:1234/v1
# OPENAI_COMPATIBLE_API_KEY=your-api-key  # Optional for endpoints that don't require auth

# Provider-specific (takes precedence over generic):
# OPENAI_COMPATIBLE_BASE_URL_LLM=http://localhost:1234/v1
# OPENAI_COMPATIBLE_API_KEY_LLM=your-api-key
model = AIFactory.create_language("openai-compatible", "your-model-name")

# Works with any OpenAI-compatible endpoint
messages = [{"role": "user", "content": "Hello!"}]
response = model.chat_complete(messages)
print(response.content)

# Streaming support
for chunk in model.chat_complete(messages, stream=True):
    print(chunk.choices[0].delta.content, end="", flush=True)
```

**Common Use Cases:**
- **LM Studio**: Local model serving with GUI
- **Ollama**: `ollama serve` with OpenAI compatibility
- **vLLM**: High-performance inference server
- **Custom Deployments**: Any server implementing OpenAI chat completions API

**Features:**
- ✅ **Streaming**: Real-time response streaming
- ✅ **Pass-through Model Names**: Use any model name your endpoint supports
- ✅ **Graceful Degradation**: Automatically handles varying feature support
- ✅ **Error Handling**: Clear error messages for troubleshooting
- ⚠️ **JSON Mode**: Depends on endpoint implementation

**Environment Variable Configuration:**

OpenAI-compatible providers support both generic and provider-specific environment variables:

- **Generic variables** (work for all provider types):
  - `OPENAI_COMPATIBLE_BASE_URL` - Base URL for the endpoint
  - `OPENAI_COMPATIBLE_API_KEY` - API key (if required)

- **Provider-specific variables** (take precedence over generic):
  - Language Models: `OPENAI_COMPATIBLE_BASE_URL_LLM`, `OPENAI_COMPATIBLE_API_KEY_LLM`
  - Embeddings: `OPENAI_COMPATIBLE_BASE_URL_EMBEDDING`, `OPENAI_COMPATIBLE_API_KEY_EMBEDDING`
  - Speech-to-Text: `OPENAI_COMPATIBLE_BASE_URL_STT`, `OPENAI_COMPATIBLE_API_KEY_STT`
  - Text-to-Speech: `OPENAI_COMPATIBLE_BASE_URL_TTS`, `OPENAI_COMPATIBLE_API_KEY_TTS`

**Configuration Precedence** (highest to lowest):
1. Direct parameters (`base_url=`, `api_key=`)
2. Config dictionary (`config={"base_url": ...}`)
3. Provider-specific environment variables
4. Generic environment variables
5. Default values

This allows you to use different OpenAI-compatible endpoints for different AI capabilities without code changes.

### Perplexity

Perplexity uses an OpenAI-compatible API but includes additional parameters for controlling search behavior.

```python
from esperanto.providers.llm.perplexity import PerplexityLanguageModel

model = PerplexityLanguageModel(
    api_key="your-api-key",  # Or set PERPLEXITY_API_KEY env var
    model_name="llama-3-sonar-large-32k-online", # Recommended default
    temperature=0.7,         # Optional
    max_tokens=850,         # Optional
    streaming=False,        # Optional
    top_p=0.9,             # Optional
    structured={"type": "json"}, # Optional, for JSON output

    # Perplexity-specific parameters
    search_domain_filter=["example.com", "-excluded.com"], # Optional, limit search domains
    return_images=False,             # Optional, include images in search results
    return_related_questions=True,  # Optional, return related questions
    search_recency_filter="week",    # Optional, filter search by time ('day', 'week', 'month', 'year')
    web_search_options={"search_context_size": "high"} # Optional, control search context ('low', 'medium', 'high')
)
```

## Timeout Configuration ⏱️

Esperanto provides flexible timeout configuration across all provider types with intelligent defaults and multiple configuration methods.

### Default Timeouts

Different provider types have optimized default timeouts based on typical operation duration:

- **LLM, Embedding, Reranking**: 60 seconds (text processing operations)
- **Speech-to-Text, Text-to-Speech**: 300 seconds (audio processing operations)

### Configuration Methods

Configure timeouts using three methods with clear priority hierarchy:

#### 1. Config Dictionary (Highest Priority)

```python
from esperanto.factory import AIFactory

# LLM with custom timeout
model = AIFactory.create_language(
    "openai",
    "gpt-4",
    config={"timeout": 120.0}  # 2 minutes
)

# Embedding with custom timeout
embedder = AIFactory.create_embedding(
    "openai",
    "text-embedding-3-small",
    config={"timeout": 90.0}  # 1.5 minutes
)

# Speech-to-Text with longer timeout for large files
transcriber = AIFactory.create_speech_to_text(
    "openai",
    config={"timeout": 600.0}  # 10 minutes
)
```

#### 2. Direct Parameters (STT/TTS)

```python
# Text-to-Speech with direct timeout parameter
speaker = AIFactory.create_text_to_speech(
    "elevenlabs",
    timeout=180.0  # 3 minutes
)

# Speech-to-Text with direct timeout parameter
transcriber = AIFactory.create_speech_to_text(
    "openai",
    timeout=450.0  # 7.5 minutes
)
```

#### 3. Environment Variables

Set global defaults for all instances of a provider type:

```bash
# Set environment variables
export ESPERANTO_LLM_TIMEOUT=90          # 90 seconds for all LLM providers
export ESPERANTO_EMBEDDING_TIMEOUT=120   # 2 minutes for all embedding providers
export ESPERANTO_RERANKER_TIMEOUT=75     # 75 seconds for all reranker providers
export ESPERANTO_STT_TIMEOUT=600         # 10 minutes for all STT providers
export ESPERANTO_TTS_TIMEOUT=400         # 6.5 minutes for all TTS providers
```

```python
# These will use environment variable defaults
model = AIFactory.create_language("openai", "gpt-4")  # Uses ESPERANTO_LLM_TIMEOUT
embedder = AIFactory.create_embedding("voyage", "voyage-2")  # Uses ESPERANTO_EMBEDDING_TIMEOUT
```

### Priority Order

Configuration resolves in this priority order:

1. **Config parameter** (highest priority)
2. **Environment variable**
3. **Provider type default** (lowest priority)

```python
# Example: Final timeout will be 150 seconds (config overrides env var)
# Even if ESPERANTO_LLM_TIMEOUT=90 is set
model = AIFactory.create_language(
    "openai",
    "gpt-4",
    config={"timeout": 150.0}  # This takes precedence
)
```

### Validation

All timeout values are validated with clear error messages:

- **Type**: Must be a number (int or float)
- **Range**: Must be between 1 and 3600 seconds (1 hour maximum)

```python
# These will raise ValueError with descriptive messages
AIFactory.create_language("openai", "gpt-4", config={"timeout": "invalid"})  # TypeError
AIFactory.create_language("openai", "gpt-4", config={"timeout": -1})         # Out of range
AIFactory.create_language("openai", "gpt-4", config={"timeout": 4000})       # Too large
```

### Production Use Cases

**Batch Processing**
```python
# Long timeout for batch embedding operations
embedder = AIFactory.create_embedding(
    "openai",
    "text-embedding-3-large",
    config={"timeout": 300.0}  # 5 minutes for large batches
)
```

**Real-time Applications**
```python
# Shorter timeout for real-time chat
model = AIFactory.create_language(
    "openai",
    "gpt-3.5-turbo",
    config={"timeout": 30.0}  # 30 seconds for quick responses
)
```

**Audio Processing**
```python
# Extended timeout for long audio files
transcriber = AIFactory.create_speech_to_text(
    "openai",
    config={"timeout": 900.0}  # 15 minutes for hour-long audio files
)
```

## Streaming Responses 🌊

Enable streaming to receive responses token by token:

```python
# Enable streaming
model = OpenAILanguageModel(api_key="your-api-key", streaming=True)

# Synchronous streaming
for chunk in model.chat_complete(messages):
    print(chunk.choices[0].delta.content, end="", flush=True)

# Async streaming
async for chunk in model.achat_complete(messages):
    print(chunk.choices[0].delta.content, end="", flush=True)
```

## Structured Output 📊

Request JSON-formatted responses (supported by OpenAI and some OpenRouter models):

```python
model = OpenAILanguageModel(
    api_key="your-api-key", # or use ENV
    structured={"type": "json"}
)

messages = [
    {"role": "user", "content": "List three European capitals as JSON"}
]

response = model.chat_complete(messages)
# Response will be in JSON format
```

## LangChain Integration 🔗

Convert any provider to a LangChain chat model:

```python
model = OpenAILanguageModel(api_key="your-api-key")
langchain_model = model.to_langchain()

# Use with LangChain
from langchain.chains import ConversationChain
chain = ConversationChain(llm=langchain_model)
```

## Documentation 📚

Complete documentation is available in the [docs](https://github.com/lfnovo/esperanto/tree/main/docs) directory:

- **[Quick Start Guide](https://github.com/lfnovo/esperanto/blob/main/docs/quickstart.md)** - Get up and running in 5 minutes
- **[Documentation Index](https://github.com/lfnovo/esperanto/blob/main/docs/README.md)** - Navigation hub for all documentation
- **[Provider Comparison](https://github.com/lfnovo/esperanto/blob/main/docs/providers/README.md)** - Compare and choose providers
- **[Capability Guides](https://github.com/lfnovo/esperanto/tree/main/docs/capabilities)** - Learn about LLM, Embeddings, Reranking, STT, TTS
- **[Provider Setup Guides](https://github.com/lfnovo/esperanto/tree/main/docs/providers)** - Setup instructions for all 17 providers
- **[Advanced Topics](https://github.com/lfnovo/esperanto/tree/main/docs/advanced)** - Task-aware embeddings, LangChain, timeouts, and more

## Contributing 🤝

We welcome contributions! Please see our [Contributing Guidelines](https://github.com/lfnovo/esperanto/blob/main/CONTRIBUTING.md) for details on how to get started.

## License 📄

This project is licensed under the MIT License - see the [LICENSE](https://github.com/lfnovo/esperanto/blob/main/LICENSE) file for details.

## Development 🛠️

1. Clone the repository:
```bash
git clone https://github.com/lfnovo/esperanto.git
cd esperanto
```

2. Install dependencies:
```bash
pip install -r requirements.txt
```

3. Run tests:
```bash
pytest

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "esperanto",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<3.14,>=3.9",
    "maintainer_email": null,
    "keywords": "ai, anthropic, deepseek, elevenlabs, gemini, google, groq, llm, mistral, openai, openrouter, speech-to-text, text-to-speech, transformers, x.ai",
    "author": null,
    "author_email": "LUIS NOVO <lfnovo@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/b8/86/7680d25c13cfa6a915deefd69e503c8bed762fee2aa87c9ddb0b2f7f4f6a/esperanto-2.8.3.tar.gz",
    "platform": null,
    "description": "# Esperanto \ud83c\udf10\n\n[![PyPI version](https://badge.fury.io/py/esperanto.svg)](https://badge.fury.io/py/esperanto)\n[![PyPI Downloads](https://img.shields.io/pypi/dm/esperanto)](https://pypi.org/project/esperanto/)\n[![Coverage](https://img.shields.io/badge/coverage-87%25-brightgreen)](https://github.com/lfnovo/esperanto)\n[![Python Versions](https://img.shields.io/pypi/pyversions/esperanto)](https://pypi.org/project/esperanto/)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n\nEsperanto is a powerful Python library that provides a unified interface for interacting with various Large Language Model (LLM) providers. It simplifies the process of working with different AI models (LLMs, Embedders, Transcribers, and TTS) APIs by offering a consistent interface while maintaining provider-specific optimizations.\n\n## Why Esperanto? \ud83d\ude80\n\n**\ud83e\udeb6 Ultra-Lightweight Architecture**\n- **Direct HTTP Communication**: All providers communicate directly via HTTP APIs using `httpx` - no bulky vendor SDKs required\n- **Minimal Dependencies**: Unlike LangChain and similar frameworks, Esperanto has a tiny footprint with zero overhead layers\n- **Production-Ready Performance**: Direct API calls mean faster response times and lower memory usage\n\n**\ud83d\udd04 True Provider Flexibility**\n- **Standardized Responses**: Switch between any provider (OpenAI \u2194 Anthropic \u2194 Google \u2194 etc.) without changing a single line of code\n- **Consistent Interface**: Same methods, same response objects, same patterns across all 15+ providers\n- **Future-Proof**: Add new providers or change existing ones without refactoring your application\n\n**\u26a1 Perfect for Production**\n- **Prototyping to Production**: Start experimenting and deploy the same code to production\n- **No Vendor Lock-in**: Test different providers, optimize costs, and maintain flexibility\n- **Enterprise-Ready**: Direct HTTP calls, standardized error handling, and comprehensive async support\n\nWhether you're building a quick prototype or a production application serving millions of requests, Esperanto gives you the performance of direct API calls with the convenience of a unified interface.\n\n## Features \u2728\n\n- **Unified Interface**: Work with multiple LLM providers using a consistent API\n- **Provider Support**:\n  - OpenAI (GPT-4o, o1, o3, o4, Whisper, TTS)\n  - OpenAI-Compatible (LM Studio, Ollama, vLLM, custom endpoints)\n  - Anthropic (Claude models)\n  - OpenRouter (Access to multiple models)\n  - xAI (Grok)\n  - Perplexity (Sonar models)\n  - Groq (Mixtral, Llama, Whisper)\n  - Google GenAI (Gemini LLM, Text To Speech, Embedding with native task optimization)\n  - Vertex AI (Google Cloud, LLM, Embedding, TTS)\n  - Ollama (Local deployment multiple models)\n  - Transformers (Universal local models - Qwen, CrossEncoder, BAAI, Jina, Mixedbread)\n  - ElevenLabs (Text-to-Speech, Speech-to-Text)\n  - Azure OpenAI (Chat, Embedding, Whisper, TTS)\n  - Mistral (Mistral Large, Small, Embedding, etc.)\n  - DeepSeek (deepseek-chat)\n  - Voyage (Embeddings, Reranking)\n  - Jina (Advanced embedding models with task optimization, Reranking)\n- **Embedding Support**: Multiple embedding providers for vector representations\n- **Reranking Support**: Universal reranking interface for improving search relevance\n- **Speech-to-Text Support**: Transcribe audio using multiple providers\n- **Text-to-Speech Support**: Generate speech using multiple providers\n- **Async Support**: Both synchronous and asynchronous API calls\n- **Streaming**: Support for streaming responses\n- **Structured Output**: JSON output formatting (where supported)\n- **LangChain Integration**: Easy conversion to LangChain chat models\n\n## \ud83d\udcda Documentation\n\n- **[Quick Start Guide](https://github.com/lfnovo/esperanto/blob/main/docs/quickstart.md)** - Get started in 5 minutes\n- **[Documentation Index](https://github.com/lfnovo/esperanto/blob/main/docs/README.md)** - Complete documentation hub\n- **[Provider Comparison](https://github.com/lfnovo/esperanto/blob/main/docs/providers/README.md)** - Choose the right provider\n- **[Configuration Guide](https://github.com/lfnovo/esperanto/blob/main/docs/configuration.md)** - Environment setup\n\n### By Capability\n- [Language Models (LLM)](https://github.com/lfnovo/esperanto/blob/main/docs/capabilities/llm.md) - Text generation and chat\n- [Embeddings](https://github.com/lfnovo/esperanto/blob/main/docs/capabilities/embedding.md) - Vector representations\n- [Reranking](https://github.com/lfnovo/esperanto/blob/main/docs/capabilities/reranking.md) - Search relevance\n- [Speech-to-Text](https://github.com/lfnovo/esperanto/blob/main/docs/capabilities/speech-to-text.md) - Audio transcription\n- [Text-to-Speech](https://github.com/lfnovo/esperanto/blob/main/docs/capabilities/text-to-speech.md) - Voice generation\n\n### By Provider\n- [Provider Setup Guides](https://github.com/lfnovo/esperanto/blob/main/docs/providers/) - Complete setup for all 17 providers\n\n### Advanced Topics\n- [Task-Aware Embeddings](https://github.com/lfnovo/esperanto/blob/main/docs/advanced/task-aware-embeddings.md)\n- [LangChain Integration](https://github.com/lfnovo/esperanto/blob/main/docs/advanced/langchain-integration.md)\n- [Timeout Configuration](https://github.com/lfnovo/esperanto/blob/main/docs/advanced/timeout-configuration.md)\n- [Model Discovery](https://github.com/lfnovo/esperanto/blob/main/docs/advanced/model-discovery.md)\n- [Transformers Features](https://github.com/lfnovo/esperanto/blob/main/docs/advanced/transformers-features.md)\n\n**[CHANGELOG](https://github.com/lfnovo/esperanto/blob/main/CHANGELOG.md)** - Version history and migration guides\n\n## Installation \ud83d\ude80\n\nInstall Esperanto using pip:\n\n```bash\npip install esperanto\n```\n\n### Optional Dependencies\n\n**Transformers Provider**\n\nIf you plan to use the transformers provider, install with the transformers extra:\n\n```bash\npip install \"esperanto[transformers]\"\n```\n\nThis installs:\n- `transformers` - Core Hugging Face library\n- `torch` - PyTorch framework\n- `tokenizers` - Fast tokenization\n- `sentence-transformers` - CrossEncoder support\n- `scikit-learn` - Advanced embedding features\n- `numpy` - Numerical computations\n\n**LangChain Integration**\n\nIf you plan to use any of the `.to_langchain()` methods, you need to install the correct LangChain SDKs manually:\n\n```bash\n# Core LangChain dependencies (required)\npip install \"langchain>=0.3.8,<0.4.0\" \"langchain-core>=0.3.29,<0.4.0\"\n\n# Provider-specific LangChain packages (install only what you need)\npip install \"langchain-openai>=0.2.9\"\npip install \"langchain-anthropic>=0.3.0\"\npip install \"langchain-google-genai>=2.1.2\"\npip install \"langchain-ollama>=0.2.0\"\npip install \"langchain-groq>=0.2.1\"\npip install \"langchain_mistralai>=0.2.1\"\npip install \"langchain_deepseek>=0.1.3\"\npip install \"langchain-google-vertexai>=2.0.24\"\n```\n\n## Provider Support Matrix\n\n| Provider     | LLM Support | Embedding Support | Reranking Support | Speech-to-Text | Text-to-Speech | JSON Mode |\n|--------------|-------------|------------------|-------------------|----------------|----------------|-----------|\n| OpenAI       | \u2705          | \u2705               | \u274c                | \u2705             | \u2705             | \u2705        |\n| OpenAI-Compatible | \u2705          | \u2705               | \u274c                | \u2705             | \u2705             | \u26a0\ufe0f*       |\n| Anthropic    | \u2705          | \u274c               | \u274c                | \u274c             | \u274c             | \u2705        |\n| Groq         | \u2705          | \u274c               | \u274c                | \u2705             | \u274c             | \u2705        |\n| Google (GenAI) | \u2705          | \u2705               | \u274c                | \u274c             | \u2705             | \u2705        |\n| Vertex AI    | \u2705          | \u2705               | \u274c                | \u274c             | \u2705             | \u274c        |\n| Ollama       | \u2705          | \u2705               | \u274c                | \u274c             | \u274c             | \u274c        |\n| Perplexity   | \u2705          | \u274c               | \u274c                | \u274c             | \u274c             | \u2705        |\n| Transformers | \u274c          | \u2705               | \u2705                | \u274c             | \u274c             | \u274c        |\n| ElevenLabs   | \u274c          | \u274c               | \u274c                | \u2705             | \u2705             | \u274c        |\n| Azure OpenAI | \u2705          | \u2705               | \u274c                | \u2705             | \u2705             | \u2705        |\n| Mistral      | \u2705          | \u2705               | \u274c                | \u274c             | \u274c             | \u2705        |\n| DeepSeek     | \u2705          | \u274c               | \u274c                | \u274c             | \u274c             | \u2705        |\n| Voyage       | \u274c          | \u2705               | \u2705                | \u274c             | \u274c             | \u274c        |\n| Jina         | \u274c          | \u2705               | \u2705                | \u274c             | \u274c             | \u274c        |\n| xAI          | \u2705          | \u274c               | \u274c                | \u274c             | \u274c             | \u2705        |\n| OpenRouter   | \u2705          | \u274c               | \u274c                | \u274c             | \u274c             | \u2705        |\n\n*\u26a0\ufe0f OpenAI-Compatible: JSON mode support depends on the specific endpoint implementation\n\n## Quick Start \ud83c\udfc3\u200d\u2642\ufe0f\n\nYou can use Esperanto in two ways: directly with provider-specific classes or through the AI Factory.\n\n### Using AI Factory\n\nThe AI Factory provides a convenient way to create model instances and discover available providers:\n\n```python\nfrom esperanto.factory import AIFactory\n\n# Get available providers for each model type\nproviders = AIFactory.get_available_providers()\nprint(providers)\n# Output:\n# {\n#     'language': ['openai', 'openai-compatible', 'anthropic', 'google', 'groq', 'ollama', 'openrouter', 'xai', 'perplexity', 'azure', 'mistral', 'deepseek'],\n#     'embedding': ['openai', 'openai-compatible', 'google', 'ollama', 'vertex', 'transformers', 'voyage', 'mistral', 'azure', 'jina'],\n#     'reranker': ['jina', 'voyage', 'transformers'],\n#     'speech_to_text': ['openai', 'openai-compatible', 'groq', 'elevenlabs', 'azure'],\n#     'text_to_speech': ['openai', 'openai-compatible', 'elevenlabs', 'google', 'vertex', 'azure']\n# }\n\n# Create model instances\nmodel = AIFactory.create_language(\n    \"openai\", \n    \"gpt-3.5-turbo\",\n    config={\"structured\": {\"type\": \"json\"}}\n)  # Language model\nembedder = AIFactory.create_embedding(\"openai\", \"text-embedding-3-small\")  # Embedding model\nreranker = AIFactory.create_reranker(\"transformers\", \"cross-encoder/ms-marco-MiniLM-L-6-v2\")  # Universal reranker model\ntranscriber = AIFactory.create_speech_to_text(\"openai\", \"whisper-1\")  # Speech-to-text model\nspeaker = AIFactory.create_text_to_speech(\"openai\", \"tts-1\")  # Text-to-speech model\n\nmessages = [\n    {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},\n    {\"role\": \"user\", \"content\": \"What's the capital of France?\"},\n]\nresponse = model.chat_complete(messages)\n\n# Create an embedding instance\ntexts = [\"Hello, world!\", \"Another text\"]\n# Synchronous usage\nembeddings = embedder.embed(texts)\n# Async usage\nembeddings = await embedder.aembed(texts)\n```\n\n### Model Discovery \ud83d\udd0d\n\nEsperanto provides a convenient way to discover available models from providers without creating instances:\n\n```python\nfrom esperanto.factory import AIFactory\n\n# Discover available models from OpenAI\nmodels = AIFactory.get_provider_models(\"openai\", api_key=\"your-api-key\")\nfor model in models:\n    print(f\"{model.id} - owned by {model.owned_by}\")\n\n# Filter by model type (for providers like OpenAI that support multiple types)\nlanguage_models = AIFactory.get_provider_models(\n    \"openai\",\n    api_key=\"your-api-key\",\n    model_type=\"language\"  # Options: 'language', 'embedding', 'speech_to_text', 'text_to_speech'\n)\n\n# Some providers return hardcoded lists (e.g., Anthropic)\nclaude_models = AIFactory.get_provider_models(\"anthropic\")\nfor model in claude_models:\n    print(f\"{model.id} - Context: {model.context_window} tokens\")\n\n# Example output:\n# claude-3-5-sonnet-20241022 - Context: 200000 tokens\n# claude-3-5-haiku-20241022 - Context: 200000 tokens\n# claude-3-opus-20240229 - Context: 200000 tokens\n\n# OpenAI-compatible endpoints (requires base_url)\nlocal_models = AIFactory.get_provider_models(\n    \"openai-compatible\",\n    base_url=\"http://localhost:1234/v1\"  # LM Studio, vLLM, etc.\n)\nfor model in local_models:\n    print(f\"{model.id} - {model.owned_by}\")\n```\n\n**Benefits of Static Discovery:**\n- \u2705 **No instance creation required** - Query models without setting up providers\n- \u2705 **Cached results** - Model lists are cached for 1 hour to reduce API calls\n- \u2705 **Flexible configuration** - Pass provider-specific config (API keys, base URLs, etc.)\n- \u2705 **Type filtering** - Filter models by type for multi-model providers\n\n**Supported Providers:**\n- **OpenAI** - Fetches models via API (supports type filtering)\n- **OpenAI-Compatible** - Fetches models from any OpenAI-compatible endpoint (LM Studio, vLLM, etc.)\n- **Anthropic** - Returns hardcoded list of Claude models\n- **Google/Gemini** - Fetches models via API\n- **Groq** - Fetches models via API\n- **Mistral** - Fetches models via API\n- **Ollama** - Fetches locally available models\n- **Jina** - Returns hardcoded list of embedding/reranking models\n- **Voyage** - Returns hardcoded list of embedding/reranking models\n- **And more...**\n\n> **Note**: This is the recommended way to discover models. The `.models` property on provider instances is deprecated and will be removed in version 3.0.\n\n### Using Provider-Specific Classes\n\nHere's a simple example to get you started:\n\n```python\nfrom esperanto.providers.llm.openai import OpenAILanguageModel\nfrom esperanto.providers.llm.anthropic import AnthropicLanguageModel\n\n# Initialize a provider with structured output\nmodel = OpenAILanguageModel(\n    api_key=\"your-api-key\",\n    model_name=\"gpt-4\",  # Optional, defaults to gpt-4\n    structured={\"type\": \"json\"}  # Optional, for JSON output\n)\n\n# Simple chat completion\nmessages = [\n    {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},\n    {\"role\": \"user\", \"content\": \"List three colors in JSON format\"}\n]\n\n# Synchronous call\nresponse = model.chat_complete(messages)\nprint(response.choices[0].message.content)  # Will be in JSON format\n\n# Async call\nasync def get_response():\n    response = await model.achat_complete(messages)\n    print(response.choices[0].message.content)  # Will be in JSON format\n```\n\n## Standardized Responses\n\nAll providers in Esperanto return standardized response objects, making it easy to work with different models without changing your code.\n\n### LLM Responses\n\n```python\nfrom esperanto.factory import AIFactory\n\nmodel = AIFactory.create_language(\n    \"openai\", \n    \"gpt-3.5-turbo\",\n    config={\"structured\": {\"type\": \"json\"}}\n)\nmessages = [{\"role\": \"user\", \"content\": \"Hello!\"}]\n\n# All LLM responses follow this structure\nresponse = model.chat_complete(messages)\nprint(response.choices[0].message.content)  # The actual response text\nprint(response.choices[0].message.role)     # 'assistant'\nprint(response.model)                       # The model used\nprint(response.usage.total_tokens)          # Token usage information\nprint(response.content)          # Shortcut for response.choices[0].message.content\n\n# For streaming responses\nfor chunk in model.chat_complete(messages):\n    print(chunk.choices[0].delta.content, end=\"\", flush=True)\n\n# Async streaming\nasync for chunk in model.achat_complete(messages):\n    print(chunk.choices[0].delta.content, end=\"\", flush=True)\n```\n\n### Embedding Responses\n\n```python\nfrom esperanto.factory import AIFactory\n\nmodel = AIFactory.create_embedding(\"openai\", \"text-embedding-3-small\")\ntexts = [\"Hello, world!\", \"Another text\"]\n\n# All embedding responses follow this structure\nresponse = model.embed(texts)\nprint(response.data[0].embedding)     # Vector for first text\nprint(response.data[0].index)         # Index of the text (0)\nprint(response.model)                 # The model used\nprint(response.usage.total_tokens)    # Token usage information\n```\n\n### Reranking Responses\n\n```python\nfrom esperanto.factory import AIFactory\n\nreranker = AIFactory.create_reranker(\"transformers\", \"BAAI/bge-reranker-base\")\nquery = \"What is machine learning?\"\ndocuments = [\n    \"Machine learning is a subset of artificial intelligence.\",\n    \"The weather is nice today.\",\n    \"Python is a programming language used in ML.\"\n]\n\n# All reranking responses follow this structure\nresponse = reranker.rerank(query, documents, top_k=2)\nprint(response.results[0].document)          # Highest ranked document\nprint(response.results[0].relevance_score)   # Normalized 0-1 relevance score\nprint(response.results[0].index)             # Original document index\nprint(response.model)                        # The model used\n```\n\n### Task-Aware Embeddings \ud83c\udfaf\n\nEsperanto supports advanced task-aware embeddings that optimize vector representations for specific use cases. This works across **all embedding providers** through a universal interface:\n\n```python\nfrom esperanto.factory import AIFactory\nfrom esperanto.common_types.task_type import EmbeddingTaskType\n\n# Task-optimized embeddings work with ANY provider\nmodel = AIFactory.create_embedding(\n    provider=\"jina\",  # Also works with: \"openai\", \"google\", \"transformers\", etc.\n    model_name=\"jina-embeddings-v3\",\n    config={\n        \"task_type\": EmbeddingTaskType.RETRIEVAL_QUERY,  # Optimize for search queries\n        \"late_chunking\": True,                           # Better long-context handling\n        \"output_dimensions\": 512                         # Control vector size\n    }\n)\n\n# Generate optimized embeddings\nquery = \"What is machine learning?\"\nembeddings = model.embed([query])\n```\n\n**Universal Task Types:**\n- `RETRIEVAL_QUERY` - Optimize for search queries\n- `RETRIEVAL_DOCUMENT` - Optimize for document storage  \n- `SIMILARITY` - General text similarity\n- `CLASSIFICATION` - Text classification tasks\n- `CLUSTERING` - Document clustering\n- `CODE_RETRIEVAL` - Code search optimization\n- `QUESTION_ANSWERING` - Optimize for Q&A tasks\n- `FACT_VERIFICATION` - Optimize for fact checking\n\n**Provider Support:**\n- **Jina**: Native API support for all features\n- **Google**: Native task type translation to Gemini API\n- **OpenAI**: Task optimization via intelligent text prefixes\n- **Transformers**: Local emulation with task-specific processing\n- **Others**: Graceful degradation with consistent interface\n\nThe standardized response objects ensure consistency across different providers, making it easy to:\n- Switch between providers without changing your application code\n- Handle responses in a uniform way\n- Access common attributes like token usage and model information\n\n## Provider Configuration \ud83d\udd27\n\n### OpenAI\n\n```python\nfrom esperanto.providers.llm.openai import OpenAILanguageModel\n\nmodel = OpenAILanguageModel(\n    api_key=\"your-api-key\",  # Or set OPENAI_API_KEY env var\n    model_name=\"gpt-4\",      # Optional\n    temperature=0.7,         # Optional\n    max_tokens=850,         # Optional\n    streaming=False,        # Optional\n    top_p=0.9,             # Optional\n    structured={\"type\": \"json\"},      # Optional, for JSON output\n    base_url=None,         # Optional, for custom endpoint\n    organization=None      # Optional, for org-specific API\n)\n```\n\n### OpenAI-Compatible Endpoints\n\nUse any OpenAI-compatible endpoint (LM Studio, Ollama, vLLM, custom deployments) with the same interface:\n\n```python\nfrom esperanto.factory import AIFactory\n\n# Using factory config\nmodel = AIFactory.create_language(\n    \"openai-compatible\",\n    \"your-model-name\",  # Use any model name supported by your endpoint\n    config={\n        \"base_url\": \"http://localhost:1234/v1\",  # Your endpoint URL (required)\n        \"api_key\": \"your-api-key\"                # Your API key (optional)\n    }\n)\n\n# Or set environment variables\n# Generic (works for all provider types):\n# OPENAI_COMPATIBLE_BASE_URL=http://localhost:1234/v1\n# OPENAI_COMPATIBLE_API_KEY=your-api-key  # Optional for endpoints that don't require auth\n\n# Provider-specific (takes precedence over generic):\n# OPENAI_COMPATIBLE_BASE_URL_LLM=http://localhost:1234/v1\n# OPENAI_COMPATIBLE_API_KEY_LLM=your-api-key\nmodel = AIFactory.create_language(\"openai-compatible\", \"your-model-name\")\n\n# Works with any OpenAI-compatible endpoint\nmessages = [{\"role\": \"user\", \"content\": \"Hello!\"}]\nresponse = model.chat_complete(messages)\nprint(response.content)\n\n# Streaming support\nfor chunk in model.chat_complete(messages, stream=True):\n    print(chunk.choices[0].delta.content, end=\"\", flush=True)\n```\n\n**Common Use Cases:**\n- **LM Studio**: Local model serving with GUI\n- **Ollama**: `ollama serve` with OpenAI compatibility\n- **vLLM**: High-performance inference server\n- **Custom Deployments**: Any server implementing OpenAI chat completions API\n\n**Features:**\n- \u2705 **Streaming**: Real-time response streaming\n- \u2705 **Pass-through Model Names**: Use any model name your endpoint supports\n- \u2705 **Graceful Degradation**: Automatically handles varying feature support\n- \u2705 **Error Handling**: Clear error messages for troubleshooting\n- \u26a0\ufe0f **JSON Mode**: Depends on endpoint implementation\n\n**Environment Variable Configuration:**\n\nOpenAI-compatible providers support both generic and provider-specific environment variables:\n\n- **Generic variables** (work for all provider types):\n  - `OPENAI_COMPATIBLE_BASE_URL` - Base URL for the endpoint\n  - `OPENAI_COMPATIBLE_API_KEY` - API key (if required)\n\n- **Provider-specific variables** (take precedence over generic):\n  - Language Models: `OPENAI_COMPATIBLE_BASE_URL_LLM`, `OPENAI_COMPATIBLE_API_KEY_LLM`\n  - Embeddings: `OPENAI_COMPATIBLE_BASE_URL_EMBEDDING`, `OPENAI_COMPATIBLE_API_KEY_EMBEDDING`\n  - Speech-to-Text: `OPENAI_COMPATIBLE_BASE_URL_STT`, `OPENAI_COMPATIBLE_API_KEY_STT`\n  - Text-to-Speech: `OPENAI_COMPATIBLE_BASE_URL_TTS`, `OPENAI_COMPATIBLE_API_KEY_TTS`\n\n**Configuration Precedence** (highest to lowest):\n1. Direct parameters (`base_url=`, `api_key=`)\n2. Config dictionary (`config={\"base_url\": ...}`)\n3. Provider-specific environment variables\n4. Generic environment variables\n5. Default values\n\nThis allows you to use different OpenAI-compatible endpoints for different AI capabilities without code changes.\n\n### Perplexity\n\nPerplexity uses an OpenAI-compatible API but includes additional parameters for controlling search behavior.\n\n```python\nfrom esperanto.providers.llm.perplexity import PerplexityLanguageModel\n\nmodel = PerplexityLanguageModel(\n    api_key=\"your-api-key\",  # Or set PERPLEXITY_API_KEY env var\n    model_name=\"llama-3-sonar-large-32k-online\", # Recommended default\n    temperature=0.7,         # Optional\n    max_tokens=850,         # Optional\n    streaming=False,        # Optional\n    top_p=0.9,             # Optional\n    structured={\"type\": \"json\"}, # Optional, for JSON output\n\n    # Perplexity-specific parameters\n    search_domain_filter=[\"example.com\", \"-excluded.com\"], # Optional, limit search domains\n    return_images=False,             # Optional, include images in search results\n    return_related_questions=True,  # Optional, return related questions\n    search_recency_filter=\"week\",    # Optional, filter search by time ('day', 'week', 'month', 'year')\n    web_search_options={\"search_context_size\": \"high\"} # Optional, control search context ('low', 'medium', 'high')\n)\n```\n\n## Timeout Configuration \u23f1\ufe0f\n\nEsperanto provides flexible timeout configuration across all provider types with intelligent defaults and multiple configuration methods.\n\n### Default Timeouts\n\nDifferent provider types have optimized default timeouts based on typical operation duration:\n\n- **LLM, Embedding, Reranking**: 60 seconds (text processing operations)\n- **Speech-to-Text, Text-to-Speech**: 300 seconds (audio processing operations)\n\n### Configuration Methods\n\nConfigure timeouts using three methods with clear priority hierarchy:\n\n#### 1. Config Dictionary (Highest Priority)\n\n```python\nfrom esperanto.factory import AIFactory\n\n# LLM with custom timeout\nmodel = AIFactory.create_language(\n    \"openai\",\n    \"gpt-4\",\n    config={\"timeout\": 120.0}  # 2 minutes\n)\n\n# Embedding with custom timeout\nembedder = AIFactory.create_embedding(\n    \"openai\",\n    \"text-embedding-3-small\",\n    config={\"timeout\": 90.0}  # 1.5 minutes\n)\n\n# Speech-to-Text with longer timeout for large files\ntranscriber = AIFactory.create_speech_to_text(\n    \"openai\",\n    config={\"timeout\": 600.0}  # 10 minutes\n)\n```\n\n#### 2. Direct Parameters (STT/TTS)\n\n```python\n# Text-to-Speech with direct timeout parameter\nspeaker = AIFactory.create_text_to_speech(\n    \"elevenlabs\",\n    timeout=180.0  # 3 minutes\n)\n\n# Speech-to-Text with direct timeout parameter\ntranscriber = AIFactory.create_speech_to_text(\n    \"openai\",\n    timeout=450.0  # 7.5 minutes\n)\n```\n\n#### 3. Environment Variables\n\nSet global defaults for all instances of a provider type:\n\n```bash\n# Set environment variables\nexport ESPERANTO_LLM_TIMEOUT=90          # 90 seconds for all LLM providers\nexport ESPERANTO_EMBEDDING_TIMEOUT=120   # 2 minutes for all embedding providers\nexport ESPERANTO_RERANKER_TIMEOUT=75     # 75 seconds for all reranker providers\nexport ESPERANTO_STT_TIMEOUT=600         # 10 minutes for all STT providers\nexport ESPERANTO_TTS_TIMEOUT=400         # 6.5 minutes for all TTS providers\n```\n\n```python\n# These will use environment variable defaults\nmodel = AIFactory.create_language(\"openai\", \"gpt-4\")  # Uses ESPERANTO_LLM_TIMEOUT\nembedder = AIFactory.create_embedding(\"voyage\", \"voyage-2\")  # Uses ESPERANTO_EMBEDDING_TIMEOUT\n```\n\n### Priority Order\n\nConfiguration resolves in this priority order:\n\n1. **Config parameter** (highest priority)\n2. **Environment variable**\n3. **Provider type default** (lowest priority)\n\n```python\n# Example: Final timeout will be 150 seconds (config overrides env var)\n# Even if ESPERANTO_LLM_TIMEOUT=90 is set\nmodel = AIFactory.create_language(\n    \"openai\",\n    \"gpt-4\",\n    config={\"timeout\": 150.0}  # This takes precedence\n)\n```\n\n### Validation\n\nAll timeout values are validated with clear error messages:\n\n- **Type**: Must be a number (int or float)\n- **Range**: Must be between 1 and 3600 seconds (1 hour maximum)\n\n```python\n# These will raise ValueError with descriptive messages\nAIFactory.create_language(\"openai\", \"gpt-4\", config={\"timeout\": \"invalid\"})  # TypeError\nAIFactory.create_language(\"openai\", \"gpt-4\", config={\"timeout\": -1})         # Out of range\nAIFactory.create_language(\"openai\", \"gpt-4\", config={\"timeout\": 4000})       # Too large\n```\n\n### Production Use Cases\n\n**Batch Processing**\n```python\n# Long timeout for batch embedding operations\nembedder = AIFactory.create_embedding(\n    \"openai\",\n    \"text-embedding-3-large\",\n    config={\"timeout\": 300.0}  # 5 minutes for large batches\n)\n```\n\n**Real-time Applications**\n```python\n# Shorter timeout for real-time chat\nmodel = AIFactory.create_language(\n    \"openai\",\n    \"gpt-3.5-turbo\",\n    config={\"timeout\": 30.0}  # 30 seconds for quick responses\n)\n```\n\n**Audio Processing**\n```python\n# Extended timeout for long audio files\ntranscriber = AIFactory.create_speech_to_text(\n    \"openai\",\n    config={\"timeout\": 900.0}  # 15 minutes for hour-long audio files\n)\n```\n\n## Streaming Responses \ud83c\udf0a\n\nEnable streaming to receive responses token by token:\n\n```python\n# Enable streaming\nmodel = OpenAILanguageModel(api_key=\"your-api-key\", streaming=True)\n\n# Synchronous streaming\nfor chunk in model.chat_complete(messages):\n    print(chunk.choices[0].delta.content, end=\"\", flush=True)\n\n# Async streaming\nasync for chunk in model.achat_complete(messages):\n    print(chunk.choices[0].delta.content, end=\"\", flush=True)\n```\n\n## Structured Output \ud83d\udcca\n\nRequest JSON-formatted responses (supported by OpenAI and some OpenRouter models):\n\n```python\nmodel = OpenAILanguageModel(\n    api_key=\"your-api-key\", # or use ENV\n    structured={\"type\": \"json\"}\n)\n\nmessages = [\n    {\"role\": \"user\", \"content\": \"List three European capitals as JSON\"}\n]\n\nresponse = model.chat_complete(messages)\n# Response will be in JSON format\n```\n\n## LangChain Integration \ud83d\udd17\n\nConvert any provider to a LangChain chat model:\n\n```python\nmodel = OpenAILanguageModel(api_key=\"your-api-key\")\nlangchain_model = model.to_langchain()\n\n# Use with LangChain\nfrom langchain.chains import ConversationChain\nchain = ConversationChain(llm=langchain_model)\n```\n\n## Documentation \ud83d\udcda\n\nComplete documentation is available in the [docs](https://github.com/lfnovo/esperanto/tree/main/docs) directory:\n\n- **[Quick Start Guide](https://github.com/lfnovo/esperanto/blob/main/docs/quickstart.md)** - Get up and running in 5 minutes\n- **[Documentation Index](https://github.com/lfnovo/esperanto/blob/main/docs/README.md)** - Navigation hub for all documentation\n- **[Provider Comparison](https://github.com/lfnovo/esperanto/blob/main/docs/providers/README.md)** - Compare and choose providers\n- **[Capability Guides](https://github.com/lfnovo/esperanto/tree/main/docs/capabilities)** - Learn about LLM, Embeddings, Reranking, STT, TTS\n- **[Provider Setup Guides](https://github.com/lfnovo/esperanto/tree/main/docs/providers)** - Setup instructions for all 17 providers\n- **[Advanced Topics](https://github.com/lfnovo/esperanto/tree/main/docs/advanced)** - Task-aware embeddings, LangChain, timeouts, and more\n\n## Contributing \ud83e\udd1d\n\nWe welcome contributions! Please see our [Contributing Guidelines](https://github.com/lfnovo/esperanto/blob/main/CONTRIBUTING.md) for details on how to get started.\n\n## License \ud83d\udcc4\n\nThis project is licensed under the MIT License - see the [LICENSE](https://github.com/lfnovo/esperanto/blob/main/LICENSE) file for details.\n\n## Development \ud83d\udee0\ufe0f\n\n1. Clone the repository:\n```bash\ngit clone https://github.com/lfnovo/esperanto.git\ncd esperanto\n```\n\n2. Install dependencies:\n```bash\npip install -r requirements.txt\n```\n\n3. Run tests:\n```bash\npytest\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A light-weight, production-ready, unified interface for various AI model providers",
    "version": "2.8.3",
    "project_urls": {
        "documentation": "https://github.com/lfnovo/esperanto#readme",
        "homepage": "https://github.com/lfnovo/esperanto",
        "repository": "https://github.com/lfnovo/esperanto"
    },
    "split_keywords": [
        "ai",
        " anthropic",
        " deepseek",
        " elevenlabs",
        " gemini",
        " google",
        " groq",
        " llm",
        " mistral",
        " openai",
        " openrouter",
        " speech-to-text",
        " text-to-speech",
        " transformers",
        " x.ai"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "aa662a900a588d110fb5a182050f947d0c90f962ecb14ffa4e6d6b2222b0415f",
                "md5": "34b2be617af7ba760495bfa74bdd3a0c",
                "sha256": "94dc68241b4447e06711e2a15aea0acc2265985cd52a9abe57ecf292864535c5"
            },
            "downloads": -1,
            "filename": "esperanto-2.8.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "34b2be617af7ba760495bfa74bdd3a0c",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<3.14,>=3.9",
            "size": 144378,
            "upload_time": "2025-11-01T18:31:34",
            "upload_time_iso_8601": "2025-11-01T18:31:34.615055Z",
            "url": "https://files.pythonhosted.org/packages/aa/66/2a900a588d110fb5a182050f947d0c90f962ecb14ffa4e6d6b2222b0415f/esperanto-2.8.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "b8867680d25c13cfa6a915deefd69e503c8bed762fee2aa87c9ddb0b2f7f4f6a",
                "md5": "0ae7d9bdf2a63cf4efa40623308d0a4f",
                "sha256": "90779f6faf625df30f615f97099f9a031140f1aa2a5a54d8cf88d8910717a765"
            },
            "downloads": -1,
            "filename": "esperanto-2.8.3.tar.gz",
            "has_sig": false,
            "md5_digest": "0ae7d9bdf2a63cf4efa40623308d0a4f",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<3.14,>=3.9",
            "size": 788096,
            "upload_time": "2025-11-01T18:31:35",
            "upload_time_iso_8601": "2025-11-01T18:31:35.717980Z",
            "url": "https://files.pythonhosted.org/packages/b8/86/7680d25c13cfa6a915deefd69e503c8bed762fee2aa87c9ddb0b2f7f4f6a/esperanto-2.8.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-11-01 18:31:35",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "lfnovo",
    "github_project": "esperanto#readme",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "esperanto"
}

None