# Esperanto ð
[](https://badge.fury.io/py/esperanto)
[](https://pypi.org/project/esperanto/)
[](https://github.com/lfnovo/esperanto)
[](https://pypi.org/project/esperanto/)
[](https://opensource.org/licenses/MIT)
Esperanto is a powerful Python library that provides a unified interface for interacting with various Large Language Model (LLM) providers. It simplifies the process of working with different AI models (LLMs, Embedders, Transcribers, and TTS) APIs by offering a consistent interface while maintaining provider-specific optimizations.
## Why Esperanto? ð
**ðŠķ Ultra-Lightweight Architecture**
- **Direct HTTP Communication**: All providers communicate directly via HTTP APIs using `httpx` - no bulky vendor SDKs required
- **Minimal Dependencies**: Unlike LangChain and similar frameworks, Esperanto has a tiny footprint with zero overhead layers
- **Production-Ready Performance**: Direct API calls mean faster response times and lower memory usage
**ð True Provider Flexibility**
- **Standardized Responses**: Switch between any provider (OpenAI â Anthropic â Google â etc.) without changing a single line of code
- **Consistent Interface**: Same methods, same response objects, same patterns across all 15+ providers
- **Future-Proof**: Add new providers or change existing ones without refactoring your application
**⥠Perfect for Production**
- **Prototyping to Production**: Start experimenting and deploy the same code to production
- **No Vendor Lock-in**: Test different providers, optimize costs, and maintain flexibility
- **Enterprise-Ready**: Direct HTTP calls, standardized error handling, and comprehensive async support
Whether you're building a quick prototype or a production application serving millions of requests, Esperanto gives you the performance of direct API calls with the convenience of a unified interface.
## Features âĻ
- **Unified Interface**: Work with multiple LLM providers using a consistent API
- **Provider Support**:
- OpenAI (GPT-4o, o1, o3, o4, Whisper, TTS)
- OpenAI-Compatible (LM Studio, Ollama, vLLM, custom endpoints)
- Anthropic (Claude models)
- OpenRouter (Access to multiple models)
- xAI (Grok)
- Perplexity (Sonar models)
- Groq (Mixtral, Llama, Whisper)
- Google GenAI (Gemini LLM, Text To Speech, Embedding with native task optimization)
- Vertex AI (Google Cloud, LLM, Embedding, TTS)
- Ollama (Local deployment multiple models)
- Transformers (Universal local models - Qwen, CrossEncoder, BAAI, Jina, Mixedbread)
- ElevenLabs (Text-to-Speech, Speech-to-Text)
- Azure OpenAI (Chat, Embedding)
- Mistral (Mistral Large, Small, Embedding, etc.)
- DeepSeek (deepseek-chat)
- Voyage (Embeddings, Reranking)
- Jina (Advanced embedding models with task optimization, Reranking)
- **Embedding Support**: Multiple embedding providers for vector representations
- **Reranking Support**: Universal reranking interface for improving search relevance
- **Speech-to-Text Support**: Transcribe audio using multiple providers
- **Text-to-Speech Support**: Generate speech using multiple providers
- **Async Support**: Both synchronous and asynchronous API calls
- **Streaming**: Support for streaming responses
- **Structured Output**: JSON output formatting (where supported)
- **LangChain Integration**: Easy conversion to LangChain chat models
For detailed information about our providers, check out:
- [LLM Providers Documentation](https://github.com/lfnovo/esperanto/blob/main/docs/llm.md)
- [Embedding Providers Documentation](https://github.com/lfnovo/esperanto/blob/main/docs/embedding.md)
- [Reranking Providers Documentation](https://github.com/lfnovo/esperanto/blob/main/docs/rerank.md)
- [Speech-to-Text Providers Documentation](https://github.com/lfnovo/esperanto/blob/main/docs/speech_to_text.md)
- [Text-to-Speech Providers Documentation](https://github.com/lfnovo/esperanto/blob/main/docs/text_to_speech.md)
## Installation ð
Install Esperanto using pip:
```bash
pip install esperanto
```
### Optional Dependencies
**Transformers Provider**
If you plan to use the transformers provider, install with the transformers extra:
```bash
pip install "esperanto[transformers]"
```
This installs:
- `transformers` - Core Hugging Face library
- `torch` - PyTorch framework
- `tokenizers` - Fast tokenization
- `sentence-transformers` - CrossEncoder support
- `scikit-learn` - Advanced embedding features
- `numpy` - Numerical computations
**LangChain Integration**
If you plan to use any of the `.to_langchain()` methods, you need to install the correct LangChain SDKs manually:
```bash
# Core LangChain dependencies (required)
pip install "langchain>=0.3.8,<0.4.0" "langchain-core>=0.3.29,<0.4.0"
# Provider-specific LangChain packages (install only what you need)
pip install "langchain-openai>=0.2.9"
pip install "langchain-anthropic>=0.3.0"
pip install "langchain-google-genai>=2.1.2"
pip install "langchain-ollama>=0.2.0"
pip install "langchain-groq>=0.2.1"
pip install "langchain_mistralai>=0.2.1"
pip install "langchain_deepseek>=0.1.3"
pip install "langchain-google-vertexai>=2.0.24"
```
## Provider Support Matrix
| Provider | LLM Support | Embedding Support | Reranking Support | Speech-to-Text | Text-to-Speech | JSON Mode |
|--------------|-------------|------------------|-------------------|----------------|----------------|-----------|
| OpenAI | â
| â
| â | â
| â
| â
|
| OpenAI-Compatible | â
| â | â | â | â | â ïļ* |
| Anthropic | â
| â | â | â | â | â
|
| Groq | â
| â | â | â
| â | â
|
| Google (GenAI) | â
| â
| â | â | â
| â
|
| Vertex AI | â
| â
| â | â | â
| â |
| Ollama | â
| â
| â | â | â | â |
| Perplexity | â
| â | â | â | â | â
|
| Transformers | â | â
| â
| â | â | â |
| ElevenLabs | â | â | â | â
| â
| â |
| Azure OpenAI | â
| â
| â | â | â | â
|
| Mistral | â
| â
| â | â | â | â
|
| DeepSeek | â
| â | â | â | â | â
|
| Voyage | â | â
| â
| â | â | â |
| Jina | â | â
| â
| â | â | â |
| xAI | â
| â | â | â | â | â
|
| OpenRouter | â
| â | â | â | â | â
|
*â ïļ OpenAI-Compatible: JSON mode support depends on the specific endpoint implementation
## Quick Start ðââïļ
You can use Esperanto in two ways: directly with provider-specific classes or through the AI Factory.
## AIFactory - Smart Model Management ð
The `AIFactory` is Esperanto's intelligent model management system that provides significant performance benefits through its **singleton cache architecture**.
### ð **Singleton Cache Benefits**
AIFactory automatically caches model instances based on their configuration. This means:
- **No duplicate model creation** - same provider + model + config = same instance returned
- **Faster subsequent calls** - cached instances are returned immediately
- **Memory efficient** - prevents memory bloat from multiple identical models
- **Connection reuse** - HTTP clients and configurations are preserved
### ðĄ **How It Works**
```python
from esperanto.factory import AIFactory
# First call - creates new model instance
model1 = AIFactory.create_language("openai", "gpt-4", temperature=0.7)
# Second call with same config - returns cached instance (instant!)
model2 = AIFactory.create_language("openai", "gpt-4", temperature=0.7)
# They're the exact same object
assert model1 is model2 # True!
# Different config - creates new instance
model3 = AIFactory.create_language("openai", "gpt-4", temperature=0.9)
assert model1 is not model3 # True - different config
```
### ðŊ **Perfect for Production**
This caching is especially powerful in production scenarios:
```python
# In a web application
def handle_chat_request(messages):
# This model is cached - no recreation overhead!
model = AIFactory.create_language("anthropic", "claude-3-sonnet-20240229")
return model.chat_complete(messages)
def handle_embedding_request(texts):
# This embedding model is also cached
embedder = AIFactory.create_embedding("openai", "text-embedding-3-small")
return embedder.embed(texts)
# Multiple calls to these functions reuse the same model instances
# = Better performance + Lower memory usage
```
### ð **Cache Key Strategy**
The cache key includes:
- **Provider name** (e.g., "openai", "anthropic")
- **Model name** (e.g., "gpt-4", "claude-3-sonnet")
- **All configuration parameters** (temperature, max_tokens, etc.)
Only models with **identical configurations** share the same cache entry.
### Using AI Factory
The AI Factory provides a convenient way to create model instances and discover available providers:
```python
from esperanto.factory import AIFactory
# Get available providers for each model type
providers = AIFactory.get_available_providers()
print(providers)
# Output:
# {
# 'language': ['openai', 'openai-compatible', 'anthropic', 'google', 'groq', 'ollama', 'openrouter', 'xai', 'perplexity', 'azure', 'mistral', 'deepseek'],
# 'embedding': ['openai', 'google', 'ollama', 'vertex', 'transformers', 'voyage', 'mistral', 'azure', 'jina'],
# 'reranker': ['jina', 'voyage', 'transformers'],
# 'speech_to_text': ['openai', 'groq', 'elevenlabs'],
# 'text_to_speech': ['openai', 'elevenlabs', 'google', 'vertex']
# }
# Create model instances
model = AIFactory.create_language(
"openai",
"gpt-3.5-turbo",
config={"structured": {"type": "json"}}
) # Language model
embedder = AIFactory.create_embedding("openai", "text-embedding-3-small") # Embedding model
reranker = AIFactory.create_reranker("transformers", "cross-encoder/ms-marco-MiniLM-L-6-v2") # Universal reranker model
transcriber = AIFactory.create_speech_to_text("openai", "whisper-1") # Speech-to-text model
speaker = AIFactory.create_text_to_speech("openai", "tts-1") # Text-to-speech model
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What's the capital of France?"},
]
response = model.chat_complete(messages)
# Create an embedding instance
texts = ["Hello, world!", "Another text"]
# Synchronous usage
embeddings = embedder.embed(texts)
# Async usage
embeddings = await embedder.aembed(texts)
```
### Using Provider-Specific Classes
Here's a simple example to get you started:
```python
from esperanto.providers.llm.openai import OpenAILanguageModel
from esperanto.providers.llm.anthropic import AnthropicLanguageModel
# Initialize a provider with structured output
model = OpenAILanguageModel(
api_key="your-api-key",
model_name="gpt-4", # Optional, defaults to gpt-4
structured={"type": "json"} # Optional, for JSON output
)
# Simple chat completion
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "List three colors in JSON format"}
]
# Synchronous call
response = model.chat_complete(messages)
print(response.choices[0].message.content) # Will be in JSON format
# Async call
async def get_response():
response = await model.achat_complete(messages)
print(response.choices[0].message.content) # Will be in JSON format
```
## Standardized Responses
All providers in Esperanto return standardized response objects, making it easy to work with different models without changing your code.
### LLM Responses
```python
from esperanto.factory import AIFactory
model = AIFactory.create_language(
"openai",
"gpt-3.5-turbo",
config={"structured": {"type": "json"}}
)
messages = [{"role": "user", "content": "Hello!"}]
# All LLM responses follow this structure
response = model.chat_complete(messages)
print(response.choices[0].message.content) # The actual response text
print(response.choices[0].message.role) # 'assistant'
print(response.model) # The model used
print(response.usage.total_tokens) # Token usage information
print(response.content) # Shortcut for response.choices[0].message.content
# For streaming responses
for chunk in model.chat_complete(messages):
print(chunk.choices[0].delta.content, end="", flush=True)
# Async streaming
async for chunk in model.achat_complete(messages):
print(chunk.choices[0].delta.content, end="", flush=True)
```
### Embedding Responses
```python
from esperanto.factory import AIFactory
model = AIFactory.create_embedding("openai", "text-embedding-3-small")
texts = ["Hello, world!", "Another text"]
# All embedding responses follow this structure
response = model.embed(texts)
print(response.data[0].embedding) # Vector for first text
print(response.data[0].index) # Index of the text (0)
print(response.model) # The model used
print(response.usage.total_tokens) # Token usage information
```
### Reranking Responses
```python
from esperanto.factory import AIFactory
reranker = AIFactory.create_reranker("transformers", "BAAI/bge-reranker-base")
query = "What is machine learning?"
documents = [
"Machine learning is a subset of artificial intelligence.",
"The weather is nice today.",
"Python is a programming language used in ML."
]
# All reranking responses follow this structure
response = reranker.rerank(query, documents, top_k=2)
print(response.results[0].document) # Highest ranked document
print(response.results[0].relevance_score) # Normalized 0-1 relevance score
print(response.results[0].index) # Original document index
print(response.model) # The model used
```
### Task-Aware Embeddings ðŊ
Esperanto supports advanced task-aware embeddings that optimize vector representations for specific use cases. This works across **all embedding providers** through a universal interface:
```python
from esperanto.factory import AIFactory
from esperanto.common_types.task_type import EmbeddingTaskType
# Task-optimized embeddings work with ANY provider
model = AIFactory.create_embedding(
provider="jina", # Also works with: "openai", "google", "transformers", etc.
model_name="jina-embeddings-v3",
config={
"task_type": EmbeddingTaskType.RETRIEVAL_QUERY, # Optimize for search queries
"late_chunking": True, # Better long-context handling
"output_dimensions": 512 # Control vector size
}
)
# Generate optimized embeddings
query = "What is machine learning?"
embeddings = model.embed([query])
```
**Universal Task Types:**
- `RETRIEVAL_QUERY` - Optimize for search queries
- `RETRIEVAL_DOCUMENT` - Optimize for document storage
- `SIMILARITY` - General text similarity
- `CLASSIFICATION` - Text classification tasks
- `CLUSTERING` - Document clustering
- `CODE_RETRIEVAL` - Code search optimization
- `QUESTION_ANSWERING` - Optimize for Q&A tasks
- `FACT_VERIFICATION` - Optimize for fact checking
**Provider Support:**
- **Jina**: Native API support for all features
- **Google**: Native task type translation to Gemini API
- **OpenAI**: Task optimization via intelligent text prefixes
- **Transformers**: Local emulation with task-specific processing
- **Others**: Graceful degradation with consistent interface
The standardized response objects ensure consistency across different providers, making it easy to:
- Switch between providers without changing your application code
- Handle responses in a uniform way
- Access common attributes like token usage and model information
## Provider Configuration ð§
### OpenAI
```python
from esperanto.providers.llm.openai import OpenAILanguageModel
model = OpenAILanguageModel(
api_key="your-api-key", # Or set OPENAI_API_KEY env var
model_name="gpt-4", # Optional
temperature=0.7, # Optional
max_tokens=850, # Optional
streaming=False, # Optional
top_p=0.9, # Optional
structured={"type": "json"}, # Optional, for JSON output
base_url=None, # Optional, for custom endpoint
organization=None # Optional, for org-specific API
)
```
### OpenAI-Compatible Endpoints
Use any OpenAI-compatible endpoint (LM Studio, Ollama, vLLM, custom deployments) with the same interface:
```python
from esperanto.factory import AIFactory
# Using factory config
model = AIFactory.create_language(
"openai-compatible",
"your-model-name", # Use any model name supported by your endpoint
config={
"base_url": "http://localhost:1234/v1", # Your endpoint URL (required)
"api_key": "your-api-key" # Your API key (optional)
}
)
# Or set environment variables
# OPENAI_COMPATIBLE_BASE_URL=http://localhost:1234/v1
# OPENAI_COMPATIBLE_API_KEY=your-api-key # Optional for endpoints that don't require auth
model = AIFactory.create_language("openai-compatible", "your-model-name")
# Works with any OpenAI-compatible endpoint
messages = [{"role": "user", "content": "Hello!"}]
response = model.chat_complete(messages)
print(response.content)
# Streaming support
for chunk in model.chat_complete(messages, stream=True):
print(chunk.choices[0].delta.content, end="", flush=True)
```
**Common Use Cases:**
- **LM Studio**: Local model serving with GUI
- **Ollama**: `ollama serve` with OpenAI compatibility
- **vLLM**: High-performance inference server
- **Custom Deployments**: Any server implementing OpenAI chat completions API
**Features:**
- â
**Streaming**: Real-time response streaming
- â
**Pass-through Model Names**: Use any model name your endpoint supports
- â
**Graceful Degradation**: Automatically handles varying feature support
- â
**Error Handling**: Clear error messages for troubleshooting
- â ïļ **JSON Mode**: Depends on endpoint implementation
### Perplexity
Perplexity uses an OpenAI-compatible API but includes additional parameters for controlling search behavior.
```python
from esperanto.providers.llm.perplexity import PerplexityLanguageModel
model = PerplexityLanguageModel(
api_key="your-api-key", # Or set PERPLEXITY_API_KEY env var
model_name="llama-3-sonar-large-32k-online", # Recommended default
temperature=0.7, # Optional
max_tokens=850, # Optional
streaming=False, # Optional
top_p=0.9, # Optional
structured={"type": "json"}, # Optional, for JSON output
# Perplexity-specific parameters
search_domain_filter=["example.com", "-excluded.com"], # Optional, limit search domains
return_images=False, # Optional, include images in search results
return_related_questions=True, # Optional, return related questions
search_recency_filter="week", # Optional, filter search by time ('day', 'week', 'month', 'year')
web_search_options={"search_context_size": "high"} # Optional, control search context ('low', 'medium', 'high')
)
```
## Streaming Responses ð
Enable streaming to receive responses token by token:
```python
# Enable streaming
model = OpenAILanguageModel(api_key="your-api-key", streaming=True)
# Synchronous streaming
for chunk in model.chat_complete(messages):
print(chunk.choices[0].delta.content, end="", flush=True)
# Async streaming
async for chunk in model.achat_complete(messages):
print(chunk.choices[0].delta.content, end="", flush=True)
```
## Structured Output ð
Request JSON-formatted responses (supported by OpenAI and some OpenRouter models):
```python
model = OpenAILanguageModel(
api_key="your-api-key", # or use ENV
structured={"type": "json"}
)
messages = [
{"role": "user", "content": "List three European capitals as JSON"}
]
response = model.chat_complete(messages)
# Response will be in JSON format
```
## LangChain Integration ð
Convert any provider to a LangChain chat model:
```python
model = OpenAILanguageModel(api_key="your-api-key")
langchain_model = model.to_langchain()
# Use with LangChain
from langchain.chains import ConversationChain
chain = ConversationChain(llm=langchain_model)
```
## Documentation ð
You can find the documentation for Esperanto in the [docs](https://github.com/lfnovo/esperanto/tree/main/docs) directory.
There is also a cool beginner's tutorial in the [tutorial](https://github.com/lfnovo/esperanto/blob/main/docs/tutorial/index.md) directory.
## Contributing ðĪ
We welcome contributions! Please see our [Contributing Guidelines](https://github.com/lfnovo/esperanto/blob/main/CONTRIBUTING.md) for details on how to get started.
## License ð
This project is licensed under the MIT License - see the [LICENSE](https://github.com/lfnovo/esperanto/blob/main/LICENSE) file for details.
## Development ð ïļ
1. Clone the repository:
```bash
git clone https://github.com/lfnovo/esperanto.git
cd esperanto
```
2. Install dependencies:
```bash
pip install -r requirements.txt
```
3. Run tests:
```bash
pytest
Raw data
{
"_id": null,
"home_page": null,
"name": "esperanto",
"maintainer": null,
"docs_url": null,
"requires_python": "<3.14,>=3.9",
"maintainer_email": null,
"keywords": "ai, anthropic, deepseek, elevenlabs, gemini, google, groq, llm, mistral, openai, openrouter, speech-to-text, text-to-speech, transformers, x.ai",
"author": null,
"author_email": "LUIS NOVO <lfnovo@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/f0/0d/549928070ab49078b7834a5a8b251a36a807d9791d0a8a43f57df8c8ea8c/esperanto-2.3.4.tar.gz",
"platform": null,
"description": "# Esperanto \ud83c\udf10\n\n[](https://badge.fury.io/py/esperanto)\n[](https://pypi.org/project/esperanto/)\n[](https://github.com/lfnovo/esperanto)\n[](https://pypi.org/project/esperanto/)\n[](https://opensource.org/licenses/MIT)\n\nEsperanto is a powerful Python library that provides a unified interface for interacting with various Large Language Model (LLM) providers. It simplifies the process of working with different AI models (LLMs, Embedders, Transcribers, and TTS) APIs by offering a consistent interface while maintaining provider-specific optimizations.\n\n## Why Esperanto? \ud83d\ude80\n\n**\ud83e\udeb6 Ultra-Lightweight Architecture**\n- **Direct HTTP Communication**: All providers communicate directly via HTTP APIs using `httpx` - no bulky vendor SDKs required\n- **Minimal Dependencies**: Unlike LangChain and similar frameworks, Esperanto has a tiny footprint with zero overhead layers\n- **Production-Ready Performance**: Direct API calls mean faster response times and lower memory usage\n\n**\ud83d\udd04 True Provider Flexibility**\n- **Standardized Responses**: Switch between any provider (OpenAI \u2194 Anthropic \u2194 Google \u2194 etc.) without changing a single line of code\n- **Consistent Interface**: Same methods, same response objects, same patterns across all 15+ providers\n- **Future-Proof**: Add new providers or change existing ones without refactoring your application\n\n**\u26a1 Perfect for Production**\n- **Prototyping to Production**: Start experimenting and deploy the same code to production\n- **No Vendor Lock-in**: Test different providers, optimize costs, and maintain flexibility\n- **Enterprise-Ready**: Direct HTTP calls, standardized error handling, and comprehensive async support\n\nWhether you're building a quick prototype or a production application serving millions of requests, Esperanto gives you the performance of direct API calls with the convenience of a unified interface.\n\n## Features \u2728\n\n- **Unified Interface**: Work with multiple LLM providers using a consistent API\n- **Provider Support**:\n - OpenAI (GPT-4o, o1, o3, o4, Whisper, TTS)\n - OpenAI-Compatible (LM Studio, Ollama, vLLM, custom endpoints)\n - Anthropic (Claude models)\n - OpenRouter (Access to multiple models)\n - xAI (Grok)\n - Perplexity (Sonar models)\n - Groq (Mixtral, Llama, Whisper)\n - Google GenAI (Gemini LLM, Text To Speech, Embedding with native task optimization)\n - Vertex AI (Google Cloud, LLM, Embedding, TTS)\n - Ollama (Local deployment multiple models)\n - Transformers (Universal local models - Qwen, CrossEncoder, BAAI, Jina, Mixedbread)\n - ElevenLabs (Text-to-Speech, Speech-to-Text)\n - Azure OpenAI (Chat, Embedding)\n - Mistral (Mistral Large, Small, Embedding, etc.)\n - DeepSeek (deepseek-chat)\n - Voyage (Embeddings, Reranking)\n - Jina (Advanced embedding models with task optimization, Reranking)\n- **Embedding Support**: Multiple embedding providers for vector representations\n- **Reranking Support**: Universal reranking interface for improving search relevance\n- **Speech-to-Text Support**: Transcribe audio using multiple providers\n- **Text-to-Speech Support**: Generate speech using multiple providers\n- **Async Support**: Both synchronous and asynchronous API calls\n- **Streaming**: Support for streaming responses\n- **Structured Output**: JSON output formatting (where supported)\n- **LangChain Integration**: Easy conversion to LangChain chat models\n\nFor detailed information about our providers, check out:\n- [LLM Providers Documentation](https://github.com/lfnovo/esperanto/blob/main/docs/llm.md)\n- [Embedding Providers Documentation](https://github.com/lfnovo/esperanto/blob/main/docs/embedding.md)\n- [Reranking Providers Documentation](https://github.com/lfnovo/esperanto/blob/main/docs/rerank.md)\n- [Speech-to-Text Providers Documentation](https://github.com/lfnovo/esperanto/blob/main/docs/speech_to_text.md)\n- [Text-to-Speech Providers Documentation](https://github.com/lfnovo/esperanto/blob/main/docs/text_to_speech.md)\n\n## Installation \ud83d\ude80\n\nInstall Esperanto using pip:\n\n```bash\npip install esperanto\n```\n\n### Optional Dependencies\n\n**Transformers Provider**\n\nIf you plan to use the transformers provider, install with the transformers extra:\n\n```bash\npip install \"esperanto[transformers]\"\n```\n\nThis installs:\n- `transformers` - Core Hugging Face library\n- `torch` - PyTorch framework\n- `tokenizers` - Fast tokenization\n- `sentence-transformers` - CrossEncoder support\n- `scikit-learn` - Advanced embedding features\n- `numpy` - Numerical computations\n\n**LangChain Integration**\n\nIf you plan to use any of the `.to_langchain()` methods, you need to install the correct LangChain SDKs manually:\n\n```bash\n# Core LangChain dependencies (required)\npip install \"langchain>=0.3.8,<0.4.0\" \"langchain-core>=0.3.29,<0.4.0\"\n\n# Provider-specific LangChain packages (install only what you need)\npip install \"langchain-openai>=0.2.9\"\npip install \"langchain-anthropic>=0.3.0\"\npip install \"langchain-google-genai>=2.1.2\"\npip install \"langchain-ollama>=0.2.0\"\npip install \"langchain-groq>=0.2.1\"\npip install \"langchain_mistralai>=0.2.1\"\npip install \"langchain_deepseek>=0.1.3\"\npip install \"langchain-google-vertexai>=2.0.24\"\n```\n\n## Provider Support Matrix\n\n| Provider | LLM Support | Embedding Support | Reranking Support | Speech-to-Text | Text-to-Speech | JSON Mode |\n|--------------|-------------|------------------|-------------------|----------------|----------------|-----------|\n| OpenAI | \u2705 | \u2705 | \u274c | \u2705 | \u2705 | \u2705 |\n| OpenAI-Compatible | \u2705 | \u274c | \u274c | \u274c | \u274c | \u26a0\ufe0f* |\n| Anthropic | \u2705 | \u274c | \u274c | \u274c | \u274c | \u2705 |\n| Groq | \u2705 | \u274c | \u274c | \u2705 | \u274c | \u2705 |\n| Google (GenAI) | \u2705 | \u2705 | \u274c | \u274c | \u2705 | \u2705 |\n| Vertex AI | \u2705 | \u2705 | \u274c | \u274c | \u2705 | \u274c |\n| Ollama | \u2705 | \u2705 | \u274c | \u274c | \u274c | \u274c |\n| Perplexity | \u2705 | \u274c | \u274c | \u274c | \u274c | \u2705 |\n| Transformers | \u274c | \u2705 | \u2705 | \u274c | \u274c | \u274c |\n| ElevenLabs | \u274c | \u274c | \u274c | \u2705 | \u2705 | \u274c |\n| Azure OpenAI | \u2705 | \u2705 | \u274c | \u274c | \u274c | \u2705 |\n| Mistral | \u2705 | \u2705 | \u274c | \u274c | \u274c | \u2705 |\n| DeepSeek | \u2705 | \u274c | \u274c | \u274c | \u274c | \u2705 |\n| Voyage | \u274c | \u2705 | \u2705 | \u274c | \u274c | \u274c |\n| Jina | \u274c | \u2705 | \u2705 | \u274c | \u274c | \u274c |\n| xAI | \u2705 | \u274c | \u274c | \u274c | \u274c | \u2705 |\n| OpenRouter | \u2705 | \u274c | \u274c | \u274c | \u274c | \u2705 |\n\n*\u26a0\ufe0f OpenAI-Compatible: JSON mode support depends on the specific endpoint implementation\n\n## Quick Start \ud83c\udfc3\u200d\u2642\ufe0f\n\nYou can use Esperanto in two ways: directly with provider-specific classes or through the AI Factory.\n\n## AIFactory - Smart Model Management \ud83c\udfed\n\nThe `AIFactory` is Esperanto's intelligent model management system that provides significant performance benefits through its **singleton cache architecture**.\n\n### \ud83d\ude80 **Singleton Cache Benefits**\n\nAIFactory automatically caches model instances based on their configuration. This means:\n- **No duplicate model creation** - same provider + model + config = same instance returned\n- **Faster subsequent calls** - cached instances are returned immediately\n- **Memory efficient** - prevents memory bloat from multiple identical models\n- **Connection reuse** - HTTP clients and configurations are preserved\n\n### \ud83d\udca1 **How It Works**\n\n```python\nfrom esperanto.factory import AIFactory\n\n# First call - creates new model instance\nmodel1 = AIFactory.create_language(\"openai\", \"gpt-4\", temperature=0.7)\n\n# Second call with same config - returns cached instance (instant!)\nmodel2 = AIFactory.create_language(\"openai\", \"gpt-4\", temperature=0.7)\n\n# They're the exact same object\nassert model1 is model2 # True!\n\n# Different config - creates new instance\nmodel3 = AIFactory.create_language(\"openai\", \"gpt-4\", temperature=0.9)\nassert model1 is not model3 # True - different config\n```\n\n### \ud83c\udfaf **Perfect for Production**\n\nThis caching is especially powerful in production scenarios:\n\n```python\n# In a web application\ndef handle_chat_request(messages):\n # This model is cached - no recreation overhead!\n model = AIFactory.create_language(\"anthropic\", \"claude-3-sonnet-20240229\")\n return model.chat_complete(messages)\n\ndef handle_embedding_request(texts):\n # This embedding model is also cached\n embedder = AIFactory.create_embedding(\"openai\", \"text-embedding-3-small\")\n return embedder.embed(texts)\n\n# Multiple calls to these functions reuse the same model instances\n# = Better performance + Lower memory usage\n```\n\n### \ud83d\udd0d **Cache Key Strategy**\n\nThe cache key includes:\n- **Provider name** (e.g., \"openai\", \"anthropic\")\n- **Model name** (e.g., \"gpt-4\", \"claude-3-sonnet\")\n- **All configuration parameters** (temperature, max_tokens, etc.)\n\nOnly models with **identical configurations** share the same cache entry.\n\n### Using AI Factory\n\nThe AI Factory provides a convenient way to create model instances and discover available providers:\n\n```python\nfrom esperanto.factory import AIFactory\n\n# Get available providers for each model type\nproviders = AIFactory.get_available_providers()\nprint(providers)\n# Output:\n# {\n# 'language': ['openai', 'openai-compatible', 'anthropic', 'google', 'groq', 'ollama', 'openrouter', 'xai', 'perplexity', 'azure', 'mistral', 'deepseek'],\n# 'embedding': ['openai', 'google', 'ollama', 'vertex', 'transformers', 'voyage', 'mistral', 'azure', 'jina'],\n# 'reranker': ['jina', 'voyage', 'transformers'],\n# 'speech_to_text': ['openai', 'groq', 'elevenlabs'],\n# 'text_to_speech': ['openai', 'elevenlabs', 'google', 'vertex']\n# }\n\n# Create model instances\nmodel = AIFactory.create_language(\n \"openai\", \n \"gpt-3.5-turbo\",\n config={\"structured\": {\"type\": \"json\"}}\n) # Language model\nembedder = AIFactory.create_embedding(\"openai\", \"text-embedding-3-small\") # Embedding model\nreranker = AIFactory.create_reranker(\"transformers\", \"cross-encoder/ms-marco-MiniLM-L-6-v2\") # Universal reranker model\ntranscriber = AIFactory.create_speech_to_text(\"openai\", \"whisper-1\") # Speech-to-text model\nspeaker = AIFactory.create_text_to_speech(\"openai\", \"tts-1\") # Text-to-speech model\n\nmessages = [\n {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},\n {\"role\": \"user\", \"content\": \"What's the capital of France?\"},\n]\nresponse = model.chat_complete(messages)\n\n# Create an embedding instance\ntexts = [\"Hello, world!\", \"Another text\"]\n# Synchronous usage\nembeddings = embedder.embed(texts)\n# Async usage\nembeddings = await embedder.aembed(texts)\n```\n\n### Using Provider-Specific Classes\n\nHere's a simple example to get you started:\n\n```python\nfrom esperanto.providers.llm.openai import OpenAILanguageModel\nfrom esperanto.providers.llm.anthropic import AnthropicLanguageModel\n\n# Initialize a provider with structured output\nmodel = OpenAILanguageModel(\n api_key=\"your-api-key\",\n model_name=\"gpt-4\", # Optional, defaults to gpt-4\n structured={\"type\": \"json\"} # Optional, for JSON output\n)\n\n# Simple chat completion\nmessages = [\n {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},\n {\"role\": \"user\", \"content\": \"List three colors in JSON format\"}\n]\n\n# Synchronous call\nresponse = model.chat_complete(messages)\nprint(response.choices[0].message.content) # Will be in JSON format\n\n# Async call\nasync def get_response():\n response = await model.achat_complete(messages)\n print(response.choices[0].message.content) # Will be in JSON format\n```\n\n## Standardized Responses\n\nAll providers in Esperanto return standardized response objects, making it easy to work with different models without changing your code.\n\n### LLM Responses\n\n```python\nfrom esperanto.factory import AIFactory\n\nmodel = AIFactory.create_language(\n \"openai\", \n \"gpt-3.5-turbo\",\n config={\"structured\": {\"type\": \"json\"}}\n)\nmessages = [{\"role\": \"user\", \"content\": \"Hello!\"}]\n\n# All LLM responses follow this structure\nresponse = model.chat_complete(messages)\nprint(response.choices[0].message.content) # The actual response text\nprint(response.choices[0].message.role) # 'assistant'\nprint(response.model) # The model used\nprint(response.usage.total_tokens) # Token usage information\nprint(response.content) # Shortcut for response.choices[0].message.content\n\n# For streaming responses\nfor chunk in model.chat_complete(messages):\n print(chunk.choices[0].delta.content, end=\"\", flush=True)\n\n# Async streaming\nasync for chunk in model.achat_complete(messages):\n print(chunk.choices[0].delta.content, end=\"\", flush=True)\n```\n\n### Embedding Responses\n\n```python\nfrom esperanto.factory import AIFactory\n\nmodel = AIFactory.create_embedding(\"openai\", \"text-embedding-3-small\")\ntexts = [\"Hello, world!\", \"Another text\"]\n\n# All embedding responses follow this structure\nresponse = model.embed(texts)\nprint(response.data[0].embedding) # Vector for first text\nprint(response.data[0].index) # Index of the text (0)\nprint(response.model) # The model used\nprint(response.usage.total_tokens) # Token usage information\n```\n\n### Reranking Responses\n\n```python\nfrom esperanto.factory import AIFactory\n\nreranker = AIFactory.create_reranker(\"transformers\", \"BAAI/bge-reranker-base\")\nquery = \"What is machine learning?\"\ndocuments = [\n \"Machine learning is a subset of artificial intelligence.\",\n \"The weather is nice today.\",\n \"Python is a programming language used in ML.\"\n]\n\n# All reranking responses follow this structure\nresponse = reranker.rerank(query, documents, top_k=2)\nprint(response.results[0].document) # Highest ranked document\nprint(response.results[0].relevance_score) # Normalized 0-1 relevance score\nprint(response.results[0].index) # Original document index\nprint(response.model) # The model used\n```\n\n### Task-Aware Embeddings \ud83c\udfaf\n\nEsperanto supports advanced task-aware embeddings that optimize vector representations for specific use cases. This works across **all embedding providers** through a universal interface:\n\n```python\nfrom esperanto.factory import AIFactory\nfrom esperanto.common_types.task_type import EmbeddingTaskType\n\n# Task-optimized embeddings work with ANY provider\nmodel = AIFactory.create_embedding(\n provider=\"jina\", # Also works with: \"openai\", \"google\", \"transformers\", etc.\n model_name=\"jina-embeddings-v3\",\n config={\n \"task_type\": EmbeddingTaskType.RETRIEVAL_QUERY, # Optimize for search queries\n \"late_chunking\": True, # Better long-context handling\n \"output_dimensions\": 512 # Control vector size\n }\n)\n\n# Generate optimized embeddings\nquery = \"What is machine learning?\"\nembeddings = model.embed([query])\n```\n\n**Universal Task Types:**\n- `RETRIEVAL_QUERY` - Optimize for search queries\n- `RETRIEVAL_DOCUMENT` - Optimize for document storage \n- `SIMILARITY` - General text similarity\n- `CLASSIFICATION` - Text classification tasks\n- `CLUSTERING` - Document clustering\n- `CODE_RETRIEVAL` - Code search optimization\n- `QUESTION_ANSWERING` - Optimize for Q&A tasks\n- `FACT_VERIFICATION` - Optimize for fact checking\n\n**Provider Support:**\n- **Jina**: Native API support for all features\n- **Google**: Native task type translation to Gemini API\n- **OpenAI**: Task optimization via intelligent text prefixes\n- **Transformers**: Local emulation with task-specific processing\n- **Others**: Graceful degradation with consistent interface\n\nThe standardized response objects ensure consistency across different providers, making it easy to:\n- Switch between providers without changing your application code\n- Handle responses in a uniform way\n- Access common attributes like token usage and model information\n\n## Provider Configuration \ud83d\udd27\n\n### OpenAI\n\n```python\nfrom esperanto.providers.llm.openai import OpenAILanguageModel\n\nmodel = OpenAILanguageModel(\n api_key=\"your-api-key\", # Or set OPENAI_API_KEY env var\n model_name=\"gpt-4\", # Optional\n temperature=0.7, # Optional\n max_tokens=850, # Optional\n streaming=False, # Optional\n top_p=0.9, # Optional\n structured={\"type\": \"json\"}, # Optional, for JSON output\n base_url=None, # Optional, for custom endpoint\n organization=None # Optional, for org-specific API\n)\n```\n\n### OpenAI-Compatible Endpoints\n\nUse any OpenAI-compatible endpoint (LM Studio, Ollama, vLLM, custom deployments) with the same interface:\n\n```python\nfrom esperanto.factory import AIFactory\n\n# Using factory config\nmodel = AIFactory.create_language(\n \"openai-compatible\",\n \"your-model-name\", # Use any model name supported by your endpoint\n config={\n \"base_url\": \"http://localhost:1234/v1\", # Your endpoint URL (required)\n \"api_key\": \"your-api-key\" # Your API key (optional)\n }\n)\n\n# Or set environment variables\n# OPENAI_COMPATIBLE_BASE_URL=http://localhost:1234/v1\n# OPENAI_COMPATIBLE_API_KEY=your-api-key # Optional for endpoints that don't require auth\nmodel = AIFactory.create_language(\"openai-compatible\", \"your-model-name\")\n\n# Works with any OpenAI-compatible endpoint\nmessages = [{\"role\": \"user\", \"content\": \"Hello!\"}]\nresponse = model.chat_complete(messages)\nprint(response.content)\n\n# Streaming support\nfor chunk in model.chat_complete(messages, stream=True):\n print(chunk.choices[0].delta.content, end=\"\", flush=True)\n```\n\n**Common Use Cases:**\n- **LM Studio**: Local model serving with GUI\n- **Ollama**: `ollama serve` with OpenAI compatibility\n- **vLLM**: High-performance inference server\n- **Custom Deployments**: Any server implementing OpenAI chat completions API\n\n**Features:**\n- \u2705 **Streaming**: Real-time response streaming\n- \u2705 **Pass-through Model Names**: Use any model name your endpoint supports\n- \u2705 **Graceful Degradation**: Automatically handles varying feature support\n- \u2705 **Error Handling**: Clear error messages for troubleshooting\n- \u26a0\ufe0f **JSON Mode**: Depends on endpoint implementation\n\n### Perplexity\n\nPerplexity uses an OpenAI-compatible API but includes additional parameters for controlling search behavior.\n\n```python\nfrom esperanto.providers.llm.perplexity import PerplexityLanguageModel\n\nmodel = PerplexityLanguageModel(\n api_key=\"your-api-key\", # Or set PERPLEXITY_API_KEY env var\n model_name=\"llama-3-sonar-large-32k-online\", # Recommended default\n temperature=0.7, # Optional\n max_tokens=850, # Optional\n streaming=False, # Optional\n top_p=0.9, # Optional\n structured={\"type\": \"json\"}, # Optional, for JSON output\n\n # Perplexity-specific parameters\n search_domain_filter=[\"example.com\", \"-excluded.com\"], # Optional, limit search domains\n return_images=False, # Optional, include images in search results\n return_related_questions=True, # Optional, return related questions\n search_recency_filter=\"week\", # Optional, filter search by time ('day', 'week', 'month', 'year')\n web_search_options={\"search_context_size\": \"high\"} # Optional, control search context ('low', 'medium', 'high')\n)\n```\n\n## Streaming Responses \ud83c\udf0a\n\nEnable streaming to receive responses token by token:\n\n```python\n# Enable streaming\nmodel = OpenAILanguageModel(api_key=\"your-api-key\", streaming=True)\n\n# Synchronous streaming\nfor chunk in model.chat_complete(messages):\n print(chunk.choices[0].delta.content, end=\"\", flush=True)\n\n# Async streaming\nasync for chunk in model.achat_complete(messages):\n print(chunk.choices[0].delta.content, end=\"\", flush=True)\n```\n\n## Structured Output \ud83d\udcca\n\nRequest JSON-formatted responses (supported by OpenAI and some OpenRouter models):\n\n```python\nmodel = OpenAILanguageModel(\n api_key=\"your-api-key\", # or use ENV\n structured={\"type\": \"json\"}\n)\n\nmessages = [\n {\"role\": \"user\", \"content\": \"List three European capitals as JSON\"}\n]\n\nresponse = model.chat_complete(messages)\n# Response will be in JSON format\n```\n\n## LangChain Integration \ud83d\udd17\n\nConvert any provider to a LangChain chat model:\n\n```python\nmodel = OpenAILanguageModel(api_key=\"your-api-key\")\nlangchain_model = model.to_langchain()\n\n# Use with LangChain\nfrom langchain.chains import ConversationChain\nchain = ConversationChain(llm=langchain_model)\n```\n\n## Documentation \ud83d\udcda\n\nYou can find the documentation for Esperanto in the [docs](https://github.com/lfnovo/esperanto/tree/main/docs) directory.\n\nThere is also a cool beginner's tutorial in the [tutorial](https://github.com/lfnovo/esperanto/blob/main/docs/tutorial/index.md) directory.\n\n## Contributing \ud83e\udd1d\n\nWe welcome contributions! Please see our [Contributing Guidelines](https://github.com/lfnovo/esperanto/blob/main/CONTRIBUTING.md) for details on how to get started.\n\n## License \ud83d\udcc4\n\nThis project is licensed under the MIT License - see the [LICENSE](https://github.com/lfnovo/esperanto/blob/main/LICENSE) file for details.\n\n## Development \ud83d\udee0\ufe0f\n\n1. Clone the repository:\n```bash\ngit clone https://github.com/lfnovo/esperanto.git\ncd esperanto\n```\n\n2. Install dependencies:\n```bash\npip install -r requirements.txt\n```\n\n3. Run tests:\n```bash\npytest\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A light-weight, production-ready, unified interface for various AI model providers",
"version": "2.3.4",
"project_urls": {
"documentation": "https://github.com/lfnovo/esperanto#readme",
"homepage": "https://github.com/lfnovo/esperanto",
"repository": "https://github.com/lfnovo/esperanto"
},
"split_keywords": [
"ai",
" anthropic",
" deepseek",
" elevenlabs",
" gemini",
" google",
" groq",
" llm",
" mistral",
" openai",
" openrouter",
" speech-to-text",
" text-to-speech",
" transformers",
" x.ai"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "2627f98201713ec5339388e0a00bef25ccf598ba5048f3980f556d6a243c2b30",
"md5": "94ab5cc4fc4688282c94c80e5d37b838",
"sha256": "9759ff22f3bfff6c9cc46acc6c69ded4f038baa6f5cda592b6ec83d5f11c5eff"
},
"downloads": -1,
"filename": "esperanto-2.3.4-py3-none-any.whl",
"has_sig": false,
"md5_digest": "94ab5cc4fc4688282c94c80e5d37b838",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<3.14,>=3.9",
"size": 116946,
"upload_time": "2025-07-14T10:29:05",
"upload_time_iso_8601": "2025-07-14T10:29:05.869187Z",
"url": "https://files.pythonhosted.org/packages/26/27/f98201713ec5339388e0a00bef25ccf598ba5048f3980f556d6a243c2b30/esperanto-2.3.4-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "f00d549928070ab49078b7834a5a8b251a36a807d9791d0a8a43f57df8c8ea8c",
"md5": "9e55dc7ad93b06c24663f6ccecf0d98a",
"sha256": "5072c3a6bdbbf0c53d6671f28991e9f205c93c49821058e2b62c3a17710bd235"
},
"downloads": -1,
"filename": "esperanto-2.3.4.tar.gz",
"has_sig": false,
"md5_digest": "9e55dc7ad93b06c24663f6ccecf0d98a",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<3.14,>=3.9",
"size": 457606,
"upload_time": "2025-07-14T10:29:07",
"upload_time_iso_8601": "2025-07-14T10:29:07.119295Z",
"url": "https://files.pythonhosted.org/packages/f0/0d/549928070ab49078b7834a5a8b251a36a807d9791d0a8a43f57df8c8ea8c/esperanto-2.3.4.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-14 10:29:07",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "lfnovo",
"github_project": "esperanto#readme",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "esperanto"
}