esperanto


Nameesperanto JSON
Version 2.3.4 PyPI version JSON
download
home_pageNone
SummaryA light-weight, production-ready, unified interface for various AI model providers
upload_time2025-07-14 10:29:07
maintainerNone
docs_urlNone
authorNone
requires_python<3.14,>=3.9
licenseMIT
keywords ai anthropic deepseek elevenlabs gemini google groq llm mistral openai openrouter speech-to-text text-to-speech transformers x.ai
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Esperanto 🌐

[![PyPI version](https://badge.fury.io/py/esperanto.svg)](https://badge.fury.io/py/esperanto)
[![PyPI Downloads](https://img.shields.io/pypi/dm/esperanto)](https://pypi.org/project/esperanto/)
[![Coverage](https://img.shields.io/badge/coverage-87%25-brightgreen)](https://github.com/lfnovo/esperanto)
[![Python Versions](https://img.shields.io/pypi/pyversions/esperanto)](https://pypi.org/project/esperanto/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

Esperanto is a powerful Python library that provides a unified interface for interacting with various Large Language Model (LLM) providers. It simplifies the process of working with different AI models (LLMs, Embedders, Transcribers, and TTS) APIs by offering a consistent interface while maintaining provider-specific optimizations.

## Why Esperanto? 🚀

**ðŸŠķ Ultra-Lightweight Architecture**
- **Direct HTTP Communication**: All providers communicate directly via HTTP APIs using `httpx` - no bulky vendor SDKs required
- **Minimal Dependencies**: Unlike LangChain and similar frameworks, Esperanto has a tiny footprint with zero overhead layers
- **Production-Ready Performance**: Direct API calls mean faster response times and lower memory usage

**🔄 True Provider Flexibility**
- **Standardized Responses**: Switch between any provider (OpenAI ↔ Anthropic ↔ Google ↔ etc.) without changing a single line of code
- **Consistent Interface**: Same methods, same response objects, same patterns across all 15+ providers
- **Future-Proof**: Add new providers or change existing ones without refactoring your application

**⚡ Perfect for Production**
- **Prototyping to Production**: Start experimenting and deploy the same code to production
- **No Vendor Lock-in**: Test different providers, optimize costs, and maintain flexibility
- **Enterprise-Ready**: Direct HTTP calls, standardized error handling, and comprehensive async support

Whether you're building a quick prototype or a production application serving millions of requests, Esperanto gives you the performance of direct API calls with the convenience of a unified interface.

## Features âœĻ

- **Unified Interface**: Work with multiple LLM providers using a consistent API
- **Provider Support**:
  - OpenAI (GPT-4o, o1, o3, o4, Whisper, TTS)
  - OpenAI-Compatible (LM Studio, Ollama, vLLM, custom endpoints)
  - Anthropic (Claude models)
  - OpenRouter (Access to multiple models)
  - xAI (Grok)
  - Perplexity (Sonar models)
  - Groq (Mixtral, Llama, Whisper)
  - Google GenAI (Gemini LLM, Text To Speech, Embedding with native task optimization)
  - Vertex AI (Google Cloud, LLM, Embedding, TTS)
  - Ollama (Local deployment multiple models)
  - Transformers (Universal local models - Qwen, CrossEncoder, BAAI, Jina, Mixedbread)
  - ElevenLabs (Text-to-Speech, Speech-to-Text)
  - Azure OpenAI (Chat, Embedding)
  - Mistral (Mistral Large, Small, Embedding, etc.)
  - DeepSeek (deepseek-chat)
  - Voyage (Embeddings, Reranking)
  - Jina (Advanced embedding models with task optimization, Reranking)
- **Embedding Support**: Multiple embedding providers for vector representations
- **Reranking Support**: Universal reranking interface for improving search relevance
- **Speech-to-Text Support**: Transcribe audio using multiple providers
- **Text-to-Speech Support**: Generate speech using multiple providers
- **Async Support**: Both synchronous and asynchronous API calls
- **Streaming**: Support for streaming responses
- **Structured Output**: JSON output formatting (where supported)
- **LangChain Integration**: Easy conversion to LangChain chat models

For detailed information about our providers, check out:
- [LLM Providers Documentation](https://github.com/lfnovo/esperanto/blob/main/docs/llm.md)
- [Embedding Providers Documentation](https://github.com/lfnovo/esperanto/blob/main/docs/embedding.md)
- [Reranking Providers Documentation](https://github.com/lfnovo/esperanto/blob/main/docs/rerank.md)
- [Speech-to-Text Providers Documentation](https://github.com/lfnovo/esperanto/blob/main/docs/speech_to_text.md)
- [Text-to-Speech Providers Documentation](https://github.com/lfnovo/esperanto/blob/main/docs/text_to_speech.md)

## Installation 🚀

Install Esperanto using pip:

```bash
pip install esperanto
```

### Optional Dependencies

**Transformers Provider**

If you plan to use the transformers provider, install with the transformers extra:

```bash
pip install "esperanto[transformers]"
```

This installs:
- `transformers` - Core Hugging Face library
- `torch` - PyTorch framework
- `tokenizers` - Fast tokenization
- `sentence-transformers` - CrossEncoder support
- `scikit-learn` - Advanced embedding features
- `numpy` - Numerical computations

**LangChain Integration**

If you plan to use any of the `.to_langchain()` methods, you need to install the correct LangChain SDKs manually:

```bash
# Core LangChain dependencies (required)
pip install "langchain>=0.3.8,<0.4.0" "langchain-core>=0.3.29,<0.4.0"

# Provider-specific LangChain packages (install only what you need)
pip install "langchain-openai>=0.2.9"
pip install "langchain-anthropic>=0.3.0"
pip install "langchain-google-genai>=2.1.2"
pip install "langchain-ollama>=0.2.0"
pip install "langchain-groq>=0.2.1"
pip install "langchain_mistralai>=0.2.1"
pip install "langchain_deepseek>=0.1.3"
pip install "langchain-google-vertexai>=2.0.24"
```

## Provider Support Matrix

| Provider     | LLM Support | Embedding Support | Reranking Support | Speech-to-Text | Text-to-Speech | JSON Mode |
|--------------|-------------|------------------|-------------------|----------------|----------------|-----------|
| OpenAI       | ✅          | ✅               | ❌                | ✅             | ✅             | ✅        |
| OpenAI-Compatible | ✅          | ❌               | ❌                | ❌             | ❌             | ⚠ïļ*       |
| Anthropic    | ✅          | ❌               | ❌                | ❌             | ❌             | ✅        |
| Groq         | ✅          | ❌               | ❌                | ✅             | ❌             | ✅        |
| Google (GenAI) | ✅          | ✅               | ❌                | ❌             | ✅             | ✅        |
| Vertex AI    | ✅          | ✅               | ❌                | ❌             | ✅             | ❌        |
| Ollama       | ✅          | ✅               | ❌                | ❌             | ❌             | ❌        |
| Perplexity   | ✅          | ❌               | ❌                | ❌             | ❌             | ✅        |
| Transformers | ❌          | ✅               | ✅                | ❌             | ❌             | ❌        |
| ElevenLabs   | ❌          | ❌               | ❌                | ✅             | ✅             | ❌        |
| Azure OpenAI | ✅          | ✅               | ❌                | ❌             | ❌             | ✅        |
| Mistral      | ✅          | ✅               | ❌                | ❌             | ❌             | ✅        |
| DeepSeek     | ✅          | ❌               | ❌                | ❌             | ❌             | ✅        |
| Voyage       | ❌          | ✅               | ✅                | ❌             | ❌             | ❌        |
| Jina         | ❌          | ✅               | ✅                | ❌             | ❌             | ❌        |
| xAI          | ✅          | ❌               | ❌                | ❌             | ❌             | ✅        |
| OpenRouter   | ✅          | ❌               | ❌                | ❌             | ❌             | ✅        |

*⚠ïļ OpenAI-Compatible: JSON mode support depends on the specific endpoint implementation

## Quick Start 🏃‍♂ïļ

You can use Esperanto in two ways: directly with provider-specific classes or through the AI Factory.

## AIFactory - Smart Model Management 🏭

The `AIFactory` is Esperanto's intelligent model management system that provides significant performance benefits through its **singleton cache architecture**.

### 🚀 **Singleton Cache Benefits**

AIFactory automatically caches model instances based on their configuration. This means:
- **No duplicate model creation** - same provider + model + config = same instance returned
- **Faster subsequent calls** - cached instances are returned immediately
- **Memory efficient** - prevents memory bloat from multiple identical models
- **Connection reuse** - HTTP clients and configurations are preserved

### ðŸ’Ą **How It Works**

```python
from esperanto.factory import AIFactory

# First call - creates new model instance
model1 = AIFactory.create_language("openai", "gpt-4", temperature=0.7)

# Second call with same config - returns cached instance (instant!)
model2 = AIFactory.create_language("openai", "gpt-4", temperature=0.7)

# They're the exact same object
assert model1 is model2  # True!

# Different config - creates new instance
model3 = AIFactory.create_language("openai", "gpt-4", temperature=0.9)
assert model1 is not model3  # True - different config
```

### ðŸŽŊ **Perfect for Production**

This caching is especially powerful in production scenarios:

```python
# In a web application
def handle_chat_request(messages):
    # This model is cached - no recreation overhead!
    model = AIFactory.create_language("anthropic", "claude-3-sonnet-20240229")
    return model.chat_complete(messages)

def handle_embedding_request(texts):
    # This embedding model is also cached
    embedder = AIFactory.create_embedding("openai", "text-embedding-3-small")
    return embedder.embed(texts)

# Multiple calls to these functions reuse the same model instances
# = Better performance + Lower memory usage
```

### 🔍 **Cache Key Strategy**

The cache key includes:
- **Provider name** (e.g., "openai", "anthropic")
- **Model name** (e.g., "gpt-4", "claude-3-sonnet")
- **All configuration parameters** (temperature, max_tokens, etc.)

Only models with **identical configurations** share the same cache entry.

### Using AI Factory

The AI Factory provides a convenient way to create model instances and discover available providers:

```python
from esperanto.factory import AIFactory

# Get available providers for each model type
providers = AIFactory.get_available_providers()
print(providers)
# Output:
# {
#     'language': ['openai', 'openai-compatible', 'anthropic', 'google', 'groq', 'ollama', 'openrouter', 'xai', 'perplexity', 'azure', 'mistral', 'deepseek'],
#     'embedding': ['openai', 'google', 'ollama', 'vertex', 'transformers', 'voyage', 'mistral', 'azure', 'jina'],
#     'reranker': ['jina', 'voyage', 'transformers'],
#     'speech_to_text': ['openai', 'groq', 'elevenlabs'],
#     'text_to_speech': ['openai', 'elevenlabs', 'google', 'vertex']
# }

# Create model instances
model = AIFactory.create_language(
    "openai", 
    "gpt-3.5-turbo",
    config={"structured": {"type": "json"}}
)  # Language model
embedder = AIFactory.create_embedding("openai", "text-embedding-3-small")  # Embedding model
reranker = AIFactory.create_reranker("transformers", "cross-encoder/ms-marco-MiniLM-L-6-v2")  # Universal reranker model
transcriber = AIFactory.create_speech_to_text("openai", "whisper-1")  # Speech-to-text model
speaker = AIFactory.create_text_to_speech("openai", "tts-1")  # Text-to-speech model

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What's the capital of France?"},
]
response = model.chat_complete(messages)

# Create an embedding instance
texts = ["Hello, world!", "Another text"]
# Synchronous usage
embeddings = embedder.embed(texts)
# Async usage
embeddings = await embedder.aembed(texts)
```

### Using Provider-Specific Classes

Here's a simple example to get you started:

```python
from esperanto.providers.llm.openai import OpenAILanguageModel
from esperanto.providers.llm.anthropic import AnthropicLanguageModel

# Initialize a provider with structured output
model = OpenAILanguageModel(
    api_key="your-api-key",
    model_name="gpt-4",  # Optional, defaults to gpt-4
    structured={"type": "json"}  # Optional, for JSON output
)

# Simple chat completion
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "List three colors in JSON format"}
]

# Synchronous call
response = model.chat_complete(messages)
print(response.choices[0].message.content)  # Will be in JSON format

# Async call
async def get_response():
    response = await model.achat_complete(messages)
    print(response.choices[0].message.content)  # Will be in JSON format
```

## Standardized Responses

All providers in Esperanto return standardized response objects, making it easy to work with different models without changing your code.

### LLM Responses

```python
from esperanto.factory import AIFactory

model = AIFactory.create_language(
    "openai", 
    "gpt-3.5-turbo",
    config={"structured": {"type": "json"}}
)
messages = [{"role": "user", "content": "Hello!"}]

# All LLM responses follow this structure
response = model.chat_complete(messages)
print(response.choices[0].message.content)  # The actual response text
print(response.choices[0].message.role)     # 'assistant'
print(response.model)                       # The model used
print(response.usage.total_tokens)          # Token usage information
print(response.content)          # Shortcut for response.choices[0].message.content

# For streaming responses
for chunk in model.chat_complete(messages):
    print(chunk.choices[0].delta.content, end="", flush=True)

# Async streaming
async for chunk in model.achat_complete(messages):
    print(chunk.choices[0].delta.content, end="", flush=True)
```

### Embedding Responses

```python
from esperanto.factory import AIFactory

model = AIFactory.create_embedding("openai", "text-embedding-3-small")
texts = ["Hello, world!", "Another text"]

# All embedding responses follow this structure
response = model.embed(texts)
print(response.data[0].embedding)     # Vector for first text
print(response.data[0].index)         # Index of the text (0)
print(response.model)                 # The model used
print(response.usage.total_tokens)    # Token usage information
```

### Reranking Responses

```python
from esperanto.factory import AIFactory

reranker = AIFactory.create_reranker("transformers", "BAAI/bge-reranker-base")
query = "What is machine learning?"
documents = [
    "Machine learning is a subset of artificial intelligence.",
    "The weather is nice today.",
    "Python is a programming language used in ML."
]

# All reranking responses follow this structure
response = reranker.rerank(query, documents, top_k=2)
print(response.results[0].document)          # Highest ranked document
print(response.results[0].relevance_score)   # Normalized 0-1 relevance score
print(response.results[0].index)             # Original document index
print(response.model)                        # The model used
```

### Task-Aware Embeddings ðŸŽŊ

Esperanto supports advanced task-aware embeddings that optimize vector representations for specific use cases. This works across **all embedding providers** through a universal interface:

```python
from esperanto.factory import AIFactory
from esperanto.common_types.task_type import EmbeddingTaskType

# Task-optimized embeddings work with ANY provider
model = AIFactory.create_embedding(
    provider="jina",  # Also works with: "openai", "google", "transformers", etc.
    model_name="jina-embeddings-v3",
    config={
        "task_type": EmbeddingTaskType.RETRIEVAL_QUERY,  # Optimize for search queries
        "late_chunking": True,                           # Better long-context handling
        "output_dimensions": 512                         # Control vector size
    }
)

# Generate optimized embeddings
query = "What is machine learning?"
embeddings = model.embed([query])
```

**Universal Task Types:**
- `RETRIEVAL_QUERY` - Optimize for search queries
- `RETRIEVAL_DOCUMENT` - Optimize for document storage  
- `SIMILARITY` - General text similarity
- `CLASSIFICATION` - Text classification tasks
- `CLUSTERING` - Document clustering
- `CODE_RETRIEVAL` - Code search optimization
- `QUESTION_ANSWERING` - Optimize for Q&A tasks
- `FACT_VERIFICATION` - Optimize for fact checking

**Provider Support:**
- **Jina**: Native API support for all features
- **Google**: Native task type translation to Gemini API
- **OpenAI**: Task optimization via intelligent text prefixes
- **Transformers**: Local emulation with task-specific processing
- **Others**: Graceful degradation with consistent interface

The standardized response objects ensure consistency across different providers, making it easy to:
- Switch between providers without changing your application code
- Handle responses in a uniform way
- Access common attributes like token usage and model information

## Provider Configuration 🔧

### OpenAI

```python
from esperanto.providers.llm.openai import OpenAILanguageModel

model = OpenAILanguageModel(
    api_key="your-api-key",  # Or set OPENAI_API_KEY env var
    model_name="gpt-4",      # Optional
    temperature=0.7,         # Optional
    max_tokens=850,         # Optional
    streaming=False,        # Optional
    top_p=0.9,             # Optional
    structured={"type": "json"},      # Optional, for JSON output
    base_url=None,         # Optional, for custom endpoint
    organization=None      # Optional, for org-specific API
)
```

### OpenAI-Compatible Endpoints

Use any OpenAI-compatible endpoint (LM Studio, Ollama, vLLM, custom deployments) with the same interface:

```python
from esperanto.factory import AIFactory

# Using factory config
model = AIFactory.create_language(
    "openai-compatible",
    "your-model-name",  # Use any model name supported by your endpoint
    config={
        "base_url": "http://localhost:1234/v1",  # Your endpoint URL (required)
        "api_key": "your-api-key"                # Your API key (optional)
    }
)

# Or set environment variables
# OPENAI_COMPATIBLE_BASE_URL=http://localhost:1234/v1
# OPENAI_COMPATIBLE_API_KEY=your-api-key  # Optional for endpoints that don't require auth
model = AIFactory.create_language("openai-compatible", "your-model-name")

# Works with any OpenAI-compatible endpoint
messages = [{"role": "user", "content": "Hello!"}]
response = model.chat_complete(messages)
print(response.content)

# Streaming support
for chunk in model.chat_complete(messages, stream=True):
    print(chunk.choices[0].delta.content, end="", flush=True)
```

**Common Use Cases:**
- **LM Studio**: Local model serving with GUI
- **Ollama**: `ollama serve` with OpenAI compatibility
- **vLLM**: High-performance inference server
- **Custom Deployments**: Any server implementing OpenAI chat completions API

**Features:**
- ✅ **Streaming**: Real-time response streaming
- ✅ **Pass-through Model Names**: Use any model name your endpoint supports
- ✅ **Graceful Degradation**: Automatically handles varying feature support
- ✅ **Error Handling**: Clear error messages for troubleshooting
- ⚠ïļ **JSON Mode**: Depends on endpoint implementation

### Perplexity

Perplexity uses an OpenAI-compatible API but includes additional parameters for controlling search behavior.

```python
from esperanto.providers.llm.perplexity import PerplexityLanguageModel

model = PerplexityLanguageModel(
    api_key="your-api-key",  # Or set PERPLEXITY_API_KEY env var
    model_name="llama-3-sonar-large-32k-online", # Recommended default
    temperature=0.7,         # Optional
    max_tokens=850,         # Optional
    streaming=False,        # Optional
    top_p=0.9,             # Optional
    structured={"type": "json"}, # Optional, for JSON output

    # Perplexity-specific parameters
    search_domain_filter=["example.com", "-excluded.com"], # Optional, limit search domains
    return_images=False,             # Optional, include images in search results
    return_related_questions=True,  # Optional, return related questions
    search_recency_filter="week",    # Optional, filter search by time ('day', 'week', 'month', 'year')
    web_search_options={"search_context_size": "high"} # Optional, control search context ('low', 'medium', 'high')
)
```

## Streaming Responses 🌊

Enable streaming to receive responses token by token:

```python
# Enable streaming
model = OpenAILanguageModel(api_key="your-api-key", streaming=True)

# Synchronous streaming
for chunk in model.chat_complete(messages):
    print(chunk.choices[0].delta.content, end="", flush=True)

# Async streaming
async for chunk in model.achat_complete(messages):
    print(chunk.choices[0].delta.content, end="", flush=True)
```

## Structured Output 📊

Request JSON-formatted responses (supported by OpenAI and some OpenRouter models):

```python
model = OpenAILanguageModel(
    api_key="your-api-key", # or use ENV
    structured={"type": "json"}
)

messages = [
    {"role": "user", "content": "List three European capitals as JSON"}
]

response = model.chat_complete(messages)
# Response will be in JSON format
```

## LangChain Integration 🔗

Convert any provider to a LangChain chat model:

```python
model = OpenAILanguageModel(api_key="your-api-key")
langchain_model = model.to_langchain()

# Use with LangChain
from langchain.chains import ConversationChain
chain = ConversationChain(llm=langchain_model)
```

## Documentation 📚

You can find the documentation for Esperanto in the [docs](https://github.com/lfnovo/esperanto/tree/main/docs) directory.

There is also a cool beginner's tutorial in the [tutorial](https://github.com/lfnovo/esperanto/blob/main/docs/tutorial/index.md) directory.

## Contributing ðŸĪ

We welcome contributions! Please see our [Contributing Guidelines](https://github.com/lfnovo/esperanto/blob/main/CONTRIBUTING.md) for details on how to get started.

## License 📄

This project is licensed under the MIT License - see the [LICENSE](https://github.com/lfnovo/esperanto/blob/main/LICENSE) file for details.

## Development 🛠ïļ

1. Clone the repository:
```bash
git clone https://github.com/lfnovo/esperanto.git
cd esperanto
```

2. Install dependencies:
```bash
pip install -r requirements.txt
```

3. Run tests:
```bash
pytest

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "esperanto",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<3.14,>=3.9",
    "maintainer_email": null,
    "keywords": "ai, anthropic, deepseek, elevenlabs, gemini, google, groq, llm, mistral, openai, openrouter, speech-to-text, text-to-speech, transformers, x.ai",
    "author": null,
    "author_email": "LUIS NOVO <lfnovo@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/f0/0d/549928070ab49078b7834a5a8b251a36a807d9791d0a8a43f57df8c8ea8c/esperanto-2.3.4.tar.gz",
    "platform": null,
    "description": "# Esperanto \ud83c\udf10\n\n[![PyPI version](https://badge.fury.io/py/esperanto.svg)](https://badge.fury.io/py/esperanto)\n[![PyPI Downloads](https://img.shields.io/pypi/dm/esperanto)](https://pypi.org/project/esperanto/)\n[![Coverage](https://img.shields.io/badge/coverage-87%25-brightgreen)](https://github.com/lfnovo/esperanto)\n[![Python Versions](https://img.shields.io/pypi/pyversions/esperanto)](https://pypi.org/project/esperanto/)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n\nEsperanto is a powerful Python library that provides a unified interface for interacting with various Large Language Model (LLM) providers. It simplifies the process of working with different AI models (LLMs, Embedders, Transcribers, and TTS) APIs by offering a consistent interface while maintaining provider-specific optimizations.\n\n## Why Esperanto? \ud83d\ude80\n\n**\ud83e\udeb6 Ultra-Lightweight Architecture**\n- **Direct HTTP Communication**: All providers communicate directly via HTTP APIs using `httpx` - no bulky vendor SDKs required\n- **Minimal Dependencies**: Unlike LangChain and similar frameworks, Esperanto has a tiny footprint with zero overhead layers\n- **Production-Ready Performance**: Direct API calls mean faster response times and lower memory usage\n\n**\ud83d\udd04 True Provider Flexibility**\n- **Standardized Responses**: Switch between any provider (OpenAI \u2194 Anthropic \u2194 Google \u2194 etc.) without changing a single line of code\n- **Consistent Interface**: Same methods, same response objects, same patterns across all 15+ providers\n- **Future-Proof**: Add new providers or change existing ones without refactoring your application\n\n**\u26a1 Perfect for Production**\n- **Prototyping to Production**: Start experimenting and deploy the same code to production\n- **No Vendor Lock-in**: Test different providers, optimize costs, and maintain flexibility\n- **Enterprise-Ready**: Direct HTTP calls, standardized error handling, and comprehensive async support\n\nWhether you're building a quick prototype or a production application serving millions of requests, Esperanto gives you the performance of direct API calls with the convenience of a unified interface.\n\n## Features \u2728\n\n- **Unified Interface**: Work with multiple LLM providers using a consistent API\n- **Provider Support**:\n  - OpenAI (GPT-4o, o1, o3, o4, Whisper, TTS)\n  - OpenAI-Compatible (LM Studio, Ollama, vLLM, custom endpoints)\n  - Anthropic (Claude models)\n  - OpenRouter (Access to multiple models)\n  - xAI (Grok)\n  - Perplexity (Sonar models)\n  - Groq (Mixtral, Llama, Whisper)\n  - Google GenAI (Gemini LLM, Text To Speech, Embedding with native task optimization)\n  - Vertex AI (Google Cloud, LLM, Embedding, TTS)\n  - Ollama (Local deployment multiple models)\n  - Transformers (Universal local models - Qwen, CrossEncoder, BAAI, Jina, Mixedbread)\n  - ElevenLabs (Text-to-Speech, Speech-to-Text)\n  - Azure OpenAI (Chat, Embedding)\n  - Mistral (Mistral Large, Small, Embedding, etc.)\n  - DeepSeek (deepseek-chat)\n  - Voyage (Embeddings, Reranking)\n  - Jina (Advanced embedding models with task optimization, Reranking)\n- **Embedding Support**: Multiple embedding providers for vector representations\n- **Reranking Support**: Universal reranking interface for improving search relevance\n- **Speech-to-Text Support**: Transcribe audio using multiple providers\n- **Text-to-Speech Support**: Generate speech using multiple providers\n- **Async Support**: Both synchronous and asynchronous API calls\n- **Streaming**: Support for streaming responses\n- **Structured Output**: JSON output formatting (where supported)\n- **LangChain Integration**: Easy conversion to LangChain chat models\n\nFor detailed information about our providers, check out:\n- [LLM Providers Documentation](https://github.com/lfnovo/esperanto/blob/main/docs/llm.md)\n- [Embedding Providers Documentation](https://github.com/lfnovo/esperanto/blob/main/docs/embedding.md)\n- [Reranking Providers Documentation](https://github.com/lfnovo/esperanto/blob/main/docs/rerank.md)\n- [Speech-to-Text Providers Documentation](https://github.com/lfnovo/esperanto/blob/main/docs/speech_to_text.md)\n- [Text-to-Speech Providers Documentation](https://github.com/lfnovo/esperanto/blob/main/docs/text_to_speech.md)\n\n## Installation \ud83d\ude80\n\nInstall Esperanto using pip:\n\n```bash\npip install esperanto\n```\n\n### Optional Dependencies\n\n**Transformers Provider**\n\nIf you plan to use the transformers provider, install with the transformers extra:\n\n```bash\npip install \"esperanto[transformers]\"\n```\n\nThis installs:\n- `transformers` - Core Hugging Face library\n- `torch` - PyTorch framework\n- `tokenizers` - Fast tokenization\n- `sentence-transformers` - CrossEncoder support\n- `scikit-learn` - Advanced embedding features\n- `numpy` - Numerical computations\n\n**LangChain Integration**\n\nIf you plan to use any of the `.to_langchain()` methods, you need to install the correct LangChain SDKs manually:\n\n```bash\n# Core LangChain dependencies (required)\npip install \"langchain>=0.3.8,<0.4.0\" \"langchain-core>=0.3.29,<0.4.0\"\n\n# Provider-specific LangChain packages (install only what you need)\npip install \"langchain-openai>=0.2.9\"\npip install \"langchain-anthropic>=0.3.0\"\npip install \"langchain-google-genai>=2.1.2\"\npip install \"langchain-ollama>=0.2.0\"\npip install \"langchain-groq>=0.2.1\"\npip install \"langchain_mistralai>=0.2.1\"\npip install \"langchain_deepseek>=0.1.3\"\npip install \"langchain-google-vertexai>=2.0.24\"\n```\n\n## Provider Support Matrix\n\n| Provider     | LLM Support | Embedding Support | Reranking Support | Speech-to-Text | Text-to-Speech | JSON Mode |\n|--------------|-------------|------------------|-------------------|----------------|----------------|-----------|\n| OpenAI       | \u2705          | \u2705               | \u274c                | \u2705             | \u2705             | \u2705        |\n| OpenAI-Compatible | \u2705          | \u274c               | \u274c                | \u274c             | \u274c             | \u26a0\ufe0f*       |\n| Anthropic    | \u2705          | \u274c               | \u274c                | \u274c             | \u274c             | \u2705        |\n| Groq         | \u2705          | \u274c               | \u274c                | \u2705             | \u274c             | \u2705        |\n| Google (GenAI) | \u2705          | \u2705               | \u274c                | \u274c             | \u2705             | \u2705        |\n| Vertex AI    | \u2705          | \u2705               | \u274c                | \u274c             | \u2705             | \u274c        |\n| Ollama       | \u2705          | \u2705               | \u274c                | \u274c             | \u274c             | \u274c        |\n| Perplexity   | \u2705          | \u274c               | \u274c                | \u274c             | \u274c             | \u2705        |\n| Transformers | \u274c          | \u2705               | \u2705                | \u274c             | \u274c             | \u274c        |\n| ElevenLabs   | \u274c          | \u274c               | \u274c                | \u2705             | \u2705             | \u274c        |\n| Azure OpenAI | \u2705          | \u2705               | \u274c                | \u274c             | \u274c             | \u2705        |\n| Mistral      | \u2705          | \u2705               | \u274c                | \u274c             | \u274c             | \u2705        |\n| DeepSeek     | \u2705          | \u274c               | \u274c                | \u274c             | \u274c             | \u2705        |\n| Voyage       | \u274c          | \u2705               | \u2705                | \u274c             | \u274c             | \u274c        |\n| Jina         | \u274c          | \u2705               | \u2705                | \u274c             | \u274c             | \u274c        |\n| xAI          | \u2705          | \u274c               | \u274c                | \u274c             | \u274c             | \u2705        |\n| OpenRouter   | \u2705          | \u274c               | \u274c                | \u274c             | \u274c             | \u2705        |\n\n*\u26a0\ufe0f OpenAI-Compatible: JSON mode support depends on the specific endpoint implementation\n\n## Quick Start \ud83c\udfc3\u200d\u2642\ufe0f\n\nYou can use Esperanto in two ways: directly with provider-specific classes or through the AI Factory.\n\n## AIFactory - Smart Model Management \ud83c\udfed\n\nThe `AIFactory` is Esperanto's intelligent model management system that provides significant performance benefits through its **singleton cache architecture**.\n\n### \ud83d\ude80 **Singleton Cache Benefits**\n\nAIFactory automatically caches model instances based on their configuration. This means:\n- **No duplicate model creation** - same provider + model + config = same instance returned\n- **Faster subsequent calls** - cached instances are returned immediately\n- **Memory efficient** - prevents memory bloat from multiple identical models\n- **Connection reuse** - HTTP clients and configurations are preserved\n\n### \ud83d\udca1 **How It Works**\n\n```python\nfrom esperanto.factory import AIFactory\n\n# First call - creates new model instance\nmodel1 = AIFactory.create_language(\"openai\", \"gpt-4\", temperature=0.7)\n\n# Second call with same config - returns cached instance (instant!)\nmodel2 = AIFactory.create_language(\"openai\", \"gpt-4\", temperature=0.7)\n\n# They're the exact same object\nassert model1 is model2  # True!\n\n# Different config - creates new instance\nmodel3 = AIFactory.create_language(\"openai\", \"gpt-4\", temperature=0.9)\nassert model1 is not model3  # True - different config\n```\n\n### \ud83c\udfaf **Perfect for Production**\n\nThis caching is especially powerful in production scenarios:\n\n```python\n# In a web application\ndef handle_chat_request(messages):\n    # This model is cached - no recreation overhead!\n    model = AIFactory.create_language(\"anthropic\", \"claude-3-sonnet-20240229\")\n    return model.chat_complete(messages)\n\ndef handle_embedding_request(texts):\n    # This embedding model is also cached\n    embedder = AIFactory.create_embedding(\"openai\", \"text-embedding-3-small\")\n    return embedder.embed(texts)\n\n# Multiple calls to these functions reuse the same model instances\n# = Better performance + Lower memory usage\n```\n\n### \ud83d\udd0d **Cache Key Strategy**\n\nThe cache key includes:\n- **Provider name** (e.g., \"openai\", \"anthropic\")\n- **Model name** (e.g., \"gpt-4\", \"claude-3-sonnet\")\n- **All configuration parameters** (temperature, max_tokens, etc.)\n\nOnly models with **identical configurations** share the same cache entry.\n\n### Using AI Factory\n\nThe AI Factory provides a convenient way to create model instances and discover available providers:\n\n```python\nfrom esperanto.factory import AIFactory\n\n# Get available providers for each model type\nproviders = AIFactory.get_available_providers()\nprint(providers)\n# Output:\n# {\n#     'language': ['openai', 'openai-compatible', 'anthropic', 'google', 'groq', 'ollama', 'openrouter', 'xai', 'perplexity', 'azure', 'mistral', 'deepseek'],\n#     'embedding': ['openai', 'google', 'ollama', 'vertex', 'transformers', 'voyage', 'mistral', 'azure', 'jina'],\n#     'reranker': ['jina', 'voyage', 'transformers'],\n#     'speech_to_text': ['openai', 'groq', 'elevenlabs'],\n#     'text_to_speech': ['openai', 'elevenlabs', 'google', 'vertex']\n# }\n\n# Create model instances\nmodel = AIFactory.create_language(\n    \"openai\", \n    \"gpt-3.5-turbo\",\n    config={\"structured\": {\"type\": \"json\"}}\n)  # Language model\nembedder = AIFactory.create_embedding(\"openai\", \"text-embedding-3-small\")  # Embedding model\nreranker = AIFactory.create_reranker(\"transformers\", \"cross-encoder/ms-marco-MiniLM-L-6-v2\")  # Universal reranker model\ntranscriber = AIFactory.create_speech_to_text(\"openai\", \"whisper-1\")  # Speech-to-text model\nspeaker = AIFactory.create_text_to_speech(\"openai\", \"tts-1\")  # Text-to-speech model\n\nmessages = [\n    {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},\n    {\"role\": \"user\", \"content\": \"What's the capital of France?\"},\n]\nresponse = model.chat_complete(messages)\n\n# Create an embedding instance\ntexts = [\"Hello, world!\", \"Another text\"]\n# Synchronous usage\nembeddings = embedder.embed(texts)\n# Async usage\nembeddings = await embedder.aembed(texts)\n```\n\n### Using Provider-Specific Classes\n\nHere's a simple example to get you started:\n\n```python\nfrom esperanto.providers.llm.openai import OpenAILanguageModel\nfrom esperanto.providers.llm.anthropic import AnthropicLanguageModel\n\n# Initialize a provider with structured output\nmodel = OpenAILanguageModel(\n    api_key=\"your-api-key\",\n    model_name=\"gpt-4\",  # Optional, defaults to gpt-4\n    structured={\"type\": \"json\"}  # Optional, for JSON output\n)\n\n# Simple chat completion\nmessages = [\n    {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},\n    {\"role\": \"user\", \"content\": \"List three colors in JSON format\"}\n]\n\n# Synchronous call\nresponse = model.chat_complete(messages)\nprint(response.choices[0].message.content)  # Will be in JSON format\n\n# Async call\nasync def get_response():\n    response = await model.achat_complete(messages)\n    print(response.choices[0].message.content)  # Will be in JSON format\n```\n\n## Standardized Responses\n\nAll providers in Esperanto return standardized response objects, making it easy to work with different models without changing your code.\n\n### LLM Responses\n\n```python\nfrom esperanto.factory import AIFactory\n\nmodel = AIFactory.create_language(\n    \"openai\", \n    \"gpt-3.5-turbo\",\n    config={\"structured\": {\"type\": \"json\"}}\n)\nmessages = [{\"role\": \"user\", \"content\": \"Hello!\"}]\n\n# All LLM responses follow this structure\nresponse = model.chat_complete(messages)\nprint(response.choices[0].message.content)  # The actual response text\nprint(response.choices[0].message.role)     # 'assistant'\nprint(response.model)                       # The model used\nprint(response.usage.total_tokens)          # Token usage information\nprint(response.content)          # Shortcut for response.choices[0].message.content\n\n# For streaming responses\nfor chunk in model.chat_complete(messages):\n    print(chunk.choices[0].delta.content, end=\"\", flush=True)\n\n# Async streaming\nasync for chunk in model.achat_complete(messages):\n    print(chunk.choices[0].delta.content, end=\"\", flush=True)\n```\n\n### Embedding Responses\n\n```python\nfrom esperanto.factory import AIFactory\n\nmodel = AIFactory.create_embedding(\"openai\", \"text-embedding-3-small\")\ntexts = [\"Hello, world!\", \"Another text\"]\n\n# All embedding responses follow this structure\nresponse = model.embed(texts)\nprint(response.data[0].embedding)     # Vector for first text\nprint(response.data[0].index)         # Index of the text (0)\nprint(response.model)                 # The model used\nprint(response.usage.total_tokens)    # Token usage information\n```\n\n### Reranking Responses\n\n```python\nfrom esperanto.factory import AIFactory\n\nreranker = AIFactory.create_reranker(\"transformers\", \"BAAI/bge-reranker-base\")\nquery = \"What is machine learning?\"\ndocuments = [\n    \"Machine learning is a subset of artificial intelligence.\",\n    \"The weather is nice today.\",\n    \"Python is a programming language used in ML.\"\n]\n\n# All reranking responses follow this structure\nresponse = reranker.rerank(query, documents, top_k=2)\nprint(response.results[0].document)          # Highest ranked document\nprint(response.results[0].relevance_score)   # Normalized 0-1 relevance score\nprint(response.results[0].index)             # Original document index\nprint(response.model)                        # The model used\n```\n\n### Task-Aware Embeddings \ud83c\udfaf\n\nEsperanto supports advanced task-aware embeddings that optimize vector representations for specific use cases. This works across **all embedding providers** through a universal interface:\n\n```python\nfrom esperanto.factory import AIFactory\nfrom esperanto.common_types.task_type import EmbeddingTaskType\n\n# Task-optimized embeddings work with ANY provider\nmodel = AIFactory.create_embedding(\n    provider=\"jina\",  # Also works with: \"openai\", \"google\", \"transformers\", etc.\n    model_name=\"jina-embeddings-v3\",\n    config={\n        \"task_type\": EmbeddingTaskType.RETRIEVAL_QUERY,  # Optimize for search queries\n        \"late_chunking\": True,                           # Better long-context handling\n        \"output_dimensions\": 512                         # Control vector size\n    }\n)\n\n# Generate optimized embeddings\nquery = \"What is machine learning?\"\nembeddings = model.embed([query])\n```\n\n**Universal Task Types:**\n- `RETRIEVAL_QUERY` - Optimize for search queries\n- `RETRIEVAL_DOCUMENT` - Optimize for document storage  \n- `SIMILARITY` - General text similarity\n- `CLASSIFICATION` - Text classification tasks\n- `CLUSTERING` - Document clustering\n- `CODE_RETRIEVAL` - Code search optimization\n- `QUESTION_ANSWERING` - Optimize for Q&A tasks\n- `FACT_VERIFICATION` - Optimize for fact checking\n\n**Provider Support:**\n- **Jina**: Native API support for all features\n- **Google**: Native task type translation to Gemini API\n- **OpenAI**: Task optimization via intelligent text prefixes\n- **Transformers**: Local emulation with task-specific processing\n- **Others**: Graceful degradation with consistent interface\n\nThe standardized response objects ensure consistency across different providers, making it easy to:\n- Switch between providers without changing your application code\n- Handle responses in a uniform way\n- Access common attributes like token usage and model information\n\n## Provider Configuration \ud83d\udd27\n\n### OpenAI\n\n```python\nfrom esperanto.providers.llm.openai import OpenAILanguageModel\n\nmodel = OpenAILanguageModel(\n    api_key=\"your-api-key\",  # Or set OPENAI_API_KEY env var\n    model_name=\"gpt-4\",      # Optional\n    temperature=0.7,         # Optional\n    max_tokens=850,         # Optional\n    streaming=False,        # Optional\n    top_p=0.9,             # Optional\n    structured={\"type\": \"json\"},      # Optional, for JSON output\n    base_url=None,         # Optional, for custom endpoint\n    organization=None      # Optional, for org-specific API\n)\n```\n\n### OpenAI-Compatible Endpoints\n\nUse any OpenAI-compatible endpoint (LM Studio, Ollama, vLLM, custom deployments) with the same interface:\n\n```python\nfrom esperanto.factory import AIFactory\n\n# Using factory config\nmodel = AIFactory.create_language(\n    \"openai-compatible\",\n    \"your-model-name\",  # Use any model name supported by your endpoint\n    config={\n        \"base_url\": \"http://localhost:1234/v1\",  # Your endpoint URL (required)\n        \"api_key\": \"your-api-key\"                # Your API key (optional)\n    }\n)\n\n# Or set environment variables\n# OPENAI_COMPATIBLE_BASE_URL=http://localhost:1234/v1\n# OPENAI_COMPATIBLE_API_KEY=your-api-key  # Optional for endpoints that don't require auth\nmodel = AIFactory.create_language(\"openai-compatible\", \"your-model-name\")\n\n# Works with any OpenAI-compatible endpoint\nmessages = [{\"role\": \"user\", \"content\": \"Hello!\"}]\nresponse = model.chat_complete(messages)\nprint(response.content)\n\n# Streaming support\nfor chunk in model.chat_complete(messages, stream=True):\n    print(chunk.choices[0].delta.content, end=\"\", flush=True)\n```\n\n**Common Use Cases:**\n- **LM Studio**: Local model serving with GUI\n- **Ollama**: `ollama serve` with OpenAI compatibility\n- **vLLM**: High-performance inference server\n- **Custom Deployments**: Any server implementing OpenAI chat completions API\n\n**Features:**\n- \u2705 **Streaming**: Real-time response streaming\n- \u2705 **Pass-through Model Names**: Use any model name your endpoint supports\n- \u2705 **Graceful Degradation**: Automatically handles varying feature support\n- \u2705 **Error Handling**: Clear error messages for troubleshooting\n- \u26a0\ufe0f **JSON Mode**: Depends on endpoint implementation\n\n### Perplexity\n\nPerplexity uses an OpenAI-compatible API but includes additional parameters for controlling search behavior.\n\n```python\nfrom esperanto.providers.llm.perplexity import PerplexityLanguageModel\n\nmodel = PerplexityLanguageModel(\n    api_key=\"your-api-key\",  # Or set PERPLEXITY_API_KEY env var\n    model_name=\"llama-3-sonar-large-32k-online\", # Recommended default\n    temperature=0.7,         # Optional\n    max_tokens=850,         # Optional\n    streaming=False,        # Optional\n    top_p=0.9,             # Optional\n    structured={\"type\": \"json\"}, # Optional, for JSON output\n\n    # Perplexity-specific parameters\n    search_domain_filter=[\"example.com\", \"-excluded.com\"], # Optional, limit search domains\n    return_images=False,             # Optional, include images in search results\n    return_related_questions=True,  # Optional, return related questions\n    search_recency_filter=\"week\",    # Optional, filter search by time ('day', 'week', 'month', 'year')\n    web_search_options={\"search_context_size\": \"high\"} # Optional, control search context ('low', 'medium', 'high')\n)\n```\n\n## Streaming Responses \ud83c\udf0a\n\nEnable streaming to receive responses token by token:\n\n```python\n# Enable streaming\nmodel = OpenAILanguageModel(api_key=\"your-api-key\", streaming=True)\n\n# Synchronous streaming\nfor chunk in model.chat_complete(messages):\n    print(chunk.choices[0].delta.content, end=\"\", flush=True)\n\n# Async streaming\nasync for chunk in model.achat_complete(messages):\n    print(chunk.choices[0].delta.content, end=\"\", flush=True)\n```\n\n## Structured Output \ud83d\udcca\n\nRequest JSON-formatted responses (supported by OpenAI and some OpenRouter models):\n\n```python\nmodel = OpenAILanguageModel(\n    api_key=\"your-api-key\", # or use ENV\n    structured={\"type\": \"json\"}\n)\n\nmessages = [\n    {\"role\": \"user\", \"content\": \"List three European capitals as JSON\"}\n]\n\nresponse = model.chat_complete(messages)\n# Response will be in JSON format\n```\n\n## LangChain Integration \ud83d\udd17\n\nConvert any provider to a LangChain chat model:\n\n```python\nmodel = OpenAILanguageModel(api_key=\"your-api-key\")\nlangchain_model = model.to_langchain()\n\n# Use with LangChain\nfrom langchain.chains import ConversationChain\nchain = ConversationChain(llm=langchain_model)\n```\n\n## Documentation \ud83d\udcda\n\nYou can find the documentation for Esperanto in the [docs](https://github.com/lfnovo/esperanto/tree/main/docs) directory.\n\nThere is also a cool beginner's tutorial in the [tutorial](https://github.com/lfnovo/esperanto/blob/main/docs/tutorial/index.md) directory.\n\n## Contributing \ud83e\udd1d\n\nWe welcome contributions! Please see our [Contributing Guidelines](https://github.com/lfnovo/esperanto/blob/main/CONTRIBUTING.md) for details on how to get started.\n\n## License \ud83d\udcc4\n\nThis project is licensed under the MIT License - see the [LICENSE](https://github.com/lfnovo/esperanto/blob/main/LICENSE) file for details.\n\n## Development \ud83d\udee0\ufe0f\n\n1. Clone the repository:\n```bash\ngit clone https://github.com/lfnovo/esperanto.git\ncd esperanto\n```\n\n2. Install dependencies:\n```bash\npip install -r requirements.txt\n```\n\n3. Run tests:\n```bash\npytest\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A light-weight, production-ready, unified interface for various AI model providers",
    "version": "2.3.4",
    "project_urls": {
        "documentation": "https://github.com/lfnovo/esperanto#readme",
        "homepage": "https://github.com/lfnovo/esperanto",
        "repository": "https://github.com/lfnovo/esperanto"
    },
    "split_keywords": [
        "ai",
        " anthropic",
        " deepseek",
        " elevenlabs",
        " gemini",
        " google",
        " groq",
        " llm",
        " mistral",
        " openai",
        " openrouter",
        " speech-to-text",
        " text-to-speech",
        " transformers",
        " x.ai"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "2627f98201713ec5339388e0a00bef25ccf598ba5048f3980f556d6a243c2b30",
                "md5": "94ab5cc4fc4688282c94c80e5d37b838",
                "sha256": "9759ff22f3bfff6c9cc46acc6c69ded4f038baa6f5cda592b6ec83d5f11c5eff"
            },
            "downloads": -1,
            "filename": "esperanto-2.3.4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "94ab5cc4fc4688282c94c80e5d37b838",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<3.14,>=3.9",
            "size": 116946,
            "upload_time": "2025-07-14T10:29:05",
            "upload_time_iso_8601": "2025-07-14T10:29:05.869187Z",
            "url": "https://files.pythonhosted.org/packages/26/27/f98201713ec5339388e0a00bef25ccf598ba5048f3980f556d6a243c2b30/esperanto-2.3.4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "f00d549928070ab49078b7834a5a8b251a36a807d9791d0a8a43f57df8c8ea8c",
                "md5": "9e55dc7ad93b06c24663f6ccecf0d98a",
                "sha256": "5072c3a6bdbbf0c53d6671f28991e9f205c93c49821058e2b62c3a17710bd235"
            },
            "downloads": -1,
            "filename": "esperanto-2.3.4.tar.gz",
            "has_sig": false,
            "md5_digest": "9e55dc7ad93b06c24663f6ccecf0d98a",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<3.14,>=3.9",
            "size": 457606,
            "upload_time": "2025-07-14T10:29:07",
            "upload_time_iso_8601": "2025-07-14T10:29:07.119295Z",
            "url": "https://files.pythonhosted.org/packages/f0/0d/549928070ab49078b7834a5a8b251a36a807d9791d0a8a43f57df8c8ea8c/esperanto-2.3.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-14 10:29:07",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "lfnovo",
    "github_project": "esperanto#readme",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "esperanto"
}
        
Elapsed time: 0.57016s