# LangGate AI Gateway
<p align="left">
<a href="https://pypi.org/project/langgate" target="_blank"><img src="https://img.shields.io/pypi/pyversions/langgate.svg" alt="Python versions"></a> <a href="https://pypi.org/project/langgate" target="_blank"><img src="https://img.shields.io/pypi/v/langgate" alt="PyPI"></a> <a href="https://github.com/Tanantor/langgate/actions?query=workflow%3A%22CI+Checks%22" target="_blank"><img src="https://github.com/Tanantor/langgate/actions/workflows/ci.yaml/badge.svg?event=push&branch=main" alt="CI Checks"></a>
<a href="https://github.com/Tanantor/langgate/tree/main/tests" target="_blank"><img src="https://img.shields.io/badge/dynamic/xml?url=https://tanantor.github.io/langgate/coverage/test-count.xml&query=//testcount&label=tests&color=blue&style=flat" alt="Tests"></a> <a href="https://github.com/Tanantor/langgate/actions?query=workflow%3ACI" target="_blank"><img src="https://tanantor.github.io/langgate/coverage/coverage-badge.svg" alt="Coverage"></a>
</p>
LangGate is a lightweight, high-performance gateway for AI model inference.
LangGate adapts to your architecture: integrate it as a Python SDK, run it as a standalone registry, or deploy it as a complete proxy server.
LangGate works with any AI provider, without forcing standardization to a specific API format. Apply custom parameter mappings or none at all - you decide.
LangGate by default avoids unnecessary transformation.
## Core Features
- **Provider-Agnostic**: Works with any AI inference provider (OpenAI, Anthropic, Google, etc.)
- **Flexible Parameter Transformations**: Apply custom parameter mappings or none at all - you decide
- **High-Performance Proxying**: Uses Envoy for efficient request handling with direct response streaming
- **Simple Configuration**: Clean YAML configuration inspired by familiar formats
- **Minimal Architecture**: Direct integration with Envoy, without complex control plane overhead
- **SDK First Approach**: Use the registry as a standalone module without the proxy service
## Architecture
LangGate uses a simplified architecture with three main components:
1. **Envoy Proxy**: Front-facing proxy that receives API requests and handles response streaming
2. **External Processor**: gRPC service implementing Envoy's External Processing filter for request transformation and routing
3. **Registry Service**: Manages model mappings, parameter transformations, and provider configurations
The system works as follows:
1. **Request Flow**: Client sends request → Envoy → External Processor transforms request → Envoy routes to appropriate AI provider
2. **Response Flow**: AI provider response → Envoy streams directly to client
This architecture provides several advantages:
- No control plane overhead or complex deployment requirements
- Direct response streaming from providers through Envoy for optimal performance
- Flexible deployment options, from local development to production environments
## Getting Started
### Using the Registry SDK
The LangGate SDK is designed to be used as a standalone module, allowing you to integrate it into your existing applications without the need for the proxy service.
This is particularly useful for local development or when you want to use LangGate's features without deploying the full stack.
You probably won't need the proxy unless scaling your application to a microservice architecture or if you have multiple apps in a Kubernetes cluster that each depend on a registry.
You can switch from the SDK's local registry client to the remote registry client + proxy setup with minimal code changes.
#### Installation
We recommend using [uv](https://docs.astral.sh/uv/) to manage Python projects. In a uv project, add `langgate[sdk]` to dependencies by running:
```bash
uv add langgate[sdk]
```
Alternatively, using pip:
```bash
pip install langgate[sdk]
```
For more information on package components and installation options for specific use cases, see the [packages documentation](packages/README.md).
#### Example Usage
The package includes a `LangGateLocal` client that can be used directly in your application without needing to run the proxy service. This client provides access to both language and image model registries, plus parameter transformation features.
**List Available Models:**
```py
from pprint import pprint as pp
from langgate.sdk import LangGateLocal
client = LangGateLocal()
# List available LLMs
llms = await client.list_llms()
print(f"Available LLMs: {len(llms)}")
for model in llms[:3]:
print(f"- {model.id}: {model.name}")
# List available image models
image_models = await client.list_image_models()
print(f"Available Image Models: {len(image_models)}")
for model in image_models[:3]:
print(f"- {model.id}: {model.name}")
```
```text
Available LLMs: 5
- openai/gpt-4.1: GPT-4.1
- openai/o3: o3
- openai/o3-high: o3-high
- anthropic/claude-sonnet-4: Claude-4 Sonnet
- anthropic/claude-sonnet-4-reasoning: Claude-4 Sonnet R
==================================================
Available Image Models: 4
- openai/gpt-image-1: GPT Image 1
- openai/dall-e-3: DALL-E 3
- black-forest-labs/flux-dev: FLUX.1 [dev]
- stability-ai/sd-3.5-large: SD 3.5 Large
```
**Get Model Information and Transform Parameters:**
```py
# LangGate allows us to register "virtual models" - models with specific parameters.
# `langgate_config.yaml` defines this `claude-sonnet-4-reasoning` model
# which is a wrapper around the `claude-sonnet-4-0` model,
# with specific parameters and metadata.
model_id = "anthropic/claude-sonnet-4-reasoning"
# Get model info
model_info = await client.get_llm_info(model_id)
print(f"Model: {model_info.name}")
print(f"Provider: {model_info.provider.name}")
# Transform parameters
input_params = {"temperature": 0.7, "stream": True}
api_format, transformed = await client.get_params(model_id, input_params)
print(f"API format: {api_format}")
pp(transformed)
```
```
Model: Claude-4 Sonnet R
Provider: Anthropic
Description: Claude-4 Sonnet with reasoning capabilities.
Transformed parameters:
('anthropic',
{'api_key': SecretStr('**********'),
'base_url': 'https://api.anthropic.com',
'model': 'claude-sonnet-4-0',
'stream': True,
'thinking': {'budget_tokens': 1024, 'type': 'enabled'}})
```
The `temperature` parameter is removed because temperature is not supported by Claude models with reasoning enabled. The `thinking` parameter is added with the `budget_tokens` we specify in `langgate_config.yaml`. See the below [Configuration](#configuration) section for more details on how LangGate handles parameter transformations.
**Working with Image Models:**
Transforming parameters for image models is the exact same process as for LLMs.
```py
# Transform parameters for an image model
image_model_id = "openai/gpt-image-1"
image_params = {
"prompt": "A beautiful sunset over the ocean",
"size": "1024x1024",
"quality": "medium",
}
api_format, transformed = await client.get_params(image_model_id, image_params)
print(f"API format: {api_format}")
pp(transformed)
```
```text
API format: openai
{'api_key': SecretStr('**********'),
'base_url': 'https://api.openai.com/v1',
'model': 'gpt-image-1',
'prompt': 'A beautiful sunset over the ocean',
'quality': 'medium',
'size': '1024x1024'}
```
#### Example integration with Langchain:
The following is an example of how you might define a factory class to create a Langchain `BaseChatModel` instance configured via the `LangGateLocal` client:
```py
import os
# Ensure you have the required environment variables set
os.environ["OPENAI_API_KEY"] = "<YOUR_API_KEY>"
# The below environment variables are optional.
# The yaml config resolution priority is: args > env > cwd > package default.
# If you don't want to use either the package default (langgate/core/data/default_config.yaml)
# or a config in your cwd, set:
# os.environ["LANGGATE_CONFIG"] = "some_other_path_not_in_your_cwd/langgate_config.yaml"
# The models data resolution priority is: args > env > cwd > package default
# By default, any user-defined `langgate_models.json` files are merged with default models data. See `models_merge_mode` configuration.
# If you don't want to use either the package default (langgate/registry/data/default_models.json)
# or a models data file in your cwd, set:
# os.environ["LANGGATE_MODELS"] = "some_other_path_not_in_your_cwd/langgate_models.json"
# The .env file resolution priority is: args > env > cwd > None
# If you don't want to use either the package default or a .env file in your cwd, set:
# os.environ["LANGGATE_ENV_FILE"] = "some_other_path_not_in_your_cwd/.env"
```
```py
from typing import Any
from pprint import pprint as pp
from langchain.chat_models.base import BaseChatModel
from langchain_anthropic import ChatAnthropic
from langchain_openai import ChatOpenAI
from langgate.sdk import LangGateLocal, LangGateLocalProtocol
from langgate.core.models import (
# `ModelProviderId` is a string alias for better type safety
ModelProviderId,
# ids for common providers are included for convenience
MODEL_PROVIDER_OPENAI,
MODEL_PROVIDER_ANTHROPIC,
)
# Map providers to model classes
MODEL_CLASS_MAP: dict[ModelProviderId, type[BaseChatModel]] = {
MODEL_PROVIDER_OPENAI: ChatOpenAI,
MODEL_PROVIDER_ANTHROPIC: ChatAnthropic,
}
class ModelFactory:
"""
Factory for creating a Langchain `BaseChatModel` instance
with paramaters from LangGate.
"""
def __init__(self, langgate_client: LangGateLocalProtocol | None = None):
self.langgate_client = langgate_client or LangGateLocal()
async def create_model(
self, model_id: str, input_params: dict[str, Any] | None = None
) -> tuple[BaseChatModel, dict[str, Any]]:
"""Create a model instance for the given model ID."""
params = {"temperature": 0.7, "streaming": True}
if input_params:
params.update(input_params)
# Get model info from the registry cache
model_info = await self.langgate_client.get_model_info(model_id)
# Transform parameters using the transformer client
# If switching to using the proxy, you would remove this line
# and let the proxy handle the parameter transformation instead.
api_format, model_params = await self.langgate_client.get_params(
model_id, params
)
# api_format defaults to the provider id unless specified in the config.
# e.g. Specify "openai" for OpenAI-compatible APIs, etc.
print("API format:", api_format)
pp(model_params)
# Get the appropriate model class based on provider
client_cls_key = ModelProviderId(api_format)
model_class = MODEL_CLASS_MAP.get(client_cls_key)
if not model_class:
raise ValueError(f"No model class for provider {model_info.provider.id}")
# Create model instance with parameters
model = model_class(**model_params)
# Create model info dict
model_metadata = model_info.model_dump(exclude_none=True)
return model, model_metadata
```
```py
model_factory = ModelFactory()
model_id = "openai/gpt-4o"
model = await model_factory.create_model(model_id, {"temperature": 0.7})
model
```
```text
API format: openai
{'api_key': SecretStr('**********'),
'base_url': 'https://api.openai.com/v1',
'model': 'gpt-4o',
'streaming': True,
'temperature': 0.7}
ChatOpenAI(client=<openai.resources.chat.completions.completions.Completions object at 0x121f66210>, async_client=<openai.resources.chat.completions.completions.AsyncCompletions object at 0x121f72210>, root_client=<openai.OpenAI object at 0x121f56210>, root_async_client=<openai.AsyncOpenAI object at 0x121f66350>, model_name='gpt-4o', temperature=0.7, model_kwargs={}, openai_api_key=SecretStr('**********'), openai_api_base='https://api.openai.com/v1', streaming=True)
```
If you want to use the LangGate Envoy proxy instead of `LangGateLocal`, you can switch to the `HTTPRegistryClient` with minimal code changes.
For more usage patterns and detailed instructions, see [examples](examples/README.md).
### Envoy Proxy Service (Coming Soon)
The LangGate proxy feature is currently in development. When completed, it will provide:
1. Centralized model registry accessible via API
2. Parameter transformation at the proxy level
3. API key management and request routing
4. High-performance response streaming via Envoy
## Configuration
LangGate uses a simple YAML configuration format:
```yaml
# langgate_config.yaml
# Main configuration file for LangGate
# Global default parameters by modality (applied to all models unless overridden)
default_params:
text:
temperature: 0.7
# Service provider configurations
services:
openai:
api_key: "${OPENAI_API_KEY}"
base_url: "https://api.openai.com/v1"
model_patterns:
# match any o-series model
openai/o:
remove_params:
- temperature
anthropic:
api_key: "${ANTHROPIC_API_KEY}"
base_url: "https://api.anthropic.com"
model_patterns:
# match any model with reasoning in the id
reasoning:
override_params:
thinking:
type: enabled
remove_params:
- temperature
replicate:
api_key: "${REPLICATE_API_KEY}"
# Model-specific configurations organized by modality
models:
text:
- id: openai/gpt-4.1
service:
provider: openai
model_id: gpt-4.1
- id: openai/o3
service:
provider: openai
model_id: o3
# "virtual model" that wraps the o3 model with high-effort reasoning
- id: openai/o3-high
service:
provider: openai
model_id: o3
name: o3-high
description: o3-high applies high-effort reasoning for the o3 model
override_params:
reasoning_effort: high
- id: anthropic/claude-sonnet-4
service:
provider: anthropic
model_id: claude-sonnet-4-0
# "virtual model" that wraps the claude-sonnet-4-0 model with reasoning
- id: anthropic/claude-sonnet-4-reasoning
service:
provider: anthropic
model_id: claude-sonnet-4-0
name: Claude-4 Sonnet R
description: "Claude-4 Sonnet with reasoning capabilities."
override_params:
thinking:
budget_tokens: 1024
image:
- id: openai/gpt-image-1
service:
provider: openai
model_id: gpt-image-1
- id: openai/dall-e-3
service:
provider: openai
model_id: dall-e-3
- id: black-forest-labs/flux-dev
service:
provider: replicate
model_id: black-forest-labs/flux-dev
default_params:
disable_safety_checker: true
- id: stability-ai/sd-3.5-large
service:
provider: replicate
model_id: stability-ai/stable-diffusion-3.5-large
# Models merge mode for loading data from JSON files: "merge" (default), "replace", or "extend"
# - merge: User models override defaults, new models are added
# - replace: Only use user models (ignore default models file)
# - extend: Add user models to defaults, error on conflicts
models_merge_mode: merge
```
### Parameter Transformation Precedence
When transforming parameters for model requests, LangGate follows a specific precedence order:
#### Defaults (applied only if key doesn't exist yet):
1. Model-specific defaults (highest precedence for defaults)
2. Pattern defaults (matching patterns applied in config order)
3. Service provider defaults
4. Global defaults (lowest precedence for defaults)
#### Overrides/Removals/Renames (applied in order, later steps overwrite/modify earlier ones):
1. Input parameters (initial state)
2. Service-level API keys and base URLs
3. Service-level overrides, removals, renames
4. Pattern-level overrides, removals, renames (matching patterns applied in config order)
5. Model-specific overrides, removals, renames (highest precedence)
6. Model ID (always overwritten with service_model_id)
7. Environment variable substitution (applied last to all string values)
## Environment Variables
| Variable | Description | Default |
|----------|-------------|---------|
| LANGGATE_CONFIG | Path to the main configuration file | ./langgate_config.yaml |
| LANGGATE_MODELS | Path to the models data JSON file | ./langgate_models.json |
| LANGGATE_ENV_FILE | Path to the .env file | ./.env |
| LOG_LEVEL | Logging level | info |
### Models Merge Behavior
You can add additional models to LangGate's model registry by creating a `langgate_models.json` file in your working directory, or by setting the `LANGGATE_MODELS` environment variable to point to a custom JSON file.
LangGate supports three modes for including extra models beyond those we ship with the package (`default_models.json`):
- **merge** (default): Your defined models are merged with default models, with your models taking precedence
- **replace**: Only your models are used
- **extend**: your models are added to defaults, conflicts cause errors
Configure this with `models_merge_mode` in your YAML configuration.
Note:
- If `langgate_models.json` is unset in your working directory, and no `LANGGATE_MODELS` environment variable is set, then the registry package default `langgate/registry/data/default_models.json` will be used. This file contains data on most major providers and models.
- If `langgate_config.yaml` is unset in your working directory, and no `LANGGATE_CONFIG` environment variable is set, then the core package default `langgate/core/data/default_config.yaml` will be used. This file contains a default configuration with common LLM providers.
## LangGate vs Alternatives
### LangGate vs Envoy AI Gateway
While both use Envoy for proxying, LangGate takes a more direct approach:
- **Simplified Architecture**: LangGate uses Envoy's ext_proc filter directly without a separate control plane
- **No Kubernetes Dependency**: Runs anywhere Docker runs, without requiring Kubernetes CRDs or custom resources
- **Configuration Simplicity**: Uses a straightforward YAML configuration instead of Kubernetes resources
- **Lightweight Deployment**: Deploy with Docker Compose or any container platform without complex orchestration
### LangGate vs Python-based Gateways
Unlike other Python-based gateways:
- **High-Performance Streaming**: Uses Envoy's native streaming capabilities instead of Python for response handling
- **Focused Functionality**: Handles request transformation in Python while letting Envoy manage the high-throughput parts
- **No Middleman for Responses**: Responses stream directly from providers to clients via Envoy
## Running with Docker
```bash
# Start the full LangGate stack
make compose-up
# Development mode with hot reloading
make compose-dev
# Local development (Python on host, Envoy in Docker)
make run-local
# Stop the stack
make compose-down
# Stop stack and remove volumes
make compose-breakdown
```
## Testing and Development
```bash
# Run all tests
make test
# Run lint checks
make lint
```
## Additional Documentation
- [Contributing Guide](CONTRIBUTING.md) - Development setup and guidelines
- [SDK Examples](examples/README.md) - Sample code for using the LangGate SDK
- [Deployment Guide](deployment/README.md) - Instructions for deploying to Kubernetes and other platforms
## Roadmap
- **Pydantic Schema Validation**: Implement validation of parameters against Pydantic schemas representing the full API of the provider's model
- **TTS and ASR Model Support**: Include leading Text-to-Speech (TTS) and Automatic Speech Recognition (ASR) models in the default model registry, with endpoints for fetching models filtered by modality (for modality-specific return typing) and schemas for these modalities.
- **Video Generation Model Support**: Add video generation models, similarly to the afformentioned modalities, with an explicit endpoint and schemas.
- **OpenAI API Standardization Option**: Introduce an option to standardize to the OpenAI API spec. This will involve mapping provider-specific Pydantic schemas to corresponding OpenAI API input schemas, offering a unified interface for diverse models.
## License
[MIT License](LICENSE)
Raw data
{
"_id": null,
"home_page": null,
"name": "langgate",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.13",
"maintainer_email": "Saran Connoly <saran@tanantor.com>",
"keywords": "envoy, fastapi, gateway, llm, model-serving, proxy, registry",
"author": "Tanantor Ltd.",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/7f/53/31f33cabab01bac4093a985492bcc9c4b8b53e344c640ccfeba1a96ca970/langgate-0.2.1.tar.gz",
"platform": null,
"description": "# LangGate AI Gateway\n<p align=\"left\">\n <a href=\"https://pypi.org/project/langgate\" target=\"_blank\"><img src=\"https://img.shields.io/pypi/pyversions/langgate.svg\" alt=\"Python versions\"></a> <a href=\"https://pypi.org/project/langgate\" target=\"_blank\"><img src=\"https://img.shields.io/pypi/v/langgate\" alt=\"PyPI\"></a> <a href=\"https://github.com/Tanantor/langgate/actions?query=workflow%3A%22CI+Checks%22\" target=\"_blank\"><img src=\"https://github.com/Tanantor/langgate/actions/workflows/ci.yaml/badge.svg?event=push&branch=main\" alt=\"CI Checks\"></a>\n <a href=\"https://github.com/Tanantor/langgate/tree/main/tests\" target=\"_blank\"><img src=\"https://img.shields.io/badge/dynamic/xml?url=https://tanantor.github.io/langgate/coverage/test-count.xml&query=//testcount&label=tests&color=blue&style=flat\" alt=\"Tests\"></a> <a href=\"https://github.com/Tanantor/langgate/actions?query=workflow%3ACI\" target=\"_blank\"><img src=\"https://tanantor.github.io/langgate/coverage/coverage-badge.svg\" alt=\"Coverage\"></a>\n</p>\n\nLangGate is a lightweight, high-performance gateway for AI model inference.\n\nLangGate adapts to your architecture: integrate it as a Python SDK, run it as a standalone registry, or deploy it as a complete proxy server.\n\nLangGate works with any AI provider, without forcing standardization to a specific API format. Apply custom parameter mappings or none at all - you decide.\n\nLangGate by default avoids unnecessary transformation.\n\n## Core Features\n\n- **Provider-Agnostic**: Works with any AI inference provider (OpenAI, Anthropic, Google, etc.)\n- **Flexible Parameter Transformations**: Apply custom parameter mappings or none at all - you decide\n- **High-Performance Proxying**: Uses Envoy for efficient request handling with direct response streaming\n- **Simple Configuration**: Clean YAML configuration inspired by familiar formats\n- **Minimal Architecture**: Direct integration with Envoy, without complex control plane overhead\n- **SDK First Approach**: Use the registry as a standalone module without the proxy service\n\n## Architecture\n\nLangGate uses a simplified architecture with three main components:\n\n1. **Envoy Proxy**: Front-facing proxy that receives API requests and handles response streaming\n2. **External Processor**: gRPC service implementing Envoy's External Processing filter for request transformation and routing\n3. **Registry Service**: Manages model mappings, parameter transformations, and provider configurations\n\nThe system works as follows:\n\n1. **Request Flow**: Client sends request \u2192 Envoy \u2192 External Processor transforms request \u2192 Envoy routes to appropriate AI provider\n2. **Response Flow**: AI provider response \u2192 Envoy streams directly to client\n\nThis architecture provides several advantages:\n- No control plane overhead or complex deployment requirements\n- Direct response streaming from providers through Envoy for optimal performance\n- Flexible deployment options, from local development to production environments\n\n## Getting Started\n\n### Using the Registry SDK\nThe LangGate SDK is designed to be used as a standalone module, allowing you to integrate it into your existing applications without the need for the proxy service.\nThis is particularly useful for local development or when you want to use LangGate's features without deploying the full stack.\nYou probably won't need the proxy unless scaling your application to a microservice architecture or if you have multiple apps in a Kubernetes cluster that each depend on a registry.\nYou can switch from the SDK's local registry client to the remote registry client + proxy setup with minimal code changes.\n#### Installation\nWe recommend using [uv](https://docs.astral.sh/uv/) to manage Python projects. In a uv project, add `langgate[sdk]` to dependencies by running:\n```bash\nuv add langgate[sdk]\n```\nAlternatively, using pip:\n\n```bash\npip install langgate[sdk]\n```\n\nFor more information on package components and installation options for specific use cases, see the [packages documentation](packages/README.md).\n#### Example Usage\n\nThe package includes a `LangGateLocal` client that can be used directly in your application without needing to run the proxy service. This client provides access to both language and image model registries, plus parameter transformation features.\n\n**List Available Models:**\n\n```py\nfrom pprint import pprint as pp\nfrom langgate.sdk import LangGateLocal\n\nclient = LangGateLocal()\n\n# List available LLMs\nllms = await client.list_llms()\nprint(f\"Available LLMs: {len(llms)}\")\nfor model in llms[:3]:\n print(f\"- {model.id}: {model.name}\")\n\n# List available image models\nimage_models = await client.list_image_models()\nprint(f\"Available Image Models: {len(image_models)}\")\nfor model in image_models[:3]:\n print(f\"- {model.id}: {model.name}\")\n```\n```text\nAvailable LLMs: 5\n- openai/gpt-4.1: GPT-4.1\n- openai/o3: o3\n- openai/o3-high: o3-high\n- anthropic/claude-sonnet-4: Claude-4 Sonnet\n- anthropic/claude-sonnet-4-reasoning: Claude-4 Sonnet R\n\n==================================================\n\nAvailable Image Models: 4\n- openai/gpt-image-1: GPT Image 1\n- openai/dall-e-3: DALL-E 3\n- black-forest-labs/flux-dev: FLUX.1 [dev]\n- stability-ai/sd-3.5-large: SD 3.5 Large\n```\n\n**Get Model Information and Transform Parameters:**\n\n```py\n# LangGate allows us to register \"virtual models\" - models with specific parameters.\n# `langgate_config.yaml` defines this `claude-sonnet-4-reasoning` model\n# which is a wrapper around the `claude-sonnet-4-0` model,\n# with specific parameters and metadata.\nmodel_id = \"anthropic/claude-sonnet-4-reasoning\"\n\n# Get model info\nmodel_info = await client.get_llm_info(model_id)\nprint(f\"Model: {model_info.name}\")\nprint(f\"Provider: {model_info.provider.name}\")\n\n# Transform parameters\ninput_params = {\"temperature\": 0.7, \"stream\": True}\napi_format, transformed = await client.get_params(model_id, input_params)\nprint(f\"API format: {api_format}\")\npp(transformed)\n```\n```\nModel: Claude-4 Sonnet R\nProvider: Anthropic\nDescription: Claude-4 Sonnet with reasoning capabilities.\n\nTransformed parameters:\n('anthropic',\n {'api_key': SecretStr('**********'),\n 'base_url': 'https://api.anthropic.com',\n 'model': 'claude-sonnet-4-0',\n 'stream': True,\n 'thinking': {'budget_tokens': 1024, 'type': 'enabled'}})\n```\n\nThe `temperature` parameter is removed because temperature is not supported by Claude models with reasoning enabled. The `thinking` parameter is added with the `budget_tokens` we specify in `langgate_config.yaml`. See the below [Configuration](#configuration) section for more details on how LangGate handles parameter transformations.\n\n**Working with Image Models:**\nTransforming parameters for image models is the exact same process as for LLMs.\n```py\n# Transform parameters for an image model\nimage_model_id = \"openai/gpt-image-1\"\nimage_params = {\n \"prompt\": \"A beautiful sunset over the ocean\",\n \"size\": \"1024x1024\",\n \"quality\": \"medium\",\n}\n\napi_format, transformed = await client.get_params(image_model_id, image_params)\nprint(f\"API format: {api_format}\")\npp(transformed)\n```\n```text\nAPI format: openai\n{'api_key': SecretStr('**********'),\n 'base_url': 'https://api.openai.com/v1',\n 'model': 'gpt-image-1',\n 'prompt': 'A beautiful sunset over the ocean',\n 'quality': 'medium',\n 'size': '1024x1024'}\n```\n\n#### Example integration with Langchain:\nThe following is an example of how you might define a factory class to create a Langchain `BaseChatModel` instance configured via the `LangGateLocal` client:\n```py\nimport os\n\n# Ensure you have the required environment variables set\nos.environ[\"OPENAI_API_KEY\"] = \"<YOUR_API_KEY>\"\n\n# The below environment variables are optional.\n\n# The yaml config resolution priority is: args > env > cwd > package default.\n# If you don't want to use either the package default (langgate/core/data/default_config.yaml)\n# or a config in your cwd, set:\n# os.environ[\"LANGGATE_CONFIG\"] = \"some_other_path_not_in_your_cwd/langgate_config.yaml\"\n\n# The models data resolution priority is: args > env > cwd > package default\n# By default, any user-defined `langgate_models.json` files are merged with default models data. See `models_merge_mode` configuration.\n# If you don't want to use either the package default (langgate/registry/data/default_models.json)\n# or a models data file in your cwd, set:\n# os.environ[\"LANGGATE_MODELS\"] = \"some_other_path_not_in_your_cwd/langgate_models.json\"\n\n# The .env file resolution priority is: args > env > cwd > None\n# If you don't want to use either the package default or a .env file in your cwd, set:\n# os.environ[\"LANGGATE_ENV_FILE\"] = \"some_other_path_not_in_your_cwd/.env\"\n```\n```py\nfrom typing import Any\nfrom pprint import pprint as pp\n\nfrom langchain.chat_models.base import BaseChatModel\nfrom langchain_anthropic import ChatAnthropic\nfrom langchain_openai import ChatOpenAI\n\nfrom langgate.sdk import LangGateLocal, LangGateLocalProtocol\nfrom langgate.core.models import (\n # `ModelProviderId` is a string alias for better type safety\n ModelProviderId,\n # ids for common providers are included for convenience\n MODEL_PROVIDER_OPENAI,\n MODEL_PROVIDER_ANTHROPIC,\n)\n\n# Map providers to model classes\nMODEL_CLASS_MAP: dict[ModelProviderId, type[BaseChatModel]] = {\n MODEL_PROVIDER_OPENAI: ChatOpenAI,\n MODEL_PROVIDER_ANTHROPIC: ChatAnthropic,\n}\n\n\nclass ModelFactory:\n \"\"\"\n Factory for creating a Langchain `BaseChatModel` instance\n with paramaters from LangGate.\n \"\"\"\n\n def __init__(self, langgate_client: LangGateLocalProtocol | None = None):\n self.langgate_client = langgate_client or LangGateLocal()\n\n async def create_model(\n self, model_id: str, input_params: dict[str, Any] | None = None\n ) -> tuple[BaseChatModel, dict[str, Any]]:\n \"\"\"Create a model instance for the given model ID.\"\"\"\n params = {\"temperature\": 0.7, \"streaming\": True}\n if input_params:\n params.update(input_params)\n\n # Get model info from the registry cache\n model_info = await self.langgate_client.get_model_info(model_id)\n\n # Transform parameters using the transformer client\n # If switching to using the proxy, you would remove this line\n # and let the proxy handle the parameter transformation instead.\n api_format, model_params = await self.langgate_client.get_params(\n model_id, params\n )\n # api_format defaults to the provider id unless specified in the config.\n # e.g. Specify \"openai\" for OpenAI-compatible APIs, etc.\n print(\"API format:\", api_format)\n pp(model_params)\n\n # Get the appropriate model class based on provider\n client_cls_key = ModelProviderId(api_format)\n model_class = MODEL_CLASS_MAP.get(client_cls_key)\n if not model_class:\n raise ValueError(f\"No model class for provider {model_info.provider.id}\")\n\n # Create model instance with parameters\n model = model_class(**model_params)\n\n # Create model info dict\n model_metadata = model_info.model_dump(exclude_none=True)\n\n return model, model_metadata\n```\n```py\nmodel_factory = ModelFactory()\nmodel_id = \"openai/gpt-4o\"\nmodel = await model_factory.create_model(model_id, {\"temperature\": 0.7})\nmodel\n```\n```text\nAPI format: openai\n{'api_key': SecretStr('**********'),\n 'base_url': 'https://api.openai.com/v1',\n 'model': 'gpt-4o',\n 'streaming': True,\n 'temperature': 0.7}\nChatOpenAI(client=<openai.resources.chat.completions.completions.Completions object at 0x121f66210>, async_client=<openai.resources.chat.completions.completions.AsyncCompletions object at 0x121f72210>, root_client=<openai.OpenAI object at 0x121f56210>, root_async_client=<openai.AsyncOpenAI object at 0x121f66350>, model_name='gpt-4o', temperature=0.7, model_kwargs={}, openai_api_key=SecretStr('**********'), openai_api_base='https://api.openai.com/v1', streaming=True)\n```\nIf you want to use the LangGate Envoy proxy instead of `LangGateLocal`, you can switch to the `HTTPRegistryClient` with minimal code changes.\n\nFor more usage patterns and detailed instructions, see [examples](examples/README.md).\n\n### Envoy Proxy Service (Coming Soon)\n\nThe LangGate proxy feature is currently in development. When completed, it will provide:\n\n1. Centralized model registry accessible via API\n2. Parameter transformation at the proxy level\n3. API key management and request routing\n4. High-performance response streaming via Envoy\n\n## Configuration\n\nLangGate uses a simple YAML configuration format:\n\n```yaml\n# langgate_config.yaml\n# Main configuration file for LangGate\n\n# Global default parameters by modality (applied to all models unless overridden)\ndefault_params:\n text:\n temperature: 0.7\n\n# Service provider configurations\nservices:\n openai:\n api_key: \"${OPENAI_API_KEY}\"\n base_url: \"https://api.openai.com/v1\"\n model_patterns:\n # match any o-series model\n openai/o:\n remove_params:\n - temperature\n\n anthropic:\n api_key: \"${ANTHROPIC_API_KEY}\"\n base_url: \"https://api.anthropic.com\"\n model_patterns:\n # match any model with reasoning in the id\n reasoning:\n override_params:\n thinking:\n type: enabled\n remove_params:\n - temperature\n\n replicate:\n api_key: \"${REPLICATE_API_KEY}\"\n\n# Model-specific configurations organized by modality\nmodels:\n text:\n - id: openai/gpt-4.1\n service:\n provider: openai\n model_id: gpt-4.1\n\n - id: openai/o3\n service:\n provider: openai\n model_id: o3\n\n # \"virtual model\" that wraps the o3 model with high-effort reasoning\n - id: openai/o3-high\n service:\n provider: openai\n model_id: o3\n name: o3-high\n description: o3-high applies high-effort reasoning for the o3 model\n override_params:\n reasoning_effort: high\n\n - id: anthropic/claude-sonnet-4\n service:\n provider: anthropic\n model_id: claude-sonnet-4-0\n\n # \"virtual model\" that wraps the claude-sonnet-4-0 model with reasoning\n - id: anthropic/claude-sonnet-4-reasoning\n service:\n provider: anthropic\n model_id: claude-sonnet-4-0\n name: Claude-4 Sonnet R\n description: \"Claude-4 Sonnet with reasoning capabilities.\"\n override_params:\n thinking:\n budget_tokens: 1024\n\n image:\n - id: openai/gpt-image-1\n service:\n provider: openai\n model_id: gpt-image-1\n\n - id: openai/dall-e-3\n service:\n provider: openai\n model_id: dall-e-3\n\n - id: black-forest-labs/flux-dev\n service:\n provider: replicate\n model_id: black-forest-labs/flux-dev\n default_params:\n disable_safety_checker: true\n\n - id: stability-ai/sd-3.5-large\n service:\n provider: replicate\n model_id: stability-ai/stable-diffusion-3.5-large\n\n# Models merge mode for loading data from JSON files: \"merge\" (default), \"replace\", or \"extend\"\n# - merge: User models override defaults, new models are added\n# - replace: Only use user models (ignore default models file)\n# - extend: Add user models to defaults, error on conflicts\nmodels_merge_mode: merge\n\n```\n\n### Parameter Transformation Precedence\n\nWhen transforming parameters for model requests, LangGate follows a specific precedence order:\n\n#### Defaults (applied only if key doesn't exist yet):\n1. Model-specific defaults (highest precedence for defaults)\n2. Pattern defaults (matching patterns applied in config order)\n3. Service provider defaults\n4. Global defaults (lowest precedence for defaults)\n\n#### Overrides/Removals/Renames (applied in order, later steps overwrite/modify earlier ones):\n1. Input parameters (initial state)\n2. Service-level API keys and base URLs\n3. Service-level overrides, removals, renames\n4. Pattern-level overrides, removals, renames (matching patterns applied in config order)\n5. Model-specific overrides, removals, renames (highest precedence)\n6. Model ID (always overwritten with service_model_id)\n7. Environment variable substitution (applied last to all string values)\n\n## Environment Variables\n\n| Variable | Description | Default |\n|----------|-------------|---------|\n| LANGGATE_CONFIG | Path to the main configuration file | ./langgate_config.yaml |\n| LANGGATE_MODELS | Path to the models data JSON file | ./langgate_models.json |\n| LANGGATE_ENV_FILE | Path to the .env file | ./.env |\n| LOG_LEVEL | Logging level | info |\n\n### Models Merge Behavior\n\nYou can add additional models to LangGate's model registry by creating a `langgate_models.json` file in your working directory, or by setting the `LANGGATE_MODELS` environment variable to point to a custom JSON file.\n\nLangGate supports three modes for including extra models beyond those we ship with the package (`default_models.json`):\n- **merge** (default): Your defined models are merged with default models, with your models taking precedence\n- **replace**: Only your models are used\n- **extend**: your models are added to defaults, conflicts cause errors\n\nConfigure this with `models_merge_mode` in your YAML configuration.\n\nNote:\n- If `langgate_models.json` is unset in your working directory, and no `LANGGATE_MODELS` environment variable is set, then the registry package default `langgate/registry/data/default_models.json` will be used. This file contains data on most major providers and models.\n- If `langgate_config.yaml` is unset in your working directory, and no `LANGGATE_CONFIG` environment variable is set, then the core package default `langgate/core/data/default_config.yaml` will be used. This file contains a default configuration with common LLM providers.\n\n## LangGate vs Alternatives\n\n### LangGate vs Envoy AI Gateway\n\nWhile both use Envoy for proxying, LangGate takes a more direct approach:\n\n- **Simplified Architecture**: LangGate uses Envoy's ext_proc filter directly without a separate control plane\n- **No Kubernetes Dependency**: Runs anywhere Docker runs, without requiring Kubernetes CRDs or custom resources\n- **Configuration Simplicity**: Uses a straightforward YAML configuration instead of Kubernetes resources\n- **Lightweight Deployment**: Deploy with Docker Compose or any container platform without complex orchestration\n\n### LangGate vs Python-based Gateways\n\nUnlike other Python-based gateways:\n\n- **High-Performance Streaming**: Uses Envoy's native streaming capabilities instead of Python for response handling\n- **Focused Functionality**: Handles request transformation in Python while letting Envoy manage the high-throughput parts\n- **No Middleman for Responses**: Responses stream directly from providers to clients via Envoy\n\n## Running with Docker\n\n```bash\n# Start the full LangGate stack\nmake compose-up\n\n# Development mode with hot reloading\nmake compose-dev\n\n# Local development (Python on host, Envoy in Docker)\nmake run-local\n\n# Stop the stack\nmake compose-down\n\n# Stop stack and remove volumes\nmake compose-breakdown\n```\n\n## Testing and Development\n\n```bash\n# Run all tests\nmake test\n\n# Run lint checks\nmake lint\n```\n\n## Additional Documentation\n\n- [Contributing Guide](CONTRIBUTING.md) - Development setup and guidelines\n- [SDK Examples](examples/README.md) - Sample code for using the LangGate SDK\n- [Deployment Guide](deployment/README.md) - Instructions for deploying to Kubernetes and other platforms\n\n## Roadmap\n- **Pydantic Schema Validation**: Implement validation of parameters against Pydantic schemas representing the full API of the provider's model\n- **TTS and ASR Model Support**: Include leading Text-to-Speech (TTS) and Automatic Speech Recognition (ASR) models in the default model registry, with endpoints for fetching models filtered by modality (for modality-specific return typing) and schemas for these modalities.\n- **Video Generation Model Support**: Add video generation models, similarly to the afformentioned modalities, with an explicit endpoint and schemas.\n- **OpenAI API Standardization Option**: Introduce an option to standardize to the OpenAI API spec. This will involve mapping provider-specific Pydantic schemas to corresponding OpenAI API input schemas, offering a unified interface for diverse models.\n\n## License\n\n[MIT License](LICENSE)\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Lightweight AI model serving gateway with Envoy proxy and a FastAPI registry.",
"version": "0.2.1",
"project_urls": {
"Homepage": "https://github.com/Tanantor/langgate",
"Issues": "https://github.com/Tanantor/langgate/issues",
"Repository": "https://github.com/Tanantor/langgate"
},
"split_keywords": [
"envoy",
" fastapi",
" gateway",
" llm",
" model-serving",
" proxy",
" registry"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "eaaabe8ae2a881affb2d96c9fc2ae9dcd8bae07402025febe565bdbe2688c854",
"md5": "1d328b086a093b2710d56960b408a214",
"sha256": "67ee2291eb4042711b99c5d59f1b82d5635239cbf0ed31dd93dc7bff6968e14c"
},
"downloads": -1,
"filename": "langgate-0.2.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "1d328b086a093b2710d56960b408a214",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.13",
"size": 9317,
"upload_time": "2025-07-11T05:33:42",
"upload_time_iso_8601": "2025-07-11T05:33:42.198611Z",
"url": "https://files.pythonhosted.org/packages/ea/aa/be8ae2a881affb2d96c9fc2ae9dcd8bae07402025febe565bdbe2688c854/langgate-0.2.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "7f5331f33cabab01bac4093a985492bcc9c4b8b53e344c640ccfeba1a96ca970",
"md5": "cd07fa86f4c551a7d4b87bbedb085ed6",
"sha256": "c788f0ba674a6f44e88f0bfe939645380eec04b8af6ebf02400831073a75df9e"
},
"downloads": -1,
"filename": "langgate-0.2.1.tar.gz",
"has_sig": false,
"md5_digest": "cd07fa86f4c551a7d4b87bbedb085ed6",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.13",
"size": 187873,
"upload_time": "2025-07-11T05:33:44",
"upload_time_iso_8601": "2025-07-11T05:33:44.220774Z",
"url": "https://files.pythonhosted.org/packages/7f/53/31f33cabab01bac4093a985492bcc9c4b8b53e344c640ccfeba1a96ca970/langgate-0.2.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-11 05:33:44",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "Tanantor",
"github_project": "langgate",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "langgate"
}