llmling-models

Name	llmling-models JSON
Version	0.12.1 JSON
	download
home_page	None
Summary	Pydantic-AI models for LLMling-agent
upload_time	2025-10-06 21:31:49
maintainer	None
docs_url	None
author	Philipp Temminghoff
requires_python	>=3.12
license	MIT License Copyright (c) 2024, Philipp Temminghoff Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
keywords
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # LLMling-models

[![PyPI License](https://img.shields.io/pypi/l/llmling-models.svg)](https://pypi.org/project/llmling-models/)
[![Package status](https://img.shields.io/pypi/status/llmling-models.svg)](https://pypi.org/project/llmling-models/)
[![Monthly downloads](https://img.shields.io/pypi/dm/llmling-models.svg)](https://pypi.org/project/llmling-models/)
[![Distribution format](https://img.shields.io/pypi/format/llmling-models.svg)](https://pypi.org/project/llmling-models/)
[![Wheel availability](https://img.shields.io/pypi/wheel/llmling-models.svg)](https://pypi.org/project/llmling-models/)
[![Python version](https://img.shields.io/pypi/pyversions/llmling-models.svg)](https://pypi.org/project/llmling-models/)
[![Implementation](https://img.shields.io/pypi/implementation/llmling-models.svg)](https://pypi.org/project/llmling-models/)
[![Releases](https://img.shields.io/github/downloads/phil65/llmling-models/total.svg)](https://github.com/phil65/llmling-models/releases)
[![Github Contributors](https://img.shields.io/github/contributors/phil65/llmling-models)](https://github.com/phil65/llmling-models/graphs/contributors)
[![Github Discussions](https://img.shields.io/github/discussions/phil65/llmling-models)](https://github.com/phil65/llmling-models/discussions)
[![Github Forks](https://img.shields.io/github/forks/phil65/llmling-models)](https://github.com/phil65/llmling-models/forks)
[![Github Issues](https://img.shields.io/github/issues/phil65/llmling-models)](https://github.com/phil65/llmling-models/issues)
[![Github Issues](https://img.shields.io/github/issues-pr/phil65/llmling-models)](https://github.com/phil65/llmling-models/pulls)
[![Github Watchers](https://img.shields.io/github/watchers/phil65/llmling-models)](https://github.com/phil65/llmling-models/watchers)
[![Github Stars](https://img.shields.io/github/stars/phil65/llmling-models)](https://github.com/phil65/llmling-models/stars)
[![Github Repository size](https://img.shields.io/github/repo-size/phil65/llmling-models)](https://github.com/phil65/llmling-models)
[![Github last commit](https://img.shields.io/github/last-commit/phil65/llmling-models)](https://github.com/phil65/llmling-models/commits)
[![Github release date](https://img.shields.io/github/release-date/phil65/llmling-models)](https://github.com/phil65/llmling-models/releases)
[![Github language count](https://img.shields.io/github/languages/count/phil65/llmling-models)](https://github.com/phil65/llmling-models)
[![Github commits this month](https://img.shields.io/github/commit-activity/m/phil65/llmling-models)](https://github.com/phil65/llmling-models)
[![Package status](https://codecov.io/gh/phil65/llmling-models/branch/main/graph/badge.svg)](https://codecov.io/gh/phil65/llmling-models/)
[![PyUp](https://pyup.io/repos/github/phil65/llmling-models/shield.svg)](https://pyup.io/repos/github/phil65/llmling-models/)

# llmling-models

Collection of model wrappers and adapters for use with [LLMling-Agent](https://github.com/phil65/llmling-agent), but should work with the underlying pydantic-ai API without issues.

**WARNING**:

This is just a prototype for now and will likely change in the future.
Also, pydantic-ais APIs dont seem stable yet, so things might not work across all pydantic-ai versions.
I will try to keep this up to date as fast as possible.

## Available Models


### Multi-Models


### Augmented Model

Enhances prompts through pre- and post-processing steps using auxiliary language models:

```python
from llmling_models import AugmentedModel

model = AugmentedModel(
    main_model="openai:gpt-4",
    pre_prompt={
        "text": "Expand this question: {input}",
        "model": "openai:gpt-3.5-turbo"
    },
    post_prompt={
        "text": "Summarize this response concisely: {output}",
        "model": "openai:gpt-3.5-turbo"
    }
)
agent = Agent(model)

# The question will be expanded before processing
# and the response will be summarized afterward
result = await agent.run("What is AI?")
```

### Input Model

A model that delegates responses to human input, useful for testing, debugging, or creating hybrid human-AI workflows:

```python
from pydantic_ai import Agent
from llmling_models import InputModel

# Basic usage with default console input
model = InputModel(
    prompt_template="🤖 Question: {prompt}",
    show_system=True,
    input_prompt="Your answer: ",
)

# Create agent with system context
agent = Agent(
    model=model,
    system_prompt="You are helping test an input model. Be concise.",
)

# Run interactive conversation
result = await agent.run("What's your favorite color?")
print(f"You responded: {result.output}")

# Supports streaming input
async with agent.run_stream("Tell me a story...") as response:
    async for chunk in response.stream():
        print(chunk, end="", flush=True)
```

Features:
- Interactive console input for testing and debugging
- Support for streaming input (character by character, but not "true" async with default handler)
- Configurable message formatting
- Custom input handlers for different input sources
- System message display control
- Full conversation context support

This model is particularly useful for:
- Testing complex prompt chains
- Creating hybrid human-AI workflows
- Debugging agent behavior
- Collecting human feedback
- Educational scenarios where human input is needed


### User Select Model

An interactive model that lets users manually choose which model to use for each prompt:

```python
from pydantic_ai import Agent
from llmling_models import UserSelectModel

# Basic setup with model list
model = UserSelectModel(
    models=["openai:gpt-4o-mini", "openai:gpt-3.5-turbo", "anthropic:claude-3"]
)

agent = Agent(model)

# The user will be shown the prompt and available models,
# and can choose which one to use for the response
result = await agent.run("What is the meaning of life?")
```

#### Model Delegation

Dynamically selects models based on given prompt. Uses a selector model to choose the most appropriate model for each task:

```python
from pydantic_ai import Agent
from llmling_models import DelegationMultiModel

# Basic setup with model list
delegation_model = DelegationMultiModel(
    selector_model="openai:gpt-4-turbo",
    models=["openai:gpt-4", "openai:gpt-3.5-turbo"],
    selection_prompt="Pick gpt-4 for complex tasks, gpt-3.5-turbo for simple queries."
)

# Advanced setup with model descriptions
delegation_model = DelegationMultiModel(
    selector_model="openai:gpt-4-turbo",
    models=["openai:gpt-4", "anthropic:claude-2", "openai:gpt-3.5-turbo"],
    model_descriptions={
        "openai:gpt-4": "Complex reasoning, math problems, and coding tasks",
        "anthropic:claude-2": "Long-form analysis and research synthesis",
        "openai:gpt-3.5-turbo": "Simple queries, chat, and basic information"
    },
    selection_prompt="Select the most appropriate model for the task."
)

agent = Agent(delegation_model)

# The selector model will analyze the prompt and choose the most suitable model
result = await agent.run("Solve this complex mathematical proof...")
```


#### Cost-Optimized Model

Selects models based on input cost limits, automatically choosing the most appropriate model within your budget constraints:

```python
from pydantic_ai import Agent
from llmling_models import CostOptimizedMultiModel

# Use cheapest model that can handle the task
cost_model = CostOptimizedMultiModel(
    models=[
        "openai:gpt-4",           # More expensive
        "openai:gpt-3.5-turbo",   # Less expensive
    ],
    max_input_cost=0.1,          # Maximum cost in USD per request
    strategy="cheapest_possible"  # Use cheapest model that fits
)

# Or use the best model within budget
cost_model = CostOptimizedMultiModel(
    models=[
        "openai:gpt-4-32k",      # Most expensive
        "openai:gpt-4",          # Medium cost
        "openai:gpt-3.5-turbo",  # Cheapest
    ],
    max_input_cost=0.5,              # Higher budget
    strategy="best_within_budget"     # Use best model within budget
)

agent = Agent(cost_model)
result = await agent.run("Your prompt here")
```

#### Token-Optimized Model

Automatically selects models based on input token count and context window requirements:

```python
from pydantic_ai import Agent
from llmling_models import TokenOptimizedMultiModel

# Create model that automatically handles different context lengths
token_model = TokenOptimizedMultiModel(
    models=[
        "openai:gpt-4-32k",        # 32k context
        "openai:gpt-4",            # 8k context
        "openai:gpt-3.5-turbo",    # 4k context
    ],
    strategy="efficient"           # Use smallest sufficient context window
)

# Or maximize context window availability
token_model = TokenOptimizedMultiModel(
    models=[
        "openai:gpt-4-32k",        # 32k context
        "openai:gpt-4",            # 8k context
        "openai:gpt-3.5-turbo",    # 4k context
    ],
    strategy="maximum_context"     # Use largest available context window
)

agent = Agent(token_model)

# Will automatically select appropriate model based on input length
result = await agent.run("Your long prompt here...")

# Long inputs automatically use models with larger context windows
result = await agent.run("Very long document..." * 1000)
```

The cost-optimized model ensures you stay within budget while getting the best possible model for your needs, while the token-optimized model automatically handles varying input lengths by selecting models with appropriate context windows.


### Remote Input Model

A model that connects to a remote human operator, allowing distributed human-in-the-loop operations:

```python
from pydantic_ai import Agent
from llmling_models import RemoteInputModel

# Basic setup with WebSocket (preferred for streaming)
model = RemoteInputModel(
    url="ws://operator:8000/v1/chat/stream",
    api_key="your-api-key"
)

# Or use REST API
model = RemoteInputModel(
    url="http://operator:8000/v1/chat",
    api_key="your-api-key"
)

agent = Agent(model)

# The request will be forwarded to the remote operator
result = await agent.run("What's the meaning of life?")
print(f"Remote operator responded: {result.output}")

# Streaming also works with WebSocket protocol
async with agent.run_stream("Tell me a story...") as response:
    async for chunk in response.stream():
        print(chunk, end="", flush=True)
```


Features:
- Distributed human-in-the-loop operations
- WebSocket support for real-time streaming
- REST API for simpler setups
- Full conversation context support
- Secure authentication via API keys

#### Setting up a Remote Model Server

Setting up a remote model server is straightforward. You just need a pydantic-ai model and can start serving it:

```python
from llmling_models.remote_model.server import ModelServer

# Create and start server
server = ModelServer(
    model="openai:gpt-4",
    api_key="your-secret-key",  # Optional authentication
)
server.run(port=8000)
```

That's it! The server now accepts both REST and WebSocket connections and handles all the message protocol details for you.

Features:
- Simple setup - just provide a model
- Optional API key authentication
- Automatic handling of both REST and WebSocket protocols
- Full pydantic-ai message protocol support
- Usage statistics forwarding
- Built-in error handling and logging

For development, you might want to run the server locally:

```python
server = ModelServer(
    model="openai:gpt-4",
    api_key="dev-key"
)
server.run(host="localhost", port=8000)
```

For production, you'll typically want to run it on a public server with proper authentication:

```python
server = ModelServer(
    model="openai:gpt-4",
    api_key="your-secure-key",  # Make sure to use a strong key
    title="Production GPT-4 Server",
    description="Serves GPT-4 model for production use"
)
server.run(
    host="0.0.0.0",  # Accept connections from anywhere
    port=8000,
    workers=4  # Multiple workers for better performance
)
```

Both REST and WebSocket protocols are supported, with WebSocket being preferred for streaming capabilities. They also maintain the full pydantic-ai message protocol, ensuring compatibility with all features of the framework.



All multi models are generically typed to follow pydantic best practices. Usefulness for that is debatable though. :P

## Providers

LLMling-models extends the capabilities of pydantic-ai with additional provider implementations that make it easy to connect to various LLM API services.

### Available Providers

The package includes the following provider implementations:


#### GitHub Copilot Provider

Connect to GitHub Copilot's API for code-focused tasks (requires token management):

```python
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel
from llmling_models.providers.copilot_provider import CopilotProvider

# Requires tokonomics.CopilotTokenManager to handle token management
provider = CopilotProvider()  # Uses tokonomics for authentication
model = OpenAIModel("gpt-4o-mini", provider=provider)
agent = Agent(model=model)
result = await agent.run("Write a function to calculate Fibonacci numbers")
```

#### LM Studio Provider

Connect to local LM Studio inference server for open-source models:

```python
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel
from llmling_models.providers.lm_studio_provider import LMStudioProvider

provider = LMStudioProvider(base_url="http://localhost:11434/v1")
model = OpenAIModel("model_name", provider=provider)  # Use model loaded in LM Studio
agent = Agent(model=model)
result = await agent.run("Tell me about yourself")
```

### Provider Utility Functions

#### infer_provider

The `infer_provider` function extends pydantic-ai's provider inference to include all LLMling-models providers:

```python
from llmling_models.providers import infer_provider

# Get provider by name
provider = infer_provider("openrouter")  # Returns OpenRouterProvider instance
provider = infer_provider("grok")        # Returns GrokProvider instance
provider = infer_provider("perplexity")  # Returns PerplexityProvider instance
provider = infer_provider("copilot")     # Returns CopilotProvider instance
provider = infer_provider("lm-studio")   # Returns LMStudioProvider instance

# Still works with standard providers too
provider = infer_provider("openai")      # Returns pydantic_ai's OpenAIProvider
```

## Extended infer_model Function

LLMling-models provides an extended `infer_model` function that resolves various model notations to appropriate instances:

```python
from llmling_models import infer_model

# Provider prefixes (requires appropriate API keys as env vars)
model = infer_model("openai:gpt-4o")             # OpenAI models
model = infer_model("openrouter:anthropic/opus") # OpenRouter (requires OPENROUTER_API_KEY)
model = infer_model("grok:grok-2-1212")          # Grok/X.AI (requires X_AI_API_KEY)
model = infer_model("perplexity:sonar-medium")   # Perplexity (requires PERPLEXITY_API_KEY)
model = infer_model("deepseek:deepseek-chat")    # DeepSeek (requires DEEPSEEK_API_KEY)
model = infer_model("copilot:gpt-4o-mini")       # GitHub Copilot (requires token management)
model = infer_model("lm-studio:model-name")      # LM Studio local models

# LLMling's special models
model = infer_model("simple-openai:gpt-4")      # Simple HTTPX-based OpenAI client
model = infer_model("input")                    # Interactive human input model
model = infer_model("remote_model:ws://url")    # Remote model proxy
model = infer_model("remote_input:ws://url")    # Remote human input
model = infer_model("import:module.path:Class") # Import model from Python path

# Testing
model = infer_model("test:Custom response")     # Test model with fixed output
```

The function provides a fallback to a simple HTTPX-based OpenAI client in environments where the full OpenAI library is not available (like Pyodide/WebAssembly contexts).

### Environment Variable Configuration

For convenience, most providers support configuration via environment variables:

| Provider    | Environment Variable    | Purpose                    |
|-------------|-------------------------|----------------------------|
| OpenRouter  | `OPENROUTER_API_KEY`    | API key for authentication |
| Grok (X.AI) | `X_AI_API_KEY` or `GROK_API_KEY` | API key for authentication |
| DeepSeek    | `DEEPSEEK_API_KEY`      | API key for authentication |
| Perplexity  | `PERPLEXITY_API_KEY`    | API key for authentication |
| Copilot     | Uses tokonomics token management | - |
| LM Studio   | `LM_STUDIO_BASE_URL`    | Base URL for local server |
| OpenAI      | `OPENAI_API_KEY`        | API key for authentication |
```


## Installation

```bash
pip install llmling-models
```

## Requirements

- Python 3.12+
- pydantic-ai
- Either tokenizers or transformers for improved token calculation

## License

MIT

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "llmling-models",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.12",
    "maintainer_email": null,
    "keywords": null,
    "author": "Philipp Temminghoff",
    "author_email": "Philipp Temminghoff <philipptemminghoff@googlemail.com>",
    "download_url": "https://files.pythonhosted.org/packages/b5/e0/4407d6c1bf15fb25c103151b8d354a9fe31be3e4868f477e8a9dd433cbae/llmling_models-0.12.1.tar.gz",
    "platform": null,
    "description": "# LLMling-models\n\n[![PyPI License](https://img.shields.io/pypi/l/llmling-models.svg)](https://pypi.org/project/llmling-models/)\n[![Package status](https://img.shields.io/pypi/status/llmling-models.svg)](https://pypi.org/project/llmling-models/)\n[![Monthly downloads](https://img.shields.io/pypi/dm/llmling-models.svg)](https://pypi.org/project/llmling-models/)\n[![Distribution format](https://img.shields.io/pypi/format/llmling-models.svg)](https://pypi.org/project/llmling-models/)\n[![Wheel availability](https://img.shields.io/pypi/wheel/llmling-models.svg)](https://pypi.org/project/llmling-models/)\n[![Python version](https://img.shields.io/pypi/pyversions/llmling-models.svg)](https://pypi.org/project/llmling-models/)\n[![Implementation](https://img.shields.io/pypi/implementation/llmling-models.svg)](https://pypi.org/project/llmling-models/)\n[![Releases](https://img.shields.io/github/downloads/phil65/llmling-models/total.svg)](https://github.com/phil65/llmling-models/releases)\n[![Github Contributors](https://img.shields.io/github/contributors/phil65/llmling-models)](https://github.com/phil65/llmling-models/graphs/contributors)\n[![Github Discussions](https://img.shields.io/github/discussions/phil65/llmling-models)](https://github.com/phil65/llmling-models/discussions)\n[![Github Forks](https://img.shields.io/github/forks/phil65/llmling-models)](https://github.com/phil65/llmling-models/forks)\n[![Github Issues](https://img.shields.io/github/issues/phil65/llmling-models)](https://github.com/phil65/llmling-models/issues)\n[![Github Issues](https://img.shields.io/github/issues-pr/phil65/llmling-models)](https://github.com/phil65/llmling-models/pulls)\n[![Github Watchers](https://img.shields.io/github/watchers/phil65/llmling-models)](https://github.com/phil65/llmling-models/watchers)\n[![Github Stars](https://img.shields.io/github/stars/phil65/llmling-models)](https://github.com/phil65/llmling-models/stars)\n[![Github Repository size](https://img.shields.io/github/repo-size/phil65/llmling-models)](https://github.com/phil65/llmling-models)\n[![Github last commit](https://img.shields.io/github/last-commit/phil65/llmling-models)](https://github.com/phil65/llmling-models/commits)\n[![Github release date](https://img.shields.io/github/release-date/phil65/llmling-models)](https://github.com/phil65/llmling-models/releases)\n[![Github language count](https://img.shields.io/github/languages/count/phil65/llmling-models)](https://github.com/phil65/llmling-models)\n[![Github commits this month](https://img.shields.io/github/commit-activity/m/phil65/llmling-models)](https://github.com/phil65/llmling-models)\n[![Package status](https://codecov.io/gh/phil65/llmling-models/branch/main/graph/badge.svg)](https://codecov.io/gh/phil65/llmling-models/)\n[![PyUp](https://pyup.io/repos/github/phil65/llmling-models/shield.svg)](https://pyup.io/repos/github/phil65/llmling-models/)\n\n# llmling-models\n\nCollection of model wrappers and adapters for use with [LLMling-Agent](https://github.com/phil65/llmling-agent), but should work with the underlying pydantic-ai API without issues.\n\n**WARNING**:\n\nThis is just a prototype for now and will likely change in the future.\nAlso, pydantic-ais APIs dont seem stable yet, so things might not work across all pydantic-ai versions.\nI will try to keep this up to date as fast as possible.\n\n## Available Models\n\n\n### Multi-Models\n\n\n### Augmented Model\n\nEnhances prompts through pre- and post-processing steps using auxiliary language models:\n\n```python\nfrom llmling_models import AugmentedModel\n\nmodel = AugmentedModel(\n    main_model=\"openai:gpt-4\",\n    pre_prompt={\n        \"text\": \"Expand this question: {input}\",\n        \"model\": \"openai:gpt-3.5-turbo\"\n    },\n    post_prompt={\n        \"text\": \"Summarize this response concisely: {output}\",\n        \"model\": \"openai:gpt-3.5-turbo\"\n    }\n)\nagent = Agent(model)\n\n# The question will be expanded before processing\n# and the response will be summarized afterward\nresult = await agent.run(\"What is AI?\")\n```\n\n### Input Model\n\nA model that delegates responses to human input, useful for testing, debugging, or creating hybrid human-AI workflows:\n\n```python\nfrom pydantic_ai import Agent\nfrom llmling_models import InputModel\n\n# Basic usage with default console input\nmodel = InputModel(\n    prompt_template=\"\ud83e\udd16 Question: {prompt}\",\n    show_system=True,\n    input_prompt=\"Your answer: \",\n)\n\n# Create agent with system context\nagent = Agent(\n    model=model,\n    system_prompt=\"You are helping test an input model. Be concise.\",\n)\n\n# Run interactive conversation\nresult = await agent.run(\"What's your favorite color?\")\nprint(f\"You responded: {result.output}\")\n\n# Supports streaming input\nasync with agent.run_stream(\"Tell me a story...\") as response:\n    async for chunk in response.stream():\n        print(chunk, end=\"\", flush=True)\n```\n\nFeatures:\n- Interactive console input for testing and debugging\n- Support for streaming input (character by character, but not \"true\" async with default handler)\n- Configurable message formatting\n- Custom input handlers for different input sources\n- System message display control\n- Full conversation context support\n\nThis model is particularly useful for:\n- Testing complex prompt chains\n- Creating hybrid human-AI workflows\n- Debugging agent behavior\n- Collecting human feedback\n- Educational scenarios where human input is needed\n\n\n### User Select Model\n\nAn interactive model that lets users manually choose which model to use for each prompt:\n\n```python\nfrom pydantic_ai import Agent\nfrom llmling_models import UserSelectModel\n\n# Basic setup with model list\nmodel = UserSelectModel(\n    models=[\"openai:gpt-4o-mini\", \"openai:gpt-3.5-turbo\", \"anthropic:claude-3\"]\n)\n\nagent = Agent(model)\n\n# The user will be shown the prompt and available models,\n# and can choose which one to use for the response\nresult = await agent.run(\"What is the meaning of life?\")\n```\n\n#### Model Delegation\n\nDynamically selects models based on given prompt. Uses a selector model to choose the most appropriate model for each task:\n\n```python\nfrom pydantic_ai import Agent\nfrom llmling_models import DelegationMultiModel\n\n# Basic setup with model list\ndelegation_model = DelegationMultiModel(\n    selector_model=\"openai:gpt-4-turbo\",\n    models=[\"openai:gpt-4\", \"openai:gpt-3.5-turbo\"],\n    selection_prompt=\"Pick gpt-4 for complex tasks, gpt-3.5-turbo for simple queries.\"\n)\n\n# Advanced setup with model descriptions\ndelegation_model = DelegationMultiModel(\n    selector_model=\"openai:gpt-4-turbo\",\n    models=[\"openai:gpt-4\", \"anthropic:claude-2\", \"openai:gpt-3.5-turbo\"],\n    model_descriptions={\n        \"openai:gpt-4\": \"Complex reasoning, math problems, and coding tasks\",\n        \"anthropic:claude-2\": \"Long-form analysis and research synthesis\",\n        \"openai:gpt-3.5-turbo\": \"Simple queries, chat, and basic information\"\n    },\n    selection_prompt=\"Select the most appropriate model for the task.\"\n)\n\nagent = Agent(delegation_model)\n\n# The selector model will analyze the prompt and choose the most suitable model\nresult = await agent.run(\"Solve this complex mathematical proof...\")\n```\n\n\n#### Cost-Optimized Model\n\nSelects models based on input cost limits, automatically choosing the most appropriate model within your budget constraints:\n\n```python\nfrom pydantic_ai import Agent\nfrom llmling_models import CostOptimizedMultiModel\n\n# Use cheapest model that can handle the task\ncost_model = CostOptimizedMultiModel(\n    models=[\n        \"openai:gpt-4\",           # More expensive\n        \"openai:gpt-3.5-turbo\",   # Less expensive\n    ],\n    max_input_cost=0.1,          # Maximum cost in USD per request\n    strategy=\"cheapest_possible\"  # Use cheapest model that fits\n)\n\n# Or use the best model within budget\ncost_model = CostOptimizedMultiModel(\n    models=[\n        \"openai:gpt-4-32k\",      # Most expensive\n        \"openai:gpt-4\",          # Medium cost\n        \"openai:gpt-3.5-turbo\",  # Cheapest\n    ],\n    max_input_cost=0.5,              # Higher budget\n    strategy=\"best_within_budget\"     # Use best model within budget\n)\n\nagent = Agent(cost_model)\nresult = await agent.run(\"Your prompt here\")\n```\n\n#### Token-Optimized Model\n\nAutomatically selects models based on input token count and context window requirements:\n\n```python\nfrom pydantic_ai import Agent\nfrom llmling_models import TokenOptimizedMultiModel\n\n# Create model that automatically handles different context lengths\ntoken_model = TokenOptimizedMultiModel(\n    models=[\n        \"openai:gpt-4-32k\",        # 32k context\n        \"openai:gpt-4\",            # 8k context\n        \"openai:gpt-3.5-turbo\",    # 4k context\n    ],\n    strategy=\"efficient\"           # Use smallest sufficient context window\n)\n\n# Or maximize context window availability\ntoken_model = TokenOptimizedMultiModel(\n    models=[\n        \"openai:gpt-4-32k\",        # 32k context\n        \"openai:gpt-4\",            # 8k context\n        \"openai:gpt-3.5-turbo\",    # 4k context\n    ],\n    strategy=\"maximum_context\"     # Use largest available context window\n)\n\nagent = Agent(token_model)\n\n# Will automatically select appropriate model based on input length\nresult = await agent.run(\"Your long prompt here...\")\n\n# Long inputs automatically use models with larger context windows\nresult = await agent.run(\"Very long document...\" * 1000)\n```\n\nThe cost-optimized model ensures you stay within budget while getting the best possible model for your needs, while the token-optimized model automatically handles varying input lengths by selecting models with appropriate context windows.\n\n\n### Remote Input Model\n\nA model that connects to a remote human operator, allowing distributed human-in-the-loop operations:\n\n```python\nfrom pydantic_ai import Agent\nfrom llmling_models import RemoteInputModel\n\n# Basic setup with WebSocket (preferred for streaming)\nmodel = RemoteInputModel(\n    url=\"ws://operator:8000/v1/chat/stream\",\n    api_key=\"your-api-key\"\n)\n\n# Or use REST API\nmodel = RemoteInputModel(\n    url=\"http://operator:8000/v1/chat\",\n    api_key=\"your-api-key\"\n)\n\nagent = Agent(model)\n\n# The request will be forwarded to the remote operator\nresult = await agent.run(\"What's the meaning of life?\")\nprint(f\"Remote operator responded: {result.output}\")\n\n# Streaming also works with WebSocket protocol\nasync with agent.run_stream(\"Tell me a story...\") as response:\n    async for chunk in response.stream():\n        print(chunk, end=\"\", flush=True)\n```\n\n\nFeatures:\n- Distributed human-in-the-loop operations\n- WebSocket support for real-time streaming\n- REST API for simpler setups\n- Full conversation context support\n- Secure authentication via API keys\n\n#### Setting up a Remote Model Server\n\nSetting up a remote model server is straightforward. You just need a pydantic-ai model and can start serving it:\n\n```python\nfrom llmling_models.remote_model.server import ModelServer\n\n# Create and start server\nserver = ModelServer(\n    model=\"openai:gpt-4\",\n    api_key=\"your-secret-key\",  # Optional authentication\n)\nserver.run(port=8000)\n```\n\nThat's it! The server now accepts both REST and WebSocket connections and handles all the message protocol details for you.\n\nFeatures:\n- Simple setup - just provide a model\n- Optional API key authentication\n- Automatic handling of both REST and WebSocket protocols\n- Full pydantic-ai message protocol support\n- Usage statistics forwarding\n- Built-in error handling and logging\n\nFor development, you might want to run the server locally:\n\n```python\nserver = ModelServer(\n    model=\"openai:gpt-4\",\n    api_key=\"dev-key\"\n)\nserver.run(host=\"localhost\", port=8000)\n```\n\nFor production, you'll typically want to run it on a public server with proper authentication:\n\n```python\nserver = ModelServer(\n    model=\"openai:gpt-4\",\n    api_key=\"your-secure-key\",  # Make sure to use a strong key\n    title=\"Production GPT-4 Server\",\n    description=\"Serves GPT-4 model for production use\"\n)\nserver.run(\n    host=\"0.0.0.0\",  # Accept connections from anywhere\n    port=8000,\n    workers=4  # Multiple workers for better performance\n)\n```\n\nBoth REST and WebSocket protocols are supported, with WebSocket being preferred for streaming capabilities. They also maintain the full pydantic-ai message protocol, ensuring compatibility with all features of the framework.\n\n\n\nAll multi models are generically typed to follow pydantic best practices. Usefulness for that is debatable though. :P\n\n## Providers\n\nLLMling-models extends the capabilities of pydantic-ai with additional provider implementations that make it easy to connect to various LLM API services.\n\n### Available Providers\n\nThe package includes the following provider implementations:\n\n\n#### GitHub Copilot Provider\n\nConnect to GitHub Copilot's API for code-focused tasks (requires token management):\n\n```python\nfrom pydantic_ai import Agent\nfrom pydantic_ai.models.openai import OpenAIModel\nfrom llmling_models.providers.copilot_provider import CopilotProvider\n\n# Requires tokonomics.CopilotTokenManager to handle token management\nprovider = CopilotProvider()  # Uses tokonomics for authentication\nmodel = OpenAIModel(\"gpt-4o-mini\", provider=provider)\nagent = Agent(model=model)\nresult = await agent.run(\"Write a function to calculate Fibonacci numbers\")\n```\n\n#### LM Studio Provider\n\nConnect to local LM Studio inference server for open-source models:\n\n```python\nfrom pydantic_ai import Agent\nfrom pydantic_ai.models.openai import OpenAIModel\nfrom llmling_models.providers.lm_studio_provider import LMStudioProvider\n\nprovider = LMStudioProvider(base_url=\"http://localhost:11434/v1\")\nmodel = OpenAIModel(\"model_name\", provider=provider)  # Use model loaded in LM Studio\nagent = Agent(model=model)\nresult = await agent.run(\"Tell me about yourself\")\n```\n\n### Provider Utility Functions\n\n#### infer_provider\n\nThe `infer_provider` function extends pydantic-ai's provider inference to include all LLMling-models providers:\n\n```python\nfrom llmling_models.providers import infer_provider\n\n# Get provider by name\nprovider = infer_provider(\"openrouter\")  # Returns OpenRouterProvider instance\nprovider = infer_provider(\"grok\")        # Returns GrokProvider instance\nprovider = infer_provider(\"perplexity\")  # Returns PerplexityProvider instance\nprovider = infer_provider(\"copilot\")     # Returns CopilotProvider instance\nprovider = infer_provider(\"lm-studio\")   # Returns LMStudioProvider instance\n\n# Still works with standard providers too\nprovider = infer_provider(\"openai\")      # Returns pydantic_ai's OpenAIProvider\n```\n\n## Extended infer_model Function\n\nLLMling-models provides an extended `infer_model` function that resolves various model notations to appropriate instances:\n\n```python\nfrom llmling_models import infer_model\n\n# Provider prefixes (requires appropriate API keys as env vars)\nmodel = infer_model(\"openai:gpt-4o\")             # OpenAI models\nmodel = infer_model(\"openrouter:anthropic/opus\") # OpenRouter (requires OPENROUTER_API_KEY)\nmodel = infer_model(\"grok:grok-2-1212\")          # Grok/X.AI (requires X_AI_API_KEY)\nmodel = infer_model(\"perplexity:sonar-medium\")   # Perplexity (requires PERPLEXITY_API_KEY)\nmodel = infer_model(\"deepseek:deepseek-chat\")    # DeepSeek (requires DEEPSEEK_API_KEY)\nmodel = infer_model(\"copilot:gpt-4o-mini\")       # GitHub Copilot (requires token management)\nmodel = infer_model(\"lm-studio:model-name\")      # LM Studio local models\n\n# LLMling's special models\nmodel = infer_model(\"simple-openai:gpt-4\")      # Simple HTTPX-based OpenAI client\nmodel = infer_model(\"input\")                    # Interactive human input model\nmodel = infer_model(\"remote_model:ws://url\")    # Remote model proxy\nmodel = infer_model(\"remote_input:ws://url\")    # Remote human input\nmodel = infer_model(\"import:module.path:Class\") # Import model from Python path\n\n# Testing\nmodel = infer_model(\"test:Custom response\")     # Test model with fixed output\n```\n\nThe function provides a fallback to a simple HTTPX-based OpenAI client in environments where the full OpenAI library is not available (like Pyodide/WebAssembly contexts).\n\n### Environment Variable Configuration\n\nFor convenience, most providers support configuration via environment variables:\n\n| Provider    | Environment Variable    | Purpose                    |\n|-------------|-------------------------|----------------------------|\n| OpenRouter  | `OPENROUTER_API_KEY`    | API key for authentication |\n| Grok (X.AI) | `X_AI_API_KEY` or `GROK_API_KEY` | API key for authentication |\n| DeepSeek    | `DEEPSEEK_API_KEY`      | API key for authentication |\n| Perplexity  | `PERPLEXITY_API_KEY`    | API key for authentication |\n| Copilot     | Uses tokonomics token management | - |\n| LM Studio   | `LM_STUDIO_BASE_URL`    | Base URL for local server |\n| OpenAI      | `OPENAI_API_KEY`        | API key for authentication |\n```\n\n\n## Installation\n\n```bash\npip install llmling-models\n```\n\n## Requirements\n\n- Python 3.12+\n- pydantic-ai\n- Either tokenizers or transformers for improved token calculation\n\n## License\n\nMIT\n",
    "bugtrack_url": null,
    "license": "MIT License\n         \n         Copyright (c) 2024, Philipp Temminghoff\n         \n         Permission is hereby granted, free of charge, to any person obtaining a copy\n         of this software and associated documentation files (the \"Software\"), to deal\n         in the Software without restriction, including without limitation the rights\n         to use, copy, modify, merge, publish, distribute, sublicense, and/or sell\n         copies of the Software, and to permit persons to whom the Software is\n         furnished to do so, subject to the following conditions:\n         \n         The above copyright notice and this permission notice shall be included in all\n         copies or substantial portions of the Software.\n         \n         THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\n         IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\n         FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\n         AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\n         LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\n         OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\n         SOFTWARE.\n         ",
    "summary": "Pydantic-AI models for LLMling-agent",
    "version": "0.12.1",
    "project_urls": {
        "Code coverage": "https://app.codecov.io/gh/phil65/llmling-models",
        "Discussions": "https://github.com/phil65/llmling-models/discussions",
        "Documentation": "https://phil65.github.io/llmling-models/",
        "Issues": "https://github.com/phil65/llmling-models/issues",
        "Source": "https://github.com/phil65/llmling-models"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "dfeede45ed64a6028f17231eddc227ddd39b55fac191d9d3b11c9a86dd606744",
                "md5": "c31961bedf64351030c43bf52672da76",
                "sha256": "40cd916bcb54e02e93366f0c799dd8b8a9f2f453044050c14feb2800b645dad2"
            },
            "downloads": -1,
            "filename": "llmling_models-0.12.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "c31961bedf64351030c43bf52672da76",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.12",
            "size": 64925,
            "upload_time": "2025-10-06T21:31:48",
            "upload_time_iso_8601": "2025-10-06T21:31:48.219221Z",
            "url": "https://files.pythonhosted.org/packages/df/ee/de45ed64a6028f17231eddc227ddd39b55fac191d9d3b11c9a86dd606744/llmling_models-0.12.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "b5e04407d6c1bf15fb25c103151b8d354a9fe31be3e4868f477e8a9dd433cbae",
                "md5": "08066edd92f1a69d4b2ecb72102c5588",
                "sha256": "872a6d5707b564402bd834d6df6ce7c4b4e871d0dc12ecb1d1cbeb4bd355cfe9"
            },
            "downloads": -1,
            "filename": "llmling_models-0.12.1.tar.gz",
            "has_sig": false,
            "md5_digest": "08066edd92f1a69d4b2ecb72102c5588",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.12",
            "size": 43652,
            "upload_time": "2025-10-06T21:31:49",
            "upload_time_iso_8601": "2025-10-06T21:31:49.874997Z",
            "url": "https://files.pythonhosted.org/packages/b5/e0/4407d6c1bf15fb25c103151b8d354a9fe31be3e4868f477e8a9dd433cbae/llmling_models-0.12.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-10-06 21:31:49",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "phil65",
    "github_project": "llmling-models",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "llmling-models"
}

Philipp Temminghoff