# ScaleXI LLM
A comprehensive multi-provider LLM proxy library that provides a unified interface for interacting with various AI models from different providers.
## ๐ Features
- **Multi-Provider Support**: OpenAI, Anthropic (Claude), Google (Gemini), Groq, DeepSeek, Qwen, Grok
- **60+ Model Configurations**: Detailed pricing, context lengths, and capabilities for each model
- **Structured Output**: Pydantic schema support with intelligent fallbacks
- **Vision Capabilities**: Image analysis with automatic fallback for non-vision models
- **File Processing**: PDF, DOCX, TXT, and JSON file analysis
- **Web Search Integration**: Multi-provider search via Exa and SERP API (Google)
- **Cost Tracking**: Automatic token usage and cost calculation
- **Robust Fallbacks**: Multi-level fallback mechanisms for reliability
- **Comprehensive Testing**: Built-in benchmarking across all providers
## ๐ฆ Installation
```bash
# Clone the repository
git clone <repository-url>
cd scalexi_llm
# Install dependencies
pip install -r requirements.txt
```
### Dependencies
```
openai
anthropic
google-genai
groq
pymupdf
xai-sdk
python-docx
pydantic
python-dotenv
exa-py
serpapi
```
## โ๏ธ Configuration
Create a `.env` file in the project root with your API keys:
```env
OPENAI_API_KEY=your_openai_key
ANTHROPIC_API_KEY=your_anthropic_key
GEMINI_API_KEY=your_gemini_key
GROQ_API_KEY=your_groq_key
DEEPSEEK_API_KEY=your_deepseek_key
QWEN_API_KEY=your_qwen_key
GROK_API_KEY=your_grok_key
EXA_API_KEY=your_exa_key
SERP_API_KEY=your_serp_key
```
You can just include the API keys for providers you want to use. Note that some providers fallback to Gemini in certain cases. (see documentation for more details)
## ๐ง Quick Start
```python
from scalexi_llm.scalexi_llm import LLMProxy
# Initialize the proxy
llm = LLMProxy()
# Basic usage
response, execution_time, token_usage, cost = llm.ask_llm(
model_name="gemini-2.5-flash",
system_prompt="You are a helpful assistant.",
user_prompt="Explain quantum computing in simple terms"
)
print(f"Response: {response}")
print(f"Cost: ${cost:.6f}")
print(f"Tokens used: {token_usage['total_tokens']}")
```
## ๐ก Usage Examples
### Structured Output
```python
from pydantic import BaseModel, Field
from typing import List
class Recipe(BaseModel):
name: str = Field(description="Recipe name")
ingredients: List[str] = Field(description="Required ingredients")
steps: List[str] = Field(description="Cooking steps")
cooking_time: int = Field(description="Cooking time in minutes")
response, _, _, _ = llm.ask_llm(
model_name="gpt-4o-latest",
user_prompt="Create a recipe for chocolate chip cookies",
schema=Recipe
)
```
### Image Analysis
```python
response, _, _, _ = llm.ask_llm(
model_name="moonshotai/kimi-k2-instruct",
user_prompt="Analyze this image and describe what you see",
image_path="photo.jpg"
)
```
### File Processing
```python
response, _, _, _ = llm.ask_llm(
model_name="claude-3-5-sonnet-latest",
user_prompt="Summarize this document",
file_path="document.pdf"
)
```
### Web Search with Different Providers
```python
# Using Exa (default)
response, _, _, _ = llm.ask_llm(
model_name="claude-3-7-sonnet-latest",
user_prompt="Latest AI developments in 2024",
websearch=True
)
# Using SERP API (Google)
response, _, _, _ = llm.ask_llm(
model_name="chatgpt-4o-latest",
user_prompt="Current trends in quantum computing",
websearch=True,
search_tool="serp",
max_search_results=10
)
# Using both Exa + SERP for maximum coverage
response, _, _, _ = llm.ask_llm(
model_name="gemini-2.5-pro",
user_prompt="Comprehensive market analysis",
websearch=True,
search_tool="both",
max_search_results=15
)
# With query generator (AI optimizes the search query)
response, _, _, _ = llm.ask_llm(
model_name="claude-sonnet-4-0",
user_prompt="Python machine learning tutorials",
websearch=True,
use_query_generator=True # Enable AI query optimization
)
```
### Combined Features
```python
response, _, _, _ = llm.ask_llm(
model_name="grok-4-latest",
system_prompt="Analyze the provided content comprehensively",
user_prompt="Analyze this resume and find career recommendations from web.",
file_path="resume.pdf",
image_path="certifications.png",
websearch=True,
schema=AnalysisSchema
)
```
## ๐ Fallback Mechanisms
The library implements intelligent fallback systems:
1. **Vision Fallback**: Non-vision models automatically use vision-capable models from the same provider to describe images
2. **Structured Output Fallback**: Falls back to better models from the same provider when schema validation fails
Models from certain providers fallback to Gemini in certain cases.
Check documentation for more details on fallbacks.
## ๐งช Testing
Run the comprehensive test suite:
```bash
cd testing
python combined_test.py
```
This will:
- Test all available models
- Benchmark performance across providers
- Generate detailed analysis reports
- Test combined features (file + image + web search + structured output)
Feel free to use your own files and images for testing.
## ๐ Cost Tracking
Every request returns detailed cost and token usage information:
```python
response, execution_time, token_usage, cost = llm.ask_llm(...)
# Token usage breakdown
print(token_usage)
# {
# "prompt_tokens": 150,
# "completion_tokens": 200,
# "total_tokens": 350
# }
# Total cost in USD
print(f"Cost: ${cost:.6f}")
```
## ๐ ๏ธ API Reference
### `ask_llm()` Parameters
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `model_name` | str | `"gpt-4o-mini"` | Model to use for generation |
| `system_prompt` | str | `""` | System prompt for the model |
| `user_prompt` | str | `""` | User prompt/message |
| `temperature` | float | `None` | Sampling temperature (0.0-2.0) |
| `schema` | Pydantic Model | `None` | Structured output schema |
| `image_path` | str | `None` | Path to image file |
| `file_path` | str | `None` | Path to file for analysis |
| `websearch` | bool | `False` | Enable web search |
| `use_query_generator` | bool | `False` | Use AI to generate optimized search query |
| `max_search_results` | int | `12` | Maximum search results to retrieve |
| `search_tool` | str | `"exa"` | Search provider: "exa", "serp", or "both" |
| `max_tokens` | int | `None` | Maximum tokens to generate |
| `retry_limit` | int | `1` | Number of retry attempts |
| `fallback_to_provider_best_model` | bool | `True` | Enable provider fallback |
| `fallback_to_standard_model` | bool | `True` | Enable cross-provider fallback |
### Return Values
Returns a tuple: `(response, execution_time, token_usage, cost)`
- **response**: The model's response (string or JSON)
- **execution_time**: Time taken in seconds (float)
- **token_usage**: Dictionary with token counts
- **cost**: Total cost in USD (float)
---
**ScaleXI LLM** - Unified AI model access with enterprise-grade reliability and comprehensive feature support.
Raw data
{
"_id": null,
"home_page": "https://github.com/scalexi/scalexi_llm",
"name": "scalexi-llm",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "llm, ai, openai, anthropic, gemini, groq, deepseek, grok, qwen, exa, proxy, api",
"author": "scalex_innovation",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/fa/18/d401dfb68f5268c2c14587fea13154cb0600ce14e028ad6b7e785a54a246/scalexi_llm-0.1.4.tar.gz",
"platform": null,
"description": "# ScaleXI LLM\n\nA comprehensive multi-provider LLM proxy library that provides a unified interface for interacting with various AI models from different providers.\n\n## \ud83d\ude80 Features\n\n- **Multi-Provider Support**: OpenAI, Anthropic (Claude), Google (Gemini), Groq, DeepSeek, Qwen, Grok\n- **60+ Model Configurations**: Detailed pricing, context lengths, and capabilities for each model\n- **Structured Output**: Pydantic schema support with intelligent fallbacks\n- **Vision Capabilities**: Image analysis with automatic fallback for non-vision models\n- **File Processing**: PDF, DOCX, TXT, and JSON file analysis\n- **Web Search Integration**: Multi-provider search via Exa and SERP API (Google)\n- **Cost Tracking**: Automatic token usage and cost calculation\n- **Robust Fallbacks**: Multi-level fallback mechanisms for reliability\n- **Comprehensive Testing**: Built-in benchmarking across all providers\n\n## \ud83d\udce6 Installation\n\n```bash\n# Clone the repository\ngit clone <repository-url>\ncd scalexi_llm\n\n# Install dependencies\npip install -r requirements.txt\n```\n\n### Dependencies\n\n```\nopenai\nanthropic\ngoogle-genai\ngroq\npymupdf\nxai-sdk\npython-docx\npydantic\npython-dotenv\nexa-py\nserpapi\n```\n\n## \u2699\ufe0f Configuration\n\nCreate a `.env` file in the project root with your API keys:\n\n```env\nOPENAI_API_KEY=your_openai_key\nANTHROPIC_API_KEY=your_anthropic_key\nGEMINI_API_KEY=your_gemini_key\nGROQ_API_KEY=your_groq_key\nDEEPSEEK_API_KEY=your_deepseek_key\nQWEN_API_KEY=your_qwen_key\nGROK_API_KEY=your_grok_key\nEXA_API_KEY=your_exa_key\nSERP_API_KEY=your_serp_key\n```\n\nYou can just include the API keys for providers you want to use. Note that some providers fallback to Gemini in certain cases. (see documentation for more details)\n\n## \ud83d\udd27 Quick Start\n\n```python\nfrom scalexi_llm.scalexi_llm import LLMProxy\n\n# Initialize the proxy\nllm = LLMProxy()\n\n# Basic usage\nresponse, execution_time, token_usage, cost = llm.ask_llm(\n model_name=\"gemini-2.5-flash\",\n system_prompt=\"You are a helpful assistant.\",\n user_prompt=\"Explain quantum computing in simple terms\"\n)\n\nprint(f\"Response: {response}\")\nprint(f\"Cost: ${cost:.6f}\")\nprint(f\"Tokens used: {token_usage['total_tokens']}\")\n```\n\n## \ud83d\udca1 Usage Examples\n\n### Structured Output\n\n```python\nfrom pydantic import BaseModel, Field\nfrom typing import List\n\nclass Recipe(BaseModel):\n name: str = Field(description=\"Recipe name\")\n ingredients: List[str] = Field(description=\"Required ingredients\")\n steps: List[str] = Field(description=\"Cooking steps\")\n cooking_time: int = Field(description=\"Cooking time in minutes\")\n\nresponse, _, _, _ = llm.ask_llm(\n model_name=\"gpt-4o-latest\",\n user_prompt=\"Create a recipe for chocolate chip cookies\",\n schema=Recipe\n)\n```\n\n### Image Analysis\n\n```python\nresponse, _, _, _ = llm.ask_llm(\n model_name=\"moonshotai/kimi-k2-instruct\",\n user_prompt=\"Analyze this image and describe what you see\",\n image_path=\"photo.jpg\"\n)\n```\n\n### File Processing\n\n```python\nresponse, _, _, _ = llm.ask_llm(\n model_name=\"claude-3-5-sonnet-latest\",\n user_prompt=\"Summarize this document\",\n file_path=\"document.pdf\"\n)\n```\n\n### Web Search with Different Providers\n\n```python\n# Using Exa (default)\nresponse, _, _, _ = llm.ask_llm(\n model_name=\"claude-3-7-sonnet-latest\",\n user_prompt=\"Latest AI developments in 2024\",\n websearch=True\n)\n\n# Using SERP API (Google)\nresponse, _, _, _ = llm.ask_llm(\n model_name=\"chatgpt-4o-latest\",\n user_prompt=\"Current trends in quantum computing\",\n websearch=True,\n search_tool=\"serp\",\n max_search_results=10\n)\n\n# Using both Exa + SERP for maximum coverage\nresponse, _, _, _ = llm.ask_llm(\n model_name=\"gemini-2.5-pro\",\n user_prompt=\"Comprehensive market analysis\",\n websearch=True,\n search_tool=\"both\",\n max_search_results=15\n)\n\n# With query generator (AI optimizes the search query)\nresponse, _, _, _ = llm.ask_llm(\n model_name=\"claude-sonnet-4-0\",\n user_prompt=\"Python machine learning tutorials\",\n websearch=True,\n use_query_generator=True # Enable AI query optimization\n)\n```\n\n### Combined Features\n\n```python\nresponse, _, _, _ = llm.ask_llm(\n model_name=\"grok-4-latest\",\n system_prompt=\"Analyze the provided content comprehensively\",\n user_prompt=\"Analyze this resume and find career recommendations from web.\",\n file_path=\"resume.pdf\",\n image_path=\"certifications.png\",\n websearch=True,\n schema=AnalysisSchema\n)\n```\n\n## \ud83d\udd04 Fallback Mechanisms\n\nThe library implements intelligent fallback systems:\n\n1. **Vision Fallback**: Non-vision models automatically use vision-capable models from the same provider to describe images\n2. **Structured Output Fallback**: Falls back to better models from the same provider when schema validation fails\n\nModels from certain providers fallback to Gemini in certain cases.\nCheck documentation for more details on fallbacks.\n\n## \ud83e\uddea Testing\n\nRun the comprehensive test suite:\n\n```bash\ncd testing\npython combined_test.py\n```\n\nThis will:\n- Test all available models\n- Benchmark performance across providers\n- Generate detailed analysis reports\n- Test combined features (file + image + web search + structured output)\n\nFeel free to use your own files and images for testing.\n\n## \ud83d\udcca Cost Tracking\n\nEvery request returns detailed cost and token usage information:\n\n```python\nresponse, execution_time, token_usage, cost = llm.ask_llm(...)\n\n# Token usage breakdown\nprint(token_usage)\n# {\n# \"prompt_tokens\": 150,\n# \"completion_tokens\": 200,\n# \"total_tokens\": 350\n# }\n\n# Total cost in USD\nprint(f\"Cost: ${cost:.6f}\")\n```\n\n## \ud83d\udee0\ufe0f API Reference\n\n### `ask_llm()` Parameters\n\n| Parameter | Type | Default | Description |\n|-----------|------|---------|-------------|\n| `model_name` | str | `\"gpt-4o-mini\"` | Model to use for generation |\n| `system_prompt` | str | `\"\"` | System prompt for the model |\n| `user_prompt` | str | `\"\"` | User prompt/message |\n| `temperature` | float | `None` | Sampling temperature (0.0-2.0) |\n| `schema` | Pydantic Model | `None` | Structured output schema |\n| `image_path` | str | `None` | Path to image file |\n| `file_path` | str | `None` | Path to file for analysis |\n| `websearch` | bool | `False` | Enable web search |\n| `use_query_generator` | bool | `False` | Use AI to generate optimized search query |\n| `max_search_results` | int | `12` | Maximum search results to retrieve |\n| `search_tool` | str | `\"exa\"` | Search provider: \"exa\", \"serp\", or \"both\" |\n| `max_tokens` | int | `None` | Maximum tokens to generate |\n| `retry_limit` | int | `1` | Number of retry attempts |\n| `fallback_to_provider_best_model` | bool | `True` | Enable provider fallback |\n| `fallback_to_standard_model` | bool | `True` | Enable cross-provider fallback |\n\n### Return Values\n\nReturns a tuple: `(response, execution_time, token_usage, cost)`\n\n- **response**: The model's response (string or JSON)\n- **execution_time**: Time taken in seconds (float)\n- **token_usage**: Dictionary with token counts\n- **cost**: Total cost in USD (float)\n\n---\n\n**ScaleXI LLM** - Unified AI model access with enterprise-grade reliability and comprehensive feature support.\n",
"bugtrack_url": null,
"license": null,
"summary": "A comprehensive multi-provider LLM proxy library with unified interface",
"version": "0.1.4",
"project_urls": {
"Homepage": "https://github.com/scalexi/scalexi_llm"
},
"split_keywords": [
"llm",
" ai",
" openai",
" anthropic",
" gemini",
" groq",
" deepseek",
" grok",
" qwen",
" exa",
" proxy",
" api"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "8832f8ff753cd8b3b514e7655c8be6d032254290700bffec889899311656a954",
"md5": "63a0f9ddcb74cff5590dca8859a6fb06",
"sha256": "e6e6dbc48fed414590a2099ee3b9ab58b3d8c88ae4ec33c6ed5fb077feb28151"
},
"downloads": -1,
"filename": "scalexi_llm-0.1.4-py3-none-any.whl",
"has_sig": false,
"md5_digest": "63a0f9ddcb74cff5590dca8859a6fb06",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 16232,
"upload_time": "2025-10-24T09:18:13",
"upload_time_iso_8601": "2025-10-24T09:18:13.502602Z",
"url": "https://files.pythonhosted.org/packages/88/32/f8ff753cd8b3b514e7655c8be6d032254290700bffec889899311656a954/scalexi_llm-0.1.4-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "fa18d401dfb68f5268c2c14587fea13154cb0600ce14e028ad6b7e785a54a246",
"md5": "cc32784943672581c71d4ee2c021625b",
"sha256": "a5af31c2af9343bc49faa10f04ebc8c064fb3d3ed69d77e8714ec94849216a6e"
},
"downloads": -1,
"filename": "scalexi_llm-0.1.4.tar.gz",
"has_sig": false,
"md5_digest": "cc32784943672581c71d4ee2c021625b",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 19192,
"upload_time": "2025-10-24T09:18:15",
"upload_time_iso_8601": "2025-10-24T09:18:15.092518Z",
"url": "https://files.pythonhosted.org/packages/fa/18/d401dfb68f5268c2c14587fea13154cb0600ce14e028ad6b7e785a54a246/scalexi_llm-0.1.4.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-10-24 09:18:15",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "scalexi",
"github_project": "scalexi_llm",
"github_not_found": true,
"lcname": "scalexi-llm"
}