Name | modalkit JSON |
Version |
0.2.0
JSON |
| download |
home_page | None |
Summary | A library to package, ship and deploy your ML app |
upload_time | 2025-07-10 13:54:37 |
maintainer | None |
docs_url | None |
author | None |
requires_python | <4.0,>=3.9 |
license | None |
keywords |
python
|
VCS |
 |
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# Modalkit
<p align="center">
<a href="https://img.shields.io/github/v/release/prassanna-ravishankar/modalkit">
<img src="https://img.shields.io/github/v/release/prassanna-ravishankar/modalkit" alt="Release">
</a>
<a href="https://codecov.io/gh/prassanna-ravishankar/modalkit">
<img src="https://codecov.io/gh/prassanna-ravishankar/modalkit/branch/main/graph/badge.svg" alt="codecov">
</a>
<a href="https://img.shields.io/github/commit-activity/m/prassanna-ravishankar/modalkit">
<img src="https://img.shields.io/github/commit-activity/m/prassanna-ravishankar/modalkit" alt="Commit activity">
</a>
<a href="https://img.shields.io/github/license/prassanna-ravishankar/modalkit">
<img src="https://img.shields.io/github/license/prassanna-ravishankar/modalkit" alt="License">
</a>
</p>
<p align="center">
<img src="./docs/modalkit.png" width="400" height="400"/>
</p>
<p align="center">
A powerful Python framework for deploying ML models on Modal with production-ready features
</p>
## 🎯 What Modalkit Offers Over Raw Modal
While Modal provides excellent serverless infrastructure, Modalkit adds a complete ML deployment framework:
### 🏗️ **Standardized ML Architecture**
- **Structured Inference Pipeline**: Enforced `preprocess()` → `predict()` → `postprocess()` pattern
- **Consistent API Endpoints**: `/predict_sync`, `/predict_batch`, `/predict_async` across all deployments
- **Type-Safe Interfaces**: Pydantic models ensure data validation at API boundaries
### ⚙️ **Configuration-Driven Deployments**
- **YAML Configuration**: Version-controlled deployment settings instead of scattered code
- **Environment Management**: Easy dev/staging/prod configs with override capabilities
- **Reproducible Builds**: Declarative infrastructure removes deployment inconsistencies
### 👥 **Team-Friendly Workflows**
- **Shared Standards**: All team members deploy models the same way
- **Code Separation**: Model logic decoupled from Modal deployment boilerplate
- **Collaboration**: Config files in git enable infrastructure review and collaboration
### 🚀 **Production Features Out-of-the-Box**
- **Authentication Middleware**: Built-in API key or Modal proxy auth
- **Queue Integration**: Async processing with multiple backend support
- **Cloud Storage**: Direct S3/GCS/R2 mounting without manual setup
- **Batch Processing**: Intelligent request batching for GPU efficiency
- **Error Handling**: Comprehensive error responses and logging
### 💡 **Developer Experience**
- **Less Boilerplate**: Focus on model code, not FastAPI/Modal setup
- **Modern Tooling**: Pre-configured with ruff, mypy, pre-commit hooks
- **Testing Framework**: Built-in patterns for testing ML deployments
**In short**: Modalkit transforms Modal from infrastructure primitives into a complete ML platform, letting teams deploy models consistently while maintaining Modal's performance and scalability.
## ✨ Key Features
- 🚀 **Native Modal Integration**: Seamless deployment on Modal's serverless infrastructure
- 🔐 **Flexible Authentication**: Modal proxy auth or custom API keys with AWS SSM support
- ☁️ **Cloud Storage Support**: Direct mounting of S3, GCS, and R2 buckets
- 🔄 **Queue Integration**: Built-in support for SQS and Taskiq for async workflows
- 📦 **Batch Inference**: Efficient batch processing with configurable batch sizes
- 🎯 **Type Safety**: Full Pydantic integration for request/response validation
- 🛠️ **Developer Friendly**: Pre-configured with modern Python tooling (ruff, pre-commit)
- 📊 **Production Ready**: Comprehensive error handling and logging
## 🚀 Quick Start
### Installation
```bash
# Using pip
pip install git+https://github.com/prassanna-ravishankar/modalkit.git
# Using uv (recommended)
uv pip install git+https://github.com/prassanna-ravishankar/modalkit.git
```
### 1. Define Your Model
Create an inference class that inherits from `InferencePipeline`:
```python
from modalkit.inference import InferencePipeline
from pydantic import BaseModel
from typing import List
# Define input/output schemas with Pydantic
class TextInput(BaseModel):
text: str
language: str = "en"
class TextOutput(BaseModel):
translated_text: str
confidence: float
# Implement your model logic
class TranslationModel(InferencePipeline):
def __init__(self, model_name: str, all_model_data_folder: str, common_settings: dict, *args, **kwargs):
super().__init__(model_name, all_model_data_folder, common_settings)
# Load your model here
# self.model = load_model(...)
def preprocess(self, input_list: List[TextInput]) -> dict:
"""Prepare inputs for the model"""
texts = [item.text for item in input_list]
return {"texts": texts, "languages": [item.language for item in input_list]}
def predict(self, input_list: List[TextInput], preprocessed_data: dict) -> dict:
"""Run model inference"""
# Your model prediction logic
translations = [text.upper() for text in preprocessed_data["texts"]] # Example
return {"translations": translations, "scores": [0.95] * len(translations)}
def postprocess(self, input_list: List[TextInput], raw_output: dict) -> List[TextOutput]:
"""Format model outputs"""
return [
TextOutput(translated_text=text, confidence=score)
for text, score in zip(raw_output["translations"], raw_output["scores"])
]
```
### 2. Create Your Modal App
```python
import modal
from modalkit.modalapp import ModalService, create_web_endpoints
from modalkit.modalutils import ModalConfig
# Initialize with your config
modal_config = ModalConfig()
app = modal.App(name=modal_config.app_name)
# Define your Modal app class
@app.cls(**modal_config.get_app_cls_settings())
class TranslationApp(ModalService):
inference_implementation = TranslationModel
model_name: str = modal.parameter(default="translation_model")
modal_utils: ModalConfig = modal_config
# Create API endpoints
@app.function(**modal_config.get_handler_settings())
@modal.asgi_app(**modal_config.get_asgi_app_settings())
def web_endpoints():
return create_web_endpoints(
app_cls=TranslationApp,
input_model=TextInput,
output_model=TextOutput
)
```
### 3. Configure Your Deployment
Create a `modalkit.yaml` configuration file:
```yaml
# modalkit.yaml
app_settings:
app_prefix: "translation-service"
# Authentication configuration
auth_config:
# Option 1: Use API key from AWS SSM
ssm_key: "/translation/api-key"
auth_header: "x-api-key"
# Option 2: Use hardcoded API key (not recommended for production)
# api_key: "your-api-key-here"
# auth_header: "x-api-key"
# Container configuration
build_config:
image: "python:3.11-slim" # or your custom image
tag: "latest"
workdir: "/app"
env:
MODEL_VERSION: "v1.0"
# Deployment settings
deployment_config:
gpu: "T4" # Options: T4, A10G, A100, or null for CPU
concurrency_limit: 10
container_idle_timeout: 300
secure: false # Set to true for Modal proxy auth
# Cloud storage mounts (optional)
cloud_bucket_mounts:
- mount_point: "/mnt/models"
bucket_name: "my-model-bucket"
secret: "aws-credentials"
read_only: true
key_prefix: "models/"
# Batch processing settings
batch_config:
max_batch_size: 32
wait_ms: 100 # Wait up to 100ms to fill batch
# Queue configuration (for async endpoints)
queue_config:
backend: "taskiq" # or "sqs" for AWS SQS
broker_url: "redis://localhost:6379"
# Model configuration
model_settings:
local_model_repository_folder: "./models"
common:
cache_dir: "./cache"
device: "cuda" # or "cpu"
model_entries:
translation_model:
model_path: "path/to/model.pt"
vocab_size: 50000
```
### 4. Deploy to Modal
```bash
# Test locally
modal serve app.py
# Deploy to production
modal deploy app.py
# View logs
modal logs -f
```
### 5. Use Your API
```python
import requests
import asyncio
# For standard API key auth
headers = {"x-api-key": "your-api-key"}
# Synchronous endpoint
response = requests.post(
"https://your-org--translation-service.modal.run/predict_sync",
json={"text": "Hello world", "language": "en"},
headers=headers
)
print(response.json())
# {"translated_text": "HELLO WORLD", "confidence": 0.95}
# Asynchronous endpoint (returns immediately)
response = requests.post(
"https://your-org--translation-service.modal.run/predict_async",
json={"text": "Hello world", "language": "en"},
headers=headers
)
print(response.json())
# {"message_id": "550e8400-e29b-41d4-a716-446655440000"}
# Batch endpoint
response = requests.post(
"https://your-org--translation-service.modal.run/predict_batch",
json=[
{"text": "Hello", "language": "en"},
{"text": "World", "language": "en"}
],
headers=headers
)
print(response.json())
# [{"translated_text": "HELLO", "confidence": 0.95}, {"translated_text": "WORLD", "confidence": 0.95}]
```
## 🔐 Authentication
Modalkit provides flexible authentication options:
### Option 1: Custom API Key (Default)
Configure with `secure: false` in your deployment config.
```yaml
# modalkit.yaml
deployment_config:
secure: false
auth_config:
# Store in AWS SSM (recommended)
ssm_key: "/myapp/api-key"
# OR hardcode (not recommended)
# api_key: "sk-1234567890"
auth_header: "x-api-key"
```
```python
# Client usage
headers = {"x-api-key": "your-api-key"}
response = requests.post(url, json=data, headers=headers)
```
### Option 2: Modal Proxy Authentication
Configure with `secure: true` for Modal's built-in auth:
```yaml
# modalkit.yaml
deployment_config:
secure: true # Enables Modal proxy auth
```
```python
# Client usage
headers = {
"Modal-Key": "your-modal-key",
"Modal-Secret": "your-modal-secret"
}
response = requests.post(url, json=data, headers=headers)
```
> 💡 **Tip**: Modal proxy auth is recommended for production as it's managed by Modal and requires no additional setup.
## ⚙️ Configuration
### Configuration Structure
Modalkit uses YAML configuration with two main sections:
```yaml
# modalkit.yaml
app_settings: # Application deployment settings
app_prefix: str # Prefix for your Modal app name
auth_config: # Authentication configuration
build_config: # Container build settings
deployment_config: # Runtime deployment settings
batch_config: # Batch processing settings
queue_config: # Async queue settings
model_settings: # Model-specific settings
local_model_repository_folder: str
common: dict # Shared settings across models
model_entries: # Model-specific configurations
model_name: dict
```
### Environment Variables
Set configuration file location:
```bash
# Default location
export MODALKIT_CONFIG="modalkit.yaml"
# Multiple configs (later files override earlier ones)
export MODALKIT_CONFIG="base.yaml,prod.yaml"
# Other environment variables
export MODALKIT_APP_POSTFIX="-prod" # Appended to app name
```
### Advanced Configuration Options
```yaml
deployment_config:
# GPU configuration
gpu: "T4" # T4, A10G, A100, H100, or null
# Resource limits
concurrency_limit: 10
container_idle_timeout: 300
retries: 3
# Memory/CPU (when gpu is null)
memory: 8192 # MB
cpu: 4.0 # cores
# Volumes and mounts
volumes:
"/mnt/cache": "model-cache-vol"
mounts:
- local_path: "configs/prod.json"
remote_path: "/app/config.json"
type: "file"
```
## ☁️ Cloud Storage Integration
Modalkit seamlessly integrates with cloud storage providers through Modal's CloudBucketMount:
### Supported Providers
| Provider | Configuration |
|----------|--------------|
| AWS S3 | Native support with IAM credentials |
| Google Cloud Storage | Service account authentication |
| Cloudflare R2 | S3-compatible API |
| MinIO/Others | Any S3-compatible endpoint |
### Quick Examples
<details>
<summary><b>AWS S3 Configuration</b></summary>
```yaml
cloud_bucket_mounts:
- mount_point: "/mnt/models"
bucket_name: "my-ml-models"
secret: "aws-credentials" # Modal secret name
key_prefix: "production/" # Only mount this prefix
read_only: true
```
First, create the Modal secret:
```bash
modal secret create aws-credentials \
AWS_ACCESS_KEY_ID=xxx \
AWS_SECRET_ACCESS_KEY=yyy \
AWS_DEFAULT_REGION=us-east-1
```
</details>
<details>
<summary><b>Google Cloud Storage</b></summary>
```yaml
cloud_bucket_mounts:
- mount_point: "/mnt/datasets"
bucket_name: "my-datasets"
bucket_endpoint_url: "https://storage.googleapis.com"
secret: "gcp-credentials"
```
Create secret from service account:
```bash
modal secret create gcp-credentials \
--from-gcp-service-account path/to/key.json
```
</details>
<details>
<summary><b>Cloudflare R2</b></summary>
```yaml
cloud_bucket_mounts:
- mount_point: "/mnt/artifacts"
bucket_name: "ml-artifacts"
bucket_endpoint_url: "https://accountid.r2.cloudflarestorage.com"
secret: "r2-credentials"
```
</details>
### Using Mounted Storage
```python
class MyInference(InferencePipeline):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
# Load model from mounted bucket
model_path = "/mnt/models/my_model.pt"
self.model = torch.load(model_path)
# Load dataset
with open("/mnt/datasets/vocab.json") as f:
self.vocab = json.load(f)
```
### Best Practices
- ✅ Use read-only mounts for model artifacts
- ✅ Mount only required prefixes with `key_prefix`
- ✅ Use separate buckets for models vs. data
- ✅ Cache frequently accessed files locally
- ❌ Avoid writing logs to mounted buckets
- ❌ Don't mount entire buckets if you only need specific files
## 🚀 Advanced Features
### Async Queue Processing
Modalkit supports async processing with multiple queue backends:
```yaml
queue_config:
backend: "taskiq" # or "sqs"
broker_url: "redis://redis:6379"
```
```python
# Async endpoint returns immediately
response = requests.post("/predict_async", json=data)
# {"message_id": "uuid", "status": "queued"}
```
### Batch Processing
Configure intelligent batching for better GPU utilization:
```yaml
batch_config:
max_batch_size: 32
wait_ms: 100 # Max time to wait for batch to fill
```
### Volume Reloading
Auto-reload Modal volumes for model updates:
```yaml
deployment_config:
volumes:
"/mnt/models": "model-volume"
volume_reload_interval_seconds: 300 # Reload every 5 minutes
```
## 🛠️ Development
### Setup
```bash
# Clone repository
git clone https://github.com/prassanna-ravishankar/modalkit.git
cd modalkit
# Install with uv (recommended)
uv sync
# Install pre-commit hooks
uv run pre-commit install
```
### Testing
```bash
# Run all tests
uv run pytest --cov --cov-config=pyproject.toml --cov-report=xml
# Run specific tests
uv run pytest tests/test_modal_service.py -v
# Run with HTML coverage report
uv run pytest --cov=modalkit --cov-report=html
```
### Code Quality
```bash
# Run all checks
uv run pre-commit run -a
# Run type checking
uv run mypy modalkit/
# Format code
uv run ruff format modalkit/ tests/
# Lint code
uv run ruff check modalkit/ tests/
```
## 📖 API Reference
### Endpoints
| Endpoint | Method | Description | Returns |
|----------|---------|-------------|----------|
| `/predict_sync` | POST | Synchronous inference | Model output |
| `/predict_async` | POST | Async inference (queued) | Message ID |
| `/predict_batch` | POST | Batch inference | List of outputs |
| `/health` | GET | Health check | Status |
### InferencePipeline Methods
Your model class must implement:
```python
def preprocess(self, input_list: List[InputModel]) -> dict
def predict(self, input_list: List[InputModel], preprocessed_data: dict) -> dict
def postprocess(self, input_list: List[InputModel], raw_output: dict) -> List[OutputModel]
```
## 🤝 Contributing
We welcome contributions! Please see our [Contributing Guidelines](CONTRIBUTING.md) for details.
### Development Workflow
1. Fork the repository
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Make your changes
4. Run tests and linting (`uv run pytest && uv run pre-commit run -a`)
5. Commit your changes (pre-commit hooks will run automatically)
6. Push to your fork and open a Pull Request
## 📝 License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## 🙏 Acknowledgments
Built with ❤️ using:
- [Modal](https://modal.com) - Serverless infrastructure for ML
- [FastAPI](https://fastapi.tiangolo.com) - Modern web framework
- [Pydantic](https://pydantic-docs.helpmanual.io) - Data validation
- [Taskiq](https://taskiq-python.github.io) - Async task processing
---
<p align="center">
<a href="https://github.com/prassanna-ravishankar/modalkit/issues">Report Bug</a> •
<a href="https://github.com/prassanna-ravishankar/modalkit/issues">Request Feature</a> •
<a href="https://prassanna-ravishankar.github.io/modalkit">Documentation</a>
</p>
Raw data
{
"_id": null,
"home_page": null,
"name": "modalkit",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.9",
"maintainer_email": null,
"keywords": "python",
"author": null,
"author_email": "Prassanna Ravishankar <me@prassanna.io>",
"download_url": "https://files.pythonhosted.org/packages/ad/76/27c93fba9f1aac36e13ec10a2f55559727ac429811a06073e8b598b9bdec/modalkit-0.2.0.tar.gz",
"platform": null,
"description": "# Modalkit\n\n<p align=\"center\">\n <a href=\"https://img.shields.io/github/v/release/prassanna-ravishankar/modalkit\">\n <img src=\"https://img.shields.io/github/v/release/prassanna-ravishankar/modalkit\" alt=\"Release\">\n </a>\n <a href=\"https://codecov.io/gh/prassanna-ravishankar/modalkit\">\n <img src=\"https://codecov.io/gh/prassanna-ravishankar/modalkit/branch/main/graph/badge.svg\" alt=\"codecov\">\n </a>\n <a href=\"https://img.shields.io/github/commit-activity/m/prassanna-ravishankar/modalkit\">\n <img src=\"https://img.shields.io/github/commit-activity/m/prassanna-ravishankar/modalkit\" alt=\"Commit activity\">\n </a>\n <a href=\"https://img.shields.io/github/license/prassanna-ravishankar/modalkit\">\n <img src=\"https://img.shields.io/github/license/prassanna-ravishankar/modalkit\" alt=\"License\">\n </a>\n</p>\n\n<p align=\"center\">\n <img src=\"./docs/modalkit.png\" width=\"400\" height=\"400\"/>\n</p>\n\n<p align=\"center\">\n A powerful Python framework for deploying ML models on Modal with production-ready features\n</p>\n\n## \ud83c\udfaf What Modalkit Offers Over Raw Modal\n\nWhile Modal provides excellent serverless infrastructure, Modalkit adds a complete ML deployment framework:\n\n### \ud83c\udfd7\ufe0f **Standardized ML Architecture**\n- **Structured Inference Pipeline**: Enforced `preprocess()` \u2192 `predict()` \u2192 `postprocess()` pattern\n- **Consistent API Endpoints**: `/predict_sync`, `/predict_batch`, `/predict_async` across all deployments\n- **Type-Safe Interfaces**: Pydantic models ensure data validation at API boundaries\n\n### \u2699\ufe0f **Configuration-Driven Deployments**\n- **YAML Configuration**: Version-controlled deployment settings instead of scattered code\n- **Environment Management**: Easy dev/staging/prod configs with override capabilities\n- **Reproducible Builds**: Declarative infrastructure removes deployment inconsistencies\n\n### \ud83d\udc65 **Team-Friendly Workflows**\n- **Shared Standards**: All team members deploy models the same way\n- **Code Separation**: Model logic decoupled from Modal deployment boilerplate\n- **Collaboration**: Config files in git enable infrastructure review and collaboration\n\n### \ud83d\ude80 **Production Features Out-of-the-Box**\n- **Authentication Middleware**: Built-in API key or Modal proxy auth\n- **Queue Integration**: Async processing with multiple backend support\n- **Cloud Storage**: Direct S3/GCS/R2 mounting without manual setup\n- **Batch Processing**: Intelligent request batching for GPU efficiency\n- **Error Handling**: Comprehensive error responses and logging\n\n### \ud83d\udca1 **Developer Experience**\n- **Less Boilerplate**: Focus on model code, not FastAPI/Modal setup\n- **Modern Tooling**: Pre-configured with ruff, mypy, pre-commit hooks\n- **Testing Framework**: Built-in patterns for testing ML deployments\n\n**In short**: Modalkit transforms Modal from infrastructure primitives into a complete ML platform, letting teams deploy models consistently while maintaining Modal's performance and scalability.\n\n## \u2728 Key Features\n\n- \ud83d\ude80 **Native Modal Integration**: Seamless deployment on Modal's serverless infrastructure\n- \ud83d\udd10 **Flexible Authentication**: Modal proxy auth or custom API keys with AWS SSM support\n- \u2601\ufe0f **Cloud Storage Support**: Direct mounting of S3, GCS, and R2 buckets\n- \ud83d\udd04 **Queue Integration**: Built-in support for SQS and Taskiq for async workflows\n- \ud83d\udce6 **Batch Inference**: Efficient batch processing with configurable batch sizes\n- \ud83c\udfaf **Type Safety**: Full Pydantic integration for request/response validation\n- \ud83d\udee0\ufe0f **Developer Friendly**: Pre-configured with modern Python tooling (ruff, pre-commit)\n- \ud83d\udcca **Production Ready**: Comprehensive error handling and logging\n\n## \ud83d\ude80 Quick Start\n\n### Installation\n\n```bash\n# Using pip\npip install git+https://github.com/prassanna-ravishankar/modalkit.git\n\n# Using uv (recommended)\nuv pip install git+https://github.com/prassanna-ravishankar/modalkit.git\n```\n\n### 1. Define Your Model\n\nCreate an inference class that inherits from `InferencePipeline`:\n\n```python\nfrom modalkit.inference import InferencePipeline\nfrom pydantic import BaseModel\nfrom typing import List\n\n# Define input/output schemas with Pydantic\nclass TextInput(BaseModel):\n text: str\n language: str = \"en\"\n\nclass TextOutput(BaseModel):\n translated_text: str\n confidence: float\n\n# Implement your model logic\nclass TranslationModel(InferencePipeline):\n def __init__(self, model_name: str, all_model_data_folder: str, common_settings: dict, *args, **kwargs):\n super().__init__(model_name, all_model_data_folder, common_settings)\n # Load your model here\n # self.model = load_model(...)\n\n def preprocess(self, input_list: List[TextInput]) -> dict:\n \"\"\"Prepare inputs for the model\"\"\"\n texts = [item.text for item in input_list]\n return {\"texts\": texts, \"languages\": [item.language for item in input_list]}\n\n def predict(self, input_list: List[TextInput], preprocessed_data: dict) -> dict:\n \"\"\"Run model inference\"\"\"\n # Your model prediction logic\n translations = [text.upper() for text in preprocessed_data[\"texts\"]] # Example\n return {\"translations\": translations, \"scores\": [0.95] * len(translations)}\n\n def postprocess(self, input_list: List[TextInput], raw_output: dict) -> List[TextOutput]:\n \"\"\"Format model outputs\"\"\"\n return [\n TextOutput(translated_text=text, confidence=score)\n for text, score in zip(raw_output[\"translations\"], raw_output[\"scores\"])\n ]\n```\n\n### 2. Create Your Modal App\n\n```python\nimport modal\nfrom modalkit.modalapp import ModalService, create_web_endpoints\nfrom modalkit.modalutils import ModalConfig\n\n# Initialize with your config\nmodal_config = ModalConfig()\napp = modal.App(name=modal_config.app_name)\n\n# Define your Modal app class\n@app.cls(**modal_config.get_app_cls_settings())\nclass TranslationApp(ModalService):\n inference_implementation = TranslationModel\n model_name: str = modal.parameter(default=\"translation_model\")\n modal_utils: ModalConfig = modal_config\n\n# Create API endpoints\n@app.function(**modal_config.get_handler_settings())\n@modal.asgi_app(**modal_config.get_asgi_app_settings())\ndef web_endpoints():\n return create_web_endpoints(\n app_cls=TranslationApp,\n input_model=TextInput,\n output_model=TextOutput\n )\n```\n\n### 3. Configure Your Deployment\n\nCreate a `modalkit.yaml` configuration file:\n\n```yaml\n# modalkit.yaml\napp_settings:\n app_prefix: \"translation-service\"\n\n # Authentication configuration\n auth_config:\n # Option 1: Use API key from AWS SSM\n ssm_key: \"/translation/api-key\"\n auth_header: \"x-api-key\"\n # Option 2: Use hardcoded API key (not recommended for production)\n # api_key: \"your-api-key-here\"\n # auth_header: \"x-api-key\"\n\n # Container configuration\n build_config:\n image: \"python:3.11-slim\" # or your custom image\n tag: \"latest\"\n workdir: \"/app\"\n env:\n MODEL_VERSION: \"v1.0\"\n\n # Deployment settings\n deployment_config:\n gpu: \"T4\" # Options: T4, A10G, A100, or null for CPU\n concurrency_limit: 10\n container_idle_timeout: 300\n secure: false # Set to true for Modal proxy auth\n\n # Cloud storage mounts (optional)\n cloud_bucket_mounts:\n - mount_point: \"/mnt/models\"\n bucket_name: \"my-model-bucket\"\n secret: \"aws-credentials\"\n read_only: true\n key_prefix: \"models/\"\n\n # Batch processing settings\n batch_config:\n max_batch_size: 32\n wait_ms: 100 # Wait up to 100ms to fill batch\n\n # Queue configuration (for async endpoints)\n queue_config:\n backend: \"taskiq\" # or \"sqs\" for AWS SQS\n broker_url: \"redis://localhost:6379\"\n\n# Model configuration\nmodel_settings:\n local_model_repository_folder: \"./models\"\n common:\n cache_dir: \"./cache\"\n device: \"cuda\" # or \"cpu\"\n model_entries:\n translation_model:\n model_path: \"path/to/model.pt\"\n vocab_size: 50000\n```\n\n### 4. Deploy to Modal\n\n```bash\n# Test locally\nmodal serve app.py\n\n# Deploy to production\nmodal deploy app.py\n\n# View logs\nmodal logs -f\n```\n\n### 5. Use Your API\n\n```python\nimport requests\nimport asyncio\n\n# For standard API key auth\nheaders = {\"x-api-key\": \"your-api-key\"}\n\n# Synchronous endpoint\nresponse = requests.post(\n \"https://your-org--translation-service.modal.run/predict_sync\",\n json={\"text\": \"Hello world\", \"language\": \"en\"},\n headers=headers\n)\nprint(response.json())\n# {\"translated_text\": \"HELLO WORLD\", \"confidence\": 0.95}\n\n# Asynchronous endpoint (returns immediately)\nresponse = requests.post(\n \"https://your-org--translation-service.modal.run/predict_async\",\n json={\"text\": \"Hello world\", \"language\": \"en\"},\n headers=headers\n)\nprint(response.json())\n# {\"message_id\": \"550e8400-e29b-41d4-a716-446655440000\"}\n\n# Batch endpoint\nresponse = requests.post(\n \"https://your-org--translation-service.modal.run/predict_batch\",\n json=[\n {\"text\": \"Hello\", \"language\": \"en\"},\n {\"text\": \"World\", \"language\": \"en\"}\n ],\n headers=headers\n)\nprint(response.json())\n# [{\"translated_text\": \"HELLO\", \"confidence\": 0.95}, {\"translated_text\": \"WORLD\", \"confidence\": 0.95}]\n```\n\n## \ud83d\udd10 Authentication\n\nModalkit provides flexible authentication options:\n\n### Option 1: Custom API Key (Default)\nConfigure with `secure: false` in your deployment config.\n\n```yaml\n# modalkit.yaml\ndeployment_config:\n secure: false\n\nauth_config:\n # Store in AWS SSM (recommended)\n ssm_key: \"/myapp/api-key\"\n # OR hardcode (not recommended)\n # api_key: \"sk-1234567890\"\n auth_header: \"x-api-key\"\n```\n\n```python\n# Client usage\nheaders = {\"x-api-key\": \"your-api-key\"}\nresponse = requests.post(url, json=data, headers=headers)\n```\n\n### Option 2: Modal Proxy Authentication\nConfigure with `secure: true` for Modal's built-in auth:\n\n```yaml\n# modalkit.yaml\ndeployment_config:\n secure: true # Enables Modal proxy auth\n```\n\n```python\n# Client usage\nheaders = {\n \"Modal-Key\": \"your-modal-key\",\n \"Modal-Secret\": \"your-modal-secret\"\n}\nresponse = requests.post(url, json=data, headers=headers)\n```\n\n> \ud83d\udca1 **Tip**: Modal proxy auth is recommended for production as it's managed by Modal and requires no additional setup.\n\n## \u2699\ufe0f Configuration\n\n### Configuration Structure\n\nModalkit uses YAML configuration with two main sections:\n\n```yaml\n# modalkit.yaml\napp_settings: # Application deployment settings\n app_prefix: str # Prefix for your Modal app name\n auth_config: # Authentication configuration\n build_config: # Container build settings\n deployment_config: # Runtime deployment settings\n batch_config: # Batch processing settings\n queue_config: # Async queue settings\n\nmodel_settings: # Model-specific settings\n local_model_repository_folder: str\n common: dict # Shared settings across models\n model_entries: # Model-specific configurations\n model_name: dict\n```\n\n### Environment Variables\n\nSet configuration file location:\n```bash\n# Default location\nexport MODALKIT_CONFIG=\"modalkit.yaml\"\n\n# Multiple configs (later files override earlier ones)\nexport MODALKIT_CONFIG=\"base.yaml,prod.yaml\"\n\n# Other environment variables\nexport MODALKIT_APP_POSTFIX=\"-prod\" # Appended to app name\n```\n\n### Advanced Configuration Options\n\n```yaml\ndeployment_config:\n # GPU configuration\n gpu: \"T4\" # T4, A10G, A100, H100, or null\n\n # Resource limits\n concurrency_limit: 10\n container_idle_timeout: 300\n retries: 3\n\n # Memory/CPU (when gpu is null)\n memory: 8192 # MB\n cpu: 4.0 # cores\n\n # Volumes and mounts\n volumes:\n \"/mnt/cache\": \"model-cache-vol\"\n mounts:\n - local_path: \"configs/prod.json\"\n remote_path: \"/app/config.json\"\n type: \"file\"\n```\n\n## \u2601\ufe0f Cloud Storage Integration\n\nModalkit seamlessly integrates with cloud storage providers through Modal's CloudBucketMount:\n\n### Supported Providers\n\n| Provider | Configuration |\n|----------|--------------|\n| AWS S3 | Native support with IAM credentials |\n| Google Cloud Storage | Service account authentication |\n| Cloudflare R2 | S3-compatible API |\n| MinIO/Others | Any S3-compatible endpoint |\n\n### Quick Examples\n\n<details>\n<summary><b>AWS S3 Configuration</b></summary>\n\n```yaml\ncloud_bucket_mounts:\n - mount_point: \"/mnt/models\"\n bucket_name: \"my-ml-models\"\n secret: \"aws-credentials\" # Modal secret name\n key_prefix: \"production/\" # Only mount this prefix\n read_only: true\n```\n\nFirst, create the Modal secret:\n```bash\nmodal secret create aws-credentials \\\n AWS_ACCESS_KEY_ID=xxx \\\n AWS_SECRET_ACCESS_KEY=yyy \\\n AWS_DEFAULT_REGION=us-east-1\n```\n</details>\n\n<details>\n<summary><b>Google Cloud Storage</b></summary>\n\n```yaml\ncloud_bucket_mounts:\n - mount_point: \"/mnt/datasets\"\n bucket_name: \"my-datasets\"\n bucket_endpoint_url: \"https://storage.googleapis.com\"\n secret: \"gcp-credentials\"\n```\n\nCreate secret from service account:\n```bash\nmodal secret create gcp-credentials \\\n --from-gcp-service-account path/to/key.json\n```\n</details>\n\n<details>\n<summary><b>Cloudflare R2</b></summary>\n\n```yaml\ncloud_bucket_mounts:\n - mount_point: \"/mnt/artifacts\"\n bucket_name: \"ml-artifacts\"\n bucket_endpoint_url: \"https://accountid.r2.cloudflarestorage.com\"\n secret: \"r2-credentials\"\n```\n</details>\n\n### Using Mounted Storage\n\n```python\nclass MyInference(InferencePipeline):\n def __init__(self, *args, **kwargs):\n super().__init__(*args, **kwargs)\n\n # Load model from mounted bucket\n model_path = \"/mnt/models/my_model.pt\"\n self.model = torch.load(model_path)\n\n # Load dataset\n with open(\"/mnt/datasets/vocab.json\") as f:\n self.vocab = json.load(f)\n```\n\n### Best Practices\n\n- \u2705 Use read-only mounts for model artifacts\n- \u2705 Mount only required prefixes with `key_prefix`\n- \u2705 Use separate buckets for models vs. data\n- \u2705 Cache frequently accessed files locally\n- \u274c Avoid writing logs to mounted buckets\n- \u274c Don't mount entire buckets if you only need specific files\n\n## \ud83d\ude80 Advanced Features\n\n### Async Queue Processing\n\nModalkit supports async processing with multiple queue backends:\n\n```yaml\nqueue_config:\n backend: \"taskiq\" # or \"sqs\"\n broker_url: \"redis://redis:6379\"\n```\n\n```python\n# Async endpoint returns immediately\nresponse = requests.post(\"/predict_async\", json=data)\n# {\"message_id\": \"uuid\", \"status\": \"queued\"}\n```\n\n### Batch Processing\n\nConfigure intelligent batching for better GPU utilization:\n\n```yaml\nbatch_config:\n max_batch_size: 32\n wait_ms: 100 # Max time to wait for batch to fill\n```\n\n### Volume Reloading\n\nAuto-reload Modal volumes for model updates:\n\n```yaml\ndeployment_config:\n volumes:\n \"/mnt/models\": \"model-volume\"\n volume_reload_interval_seconds: 300 # Reload every 5 minutes\n```\n\n## \ud83d\udee0\ufe0f Development\n\n### Setup\n\n```bash\n# Clone repository\ngit clone https://github.com/prassanna-ravishankar/modalkit.git\ncd modalkit\n\n# Install with uv (recommended)\nuv sync\n\n# Install pre-commit hooks\nuv run pre-commit install\n```\n\n### Testing\n\n```bash\n# Run all tests\nuv run pytest --cov --cov-config=pyproject.toml --cov-report=xml\n\n# Run specific tests\nuv run pytest tests/test_modal_service.py -v\n\n# Run with HTML coverage report\nuv run pytest --cov=modalkit --cov-report=html\n```\n\n### Code Quality\n\n```bash\n# Run all checks\nuv run pre-commit run -a\n\n# Run type checking\nuv run mypy modalkit/\n\n# Format code\nuv run ruff format modalkit/ tests/\n\n# Lint code\nuv run ruff check modalkit/ tests/\n```\n\n## \ud83d\udcd6 API Reference\n\n### Endpoints\n\n| Endpoint | Method | Description | Returns |\n|----------|---------|-------------|----------|\n| `/predict_sync` | POST | Synchronous inference | Model output |\n| `/predict_async` | POST | Async inference (queued) | Message ID |\n| `/predict_batch` | POST | Batch inference | List of outputs |\n| `/health` | GET | Health check | Status |\n\n### InferencePipeline Methods\n\nYour model class must implement:\n\n```python\ndef preprocess(self, input_list: List[InputModel]) -> dict\ndef predict(self, input_list: List[InputModel], preprocessed_data: dict) -> dict\ndef postprocess(self, input_list: List[InputModel], raw_output: dict) -> List[OutputModel]\n```\n\n## \ud83e\udd1d Contributing\n\nWe welcome contributions! Please see our [Contributing Guidelines](CONTRIBUTING.md) for details.\n\n### Development Workflow\n\n1. Fork the repository\n2. Create a feature branch (`git checkout -b feature/amazing-feature`)\n3. Make your changes\n4. Run tests and linting (`uv run pytest && uv run pre-commit run -a`)\n5. Commit your changes (pre-commit hooks will run automatically)\n6. Push to your fork and open a Pull Request\n\n## \ud83d\udcdd License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## \ud83d\ude4f Acknowledgments\n\nBuilt with \u2764\ufe0f using:\n- [Modal](https://modal.com) - Serverless infrastructure for ML\n- [FastAPI](https://fastapi.tiangolo.com) - Modern web framework\n- [Pydantic](https://pydantic-docs.helpmanual.io) - Data validation\n- [Taskiq](https://taskiq-python.github.io) - Async task processing\n\n---\n\n<p align=\"center\">\n <a href=\"https://github.com/prassanna-ravishankar/modalkit/issues\">Report Bug</a> \u2022\n <a href=\"https://github.com/prassanna-ravishankar/modalkit/issues\">Request Feature</a> \u2022\n <a href=\"https://prassanna-ravishankar.github.io/modalkit\">Documentation</a>\n</p>\n",
"bugtrack_url": null,
"license": null,
"summary": "A library to package, ship and deploy your ML app",
"version": "0.2.0",
"project_urls": {
"Repository": "https://github.com/prassanna-ravishankar/modalkit"
},
"split_keywords": [
"python"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "63b3bbc9e11b096bbe2e910bfe4fdec742001d5323a41ce870aece242b302b9e",
"md5": "57d8f63972a040358e45a6bdb600506e",
"sha256": "b0e52b40802e080eab411f047c6c5d359140e07f3845954eb9e72d4672d4b14e"
},
"downloads": -1,
"filename": "modalkit-0.2.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "57d8f63972a040358e45a6bdb600506e",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.9",
"size": 24881,
"upload_time": "2025-07-10T13:54:36",
"upload_time_iso_8601": "2025-07-10T13:54:36.786078Z",
"url": "https://files.pythonhosted.org/packages/63/b3/bbc9e11b096bbe2e910bfe4fdec742001d5323a41ce870aece242b302b9e/modalkit-0.2.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "ad7627c93fba9f1aac36e13ec10a2f55559727ac429811a06073e8b598b9bdec",
"md5": "c20e830281cf5c8ab09421b34a86c087",
"sha256": "4760bdf8968df0a811c40ac2c013564cbeaafd2852ff697cc114b480ffe0060a"
},
"downloads": -1,
"filename": "modalkit-0.2.0.tar.gz",
"has_sig": false,
"md5_digest": "c20e830281cf5c8ab09421b34a86c087",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.9",
"size": 42973,
"upload_time": "2025-07-10T13:54:37",
"upload_time_iso_8601": "2025-07-10T13:54:37.941941Z",
"url": "https://files.pythonhosted.org/packages/ad/76/27c93fba9f1aac36e13ec10a2f55559727ac429811a06073e8b598b9bdec/modalkit-0.2.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-10 13:54:37",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "prassanna-ravishankar",
"github_project": "modalkit",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"tox": true,
"lcname": "modalkit"
}