modalkit


Namemodalkit JSON
Version 0.2.0 PyPI version JSON
download
home_pageNone
SummaryA library to package, ship and deploy your ML app
upload_time2025-07-10 13:54:37
maintainerNone
docs_urlNone
authorNone
requires_python<4.0,>=3.9
licenseNone
keywords python
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Modalkit

<p align="center">
  <a href="https://img.shields.io/github/v/release/prassanna-ravishankar/modalkit">
    <img src="https://img.shields.io/github/v/release/prassanna-ravishankar/modalkit" alt="Release">
  </a>
  <a href="https://codecov.io/gh/prassanna-ravishankar/modalkit">
    <img src="https://codecov.io/gh/prassanna-ravishankar/modalkit/branch/main/graph/badge.svg" alt="codecov">
  </a>
  <a href="https://img.shields.io/github/commit-activity/m/prassanna-ravishankar/modalkit">
    <img src="https://img.shields.io/github/commit-activity/m/prassanna-ravishankar/modalkit" alt="Commit activity">
  </a>
  <a href="https://img.shields.io/github/license/prassanna-ravishankar/modalkit">
    <img src="https://img.shields.io/github/license/prassanna-ravishankar/modalkit" alt="License">
  </a>
</p>

<p align="center">
  <img src="./docs/modalkit.png" width="400" height="400"/>
</p>

<p align="center">
  A powerful Python framework for deploying ML models on Modal with production-ready features
</p>

## 🎯 What Modalkit Offers Over Raw Modal

While Modal provides excellent serverless infrastructure, Modalkit adds a complete ML deployment framework:

### 🏗️ **Standardized ML Architecture**
- **Structured Inference Pipeline**: Enforced `preprocess()` → `predict()` → `postprocess()` pattern
- **Consistent API Endpoints**: `/predict_sync`, `/predict_batch`, `/predict_async` across all deployments
- **Type-Safe Interfaces**: Pydantic models ensure data validation at API boundaries

### ⚙️ **Configuration-Driven Deployments**
- **YAML Configuration**: Version-controlled deployment settings instead of scattered code
- **Environment Management**: Easy dev/staging/prod configs with override capabilities
- **Reproducible Builds**: Declarative infrastructure removes deployment inconsistencies

### 👥 **Team-Friendly Workflows**
- **Shared Standards**: All team members deploy models the same way
- **Code Separation**: Model logic decoupled from Modal deployment boilerplate
- **Collaboration**: Config files in git enable infrastructure review and collaboration

### 🚀 **Production Features Out-of-the-Box**
- **Authentication Middleware**: Built-in API key or Modal proxy auth
- **Queue Integration**: Async processing with multiple backend support
- **Cloud Storage**: Direct S3/GCS/R2 mounting without manual setup
- **Batch Processing**: Intelligent request batching for GPU efficiency
- **Error Handling**: Comprehensive error responses and logging

### 💡 **Developer Experience**
- **Less Boilerplate**: Focus on model code, not FastAPI/Modal setup
- **Modern Tooling**: Pre-configured with ruff, mypy, pre-commit hooks
- **Testing Framework**: Built-in patterns for testing ML deployments

**In short**: Modalkit transforms Modal from infrastructure primitives into a complete ML platform, letting teams deploy models consistently while maintaining Modal's performance and scalability.

## ✨ Key Features

- 🚀 **Native Modal Integration**: Seamless deployment on Modal's serverless infrastructure
- 🔐 **Flexible Authentication**: Modal proxy auth or custom API keys with AWS SSM support
- ☁️ **Cloud Storage Support**: Direct mounting of S3, GCS, and R2 buckets
- 🔄 **Queue Integration**: Built-in support for SQS and Taskiq for async workflows
- 📦 **Batch Inference**: Efficient batch processing with configurable batch sizes
- 🎯 **Type Safety**: Full Pydantic integration for request/response validation
- 🛠️ **Developer Friendly**: Pre-configured with modern Python tooling (ruff, pre-commit)
- 📊 **Production Ready**: Comprehensive error handling and logging

## 🚀 Quick Start

### Installation

```bash
# Using pip
pip install git+https://github.com/prassanna-ravishankar/modalkit.git

# Using uv (recommended)
uv pip install git+https://github.com/prassanna-ravishankar/modalkit.git
```

### 1. Define Your Model

Create an inference class that inherits from `InferencePipeline`:

```python
from modalkit.inference import InferencePipeline
from pydantic import BaseModel
from typing import List

# Define input/output schemas with Pydantic
class TextInput(BaseModel):
    text: str
    language: str = "en"

class TextOutput(BaseModel):
    translated_text: str
    confidence: float

# Implement your model logic
class TranslationModel(InferencePipeline):
    def __init__(self, model_name: str, all_model_data_folder: str, common_settings: dict, *args, **kwargs):
        super().__init__(model_name, all_model_data_folder, common_settings)
        # Load your model here
        # self.model = load_model(...)

    def preprocess(self, input_list: List[TextInput]) -> dict:
        """Prepare inputs for the model"""
        texts = [item.text for item in input_list]
        return {"texts": texts, "languages": [item.language for item in input_list]}

    def predict(self, input_list: List[TextInput], preprocessed_data: dict) -> dict:
        """Run model inference"""
        # Your model prediction logic
        translations = [text.upper() for text in preprocessed_data["texts"]]  # Example
        return {"translations": translations, "scores": [0.95] * len(translations)}

    def postprocess(self, input_list: List[TextInput], raw_output: dict) -> List[TextOutput]:
        """Format model outputs"""
        return [
            TextOutput(translated_text=text, confidence=score)
            for text, score in zip(raw_output["translations"], raw_output["scores"])
        ]
```

### 2. Create Your Modal App

```python
import modal
from modalkit.modalapp import ModalService, create_web_endpoints
from modalkit.modalutils import ModalConfig

# Initialize with your config
modal_config = ModalConfig()
app = modal.App(name=modal_config.app_name)

# Define your Modal app class
@app.cls(**modal_config.get_app_cls_settings())
class TranslationApp(ModalService):
    inference_implementation = TranslationModel
    model_name: str = modal.parameter(default="translation_model")
    modal_utils: ModalConfig = modal_config

# Create API endpoints
@app.function(**modal_config.get_handler_settings())
@modal.asgi_app(**modal_config.get_asgi_app_settings())
def web_endpoints():
    return create_web_endpoints(
        app_cls=TranslationApp,
        input_model=TextInput,
        output_model=TextOutput
    )
```

### 3. Configure Your Deployment

Create a `modalkit.yaml` configuration file:

```yaml
# modalkit.yaml
app_settings:
  app_prefix: "translation-service"

  # Authentication configuration
  auth_config:
    # Option 1: Use API key from AWS SSM
    ssm_key: "/translation/api-key"
    auth_header: "x-api-key"
    # Option 2: Use hardcoded API key (not recommended for production)
    # api_key: "your-api-key-here"
    # auth_header: "x-api-key"

  # Container configuration
  build_config:
    image: "python:3.11-slim"  # or your custom image
    tag: "latest"
    workdir: "/app"
    env:
      MODEL_VERSION: "v1.0"

  # Deployment settings
  deployment_config:
    gpu: "T4"  # Options: T4, A10G, A100, or null for CPU
    concurrency_limit: 10
    container_idle_timeout: 300
    secure: false  # Set to true for Modal proxy auth

    # Cloud storage mounts (optional)
    cloud_bucket_mounts:
      - mount_point: "/mnt/models"
        bucket_name: "my-model-bucket"
        secret: "aws-credentials"
        read_only: true
        key_prefix: "models/"

  # Batch processing settings
  batch_config:
    max_batch_size: 32
    wait_ms: 100  # Wait up to 100ms to fill batch

  # Queue configuration (for async endpoints)
  queue_config:
    backend: "taskiq"  # or "sqs" for AWS SQS
    broker_url: "redis://localhost:6379"

# Model configuration
model_settings:
  local_model_repository_folder: "./models"
  common:
    cache_dir: "./cache"
    device: "cuda"  # or "cpu"
  model_entries:
    translation_model:
      model_path: "path/to/model.pt"
      vocab_size: 50000
```

### 4. Deploy to Modal

```bash
# Test locally
modal serve app.py

# Deploy to production
modal deploy app.py

# View logs
modal logs -f
```

### 5. Use Your API

```python
import requests
import asyncio

# For standard API key auth
headers = {"x-api-key": "your-api-key"}

# Synchronous endpoint
response = requests.post(
    "https://your-org--translation-service.modal.run/predict_sync",
    json={"text": "Hello world", "language": "en"},
    headers=headers
)
print(response.json())
# {"translated_text": "HELLO WORLD", "confidence": 0.95}

# Asynchronous endpoint (returns immediately)
response = requests.post(
    "https://your-org--translation-service.modal.run/predict_async",
    json={"text": "Hello world", "language": "en"},
    headers=headers
)
print(response.json())
# {"message_id": "550e8400-e29b-41d4-a716-446655440000"}

# Batch endpoint
response = requests.post(
    "https://your-org--translation-service.modal.run/predict_batch",
    json=[
        {"text": "Hello", "language": "en"},
        {"text": "World", "language": "en"}
    ],
    headers=headers
)
print(response.json())
# [{"translated_text": "HELLO", "confidence": 0.95}, {"translated_text": "WORLD", "confidence": 0.95}]
```

## 🔐 Authentication

Modalkit provides flexible authentication options:

### Option 1: Custom API Key (Default)
Configure with `secure: false` in your deployment config.

```yaml
# modalkit.yaml
deployment_config:
  secure: false

auth_config:
  # Store in AWS SSM (recommended)
  ssm_key: "/myapp/api-key"
  # OR hardcode (not recommended)
  # api_key: "sk-1234567890"
  auth_header: "x-api-key"
```

```python
# Client usage
headers = {"x-api-key": "your-api-key"}
response = requests.post(url, json=data, headers=headers)
```

### Option 2: Modal Proxy Authentication
Configure with `secure: true` for Modal's built-in auth:

```yaml
# modalkit.yaml
deployment_config:
  secure: true  # Enables Modal proxy auth
```

```python
# Client usage
headers = {
    "Modal-Key": "your-modal-key",
    "Modal-Secret": "your-modal-secret"
}
response = requests.post(url, json=data, headers=headers)
```

> 💡 **Tip**: Modal proxy auth is recommended for production as it's managed by Modal and requires no additional setup.

## ⚙️ Configuration

### Configuration Structure

Modalkit uses YAML configuration with two main sections:

```yaml
# modalkit.yaml
app_settings:        # Application deployment settings
  app_prefix: str    # Prefix for your Modal app name
  auth_config:       # Authentication configuration
  build_config:      # Container build settings
  deployment_config: # Runtime deployment settings
  batch_config:      # Batch processing settings
  queue_config:      # Async queue settings

model_settings:      # Model-specific settings
  local_model_repository_folder: str
  common: dict       # Shared settings across models
  model_entries:     # Model-specific configurations
    model_name: dict
```

### Environment Variables

Set configuration file location:
```bash
# Default location
export MODALKIT_CONFIG="modalkit.yaml"

# Multiple configs (later files override earlier ones)
export MODALKIT_CONFIG="base.yaml,prod.yaml"

# Other environment variables
export MODALKIT_APP_POSTFIX="-prod"  # Appended to app name
```

### Advanced Configuration Options

```yaml
deployment_config:
  # GPU configuration
  gpu: "T4"  # T4, A10G, A100, H100, or null

  # Resource limits
  concurrency_limit: 10
  container_idle_timeout: 300
  retries: 3

  # Memory/CPU (when gpu is null)
  memory: 8192  # MB
  cpu: 4.0      # cores

  # Volumes and mounts
  volumes:
    "/mnt/cache": "model-cache-vol"
  mounts:
    - local_path: "configs/prod.json"
      remote_path: "/app/config.json"
      type: "file"
```

## ☁️ Cloud Storage Integration

Modalkit seamlessly integrates with cloud storage providers through Modal's CloudBucketMount:

### Supported Providers

| Provider | Configuration |
|----------|--------------|
| AWS S3 | Native support with IAM credentials |
| Google Cloud Storage | Service account authentication |
| Cloudflare R2 | S3-compatible API |
| MinIO/Others | Any S3-compatible endpoint |

### Quick Examples

<details>
<summary><b>AWS S3 Configuration</b></summary>

```yaml
cloud_bucket_mounts:
  - mount_point: "/mnt/models"
    bucket_name: "my-ml-models"
    secret: "aws-credentials"  # Modal secret name
    key_prefix: "production/"  # Only mount this prefix
    read_only: true
```

First, create the Modal secret:
```bash
modal secret create aws-credentials \
  AWS_ACCESS_KEY_ID=xxx \
  AWS_SECRET_ACCESS_KEY=yyy \
  AWS_DEFAULT_REGION=us-east-1
```
</details>

<details>
<summary><b>Google Cloud Storage</b></summary>

```yaml
cloud_bucket_mounts:
  - mount_point: "/mnt/datasets"
    bucket_name: "my-datasets"
    bucket_endpoint_url: "https://storage.googleapis.com"
    secret: "gcp-credentials"
```

Create secret from service account:
```bash
modal secret create gcp-credentials \
  --from-gcp-service-account path/to/key.json
```
</details>

<details>
<summary><b>Cloudflare R2</b></summary>

```yaml
cloud_bucket_mounts:
  - mount_point: "/mnt/artifacts"
    bucket_name: "ml-artifacts"
    bucket_endpoint_url: "https://accountid.r2.cloudflarestorage.com"
    secret: "r2-credentials"
```
</details>

### Using Mounted Storage

```python
class MyInference(InferencePipeline):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)

        # Load model from mounted bucket
        model_path = "/mnt/models/my_model.pt"
        self.model = torch.load(model_path)

        # Load dataset
        with open("/mnt/datasets/vocab.json") as f:
            self.vocab = json.load(f)
```

### Best Practices

- ✅ Use read-only mounts for model artifacts
- ✅ Mount only required prefixes with `key_prefix`
- ✅ Use separate buckets for models vs. data
- ✅ Cache frequently accessed files locally
- ❌ Avoid writing logs to mounted buckets
- ❌ Don't mount entire buckets if you only need specific files

## 🚀 Advanced Features

### Async Queue Processing

Modalkit supports async processing with multiple queue backends:

```yaml
queue_config:
  backend: "taskiq"  # or "sqs"
  broker_url: "redis://redis:6379"
```

```python
# Async endpoint returns immediately
response = requests.post("/predict_async", json=data)
# {"message_id": "uuid", "status": "queued"}
```

### Batch Processing

Configure intelligent batching for better GPU utilization:

```yaml
batch_config:
  max_batch_size: 32
  wait_ms: 100  # Max time to wait for batch to fill
```

### Volume Reloading

Auto-reload Modal volumes for model updates:

```yaml
deployment_config:
  volumes:
    "/mnt/models": "model-volume"
  volume_reload_interval_seconds: 300  # Reload every 5 minutes
```

## 🛠️ Development

### Setup

```bash
# Clone repository
git clone https://github.com/prassanna-ravishankar/modalkit.git
cd modalkit

# Install with uv (recommended)
uv sync

# Install pre-commit hooks
uv run pre-commit install
```

### Testing

```bash
# Run all tests
uv run pytest --cov --cov-config=pyproject.toml --cov-report=xml

# Run specific tests
uv run pytest tests/test_modal_service.py -v

# Run with HTML coverage report
uv run pytest --cov=modalkit --cov-report=html
```

### Code Quality

```bash
# Run all checks
uv run pre-commit run -a

# Run type checking
uv run mypy modalkit/

# Format code
uv run ruff format modalkit/ tests/

# Lint code
uv run ruff check modalkit/ tests/
```

## 📖 API Reference

### Endpoints

| Endpoint | Method | Description | Returns |
|----------|---------|-------------|----------|
| `/predict_sync` | POST | Synchronous inference | Model output |
| `/predict_async` | POST | Async inference (queued) | Message ID |
| `/predict_batch` | POST | Batch inference | List of outputs |
| `/health` | GET | Health check | Status |

### InferencePipeline Methods

Your model class must implement:

```python
def preprocess(self, input_list: List[InputModel]) -> dict
def predict(self, input_list: List[InputModel], preprocessed_data: dict) -> dict
def postprocess(self, input_list: List[InputModel], raw_output: dict) -> List[OutputModel]
```

## 🤝 Contributing

We welcome contributions! Please see our [Contributing Guidelines](CONTRIBUTING.md) for details.

### Development Workflow

1. Fork the repository
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Make your changes
4. Run tests and linting (`uv run pytest && uv run pre-commit run -a`)
5. Commit your changes (pre-commit hooks will run automatically)
6. Push to your fork and open a Pull Request

## 📝 License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## 🙏 Acknowledgments

Built with ❤️ using:
- [Modal](https://modal.com) - Serverless infrastructure for ML
- [FastAPI](https://fastapi.tiangolo.com) - Modern web framework
- [Pydantic](https://pydantic-docs.helpmanual.io) - Data validation
- [Taskiq](https://taskiq-python.github.io) - Async task processing

---

<p align="center">
  <a href="https://github.com/prassanna-ravishankar/modalkit/issues">Report Bug</a> •
  <a href="https://github.com/prassanna-ravishankar/modalkit/issues">Request Feature</a> •
  <a href="https://prassanna-ravishankar.github.io/modalkit">Documentation</a>
</p>

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "modalkit",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.9",
    "maintainer_email": null,
    "keywords": "python",
    "author": null,
    "author_email": "Prassanna Ravishankar <me@prassanna.io>",
    "download_url": "https://files.pythonhosted.org/packages/ad/76/27c93fba9f1aac36e13ec10a2f55559727ac429811a06073e8b598b9bdec/modalkit-0.2.0.tar.gz",
    "platform": null,
    "description": "# Modalkit\n\n<p align=\"center\">\n  <a href=\"https://img.shields.io/github/v/release/prassanna-ravishankar/modalkit\">\n    <img src=\"https://img.shields.io/github/v/release/prassanna-ravishankar/modalkit\" alt=\"Release\">\n  </a>\n  <a href=\"https://codecov.io/gh/prassanna-ravishankar/modalkit\">\n    <img src=\"https://codecov.io/gh/prassanna-ravishankar/modalkit/branch/main/graph/badge.svg\" alt=\"codecov\">\n  </a>\n  <a href=\"https://img.shields.io/github/commit-activity/m/prassanna-ravishankar/modalkit\">\n    <img src=\"https://img.shields.io/github/commit-activity/m/prassanna-ravishankar/modalkit\" alt=\"Commit activity\">\n  </a>\n  <a href=\"https://img.shields.io/github/license/prassanna-ravishankar/modalkit\">\n    <img src=\"https://img.shields.io/github/license/prassanna-ravishankar/modalkit\" alt=\"License\">\n  </a>\n</p>\n\n<p align=\"center\">\n  <img src=\"./docs/modalkit.png\" width=\"400\" height=\"400\"/>\n</p>\n\n<p align=\"center\">\n  A powerful Python framework for deploying ML models on Modal with production-ready features\n</p>\n\n## \ud83c\udfaf What Modalkit Offers Over Raw Modal\n\nWhile Modal provides excellent serverless infrastructure, Modalkit adds a complete ML deployment framework:\n\n### \ud83c\udfd7\ufe0f **Standardized ML Architecture**\n- **Structured Inference Pipeline**: Enforced `preprocess()` \u2192 `predict()` \u2192 `postprocess()` pattern\n- **Consistent API Endpoints**: `/predict_sync`, `/predict_batch`, `/predict_async` across all deployments\n- **Type-Safe Interfaces**: Pydantic models ensure data validation at API boundaries\n\n### \u2699\ufe0f **Configuration-Driven Deployments**\n- **YAML Configuration**: Version-controlled deployment settings instead of scattered code\n- **Environment Management**: Easy dev/staging/prod configs with override capabilities\n- **Reproducible Builds**: Declarative infrastructure removes deployment inconsistencies\n\n### \ud83d\udc65 **Team-Friendly Workflows**\n- **Shared Standards**: All team members deploy models the same way\n- **Code Separation**: Model logic decoupled from Modal deployment boilerplate\n- **Collaboration**: Config files in git enable infrastructure review and collaboration\n\n### \ud83d\ude80 **Production Features Out-of-the-Box**\n- **Authentication Middleware**: Built-in API key or Modal proxy auth\n- **Queue Integration**: Async processing with multiple backend support\n- **Cloud Storage**: Direct S3/GCS/R2 mounting without manual setup\n- **Batch Processing**: Intelligent request batching for GPU efficiency\n- **Error Handling**: Comprehensive error responses and logging\n\n### \ud83d\udca1 **Developer Experience**\n- **Less Boilerplate**: Focus on model code, not FastAPI/Modal setup\n- **Modern Tooling**: Pre-configured with ruff, mypy, pre-commit hooks\n- **Testing Framework**: Built-in patterns for testing ML deployments\n\n**In short**: Modalkit transforms Modal from infrastructure primitives into a complete ML platform, letting teams deploy models consistently while maintaining Modal's performance and scalability.\n\n## \u2728 Key Features\n\n- \ud83d\ude80 **Native Modal Integration**: Seamless deployment on Modal's serverless infrastructure\n- \ud83d\udd10 **Flexible Authentication**: Modal proxy auth or custom API keys with AWS SSM support\n- \u2601\ufe0f **Cloud Storage Support**: Direct mounting of S3, GCS, and R2 buckets\n- \ud83d\udd04 **Queue Integration**: Built-in support for SQS and Taskiq for async workflows\n- \ud83d\udce6 **Batch Inference**: Efficient batch processing with configurable batch sizes\n- \ud83c\udfaf **Type Safety**: Full Pydantic integration for request/response validation\n- \ud83d\udee0\ufe0f **Developer Friendly**: Pre-configured with modern Python tooling (ruff, pre-commit)\n- \ud83d\udcca **Production Ready**: Comprehensive error handling and logging\n\n## \ud83d\ude80 Quick Start\n\n### Installation\n\n```bash\n# Using pip\npip install git+https://github.com/prassanna-ravishankar/modalkit.git\n\n# Using uv (recommended)\nuv pip install git+https://github.com/prassanna-ravishankar/modalkit.git\n```\n\n### 1. Define Your Model\n\nCreate an inference class that inherits from `InferencePipeline`:\n\n```python\nfrom modalkit.inference import InferencePipeline\nfrom pydantic import BaseModel\nfrom typing import List\n\n# Define input/output schemas with Pydantic\nclass TextInput(BaseModel):\n    text: str\n    language: str = \"en\"\n\nclass TextOutput(BaseModel):\n    translated_text: str\n    confidence: float\n\n# Implement your model logic\nclass TranslationModel(InferencePipeline):\n    def __init__(self, model_name: str, all_model_data_folder: str, common_settings: dict, *args, **kwargs):\n        super().__init__(model_name, all_model_data_folder, common_settings)\n        # Load your model here\n        # self.model = load_model(...)\n\n    def preprocess(self, input_list: List[TextInput]) -> dict:\n        \"\"\"Prepare inputs for the model\"\"\"\n        texts = [item.text for item in input_list]\n        return {\"texts\": texts, \"languages\": [item.language for item in input_list]}\n\n    def predict(self, input_list: List[TextInput], preprocessed_data: dict) -> dict:\n        \"\"\"Run model inference\"\"\"\n        # Your model prediction logic\n        translations = [text.upper() for text in preprocessed_data[\"texts\"]]  # Example\n        return {\"translations\": translations, \"scores\": [0.95] * len(translations)}\n\n    def postprocess(self, input_list: List[TextInput], raw_output: dict) -> List[TextOutput]:\n        \"\"\"Format model outputs\"\"\"\n        return [\n            TextOutput(translated_text=text, confidence=score)\n            for text, score in zip(raw_output[\"translations\"], raw_output[\"scores\"])\n        ]\n```\n\n### 2. Create Your Modal App\n\n```python\nimport modal\nfrom modalkit.modalapp import ModalService, create_web_endpoints\nfrom modalkit.modalutils import ModalConfig\n\n# Initialize with your config\nmodal_config = ModalConfig()\napp = modal.App(name=modal_config.app_name)\n\n# Define your Modal app class\n@app.cls(**modal_config.get_app_cls_settings())\nclass TranslationApp(ModalService):\n    inference_implementation = TranslationModel\n    model_name: str = modal.parameter(default=\"translation_model\")\n    modal_utils: ModalConfig = modal_config\n\n# Create API endpoints\n@app.function(**modal_config.get_handler_settings())\n@modal.asgi_app(**modal_config.get_asgi_app_settings())\ndef web_endpoints():\n    return create_web_endpoints(\n        app_cls=TranslationApp,\n        input_model=TextInput,\n        output_model=TextOutput\n    )\n```\n\n### 3. Configure Your Deployment\n\nCreate a `modalkit.yaml` configuration file:\n\n```yaml\n# modalkit.yaml\napp_settings:\n  app_prefix: \"translation-service\"\n\n  # Authentication configuration\n  auth_config:\n    # Option 1: Use API key from AWS SSM\n    ssm_key: \"/translation/api-key\"\n    auth_header: \"x-api-key\"\n    # Option 2: Use hardcoded API key (not recommended for production)\n    # api_key: \"your-api-key-here\"\n    # auth_header: \"x-api-key\"\n\n  # Container configuration\n  build_config:\n    image: \"python:3.11-slim\"  # or your custom image\n    tag: \"latest\"\n    workdir: \"/app\"\n    env:\n      MODEL_VERSION: \"v1.0\"\n\n  # Deployment settings\n  deployment_config:\n    gpu: \"T4\"  # Options: T4, A10G, A100, or null for CPU\n    concurrency_limit: 10\n    container_idle_timeout: 300\n    secure: false  # Set to true for Modal proxy auth\n\n    # Cloud storage mounts (optional)\n    cloud_bucket_mounts:\n      - mount_point: \"/mnt/models\"\n        bucket_name: \"my-model-bucket\"\n        secret: \"aws-credentials\"\n        read_only: true\n        key_prefix: \"models/\"\n\n  # Batch processing settings\n  batch_config:\n    max_batch_size: 32\n    wait_ms: 100  # Wait up to 100ms to fill batch\n\n  # Queue configuration (for async endpoints)\n  queue_config:\n    backend: \"taskiq\"  # or \"sqs\" for AWS SQS\n    broker_url: \"redis://localhost:6379\"\n\n# Model configuration\nmodel_settings:\n  local_model_repository_folder: \"./models\"\n  common:\n    cache_dir: \"./cache\"\n    device: \"cuda\"  # or \"cpu\"\n  model_entries:\n    translation_model:\n      model_path: \"path/to/model.pt\"\n      vocab_size: 50000\n```\n\n### 4. Deploy to Modal\n\n```bash\n# Test locally\nmodal serve app.py\n\n# Deploy to production\nmodal deploy app.py\n\n# View logs\nmodal logs -f\n```\n\n### 5. Use Your API\n\n```python\nimport requests\nimport asyncio\n\n# For standard API key auth\nheaders = {\"x-api-key\": \"your-api-key\"}\n\n# Synchronous endpoint\nresponse = requests.post(\n    \"https://your-org--translation-service.modal.run/predict_sync\",\n    json={\"text\": \"Hello world\", \"language\": \"en\"},\n    headers=headers\n)\nprint(response.json())\n# {\"translated_text\": \"HELLO WORLD\", \"confidence\": 0.95}\n\n# Asynchronous endpoint (returns immediately)\nresponse = requests.post(\n    \"https://your-org--translation-service.modal.run/predict_async\",\n    json={\"text\": \"Hello world\", \"language\": \"en\"},\n    headers=headers\n)\nprint(response.json())\n# {\"message_id\": \"550e8400-e29b-41d4-a716-446655440000\"}\n\n# Batch endpoint\nresponse = requests.post(\n    \"https://your-org--translation-service.modal.run/predict_batch\",\n    json=[\n        {\"text\": \"Hello\", \"language\": \"en\"},\n        {\"text\": \"World\", \"language\": \"en\"}\n    ],\n    headers=headers\n)\nprint(response.json())\n# [{\"translated_text\": \"HELLO\", \"confidence\": 0.95}, {\"translated_text\": \"WORLD\", \"confidence\": 0.95}]\n```\n\n## \ud83d\udd10 Authentication\n\nModalkit provides flexible authentication options:\n\n### Option 1: Custom API Key (Default)\nConfigure with `secure: false` in your deployment config.\n\n```yaml\n# modalkit.yaml\ndeployment_config:\n  secure: false\n\nauth_config:\n  # Store in AWS SSM (recommended)\n  ssm_key: \"/myapp/api-key\"\n  # OR hardcode (not recommended)\n  # api_key: \"sk-1234567890\"\n  auth_header: \"x-api-key\"\n```\n\n```python\n# Client usage\nheaders = {\"x-api-key\": \"your-api-key\"}\nresponse = requests.post(url, json=data, headers=headers)\n```\n\n### Option 2: Modal Proxy Authentication\nConfigure with `secure: true` for Modal's built-in auth:\n\n```yaml\n# modalkit.yaml\ndeployment_config:\n  secure: true  # Enables Modal proxy auth\n```\n\n```python\n# Client usage\nheaders = {\n    \"Modal-Key\": \"your-modal-key\",\n    \"Modal-Secret\": \"your-modal-secret\"\n}\nresponse = requests.post(url, json=data, headers=headers)\n```\n\n> \ud83d\udca1 **Tip**: Modal proxy auth is recommended for production as it's managed by Modal and requires no additional setup.\n\n## \u2699\ufe0f Configuration\n\n### Configuration Structure\n\nModalkit uses YAML configuration with two main sections:\n\n```yaml\n# modalkit.yaml\napp_settings:        # Application deployment settings\n  app_prefix: str    # Prefix for your Modal app name\n  auth_config:       # Authentication configuration\n  build_config:      # Container build settings\n  deployment_config: # Runtime deployment settings\n  batch_config:      # Batch processing settings\n  queue_config:      # Async queue settings\n\nmodel_settings:      # Model-specific settings\n  local_model_repository_folder: str\n  common: dict       # Shared settings across models\n  model_entries:     # Model-specific configurations\n    model_name: dict\n```\n\n### Environment Variables\n\nSet configuration file location:\n```bash\n# Default location\nexport MODALKIT_CONFIG=\"modalkit.yaml\"\n\n# Multiple configs (later files override earlier ones)\nexport MODALKIT_CONFIG=\"base.yaml,prod.yaml\"\n\n# Other environment variables\nexport MODALKIT_APP_POSTFIX=\"-prod\"  # Appended to app name\n```\n\n### Advanced Configuration Options\n\n```yaml\ndeployment_config:\n  # GPU configuration\n  gpu: \"T4\"  # T4, A10G, A100, H100, or null\n\n  # Resource limits\n  concurrency_limit: 10\n  container_idle_timeout: 300\n  retries: 3\n\n  # Memory/CPU (when gpu is null)\n  memory: 8192  # MB\n  cpu: 4.0      # cores\n\n  # Volumes and mounts\n  volumes:\n    \"/mnt/cache\": \"model-cache-vol\"\n  mounts:\n    - local_path: \"configs/prod.json\"\n      remote_path: \"/app/config.json\"\n      type: \"file\"\n```\n\n## \u2601\ufe0f Cloud Storage Integration\n\nModalkit seamlessly integrates with cloud storage providers through Modal's CloudBucketMount:\n\n### Supported Providers\n\n| Provider | Configuration |\n|----------|--------------|\n| AWS S3 | Native support with IAM credentials |\n| Google Cloud Storage | Service account authentication |\n| Cloudflare R2 | S3-compatible API |\n| MinIO/Others | Any S3-compatible endpoint |\n\n### Quick Examples\n\n<details>\n<summary><b>AWS S3 Configuration</b></summary>\n\n```yaml\ncloud_bucket_mounts:\n  - mount_point: \"/mnt/models\"\n    bucket_name: \"my-ml-models\"\n    secret: \"aws-credentials\"  # Modal secret name\n    key_prefix: \"production/\"  # Only mount this prefix\n    read_only: true\n```\n\nFirst, create the Modal secret:\n```bash\nmodal secret create aws-credentials \\\n  AWS_ACCESS_KEY_ID=xxx \\\n  AWS_SECRET_ACCESS_KEY=yyy \\\n  AWS_DEFAULT_REGION=us-east-1\n```\n</details>\n\n<details>\n<summary><b>Google Cloud Storage</b></summary>\n\n```yaml\ncloud_bucket_mounts:\n  - mount_point: \"/mnt/datasets\"\n    bucket_name: \"my-datasets\"\n    bucket_endpoint_url: \"https://storage.googleapis.com\"\n    secret: \"gcp-credentials\"\n```\n\nCreate secret from service account:\n```bash\nmodal secret create gcp-credentials \\\n  --from-gcp-service-account path/to/key.json\n```\n</details>\n\n<details>\n<summary><b>Cloudflare R2</b></summary>\n\n```yaml\ncloud_bucket_mounts:\n  - mount_point: \"/mnt/artifacts\"\n    bucket_name: \"ml-artifacts\"\n    bucket_endpoint_url: \"https://accountid.r2.cloudflarestorage.com\"\n    secret: \"r2-credentials\"\n```\n</details>\n\n### Using Mounted Storage\n\n```python\nclass MyInference(InferencePipeline):\n    def __init__(self, *args, **kwargs):\n        super().__init__(*args, **kwargs)\n\n        # Load model from mounted bucket\n        model_path = \"/mnt/models/my_model.pt\"\n        self.model = torch.load(model_path)\n\n        # Load dataset\n        with open(\"/mnt/datasets/vocab.json\") as f:\n            self.vocab = json.load(f)\n```\n\n### Best Practices\n\n- \u2705 Use read-only mounts for model artifacts\n- \u2705 Mount only required prefixes with `key_prefix`\n- \u2705 Use separate buckets for models vs. data\n- \u2705 Cache frequently accessed files locally\n- \u274c Avoid writing logs to mounted buckets\n- \u274c Don't mount entire buckets if you only need specific files\n\n## \ud83d\ude80 Advanced Features\n\n### Async Queue Processing\n\nModalkit supports async processing with multiple queue backends:\n\n```yaml\nqueue_config:\n  backend: \"taskiq\"  # or \"sqs\"\n  broker_url: \"redis://redis:6379\"\n```\n\n```python\n# Async endpoint returns immediately\nresponse = requests.post(\"/predict_async\", json=data)\n# {\"message_id\": \"uuid\", \"status\": \"queued\"}\n```\n\n### Batch Processing\n\nConfigure intelligent batching for better GPU utilization:\n\n```yaml\nbatch_config:\n  max_batch_size: 32\n  wait_ms: 100  # Max time to wait for batch to fill\n```\n\n### Volume Reloading\n\nAuto-reload Modal volumes for model updates:\n\n```yaml\ndeployment_config:\n  volumes:\n    \"/mnt/models\": \"model-volume\"\n  volume_reload_interval_seconds: 300  # Reload every 5 minutes\n```\n\n## \ud83d\udee0\ufe0f Development\n\n### Setup\n\n```bash\n# Clone repository\ngit clone https://github.com/prassanna-ravishankar/modalkit.git\ncd modalkit\n\n# Install with uv (recommended)\nuv sync\n\n# Install pre-commit hooks\nuv run pre-commit install\n```\n\n### Testing\n\n```bash\n# Run all tests\nuv run pytest --cov --cov-config=pyproject.toml --cov-report=xml\n\n# Run specific tests\nuv run pytest tests/test_modal_service.py -v\n\n# Run with HTML coverage report\nuv run pytest --cov=modalkit --cov-report=html\n```\n\n### Code Quality\n\n```bash\n# Run all checks\nuv run pre-commit run -a\n\n# Run type checking\nuv run mypy modalkit/\n\n# Format code\nuv run ruff format modalkit/ tests/\n\n# Lint code\nuv run ruff check modalkit/ tests/\n```\n\n## \ud83d\udcd6 API Reference\n\n### Endpoints\n\n| Endpoint | Method | Description | Returns |\n|----------|---------|-------------|----------|\n| `/predict_sync` | POST | Synchronous inference | Model output |\n| `/predict_async` | POST | Async inference (queued) | Message ID |\n| `/predict_batch` | POST | Batch inference | List of outputs |\n| `/health` | GET | Health check | Status |\n\n### InferencePipeline Methods\n\nYour model class must implement:\n\n```python\ndef preprocess(self, input_list: List[InputModel]) -> dict\ndef predict(self, input_list: List[InputModel], preprocessed_data: dict) -> dict\ndef postprocess(self, input_list: List[InputModel], raw_output: dict) -> List[OutputModel]\n```\n\n## \ud83e\udd1d Contributing\n\nWe welcome contributions! Please see our [Contributing Guidelines](CONTRIBUTING.md) for details.\n\n### Development Workflow\n\n1. Fork the repository\n2. Create a feature branch (`git checkout -b feature/amazing-feature`)\n3. Make your changes\n4. Run tests and linting (`uv run pytest && uv run pre-commit run -a`)\n5. Commit your changes (pre-commit hooks will run automatically)\n6. Push to your fork and open a Pull Request\n\n## \ud83d\udcdd License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## \ud83d\ude4f Acknowledgments\n\nBuilt with \u2764\ufe0f using:\n- [Modal](https://modal.com) - Serverless infrastructure for ML\n- [FastAPI](https://fastapi.tiangolo.com) - Modern web framework\n- [Pydantic](https://pydantic-docs.helpmanual.io) - Data validation\n- [Taskiq](https://taskiq-python.github.io) - Async task processing\n\n---\n\n<p align=\"center\">\n  <a href=\"https://github.com/prassanna-ravishankar/modalkit/issues\">Report Bug</a> \u2022\n  <a href=\"https://github.com/prassanna-ravishankar/modalkit/issues\">Request Feature</a> \u2022\n  <a href=\"https://prassanna-ravishankar.github.io/modalkit\">Documentation</a>\n</p>\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A library to package, ship and deploy your ML app",
    "version": "0.2.0",
    "project_urls": {
        "Repository": "https://github.com/prassanna-ravishankar/modalkit"
    },
    "split_keywords": [
        "python"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "63b3bbc9e11b096bbe2e910bfe4fdec742001d5323a41ce870aece242b302b9e",
                "md5": "57d8f63972a040358e45a6bdb600506e",
                "sha256": "b0e52b40802e080eab411f047c6c5d359140e07f3845954eb9e72d4672d4b14e"
            },
            "downloads": -1,
            "filename": "modalkit-0.2.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "57d8f63972a040358e45a6bdb600506e",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.9",
            "size": 24881,
            "upload_time": "2025-07-10T13:54:36",
            "upload_time_iso_8601": "2025-07-10T13:54:36.786078Z",
            "url": "https://files.pythonhosted.org/packages/63/b3/bbc9e11b096bbe2e910bfe4fdec742001d5323a41ce870aece242b302b9e/modalkit-0.2.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "ad7627c93fba9f1aac36e13ec10a2f55559727ac429811a06073e8b598b9bdec",
                "md5": "c20e830281cf5c8ab09421b34a86c087",
                "sha256": "4760bdf8968df0a811c40ac2c013564cbeaafd2852ff697cc114b480ffe0060a"
            },
            "downloads": -1,
            "filename": "modalkit-0.2.0.tar.gz",
            "has_sig": false,
            "md5_digest": "c20e830281cf5c8ab09421b34a86c087",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.9",
            "size": 42973,
            "upload_time": "2025-07-10T13:54:37",
            "upload_time_iso_8601": "2025-07-10T13:54:37.941941Z",
            "url": "https://files.pythonhosted.org/packages/ad/76/27c93fba9f1aac36e13ec10a2f55559727ac429811a06073e8b598b9bdec/modalkit-0.2.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-10 13:54:37",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "prassanna-ravishankar",
    "github_project": "modalkit",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "tox": true,
    "lcname": "modalkit"
}
        
Elapsed time: 0.86489s