litserve


Namelitserve JSON
Version 0.2.5 PyPI version JSON
download
home_pagehttps://github.com/Lightning-AI/litserve
SummaryLightweight AI server.
upload_time2024-11-27 12:08:44
maintainerNone
docs_urlNone
authorLightning-AI et al.
requires_python>=3.8
licenseApache-2.0
keywords deep learning pytorch ai
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <div align='center'>

# Easily serve AI models Lightning fast ⚡    

<img alt="Lightning" src="https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/app-2/ls_banner2.png" width="800px" style="max-width: 100%;">

&nbsp;

<strong>Lightning-fast serving engine for AI models.</strong>    
Easy. Flexible. Enterprise-scale.    
</div>

----

**LitServe** is an easy-to-use, flexible serving engine for AI models built on FastAPI. It augments FastAPI with features like batching, streaming, and GPU autoscaling eliminate the need to rebuild a FastAPI server per model.  

LitServe is at least [2x faster](#performance) than plain FastAPI due to AI-specific multi-worker handling.    

<div align='center'>
  
<pre>
✅ (2x)+ faster serving  ✅ Easy to use          ✅ LLMs, non LLMs and more
✅ Bring your own model  ✅ PyTorch/JAX/TF/...   ✅ Built on FastAPI       
✅ GPU autoscaling       ✅ Batching, Streaming  ✅ Self-host or ⚡️ managed 
✅ Compound AI           ✅ Integrate with vLLM and more                   
</pre>

<div align='center'>

[![Discord](https://img.shields.io/discord/1077906959069626439?label=Get%20help%20on%20Discord)](https://discord.gg/WajDThKAur)
![cpu-tests](https://github.com/Lightning-AI/litserve/actions/workflows/ci-testing.yml/badge.svg)
[![codecov](https://codecov.io/gh/Lightning-AI/litserve/graph/badge.svg?token=SmzX8mnKlA)](https://codecov.io/gh/Lightning-AI/litserve)
[![license](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://github.com/Lightning-AI/litserve/blob/main/LICENSE)

</div>
</div>
<div align="center">
  <div style="text-align: center;">
    <a target="_blank" href="#quick-start" style="margin: 0 10px;">Quick start</a> •
    <a target="_blank" href="#featured-examples" style="margin: 0 10px;">Examples</a> •
    <a target="_blank" href="#features" style="margin: 0 10px;">Features</a> •
    <a target="_blank" href="#performance" style="margin: 0 10px;">Performance</a> •
    <a target="_blank" href="#hosting-options" style="margin: 0 10px;">Hosting</a> •
    <a target="_blank" href="https://lightning.ai/docs/litserve" style="margin: 0 10px;">Docs</a>
  </div>
</div>

&nbsp;

<div align="center">
<a target="_blank" href="https://lightning.ai/docs/litserve/home/get-started">
  <img src="https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/app-2/get-started-badge.svg" height="36px" alt="Get started"/>
</a>
</div>

&nbsp; 

# Quick start

Install LitServe via pip ([more options](https://lightning.ai/docs/litserve/home/install)):

```bash
pip install litserve
```
    
### Define a server    
This toy example with 2 models (AI compound system) shows LitServe's flexibility ([see real examples](#examples)):    

```python
# server.py
import litserve as ls

# (STEP 1) - DEFINE THE API (compound AI system)
class SimpleLitAPI(ls.LitAPI):
    def setup(self, device):
        # setup is called once at startup. Build a compound AI system (1+ models), connect DBs, load data, etc...
        self.model1 = lambda x: x**2
        self.model2 = lambda x: x**3

    def decode_request(self, request):
        # Convert the request payload to model input.
        return request["input"] 

    def predict(self, x):
        # Easily build compound systems. Run inference and return the output.
        squared = self.model1(x)
        cubed = self.model2(x)
        output = squared + cubed
        return {"output": output}

    def encode_response(self, output):
        # Convert the model output to a response payload.
        return {"output": output} 

# (STEP 2) - START THE SERVER
if __name__ == "__main__":
    # scale with advanced features (batching, GPUs, etc...)
    server = ls.LitServer(SimpleLitAPI(), accelerator="auto", max_batch_size=1)
    server.run(port=8000)
```

Now run the server via the command-line

```bash
python server.py
```
    
### Test the server
Run the auto-generated test client:        
```bash
python client.py    
```

Or use this terminal command:
```bash
curl -X POST http://127.0.0.1:8000/predict -H "Content-Type: application/json" -d '{"input": 4.0}'
```

### LLM serving
LitServe isn’t *just* for LLMs like vLLM or Ollama; it serves any AI model with full control over internals ([learn more](https://lightning.ai/docs/litserve/features/serve-llms)).    
For easy LLM serving, integrate [vLLM with LitServe](https://lightning.ai/lightning-ai/studios/deploy-a-private-llama-3-2-rag-api), or use [LitGPT](https://github.com/Lightning-AI/litgpt?tab=readme-ov-file#deploy-an-llm) (built on LitServe). 

```
litgpt serve microsoft/phi-2
```

### Summary
- LitAPI lets you easily build complex AI systems with one or more models ([docs](https://lightning.ai/docs/litserve/api-reference/litapi)).
- Use the setup method for one-time tasks like connecting models, DBs, and loading data ([docs](https://lightning.ai/docs/litserve/api-reference/litapi#setup)).        
- LitServer handles optimizations like batching, GPU autoscaling, streaming, etc... ([docs](https://lightning.ai/docs/litserve/api-reference/litserver)).
- Self host on your own machines or use Lightning Studios for a fully managed deployment ([learn more](#hosting-options)).         

[Learn how to make this server 200x faster](https://lightning.ai/docs/litserve/home/speed-up-serving-by-200x).    

&nbsp;

# Featured examples    
Use LitServe to deploy any model or AI service: (Compound AI, Gen AI, classic ML, embeddings, LLMs, vision, audio, etc...)       

<div align='center'>
  <div width='200px'>
        <video src="https://github.com/user-attachments/assets/5e73549a-bc0f-47a9-9d9c-5b54389be5de" width='200px' controls></video>    
  </div>
</div>

## Examples    
<pre>
<strong>Toy model:</strong>      <a target="_blank" href="#define-a-server">Hello world</a>
<strong>LLMs:</strong>           <a target="_blank" href="https://lightning.ai/lightning-ai/studios/deploy-llama-3-2-vision-with-litserve">Llama 3.2</a>, <a target="_blank" href="https://lightning.ai/lightning-ai/studios/openai-fault-tolerant-proxy-server">LLM Proxy server</a>, <a target="_blank" href="https://lightning.ai/lightning-ai/studios/deploy-ai-agent-with-tool-use">Agent with tool use</a>
<strong>RAG:</strong>            <a target="_blank" href="https://lightning.ai/lightning-ai/studios/deploy-a-private-llama-3-2-rag-api">vLLM RAG (Llama 3.2)</a>, <a target="_blank" href="https://lightning.ai/lightning-ai/studios/deploy-a-private-llama-3-1-rag-api">RAG API (LlamaIndex)</a>
<strong>NLP:</strong>            <a target="_blank" href="https://lightning.ai/lightning-ai/studios/deploy-any-hugging-face-model-instantly">Hugging face</a>, <a target="_blank" href="https://lightning.ai/lightning-ai/studios/deploy-a-hugging-face-bert-model">BERT</a>, <a target="_blank" href="https://lightning.ai/lightning-ai/studios/deploy-text-embedding-api-with-litserve">Text embedding API</a>
<strong>Multimodal:</strong>     <a target="_blank" href="https://lightning.ai/lightning-ai/studios/deploy-open-ai-clip-with-litserve">OpenAI Clip</a>, <a target="_blank" href="https://lightning.ai/lightning-ai/studios/deploy-a-multi-modal-llm-with-minicpm">MiniCPM</a>, <a target="_blank" href="https://lightning.ai/lightning-ai/studios/deploy-phi3-5-vision-api-with-litserve">Phi-3.5 Vision Instruct</a>, <a target="_blank" href="https://lightning.ai/bhimrajyadav/studios/deploy-and-chat-with-qwen2-vl-using-litserve">Qwen2-VL</a>, <a target="_blank" href="https://lightning.ai/lightning-ai/studios/deploy-a-multi-modal-llm-with-pixtral">Pixtral</a>
<strong>Audio:</strong>          <a target="_blank" href="https://lightning.ai/lightning-ai/studios/deploy-open-ai-s-whisper-model">Whisper</a>, <a target="_blank" href="https://lightning.ai/lightning-ai/studios/deploy-an-music-generation-api-with-meta-s-audio-craft">AudioCraft</a>, <a target="_blank" href="https://lightning.ai/lightning-ai/studios/deploy-an-audio-generation-api">StableAudio</a>, <a target="_blank" href="https://lightning.ai/lightning-ai/studios/deploy-a-noise-cancellation-api-with-deepfilternet">Noise cancellation (DeepFilterNet)</a>
<strong>Vision:</strong>         <a target="_blank" href="https://lightning.ai/lightning-ai/studios/deploy-a-private-api-for-stable-diffusion-2">Stable diffusion 2</a>, <a target="_blank" href="https://lightning.ai/lightning-ai/studios/deploy-an-image-generation-api-with-auraflow">AuraFlow</a>, <a target="_blank" href="https://lightning.ai/lightning-ai/studios/deploy-an-image-generation-api-with-flux">Flux</a>, <a target="_blank" href="https://lightning.ai/lightning-ai/studios/deploy-a-super-resolution-image-api-with-aura-sr">Image Super Resolution (Aura SR)</a>,
                <a target="_blank" href="https://lightning.ai/bhimrajyadav/studios/deploy-background-removal-api-with-litserve">Background Removal</a>, <a target="_blank" href="https://lightning.ai/lightning-ai/studios/deploy-a-controlled-image-generation-api-controlnet">Control Stable Diffusion (ControlNet)</a>
<strong>Speech:</strong>         <a target="_blank" href="https://lightning.ai/lightning-ai/studios/deploy-a-voice-clone-api-coqui-xtts-v2-model">Text-speech (XTTS V2)</a>, <a target="_blank" href="https://lightning.ai/bhimrajyadav/studios/deploy-a-speech-generation-api-using-parler-tts-powered-by-litserve">Parler-TTS</a>
<strong>Classical ML:</strong>   <a target="_blank" href="https://lightning.ai/lightning-ai/studios/deploy-random-forest-with-litserve">Random forest</a>, <a target="_blank" href="https://lightning.ai/lightning-ai/studios/deploy-xgboost-with-litserve">XGBoost</a>
<strong>Miscellaneous:</strong>  <a target="_blank" href="https://lightning.ai/lightning-ai/studios/deploy-an-media-conversion-api-with-ffmpeg">Media conversion API (ffmpeg)</a>, <a target="_blank" href="https://lightning.ai/lightning-ai/studios/deploy-both-pytorch-and-tensorflow-in-a-single-api">PyTorch + TensorFlow in one API</a>
</pre>
</pre>

[Browse 100+ community-built templates](https://lightning.ai/studios?section=serving)

&nbsp;

# Features
State-of-the-art features:

✅ [(2x)+ faster than plain FastAPI](#performance)      
✅ [Bring your own model](https://lightning.ai/docs/litserve/features/full-control)    
✅ [Build compound systems (1+ models)](https://lightning.ai/docs/litserve/home)    
✅ [GPU autoscaling](https://lightning.ai/docs/litserve/features/gpu-inference)    
✅ [Batching](https://lightning.ai/docs/litserve/features/batching)    
✅ [Streaming](https://lightning.ai/docs/litserve/features/streaming)    
✅ [Worker autoscaling](https://lightning.ai/docs/litserve/features/autoscaling)    
✅ [Self-host on your machines](https://lightning.ai/docs/litserve/features/hosting-methods#host-on-your-own)    
✅ [Host fully managed on Lightning AI](https://lightning.ai/docs/litserve/features/hosting-methods#host-on-lightning-studios)  
✅ [Serve all models: (LLMs, vision, etc.)](https://lightning.ai/docs/litserve/examples)        
✅ [Scale to zero (serverless)](https://lightning.ai/docs/litserve/features/streaming)    
✅ [Supports PyTorch, JAX, TF, etc...](https://lightning.ai/docs/litserve/features/full-control)        
✅ [OpenAPI compliant](https://www.openapis.org/)          
✅ [Open AI compatibility](https://lightning.ai/docs/litserve/features/open-ai-spec)    
✅ [Authentication](https://lightning.ai/docs/litserve/features/authentication)    
✅ [Dockerization](https://lightning.ai/docs/litserve/features/dockerization-deployment)



[10+ features...](https://lightning.ai/docs/litserve/features)    

**Note:** We prioritize scalable, enterprise-level features over hype.   

&nbsp;

# Performance  
LitServe is designed for AI workloads. Specialized multi-worker handling delivers a minimum **2x speedup over FastAPI**.    

Additional features like batching and GPU autoscaling can drive performance well beyond 2x, scaling efficiently to handle more simultaneous requests than FastAPI and TorchServe.
    
Reproduce the full benchmarks [here](https://lightning.ai/docs/litserve/home/benchmarks) (higher is better).  

<div align="center">
  <img alt="LitServe" src="https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/app-2/ls_charts_v6.png" width="1000px" style="max-width: 100%;">
</div> 

These results are for image and text classification ML tasks. The performance relationships hold for other ML tasks (embedding, LLM serving, audio, segmentation, object detection, summarization etc...).   
    
***💡 Note on LLM serving:*** For high-performance LLM serving (like Ollama/vLLM), integrate [vLLM with LitServe](https://lightning.ai/lightning-ai/studios/deploy-a-private-llama-3-2-rag-api), use [LitGPT](https://github.com/Lightning-AI/litgpt?tab=readme-ov-file#deploy-an-llm), or build your custom vLLM-like server with LitServe. Optimizations like kv-caching, which can be done with LitServe, are needed to maximize LLM performance.

&nbsp; 

# Hosting options   
LitServe can be hosted independently on your own machines or fully managed via Lightning Studios.

Self-hosting is ideal for hackers, students, and DIY developers, while fully managed hosting is ideal for enterprise developers needing easy autoscaling, security, release management, and 99.995% uptime and observability.   

&nbsp;

<div align="center">
<a target="_blank" href="https://lightning.ai/lightning-ai/studios/litserve-hello-world">
  <img src="https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/app-2/host-on-lightning.svg" alt="Host on Lightning"/>
</a>
</div>

&nbsp;

<div align='center'>
  
| Feature                          | Self Managed                      | Fully Managed on Studios            |
|----------------------------------|-----------------------------------|-------------------------------------|
| Deployment                       | ✅ Do it yourself deployment      | ✅ One-button cloud deploy          |
| Load balancing                   | ❌                                | ✅                                  |
| Autoscaling                      | ❌                                | ✅                                  |
| Scale to zero                    | ❌                                | ✅                                  |
| Multi-machine inference          | ❌                                | ✅                                  |
| Authentication                   | ❌                                | ✅                                  |
| Own VPC                          | ❌                                | ✅                                  |
| AWS, GCP                         | ❌                                | ✅                                  |
| Use your own cloud commits       | ❌                                | ✅                                  |

</div>

&nbsp;

# Community
LitServe is a [community project accepting contributions](https://lightning.ai/docs/litserve/community) - Let's make the world's most advanced AI inference engine.

💬 [Get help on Discord](https://discord.com/invite/XncpTy7DSt)    
📋 [License: Apache 2.0](https://github.com/Lightning-AI/litserve/blob/main/LICENSE)    

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/Lightning-AI/litserve",
    "name": "litserve",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "deep learning, pytorch, AI",
    "author": "Lightning-AI et al.",
    "author_email": "community@lightning.ai",
    "download_url": "https://files.pythonhosted.org/packages/3f/11/3c6292cf8b66d8c235980561077488e872f534514634fe077805a82ee08d/litserve-0.2.5.tar.gz",
    "platform": null,
    "description": "<div align='center'>\n\n# Easily serve AI models Lightning fast \u26a1    \n\n<img alt=\"Lightning\" src=\"https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/app-2/ls_banner2.png\" width=\"800px\" style=\"max-width: 100%;\">\n\n&nbsp;\n\n<strong>Lightning-fast serving engine for AI models.</strong>    \nEasy. Flexible. Enterprise-scale.    \n</div>\n\n----\n\n**LitServe** is an easy-to-use, flexible serving engine for AI models built on FastAPI. It augments FastAPI with features like batching, streaming, and GPU autoscaling eliminate the need to rebuild a FastAPI server per model.  \n\nLitServe is at least [2x faster](#performance) than plain FastAPI due to AI-specific multi-worker handling.    \n\n<div align='center'>\n  \n<pre>\n\u2705 (2x)+ faster serving  \u2705 Easy to use          \u2705 LLMs, non LLMs and more\n\u2705 Bring your own model  \u2705 PyTorch/JAX/TF/...   \u2705 Built on FastAPI       \n\u2705 GPU autoscaling       \u2705 Batching, Streaming  \u2705 Self-host or \u26a1\ufe0f managed \n\u2705 Compound AI           \u2705 Integrate with vLLM and more                   \n</pre>\n\n<div align='center'>\n\n[![Discord](https://img.shields.io/discord/1077906959069626439?label=Get%20help%20on%20Discord)](https://discord.gg/WajDThKAur)\n![cpu-tests](https://github.com/Lightning-AI/litserve/actions/workflows/ci-testing.yml/badge.svg)\n[![codecov](https://codecov.io/gh/Lightning-AI/litserve/graph/badge.svg?token=SmzX8mnKlA)](https://codecov.io/gh/Lightning-AI/litserve)\n[![license](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://github.com/Lightning-AI/litserve/blob/main/LICENSE)\n\n</div>\n</div>\n<div align=\"center\">\n  <div style=\"text-align: center;\">\n    <a target=\"_blank\" href=\"#quick-start\" style=\"margin: 0 10px;\">Quick start</a> \u2022\n    <a target=\"_blank\" href=\"#featured-examples\" style=\"margin: 0 10px;\">Examples</a> \u2022\n    <a target=\"_blank\" href=\"#features\" style=\"margin: 0 10px;\">Features</a> \u2022\n    <a target=\"_blank\" href=\"#performance\" style=\"margin: 0 10px;\">Performance</a> \u2022\n    <a target=\"_blank\" href=\"#hosting-options\" style=\"margin: 0 10px;\">Hosting</a> \u2022\n    <a target=\"_blank\" href=\"https://lightning.ai/docs/litserve\" style=\"margin: 0 10px;\">Docs</a>\n  </div>\n</div>\n\n&nbsp;\n\n<div align=\"center\">\n<a target=\"_blank\" href=\"https://lightning.ai/docs/litserve/home/get-started\">\n  <img src=\"https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/app-2/get-started-badge.svg\" height=\"36px\" alt=\"Get started\"/>\n</a>\n</div>\n\n&nbsp; \n\n# Quick start\n\nInstall LitServe via pip ([more options](https://lightning.ai/docs/litserve/home/install)):\n\n```bash\npip install litserve\n```\n    \n### Define a server    \nThis toy example with 2 models (AI compound system) shows LitServe's flexibility ([see real examples](#examples)):    \n\n```python\n# server.py\nimport litserve as ls\n\n# (STEP 1) - DEFINE THE API (compound AI system)\nclass SimpleLitAPI(ls.LitAPI):\n    def setup(self, device):\n        # setup is called once at startup. Build a compound AI system (1+ models), connect DBs, load data, etc...\n        self.model1 = lambda x: x**2\n        self.model2 = lambda x: x**3\n\n    def decode_request(self, request):\n        # Convert the request payload to model input.\n        return request[\"input\"] \n\n    def predict(self, x):\n        # Easily build compound systems. Run inference and return the output.\n        squared = self.model1(x)\n        cubed = self.model2(x)\n        output = squared + cubed\n        return {\"output\": output}\n\n    def encode_response(self, output):\n        # Convert the model output to a response payload.\n        return {\"output\": output} \n\n# (STEP 2) - START THE SERVER\nif __name__ == \"__main__\":\n    # scale with advanced features (batching, GPUs, etc...)\n    server = ls.LitServer(SimpleLitAPI(), accelerator=\"auto\", max_batch_size=1)\n    server.run(port=8000)\n```\n\nNow run the server via the command-line\n\n```bash\npython server.py\n```\n    \n### Test the server\nRun the auto-generated test client:        \n```bash\npython client.py    \n```\n\nOr use this terminal command:\n```bash\ncurl -X POST http://127.0.0.1:8000/predict -H \"Content-Type: application/json\" -d '{\"input\": 4.0}'\n```\n\n### LLM serving\nLitServe isn\u2019t *just* for LLMs like vLLM or Ollama; it serves any AI model with full control over internals ([learn more](https://lightning.ai/docs/litserve/features/serve-llms)).    \nFor easy LLM serving, integrate [vLLM with LitServe](https://lightning.ai/lightning-ai/studios/deploy-a-private-llama-3-2-rag-api), or use [LitGPT](https://github.com/Lightning-AI/litgpt?tab=readme-ov-file#deploy-an-llm) (built on LitServe). \n\n```\nlitgpt serve microsoft/phi-2\n```\n\n### Summary\n- LitAPI lets you easily build complex AI systems with one or more models ([docs](https://lightning.ai/docs/litserve/api-reference/litapi)).\n- Use the setup method for one-time tasks like connecting models, DBs, and loading data ([docs](https://lightning.ai/docs/litserve/api-reference/litapi#setup)).        \n- LitServer handles optimizations like batching, GPU autoscaling, streaming, etc... ([docs](https://lightning.ai/docs/litserve/api-reference/litserver)).\n- Self host on your own machines or use Lightning Studios for a fully managed deployment ([learn more](#hosting-options)).         \n\n[Learn how to make this server 200x faster](https://lightning.ai/docs/litserve/home/speed-up-serving-by-200x).    \n\n&nbsp;\n\n# Featured examples    \nUse LitServe to deploy any model or AI service: (Compound AI, Gen AI, classic ML, embeddings, LLMs, vision, audio, etc...)       \n\n<div align='center'>\n  <div width='200px'>\n        <video src=\"https://github.com/user-attachments/assets/5e73549a-bc0f-47a9-9d9c-5b54389be5de\" width='200px' controls></video>    \n  </div>\n</div>\n\n## Examples    \n<pre>\n<strong>Toy model:</strong>      <a target=\"_blank\" href=\"#define-a-server\">Hello world</a>\n<strong>LLMs:</strong>           <a target=\"_blank\" href=\"https://lightning.ai/lightning-ai/studios/deploy-llama-3-2-vision-with-litserve\">Llama 3.2</a>, <a target=\"_blank\" href=\"https://lightning.ai/lightning-ai/studios/openai-fault-tolerant-proxy-server\">LLM Proxy server</a>, <a target=\"_blank\" href=\"https://lightning.ai/lightning-ai/studios/deploy-ai-agent-with-tool-use\">Agent with tool use</a>\n<strong>RAG:</strong>            <a target=\"_blank\" href=\"https://lightning.ai/lightning-ai/studios/deploy-a-private-llama-3-2-rag-api\">vLLM RAG (Llama 3.2)</a>, <a target=\"_blank\" href=\"https://lightning.ai/lightning-ai/studios/deploy-a-private-llama-3-1-rag-api\">RAG API (LlamaIndex)</a>\n<strong>NLP:</strong>            <a target=\"_blank\" href=\"https://lightning.ai/lightning-ai/studios/deploy-any-hugging-face-model-instantly\">Hugging face</a>, <a target=\"_blank\" href=\"https://lightning.ai/lightning-ai/studios/deploy-a-hugging-face-bert-model\">BERT</a>, <a target=\"_blank\" href=\"https://lightning.ai/lightning-ai/studios/deploy-text-embedding-api-with-litserve\">Text embedding API</a>\n<strong>Multimodal:</strong>     <a target=\"_blank\" href=\"https://lightning.ai/lightning-ai/studios/deploy-open-ai-clip-with-litserve\">OpenAI Clip</a>, <a target=\"_blank\" href=\"https://lightning.ai/lightning-ai/studios/deploy-a-multi-modal-llm-with-minicpm\">MiniCPM</a>, <a target=\"_blank\" href=\"https://lightning.ai/lightning-ai/studios/deploy-phi3-5-vision-api-with-litserve\">Phi-3.5 Vision Instruct</a>, <a target=\"_blank\" href=\"https://lightning.ai/bhimrajyadav/studios/deploy-and-chat-with-qwen2-vl-using-litserve\">Qwen2-VL</a>, <a target=\"_blank\" href=\"https://lightning.ai/lightning-ai/studios/deploy-a-multi-modal-llm-with-pixtral\">Pixtral</a>\n<strong>Audio:</strong>          <a target=\"_blank\" href=\"https://lightning.ai/lightning-ai/studios/deploy-open-ai-s-whisper-model\">Whisper</a>, <a target=\"_blank\" href=\"https://lightning.ai/lightning-ai/studios/deploy-an-music-generation-api-with-meta-s-audio-craft\">AudioCraft</a>, <a target=\"_blank\" href=\"https://lightning.ai/lightning-ai/studios/deploy-an-audio-generation-api\">StableAudio</a>, <a target=\"_blank\" href=\"https://lightning.ai/lightning-ai/studios/deploy-a-noise-cancellation-api-with-deepfilternet\">Noise cancellation (DeepFilterNet)</a>\n<strong>Vision:</strong>         <a target=\"_blank\" href=\"https://lightning.ai/lightning-ai/studios/deploy-a-private-api-for-stable-diffusion-2\">Stable diffusion 2</a>, <a target=\"_blank\" href=\"https://lightning.ai/lightning-ai/studios/deploy-an-image-generation-api-with-auraflow\">AuraFlow</a>, <a target=\"_blank\" href=\"https://lightning.ai/lightning-ai/studios/deploy-an-image-generation-api-with-flux\">Flux</a>, <a target=\"_blank\" href=\"https://lightning.ai/lightning-ai/studios/deploy-a-super-resolution-image-api-with-aura-sr\">Image Super Resolution (Aura SR)</a>,\n                <a target=\"_blank\" href=\"https://lightning.ai/bhimrajyadav/studios/deploy-background-removal-api-with-litserve\">Background Removal</a>, <a target=\"_blank\" href=\"https://lightning.ai/lightning-ai/studios/deploy-a-controlled-image-generation-api-controlnet\">Control Stable Diffusion (ControlNet)</a>\n<strong>Speech:</strong>         <a target=\"_blank\" href=\"https://lightning.ai/lightning-ai/studios/deploy-a-voice-clone-api-coqui-xtts-v2-model\">Text-speech (XTTS V2)</a>, <a target=\"_blank\" href=\"https://lightning.ai/bhimrajyadav/studios/deploy-a-speech-generation-api-using-parler-tts-powered-by-litserve\">Parler-TTS</a>\n<strong>Classical ML:</strong>   <a target=\"_blank\" href=\"https://lightning.ai/lightning-ai/studios/deploy-random-forest-with-litserve\">Random forest</a>, <a target=\"_blank\" href=\"https://lightning.ai/lightning-ai/studios/deploy-xgboost-with-litserve\">XGBoost</a>\n<strong>Miscellaneous:</strong>  <a target=\"_blank\" href=\"https://lightning.ai/lightning-ai/studios/deploy-an-media-conversion-api-with-ffmpeg\">Media conversion API (ffmpeg)</a>, <a target=\"_blank\" href=\"https://lightning.ai/lightning-ai/studios/deploy-both-pytorch-and-tensorflow-in-a-single-api\">PyTorch + TensorFlow in one API</a>\n</pre>\n</pre>\n\n[Browse 100+ community-built templates](https://lightning.ai/studios?section=serving)\n\n&nbsp;\n\n# Features\nState-of-the-art features:\n\n\u2705 [(2x)+ faster than plain FastAPI](#performance)      \n\u2705 [Bring your own model](https://lightning.ai/docs/litserve/features/full-control)    \n\u2705 [Build compound systems (1+ models)](https://lightning.ai/docs/litserve/home)    \n\u2705 [GPU autoscaling](https://lightning.ai/docs/litserve/features/gpu-inference)    \n\u2705 [Batching](https://lightning.ai/docs/litserve/features/batching)    \n\u2705 [Streaming](https://lightning.ai/docs/litserve/features/streaming)    \n\u2705 [Worker autoscaling](https://lightning.ai/docs/litserve/features/autoscaling)    \n\u2705 [Self-host on your machines](https://lightning.ai/docs/litserve/features/hosting-methods#host-on-your-own)    \n\u2705 [Host fully managed on Lightning AI](https://lightning.ai/docs/litserve/features/hosting-methods#host-on-lightning-studios)  \n\u2705 [Serve all models: (LLMs, vision, etc.)](https://lightning.ai/docs/litserve/examples)        \n\u2705 [Scale to zero (serverless)](https://lightning.ai/docs/litserve/features/streaming)    \n\u2705 [Supports PyTorch, JAX, TF, etc...](https://lightning.ai/docs/litserve/features/full-control)        \n\u2705 [OpenAPI compliant](https://www.openapis.org/)          \n\u2705 [Open AI compatibility](https://lightning.ai/docs/litserve/features/open-ai-spec)    \n\u2705 [Authentication](https://lightning.ai/docs/litserve/features/authentication)    \n\u2705 [Dockerization](https://lightning.ai/docs/litserve/features/dockerization-deployment)\n\n\n\n[10+ features...](https://lightning.ai/docs/litserve/features)    \n\n**Note:** We prioritize scalable, enterprise-level features over hype.   \n\n&nbsp;\n\n# Performance  \nLitServe is designed for AI workloads. Specialized multi-worker handling delivers a minimum **2x speedup over FastAPI**.    \n\nAdditional features like batching and GPU autoscaling can drive performance well beyond 2x, scaling efficiently to handle more simultaneous requests than FastAPI and TorchServe.\n    \nReproduce the full benchmarks [here](https://lightning.ai/docs/litserve/home/benchmarks) (higher is better).  \n\n<div align=\"center\">\n  <img alt=\"LitServe\" src=\"https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/app-2/ls_charts_v6.png\" width=\"1000px\" style=\"max-width: 100%;\">\n</div> \n\nThese results are for image and text classification ML tasks. The performance relationships hold for other ML tasks (embedding, LLM serving, audio, segmentation, object detection, summarization etc...).   \n    \n***\ud83d\udca1 Note on LLM serving:*** For high-performance LLM serving (like Ollama/vLLM), integrate [vLLM with LitServe](https://lightning.ai/lightning-ai/studios/deploy-a-private-llama-3-2-rag-api), use [LitGPT](https://github.com/Lightning-AI/litgpt?tab=readme-ov-file#deploy-an-llm), or build your custom vLLM-like server with LitServe. Optimizations like kv-caching, which can be done with LitServe, are needed to maximize LLM performance.\n\n&nbsp; \n\n# Hosting options   \nLitServe can be hosted independently on your own machines or fully managed via Lightning Studios.\n\nSelf-hosting is ideal for hackers, students, and DIY developers, while fully managed hosting is ideal for enterprise developers needing easy autoscaling, security, release management, and 99.995% uptime and observability.   \n\n&nbsp;\n\n<div align=\"center\">\n<a target=\"_blank\" href=\"https://lightning.ai/lightning-ai/studios/litserve-hello-world\">\n  <img src=\"https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/app-2/host-on-lightning.svg\" alt=\"Host on Lightning\"/>\n</a>\n</div>\n\n&nbsp;\n\n<div align='center'>\n  \n| Feature                          | Self Managed                      | Fully Managed on Studios            |\n|----------------------------------|-----------------------------------|-------------------------------------|\n| Deployment                       | \u2705 Do it yourself deployment      | \u2705 One-button cloud deploy          |\n| Load balancing                   | \u274c                                | \u2705                                  |\n| Autoscaling                      | \u274c                                | \u2705                                  |\n| Scale to zero                    | \u274c                                | \u2705                                  |\n| Multi-machine inference          | \u274c                                | \u2705                                  |\n| Authentication                   | \u274c                                | \u2705                                  |\n| Own VPC                          | \u274c                                | \u2705                                  |\n| AWS, GCP                         | \u274c                                | \u2705                                  |\n| Use your own cloud commits       | \u274c                                | \u2705                                  |\n\n</div>\n\n&nbsp;\n\n# Community\nLitServe is a [community project accepting contributions](https://lightning.ai/docs/litserve/community) - Let's make the world's most advanced AI inference engine.\n\n\ud83d\udcac [Get help on Discord](https://discord.com/invite/XncpTy7DSt)    \n\ud83d\udccb [License: Apache 2.0](https://github.com/Lightning-AI/litserve/blob/main/LICENSE)    \n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "Lightweight AI server.",
    "version": "0.2.5",
    "project_urls": {
        "Bug Tracker": "https://github.com/Lightning-AI/litserve/issues",
        "Documentation": "https://lightning-ai.github.io/litserve/",
        "Download": "https://github.com/Lightning-AI/litserve",
        "Homepage": "https://github.com/Lightning-AI/litserve",
        "Source Code": "https://github.com/Lightning-AI/litserve"
    },
    "split_keywords": [
        "deep learning",
        " pytorch",
        " ai"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "6df3916d95e9432827bf03ede0dad5857c75fa55cc9d2d9928a673b7a087453a",
                "md5": "8c77233245a434fcf8c14098722bfddb",
                "sha256": "df83781ce580b7da5594b198fed09f24edf8eb621bf2e56e029d613d0951281e"
            },
            "downloads": -1,
            "filename": "litserve-0.2.5-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "8c77233245a434fcf8c14098722bfddb",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 48159,
            "upload_time": "2024-11-27T12:08:43",
            "upload_time_iso_8601": "2024-11-27T12:08:43.246639Z",
            "url": "https://files.pythonhosted.org/packages/6d/f3/916d95e9432827bf03ede0dad5857c75fa55cc9d2d9928a673b7a087453a/litserve-0.2.5-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "3f113c6292cf8b66d8c235980561077488e872f534514634fe077805a82ee08d",
                "md5": "a6e18558e17e1ec93810ad8a823c5341",
                "sha256": "25d3f624a30159afec0d77ff0f66671a5565e2d0c0383009238f00b084686eb1"
            },
            "downloads": -1,
            "filename": "litserve-0.2.5.tar.gz",
            "has_sig": false,
            "md5_digest": "a6e18558e17e1ec93810ad8a823c5341",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 42068,
            "upload_time": "2024-11-27T12:08:44",
            "upload_time_iso_8601": "2024-11-27T12:08:44.934376Z",
            "url": "https://files.pythonhosted.org/packages/3f/11/3c6292cf8b66d8c235980561077488e872f534514634fe077805a82ee08d/litserve-0.2.5.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-27 12:08:44",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Lightning-AI",
    "github_project": "litserve",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "litserve"
}
        
Elapsed time: 0.40936s