FastServeAI


NameFastServeAI JSON
Version 0.0.3 PyPI version JSON
download
home_pagehttps://github.com/aniketmaurya/fastserve
Summary'Machine Learning Serving focused on GenAI & LLMs with simplicity as the top priority.'
upload_time2024-02-23 11:47:20
maintainer
docs_urlNone
authorAniket Maurya
requires_python>=3.8
licenseApache License 2.0
keywords opensource python
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # FastServe

Machine Learning Serving focused on GenAI & LLMs with simplicity as the top priority.

> [![img_tag](https://img.youtube.com/vi/GfcmyfPB9qY/0.jpg)](https://www.youtube.com/watch?v=GfcmyfPB9qY)
>
> YouTube: How to serve your own GPT like LLM in 1 minute with FastServe

## Installation

**Stable:**
```shell
pip install FastServeAI
```

**Latest:**
```shell
pip install git+https://github.com/aniketmaurya/fastserve.git@main
```

## Run locally

```bash
python -m fastserve
```

## Usage/Examples


### Serve LLMs with Llama-cpp

```python
from fastserve.models import ServeLlamaCpp

model_path = "openhermes-2-mistral-7b.Q5_K_M.gguf"
serve = ServeLlamaCpp(model_path=model_path, )
serve.run_server()
```

or, run `python -m fastserve.models --model llama-cpp --model_path openhermes-2-mistral-7b.Q5_K_M.gguf` from terminal.


### Serve vLLM

```python
from fastserve.models import ServeVLLM

app = ServeVLLM("TinyLlama/TinyLlama-1.1B-Chat-v1.0")
app.run_server()
```

You can use the FastServe client that will automatically apply chat template for you -

```python
from fastserve.client import vLLMClient
from rich import print

client = vLLMClient("TinyLlama/TinyLlama-1.1B-Chat-v1.0")
response = client.chat("Write a python function to resize image to 224x224", keep_context=True)
# print(client.context)
print(response["outputs"][0]["text"])
```


### Serve SDXL Turbo

```python
from fastserve.models import ServeSDXLTurbo

serve = ServeSDXLTurbo(device="cuda", batch_size=2, timeout=1)
serve.run_server()
```

or, run `python -m fastserve.models --model sdxl-turbo --batch_size 2 --timeout 1` from terminal.

This application comes with an UI. You can access it at [http://localhost:8000/ui](http://localhost:8000/ui) .


<img src="https://raw.githubusercontent.com/aniketmaurya/fastserve/main/assets/sdxl.jpg" width=400 style="border: 1px solid #F2F3F5;">


### Face  Detection

```python
from fastserve.models import FaceDetection

serve = FaceDetection(batch_size=2, timeout=1)
serve.run_server()
```

or, run `python -m fastserve.models --model face-detection --batch_size 2 --timeout 1` from terminal.

### Image Classification

```python
from fastserve.models import ServeImageClassification

app = ServeImageClassification("resnet18", timeout=1, batch_size=4)
app.run_server()
```

or, run `python -m fastserve.models --model image-classification --model_name resnet18 --batch_size 4 --timeout 1` from
terminal.

### Serve Custom Model

To serve a custom model, you will have to implement `handle` method for `FastServe` that processes a batch of inputs and
returns the response as a list.

```python
from fastserve import FastServe


class MyModelServing(FastServe):
    def __init__(self):
        super().__init__(batch_size=2, timeout=0.1)
        self.model = create_model(...)

    def handle(self, batch: List[BaseRequest]) -> List[float]:
        inputs = [b.request for b in batch]
        response = self.model(inputs)
        return response


app = MyModelServing()
app.run_server()
```

You can run the above script in terminal, and it will launch a FastAPI server for your custom model.

## Deploy

### Lightning AI Studio ⚡️

```shell
python fastserve.deploy.lightning --filename main.py \
    --user LIGHTNING_USERNAME \
    --teamspace LIGHTNING_TEAMSPACE \
    --machine "CPU"  # T4, A10G or A10G_X_4
```

## Contribute

**Install in editable mode:**

```shell
git clone https://github.com/aniketmaurya/fastserve.git
cd fastserve
pip install -e .
```

**Create a new branch**

```shell
git checkout -b <new-branch>
```

**Make your changes, commit and [create a PR](https://github.com/aniketmaurya/fastserve/compare).**


<!-- ## FAQ

#### Question 1

Answer 1

#### Question 2

Answer 2 -->

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/aniketmaurya/fastserve",
    "name": "FastServeAI",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "",
    "keywords": "opensource,python",
    "author": "Aniket Maurya",
    "author_email": "theaniketmaurya@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/58/57/2004d3807d412a71bd95764dffe32ddd64d82aa2782a38a4904006ca7507/FastServeAI-0.0.3.tar.gz",
    "platform": null,
    "description": "# FastServe\n\nMachine Learning Serving focused on GenAI & LLMs with simplicity as the top priority.\n\n> [![img_tag](https://img.youtube.com/vi/GfcmyfPB9qY/0.jpg)](https://www.youtube.com/watch?v=GfcmyfPB9qY)\n>\n> YouTube: How to serve your own GPT like LLM in 1 minute with FastServe\n\n## Installation\n\n**Stable:**\n```shell\npip install FastServeAI\n```\n\n**Latest:**\n```shell\npip install git+https://github.com/aniketmaurya/fastserve.git@main\n```\n\n## Run locally\n\n```bash\npython -m fastserve\n```\n\n## Usage/Examples\n\n\n### Serve LLMs with Llama-cpp\n\n```python\nfrom fastserve.models import ServeLlamaCpp\n\nmodel_path = \"openhermes-2-mistral-7b.Q5_K_M.gguf\"\nserve = ServeLlamaCpp(model_path=model_path, )\nserve.run_server()\n```\n\nor, run `python -m fastserve.models --model llama-cpp --model_path openhermes-2-mistral-7b.Q5_K_M.gguf` from terminal.\n\n\n### Serve vLLM\n\n```python\nfrom fastserve.models import ServeVLLM\n\napp = ServeVLLM(\"TinyLlama/TinyLlama-1.1B-Chat-v1.0\")\napp.run_server()\n```\n\nYou can use the FastServe client that will automatically apply chat template for you -\n\n```python\nfrom fastserve.client import vLLMClient\nfrom rich import print\n\nclient = vLLMClient(\"TinyLlama/TinyLlama-1.1B-Chat-v1.0\")\nresponse = client.chat(\"Write a python function to resize image to 224x224\", keep_context=True)\n# print(client.context)\nprint(response[\"outputs\"][0][\"text\"])\n```\n\n\n### Serve SDXL Turbo\n\n```python\nfrom fastserve.models import ServeSDXLTurbo\n\nserve = ServeSDXLTurbo(device=\"cuda\", batch_size=2, timeout=1)\nserve.run_server()\n```\n\nor, run `python -m fastserve.models --model sdxl-turbo --batch_size 2 --timeout 1` from terminal.\n\nThis application comes with an UI. You can access it at [http://localhost:8000/ui](http://localhost:8000/ui) .\n\n\n<img src=\"https://raw.githubusercontent.com/aniketmaurya/fastserve/main/assets/sdxl.jpg\" width=400 style=\"border: 1px solid #F2F3F5;\">\n\n\n### Face  Detection\n\n```python\nfrom fastserve.models import FaceDetection\n\nserve = FaceDetection(batch_size=2, timeout=1)\nserve.run_server()\n```\n\nor, run `python -m fastserve.models --model face-detection --batch_size 2 --timeout 1` from terminal.\n\n### Image Classification\n\n```python\nfrom fastserve.models import ServeImageClassification\n\napp = ServeImageClassification(\"resnet18\", timeout=1, batch_size=4)\napp.run_server()\n```\n\nor, run `python -m fastserve.models --model image-classification --model_name resnet18 --batch_size 4 --timeout 1` from\nterminal.\n\n### Serve Custom Model\n\nTo serve a custom model, you will have to implement `handle` method for `FastServe` that processes a batch of inputs and\nreturns the response as a list.\n\n```python\nfrom fastserve import FastServe\n\n\nclass MyModelServing(FastServe):\n    def __init__(self):\n        super().__init__(batch_size=2, timeout=0.1)\n        self.model = create_model(...)\n\n    def handle(self, batch: List[BaseRequest]) -> List[float]:\n        inputs = [b.request for b in batch]\n        response = self.model(inputs)\n        return response\n\n\napp = MyModelServing()\napp.run_server()\n```\n\nYou can run the above script in terminal, and it will launch a FastAPI server for your custom model.\n\n## Deploy\n\n### Lightning AI Studio \u26a1\ufe0f\n\n```shell\npython fastserve.deploy.lightning --filename main.py \\\n    --user LIGHTNING_USERNAME \\\n    --teamspace LIGHTNING_TEAMSPACE \\\n    --machine \"CPU\"  # T4, A10G or A10G_X_4\n```\n\n## Contribute\n\n**Install in editable mode:**\n\n```shell\ngit clone https://github.com/aniketmaurya/fastserve.git\ncd fastserve\npip install -e .\n```\n\n**Create a new branch**\n\n```shell\ngit checkout -b \uff1cnew-branch\uff1e\n```\n\n**Make your changes, commit and [create a PR](https://github.com/aniketmaurya/fastserve/compare).**\n\n\n<!-- ## FAQ\n\n#### Question 1\n\nAnswer 1\n\n#### Question 2\n\nAnswer 2 -->\n",
    "bugtrack_url": null,
    "license": "Apache License 2.0",
    "summary": "'Machine Learning Serving focused on GenAI & LLMs with simplicity as the top priority.'",
    "version": "0.0.3",
    "project_urls": {
        "Homepage": "https://github.com/aniketmaurya/fastserve"
    },
    "split_keywords": [
        "opensource",
        "python"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "3d2c0e5a2f8b9c65c33a6c8061003c9e848458a183d9eb5207b728c11bd9dc90",
                "md5": "f3244f3f3b819198379019710cd1e93b",
                "sha256": "cf5be4bda7d9a41425140dcd31a97635ff2f37251fb6a8d9b5f45f9f3f41a8ac"
            },
            "downloads": -1,
            "filename": "FastServeAI-0.0.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f3244f3f3b819198379019710cd1e93b",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 246964,
            "upload_time": "2024-02-23T11:47:18",
            "upload_time_iso_8601": "2024-02-23T11:47:18.203405Z",
            "url": "https://files.pythonhosted.org/packages/3d/2c/0e5a2f8b9c65c33a6c8061003c9e848458a183d9eb5207b728c11bd9dc90/FastServeAI-0.0.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "58572004d3807d412a71bd95764dffe32ddd64d82aa2782a38a4904006ca7507",
                "md5": "347b70cca13ed5aa8a2109fa1487d472",
                "sha256": "ffb237530f4d45b54216d56927d871975e1b4620208cb352b7c39b34a6411516"
            },
            "downloads": -1,
            "filename": "FastServeAI-0.0.3.tar.gz",
            "has_sig": false,
            "md5_digest": "347b70cca13ed5aa8a2109fa1487d472",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 1549228,
            "upload_time": "2024-02-23T11:47:20",
            "upload_time_iso_8601": "2024-02-23T11:47:20.485055Z",
            "url": "https://files.pythonhosted.org/packages/58/57/2004d3807d412a71bd95764dffe32ddd64d82aa2782a38a4904006ca7507/FastServeAI-0.0.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-02-23 11:47:20",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "aniketmaurya",
    "github_project": "fastserve",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "fastserveai"
}
        
Elapsed time: 0.18785s