jina


Namejina JSON
Version 3.28.0 PyPI version JSON
download
home_pagehttps://github.com/jina-ai/jina/
SummaryMultimodal AI services & pipelines with cloud-native stack: gRPC, Kubernetes, Docker, OpenTelemetry, Prometheus, Jaeger, etc.
upload_time2024-11-08 13:26:22
maintainerNone
docs_urlNone
authorJina AI
requires_pythonNone
licenseApache 2.0
keywords jina cloud-native cross-modal multimodal neural-search query search index elastic neural-network encoding embedding serving docker container image video audio deep-learning mlops
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Jina-Serve
<a href="https://pypi.org/project/jina/"><img alt="PyPI" src="https://img.shields.io/pypi/v/jina?label=Release&style=flat-square"></a>
<a href="https://discord.jina.ai"><img src="https://img.shields.io/discord/1106542220112302130?logo=discord&logoColor=white&style=flat-square"></a>
<a href="https://pypistats.org/packages/jina"><img alt="PyPI - Downloads from official pypistats" src="https://img.shields.io/pypi/dm/jina?style=flat-square"></a>
<a href="https://github.com/jina-ai/jina/actions/workflows/cd.yml"><img alt="Github CD status" src="https://github.com/jina-ai/jina/actions/workflows/cd.yml/badge.svg"></a>

Jina-serve is a framework for building and deploying AI services that communicate via gRPC, HTTP and WebSockets. Scale your services from local development to production while focusing on your core logic.

## Key Features

- Native support for all major ML frameworks and data types
- High-performance service design with scaling, streaming, and dynamic batching
- LLM serving with streaming output
- Built-in Docker integration and Executor Hub
- One-click deployment to Jina AI Cloud
- Enterprise-ready with Kubernetes and Docker Compose support

<details>
<summary><strong>Comparison with FastAPI</strong></summary>

Key advantages over FastAPI:

- DocArray-based data handling with native gRPC support
- Built-in containerization and service orchestration
- Seamless scaling of microservices
- One-command cloud deployment
</details>

## Install 

```bash
pip install jina
```

See guides for [Apple Silicon](https://jina.ai/serve/get-started/install/apple-silicon-m1-m2/) and [Windows](https://jina.ai/serve/get-started/install/windows/).

## Core Concepts

Three main layers:
- **Data**: BaseDoc and DocList for input/output
- **Serving**: Executors process Documents, Gateway connects services
- **Orchestration**: Deployments serve Executors, Flows create pipelines

## Build AI Services

Let's create a gRPC-based AI service using StableLM:

```python
from jina import Executor, requests
from docarray import DocList, BaseDoc
from transformers import pipeline


class Prompt(BaseDoc):
    text: str


class Generation(BaseDoc):
    prompt: str
    text: str


class StableLM(Executor):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self.generator = pipeline(
            'text-generation', model='stabilityai/stablelm-base-alpha-3b'
        )

    @requests
    def generate(self, docs: DocList[Prompt], **kwargs) -> DocList[Generation]:
        generations = DocList[Generation]()
        prompts = docs.text
        llm_outputs = self.generator(prompts)
        for prompt, output in zip(prompts, llm_outputs):
            generations.append(Generation(prompt=prompt, text=output))
        return generations
```

Deploy with Python or YAML:

```python
from jina import Deployment
from executor import StableLM

dep = Deployment(uses=StableLM, timeout_ready=-1, port=12345)

with dep:
    dep.block()
```

```yaml
jtype: Deployment
with:
 uses: StableLM
 py_modules:
   - executor.py
 timeout_ready: -1
 port: 12345
```

Use the client:

```python
from jina import Client
from docarray import DocList
from executor import Prompt, Generation

prompt = Prompt(text='suggest an interesting image generation prompt')
client = Client(port=12345)
response = client.post('/', inputs=[prompt], return_type=DocList[Generation])
```

## Build Pipelines

Chain services into a Flow:

```python
from jina import Flow

flow = Flow(port=12345).add(uses=StableLM).add(uses=TextToImage)

with flow:
    flow.block()
```

## Scaling and Deployment

### Local Scaling

Boost throughput with built-in features:
- Replicas for parallel processing
- Shards for data partitioning
- Dynamic batching for efficient model inference

Example scaling a Stable Diffusion deployment:

```yaml
jtype: Deployment
with:
 uses: TextToImage
 timeout_ready: -1
 py_modules:
   - text_to_image.py
 env:
  CUDA_VISIBLE_DEVICES: RR
 replicas: 2
 uses_dynamic_batching:
   /default:
     preferred_batch_size: 10
     timeout: 200
```

### Cloud Deployment

#### Containerize Services

1. Structure your Executor:
```
TextToImage/
├── executor.py
├── config.yml
├── requirements.txt
```

2. Configure:
```yaml
# config.yml
jtype: TextToImage
py_modules:
 - executor.py
metas:
 name: TextToImage
 description: Text to Image generation Executor
```

3. Push to Hub:
```bash
jina hub push TextToImage
```

#### Deploy to Kubernetes
```bash
jina export kubernetes flow.yml ./my-k8s
kubectl apply -R -f my-k8s
```

#### Use Docker Compose
```bash
jina export docker-compose flow.yml docker-compose.yml
docker-compose up
```

#### JCloud Deployment

Deploy with a single command:
```bash
jina cloud deploy jcloud-flow.yml
```

## LLM Streaming

Enable token-by-token streaming for responsive LLM applications:

1. Define schemas:
```python
from docarray import BaseDoc


class PromptDocument(BaseDoc):
    prompt: str
    max_tokens: int


class ModelOutputDocument(BaseDoc):
    token_id: int
    generated_text: str
```

2. Initialize service:
```python
from transformers import GPT2Tokenizer, GPT2LMHeadModel


class TokenStreamingExecutor(Executor):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self.model = GPT2LMHeadModel.from_pretrained('gpt2')
```

3. Implement streaming:
```python
@requests(on='/stream')
async def task(self, doc: PromptDocument, **kwargs) -> ModelOutputDocument:
    input = tokenizer(doc.prompt, return_tensors='pt')
    input_len = input['input_ids'].shape[1]
    for _ in range(doc.max_tokens):
        output = self.model.generate(**input, max_new_tokens=1)
        if output[0][-1] == tokenizer.eos_token_id:
            break
        yield ModelOutputDocument(
            token_id=output[0][-1],
            generated_text=tokenizer.decode(
                output[0][input_len:], skip_special_tokens=True
            ),
        )
        input = {
            'input_ids': output,
            'attention_mask': torch.ones(1, len(output[0])),
        }
```

4. Serve and use:
```python
# Server
with Deployment(uses=TokenStreamingExecutor, port=12345, protocol='grpc') as dep:
    dep.block()


# Client
async def main():
    client = Client(port=12345, protocol='grpc', asyncio=True)
    async for doc in client.stream_doc(
        on='/stream',
        inputs=PromptDocument(prompt='what is the capital of France ?', max_tokens=10),
        return_type=ModelOutputDocument,
    ):
        print(doc.generated_text)
```

## Support

Jina-serve is backed by [Jina AI](https://jina.ai) and licensed under [Apache-2.0](./LICENSE).

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/jina-ai/jina/",
    "name": "jina",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "jina cloud-native cross-modal multimodal neural-search query search index elastic neural-network encoding embedding serving docker container image video audio deep-learning mlops",
    "author": "Jina AI",
    "author_email": "hello@jina.ai",
    "download_url": "https://github.com/jina-ai/jina/tags",
    "platform": null,
    "description": "# Jina-Serve\n<a href=\"https://pypi.org/project/jina/\"><img alt=\"PyPI\" src=\"https://img.shields.io/pypi/v/jina?label=Release&style=flat-square\"></a>\n<a href=\"https://discord.jina.ai\"><img src=\"https://img.shields.io/discord/1106542220112302130?logo=discord&logoColor=white&style=flat-square\"></a>\n<a href=\"https://pypistats.org/packages/jina\"><img alt=\"PyPI - Downloads from official pypistats\" src=\"https://img.shields.io/pypi/dm/jina?style=flat-square\"></a>\n<a href=\"https://github.com/jina-ai/jina/actions/workflows/cd.yml\"><img alt=\"Github CD status\" src=\"https://github.com/jina-ai/jina/actions/workflows/cd.yml/badge.svg\"></a>\n\nJina-serve is a framework for building and deploying AI services that communicate via gRPC, HTTP and WebSockets. Scale your services from local development to production while focusing on your core logic.\n\n## Key Features\n\n- Native support for all major ML frameworks and data types\n- High-performance service design with scaling, streaming, and dynamic batching\n- LLM serving with streaming output\n- Built-in Docker integration and Executor Hub\n- One-click deployment to Jina AI Cloud\n- Enterprise-ready with Kubernetes and Docker Compose support\n\n<details>\n<summary><strong>Comparison with FastAPI</strong></summary>\n\nKey advantages over FastAPI:\n\n- DocArray-based data handling with native gRPC support\n- Built-in containerization and service orchestration\n- Seamless scaling of microservices\n- One-command cloud deployment\n</details>\n\n## Install \n\n```bash\npip install jina\n```\n\nSee guides for [Apple Silicon](https://jina.ai/serve/get-started/install/apple-silicon-m1-m2/) and [Windows](https://jina.ai/serve/get-started/install/windows/).\n\n## Core Concepts\n\nThree main layers:\n- **Data**: BaseDoc and DocList for input/output\n- **Serving**: Executors process Documents, Gateway connects services\n- **Orchestration**: Deployments serve Executors, Flows create pipelines\n\n## Build AI Services\n\nLet's create a gRPC-based AI service using StableLM:\n\n```python\nfrom jina import Executor, requests\nfrom docarray import DocList, BaseDoc\nfrom transformers import pipeline\n\n\nclass Prompt(BaseDoc):\n    text: str\n\n\nclass Generation(BaseDoc):\n    prompt: str\n    text: str\n\n\nclass StableLM(Executor):\n    def __init__(self, **kwargs):\n        super().__init__(**kwargs)\n        self.generator = pipeline(\n            'text-generation', model='stabilityai/stablelm-base-alpha-3b'\n        )\n\n    @requests\n    def generate(self, docs: DocList[Prompt], **kwargs) -> DocList[Generation]:\n        generations = DocList[Generation]()\n        prompts = docs.text\n        llm_outputs = self.generator(prompts)\n        for prompt, output in zip(prompts, llm_outputs):\n            generations.append(Generation(prompt=prompt, text=output))\n        return generations\n```\n\nDeploy with Python or YAML:\n\n```python\nfrom jina import Deployment\nfrom executor import StableLM\n\ndep = Deployment(uses=StableLM, timeout_ready=-1, port=12345)\n\nwith dep:\n    dep.block()\n```\n\n```yaml\njtype: Deployment\nwith:\n uses: StableLM\n py_modules:\n   - executor.py\n timeout_ready: -1\n port: 12345\n```\n\nUse the client:\n\n```python\nfrom jina import Client\nfrom docarray import DocList\nfrom executor import Prompt, Generation\n\nprompt = Prompt(text='suggest an interesting image generation prompt')\nclient = Client(port=12345)\nresponse = client.post('/', inputs=[prompt], return_type=DocList[Generation])\n```\n\n## Build Pipelines\n\nChain services into a Flow:\n\n```python\nfrom jina import Flow\n\nflow = Flow(port=12345).add(uses=StableLM).add(uses=TextToImage)\n\nwith flow:\n    flow.block()\n```\n\n## Scaling and Deployment\n\n### Local Scaling\n\nBoost throughput with built-in features:\n- Replicas for parallel processing\n- Shards for data partitioning\n- Dynamic batching for efficient model inference\n\nExample scaling a Stable Diffusion deployment:\n\n```yaml\njtype: Deployment\nwith:\n uses: TextToImage\n timeout_ready: -1\n py_modules:\n   - text_to_image.py\n env:\n  CUDA_VISIBLE_DEVICES: RR\n replicas: 2\n uses_dynamic_batching:\n   /default:\n     preferred_batch_size: 10\n     timeout: 200\n```\n\n### Cloud Deployment\n\n#### Containerize Services\n\n1. Structure your Executor:\n```\nTextToImage/\n\u251c\u2500\u2500 executor.py\n\u251c\u2500\u2500 config.yml\n\u251c\u2500\u2500 requirements.txt\n```\n\n2. Configure:\n```yaml\n# config.yml\njtype: TextToImage\npy_modules:\n - executor.py\nmetas:\n name: TextToImage\n description: Text to Image generation Executor\n```\n\n3. Push to Hub:\n```bash\njina hub push TextToImage\n```\n\n#### Deploy to Kubernetes\n```bash\njina export kubernetes flow.yml ./my-k8s\nkubectl apply -R -f my-k8s\n```\n\n#### Use Docker Compose\n```bash\njina export docker-compose flow.yml docker-compose.yml\ndocker-compose up\n```\n\n#### JCloud Deployment\n\nDeploy with a single command:\n```bash\njina cloud deploy jcloud-flow.yml\n```\n\n## LLM Streaming\n\nEnable token-by-token streaming for responsive LLM applications:\n\n1. Define schemas:\n```python\nfrom docarray import BaseDoc\n\n\nclass PromptDocument(BaseDoc):\n    prompt: str\n    max_tokens: int\n\n\nclass ModelOutputDocument(BaseDoc):\n    token_id: int\n    generated_text: str\n```\n\n2. Initialize service:\n```python\nfrom transformers import GPT2Tokenizer, GPT2LMHeadModel\n\n\nclass TokenStreamingExecutor(Executor):\n    def __init__(self, **kwargs):\n        super().__init__(**kwargs)\n        self.model = GPT2LMHeadModel.from_pretrained('gpt2')\n```\n\n3. Implement streaming:\n```python\n@requests(on='/stream')\nasync def task(self, doc: PromptDocument, **kwargs) -> ModelOutputDocument:\n    input = tokenizer(doc.prompt, return_tensors='pt')\n    input_len = input['input_ids'].shape[1]\n    for _ in range(doc.max_tokens):\n        output = self.model.generate(**input, max_new_tokens=1)\n        if output[0][-1] == tokenizer.eos_token_id:\n            break\n        yield ModelOutputDocument(\n            token_id=output[0][-1],\n            generated_text=tokenizer.decode(\n                output[0][input_len:], skip_special_tokens=True\n            ),\n        )\n        input = {\n            'input_ids': output,\n            'attention_mask': torch.ones(1, len(output[0])),\n        }\n```\n\n4. Serve and use:\n```python\n# Server\nwith Deployment(uses=TokenStreamingExecutor, port=12345, protocol='grpc') as dep:\n    dep.block()\n\n\n# Client\nasync def main():\n    client = Client(port=12345, protocol='grpc', asyncio=True)\n    async for doc in client.stream_doc(\n        on='/stream',\n        inputs=PromptDocument(prompt='what is the capital of France ?', max_tokens=10),\n        return_type=ModelOutputDocument,\n    ):\n        print(doc.generated_text)\n```\n\n## Support\n\nJina-serve is backed by [Jina AI](https://jina.ai) and licensed under [Apache-2.0](./LICENSE).\n",
    "bugtrack_url": null,
    "license": "Apache 2.0",
    "summary": "Multimodal AI services & pipelines with cloud-native stack: gRPC, Kubernetes, Docker, OpenTelemetry, Prometheus, Jaeger, etc.",
    "version": "3.28.0",
    "project_urls": {
        "Documentation": "https://jina.ai/serve",
        "Download": "https://github.com/jina-ai/jina/tags",
        "Homepage": "https://github.com/jina-ai/jina/",
        "Source": "https://github.com/jina-ai/jina/",
        "Tracker": "https://github.com/jina-ai/jina/issues"
    },
    "split_keywords": [
        "jina",
        "cloud-native",
        "cross-modal",
        "multimodal",
        "neural-search",
        "query",
        "search",
        "index",
        "elastic",
        "neural-network",
        "encoding",
        "embedding",
        "serving",
        "docker",
        "container",
        "image",
        "video",
        "audio",
        "deep-learning",
        "mlops"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "1a02f921df7a9b4b9ba22c47004142e0ed5a0333bafc979a78ec3f34991917f1",
                "md5": "f9f38341be15e21ab936b94e989fca69",
                "sha256": "d18097f09f579a28522e1e202ae821dfc45e99fb497db4c68e9b1d22563449ae"
            },
            "downloads": -1,
            "filename": "jina-3.28.0-cp310-cp310-macosx_10_9_x86_64.whl",
            "has_sig": false,
            "md5_digest": "f9f38341be15e21ab936b94e989fca69",
            "packagetype": "bdist_wheel",
            "python_version": "cp310",
            "requires_python": null,
            "size": 8954136,
            "upload_time": "2024-11-08T13:26:22",
            "upload_time_iso_8601": "2024-11-08T13:26:22.240671Z",
            "url": "https://files.pythonhosted.org/packages/1a/02/f921df7a9b4b9ba22c47004142e0ed5a0333bafc979a78ec3f34991917f1/jina-3.28.0-cp310-cp310-macosx_10_9_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "34b60d63d3cb3d6061920b6c869308de6584d04842a02beaa10ad601b7e551a6",
                "md5": "91600335bc2c6f338c0b5543c8725601",
                "sha256": "c3c04a686074e3064da3ad25660aad850cfa664e9a599872540208ffd92a1a04"
            },
            "downloads": -1,
            "filename": "jina-3.28.0-cp310-cp310-macosx_11_0_arm64.whl",
            "has_sig": false,
            "md5_digest": "91600335bc2c6f338c0b5543c8725601",
            "packagetype": "bdist_wheel",
            "python_version": "cp310",
            "requires_python": null,
            "size": 8395470,
            "upload_time": "2024-11-08T13:26:27",
            "upload_time_iso_8601": "2024-11-08T13:26:27.206436Z",
            "url": "https://files.pythonhosted.org/packages/34/b6/0d63d3cb3d6061920b6c869308de6584d04842a02beaa10ad601b7e551a6/jina-3.28.0-cp310-cp310-macosx_11_0_arm64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "48963ea845b2dd9586cef6d4218556e0152dac31b38e526987a5fc8a6b9d5663",
                "md5": "73bae2b81305df0888f199afd6ebc1db",
                "sha256": "8743afbfc1fdf62e9ea818f42b44799a0f74e3093be4132c87a41d560c61ebea"
            },
            "downloads": -1,
            "filename": "jina-3.28.0-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "73bae2b81305df0888f199afd6ebc1db",
            "packagetype": "bdist_wheel",
            "python_version": "cp310",
            "requires_python": null,
            "size": 7879857,
            "upload_time": "2024-11-08T13:26:29",
            "upload_time_iso_8601": "2024-11-08T13:26:29.209059Z",
            "url": "https://files.pythonhosted.org/packages/48/96/3ea845b2dd9586cef6d4218556e0152dac31b38e526987a5fc8a6b9d5663/jina-3.28.0-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "5ba823898ce4246fe9959553c3ab153b1d0db83c7ff336219034e5fb784cf792",
                "md5": "09d9f1aac38d346c7cae49920920eb06",
                "sha256": "dafe3055c37ca7a0b1627d82543498178462bbe6795fe4f60e2ee043a67adb55"
            },
            "downloads": -1,
            "filename": "jina-3.28.0-cp311-cp311-macosx_10_9_x86_64.whl",
            "has_sig": false,
            "md5_digest": "09d9f1aac38d346c7cae49920920eb06",
            "packagetype": "bdist_wheel",
            "python_version": "cp311",
            "requires_python": null,
            "size": 8954100,
            "upload_time": "2024-11-08T13:26:31",
            "upload_time_iso_8601": "2024-11-08T13:26:31.538236Z",
            "url": "https://files.pythonhosted.org/packages/5b/a8/23898ce4246fe9959553c3ab153b1d0db83c7ff336219034e5fb784cf792/jina-3.28.0-cp311-cp311-macosx_10_9_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "093b937bbbaee2b8dc7d5f316a2d6b6cb0872a0bae0c24cc4f351453edb37c92",
                "md5": "5c300a0682a8b364cd030cc57b5fc805",
                "sha256": "8e06055342ec4850a069ef8572b4315b17442f256c047e4931a48d3d5b2d2965"
            },
            "downloads": -1,
            "filename": "jina-3.28.0-cp311-cp311-macosx_11_0_arm64.whl",
            "has_sig": false,
            "md5_digest": "5c300a0682a8b364cd030cc57b5fc805",
            "packagetype": "bdist_wheel",
            "python_version": "cp311",
            "requires_python": null,
            "size": 8395442,
            "upload_time": "2024-11-08T13:26:35",
            "upload_time_iso_8601": "2024-11-08T13:26:35.000682Z",
            "url": "https://files.pythonhosted.org/packages/09/3b/937bbbaee2b8dc7d5f316a2d6b6cb0872a0bae0c24cc4f351453edb37c92/jina-3.28.0-cp311-cp311-macosx_11_0_arm64.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-08 13:26:22",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "jina-ai",
    "github_project": "jina",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "jina"
}
        
Elapsed time: 0.71542s