pyotritonclient


Namepyotritonclient JSON
Version 0.2.6 PyPI version JSON
download
home_pagehttps://github.com/oeway/pyotritonclient
SummaryA lightweight http client library for communicating with Nvidia Triton Inference Server (with Pyodide support in the browser)
upload_time2023-06-06 08:48:27
maintainer
docs_urlNone
authorWei OUYANG
requires_python
licenseBSD
keywords pyodide http triton tensorrt inference server service client nvidia
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Triton HTTP Client for Pyodide

A Pyodide python http client library and utilities for communicating with Triton Inference Server (based on tritonclient from NVIDIA).


This is a simplified implemetation of the triton client from NVIDIA, it works both in the browser with Pyodide Python or the native Python.
It only implement the http client, and most of the API remains the similar but changed into async and with additional utility functions.

## Installation

To use it in native CPython, you can install the package by running:
```
pip install pyotritonclient
```

For Pyodide-based Python environment, for example: [JupyterLite](https://jupyterlite.readthedocs.io/en/latest/_static/lab/index.html) or [Pyodide console](https://pyodide-cdn2.iodide.io/dev/full/console.html), you can install the client by running the following python code:
```python
import micropip
micropip.install("pyotritonclient")
```
## Usage

### Basic example
To execute the model, we provide utility functions to make it much easier:
```python
import numpy as np
from pyotritonclient import execute

# create fake input tensors
input0 = np.zeros([2, 349, 467], dtype='float32')
# run inference
results = await execute(inputs=[input0, {"diameter": 30}], server_url='https://ai.imjoy.io/triton', model_name='cellpose-python')
```

The above example assumes you are running the code in a jupyter notebook or an environment supports top-level await, if you are trying the example code in a normal python script, please wrap the code into an async function and run with asyncio as follows:
```python
import asyncio
import numpy as np
from pyotritonclient import execute

async def run():
    results = await execute(inputs=[np.zeros([2, 349, 467], dtype='float32'), {"diameter": 30}], server_url='https://ai.imjoy.io/triton', model_name='cellpose-python')
    print(results)

loop = asyncio.get_event_loop()
loop.run_until_complete(run())
```

You can access the lower level api, see the [test example](./tests/test_client.py).

You can also find the official [client examples](https://github.com/triton-inference-server/client/tree/main/src/python/examples) demonstrate how to use the 
package to issue request to [triton inference server](https://github.com/triton-inference-server/server). However, please notice that you will need to
change the http client code into async style. For example, instead of doing `client.infer(...)`, you need to do `await client.infer(...)`.

The http client code is forked from [triton client git repo](https://github.com/triton-inference-server/client) since commit [b3005f9db154247a4c792633e54f25f35ccadff0](https://github.com/triton-inference-server/client/tree/b3005f9db154247a4c792633e54f25f35ccadff0).


### Using the sequence executor with stateful models
To simplify the manipulation on stateful models with sequence, we also provide the `SequenceExecutor` to make it easier to run models in a sequence.
```python
from pyotritonclient import SequenceExcutor


seq = SequenceExcutor(
  server_url='https://ai.imjoy.io/triton',
  model_name='cellpose-train',
  sequence_id=100
)
inputs = [
  image.astype('float32'),
  labels.astype('float32'),
  {"steps": 1, "resume": True}
]
for (image, labels, info) in train_samples:
  result = await seq.step(inputs)

result = await seq.end(inputs)
```

Note that above example called `seq.end()` by sending the last inputs again to end the sequence. If you want to specify the inputs for the execution, you can run `result = await se.end(inputs)`.

For a small batch of data, you can also run it like this:
```python
from pyotritonclient import SequenceExcutor

seq = SequenceExcutor(
  server_url='https://ai.imjoy.io/triton',
  model_name='cellpose-train',
  sequence_id=100
)

# a list of inputs
inputs_batch = [[
  image.astype('float32'),
  labels.astype('float32'),
  {"steps": 1, "resume": True}
] for (image, labels, _) in train_samples]

def on_step(i, result):
  """Function called on every step"""
  print(i)

results = await seq(inputs_batch, on_step=on_step)
```



## Server setup
Since we access the server from the browser environment which typically has more security restrictions, it is important that the server is configured to enable browser access.

Please make sure you configured following aspects:
 * The server must provide HTTPS endpoints instead of HTTP
 * The server should send the following headers:
    - `Access-Control-Allow-Headers: Inference-Header-Content-Length,Accept-Encoding,Content-Encoding,Access-Control-Allow-Headers`
    - `Access-Control-Expose-Headers: Inference-Header-Content-Length,Range,Origin,Content-Type`
    - `Access-Control-Allow-Methods: GET,HEAD,OPTIONS,PUT,POST`
    - `Access-Control-Allow-Origin: *` (This is optional depending on whether you want to support CORS)

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/oeway/pyotritonclient",
    "name": "pyotritonclient",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "pyodide,http,triton,tensorrt,inference,server,service,client,nvidia",
    "author": "Wei OUYANG",
    "author_email": "oeway007@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/f1/17/70badc7a1b7f5c66d3712ca6c22c363978cab00aba70f9ccd272fd29ef6d/pyotritonclient-0.2.6.tar.gz",
    "platform": null,
    "description": "# Triton HTTP Client for Pyodide\n\nA Pyodide python http client library and utilities for communicating with Triton Inference Server (based on tritonclient from NVIDIA).\n\n\nThis is a simplified implemetation of the triton client from NVIDIA, it works both in the browser with Pyodide Python or the native Python.\nIt only implement the http client, and most of the API remains the similar but changed into async and with additional utility functions.\n\n## Installation\n\nTo use it in native CPython, you can install the package by running:\n```\npip install pyotritonclient\n```\n\nFor Pyodide-based Python environment, for example: [JupyterLite](https://jupyterlite.readthedocs.io/en/latest/_static/lab/index.html) or [Pyodide console](https://pyodide-cdn2.iodide.io/dev/full/console.html), you can install the client by running the following python code:\n```python\nimport micropip\nmicropip.install(\"pyotritonclient\")\n```\n## Usage\n\n### Basic example\nTo execute the model, we provide utility functions to make it much easier:\n```python\nimport numpy as np\nfrom pyotritonclient import execute\n\n# create fake input tensors\ninput0 = np.zeros([2, 349, 467], dtype='float32')\n# run inference\nresults = await execute(inputs=[input0, {\"diameter\": 30}], server_url='https://ai.imjoy.io/triton', model_name='cellpose-python')\n```\n\nThe above example assumes you are running the code in a jupyter notebook or an environment supports top-level await, if you are trying the example code in a normal python script, please wrap the code into an async function and run with asyncio as follows:\n```python\nimport asyncio\nimport numpy as np\nfrom pyotritonclient import execute\n\nasync def run():\n    results = await execute(inputs=[np.zeros([2, 349, 467], dtype='float32'), {\"diameter\": 30}], server_url='https://ai.imjoy.io/triton', model_name='cellpose-python')\n    print(results)\n\nloop = asyncio.get_event_loop()\nloop.run_until_complete(run())\n```\n\nYou can access the lower level api, see the [test example](./tests/test_client.py).\n\nYou can also find the official [client examples](https://github.com/triton-inference-server/client/tree/main/src/python/examples) demonstrate how to use the \npackage to issue request to [triton inference server](https://github.com/triton-inference-server/server). However, please notice that you will need to\nchange the http client code into async style. For example, instead of doing `client.infer(...)`, you need to do `await client.infer(...)`.\n\nThe http client code is forked from [triton client git repo](https://github.com/triton-inference-server/client) since commit [b3005f9db154247a4c792633e54f25f35ccadff0](https://github.com/triton-inference-server/client/tree/b3005f9db154247a4c792633e54f25f35ccadff0).\n\n\n### Using the sequence executor with stateful models\nTo simplify the manipulation on stateful models with sequence, we also provide the `SequenceExecutor` to make it easier to run models in a sequence.\n```python\nfrom pyotritonclient import SequenceExcutor\n\n\nseq = SequenceExcutor(\n  server_url='https://ai.imjoy.io/triton',\n  model_name='cellpose-train',\n  sequence_id=100\n)\ninputs = [\n  image.astype('float32'),\n  labels.astype('float32'),\n  {\"steps\": 1, \"resume\": True}\n]\nfor (image, labels, info) in train_samples:\n  result = await seq.step(inputs)\n\nresult = await seq.end(inputs)\n```\n\nNote that above example called `seq.end()` by sending the last inputs again to end the sequence. If you want to specify the inputs for the execution, you can run `result = await se.end(inputs)`.\n\nFor a small batch of data, you can also run it like this:\n```python\nfrom pyotritonclient import SequenceExcutor\n\nseq = SequenceExcutor(\n  server_url='https://ai.imjoy.io/triton',\n  model_name='cellpose-train',\n  sequence_id=100\n)\n\n# a list of inputs\ninputs_batch = [[\n  image.astype('float32'),\n  labels.astype('float32'),\n  {\"steps\": 1, \"resume\": True}\n] for (image, labels, _) in train_samples]\n\ndef on_step(i, result):\n  \"\"\"Function called on every step\"\"\"\n  print(i)\n\nresults = await seq(inputs_batch, on_step=on_step)\n```\n\n\n\n## Server setup\nSince we access the server from the browser environment which typically has more security restrictions, it is important that the server is configured to enable browser access.\n\nPlease make sure you configured following aspects:\n * The server must provide HTTPS endpoints instead of HTTP\n * The server should send the following headers:\n    - `Access-Control-Allow-Headers: Inference-Header-Content-Length,Accept-Encoding,Content-Encoding,Access-Control-Allow-Headers`\n    - `Access-Control-Expose-Headers: Inference-Header-Content-Length,Range,Origin,Content-Type`\n    - `Access-Control-Allow-Methods: GET,HEAD,OPTIONS,PUT,POST`\n    - `Access-Control-Allow-Origin: *` (This is optional depending on whether you want to support CORS)\n",
    "bugtrack_url": null,
    "license": "BSD",
    "summary": "A lightweight http client library for communicating with Nvidia Triton Inference Server (with Pyodide support in the browser)",
    "version": "0.2.6",
    "project_urls": {
        "Homepage": "https://github.com/oeway/pyotritonclient"
    },
    "split_keywords": [
        "pyodide",
        "http",
        "triton",
        "tensorrt",
        "inference",
        "server",
        "service",
        "client",
        "nvidia"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "5a39b1c278ecd419a32195fd53145584fd65b067c26bfb2b8c2bb820d3fd7bb0",
                "md5": "59dcc5a27b8b61abbb4e6e5ced516c40",
                "sha256": "ef638011b79b390214ca3c895d75119cf2dd551adbaad38bf9106f023169f1e1"
            },
            "downloads": -1,
            "filename": "pyotritonclient-0.2.6-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "59dcc5a27b8b61abbb4e6e5ced516c40",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 23228,
            "upload_time": "2023-06-06T08:48:25",
            "upload_time_iso_8601": "2023-06-06T08:48:25.024940Z",
            "url": "https://files.pythonhosted.org/packages/5a/39/b1c278ecd419a32195fd53145584fd65b067c26bfb2b8c2bb820d3fd7bb0/pyotritonclient-0.2.6-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f11770badc7a1b7f5c66d3712ca6c22c363978cab00aba70f9ccd272fd29ef6d",
                "md5": "dcbdcba113e161709c5b67ac6beb13f1",
                "sha256": "3534b76f4d33a9a41da332b63b3e7d2527ce79901197baf75c00dd3434a2dace"
            },
            "downloads": -1,
            "filename": "pyotritonclient-0.2.6.tar.gz",
            "has_sig": false,
            "md5_digest": "dcbdcba113e161709c5b67ac6beb13f1",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 26955,
            "upload_time": "2023-06-06T08:48:27",
            "upload_time_iso_8601": "2023-06-06T08:48:27.157592Z",
            "url": "https://files.pythonhosted.org/packages/f1/17/70badc7a1b7f5c66d3712ca6c22c363978cab00aba70f9ccd272fd29ef6d/pyotritonclient-0.2.6.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-06-06 08:48:27",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "oeway",
    "github_project": "pyotritonclient",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [],
    "lcname": "pyotritonclient"
}
        
Elapsed time: 0.07167s