mlserver-huggingface-striveworks

Name	mlserver-huggingface-striveworks JSON
Version	1.4.0.dev3 JSON
	download
home_page
Summary	HuggingFace runtime for MLServer
upload_time	2024-02-08 22:30:40
maintainer
docs_url	None
author	Seldon Technologies Ltd.
requires_python	>=3.8.1,<3.12
license	Apache-2.0
keywords
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # HuggingFace runtime for MLServer

This package provides a MLServer runtime compatible with HuggingFace Transformers.

## Usage

You can install the runtime, alongside `mlserver`, as:

```bash
pip install mlserver mlserver-huggingface
```

For further information on how to use MLServer with HuggingFace, you can check
out this [worked out example](../../docs/examples/huggingface/README.md).

## Content Types

The HuggingFace runtime will always decode the input request using its own
built-in codec.
Therefore, [content type annotations](../../docs/user-guide/content-type) at
the request level will **be ignored**.
Not that this **doesn't include [input-level content
type](../../docs/user-guide/content-type#Codecs) annotations**, which will be
respected as usual.

## Settings

The HuggingFace runtime exposes a couple extra parameters which can be used to
customise how the runtime behaves.
These settings can be added under the `parameters.extra` section of your
`model-settings.json` file, e.g.

```{code-block} json
---
emphasize-lines: 5-8
---
{
  "name": "qa",
  "implementation": "mlserver_huggingface.HuggingFaceRuntime",
  "parameters": {
    "extra": {
      "task": "question-answering",
      "optimum_model": true
    }
  }
}
```

````{note}
These settings can also be injected through environment variables prefixed with `MLSERVER_MODEL_HUGGINGFACE_`, e.g.

```bash
MLSERVER_MODEL_HUGGINGFACE_TASK="question-answering"
MLSERVER_MODEL_HUGGINGFACE_OPTIMUM_MODEL=true
```
````

### Loading models
#### Local models
It is possible to load a local model into a HuggingFace pipeline by specifying the model artefact folder path in `parameters.uri` in `model-settings.json`.

#### HuggingFace models
Models in the HuggingFace hub can be loaded by specifying their name in `parameters.extra.pretrained_model` in `model-settings.json`.

````{note}
If `parameters.extra.pretrained_model` is specified, it takes precedence over `parameters.uri`.
````

#### Model Inference
Model inference is done by HuggingFace pipeline. It allows users to run inference on a batch of inputs. Extra inference kwargs can be kept in `parameters.extra`.
```{code-block} json
{
    "inputs": [
        {
            "name": "text_inputs",
            "shape": [1],
            "datatype": "BYTES",
            "data": ["My kitten's name is JoJo,","Tell me a story:"],
        }
    ],
    "parameters": {
        "extra":{"max_new_tokens": 200,"return_full_text": false}
    }
}
```

### Reference

You can find the full reference of the accepted extra settings for the
HuggingFace runtime below:

```{eval-rst}

.. autopydantic_settings:: mlserver_huggingface.settings.HuggingFaceSettings
```

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "mlserver-huggingface-striveworks",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8.1,<3.12",
    "maintainer_email": "",
    "keywords": "",
    "author": "Seldon Technologies Ltd.",
    "author_email": "hello@seldon.io",
    "download_url": "https://files.pythonhosted.org/packages/e2/0f/ed8b7bbed00fc75147acf680b93dc9c041db940f5ca3dfc27688a0513d4c/mlserver_huggingface_striveworks-1.4.0.dev3.tar.gz",
    "platform": null,
    "description": "# HuggingFace runtime for MLServer\n\nThis package provides a MLServer runtime compatible with HuggingFace Transformers.\n\n## Usage\n\nYou can install the runtime, alongside `mlserver`, as:\n\n```bash\npip install mlserver mlserver-huggingface\n```\n\nFor further information on how to use MLServer with HuggingFace, you can check\nout this [worked out example](../../docs/examples/huggingface/README.md).\n\n## Content Types\n\nThe HuggingFace runtime will always decode the input request using its own\nbuilt-in codec.\nTherefore, [content type annotations](../../docs/user-guide/content-type) at\nthe request level will **be ignored**.\nNot that this **doesn't include [input-level content\ntype](../../docs/user-guide/content-type#Codecs) annotations**, which will be\nrespected as usual.\n\n## Settings\n\nThe HuggingFace runtime exposes a couple extra parameters which can be used to\ncustomise how the runtime behaves.\nThese settings can be added under the `parameters.extra` section of your\n`model-settings.json` file, e.g.\n\n```{code-block} json\n---\nemphasize-lines: 5-8\n---\n{\n  \"name\": \"qa\",\n  \"implementation\": \"mlserver_huggingface.HuggingFaceRuntime\",\n  \"parameters\": {\n    \"extra\": {\n      \"task\": \"question-answering\",\n      \"optimum_model\": true\n    }\n  }\n}\n```\n\n````{note}\nThese settings can also be injected through environment variables prefixed with `MLSERVER_MODEL_HUGGINGFACE_`, e.g.\n\n```bash\nMLSERVER_MODEL_HUGGINGFACE_TASK=\"question-answering\"\nMLSERVER_MODEL_HUGGINGFACE_OPTIMUM_MODEL=true\n```\n````\n\n### Loading models\n#### Local models\nIt is possible to load a local model into a HuggingFace pipeline by specifying the model artefact folder path in `parameters.uri` in `model-settings.json`.\n\n#### HuggingFace models\nModels in the HuggingFace hub can be loaded by specifying their name in `parameters.extra.pretrained_model` in `model-settings.json`.\n\n````{note}\nIf `parameters.extra.pretrained_model` is specified, it takes precedence over `parameters.uri`.\n````\n\n#### Model Inference\nModel inference is done by HuggingFace pipeline. It allows users to run inference on a batch of inputs. Extra inference kwargs can be kept in `parameters.extra`.\n```{code-block} json\n{\n    \"inputs\": [\n        {\n            \"name\": \"text_inputs\",\n            \"shape\": [1],\n            \"datatype\": \"BYTES\",\n            \"data\": [\"My kitten's name is JoJo,\",\"Tell me a story:\"],\n        }\n    ],\n    \"parameters\": {\n        \"extra\":{\"max_new_tokens\": 200,\"return_full_text\": false}\n    }\n}\n```\n\n### Reference\n\nYou can find the full reference of the accepted extra settings for the\nHuggingFace runtime below:\n\n```{eval-rst}\n\n.. autopydantic_settings:: mlserver_huggingface.settings.HuggingFaceSettings\n```\n\n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "HuggingFace runtime for MLServer",
    "version": "1.4.0.dev3",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ae4cdca72cc4c921e13741cc5a0906142453af431f408b03e9604dd8397836c5",
                "md5": "ccaf5ea5410d6fe997de4d22e4feaa9f",
                "sha256": "fe8d7488082647c351e0afb49e38db25d2e9a8485028f346545e4a97aec0cbd7"
            },
            "downloads": -1,
            "filename": "mlserver_huggingface_striveworks-1.4.0.dev3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "ccaf5ea5410d6fe997de4d22e4feaa9f",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8.1,<3.12",
            "size": 22250,
            "upload_time": "2024-02-08T22:30:38",
            "upload_time_iso_8601": "2024-02-08T22:30:38.710274Z",
            "url": "https://files.pythonhosted.org/packages/ae/4c/dca72cc4c921e13741cc5a0906142453af431f408b03e9604dd8397836c5/mlserver_huggingface_striveworks-1.4.0.dev3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e20fed8b7bbed00fc75147acf680b93dc9c041db940f5ca3dfc27688a0513d4c",
                "md5": "116efcc0885759756aea63fe716ad491",
                "sha256": "8376a60201e3f664a0e59ce3eb83e226dbaa94043840a3b64a67ceadb90b49fc"
            },
            "downloads": -1,
            "filename": "mlserver_huggingface_striveworks-1.4.0.dev3.tar.gz",
            "has_sig": false,
            "md5_digest": "116efcc0885759756aea63fe716ad491",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8.1,<3.12",
            "size": 16278,
            "upload_time": "2024-02-08T22:30:40",
            "upload_time_iso_8601": "2024-02-08T22:30:40.059165Z",
            "url": "https://files.pythonhosted.org/packages/e2/0f/ed8b7bbed00fc75147acf680b93dc9c041db940f5ca3dfc27688a0513d4c/mlserver_huggingface_striveworks-1.4.0.dev3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-02-08 22:30:40",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "mlserver-huggingface-striveworks"
}

Seldon Technologies Ltd.