mlserver-huggingface

Name	mlserver-huggingface JSON
Version	1.6.1 JSON
	download
home_page	None
Summary	HuggingFace runtime for MLServer
upload_time	2024-09-10 15:10:54
maintainer	None
docs_url	None
author	Seldon Technologies Ltd.
requires_python	<3.12,>=3.9
license	Apache-2.0
keywords
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

# HuggingFace runtime for MLServer

This package provides a MLServer runtime compatible with HuggingFace Transformers.

## Usage

You can install the runtime, alongside `mlserver`, as:

```bash
pip install mlserver mlserver-huggingface
```

For further information on how to use MLServer with HuggingFace, you can check
out this [worked out example](../../docs/examples/huggingface/README.md).

## Content Types

The HuggingFace runtime will always decode the input request using its own
built-in codec.
Therefore, [content type annotations](../../docs/user-guide/content-type) at
the request level will **be ignored**.
Note that this **doesn't include [input-level content
type](../../docs/user-guide/content-type#Codecs) annotations**, which will be
respected as usual.

## Settings

The HuggingFace runtime exposes a couple extra parameters which can be used to
customise how the runtime behaves.
These settings can be added under the `parameters.extra` section of your
`model-settings.json` file, e.g.

```{code-block} json
---
emphasize-lines: 5-8
---
{
"name": "qa",
"implementation": "mlserver_huggingface.HuggingFaceRuntime",
"parameters": {
"extra": {
"task": "question-answering",
"optimum_model": true
}
}
}
```

````{note}
These settings can also be injected through environment variables prefixed with `MLSERVER_MODEL_HUGGINGFACE_`, e.g.

```bash
MLSERVER_MODEL_HUGGINGFACE_TASK="question-answering"
MLSERVER_MODEL_HUGGINGFACE_OPTIMUM_MODEL=true
```
````

### Loading models
#### Local models
It is possible to load a local model into a HuggingFace pipeline by specifying the model artefact folder path in `parameters.uri` in `model-settings.json`.

#### HuggingFace models
Models in the HuggingFace hub can be loaded by specifying their name in `parameters.extra.pretrained_model` in `model-settings.json`.

````{note}
If `parameters.extra.pretrained_model` is specified, it takes precedence over `parameters.uri`.
````

### Reference

You can find the full reference of the accepted extra settings for the
HuggingFace runtime below:

```{eval-rst}

.. autopydantic_settings:: mlserver_huggingface.settings.HuggingFaceSettings
```

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "mlserver-huggingface",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<3.12,>=3.9",
    "maintainer_email": null,
    "keywords": null,
    "author": "Seldon Technologies Ltd.",
    "author_email": "hello@seldon.io",
    "download_url": "https://files.pythonhosted.org/packages/57/48/c03b0599dff5fc7ada86cc8c461cfc3fd855031b617a3f9b928334c90067/mlserver_huggingface-1.6.1.tar.gz",
    "platform": null,
    "description": "# HuggingFace runtime for MLServer\n\nThis package provides a MLServer runtime compatible with HuggingFace Transformers.\n\n## Usage\n\nYou can install the runtime, alongside `mlserver`, as:\n\n```bash\npip install mlserver mlserver-huggingface\n```\n\nFor further information on how to use MLServer with HuggingFace, you can check\nout this [worked out example](../../docs/examples/huggingface/README.md).\n\n## Content Types\n\nThe HuggingFace runtime will always decode the input request using its own\nbuilt-in codec.\nTherefore, [content type annotations](../../docs/user-guide/content-type) at\nthe request level will **be ignored**.\nNote that this **doesn't include [input-level content\ntype](../../docs/user-guide/content-type#Codecs) annotations**, which will be\nrespected as usual.\n\n## Settings\n\nThe HuggingFace runtime exposes a couple extra parameters which can be used to\ncustomise how the runtime behaves.\nThese settings can be added under the `parameters.extra` section of your\n`model-settings.json` file, e.g.\n\n```{code-block} json\n---\nemphasize-lines: 5-8\n---\n{\n  \"name\": \"qa\",\n  \"implementation\": \"mlserver_huggingface.HuggingFaceRuntime\",\n  \"parameters\": {\n    \"extra\": {\n      \"task\": \"question-answering\",\n      \"optimum_model\": true\n    }\n  }\n}\n```\n\n````{note}\nThese settings can also be injected through environment variables prefixed with `MLSERVER_MODEL_HUGGINGFACE_`, e.g.\n\n```bash\nMLSERVER_MODEL_HUGGINGFACE_TASK=\"question-answering\"\nMLSERVER_MODEL_HUGGINGFACE_OPTIMUM_MODEL=true\n```\n````\n\n### Loading models\n#### Local models\nIt is possible to load a local model into a HuggingFace pipeline by specifying the model artefact folder path in `parameters.uri` in `model-settings.json`.\n\n#### HuggingFace models\nModels in the HuggingFace hub can be loaded by specifying their name in `parameters.extra.pretrained_model` in `model-settings.json`.\n\n````{note}\nIf `parameters.extra.pretrained_model` is specified, it takes precedence over `parameters.uri`.\n````\n\n### Reference\n\nYou can find the full reference of the accepted extra settings for the\nHuggingFace runtime below:\n\n```{eval-rst}\n\n.. autopydantic_settings:: mlserver_huggingface.settings.HuggingFaceSettings\n```\n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "HuggingFace runtime for MLServer",
    "version": "1.6.1",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d3f6d755c23e9489fc483079df9bf1ed80c9cf84d262f424a19d4b9bc50551af",
                "md5": "e0feaa9c2cee426bf08f43b4745cf53e",
                "sha256": "3b3cf325515a53dc5e35ecf17ee1a16341f2bc1b2875f8360b199a49e1a0468a"
            },
            "downloads": -1,
            "filename": "mlserver_huggingface-1.6.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "e0feaa9c2cee426bf08f43b4745cf53e",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<3.12,>=3.9",
            "size": 21379,
            "upload_time": "2024-09-10T15:10:52",
            "upload_time_iso_8601": "2024-09-10T15:10:52.157743Z",
            "url": "https://files.pythonhosted.org/packages/d3/f6/d755c23e9489fc483079df9bf1ed80c9cf84d262f424a19d4b9bc50551af/mlserver_huggingface-1.6.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "5748c03b0599dff5fc7ada86cc8c461cfc3fd855031b617a3f9b928334c90067",
                "md5": "27afc4b689d0d9f263e84d2ff0e38f8f",
                "sha256": "52e61a6edce8286ae90e65c62ee9fdb40da3f2ef29e16c9f4a10b7a925cf56e0"
            },
            "downloads": -1,
            "filename": "mlserver_huggingface-1.6.1.tar.gz",
            "has_sig": false,
            "md5_digest": "27afc4b689d0d9f263e84d2ff0e38f8f",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<3.12,>=3.9",
            "size": 15778,
            "upload_time": "2024-09-10T15:10:54",
            "upload_time_iso_8601": "2024-09-10T15:10:54.403191Z",
            "url": "https://files.pythonhosted.org/packages/57/48/c03b0599dff5fc7ada86cc8c461cfc3fd855031b617a3f9b928334c90067/mlserver_huggingface-1.6.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-09-10 15:10:54",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "mlserver-huggingface"
}

Seldon Technologies Ltd.