# HuggingFace runtime for MLServer
This package provides a MLServer runtime compatible with HuggingFace Transformers.
## Usage
You can install the runtime, alongside `mlserver`, as:
```bash
pip install mlserver mlserver-huggingface
```
For further information on how to use MLServer with HuggingFace, you can check
out this [worked out example](../../docs/examples/huggingface/README.md).
## Content Types
The HuggingFace runtime will always decode the input request using its own
built-in codec.
Therefore, [content type annotations](../../docs/user-guide/content-type) at
the request level will **be ignored**.
Note that this **doesn't include [input-level content
type](../../docs/user-guide/content-type#Codecs) annotations**, which will be
respected as usual.
## Settings
The HuggingFace runtime exposes a couple extra parameters which can be used to
customise how the runtime behaves.
These settings can be added under the `parameters.extra` section of your
`model-settings.json` file, e.g.
```{code-block} json
---
emphasize-lines: 5-8
---
{
"name": "qa",
"implementation": "mlserver_huggingface.HuggingFaceRuntime",
"parameters": {
"extra": {
"task": "question-answering",
"optimum_model": true
}
}
}
```
````{note}
These settings can also be injected through environment variables prefixed with `MLSERVER_MODEL_HUGGINGFACE_`, e.g.
```bash
MLSERVER_MODEL_HUGGINGFACE_TASK="question-answering"
MLSERVER_MODEL_HUGGINGFACE_OPTIMUM_MODEL=true
```
````
### Loading models
#### Local models
It is possible to load a local model into a HuggingFace pipeline by specifying the model artefact folder path in `parameters.uri` in `model-settings.json`.
#### HuggingFace models
Models in the HuggingFace hub can be loaded by specifying their name in `parameters.extra.pretrained_model` in `model-settings.json`.
````{note}
If `parameters.extra.pretrained_model` is specified, it takes precedence over `parameters.uri`.
````
### Reference
You can find the full reference of the accepted extra settings for the
HuggingFace runtime below:
```{eval-rst}
.. autopydantic_settings:: mlserver_huggingface.settings.HuggingFaceSettings
```
Raw data
{
"_id": null,
"home_page": null,
"name": "mlserver-huggingface",
"maintainer": null,
"docs_url": null,
"requires_python": "<3.12,>=3.9",
"maintainer_email": null,
"keywords": null,
"author": "Seldon Technologies Ltd.",
"author_email": "hello@seldon.io",
"download_url": "https://files.pythonhosted.org/packages/57/48/c03b0599dff5fc7ada86cc8c461cfc3fd855031b617a3f9b928334c90067/mlserver_huggingface-1.6.1.tar.gz",
"platform": null,
"description": "# HuggingFace runtime for MLServer\n\nThis package provides a MLServer runtime compatible with HuggingFace Transformers.\n\n## Usage\n\nYou can install the runtime, alongside `mlserver`, as:\n\n```bash\npip install mlserver mlserver-huggingface\n```\n\nFor further information on how to use MLServer with HuggingFace, you can check\nout this [worked out example](../../docs/examples/huggingface/README.md).\n\n## Content Types\n\nThe HuggingFace runtime will always decode the input request using its own\nbuilt-in codec.\nTherefore, [content type annotations](../../docs/user-guide/content-type) at\nthe request level will **be ignored**.\nNote that this **doesn't include [input-level content\ntype](../../docs/user-guide/content-type#Codecs) annotations**, which will be\nrespected as usual.\n\n## Settings\n\nThe HuggingFace runtime exposes a couple extra parameters which can be used to\ncustomise how the runtime behaves.\nThese settings can be added under the `parameters.extra` section of your\n`model-settings.json` file, e.g.\n\n```{code-block} json\n---\nemphasize-lines: 5-8\n---\n{\n \"name\": \"qa\",\n \"implementation\": \"mlserver_huggingface.HuggingFaceRuntime\",\n \"parameters\": {\n \"extra\": {\n \"task\": \"question-answering\",\n \"optimum_model\": true\n }\n }\n}\n```\n\n````{note}\nThese settings can also be injected through environment variables prefixed with `MLSERVER_MODEL_HUGGINGFACE_`, e.g.\n\n```bash\nMLSERVER_MODEL_HUGGINGFACE_TASK=\"question-answering\"\nMLSERVER_MODEL_HUGGINGFACE_OPTIMUM_MODEL=true\n```\n````\n\n### Loading models\n#### Local models\nIt is possible to load a local model into a HuggingFace pipeline by specifying the model artefact folder path in `parameters.uri` in `model-settings.json`.\n\n#### HuggingFace models\nModels in the HuggingFace hub can be loaded by specifying their name in `parameters.extra.pretrained_model` in `model-settings.json`.\n\n````{note}\nIf `parameters.extra.pretrained_model` is specified, it takes precedence over `parameters.uri`.\n````\n\n### Reference\n\nYou can find the full reference of the accepted extra settings for the\nHuggingFace runtime below:\n\n```{eval-rst}\n\n.. autopydantic_settings:: mlserver_huggingface.settings.HuggingFaceSettings\n```\n",
"bugtrack_url": null,
"license": "Apache-2.0",
"summary": "HuggingFace runtime for MLServer",
"version": "1.6.1",
"project_urls": null,
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "d3f6d755c23e9489fc483079df9bf1ed80c9cf84d262f424a19d4b9bc50551af",
"md5": "e0feaa9c2cee426bf08f43b4745cf53e",
"sha256": "3b3cf325515a53dc5e35ecf17ee1a16341f2bc1b2875f8360b199a49e1a0468a"
},
"downloads": -1,
"filename": "mlserver_huggingface-1.6.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "e0feaa9c2cee426bf08f43b4745cf53e",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<3.12,>=3.9",
"size": 21379,
"upload_time": "2024-09-10T15:10:52",
"upload_time_iso_8601": "2024-09-10T15:10:52.157743Z",
"url": "https://files.pythonhosted.org/packages/d3/f6/d755c23e9489fc483079df9bf1ed80c9cf84d262f424a19d4b9bc50551af/mlserver_huggingface-1.6.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "5748c03b0599dff5fc7ada86cc8c461cfc3fd855031b617a3f9b928334c90067",
"md5": "27afc4b689d0d9f263e84d2ff0e38f8f",
"sha256": "52e61a6edce8286ae90e65c62ee9fdb40da3f2ef29e16c9f4a10b7a925cf56e0"
},
"downloads": -1,
"filename": "mlserver_huggingface-1.6.1.tar.gz",
"has_sig": false,
"md5_digest": "27afc4b689d0d9f263e84d2ff0e38f8f",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<3.12,>=3.9",
"size": 15778,
"upload_time": "2024-09-10T15:10:54",
"upload_time_iso_8601": "2024-09-10T15:10:54.403191Z",
"url": "https://files.pythonhosted.org/packages/57/48/c03b0599dff5fc7ada86cc8c461cfc3fd855031b617a3f9b928334c90067/mlserver_huggingface-1.6.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-09-10 15:10:54",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "mlserver-huggingface"
}