![SageMaker](https://github.com/aws/sagemaker-inference-toolkit/raw/master/branding/icon/sagemaker-banner.png)
# SageMaker Inference Toolkit
[![Latest Version](https://img.shields.io/pypi/v/sagemaker-inference.svg)](https://pypi.python.org/pypi/sagemaker-inference) [![Supported Python Versions](https://img.shields.io/pypi/pyversions/sagemaker-inference.svg)](https://pypi.python.org/pypi/sagemaker-inference) [![Code Style: Black](https://img.shields.io/badge/code_style-black-000000.svg)](https://github.com/python/black)
Serve machine learning models within a Docker container using Amazon
SageMaker.
## :books: Background
[Amazon SageMaker](https://aws.amazon.com/sagemaker/) is a fully managed service for data science and machine learning (ML) workflows.
You can use Amazon SageMaker to simplify the process of building, training, and deploying ML models.
Once you have a trained model, you can include it in a [Docker container](https://www.docker.com/resources/what-container) that runs your inference code.
A container provides an effectively isolated environment, ensuring a consistent runtime regardless of where the container is deployed.
Containerizing your model and code enables fast and reliable deployment of your model.
The **SageMaker Inference Toolkit** implements a model serving stack and can be easily added to any Docker container, making it [deployable to SageMaker](https://aws.amazon.com/sagemaker/deploy/).
This library's serving stack is built on [Multi Model Server](https://github.com/awslabs/multi-model-server), and it can serve your own models or those you trained on SageMaker using [machine learning frameworks with native SageMaker support](https://docs.aws.amazon.com/sagemaker/latest/dg/frameworks.html).
If you use a [prebuilt SageMaker Docker image for inference](https://docs.aws.amazon.com/sagemaker/latest/dg/pre-built-containers-frameworks-deep-learning.html), this library may already be included.
For more information, see the Amazon SageMaker Developer Guide sections on [building your own container with Multi Model Server](https://docs.aws.amazon.com/sagemaker/latest/dg/build-multi-model-build-container.html) and [using your own models](https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms.html).
## :hammer_and_wrench: Installation
To install this library in your Docker image, add the following line to your [Dockerfile](https://docs.docker.com/engine/reference/builder/):
``` dockerfile
RUN pip3 install multi-model-server sagemaker-inference
```
[Here is an example](https://github.com/awslabs/amazon-sagemaker-examples/blob/master/advanced_functionality/multi_model_bring_your_own/container/Dockerfile) of a Dockerfile that installs SageMaker Inference Toolkit.
## :computer: Usage
### Implementation Steps
To use the SageMaker Inference Toolkit, you need to do the following:
1. Implement an inference handler, which is responsible for loading the model and providing input, predict, and output functions.
([Here is an example](https://github.com/aws/sagemaker-pytorch-serving-container/blob/master/src/sagemaker_pytorch_serving_container/default_pytorch_inference_handler.py) of an inference handler.)
``` python
from sagemaker_inference import content_types, decoder, default_inference_handler, encoder, errors
class DefaultPytorchInferenceHandler(default_inference_handler.DefaultInferenceHandler):
def default_model_fn(self, model_dir, context=None):
"""Loads a model. For PyTorch, a default function to load a model cannot be provided.
Users should provide customized model_fn() in script.
Args:
model_dir: a directory where model is saved.
context (obj): the request context (default: None).
Returns: A PyTorch model.
"""
raise NotImplementedError(textwrap.dedent("""
Please provide a model_fn implementation.
See documentation for model_fn at https://github.com/aws/sagemaker-python-sdk
"""))
def default_input_fn(self, input_data, content_type, context=None):
"""A default input_fn that can handle JSON, CSV and NPZ formats.
Args:
input_data: the request payload serialized in the content_type format
content_type: the request content_type
context (obj): the request context (default: None).
Returns: input_data deserialized into torch.FloatTensor or torch.cuda.FloatTensor depending if cuda is available.
"""
return decoder.decode(input_data, content_type)
def default_predict_fn(self, data, model, context=None):
"""A default predict_fn for PyTorch. Calls a model on data deserialized in input_fn.
Runs prediction on GPU if cuda is available.
Args:
data: input data (torch.Tensor) for prediction deserialized by input_fn
model: PyTorch model loaded in memory by model_fn
context (obj): the request context (default: None).
Returns: a prediction
"""
return model(input_data)
def default_output_fn(self, prediction, accept, context=None):
"""A default output_fn for PyTorch. Serializes predictions from predict_fn to JSON, CSV or NPY format.
Args:
prediction: a prediction result from predict_fn
accept: type which the output data needs to be serialized
context (obj): the request context (default: None).
Returns: output data serialized
"""
return encoder.encode(prediction, accept)
```
Note, passing context as an argument to the handler functions is optional. Customer can choose to omit context from the function declaration if it's not needed in the runtime. For example, the following handler function declarations will also work:
```
def default_model_fn(self, model_dir)
def default_input_fn(self, input_data, content_type)
def default_predict_fn(self, data, model)
def default_output_fn(self, prediction, accept)
```
2. Implement a handler service that is executed by the model server.
([Here is an example](https://github.com/aws/sagemaker-pytorch-serving-container/blob/master/src/sagemaker_pytorch_serving_container/handler_service.py) of a handler service.)
For more information on how to define your `HANDLER_SERVICE` file, see [the MMS custom service documentation](https://github.com/awslabs/multi-model-server/blob/master/docs/custom_service.md).
``` python
from sagemaker_inference.default_handler_service import DefaultHandlerService
from sagemaker_inference.transformer import Transformer
from sagemaker_pytorch_serving_container.default_inference_handler import DefaultPytorchInferenceHandler
class HandlerService(DefaultHandlerService):
"""Handler service that is executed by the model server.
Determines specific default inference handlers to use based on model being used.
This class extends ``DefaultHandlerService``, which define the following:
- The ``handle`` method is invoked for all incoming inference requests to the model server.
- The ``initialize`` method is invoked at model server start up.
Based on: https://github.com/awslabs/multi-model-server/blob/master/docs/custom_service.md
"""
def __init__(self):
transformer = Transformer(default_inference_handler=DefaultPytorchInferenceHandler())
super(HandlerService, self).__init__(transformer=transformer)
```
3. Implement a serving entrypoint, which starts the model server.
([Here is an example](https://github.com/aws/sagemaker-pytorch-serving-container/blob/master/src/sagemaker_pytorch_serving_container/serving.py) of a serving entrypoint.)
``` python
from sagemaker_inference import model_server
model_server.start_model_server(handler_service=HANDLER_SERVICE)
```
4. Define the location of the entrypoint in your Dockerfile.
``` dockerfile
ENTRYPOINT ["python", "/usr/local/bin/entrypoint.py"]
```
### Complete Example
[Here is a complete example](https://github.com/awslabs/amazon-sagemaker-examples/tree/master/advanced_functionality/multi_model_bring_your_own) demonstrating usage of the SageMaker Inference Toolkit in your own container for deployment to a multi-model endpoint.
## :scroll: License
This library is licensed under the [Apache 2.0 License](http://aws.amazon.com/apache2.0/).
For more details, please take a look at the [LICENSE](https://github.com/aws-samples/sagemaker-inference-toolkit/blob/master/LICENSE) file.
## :handshake: Contributing
Contributions are welcome!
Please read our [contributing guidelines](https://github.com/aws/sagemaker-inference-toolkit/blob/master/CONTRIBUTING.md)
if you'd like to open an issue or submit a pull request.
Raw data
{
"_id": null,
"home_page": "https://github.com/aws/sagemaker-inference-toolkit/",
"name": "sagemaker-inference",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "",
"author": "Amazon Web Services",
"author_email": "",
"download_url": "https://files.pythonhosted.org/packages/da/4e/f10fbc7675501500beb90caa5f0740321dd25593831d50df758320806b5b/sagemaker_inference-1.10.1.tar.gz",
"platform": null,
"description": "![SageMaker](https://github.com/aws/sagemaker-inference-toolkit/raw/master/branding/icon/sagemaker-banner.png)\n\n# SageMaker Inference Toolkit\n\n[![Latest Version](https://img.shields.io/pypi/v/sagemaker-inference.svg)](https://pypi.python.org/pypi/sagemaker-inference) [![Supported Python Versions](https://img.shields.io/pypi/pyversions/sagemaker-inference.svg)](https://pypi.python.org/pypi/sagemaker-inference) [![Code Style: Black](https://img.shields.io/badge/code_style-black-000000.svg)](https://github.com/python/black)\n\nServe machine learning models within a Docker container using Amazon\nSageMaker.\n\n## :books: Background\n\n[Amazon SageMaker](https://aws.amazon.com/sagemaker/) is a fully managed service for data science and machine learning (ML) workflows.\nYou can use Amazon SageMaker to simplify the process of building, training, and deploying ML models.\n\nOnce you have a trained model, you can include it in a [Docker container](https://www.docker.com/resources/what-container) that runs your inference code.\nA container provides an effectively isolated environment, ensuring a consistent runtime regardless of where the container is deployed.\nContainerizing your model and code enables fast and reliable deployment of your model.\n\nThe **SageMaker Inference Toolkit** implements a model serving stack and can be easily added to any Docker container, making it [deployable to SageMaker](https://aws.amazon.com/sagemaker/deploy/).\nThis library's serving stack is built on [Multi Model Server](https://github.com/awslabs/multi-model-server), and it can serve your own models or those you trained on SageMaker using [machine learning frameworks with native SageMaker support](https://docs.aws.amazon.com/sagemaker/latest/dg/frameworks.html).\nIf you use a [prebuilt SageMaker Docker image for inference](https://docs.aws.amazon.com/sagemaker/latest/dg/pre-built-containers-frameworks-deep-learning.html), this library may already be included.\n\nFor more information, see the Amazon SageMaker Developer Guide sections on [building your own container with Multi Model Server](https://docs.aws.amazon.com/sagemaker/latest/dg/build-multi-model-build-container.html) and [using your own models](https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms.html).\n\n## :hammer_and_wrench: Installation\n\nTo install this library in your Docker image, add the following line to your [Dockerfile](https://docs.docker.com/engine/reference/builder/):\n\n``` dockerfile\nRUN pip3 install multi-model-server sagemaker-inference\n```\n\n[Here is an example](https://github.com/awslabs/amazon-sagemaker-examples/blob/master/advanced_functionality/multi_model_bring_your_own/container/Dockerfile) of a Dockerfile that installs SageMaker Inference Toolkit.\n\n## :computer: Usage\n\n### Implementation Steps\n\nTo use the SageMaker Inference Toolkit, you need to do the following:\n\n1. Implement an inference handler, which is responsible for loading the model and providing input, predict, and output functions.\n ([Here is an example](https://github.com/aws/sagemaker-pytorch-serving-container/blob/master/src/sagemaker_pytorch_serving_container/default_pytorch_inference_handler.py) of an inference handler.)\n\n ``` python\n from sagemaker_inference import content_types, decoder, default_inference_handler, encoder, errors\n\n class DefaultPytorchInferenceHandler(default_inference_handler.DefaultInferenceHandler):\n\n def default_model_fn(self, model_dir, context=None):\n \"\"\"Loads a model. For PyTorch, a default function to load a model cannot be provided.\n Users should provide customized model_fn() in script.\n\n Args:\n model_dir: a directory where model is saved.\n context (obj): the request context (default: None).\n\n Returns: A PyTorch model.\n \"\"\"\n raise NotImplementedError(textwrap.dedent(\"\"\"\n Please provide a model_fn implementation.\n See documentation for model_fn at https://github.com/aws/sagemaker-python-sdk\n \"\"\"))\n\n def default_input_fn(self, input_data, content_type, context=None):\n \"\"\"A default input_fn that can handle JSON, CSV and NPZ formats.\n\n Args:\n input_data: the request payload serialized in the content_type format\n content_type: the request content_type\n context (obj): the request context (default: None).\n\n Returns: input_data deserialized into torch.FloatTensor or torch.cuda.FloatTensor depending if cuda is available.\n \"\"\"\n return decoder.decode(input_data, content_type)\n\n def default_predict_fn(self, data, model, context=None):\n \"\"\"A default predict_fn for PyTorch. Calls a model on data deserialized in input_fn.\n Runs prediction on GPU if cuda is available.\n\n Args:\n data: input data (torch.Tensor) for prediction deserialized by input_fn\n model: PyTorch model loaded in memory by model_fn\n context (obj): the request context (default: None).\n\n Returns: a prediction\n \"\"\"\n return model(input_data)\n\n def default_output_fn(self, prediction, accept, context=None):\n \"\"\"A default output_fn for PyTorch. Serializes predictions from predict_fn to JSON, CSV or NPY format.\n\n Args:\n prediction: a prediction result from predict_fn\n accept: type which the output data needs to be serialized\n context (obj): the request context (default: None).\n\n Returns: output data serialized\n \"\"\"\n return encoder.encode(prediction, accept)\n ```\n Note, passing context as an argument to the handler functions is optional. Customer can choose to omit context from the function declaration if it's not needed in the runtime. For example, the following handler function declarations will also work:\n\n ```\n def default_model_fn(self, model_dir)\n\n def default_input_fn(self, input_data, content_type)\n\n def default_predict_fn(self, data, model)\n\n def default_output_fn(self, prediction, accept)\n ``` \n\n2. Implement a handler service that is executed by the model server.\n ([Here is an example](https://github.com/aws/sagemaker-pytorch-serving-container/blob/master/src/sagemaker_pytorch_serving_container/handler_service.py) of a handler service.)\n For more information on how to define your `HANDLER_SERVICE` file, see [the MMS custom service documentation](https://github.com/awslabs/multi-model-server/blob/master/docs/custom_service.md).\n\n ``` python\n from sagemaker_inference.default_handler_service import DefaultHandlerService\n from sagemaker_inference.transformer import Transformer\n from sagemaker_pytorch_serving_container.default_inference_handler import DefaultPytorchInferenceHandler\n\n\n class HandlerService(DefaultHandlerService):\n \"\"\"Handler service that is executed by the model server.\n Determines specific default inference handlers to use based on model being used.\n This class extends ``DefaultHandlerService``, which define the following:\n - The ``handle`` method is invoked for all incoming inference requests to the model server.\n - The ``initialize`` method is invoked at model server start up.\n Based on: https://github.com/awslabs/multi-model-server/blob/master/docs/custom_service.md\n \"\"\"\n def __init__(self):\n transformer = Transformer(default_inference_handler=DefaultPytorchInferenceHandler())\n super(HandlerService, self).__init__(transformer=transformer)\n ```\n\n3. Implement a serving entrypoint, which starts the model server.\n ([Here is an example](https://github.com/aws/sagemaker-pytorch-serving-container/blob/master/src/sagemaker_pytorch_serving_container/serving.py) of a serving entrypoint.)\n\n ``` python\n from sagemaker_inference import model_server\n\n model_server.start_model_server(handler_service=HANDLER_SERVICE)\n ```\n\n4. Define the location of the entrypoint in your Dockerfile.\n\n ``` dockerfile\n ENTRYPOINT [\"python\", \"/usr/local/bin/entrypoint.py\"]\n ```\n\n### Complete Example\n\n[Here is a complete example](https://github.com/awslabs/amazon-sagemaker-examples/tree/master/advanced_functionality/multi_model_bring_your_own) demonstrating usage of the SageMaker Inference Toolkit in your own container for deployment to a multi-model endpoint.\n\n## :scroll: License\n\nThis library is licensed under the [Apache 2.0 License](http://aws.amazon.com/apache2.0/).\nFor more details, please take a look at the [LICENSE](https://github.com/aws-samples/sagemaker-inference-toolkit/blob/master/LICENSE) file.\n\n## :handshake: Contributing\n\nContributions are welcome!\nPlease read our [contributing guidelines](https://github.com/aws/sagemaker-inference-toolkit/blob/master/CONTRIBUTING.md)\nif you'd like to open an issue or submit a pull request.",
"bugtrack_url": null,
"license": "Apache License 2.0",
"summary": "Open source toolkit for helping create serving containers to run on Amazon SageMaker.",
"version": "1.10.1",
"project_urls": {
"Homepage": "https://github.com/aws/sagemaker-inference-toolkit/"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "da4ef10fbc7675501500beb90caa5f0740321dd25593831d50df758320806b5b",
"md5": "4f4f47cef7934533d4e0a3757bf0b364",
"sha256": "7aab74809c8eb28c6980eda52cd46e2ca9699a581f291477a0aa3b12ce5e9762"
},
"downloads": -1,
"filename": "sagemaker_inference-1.10.1.tar.gz",
"has_sig": false,
"md5_digest": "4f4f47cef7934533d4e0a3757bf0b364",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 23394,
"upload_time": "2023-10-25T09:40:28",
"upload_time_iso_8601": "2023-10-25T09:40:28.723507Z",
"url": "https://files.pythonhosted.org/packages/da/4e/f10fbc7675501500beb90caa5f0740321dd25593831d50df758320806b5b/sagemaker_inference-1.10.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-10-25 09:40:28",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "aws",
"github_project": "sagemaker-inference-toolkit",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"tox": true,
"lcname": "sagemaker-inference"
}