sagemaker-hyperpod

Name	sagemaker-hyperpod JSON
Version	3.2.1 JSON
	download
home_page	https://github.com/aws/sagemaker-hyperpod-cli
Summary	Amazon SageMaker HyperPod SDK and CLI
upload_time	2025-08-28 00:16:24
maintainer	None
docs_url	None
author	Amazon Web Services
requires_python	>=3.8
license	Apache-2.0
keywords
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            
# SageMaker HyperPod command-line interface

The Amazon SageMaker HyperPod command-line interface (HyperPod CLI) is a tool that helps manage training jobs on the SageMaker HyperPod clusters orchestrated by Amazon EKS.

This documentation serves as a reference for the available HyperPod CLI commands. For a comprehensive user guide, see [Orchestrating SageMaker HyperPod clusters with Amazon EKS](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-hyperpod-eks.html) in the *Amazon SageMaker Developer Guide*.

Note: Old `hyperpod`CLI V2 has been moved to `release_v2` branch. Please refer [release_v2 branch](https://github.com/aws/sagemaker-hyperpod-cli/tree/release_v2) for usage.

## Table of Contents
- [Overview](#overview)
- [Prerequisites](#prerequisites)
- [Platform Support](#platform-support)
- [ML Framework Support](#ml-framework-support)
- [Installation](#installation)
- [Usage](#usage)
  - [Getting Clusters](#getting-cluster-information)
  - [Connecting to a Cluster](#connecting-to-a-cluster)
  - [Getting Cluster Context](#getting-cluster-context)
  - [Listing Pods](#listing-pods)
  - [Accessing Logs](#accessing-logs)
  - [CLI](#cli-)
    - [Training](#training-)
    - [Inference](#inference-)
  - [SDK](#sdk-)
    - [Training](#training-sdk)
    - [Inference](#inference-sdk)
  

## Overview

The SageMaker HyperPod CLI is a tool that helps create training jobs and inference endpoint deployments to the Amazon SageMaker HyperPod clusters orchestrated by Amazon EKS. It provides a set of commands for managing the full lifecycle of jobs, including create, describe, list, and delete operations, as well as accessing pod and operator logs where applicable. The CLI is designed to abstract away the complexity of working directly with Kubernetes for these core actions of managing jobs on SageMaker HyperPod clusters orchestrated by Amazon EKS.

## Prerequisites for Training

- HyperPod CLI currently supports starting PyTorchJobs. To start a job, you need to install Training Operator first. 
  - You can follow [pytorch operator doc](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-eks-operator-install.html) to install it.

## Prerequisites for Inference 

- HyperPod CLI supports creating Inference Endpoints through jumpstart and through custom Endpoint config 
  - You can follow [inference operator doc](https://github.com/aws/sagemaker-hyperpod-cli/tree/master/helm_chart/HyperPodHelmChart/charts/inference-operator) to install it.

## Platform Support

SageMaker HyperPod CLI currently supports Linux and MacOS platforms. Windows platform is not supported now.

## ML Framework Support

SageMaker HyperPod CLI currently supports start training job with:
- PyTorch ML Framework. Version requirements: PyTorch >= 1.10

## Installation

1. Make sure that your local python version is 3.8, 3.9, 3.10 or 3.11.

2. Install the sagemaker-hyperpod-cli package.

    ```
    pip install sagemaker-hyperpod
    ```

3. Verify if the installation succeeded by running the following command.

    ```
    hyp --help
    ```

## Usage

The HyperPod CLI provides the following commands:

- [Getting Clusters](#getting-cluster-information)
- [Connecting to a Cluster](#connecting-to-a-cluster)
- [Getting Cluster Context](#getting-cluster-context)
- [Listing Pods](#listing-pods)
- [Accessing Logs](#accessing-logs)
- [CLI](#cli-)
  - [Training](#training-)
  - [Inference](#inference-)
- [SDK](#sdk-)
  - [Training](#training-sdk)
  - [Inference](#inference-sdk)


### Getting Cluster information

This command lists the available SageMaker HyperPod clusters and their capacity information.

```
hyp list-cluster [--region <region>]  [--namespace <namespace>] [--output <json|table>]
```

* `region` (string) - Optional. The region that the SageMaker HyperPod and EKS clusters are located. If not specified, it will be set to the region from the current AWS account credentials.
* `namespace` (string) - Optional. The namespace that users want to check the quota with. Only the SageMaker managed namespaces are supported.
* `output` (enum) - Optional. The output format. Available values are `table` and `json`. The default value is `json`.

### Connecting to a Cluster

This command configures the local Kubectl environment to interact with the specified SageMaker HyperPod cluster and namespace.

```
hyp set-cluster-context --cluster-name <cluster-name> [--namespace <namespace>]
```

* `cluster-name` (string) - Required. The SageMaker HyperPod cluster name to configure with.
* `namespace` (string) - Optional. The namespace that you want to connect to. If not specified, Hyperpod cli commands will auto discover the accessible namespace.

### Getting Cluster Context

Get all the context related to the current set Cluster

```
hyp get-cluster-context
```

### Listing Pods

This command lists all the pods associated with a specific training job.

```
hyp list-pods hyp-pytorch-job --job-name <job-name>
```

* `job-name` (string) - Required. The name of the job to list pods for.

### Accessing Logs

This command retrieves the logs for a specific pod within a training job.

```
hyp get-logs hyp-pytorch-job --pod-name <pod-name> --job-name <job-name>
```

* `job-name` (string) - Required. The name of the job to get the log for.
* `pod-name` (string) - Required. The name of the pod to get the log from.


### CLI 

### Training 

#### Creating a Training Job 

```
hyp create hyp-pytorch-job \
    --version 1.0 \
    --job-name test-pytorch-job \
    --image pytorch/pytorch:latest \
    --command '[python, train.py]' \
    --args '[--epochs=10, --batch-size=32]' \
    --environment '{"PYTORCH_CUDA_ALLOC_CONF": "max_split_size_mb:32"}' \
    --pull-policy "IfNotPresent" \
    --instance-type ml.p4d.24xlarge \
    --tasks-per-node 8 \
    --label-selector '{"accelerator": "nvidia", "network": "efa"}' \
    --deep-health-check-passed-nodes-only true \
    --scheduler-type "kueue" \
    --queue-name "training-queue" \
    --priority "high" \
    --max-retry 3 \
    --accelerators 8 \
    --vcpu 96.0 \
    --memory 1152.0 \
    --accelerators-limit 8 \
    --vcpu-limit 96.0 \
    --memory-limit 1152.0 \
    --preferred-topology "topology.kubernetes.io/zone=us-west-2a" \
    --volume name=model-data,type=hostPath,mount_path=/data,path=/data \
    --volume name=training-output,type=pvc,mount_path=/data2,claim_name=my-pvc,read_only=false
```

Key required parameters explained:

    --job-name: Unique identifier for your training job

    --image: Docker image containing your training environment

### Inference 

#### Creating a JumpstartModel Endpoint

Pre-trained Jumpstart models can be gotten from https://sagemaker.readthedocs.io/en/v2.82.0/doc_utils/jumpstart.html and fed into the call for creating the endpoint

```
hyp create hyp-jumpstart-endpoint \
    --version 1.0 \
    --model-id jumpstart-model-id\
    --instance-type ml.g5.8xlarge \
    --endpoint-name endpoint-jumpstart \
```


#### Invoke a JumpstartModel Endpoint

```
hyp invoke hyp-jumpstart-endpoint \
    --endpoint-name endpoint-jumpstart \
    --body '{"inputs":"What is the capital of USA?"}'
```

#### Managing an Endpoint 

```
hyp list hyp-jumpstart-endpoint
hyp describe hyp-jumpstart-endpoint --name endpoint-jumpstart
```

#### Creating a Custom Inference Endpoint 

```
hyp create hyp-custom-endpoint \
    --version 1.0 \
    --endpoint-name my-custom-endpoint \
    --model-name my-pytorch-model \
    --model-source-type s3 \
    --model-location my-pytorch-training \
    --model-volume-mount-name test-volume \
    --s3-bucket-name your-bucket \
    --s3-region us-east-1 \
    --instance-type ml.g5.8xlarge \
    --image-uri 763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-inference:latest \
    --container-port 8080

```

#### Invoke a Custom Inference Endpoint 

```
hyp invoke hyp-custom-endpoint \
    --endpoint-name endpoint-custom-pytorch \
    --body '{"inputs":"What is the capital of USA?"}'
    
```

#### Deleting an Endpoint

```
hyp delete hyp-jumpstart-endpoint --name endpoint-jumpstart
```


## SDK 

Along with the CLI, we also have SDKs available that can perform the training and inference functionalities that the CLI performs

### Training SDK

#### Creating a Training Job 

```

from sagemaker.hyperpod.training import HyperPodPytorchJob
from sagemaker.hyperpod.training
import ReplicaSpec, Template, Spec, Containers, Resources, RunPolicy
from sagemaker.hyperpod.common.config import Metadata

# Define job specifications
nproc_per_node = "1"  # Number of processes per node
replica_specs = 
[
    ReplicaSpec
    (
        name = "pod",  # Replica name
        template = Template
        (
            spec = Spec
            (
                containers =
                [
                    Containers
                    (
                        # Container name
                        name="container-name",  
                        
                        # Training image
                        image="123456789012.dkr.ecr.us-west-2.amazonaws.com/my-training-image:latest",  
                        
                        # Always pull image
                        image_pull_policy="Always",  
                        resources=Resources\
                        (
                            # No GPUs requested
                            requests={"nvidia.com/gpu": "0"},  
                            # No GPU limit
                            limits={"nvidia.com/gpu": "0"},   
                        ),
                        # Command to run
                        command=["python", "train.py"],  
                        # Script arguments
                        args=["--epochs", "10", "--batch-size", "32"],  
                    )
                ]
            )
        ),
    )
]
# Keep pods after completion
run_policy = RunPolicy(clean_pod_policy="None")  

# Create and start the PyTorch job
pytorch_job = HyperPodPytorchJob
(
    # Job name
    metadata = Metadata(name="demo"),  
    # Processes per node
    nproc_per_node = nproc_per_node,   
    # Replica specifications
    replica_specs = replica_specs,     
    # Run policy
    run_policy = run_policy,           
)
# Launch the job
pytorch_job.create()  
                

```    



### Inference SDK

#### Creating a JumpstartModel Endpoint

Pre-trained Jumpstart models can be gotten from https://sagemaker.readthedocs.io/en/v2.82.0/doc_utils/jumpstart.html and fed into the call for creating the endpoint

```
from sagemaker.hyperpod.inference.config.hp_jumpstart_endpoint_config import Model, Server, SageMakerEndpoint, TlsConfig
from sagemaker.hyperpod.inference.hp_jumpstart_endpoint import HPJumpStartEndpoint

model=Model(
    model_id='deepseek-llm-r1-distill-qwen-1-5b'
)
server=Server(
    instance_type='ml.g5.8xlarge',
)
endpoint_name=SageMakerEndpoint(name='<my-endpoint-name>')

js_endpoint=HPJumpStartEndpoint(
    model=model,
    server=server,
    sage_maker_endpoint=endpoint_name
)

js_endpoint.create()
```


#### Invoke a JumpstartModel Endpoint

```
data = '{"inputs":"What is the capital of USA?"}'
response = js_endpoint.invoke(body=data).body.read()
print(response)
```


#### Creating a Custom Inference Endpoint (with S3)

```
from sagemaker.hyperpod.inference.config.hp_endpoint_config import CloudWatchTrigger, Dimensions, AutoScalingSpec, Metrics, S3Storage, ModelSourceConfig, TlsConfig, EnvironmentVariables, ModelInvocationPort, ModelVolumeMount, Resources, Worker
from sagemaker.hyperpod.inference.hp_endpoint import HPEndpoint

model_source_config = ModelSourceConfig(
    model_source_type='s3',
    model_location="<my-model-folder-in-s3>",
    s3_storage=S3Storage(
        bucket_name='<my-model-artifacts-bucket>',
        region='us-east-2',
    ),
)

environment_variables = [
    EnvironmentVariables(name="HF_MODEL_ID", value="/opt/ml/model"),
    EnvironmentVariables(name="SAGEMAKER_PROGRAM", value="inference.py"),
    EnvironmentVariables(name="SAGEMAKER_SUBMIT_DIRECTORY", value="/opt/ml/model/code"),
    EnvironmentVariables(name="MODEL_CACHE_ROOT", value="/opt/ml/model"),
    EnvironmentVariables(name="SAGEMAKER_ENV", value="1"),
]

worker = Worker(
    image='763104351884.dkr.ecr.us-east-2.amazonaws.com/huggingface-pytorch-tgi-inference:2.4.0-tgi2.3.1-gpu-py311-cu124-ubuntu22.04-v2.0',
    model_volume_mount=ModelVolumeMount(
        name='model-weights',
    ),
    model_invocation_port=ModelInvocationPort(container_port=8080),
    resources=Resources(
            requests={"cpu": "30000m", "nvidia.com/gpu": 1, "memory": "100Gi"},
            limits={"nvidia.com/gpu": 1}
    ),
    environment_variables=environment_variables,
)

tls_config=TlsConfig(tls_certificate_output_s3_uri='s3://<my-tls-bucket-name>')

custom_endpoint = HPEndpoint(
    endpoint_name='<my-endpoint-name>',
    instance_type='ml.g5.8xlarge',
    model_name='deepseek15b-test-model-name',  
    tls_config=tls_config,
    model_source_config=model_source_config,
    worker=worker,
)

custom_endpoint.create()
```

#### Invoke a Custom Inference Endpoint 

```
data = '{"inputs":"What is the capital of USA?"}'
response = custom_endpoint.invoke(body=data).body.read()
print(response)
```

#### Managing an Endpoint 

```
endpoint_list = HPEndpoint.list()
print(endpoint_list[0])

print(custom_endpoint.get_operator_logs(since_hours=0.5))

```

#### Deleting an Endpoint 

```
custom_endpoint.delete()

```

#### Observability - Getting Monitoring Information
```
from sagemaker.hyperpod.utils import get_monitoring_config,
monitor_config = get_monitoring_config()
monitor_config.grafanaURL
monitor_config.prometheusURL
```

## Disclaimer 

* This CLI and SDK requires access to the user's file system to set and get context and function properly. 
It needs to read configuration files such as kubeconfig to establish the necessary environment settings.


## Working behind a proxy server ?
* Follow these steps from [here](https://docs.aws.amazon.com/cli/v1/userguide/cli-configure-proxy.html) to set up HTTP proxy connections

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/aws/sagemaker-hyperpod-cli",
    "name": "sagemaker-hyperpod",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": null,
    "author": "Amazon Web Services",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/07/dc/2fb2f60857f59e49e4bb17ee49cbb0e641578caf5d7b8e332a79658bd904/sagemaker_hyperpod-3.2.1.tar.gz",
    "platform": null,
    "description": "\n# SageMaker HyperPod command-line interface\n\nThe Amazon SageMaker HyperPod command-line interface (HyperPod CLI) is a tool that helps manage training jobs on the SageMaker HyperPod clusters orchestrated by Amazon EKS.\n\nThis documentation serves as a reference for the available HyperPod CLI commands. For a comprehensive user guide, see [Orchestrating SageMaker HyperPod clusters with Amazon EKS](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-hyperpod-eks.html) in the *Amazon SageMaker Developer Guide*.\n\nNote: Old `hyperpod`CLI V2 has been moved to `release_v2` branch. Please refer [release_v2 branch](https://github.com/aws/sagemaker-hyperpod-cli/tree/release_v2) for usage.\n\n## Table of Contents\n- [Overview](#overview)\n- [Prerequisites](#prerequisites)\n- [Platform Support](#platform-support)\n- [ML Framework Support](#ml-framework-support)\n- [Installation](#installation)\n- [Usage](#usage)\n  - [Getting Clusters](#getting-cluster-information)\n  - [Connecting to a Cluster](#connecting-to-a-cluster)\n  - [Getting Cluster Context](#getting-cluster-context)\n  - [Listing Pods](#listing-pods)\n  - [Accessing Logs](#accessing-logs)\n  - [CLI](#cli-)\n    - [Training](#training-)\n    - [Inference](#inference-)\n  - [SDK](#sdk-)\n    - [Training](#training-sdk)\n    - [Inference](#inference-sdk)\n  \n\n## Overview\n\nThe SageMaker HyperPod CLI is a tool that helps create training jobs and inference endpoint deployments to the Amazon SageMaker HyperPod clusters orchestrated by Amazon EKS. It provides a set of commands for managing the full lifecycle of jobs, including create, describe, list, and delete operations, as well as accessing pod and operator logs where applicable. The CLI is designed to abstract away the complexity of working directly with Kubernetes for these core actions of managing jobs on SageMaker HyperPod clusters orchestrated by Amazon EKS.\n\n## Prerequisites for Training\n\n- HyperPod CLI currently supports starting PyTorchJobs. To start a job, you need to install Training Operator first. \n  - You can follow [pytorch operator doc](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-eks-operator-install.html) to install it.\n\n## Prerequisites for Inference \n\n- HyperPod CLI supports creating Inference Endpoints through jumpstart and through custom Endpoint config \n  - You can follow [inference operator doc](https://github.com/aws/sagemaker-hyperpod-cli/tree/master/helm_chart/HyperPodHelmChart/charts/inference-operator) to install it.\n\n## Platform Support\n\nSageMaker HyperPod CLI currently supports Linux and MacOS platforms. Windows platform is not supported now.\n\n## ML Framework Support\n\nSageMaker HyperPod CLI currently supports start training job with:\n- PyTorch ML Framework. Version requirements: PyTorch >= 1.10\n\n## Installation\n\n1. Make sure that your local python version is 3.8, 3.9, 3.10 or 3.11.\n\n2. Install the sagemaker-hyperpod-cli package.\n\n    ```\n    pip install sagemaker-hyperpod\n    ```\n\n3. Verify if the installation succeeded by running the following command.\n\n    ```\n    hyp --help\n    ```\n\n## Usage\n\nThe HyperPod CLI provides the following commands:\n\n- [Getting Clusters](#getting-cluster-information)\n- [Connecting to a Cluster](#connecting-to-a-cluster)\n- [Getting Cluster Context](#getting-cluster-context)\n- [Listing Pods](#listing-pods)\n- [Accessing Logs](#accessing-logs)\n- [CLI](#cli-)\n  - [Training](#training-)\n  - [Inference](#inference-)\n- [SDK](#sdk-)\n  - [Training](#training-sdk)\n  - [Inference](#inference-sdk)\n\n\n### Getting Cluster information\n\nThis command lists the available SageMaker HyperPod clusters and their capacity information.\n\n```\nhyp list-cluster [--region <region>]  [--namespace <namespace>] [--output <json|table>]\n```\n\n* `region` (string) - Optional. The region that the SageMaker HyperPod and EKS clusters are located. If not specified, it will be set to the region from the current AWS account credentials.\n* `namespace` (string) - Optional. The namespace that users want to check the quota with. Only the SageMaker managed namespaces are supported.\n* `output` (enum) - Optional. The output format. Available values are `table` and `json`. The default value is `json`.\n\n### Connecting to a Cluster\n\nThis command configures the local Kubectl environment to interact with the specified SageMaker HyperPod cluster and namespace.\n\n```\nhyp set-cluster-context --cluster-name <cluster-name> [--namespace <namespace>]\n```\n\n* `cluster-name` (string) - Required. The SageMaker HyperPod cluster name to configure with.\n* `namespace` (string) - Optional. The namespace that you want to connect to. If not specified, Hyperpod cli commands will auto discover the accessible namespace.\n\n### Getting Cluster Context\n\nGet all the context related to the current set Cluster\n\n```\nhyp get-cluster-context\n```\n\n### Listing Pods\n\nThis command lists all the pods associated with a specific training job.\n\n```\nhyp list-pods hyp-pytorch-job --job-name <job-name>\n```\n\n* `job-name` (string) - Required. The name of the job to list pods for.\n\n### Accessing Logs\n\nThis command retrieves the logs for a specific pod within a training job.\n\n```\nhyp get-logs hyp-pytorch-job --pod-name <pod-name> --job-name <job-name>\n```\n\n* `job-name` (string) - Required. The name of the job to get the log for.\n* `pod-name` (string) - Required. The name of the pod to get the log from.\n\n\n### CLI \n\n### Training \n\n#### Creating a Training Job \n\n```\nhyp create hyp-pytorch-job \\\n    --version 1.0 \\\n    --job-name test-pytorch-job \\\n    --image pytorch/pytorch:latest \\\n    --command '[python, train.py]' \\\n    --args '[--epochs=10, --batch-size=32]' \\\n    --environment '{\"PYTORCH_CUDA_ALLOC_CONF\": \"max_split_size_mb:32\"}' \\\n    --pull-policy \"IfNotPresent\" \\\n    --instance-type ml.p4d.24xlarge \\\n    --tasks-per-node 8 \\\n    --label-selector '{\"accelerator\": \"nvidia\", \"network\": \"efa\"}' \\\n    --deep-health-check-passed-nodes-only true \\\n    --scheduler-type \"kueue\" \\\n    --queue-name \"training-queue\" \\\n    --priority \"high\" \\\n    --max-retry 3 \\\n    --accelerators 8 \\\n    --vcpu 96.0 \\\n    --memory 1152.0 \\\n    --accelerators-limit 8 \\\n    --vcpu-limit 96.0 \\\n    --memory-limit 1152.0 \\\n    --preferred-topology \"topology.kubernetes.io/zone=us-west-2a\" \\\n    --volume name=model-data,type=hostPath,mount_path=/data,path=/data \\\n    --volume name=training-output,type=pvc,mount_path=/data2,claim_name=my-pvc,read_only=false\n```\n\nKey required parameters explained:\n\n    --job-name: Unique identifier for your training job\n\n    --image: Docker image containing your training environment\n\n### Inference \n\n#### Creating a JumpstartModel Endpoint\n\nPre-trained Jumpstart models can be gotten from https://sagemaker.readthedocs.io/en/v2.82.0/doc_utils/jumpstart.html and fed into the call for creating the endpoint\n\n```\nhyp create hyp-jumpstart-endpoint \\\n    --version 1.0 \\\n    --model-id jumpstart-model-id\\\n    --instance-type ml.g5.8xlarge \\\n    --endpoint-name endpoint-jumpstart \\\n```\n\n\n#### Invoke a JumpstartModel Endpoint\n\n```\nhyp invoke hyp-jumpstart-endpoint \\\n    --endpoint-name endpoint-jumpstart \\\n    --body '{\"inputs\":\"What is the capital of USA?\"}'\n```\n\n#### Managing an Endpoint \n\n```\nhyp list hyp-jumpstart-endpoint\nhyp describe hyp-jumpstart-endpoint --name endpoint-jumpstart\n```\n\n#### Creating a Custom Inference Endpoint \n\n```\nhyp create hyp-custom-endpoint \\\n    --version 1.0 \\\n    --endpoint-name my-custom-endpoint \\\n    --model-name my-pytorch-model \\\n    --model-source-type s3 \\\n    --model-location my-pytorch-training \\\n    --model-volume-mount-name test-volume \\\n    --s3-bucket-name your-bucket \\\n    --s3-region us-east-1 \\\n    --instance-type ml.g5.8xlarge \\\n    --image-uri 763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-inference:latest \\\n    --container-port 8080\n\n```\n\n#### Invoke a Custom Inference Endpoint \n\n```\nhyp invoke hyp-custom-endpoint \\\n    --endpoint-name endpoint-custom-pytorch \\\n    --body '{\"inputs\":\"What is the capital of USA?\"}'\n    \n```\n\n#### Deleting an Endpoint\n\n```\nhyp delete hyp-jumpstart-endpoint --name endpoint-jumpstart\n```\n\n\n## SDK \n\nAlong with the CLI, we also have SDKs available that can perform the training and inference functionalities that the CLI performs\n\n### Training SDK\n\n#### Creating a Training Job \n\n```\n\nfrom sagemaker.hyperpod.training import HyperPodPytorchJob\nfrom sagemaker.hyperpod.training\nimport ReplicaSpec, Template, Spec, Containers, Resources, RunPolicy\nfrom sagemaker.hyperpod.common.config import Metadata\n\n# Define job specifications\nnproc_per_node = \"1\"  # Number of processes per node\nreplica_specs = \n[\n    ReplicaSpec\n    (\n        name = \"pod\",  # Replica name\n        template = Template\n        (\n            spec = Spec\n            (\n                containers =\n                [\n                    Containers\n                    (\n                        # Container name\n                        name=\"container-name\",  \n                        \n                        # Training image\n                        image=\"123456789012.dkr.ecr.us-west-2.amazonaws.com/my-training-image:latest\",  \n                        \n                        # Always pull image\n                        image_pull_policy=\"Always\",  \n                        resources=Resources\\\n                        (\n                            # No GPUs requested\n                            requests={\"nvidia.com/gpu\": \"0\"},  \n                            # No GPU limit\n                            limits={\"nvidia.com/gpu\": \"0\"},   \n                        ),\n                        # Command to run\n                        command=[\"python\", \"train.py\"],  \n                        # Script arguments\n                        args=[\"--epochs\", \"10\", \"--batch-size\", \"32\"],  \n                    )\n                ]\n            )\n        ),\n    )\n]\n# Keep pods after completion\nrun_policy = RunPolicy(clean_pod_policy=\"None\")  \n\n# Create and start the PyTorch job\npytorch_job = HyperPodPytorchJob\n(\n    # Job name\n    metadata = Metadata(name=\"demo\"),  \n    # Processes per node\n    nproc_per_node = nproc_per_node,   \n    # Replica specifications\n    replica_specs = replica_specs,     \n    # Run policy\n    run_policy = run_policy,           \n)\n# Launch the job\npytorch_job.create()  \n                \n\n```    \n\n\n\n### Inference SDK\n\n#### Creating a JumpstartModel Endpoint\n\nPre-trained Jumpstart models can be gotten from https://sagemaker.readthedocs.io/en/v2.82.0/doc_utils/jumpstart.html and fed into the call for creating the endpoint\n\n```\nfrom sagemaker.hyperpod.inference.config.hp_jumpstart_endpoint_config import Model, Server, SageMakerEndpoint, TlsConfig\nfrom sagemaker.hyperpod.inference.hp_jumpstart_endpoint import HPJumpStartEndpoint\n\nmodel=Model(\n    model_id='deepseek-llm-r1-distill-qwen-1-5b'\n)\nserver=Server(\n    instance_type='ml.g5.8xlarge',\n)\nendpoint_name=SageMakerEndpoint(name='<my-endpoint-name>')\n\njs_endpoint=HPJumpStartEndpoint(\n    model=model,\n    server=server,\n    sage_maker_endpoint=endpoint_name\n)\n\njs_endpoint.create()\n```\n\n\n#### Invoke a JumpstartModel Endpoint\n\n```\ndata = '{\"inputs\":\"What is the capital of USA?\"}'\nresponse = js_endpoint.invoke(body=data).body.read()\nprint(response)\n```\n\n\n#### Creating a Custom Inference Endpoint (with S3)\n\n```\nfrom sagemaker.hyperpod.inference.config.hp_endpoint_config import CloudWatchTrigger, Dimensions, AutoScalingSpec, Metrics, S3Storage, ModelSourceConfig, TlsConfig, EnvironmentVariables, ModelInvocationPort, ModelVolumeMount, Resources, Worker\nfrom sagemaker.hyperpod.inference.hp_endpoint import HPEndpoint\n\nmodel_source_config = ModelSourceConfig(\n    model_source_type='s3',\n    model_location=\"<my-model-folder-in-s3>\",\n    s3_storage=S3Storage(\n        bucket_name='<my-model-artifacts-bucket>',\n        region='us-east-2',\n    ),\n)\n\nenvironment_variables = [\n    EnvironmentVariables(name=\"HF_MODEL_ID\", value=\"/opt/ml/model\"),\n    EnvironmentVariables(name=\"SAGEMAKER_PROGRAM\", value=\"inference.py\"),\n    EnvironmentVariables(name=\"SAGEMAKER_SUBMIT_DIRECTORY\", value=\"/opt/ml/model/code\"),\n    EnvironmentVariables(name=\"MODEL_CACHE_ROOT\", value=\"/opt/ml/model\"),\n    EnvironmentVariables(name=\"SAGEMAKER_ENV\", value=\"1\"),\n]\n\nworker = Worker(\n    image='763104351884.dkr.ecr.us-east-2.amazonaws.com/huggingface-pytorch-tgi-inference:2.4.0-tgi2.3.1-gpu-py311-cu124-ubuntu22.04-v2.0',\n    model_volume_mount=ModelVolumeMount(\n        name='model-weights',\n    ),\n    model_invocation_port=ModelInvocationPort(container_port=8080),\n    resources=Resources(\n            requests={\"cpu\": \"30000m\", \"nvidia.com/gpu\": 1, \"memory\": \"100Gi\"},\n            limits={\"nvidia.com/gpu\": 1}\n    ),\n    environment_variables=environment_variables,\n)\n\ntls_config=TlsConfig(tls_certificate_output_s3_uri='s3://<my-tls-bucket-name>')\n\ncustom_endpoint = HPEndpoint(\n    endpoint_name='<my-endpoint-name>',\n    instance_type='ml.g5.8xlarge',\n    model_name='deepseek15b-test-model-name',  \n    tls_config=tls_config,\n    model_source_config=model_source_config,\n    worker=worker,\n)\n\ncustom_endpoint.create()\n```\n\n#### Invoke a Custom Inference Endpoint \n\n```\ndata = '{\"inputs\":\"What is the capital of USA?\"}'\nresponse = custom_endpoint.invoke(body=data).body.read()\nprint(response)\n```\n\n#### Managing an Endpoint \n\n```\nendpoint_list = HPEndpoint.list()\nprint(endpoint_list[0])\n\nprint(custom_endpoint.get_operator_logs(since_hours=0.5))\n\n```\n\n#### Deleting an Endpoint \n\n```\ncustom_endpoint.delete()\n\n```\n\n#### Observability - Getting Monitoring Information\n```\nfrom sagemaker.hyperpod.utils import get_monitoring_config,\nmonitor_config = get_monitoring_config()\nmonitor_config.grafanaURL\nmonitor_config.prometheusURL\n```\n\n## Disclaimer \n\n* This CLI and SDK requires access to the user's file system to set and get context and function properly. \nIt needs to read configuration files such as kubeconfig to establish the necessary environment settings.\n\n\n## Working behind a proxy server ?\n* Follow these steps from [here](https://docs.aws.amazon.com/cli/v1/userguide/cli-configure-proxy.html) to set up HTTP proxy connections\n\n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "Amazon SageMaker HyperPod SDK and CLI",
    "version": "3.2.1",
    "project_urls": {
        "Homepage": "https://github.com/aws/sagemaker-hyperpod-cli"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "df90ad4a795edf568edafeb55ca7caab3e87a44b29274febb3eced15c2acca1e",
                "md5": "074e8103eefb12bef0c1eebde7576ac5",
                "sha256": "a0615cd0dbd775fe91de49afed4770396d7b7a0eb7b6c70cfec76534561e7173"
            },
            "downloads": -1,
            "filename": "sagemaker_hyperpod-3.2.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "074e8103eefb12bef0c1eebde7576ac5",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 189860,
            "upload_time": "2025-08-28T00:16:22",
            "upload_time_iso_8601": "2025-08-28T00:16:22.962737Z",
            "url": "https://files.pythonhosted.org/packages/df/90/ad4a795edf568edafeb55ca7caab3e87a44b29274febb3eced15c2acca1e/sagemaker_hyperpod-3.2.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "07dc2fb2f60857f59e49e4bb17ee49cbb0e641578caf5d7b8e332a79658bd904",
                "md5": "711848d662366dbf1a6384445b4d234c",
                "sha256": "7f5bea40b2d131992892830301739b22f5e73fd9117bff9e221712f2c1c82f6b"
            },
            "downloads": -1,
            "filename": "sagemaker_hyperpod-3.2.1.tar.gz",
            "has_sig": false,
            "md5_digest": "711848d662366dbf1a6384445b4d234c",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 161404,
            "upload_time": "2025-08-28T00:16:24",
            "upload_time_iso_8601": "2025-08-28T00:16:24.123713Z",
            "url": "https://files.pythonhosted.org/packages/07/dc/2fb2f60857f59e49e4bb17ee49cbb0e641578caf5d7b8e332a79658bd904/sagemaker_hyperpod-3.2.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-28 00:16:24",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "aws",
    "github_project": "sagemaker-hyperpod-cli",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "tox": true,
    "lcname": "sagemaker-hyperpod"
}

Amazon Web Services