# SageMaker Studio Analytics Extension
This is a notebook extension provided by AWS SageMaker Studio Team to integrate with analytics resources. Currently, it supports connecting SageMaker Studio Notebook to Spark(EMR) cluster through SparkMagic library.
## Usage
Before you can use the magic command to connect Studio notebook to EMR, please ensure the SageMaker Studio has the connectivity to Spark cluster(livy service). You can refer to [this AWS blog](https://aws.amazon.com/blogs/machine-learning/amazon-sagemaker-studio-notebooks-backed-by-spark-in-amazon-emr/) for how to set up SageMaker Studio and EMR cluster.
### Register the magic command:
```buildoutcfg
%load_ext sagemaker_studio_analytics_extension.magics
```
### Show help content:
```buildoutcfg
Docstring:
::
%sm_analytics [--auth-type AUTH_TYPE] [--cluster-id CLUSTER_ID]
[--language LANGUAGE]
[--assumable-role-arn ASSUMABLE_ROLE_ARN]
[--emr-execution-role-arn EMR_EXECUTION_ROLE_ARN]
[--secret SECRET]
[--verify-certificate VERIFY_CERTIFICATE]
[command [command ...]]
positional arguments:
command Command to execute. The command consists of a service
name followed by a ' ' followed by an operation.
Supported services are ['emr'] and supported
operations are ['connect']. For example a valid
command is 'emr connect'.
optional arguments:
--auth-type AUTH_TYPE
The authentication type to be used. Supported
authentication types are {'Basic_Access', 'Kerberos',
'None'}.
--cluster-id CLUSTER_ID
The cluster id to connect to.
--language LANGUAGE Language to use. The supported languages for IPython
kernel(s) are {'python', 'scala'}. This is a required
argument for IPython kernels, but not for magic
kernels such as PySpark or SparkScala.
--assumable-role-arn ASSUMABLE_ROLE_ARN
The IAM role to assume when connecting to a cluster in
a different AWS account. This argument is not required
when connecting to a cluster in the same AWS account.
--emr-execution-role-arn EMR_EXECUTION_ROLE_ARN
The IAM role passed to EMR to set up EMR job security
context. This argument is optional and used when IAM
Passthrough feature is enabled for EMR.
--secret SECRET The AWS Secrets Manager SecretID.
--verify-certificate VERIFY_CERTIFICATE
Determine if SSL certificate should be verified when
using HTTPS to connect to EMR. Supported values are
['True', 'False', 'PathToCert']. If a path-to-cert-
file is provided, the certificate verification will be
done with the certificate in the provided file
path.Note that the default
```
### Examples
1. Connect Studio notebook using IPython Kernel to EMR cluster protected by Kerberos.
```buildoutcfg
%sm_analytics emr connect --cluster-id j-1JIIZS02SEVCS --auth-type Kerberos --language python
```
2. Connect Studio notebook using IPython Kernel to HTTP Basic Auth protected EMR cluster and create the Scala based session.
```buildoutcfg
%sm_analytics emr connect --cluster-id j-1KHIOQZAQUF5P --auth-type Basic_Access --language scala
```
3. Connect Studio notebook using IPython Kernel to EMR cluster directly without Livy authentication.
```buildoutcfg
%sm_analytics emr connect --cluster-id j-1KHIOQZAQUF5P --auth-type None --language python
```
4. Connect Studio notebook using PySpark or Spark(scala) Kernel to HTTP Basic Auth protected EMR cluster.
```buildoutcfg
%sm_analytics emr connect --cluster-id j-1KHIOQZAQUF5P --auth-type Basic_Access
```
## License
This library is licensed under the Apache 2.0 License. See the LICENSE file.
Raw data
{
"_id": null,
"home_page": "https://aws.amazon.com/sagemaker",
"name": "sagemaker-studio-analytics-extension",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": null,
"author": "Amazon Web Services",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/fa/96/e939d9b220fc07b32456600fe6f48879279d57263de6845c6672d7ec5547/sagemaker_studio_analytics_extension-0.1.2.tar.gz",
"platform": null,
"description": "# SageMaker Studio Analytics Extension\n\nThis is a notebook extension provided by AWS SageMaker Studio Team to integrate with analytics resources. Currently, it supports connecting SageMaker Studio Notebook to Spark(EMR) cluster through SparkMagic library.\n\n## Usage\nBefore you can use the magic command to connect Studio notebook to EMR, please ensure the SageMaker Studio has the connectivity to Spark cluster(livy service). You can refer to [this AWS blog](https://aws.amazon.com/blogs/machine-learning/amazon-sagemaker-studio-notebooks-backed-by-spark-in-amazon-emr/) for how to set up SageMaker Studio and EMR cluster. \n### Register the magic command:\n```buildoutcfg\n%load_ext sagemaker_studio_analytics_extension.magics\n```\n### Show help content:\n```buildoutcfg\nDocstring:\n::\n\n %sm_analytics [--auth-type AUTH_TYPE] [--cluster-id CLUSTER_ID]\n [--language LANGUAGE]\n [--assumable-role-arn ASSUMABLE_ROLE_ARN]\n [--emr-execution-role-arn EMR_EXECUTION_ROLE_ARN]\n [--secret SECRET]\n [--verify-certificate VERIFY_CERTIFICATE]\n [command [command ...]]\n\npositional arguments:\n command Command to execute. The command consists of a service\n name followed by a ' ' followed by an operation.\n Supported services are ['emr'] and supported\n operations are ['connect']. For example a valid\n command is 'emr connect'.\n\noptional arguments:\n --auth-type AUTH_TYPE\n The authentication type to be used. Supported\n authentication types are {'Basic_Access', 'Kerberos',\n 'None'}.\n --cluster-id CLUSTER_ID\n The cluster id to connect to.\n --language LANGUAGE Language to use. The supported languages for IPython\n kernel(s) are {'python', 'scala'}. This is a required\n argument for IPython kernels, but not for magic\n kernels such as PySpark or SparkScala.\n --assumable-role-arn ASSUMABLE_ROLE_ARN\n The IAM role to assume when connecting to a cluster in\n a different AWS account. This argument is not required\n when connecting to a cluster in the same AWS account.\n --emr-execution-role-arn EMR_EXECUTION_ROLE_ARN\n The IAM role passed to EMR to set up EMR job security\n context. This argument is optional and used when IAM\n Passthrough feature is enabled for EMR.\n --secret SECRET The AWS Secrets Manager SecretID.\n --verify-certificate VERIFY_CERTIFICATE\n Determine if SSL certificate should be verified when\n using HTTPS to connect to EMR. Supported values are\n ['True', 'False', 'PathToCert']. If a path-to-cert-\n file is provided, the certificate verification will be\n done with the certificate in the provided file\n path.Note that the default \n```\n\n### Examples\n1. Connect Studio notebook using IPython Kernel to EMR cluster protected by Kerberos. \n```buildoutcfg\n%sm_analytics emr connect --cluster-id j-1JIIZS02SEVCS --auth-type Kerberos --language python\n```\n\n2. Connect Studio notebook using IPython Kernel to HTTP Basic Auth protected EMR cluster and create the Scala based session. \n```buildoutcfg\n%sm_analytics emr connect --cluster-id j-1KHIOQZAQUF5P --auth-type Basic_Access --language scala\n```\n\n3. Connect Studio notebook using IPython Kernel to EMR cluster directly without Livy authentication. \n```buildoutcfg\n%sm_analytics emr connect --cluster-id j-1KHIOQZAQUF5P --auth-type None --language python\n```\n\n4. Connect Studio notebook using PySpark or Spark(scala) Kernel to HTTP Basic Auth protected EMR cluster. \n```buildoutcfg\n%sm_analytics emr connect --cluster-id j-1KHIOQZAQUF5P --auth-type Basic_Access\n```\n## License\n\nThis library is licensed under the Apache 2.0 License. See the LICENSE file.\n\n",
"bugtrack_url": null,
"license": "Apache 2.0",
"summary": "SageMaker Studio Analytics Extension",
"version": "0.1.2",
"project_urls": {
"Homepage": "https://aws.amazon.com/sagemaker"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "fa96e939d9b220fc07b32456600fe6f48879279d57263de6845c6672d7ec5547",
"md5": "78ef2ef7879382a36bb36ea874c3c0d4",
"sha256": "a7bbc3b8f3d950f5396761ebb19de76a07a9a1b8999b0f652191e38a5de23b4e"
},
"downloads": -1,
"filename": "sagemaker_studio_analytics_extension-0.1.2.tar.gz",
"has_sig": false,
"md5_digest": "78ef2ef7879382a36bb36ea874c3c0d4",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 55549,
"upload_time": "2024-07-31T21:58:40",
"upload_time_iso_8601": "2024-07-31T21:58:40.254383Z",
"url": "https://files.pythonhosted.org/packages/fa/96/e939d9b220fc07b32456600fe6f48879279d57263de6845c6672d7ec5547/sagemaker_studio_analytics_extension-0.1.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-07-31 21:58:40",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "sagemaker-studio-analytics-extension"
}