tdapiclient


Nametdapiclient JSON
Version 1.4.0.1 PyPI version JSON
download
home_pagehttp://www.teradata.com/
SummaryTeradata API Client Python package
upload_time2023-11-09 11:58:23
maintainer
docs_urlNone
authorTeradata Corporation
requires_python>=3.0
licenseTeradata License Agreement
keywords teradata
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            ## tdapiclient - Teradata Third Party Analytics Integration Python Library

 The tdapiclient Python library integrates the Python libraries from AWS SageMaker, Azure ML, and Google Vertex AI with Teradata. Users can train and score their models using teradataml DataFrame. tdapiclient will transparantly convert the teradataml DataFrame to an S3 address, Azure ML Dataset or Blob, or Vertex AI dataset to be used for training. The user can then provide another teradataml DataFrame as input for inference.

 Users of tdapiclient can also deploy models trained in Azure ML, AWS SageMaker, or Vertex AI to a Teradata Vantage system for in-database scoring using BYOM functionality.

 This library also provides `API_Request`, a method to call API_Request UDF, which can be used for obtaining OpenAI and Azure OpenAI text embeddings from large language models. This method can also be used for scoring models hosted in AWS, Azure, or Google Cloud Platform, equivalent to predicting in UDF mode through the tdapiclient `predict` method.

For community support, please visit the [Teradata Community](https://support.teradata.com/community).
For Teradata customer support, please visit [Teradata Support](https://support.teradata.com/csm).

Copyright 2022, Teradata. All Rights Reserved.

### Table of Contents
- [tdapiclient - Teradata Third Party Analytics Integration Python Library](#tdapiclient---teradata-third-party-analytics-integration-python-library)
- [Release Notes](#release-notes)
- [Installation and Requirements](#installation-and-requirements)
- [Using the tdapiclient Python Package with SageMaker](#using-the-tdapiclient-python-package-with-sagemaker)
- [Using the tdapiclient Python Package with Azure ML](#using-the-tdapiclient-python-package-with-azure-ml)
- [Using the tdapiclient Python Package with Vertex AI](#using-the-tdapiclient-python-package-with-vertex-ai)
- [Documentation](#documentation)
- [License](#license) | [See Agreement](https://downloads.teradata.com/download/license?destination=download/files/202392/202391/0/&message=License%2520Agreement&key=0)

## Release Notes
#### tdapiclient 1.4.0.1
This release fixes an issue with the SageMaker fit method, related to the WriteNOS function when called on CSV data. When exporting CSV data through WriteNOS, Teradata automatically converts floats into a string representation which cannot be parsed by popular data manipulation libraries, such as Pandas. The SageMaker fit method now exports CSV data in a suitable format.

#### tdapiclient 1.4.0.0
* `tdapiclient 1.4.0.00` is the fourth release version. This release adds support for Google Vertex AI integration with Teradata Vantage. The static method `TDApiClient.API_Request` now supports OpenAI and Azure OpenAI for obtaining text embeddings from large language models. Please refer to the _API Integration Guide for Cloud Machine Learning_ for a list of Limitations and Usage Considerations.

#### tdapiclient 1.2.1.0
* `tdapiclient 1.2.1.00` is the third release version. This release adds BYOM deployment support for SageMaker and optimizes fit method for CSV data format. Please refer to the _API Integration Guide for Cloud Machine Learning_ for a list of Limitations and Usage Considerations.

#### tdapiclient 1.1.1.0
* `tdapiclient 1.1.1.00` is the second release version. This release adds a support for AzureML integration with Teradata vantage. Please refer to the _API Integration Guide for Cloud Machine Learning_ for a list of Limitations and Usage Considerations.

#### tdapiclient 1.0.0.0
* `tdapiclient 1.00.00.00` is the first release version. Please refer to the _API Integration Guide for Cloud Machine Learning_ for a list of Limitations and Usage Considerations.

## Installation and Requirements

### Package Requirements
* Python 3.6 or later

Note: 32-bit Python is not supported.

### Minimum System Requirements
* Windows 7 (64Bit) or later
* macOS 10.9 (64Bit) or later
* Red Hat 7 or later versions
* Ubuntu 16.04 or later versions
* CentOS 7 or later versions
* SLES 12 or later versions
* Teradata Vantage Advanced SQL Engine:
    * Advanced SQL Engine 17.05 Feature Update 1 or later

### Installation

Use pip to install the tdapiclient - Teradata Sagemaker Python Library

Platform       | Command
-------------- | ---
macOS/Linux    | `pip install tdapiclient`
Windows        | `py -3 -m pip install tdapiclient`

## Using the tdapiclient Python Package with SageMaker

Your Python script must import the `tdapiclient` package in order to use the tdapiclient Python Library
```
>>> from tdapiclient import create_aws_context,TDApiClient
>>> from teradataml import create_context, DataFrame, copy_to_sql

>>> # Create connection to Teradata Vantage System
>>> host = input("Host: ")
>>> username = input("Username: ")
>>> password = getpass.getpass("Password: ")
>>> td_context = create_context(host=host, username=username, password=password)

# Create AWS Context to be used in TDApiClient
>>> s3_bucket = input("S3 Bucket(Please give just the bucket name) :")
>>> access_id = input("Access ID:")
>>> access_key = getpass.getpass("Acess Key: ")
>>> region = input("AWS Region: ")

>>>   os.environ["AWS_ACCESS_KEY_ID"] = access_id
>>>   os.environ["AWS_SECRET_ACCESS_KEY"] = access_key
>>>   os.environ["AWS_REGION"] = region

>>> aws_context = create_tdapi_context("aws", bucket_name=s3_bucket)
# Create TDApiClient Instance
>>> td_apiclient = TDApiClient(aws_context)

# Load data in teradata tables
>>> from sklearn.model_selection import train_test_split
>>> from sklearn.datasets import fetch_california_housing

>>> data = fetch_california_housing()
>>> X_train, X_test, y_train, y_test = train_test_split(
     data.data, data.target, test_size=0.25, random_state=42)

>>> trainX = pd.DataFrame(X_train, columns=data.feature_names)
>>> trainX["target"] = y_train

>>> testX = pd.DataFrame(X_test, columns=data.feature_names)
>>> testX["target"] = y_test

>>> train_table = "housing_data_train"
>>> test_table = "housing_data_test"

>>> column_types = {"MedInc": FLOAT, "HouseAge": FLOAT,
                "AveRooms": FLOAT, "AveBedrms": FLOAT, "Population": FLOAT,
                "AveOccup": FLOAT, "Latitude": FLOAT, "Longitude": FLOAT,
                "target" : FLOAT}

>>> copy_to_sql(df=trainX, table_name=train_table, if_exists="replace", types=column_types)
>>> copy_to_sql(df=testX, table_name=test_table, if_exists="replace", types=column_types)

# Create teradataml DataFrame for input tables

>>> test_df = DataFrame(table_name=test_table)
>>> train_df = DataFrame(table_name=train_table)

>>> exec_role_arn = "arn:aws:iam::XX:role/service-role/AmazonSageMaker-ExecutionRole-20210112T215668"
>>> FRAMEWORK_VERSION = "0.23-1"
# Create an estimator object based on sklearn sagemaker class
>>> sklearn_estimator = td_apiclient.SKLearn(
    entry_point="sklearn-script.py",
    role=exec_role_arn,
    instance_count=1,
    instance_type="ml.m5.large",
    framework_version=FRAMEWORK_VERSION,
    base_job_name="rf-scikit",
    metric_definitions=[{"Name": "median-AE", "Regex": "AE-at-50th-percentile: ([0-9.]+).*$"}],
    hyperparameters={
        "n-estimators": 100,
        "min-samples-leaf": 3,
        "features": "MedInc HouseAge AveRooms AveBedrms Population AveOccup Latitude Longitude",
        "target": "target",
    },
)
>>> # Start training using DataFrame objects
>>> sklearn_estimator.fit({"train": test_df, "test": train_df}, content_type="csv", wait=True)

>>> from sagemaker.serializers import CSVSerializer
>>> from sagemaker.deserializers import CSVDeserializer
>>> csv_ser = CSVSerializer()
>>> csv_dser = CSVDeserializer()
>>> sg_kw = {
        "instance_type": "ml.m5.large",
        "initial_instance_count": 1,
        "serializer": csv_ser,
        "deserializer": csv_dser
    }
>>> predictor = sklearn_estimator.deploy("aws-endpoint", sagemaker_kw_args=sg_kw)

>>> # Now let's try prediction with UDF and Client options.
>>> input = DataFrame(table_name='housing_data_test')
>>> column_list = ["MedInc","HouseAge","AveRooms","AveBedrms","Population","AveOccup","Latitude","Longitude"]
>>> input = input.sample(n=5).select(column_list)

>>> output = predictor.predict(input, mode="UDF",content_type='csv')

```

## Using the tdapiclient Python Package with Azure ML
Your Python script must import the `tdapiclient` package in order to use the tdapiclient Python Library
```
>>> import os
>>> import getpass
>>> from teradataml import create_context, DataFrame, read_csv
>>> import pandas as pd
>>> from teradatasqlalchemy.types import  *
>>> from tdapiclient import create_tdapi_context,TDApiClient, remove_tdapi_context
>>> # Create the connection.
>>> host = input("Host: ")
>>> username = input("Username: ")
>>> password = getpass.getpass("Password: ")
>>> # Create Azure Context and TDApiClient object.

>>> datastore_path = input(
>>>    "DataStore path : Please give path within data store of azure-ml workspace.")
>>> tenant_id = input("Azure Tenant ID:")
>>> client_id = getpass.getpass("Azure Client ID: ")
>>> client_secret = input("Azure Client Secret: ")
>>>
>>> azure_sub = input("Azure Subscription id: ")
>>> azure_rg = input("Azure resource group: ")
>>> azureml_ws = input("Azure-ML workspace: ")
>>> azure_region = input("Azure region: ")

>>> os.environ["AZURE_TENANT_ID"] = tenant_id
>>> os.environ["AZURE_CLIENT_ID"] = client_id
>>> os.environ["AZURE_CLIENT_SECRET"] = client_secret
>>>
>>> os.environ["AZURE_SUB_ID"] = azure_sub
>>> os.environ["AZURE_RG"] = azure_rg
>>> os.environ["AZURE_WS"] = azureml_ws
>>> os.environ["AZURE_REGION"] = azure_region

>>> tdapi_context = create_tdapi_context("azure", datastore_path="td-tables")
>>> td_apiclient = TDApiClient(tdapi_context)
>>> from collections import OrderedDict
>>>
>>> from collections import OrderedDict
>>>
>>> types = OrderedDict(bustout=INTEGER, rec_id=INTEGER, acct_no=INTEGER, as_of_dt_day=DATE, avg_pmt_05_mth=FLOAT,>>> days_since_lstcash=INTEGER, max_utilization_05_mth=INTEGER, maxamt_epmt_v7day=INTEGER, times_nsf=INTEGER,
>>>  totcash_to_line_v7day=INTEGER,totpmt_to_line_v7day=INTEGER,totpur_to_line_v7day=INTEGER,  totpurcash_to_line_v7day=INTEGER, credit_util_cur_mth=FLOAT, credit_util_prior_5_mth=FLOAT, credit_util_cur_to_prior_ratio=FLOAT, days_since_lst_pymnt=INTEGER, num_pymnt_lst_7_days=INTEGER, num_pymnt_lst_60_days=INTEGER,
>>>     pct_line_paid_lst_7_days=INTEGER, pct_line_paid_lst_30_days=INTEGER, num_pur_lst_7_days=INTEGER, num_pur_lst_60_days=INTEGER,
>>>     pct_line_pur_lst_7_days=INTEGER, pct_line_pur_lst_30_days=INTEGER, tot_pymnt_chnl=INTEGER, tot_pymnt=INTEGER, tot_pymnt_am=INTEGER, pay_by_phone=CHAR, elec_pymnt=CHAR,
>>>     pay_in_bank=CHAR, pay_by_check=CHAR, pay_by_othr=CHAR, last_12m_trans_ct=INTEGER, Sample_ID=INTEGER)

>>> # Check this csv file in Teradata Vantage Documentation site under azureml-usercases.zip
>>> df:DataFrame = read_csv(r'financial_data.csv', table_name="financial_data", types=types, use_fastload=False)
>>> # training dataframe.
>>> selected_df = df.select(["bustout", "rec_id", "avg_pmt_05_mth", "max_utilization_05_mth","times_nsf"
,"credit_util_cur_mth","credit_util_prior_5_mth","num_pur_lst_7_days","num_pur_lst_60_days","tot_pymnt_chnl","last_12m_trans_ct"])
>>> # Setup compute target for Azure ML.
>>> from azureml.core.compute import AmlCompute, ComputeTarget
>>> from azureml.core.authentication import ServicePrincipalAuthentication
>>> from azureml.core import Workspace, Environment

>>> credential = ServicePrincipalAuthentication(
>>>         tenant_id=tenant_id,
>>>         service_principal_id=client_id, service_principal_password=client_secret)

>>> ws = Workspace(subscription_id=azure_sub, resource_group=azure_rg, workspace_name=azureml_ws, auth=credential)

>>> vm_size = "Standard_DS3_v2"
>>> min_node = 1
>>> max_node = 1
>>> cluster_name = "test-td-cluster-new"
>>> provisioning_config = AmlCompute.provisioning_configuration(
>>>         vm_size=vm_size, min_nodes=min_node,
>>>         max_nodes=max_node)

>>> # Creating Compute cluster in Azure ML.
>>> compute_target = ComputeTarget.create(
>>>         ws, cluster_name, provisioning_config)
>>> compute_target.wait_for_completion(show_output=True)

>>> compute_target = ws.compute_targets["test-td-cluster-new"]
>>> from azureml.automl.core.featurization import FeaturizationConfig
>>> import logging

>>> # Selecting the target column.
>>> target_column_name = "bustout"

>>> forecast_horizon=14

>>> featurization_config = FeaturizationConfig()
>>> # Force the target column, to be integer type.
>>> featurization_config.add_prediction_transform_type("Integer")

>>> automl_config = td_apiclient.AutoMLConfig(
>>>     task="classification",
>>>     primary_metric="accuracy",
>>>     featurization=featurization_config,
>>>     blocked_models=["ExtremeRandomTrees"],
>>>     experiment_timeout_hours=0.3,
>>>     training_data=selected_df,
>>>     label_column_name=target_column_name,
>>>     compute_target=compute_target,
>>>     enable_early_stopping=True,
>>>     n_cross_validations=3,
>>>     max_concurrent_iterations=4,
>>>     max_cores_per_iteration=-1,
>>>     verbosity=logging.INFO
>>> )

>>> # Execute Azure ML training API with teradataml DataFrame as input which returns Azure ML Run Object.
>>> run = automl_config.fit(mount=False)
>>> # Get the best run after Auto ML job has completed.
>>> run_best = run.get_best_child()
>>> from azureml.core.environment import Environment
>>>
>>> # Creating an Azure ML Environment from a Dockerfile and requirements.txt.
>>> # myenv = Environment.from_dockerfile(name="new_project_env_7", dockerfile="./Dockerfile", pip_requirements="./>>> requirements.txt")
>>> myenv = Environment.from_dockerfile(name="new_project_env_18", >>> dockerfile=r'C:\Projects\AzureML-jupyter-notebooks\test-ignite-azureml-api-demo\Dockerfile', pip_requirements=r'C:\Projects\AzureML-jupyter-notebooks\test-ignite-azureml-api-demo\requirements.txt')
>>> myenv_b = myenv.build(workspace=ws)
>>> myenv_b.wait_for_completion(show_output=True)
>>> # curated_env_name = "AzureML-sklearn-0.24.1-ubuntu18.04-py37-cpu-inference"
>>> # myenv = Environment.get(workspace=ws, name=curated_env_name)
>>> myenv = Environment.get(workspace=ws, name="new_project_env_18")
>>> # Register an Azure ML model from the best run.
>>> from azureml.core import Model
>>> model:Model = run_best.register_model(model_name='voting_ensemble_model_1', model_path='outputs/model.pkl',>>> model_framework=Model.Framework.SCIKITLEARN)
>>> from azureml.core.model import Model
>>> model = Model(workspace=ws, name="voting_ensemble_model_1", version=1)

>>> from enum import auto
>>> from operator import mod
>>> from platform import platform
>>> from azureml.core import Model
>>> from azureml.core.model import InferenceConfig, Model
>>> from azureml.core.webservice import AciWebservice, Webservice
>>> from azureml.core.environment import Environment
>>> print(myenv)
>>> # Combine scoring script & environment in Inference configuration
>>> # inference_config = InferenceConfig(entry_script="scoring.py",
>>> #                                    environment=myenv)
>>> myenv.inferencing_stack_version = 'latest'
>>> inference_config = InferenceConfig(entry_script=r'C:\Projects\test-tdapiclient\tdapiclient\notebooks\azureml-az-webservice\scoring.py',
                                   environment=myenv)
>>> # Set deployment configuration
>>> deployment_config = AciWebservice.deploy_configuration(cpu_cores = 2,
>>>                                                        memory_gb = 4, auth_enabled=True)

>>> # Creating azmodel_deploy_kwargs dictionary to pass as a keyword argument for deploy method.
>>> azmodel_deploy_kwargs = {}
>>> azmodel_deploy_kwargs["name"] = "tdapiclient-endpoint-29"
>>> azmodel_deploy_kwargs["models"] = [model]
>>> azmodel_deploy_kwargs["workspace"] = ws
>>> azmodel_deploy_kwargs["inference_config"] = inference_config
>>> azmodel_deploy_kwargs["deployment_config"] = deployment_config
>>> azmodel_deploy_kwargs["overwrite"] = True

>>> # Deploying the model to Azure ML Compute cluster if the platform is az-webservice.
>>> webservice = automl_config.deploy(platform="az-webservice", model=model, model_type="",
>>>                         model_deploy_kwargs=azmodel_deploy_kwargs)
>>> webservice.wait_for_deployment(show_output=True)
>>> # Creating an options dictionary to pass the content_format for scoring.
>>> options = {}
>>> content_format = {}
>>> content_format["Inputs"] = [["%row"]]
>>> options["content_format"] = content_format
>>> print(webservice.predict(test_df, **options, mode="udf", content_type='json'))
......
......
```

## Using the tdapiclient Python Package with Vertex AI

Your Python script must import the `tdapiclient` package in order to use the tdapiclient Python Library
```
>>> import os, getpass
>>> from teradataml import create_context, DataFrame, remove_context, load_example_data
>>> from tdapiclient import create_tdapi_context, TDApiClient, remove_tdapi_context

# Create connection to Teradata Vantage system
>>> host = input("Host: ")
>>> username = input("Username: ")
>>> password = getpass.getpass("Password: ")
>>> td_context = create_context(host=host, username=username, password=password)

# Create Google Cloud Platform (GCP) context to be used in TDApiClient
>>> bucket_name = input("GCS bucket name: ")
>>> bucket_path = input("GCS bucket path (without bucket name): ")
>>> td_auth_obj = getpass.getpass("GCP Teradata auth object name: ")
>>> project_id = input("GCP project ID: ")
>>> region = input("GCP Region: ")
>>> google_app_cred = input("Local path to Google credentials JSON file: ")

>>> os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = google_app_cred
>>> os.environ["GCP_REGION"] = region
>>> os.environ["GCP_PROJECT_ID"] = project_id
>>> os.environ["GCP_TD_AUTH_OBJ"] = td_auth_obj

>>> gcp_context = create_tdapi_context("gcp", gcp_bucket_name=bucket_name, gcp_bucket_path=bucket_path)
# Create TDApiClient instance
>>> td_apiclient = TDApiClient(gcp_context)

# Load data in Teradata tables
# (training data is the same as test data for the purposes of this demo)
>>> load_example_data("naivebayes", "nb_iris_input_train")
>>> df = DataFrame("nb_iris_input_train")

# Create Vertex AI training job
>>> TRAINING_IMAGE = "us-docker.pkg.dev/vertex-ai/training/scikit-learn-cpu.0-23:latest"
>>> PREDICTION_IMAGE = "us-docker.pkg.dev/vertex-ai/prediction/sklearn-cpu.1-0:latest"
>>> job = td_apiclient.CustomTrainingJob(
        display_name="tdapiclient-custom-demo",
        script_path="train.py",
        container_uri=TRAINING_IMAGE,
        requirements=["gcsfs", "nyoka"],
        model_serving_container_image_uri=PREDICTION_IMAGE
        )

# Obtain trained model
>>> model = job.fit(
        df,
        replica_count=1,
        model_display_name="tdapiclient-custom-demo"
        )

# Deploy model to a Vertex AI online endpoint
>>> predictor = job.deploy(
        model,
        "vx-endpoint",
        vertex_kwargs={"machine_type": "n1-standard-4"}
        )

# Predict with UDF and client options
>>> df_test = df.drop(["id", "species"], axis=1)
>>> vertex_prediction_obj = predictor.predict(df_test, mode="client")
>>> td_output = predictor.predict(df_test, mode="udf", content_type="json")

```

## Documentation

General product information, including installation instructions, is available in the [Teradata Documentation website](https://docs.teradata.com/).

## License

Use of the Teradata Python Package is governed by the [TERADATA API LICENSE AGREEMENT](https://downloads.teradata.com/download/license?destination=download/files/202392/202391/0/&message=License%2520Agreement&key=0).
After installation, the `LICENSE` and `LICENSE-3RD-PARTY` files will be located in the `tdapiclient` directory of the Python installation directory.


            

Raw data

            {
    "_id": null,
    "home_page": "http://www.teradata.com/",
    "name": "tdapiclient",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.0",
    "maintainer_email": "",
    "keywords": "Teradata",
    "author": "Teradata Corporation",
    "author_email": "",
    "download_url": "",
    "platform": "MacOS X, Windows, Linux",
    "description": "## tdapiclient - Teradata Third Party Analytics Integration Python Library\n\n The tdapiclient Python library integrates the Python libraries from AWS SageMaker, Azure ML, and Google Vertex AI with Teradata. Users can train and score their models using teradataml DataFrame. tdapiclient will transparantly convert the teradataml DataFrame to an S3 address, Azure ML Dataset or Blob, or Vertex AI dataset to be used for training. The user can then provide another teradataml DataFrame as input for inference.\n\n Users of tdapiclient can also deploy models trained in Azure ML, AWS SageMaker, or Vertex AI to a Teradata Vantage system for in-database scoring using BYOM functionality.\n\n This library also provides `API_Request`, a method to call API_Request UDF, which can be used for obtaining OpenAI and Azure OpenAI text embeddings from large language models. This method can also be used for scoring models hosted in AWS, Azure, or Google Cloud Platform, equivalent to predicting in UDF mode through the tdapiclient `predict` method.\n\nFor community support, please visit the [Teradata Community](https://support.teradata.com/community).\nFor Teradata customer support, please visit [Teradata Support](https://support.teradata.com/csm).\n\nCopyright 2022, Teradata. All Rights Reserved.\n\n### Table of Contents\n- [tdapiclient - Teradata Third Party Analytics Integration Python Library](#tdapiclient---teradata-third-party-analytics-integration-python-library)\n- [Release Notes](#release-notes)\n- [Installation and Requirements](#installation-and-requirements)\n- [Using the tdapiclient Python Package with SageMaker](#using-the-tdapiclient-python-package-with-sagemaker)\n- [Using the tdapiclient Python Package with Azure ML](#using-the-tdapiclient-python-package-with-azure-ml)\n- [Using the tdapiclient Python Package with Vertex AI](#using-the-tdapiclient-python-package-with-vertex-ai)\n- [Documentation](#documentation)\n- [License](#license) | [See Agreement](https://downloads.teradata.com/download/license?destination=download/files/202392/202391/0/&message=License%2520Agreement&key=0)\n\n## Release Notes\n#### tdapiclient 1.4.0.1\nThis release fixes an issue with the SageMaker fit method, related to the WriteNOS function when called on CSV data. When exporting CSV data through WriteNOS, Teradata automatically converts floats into a string representation which cannot be parsed by popular data manipulation libraries, such as Pandas. The SageMaker fit method now exports CSV data in a suitable format.\n\n#### tdapiclient 1.4.0.0\n* `tdapiclient 1.4.0.00` is the fourth release version. This release adds support for Google Vertex AI integration with Teradata Vantage. The static method `TDApiClient.API_Request` now supports OpenAI and Azure OpenAI for obtaining text embeddings from large language models. Please refer to the _API Integration Guide for Cloud Machine Learning_ for a list of Limitations and Usage Considerations.\n\n#### tdapiclient 1.2.1.0\n* `tdapiclient 1.2.1.00` is the third release version. This release adds BYOM deployment support for SageMaker and optimizes fit method for CSV data format. Please refer to the _API Integration Guide for Cloud Machine Learning_ for a list of Limitations and Usage Considerations.\n\n#### tdapiclient 1.1.1.0\n* `tdapiclient 1.1.1.00` is the second release version. This release adds a support for AzureML integration with Teradata vantage. Please refer to the _API Integration Guide for Cloud Machine Learning_ for a list of Limitations and Usage Considerations.\n\n#### tdapiclient 1.0.0.0\n* `tdapiclient 1.00.00.00` is the first release version. Please refer to the _API Integration Guide for Cloud Machine Learning_ for a list of Limitations and Usage Considerations.\n\n## Installation and Requirements\n\n### Package Requirements\n* Python 3.6 or later\n\nNote: 32-bit Python is not supported.\n\n### Minimum System Requirements\n* Windows 7 (64Bit) or later\n* macOS 10.9 (64Bit) or later\n* Red Hat 7 or later versions\n* Ubuntu 16.04 or later versions\n* CentOS 7 or later versions\n* SLES 12 or later versions\n* Teradata Vantage Advanced SQL Engine:\n    * Advanced SQL Engine 17.05 Feature Update 1 or later\n\n### Installation\n\nUse pip to install the tdapiclient - Teradata Sagemaker Python Library\n\nPlatform       | Command\n-------------- | ---\nmacOS/Linux    | `pip install tdapiclient`\nWindows        | `py -3 -m pip install tdapiclient`\n\n## Using the tdapiclient Python Package with SageMaker\n\nYour Python script must import the `tdapiclient` package in order to use the tdapiclient Python Library\n```\n>>> from tdapiclient import create_aws_context,TDApiClient\n>>> from teradataml import create_context, DataFrame, copy_to_sql\n\n>>> # Create connection to Teradata Vantage System\n>>> host = input(\"Host: \")\n>>> username = input(\"Username: \")\n>>> password = getpass.getpass(\"Password: \")\n>>> td_context = create_context(host=host, username=username, password=password)\n\n# Create AWS Context to be used in TDApiClient\n>>> s3_bucket = input(\"S3 Bucket(Please give just the bucket name) :\")\n>>> access_id = input(\"Access ID:\")\n>>> access_key = getpass.getpass(\"Acess Key: \")\n>>> region = input(\"AWS Region: \")\n\n>>>   os.environ[\"AWS_ACCESS_KEY_ID\"] = access_id\n>>>   os.environ[\"AWS_SECRET_ACCESS_KEY\"] = access_key\n>>>   os.environ[\"AWS_REGION\"] = region\n\n>>> aws_context = create_tdapi_context(\"aws\", bucket_name=s3_bucket)\n# Create TDApiClient Instance\n>>> td_apiclient = TDApiClient(aws_context)\n\n# Load data in teradata tables\n>>> from sklearn.model_selection import train_test_split\n>>> from sklearn.datasets import fetch_california_housing\n\n>>> data = fetch_california_housing()\n>>> X_train, X_test, y_train, y_test = train_test_split(\n     data.data, data.target, test_size=0.25, random_state=42)\n\n>>> trainX = pd.DataFrame(X_train, columns=data.feature_names)\n>>> trainX[\"target\"] = y_train\n\n>>> testX = pd.DataFrame(X_test, columns=data.feature_names)\n>>> testX[\"target\"] = y_test\n\n>>> train_table = \"housing_data_train\"\n>>> test_table = \"housing_data_test\"\n\n>>> column_types = {\"MedInc\": FLOAT, \"HouseAge\": FLOAT,\n                \"AveRooms\": FLOAT, \"AveBedrms\": FLOAT, \"Population\": FLOAT,\n                \"AveOccup\": FLOAT, \"Latitude\": FLOAT, \"Longitude\": FLOAT,\n                \"target\" : FLOAT}\n\n>>> copy_to_sql(df=trainX, table_name=train_table, if_exists=\"replace\", types=column_types)\n>>> copy_to_sql(df=testX, table_name=test_table, if_exists=\"replace\", types=column_types)\n\n# Create teradataml DataFrame for input tables\n\n>>> test_df = DataFrame(table_name=test_table)\n>>> train_df = DataFrame(table_name=train_table)\n\n>>> exec_role_arn = \"arn:aws:iam::XX:role/service-role/AmazonSageMaker-ExecutionRole-20210112T215668\"\n>>> FRAMEWORK_VERSION = \"0.23-1\"\n# Create an estimator object based on sklearn sagemaker class\n>>> sklearn_estimator = td_apiclient.SKLearn(\n    entry_point=\"sklearn-script.py\",\n    role=exec_role_arn,\n    instance_count=1,\n    instance_type=\"ml.m5.large\",\n    framework_version=FRAMEWORK_VERSION,\n    base_job_name=\"rf-scikit\",\n    metric_definitions=[{\"Name\": \"median-AE\", \"Regex\": \"AE-at-50th-percentile: ([0-9.]+).*$\"}],\n    hyperparameters={\n        \"n-estimators\": 100,\n        \"min-samples-leaf\": 3,\n        \"features\": \"MedInc HouseAge AveRooms AveBedrms Population AveOccup Latitude Longitude\",\n        \"target\": \"target\",\n    },\n)\n>>> # Start training using DataFrame objects\n>>> sklearn_estimator.fit({\"train\": test_df, \"test\": train_df}, content_type=\"csv\", wait=True)\n\n>>> from sagemaker.serializers import CSVSerializer\n>>> from sagemaker.deserializers import CSVDeserializer\n>>> csv_ser = CSVSerializer()\n>>> csv_dser = CSVDeserializer()\n>>> sg_kw = {\n        \"instance_type\": \"ml.m5.large\",\n        \"initial_instance_count\": 1,\n        \"serializer\": csv_ser,\n        \"deserializer\": csv_dser\n    }\n>>> predictor = sklearn_estimator.deploy(\"aws-endpoint\", sagemaker_kw_args=sg_kw)\n\n>>> # Now let's try prediction with UDF and Client options.\n>>> input = DataFrame(table_name='housing_data_test')\n>>> column_list = [\"MedInc\",\"HouseAge\",\"AveRooms\",\"AveBedrms\",\"Population\",\"AveOccup\",\"Latitude\",\"Longitude\"]\n>>> input = input.sample(n=5).select(column_list)\n\n>>> output = predictor.predict(input, mode=\"UDF\",content_type='csv')\n\n```\n\n## Using the tdapiclient Python Package with Azure ML\nYour Python script must import the `tdapiclient` package in order to use the tdapiclient Python Library\n```\n>>> import os\n>>> import getpass\n>>> from teradataml import create_context, DataFrame, read_csv\n>>> import pandas as pd\n>>> from teradatasqlalchemy.types import  *\n>>> from tdapiclient import create_tdapi_context,TDApiClient, remove_tdapi_context\n>>> # Create the connection.\n>>> host = input(\"Host: \")\n>>> username = input(\"Username: \")\n>>> password = getpass.getpass(\"Password: \")\n>>> # Create Azure Context and TDApiClient object.\n\n>>> datastore_path = input(\n>>>    \"DataStore path : Please give path within data store of azure-ml workspace.\")\n>>> tenant_id = input(\"Azure Tenant ID:\")\n>>> client_id = getpass.getpass(\"Azure Client ID: \")\n>>> client_secret = input(\"Azure Client Secret: \")\n>>>\n>>> azure_sub = input(\"Azure Subscription id: \")\n>>> azure_rg = input(\"Azure resource group: \")\n>>> azureml_ws = input(\"Azure-ML workspace: \")\n>>> azure_region = input(\"Azure region: \")\n\n>>> os.environ[\"AZURE_TENANT_ID\"] = tenant_id\n>>> os.environ[\"AZURE_CLIENT_ID\"] = client_id\n>>> os.environ[\"AZURE_CLIENT_SECRET\"] = client_secret\n>>>\n>>> os.environ[\"AZURE_SUB_ID\"] = azure_sub\n>>> os.environ[\"AZURE_RG\"] = azure_rg\n>>> os.environ[\"AZURE_WS\"] = azureml_ws\n>>> os.environ[\"AZURE_REGION\"] = azure_region\n\n>>> tdapi_context = create_tdapi_context(\"azure\", datastore_path=\"td-tables\")\n>>> td_apiclient = TDApiClient(tdapi_context)\n>>> from collections import OrderedDict\n>>>\n>>> from collections import OrderedDict\n>>>\n>>> types = OrderedDict(bustout=INTEGER, rec_id=INTEGER, acct_no=INTEGER, as_of_dt_day=DATE, avg_pmt_05_mth=FLOAT,>>> days_since_lstcash=INTEGER, max_utilization_05_mth=INTEGER, maxamt_epmt_v7day=INTEGER, times_nsf=INTEGER,\n>>>  totcash_to_line_v7day=INTEGER,totpmt_to_line_v7day=INTEGER,totpur_to_line_v7day=INTEGER,  totpurcash_to_line_v7day=INTEGER, credit_util_cur_mth=FLOAT, credit_util_prior_5_mth=FLOAT, credit_util_cur_to_prior_ratio=FLOAT, days_since_lst_pymnt=INTEGER, num_pymnt_lst_7_days=INTEGER, num_pymnt_lst_60_days=INTEGER,\n>>>     pct_line_paid_lst_7_days=INTEGER, pct_line_paid_lst_30_days=INTEGER, num_pur_lst_7_days=INTEGER, num_pur_lst_60_days=INTEGER,\n>>>     pct_line_pur_lst_7_days=INTEGER, pct_line_pur_lst_30_days=INTEGER, tot_pymnt_chnl=INTEGER, tot_pymnt=INTEGER, tot_pymnt_am=INTEGER, pay_by_phone=CHAR, elec_pymnt=CHAR,\n>>>     pay_in_bank=CHAR, pay_by_check=CHAR, pay_by_othr=CHAR, last_12m_trans_ct=INTEGER, Sample_ID=INTEGER)\n\n>>> # Check this csv file in Teradata Vantage Documentation site under azureml-usercases.zip\n>>> df:DataFrame = read_csv(r'financial_data.csv', table_name=\"financial_data\", types=types, use_fastload=False)\n>>> # training dataframe.\n>>> selected_df = df.select([\"bustout\", \"rec_id\", \"avg_pmt_05_mth\", \"max_utilization_05_mth\",\"times_nsf\"\n,\"credit_util_cur_mth\",\"credit_util_prior_5_mth\",\"num_pur_lst_7_days\",\"num_pur_lst_60_days\",\"tot_pymnt_chnl\",\"last_12m_trans_ct\"])\n>>> # Setup compute target for Azure ML.\n>>> from azureml.core.compute import AmlCompute, ComputeTarget\n>>> from azureml.core.authentication import ServicePrincipalAuthentication\n>>> from azureml.core import Workspace, Environment\n\n>>> credential = ServicePrincipalAuthentication(\n>>>         tenant_id=tenant_id,\n>>>         service_principal_id=client_id, service_principal_password=client_secret)\n\n>>> ws = Workspace(subscription_id=azure_sub, resource_group=azure_rg, workspace_name=azureml_ws, auth=credential)\n\n>>> vm_size = \"Standard_DS3_v2\"\n>>> min_node = 1\n>>> max_node = 1\n>>> cluster_name = \"test-td-cluster-new\"\n>>> provisioning_config = AmlCompute.provisioning_configuration(\n>>>         vm_size=vm_size, min_nodes=min_node,\n>>>         max_nodes=max_node)\n\n>>> # Creating Compute cluster in Azure ML.\n>>> compute_target = ComputeTarget.create(\n>>>         ws, cluster_name, provisioning_config)\n>>> compute_target.wait_for_completion(show_output=True)\n\n>>> compute_target = ws.compute_targets[\"test-td-cluster-new\"]\n>>> from azureml.automl.core.featurization import FeaturizationConfig\n>>> import logging\n\n>>> # Selecting the target column.\n>>> target_column_name = \"bustout\"\n\n>>> forecast_horizon=14\n\n>>> featurization_config = FeaturizationConfig()\n>>> # Force the target column, to be integer type.\n>>> featurization_config.add_prediction_transform_type(\"Integer\")\n\n>>> automl_config = td_apiclient.AutoMLConfig(\n>>>     task=\"classification\",\n>>>     primary_metric=\"accuracy\",\n>>>     featurization=featurization_config,\n>>>     blocked_models=[\"ExtremeRandomTrees\"],\n>>>     experiment_timeout_hours=0.3,\n>>>     training_data=selected_df,\n>>>     label_column_name=target_column_name,\n>>>     compute_target=compute_target,\n>>>     enable_early_stopping=True,\n>>>     n_cross_validations=3,\n>>>     max_concurrent_iterations=4,\n>>>     max_cores_per_iteration=-1,\n>>>     verbosity=logging.INFO\n>>> )\n\n>>> # Execute Azure ML training API with teradataml DataFrame as input which returns Azure ML Run Object.\n>>> run = automl_config.fit(mount=False)\n>>> # Get the best run after Auto ML job has completed.\n>>> run_best = run.get_best_child()\n>>> from azureml.core.environment import Environment\n>>>\n>>> # Creating an Azure ML Environment from a Dockerfile and requirements.txt.\n>>> # myenv = Environment.from_dockerfile(name=\"new_project_env_7\", dockerfile=\"./Dockerfile\", pip_requirements=\"./>>> requirements.txt\")\n>>> myenv = Environment.from_dockerfile(name=\"new_project_env_18\", >>> dockerfile=r'C:\\Projects\\AzureML-jupyter-notebooks\\test-ignite-azureml-api-demo\\Dockerfile', pip_requirements=r'C:\\Projects\\AzureML-jupyter-notebooks\\test-ignite-azureml-api-demo\\requirements.txt')\n>>> myenv_b = myenv.build(workspace=ws)\n>>> myenv_b.wait_for_completion(show_output=True)\n>>> # curated_env_name = \"AzureML-sklearn-0.24.1-ubuntu18.04-py37-cpu-inference\"\n>>> # myenv = Environment.get(workspace=ws, name=curated_env_name)\n>>> myenv = Environment.get(workspace=ws, name=\"new_project_env_18\")\n>>> # Register an Azure ML model from the best run.\n>>> from azureml.core import Model\n>>> model:Model = run_best.register_model(model_name='voting_ensemble_model_1', model_path='outputs/model.pkl',>>> model_framework=Model.Framework.SCIKITLEARN)\n>>> from azureml.core.model import Model\n>>> model = Model(workspace=ws, name=\"voting_ensemble_model_1\", version=1)\n\n>>> from enum import auto\n>>> from operator import mod\n>>> from platform import platform\n>>> from azureml.core import Model\n>>> from azureml.core.model import InferenceConfig, Model\n>>> from azureml.core.webservice import AciWebservice, Webservice\n>>> from azureml.core.environment import Environment\n>>> print(myenv)\n>>> # Combine scoring script & environment in Inference configuration\n>>> # inference_config = InferenceConfig(entry_script=\"scoring.py\",\n>>> #                                    environment=myenv)\n>>> myenv.inferencing_stack_version = 'latest'\n>>> inference_config = InferenceConfig(entry_script=r'C:\\Projects\\test-tdapiclient\\tdapiclient\\notebooks\\azureml-az-webservice\\scoring.py',\n                                   environment=myenv)\n>>> # Set deployment configuration\n>>> deployment_config = AciWebservice.deploy_configuration(cpu_cores = 2,\n>>>                                                        memory_gb = 4, auth_enabled=True)\n\n>>> # Creating azmodel_deploy_kwargs dictionary to pass as a keyword argument for deploy method.\n>>> azmodel_deploy_kwargs = {}\n>>> azmodel_deploy_kwargs[\"name\"] = \"tdapiclient-endpoint-29\"\n>>> azmodel_deploy_kwargs[\"models\"] = [model]\n>>> azmodel_deploy_kwargs[\"workspace\"] = ws\n>>> azmodel_deploy_kwargs[\"inference_config\"] = inference_config\n>>> azmodel_deploy_kwargs[\"deployment_config\"] = deployment_config\n>>> azmodel_deploy_kwargs[\"overwrite\"] = True\n\n>>> # Deploying the model to Azure ML Compute cluster if the platform is az-webservice.\n>>> webservice = automl_config.deploy(platform=\"az-webservice\", model=model, model_type=\"\",\n>>>                         model_deploy_kwargs=azmodel_deploy_kwargs)\n>>> webservice.wait_for_deployment(show_output=True)\n>>> # Creating an options dictionary to pass the content_format for scoring.\n>>> options = {}\n>>> content_format = {}\n>>> content_format[\"Inputs\"] = [[\"%row\"]]\n>>> options[\"content_format\"] = content_format\n>>> print(webservice.predict(test_df, **options, mode=\"udf\", content_type='json'))\n......\n......\n```\n\n## Using the tdapiclient Python Package with Vertex AI\n\nYour Python script must import the `tdapiclient` package in order to use the tdapiclient Python Library\n```\n>>> import os, getpass\n>>> from teradataml import create_context, DataFrame, remove_context, load_example_data\n>>> from tdapiclient import create_tdapi_context, TDApiClient, remove_tdapi_context\n\n# Create connection to Teradata Vantage system\n>>> host = input(\"Host: \")\n>>> username = input(\"Username: \")\n>>> password = getpass.getpass(\"Password: \")\n>>> td_context = create_context(host=host, username=username, password=password)\n\n# Create Google Cloud Platform (GCP) context to be used in TDApiClient\n>>> bucket_name = input(\"GCS bucket name: \")\n>>> bucket_path = input(\"GCS bucket path (without bucket name): \")\n>>> td_auth_obj = getpass.getpass(\"GCP Teradata auth object name: \")\n>>> project_id = input(\"GCP project ID: \")\n>>> region = input(\"GCP Region: \")\n>>> google_app_cred = input(\"Local path to Google credentials JSON file: \")\n\n>>> os.environ[\"GOOGLE_APPLICATION_CREDENTIALS\"] = google_app_cred\n>>> os.environ[\"GCP_REGION\"] = region\n>>> os.environ[\"GCP_PROJECT_ID\"] = project_id\n>>> os.environ[\"GCP_TD_AUTH_OBJ\"] = td_auth_obj\n\n>>> gcp_context = create_tdapi_context(\"gcp\", gcp_bucket_name=bucket_name, gcp_bucket_path=bucket_path)\n# Create TDApiClient instance\n>>> td_apiclient = TDApiClient(gcp_context)\n\n# Load data in Teradata tables\n# (training data is the same as test data for the purposes of this demo)\n>>> load_example_data(\"naivebayes\", \"nb_iris_input_train\")\n>>> df = DataFrame(\"nb_iris_input_train\")\n\n# Create Vertex AI training job\n>>> TRAINING_IMAGE = \"us-docker.pkg.dev/vertex-ai/training/scikit-learn-cpu.0-23:latest\"\n>>> PREDICTION_IMAGE = \"us-docker.pkg.dev/vertex-ai/prediction/sklearn-cpu.1-0:latest\"\n>>> job = td_apiclient.CustomTrainingJob(\n        display_name=\"tdapiclient-custom-demo\",\n        script_path=\"train.py\",\n        container_uri=TRAINING_IMAGE,\n        requirements=[\"gcsfs\", \"nyoka\"],\n        model_serving_container_image_uri=PREDICTION_IMAGE\n        )\n\n# Obtain trained model\n>>> model = job.fit(\n        df,\n        replica_count=1,\n        model_display_name=\"tdapiclient-custom-demo\"\n        )\n\n# Deploy model to a Vertex AI online endpoint\n>>> predictor = job.deploy(\n        model,\n        \"vx-endpoint\",\n        vertex_kwargs={\"machine_type\": \"n1-standard-4\"}\n        )\n\n# Predict with UDF and client options\n>>> df_test = df.drop([\"id\", \"species\"], axis=1)\n>>> vertex_prediction_obj = predictor.predict(df_test, mode=\"client\")\n>>> td_output = predictor.predict(df_test, mode=\"udf\", content_type=\"json\")\n\n```\n\n## Documentation\n\nGeneral product information, including installation instructions, is available in the [Teradata Documentation website](https://docs.teradata.com/).\n\n## License\n\nUse of the Teradata Python Package is governed by the [TERADATA API LICENSE AGREEMENT](https://downloads.teradata.com/download/license?destination=download/files/202392/202391/0/&message=License%2520Agreement&key=0).\nAfter installation, the `LICENSE` and `LICENSE-3RD-PARTY` files will be located in the `tdapiclient` directory of the Python installation directory.\n\n",
    "bugtrack_url": null,
    "license": "Teradata License Agreement",
    "summary": "Teradata API Client Python package",
    "version": "1.4.0.1",
    "project_urls": {
        "Homepage": "http://www.teradata.com/"
    },
    "split_keywords": [
        "teradata"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c3ba1149f9b5322f54579cca510b2764b7572af9012853f277dba9405ffc985e",
                "md5": "d11ec3e5c8d73aed110a737620a68f9d",
                "sha256": "31dd9b347b449e9e8cbb228ed8be91a0ab5d3190f62ab0f62984111e273026a5"
            },
            "downloads": -1,
            "filename": "tdapiclient-1.4.0.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "d11ec3e5c8d73aed110a737620a68f9d",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.0",
            "size": 220969,
            "upload_time": "2023-11-09T11:58:23",
            "upload_time_iso_8601": "2023-11-09T11:58:23.678776Z",
            "url": "https://files.pythonhosted.org/packages/c3/ba/1149f9b5322f54579cca510b2764b7572af9012853f277dba9405ffc985e/tdapiclient-1.4.0.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-11-09 11:58:23",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "tdapiclient"
}
        
Elapsed time: 0.13643s