# mlflow-integration
Provide means of exporting a model from MLFlow model registry and pushing it to DataRobot model registry
Key-values are created from training parameters, metrics, tags,
and artifacts in the MLflow model.
## Setup
* Python 3.7 or later
* DataRobot 9.0 or later
* `pip install datarobot-mlflow`
* if using Azure: `pip install "datarobot-mlflow[azure]"`
## Considerations
This integration library uses an API endpoint under Public Preview.
The DataRobot user owning the API token used below must have:
* `Enable Extended Compliance Documentation` set
* `Owner` or `User` permission for the DataRobot model package
## DataRobot information needed
* URL of DataRobot instance, example: `https://app.datarobot.com`
* ID of the model package to receive key-values; example: `64227b4bf82db411c90c3209`
* API token for DataRobot: `export MLOPS_API_TOKEN=<API token from DataRobot Developer Tools>`
## Local MLflow information needed
* MLflow tracking URI; example `"file:///Users/me/mlflow/examples/mlruns"`
* Model name; example `"cost-model"`
* Model version; example `"2"`
## Azure DataBricks MLFlow with Service Principal information needed
* MLflow tracking URI; example `"azureml://region.api.azureml.ms/mlflow/v1.0/subscriptions/subscription-id/resourceGroups/resource-group-name/providers/Microsoft.MachineLearningServices/workspaces/azure-ml-workspace-name"`
* Model name; example `"cost-model"`
* Model version; example `"2"`
* Provide service principal details in environment:
* `export AZURE_TENANT_ID="<tenant-id>"`
* `export AZURE_CLIENT_ID="<client-id>"`
* `export AZURE_CLIENT_SECRET="<secret>"`
## Example: Import from MLflow
```sh
DR_MODEL_ID="<MODEL_PACKAGE_ID>"
env PYTHONPATH=./ \
python datarobot_mlflow/drflow_cli.py \
--mlflow-url http://localhost:8080 \
--mlflow-model cost-model \
--mlflow-model-version 2 \
--dr-model $DR_MODEL_ID \
--dr-url https://app.datarobot.com \
--with-artifacts \
--verbose \
--action sync
```
## Example: validate Azure credentials
```sh
export MLOPS_API_TOKEN="n/a" # not used for Azure auth check, but must be present
env PYTHONPATH=./ \
python datarobot_mlflow/drflow_cli.py \
--verbose \
--auth-type azure-service-principal \
--service-provider-type azure-databricks \
--action validate-auth
# example output for missing environment variables:
Required environment variable is not defined: AZURE_TENANT_ID
Required environment variable is not defined: AZURE_CLIENT_ID
Required environment variable is not defined: AZURE_CLIENT_SECRET
Azure AD Service Principal credentials are not valid; check environment variables
# example output for successful authentication:
Azure AD Service Principal credentials are valid for obtaining access token
```
## Actions
The following operations are available for `--action`:
* `sync`: import parameters, tags, metrics, and artifacts from MLflow model.
* `list-mlflow-keys`: list parameters, tags, metrics, and artifacts in an MLflow model. Requires `--mlflow-url`, `--mlflow-model`, and `--mlflow-model-version`.
* `validate-auth`: see "validate Azure credentials" example above.
## Options
The following options can be added to the `drflow_cli` command line:
* `--mlflow-url`: MLflow Tracking URI
* `--mlflow-model`: MLflow model name
* `--mlflow-model-version`: MLflow model version
* `--dr-url`: Main URL of the DataRobot instance
* `--dr-model`: DataRobot Model Package ID. Registered Model Versions are also supported.
* `--prefix`: a string to prepend to the names of all key-values posted to DataRobot. Default is empty.
* `--debug`: set Python logging level to `logging.DEBUG`. Default level is `logging.WARNING`.
* `--verbose`: prints to stdout information about the following:
* retrieving model from MLflow; prints model information
* setting model data in DataRobot: prints each key-value posted
* `--with-artifacts`: download MLflow model artifacts to `/tmp/model`
* `--service-provider-type`: service provider to use for `validate-auth`. Supported values are:
* `azure-databricks`: for Databricks MLflow within Azure
* `--auth-type`: authentication type for `validate-auth`. Supported values are:
* `azure-service-principal`: for Azure Service Principal
Raw data
{
"_id": null,
"home_page": "https://datarobot.com",
"name": "datarobot-mlflow",
"maintainer": "DataRobot",
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": "info+mlflow@datarobot.com",
"keywords": "",
"author": "DataRobot",
"author_email": "support+mlflow@datarobot.com",
"download_url": "",
"platform": null,
"description": "# mlflow-integration\n\nProvide means of exporting a model from MLFlow model registry and pushing it to DataRobot model registry\n\nKey-values are created from training parameters, metrics, tags,\nand artifacts in the MLflow model.\n\n## Setup\n* Python 3.7 or later\n* DataRobot 9.0 or later\n* `pip install datarobot-mlflow`\n * if using Azure: `pip install \"datarobot-mlflow[azure]\"`\n\n## Considerations\nThis integration library uses an API endpoint under Public Preview.\nThe DataRobot user owning the API token used below must have:\n* `Enable Extended Compliance Documentation` set\n* `Owner` or `User` permission for the DataRobot model package\n\n## DataRobot information needed\n* URL of DataRobot instance, example: `https://app.datarobot.com`\n* ID of the model package to receive key-values; example: `64227b4bf82db411c90c3209`\n* API token for DataRobot: `export MLOPS_API_TOKEN=<API token from DataRobot Developer Tools>`\n\n## Local MLflow information needed\n* MLflow tracking URI; example `\"file:///Users/me/mlflow/examples/mlruns\"`\n* Model name; example `\"cost-model\"`\n* Model version; example `\"2\"`\n\n## Azure DataBricks MLFlow with Service Principal information needed\n* MLflow tracking URI; example `\"azureml://region.api.azureml.ms/mlflow/v1.0/subscriptions/subscription-id/resourceGroups/resource-group-name/providers/Microsoft.MachineLearningServices/workspaces/azure-ml-workspace-name\"`\n* Model name; example `\"cost-model\"`\n* Model version; example `\"2\"`\n* Provide service principal details in environment:\n * `export AZURE_TENANT_ID=\"<tenant-id>\"`\n * `export AZURE_CLIENT_ID=\"<client-id>\"`\n * `export AZURE_CLIENT_SECRET=\"<secret>\"`\n\n## Example: Import from MLflow\n```sh\nDR_MODEL_ID=\"<MODEL_PACKAGE_ID>\"\n\nenv PYTHONPATH=./ \\\npython datarobot_mlflow/drflow_cli.py \\\n --mlflow-url http://localhost:8080 \\\n --mlflow-model cost-model \\\n --mlflow-model-version 2 \\\n --dr-model $DR_MODEL_ID \\\n --dr-url https://app.datarobot.com \\\n --with-artifacts \\\n --verbose \\\n --action sync\n```\n\n## Example: validate Azure credentials\n```sh\nexport MLOPS_API_TOKEN=\"n/a\" # not used for Azure auth check, but must be present\n\nenv PYTHONPATH=./ \\\npython datarobot_mlflow/drflow_cli.py \\\n --verbose \\\n --auth-type azure-service-principal \\\n --service-provider-type azure-databricks \\\n --action validate-auth\n\n# example output for missing environment variables:\nRequired environment variable is not defined: AZURE_TENANT_ID\nRequired environment variable is not defined: AZURE_CLIENT_ID\nRequired environment variable is not defined: AZURE_CLIENT_SECRET\nAzure AD Service Principal credentials are not valid; check environment variables\n\n# example output for successful authentication:\nAzure AD Service Principal credentials are valid for obtaining access token\n```\n\n## Actions\nThe following operations are available for `--action`:\n* `sync`: import parameters, tags, metrics, and artifacts from MLflow model.\n* `list-mlflow-keys`: list parameters, tags, metrics, and artifacts in an MLflow model. Requires `--mlflow-url`, `--mlflow-model`, and `--mlflow-model-version`.\n* `validate-auth`: see \"validate Azure credentials\" example above.\n\n## Options\nThe following options can be added to the `drflow_cli` command line:\n* `--mlflow-url`: MLflow Tracking URI\n* `--mlflow-model`: MLflow model name\n* `--mlflow-model-version`: MLflow model version\n* `--dr-url`: Main URL of the DataRobot instance\n* `--dr-model`: DataRobot Model Package ID. Registered Model Versions are also supported.\n* `--prefix`: a string to prepend to the names of all key-values posted to DataRobot. Default is empty.\n* `--debug`: set Python logging level to `logging.DEBUG`. Default level is `logging.WARNING`. \n* `--verbose`: prints to stdout information about the following:\n * retrieving model from MLflow; prints model information\n * setting model data in DataRobot: prints each key-value posted\n* `--with-artifacts`: download MLflow model artifacts to `/tmp/model`\n* `--service-provider-type`: service provider to use for `validate-auth`. Supported values are:\n * `azure-databricks`: for Databricks MLflow within Azure\n* `--auth-type`: authentication type for `validate-auth`. Supported values are:\n * `azure-service-principal`: for Azure Service Principal\n",
"bugtrack_url": null,
"license": "DataRobot Tool and Utility Agreement",
"summary": "datarobot-mlflow client to synchronize an MLFlow model with DataRobot model",
"version": "0.1.dev2",
"project_urls": {
"Homepage": "https://datarobot.com"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "2e4ab55efdfaec0b9d33437655546bb61b9977ee034d2b6ed7e045d79cdc9965",
"md5": "960f28e621866a9bd7c65a88bcbd2cb2",
"sha256": "966e41aaf2bdd229271c085c3b778b26486bceffaa8ae419754921f6561d2000"
},
"downloads": -1,
"filename": "datarobot_mlflow-0.1.dev2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "960f28e621866a9bd7c65a88bcbd2cb2",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 12176,
"upload_time": "2023-07-07T22:28:16",
"upload_time_iso_8601": "2023-07-07T22:28:16.266294Z",
"url": "https://files.pythonhosted.org/packages/2e/4a/b55efdfaec0b9d33437655546bb61b9977ee034d2b6ed7e045d79cdc9965/datarobot_mlflow-0.1.dev2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-07-07 22:28:16",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "datarobot-mlflow"
}