azureml-ai-monitoring

Name	azureml-ai-monitoring JSON
Version	1.0.0 JSON
	download
home_page	None
Summary	Microsoft Azure Machine Learning Python SDK v2 for collecting model data during operationalization
upload_time	2024-04-25 08:52:07
maintainer	None
docs_url	None
author	Microsoft Corporation
requires_python	None
license	MIT License
keywords	azuremachinelearning modelmonitoring
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Microsoft Azure Machine Learning Data Collection SDK v2 for model monitoring

The `azureml-ai-monitoring` package provides an SDK to enable Model Data Collector (MDC) for custom logging allows customers to collect data at arbitrary points in their data pre-processing pipeline. Customers can leverage SDK in `score.py` to log data to desired sink before, during, and after any data transformations. 

## Quickstart

Start by importing the `azureml-ai-monitoring` package in `score.py`

```
import pandas as pd
import json
from azureml.ai.monitoring import Collector

def init():
  global inputs_collector, outputs_collector

  # instantiate collectors with appropriate names, make sure align with deployment spec
  inputs_collector = Collector(name='model_inputs')                    
  outputs_collector = Collector(name='model_outputs')

def run(data): 
  # json data: { "data" : {  "col1": [1,2,3], "col2": [2,3,4] } }
  pdf_data = preprocess(json.loads(data))
  
  # tabular data: {  "col1": [1,2,3], "col2": [2,3,4] }
  input_df = pd.DataFrame(pdf_data)

  # collect inputs data, store correlation_context
  context = inputs_collector.collect(input_df)

  # perform scoring with pandas Dataframe, return value is also pandas Dataframe
  output_df = predict(input_df) 

  # collect outputs data, pass in correlation_context so inputs and outputs data can be correlated later
  outputs_collector.collect(output_df, context)
  
  return output_df.to_dict()
  
def preprocess(json_data):
  # preprocess the payload to ensure it can be converted to pandas DataFrame
  return json_data["data"]

def predict(input_df):
  # process input and return with outputs
  ...
  
  return output_df
```

Create environment with base image `mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04` and conda dependencies, then build the environment.

```
channels:
  - conda-forge
dependencies:
  - python=3.8
  - pip=22.3.1
  - pip:
      - azureml-defaults==1.38.0
      - azureml-ai-monitoring
name: model-env
```

Create deployment with custom logging enabled (model_inputs and model_outputs are enabled) and the environment you just built, please update the yaml according to your scenario.

```
#source ../configs/model-data-collector/data-storage-basic-OnlineDeployment.YAML
$schema: http://azureml/sdk-2-0/OnlineDeployment.json

endpoint_name: my_endpoint #unchanged
name: blue #unchanged
model: azureml:my-model-m1:1 #azureml:models/<name>:<version> #unchanged
environment: azureml:custom-logging-env@latest #unchanged
data_collector:
  collections:
    model_inputs:
      enabled: 'True'
    model_outputs:
      enabled: 'True'
```

## Configurable error handler

By default, we'll raise the exception when there is unexpected behavior (like custom logging is not enabled, collection is not enabled, not supported data type), if you want a configurable on_error, you can do it with

```
collector = Collector(name="inputs", on_error=lambda e: logging.info("ex:{}".format(e)))
```

# Change Log

## [v1.0.0](https://pypi.org/project/azureml-ai-monitoring) (2024.4.25)

**Announcement**

- Publish official version v1.0.0.

## [v0.1.0b4](https://pypi.org/project/azureml-ai-monitoring) (2023.8.21)

**Improvements**

- improve error msg when queue is full.
- Increase msg queue to handle more requests.

## [v0.1.0b3](https://pypi.org/project/azureml-ai-monitoring) (2023.5.15)

**Improvements**

- fix install_requires
- fix classifiers
- fix README

## [v0.1.0b2](https://pypi.org/project/azureml-ai-monitoring) (2023.5.9)

**New Features**

- Support local capture

## [v0.1.0b1](https://pypi.org/project/azureml-ai-monitoring) (2023.4.25)

**New Features**

- Support model data collection for pandas Dataframe.

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "azureml-ai-monitoring",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "AzureMachineLearning, ModelMonitoring",
    "author": "Microsoft Corporation",
    "author_email": "azuremlsdk@microsoft.com",
    "download_url": null,
    "platform": "any",
    "description": "# Microsoft Azure Machine Learning Data Collection SDK v2 for model monitoring\n\nThe `azureml-ai-monitoring` package provides an SDK to enable Model Data Collector (MDC) for custom logging allows customers to collect data at arbitrary points in their data pre-processing pipeline. Customers can leverage SDK in `score.py` to log data to desired sink before, during, and after any data transformations. \n\n## Quickstart\n\nStart by importing the `azureml-ai-monitoring` package in `score.py`\n\n```\nimport pandas as pd\nimport json\nfrom azureml.ai.monitoring import Collector\n\ndef init():\n  global inputs_collector, outputs_collector\n\n  # instantiate collectors with appropriate names, make sure align with deployment spec\n  inputs_collector = Collector(name='model_inputs')                    \n  outputs_collector = Collector(name='model_outputs')\n\ndef run(data): \n  # json data: { \"data\" : {  \"col1\": [1,2,3], \"col2\": [2,3,4] } }\n  pdf_data = preprocess(json.loads(data))\n  \n  # tabular data: {  \"col1\": [1,2,3], \"col2\": [2,3,4] }\n  input_df = pd.DataFrame(pdf_data)\n\n  # collect inputs data, store correlation_context\n  context = inputs_collector.collect(input_df)\n\n  # perform scoring with pandas Dataframe, return value is also pandas Dataframe\n  output_df = predict(input_df) \n\n  # collect outputs data, pass in correlation_context so inputs and outputs data can be correlated later\n  outputs_collector.collect(output_df, context)\n  \n  return output_df.to_dict()\n  \ndef preprocess(json_data):\n  # preprocess the payload to ensure it can be converted to pandas DataFrame\n  return json_data[\"data\"]\n\ndef predict(input_df):\n  # process input and return with outputs\n  ...\n  \n  return output_df\n```\n\nCreate environment with base image `mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04` and conda dependencies, then build the environment.\n\n```\nchannels:\n  - conda-forge\ndependencies:\n  - python=3.8\n  - pip=22.3.1\n  - pip:\n      - azureml-defaults==1.38.0\n      - azureml-ai-monitoring\nname: model-env\n```\n\nCreate deployment with custom logging enabled (model_inputs and model_outputs are enabled) and the environment you just built, please update the yaml according to your scenario.\n\n```\n#source ../configs/model-data-collector/data-storage-basic-OnlineDeployment.YAML\n$schema: http://azureml/sdk-2-0/OnlineDeployment.json\n\nendpoint_name: my_endpoint #unchanged\nname: blue #unchanged\nmodel: azureml:my-model-m1:1 #azureml:models/<name>:<version> #unchanged\nenvironment: azureml:custom-logging-env@latest #unchanged\ndata_collector:\n  collections:\n    model_inputs:\n      enabled: 'True'\n    model_outputs:\n      enabled: 'True'\n```\n\n## Configurable error handler\n\nBy default, we'll raise the exception when there is unexpected behavior (like custom logging is not enabled, collection is not enabled, not supported data type), if you want a configurable on_error, you can do it with\n\n```\ncollector = Collector(name=\"inputs\", on_error=lambda e: logging.info(\"ex:{}\".format(e)))\n```\n\n# Change Log\n\n## [v1.0.0](https://pypi.org/project/azureml-ai-monitoring) (2024.4.25)\n\n**Announcement**\n\n- Publish official version v1.0.0.\n\n## [v0.1.0b4](https://pypi.org/project/azureml-ai-monitoring) (2023.8.21)\n\n**Improvements**\n\n- improve error msg when queue is full.\n- Increase msg queue to handle more requests.\n\n## [v0.1.0b3](https://pypi.org/project/azureml-ai-monitoring) (2023.5.15)\n\n**Improvements**\n\n- fix install_requires\n- fix classifiers\n- fix README\n\n## [v0.1.0b2](https://pypi.org/project/azureml-ai-monitoring) (2023.5.9)\n\n**New Features**\n\n- Support local capture\n\n## [v0.1.0b1](https://pypi.org/project/azureml-ai-monitoring) (2023.4.25)\n\n**New Features**\n\n- Support model data collection for pandas Dataframe.\n",
    "bugtrack_url": null,
    "license": "MIT License",
    "summary": "Microsoft Azure Machine Learning Python SDK v2 for collecting model data during operationalization",
    "version": "1.0.0",
    "project_urls": null,
    "split_keywords": [
        "azuremachinelearning",
        " modelmonitoring"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "5f7997d545c565a988222e40290af1351268547a46cb9c4b636717e773599f99",
                "md5": "60e4a81ea68f5d02241171c48526bff1",
                "sha256": "5d7cbbafec9a4934fa317cd6679b9108986c29d2dc4ccc44b062e319f8c901b1"
            },
            "downloads": -1,
            "filename": "azureml_ai_monitoring-1.0.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "60e4a81ea68f5d02241171c48526bff1",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 17477,
            "upload_time": "2024-04-25T08:52:07",
            "upload_time_iso_8601": "2024-04-25T08:52:07.011475Z",
            "url": "https://files.pythonhosted.org/packages/5f/79/97d545c565a988222e40290af1351268547a46cb9c4b636717e773599f99/azureml_ai_monitoring-1.0.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-04-25 08:52:07",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "azureml-ai-monitoring"
}

Microsoft Corporation