mlflavors

Name	mlflavors JSON
Version	0.1.0 JSON
	download
home_page	https://github.com/ml-toolkits/mlflavors
Summary	MLflavors: A collection of custom MLflow flavors.
upload_time	2023-05-10 13:31:52
maintainer
docs_url	None
author	Benjamin Bluhm
requires_python	>=3.7
license	BSD-3-Clause
keywords	machine-learning ai mlflow
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage

            
MLflavors
=========

The MLflavors package adds MLflow support for some popular machine learning frameworks currently
not considered for inclusion as MLflow built-in flavors. Just like built-in flavors, you can use
this package to save your model as an MLflow artifact, load your model from MLflow for batch
inference, and deploy your model to a serving endpoint using MLflow deployment tools.

The following open-source libraries are currently supported:

    .. list-table::
      :widths: 15 10 15
      :header-rows: 1

      * - Framework
        - Tutorials
        - Category
      * - `Orbit <https://github.com/uber/orbit>`_
        - `MLflow-Orbit <https://mlflavors.readthedocs.io/en/latest/examples.html#orbit>`_
        - Time Series Forecasting
      * - `Sktime <https://github.com/sktime/sktime>`_
        - `MLflow-Sktime <https://mlflavors.readthedocs.io/en/latest/examples.html#sktime>`_
        - Time Series Forecasting
      * - `StatsForecast <https://github.com/Nixtla/statsforecast>`_
        - `MLflow-StatsForecast <https://mlflavors.readthedocs.io/en/latest/examples.html#statsforecast>`_
        - Time Series Forecasting
      * - `PyOD <https://github.com/yzhao062/pyod>`_
        - `MLflow-PyOD <https://mlflavors.readthedocs.io/en/latest/examples.html#pyod>`_
        - Anomaly Detection
      * - `SDV <https://github.com/sdv-dev/SDV>`_
        - `MLflow-SDV <https://mlflavors.readthedocs.io/en/latest/examples.html#sdv>`_
        - Synthetic Data Generation

The MLflow interface for the supported frameworks closely follows the design of built-in flavors.
Particularly, the interface for utilizing the custom model loaded as a ``pyfunc`` flavor
for generating predictions uses a single-row Pandas DataFrame configuration argument to expose the
parameters of the flavor's inference API.

|tests| |coverage| |docs| |pypi| |license|

.. |tests| image:: https://img.shields.io/github/actions/workflow/status/ml-toolkits/mlflavors/ci.yml?style=for-the-badge&logo=github
    :target: https://github.com/ml-toolkits/mlflavors/actions/workflows/ci.yml/

.. |coverage| image:: https://img.shields.io/codecov/c/github/ml-toolkits/mlflavors?style=for-the-badge&label=codecov&logo=codecov
    :target: https://codecov.io/gh/ml-toolkits/mlflavors

.. |docs| image:: https://img.shields.io/readthedocs/mlflavors/latest.svg?style=for-the-badge&logoColor=white
    :target: https://mlflavors.readthedocs.io/en/latest/index.html
    :alt: Latest Docs

.. |pypi| image:: https://img.shields.io/pypi/v/mlflavors.svg?style=for-the-badge&logo=pypi&logoColor=white
    :target: https://pypi.org/project/mlflavors/
    :alt: Latest Python Release

.. |license| image:: https://img.shields.io/badge/License-BSD--3--Clause-blue?style=for-the-badge
    :target: https://opensource.org/license/bsd-3-clause/
    :alt: BSD-3-Clause License

Documentation
-------------

Usage examples for all flavors and the API reference can be found in the package
`documenation <https://mlflavors.readthedocs.io/en/latest/index.html>`_.


Installation
------------

Installing from PyPI:

.. code-block:: bash

   $ pip install mlflavors

Quickstart
----------

This example trains a `PyOD <https://github.com/yzhao062/pyod>`_ KNN model
using a synthetic dataset. Normal data is generated by a multivariate
Gaussian distribution and outliers are generated by a uniform distribution.
A new MLflow experiment is created to log the evaluation metrics and the trained
model as an artifact and anomaly scores are computed loading the
trained model in native flavor and ``pyfunc`` flavor:

.. code-block:: python

    import json

    import mlflow
    import pandas as pd
    from pyod.models.knn import KNN
    from pyod.utils.data import generate_data
    from sklearn.metrics import roc_auc_score

    import mlflavors

    ARTIFACT_PATH = "model"

    with mlflow.start_run() as run:
        contamination = 0.1  # percentage of outliers
        n_train = 200  # number of training points
        n_test = 100  # number of testing points

        X_train, X_test, _, y_test = generate_data(
            n_train=n_train, n_test=n_test, contamination=contamination
        )

        # Train kNN detector
        clf = KNN()
        clf.fit(X_train)

        # Evaluate model
        y_test_scores = clf.decision_function(X_test)

        metrics = {
            "roc": roc_auc_score(y_test, y_test_scores),
        }

        print(f"Metrics: \n{json.dumps(metrics, indent=2)}")

        # Log metrics
        mlflow.log_metrics(metrics)

        # Log model using pickle serialization (default).
        mlflavors.pyod.log_model(
            pyod_model=clf,
            artifact_path=ARTIFACT_PATH,
            serialization_format="pickle",
        )
        model_uri = mlflow.get_artifact_uri(ARTIFACT_PATH)

    # Print the run id wich is used below for serving the model to a local REST API endpoint
    print(f"\nMLflow run id:\n{run.info.run_id}")

Make a prediction loading the model from MLflow in native format:

.. code-block:: python

    loaded_model = mlflavors.pyod.load_model(model_uri=model_uri)
    print(loaded_model.decision_function(X_test))

Make a prediction loading the model from MLflow in ``pyfunc`` format:

.. code-block:: python

    loaded_pyfunc = mlflavors.pyod.pyfunc.load_model(model_uri=model_uri)

    # Create configuration DataFrame
    predict_conf = pd.DataFrame(
        [
            {
                "X": X_test,
                "predict_method": "decision_function",
            }
        ]
    )

    print(loaded_pyfunc.predict(predict_conf)[0])

To serve the model to a local REST API endpoint run the command below where you substitute
the run id printed above:

.. code-block:: bash

    mlflow models serve -m runs:/<run_id>/model --env-manager local --host 127.0.0.1

Open a new terminal and run the below model scoring script to request a prediction from
the served model:

.. code-block:: python

    import pandas as pd
    import requests
    from pyod.utils.data import generate_data

    contamination = 0.1  # percentage of outliers
    n_train = 200  # number of training points
    n_test = 100  # number of testing points

    _, X_test, _, _ = generate_data(
        n_train=n_train, n_test=n_test, contamination=contamination
    )

    # Define local host and endpoint url
    host = "127.0.0.1"
    url = f"http://{host}:5000/invocations"

    # Convert to list for JSON serialization
    X_test_list = X_test.tolist()

    # Create configuration DataFrame
    predict_conf = pd.DataFrame(
        [
            {
                "X": X_test_list,
                "predict_method": "decision_function",
            }
        ]
    )

    # Create dictionary with pandas DataFrame in the split orientation
    json_data = {"dataframe_split": predict_conf.to_dict(orient="split")}

    # Score model
    response = requests.post(url, json=json_data)
    print(response.json())

Contributing
------------

Contributions from the community are welcome, I will be happy to support the inclusion
and development of new features and flavors. To open an issue or request a new feature, please
open a GitHub issue.

Versioning
----------

Versions and changes are documented in the
`changelog <https://github.com/ml-toolkits/mlflavors/tree/main/CHANGELOG.rst>`_ .

Development
-----------

To set up your local development environment, create a virtual environment, such as:

.. code-block:: bash

    $ conda create -n mlflavors-dev python=3.9
    $ source activate mlflavors-dev

Install project locally:

.. code-block:: bash

    $ python -m pip install --upgrade pip
    $ pip install -e ".[dev]"

Install pre-commit hooks:

.. code-block:: bash

    $ pre-commit install

Run tests:

.. code-block:: bash

    $ pytest

Build Sphinx docs:

.. code-block:: bash

    $ cd docs
    $ make html

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/ml-toolkits/mlflavors",
    "name": "mlflavors",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "",
    "keywords": "machine-learning ai mlflow",
    "author": "Benjamin Bluhm",
    "author_email": "",
    "download_url": "https://files.pythonhosted.org/packages/71/f5/7a5d157d848f3e9b6c3a4e6bc05e1ff632c617fc329d26fc9b7737e73c5e/mlflavors-0.1.0.tar.gz",
    "platform": null,
    "description": "\nMLflavors\n=========\n\nThe MLflavors package adds MLflow support for some popular machine learning frameworks currently\nnot considered for inclusion as MLflow built-in flavors. Just like built-in flavors, you can use\nthis package to save your model as an MLflow artifact, load your model from MLflow for batch\ninference, and deploy your model to a serving endpoint using MLflow deployment tools.\n\nThe following open-source libraries are currently supported:\n\n    .. list-table::\n      :widths: 15 10 15\n      :header-rows: 1\n\n      * - Framework\n        - Tutorials\n        - Category\n      * - `Orbit <https://github.com/uber/orbit>`_\n        - `MLflow-Orbit <https://mlflavors.readthedocs.io/en/latest/examples.html#orbit>`_\n        - Time Series Forecasting\n      * - `Sktime <https://github.com/sktime/sktime>`_\n        - `MLflow-Sktime <https://mlflavors.readthedocs.io/en/latest/examples.html#sktime>`_\n        - Time Series Forecasting\n      * - `StatsForecast <https://github.com/Nixtla/statsforecast>`_\n        - `MLflow-StatsForecast <https://mlflavors.readthedocs.io/en/latest/examples.html#statsforecast>`_\n        - Time Series Forecasting\n      * - `PyOD <https://github.com/yzhao062/pyod>`_\n        - `MLflow-PyOD <https://mlflavors.readthedocs.io/en/latest/examples.html#pyod>`_\n        - Anomaly Detection\n      * - `SDV <https://github.com/sdv-dev/SDV>`_\n        - `MLflow-SDV <https://mlflavors.readthedocs.io/en/latest/examples.html#sdv>`_\n        - Synthetic Data Generation\n\nThe MLflow interface for the supported frameworks closely follows the design of built-in flavors.\nParticularly, the interface for utilizing the custom model loaded as a ``pyfunc`` flavor\nfor generating predictions uses a single-row Pandas DataFrame configuration argument to expose the\nparameters of the flavor's inference API.\n\n|tests| |coverage| |docs| |pypi| |license|\n\n.. |tests| image:: https://img.shields.io/github/actions/workflow/status/ml-toolkits/mlflavors/ci.yml?style=for-the-badge&logo=github\n    :target: https://github.com/ml-toolkits/mlflavors/actions/workflows/ci.yml/\n\n.. |coverage| image:: https://img.shields.io/codecov/c/github/ml-toolkits/mlflavors?style=for-the-badge&label=codecov&logo=codecov\n    :target: https://codecov.io/gh/ml-toolkits/mlflavors\n\n.. |docs| image:: https://img.shields.io/readthedocs/mlflavors/latest.svg?style=for-the-badge&logoColor=white\n    :target: https://mlflavors.readthedocs.io/en/latest/index.html\n    :alt: Latest Docs\n\n.. |pypi| image:: https://img.shields.io/pypi/v/mlflavors.svg?style=for-the-badge&logo=pypi&logoColor=white\n    :target: https://pypi.org/project/mlflavors/\n    :alt: Latest Python Release\n\n.. |license| image:: https://img.shields.io/badge/License-BSD--3--Clause-blue?style=for-the-badge\n    :target: https://opensource.org/license/bsd-3-clause/\n    :alt: BSD-3-Clause License\n\nDocumentation\n-------------\n\nUsage examples for all flavors and the API reference can be found in the package\n`documenation <https://mlflavors.readthedocs.io/en/latest/index.html>`_.\n\n\nInstallation\n------------\n\nInstalling from PyPI:\n\n.. code-block:: bash\n\n   $ pip install mlflavors\n\nQuickstart\n----------\n\nThis example trains a `PyOD <https://github.com/yzhao062/pyod>`_ KNN model\nusing a synthetic dataset. Normal data is generated by a multivariate\nGaussian distribution and outliers are generated by a uniform distribution.\nA new MLflow experiment is created to log the evaluation metrics and the trained\nmodel as an artifact and anomaly scores are computed loading the\ntrained model in native flavor and ``pyfunc`` flavor:\n\n.. code-block:: python\n\n    import json\n\n    import mlflow\n    import pandas as pd\n    from pyod.models.knn import KNN\n    from pyod.utils.data import generate_data\n    from sklearn.metrics import roc_auc_score\n\n    import mlflavors\n\n    ARTIFACT_PATH = \"model\"\n\n    with mlflow.start_run() as run:\n        contamination = 0.1  # percentage of outliers\n        n_train = 200  # number of training points\n        n_test = 100  # number of testing points\n\n        X_train, X_test, _, y_test = generate_data(\n            n_train=n_train, n_test=n_test, contamination=contamination\n        )\n\n        # Train kNN detector\n        clf = KNN()\n        clf.fit(X_train)\n\n        # Evaluate model\n        y_test_scores = clf.decision_function(X_test)\n\n        metrics = {\n            \"roc\": roc_auc_score(y_test, y_test_scores),\n        }\n\n        print(f\"Metrics: \\n{json.dumps(metrics, indent=2)}\")\n\n        # Log metrics\n        mlflow.log_metrics(metrics)\n\n        # Log model using pickle serialization (default).\n        mlflavors.pyod.log_model(\n            pyod_model=clf,\n            artifact_path=ARTIFACT_PATH,\n            serialization_format=\"pickle\",\n        )\n        model_uri = mlflow.get_artifact_uri(ARTIFACT_PATH)\n\n    # Print the run id wich is used below for serving the model to a local REST API endpoint\n    print(f\"\\nMLflow run id:\\n{run.info.run_id}\")\n\nMake a prediction loading the model from MLflow in native format:\n\n.. code-block:: python\n\n    loaded_model = mlflavors.pyod.load_model(model_uri=model_uri)\n    print(loaded_model.decision_function(X_test))\n\nMake a prediction loading the model from MLflow in ``pyfunc`` format:\n\n.. code-block:: python\n\n    loaded_pyfunc = mlflavors.pyod.pyfunc.load_model(model_uri=model_uri)\n\n    # Create configuration DataFrame\n    predict_conf = pd.DataFrame(\n        [\n            {\n                \"X\": X_test,\n                \"predict_method\": \"decision_function\",\n            }\n        ]\n    )\n\n    print(loaded_pyfunc.predict(predict_conf)[0])\n\nTo serve the model to a local REST API endpoint run the command below where you substitute\nthe run id printed above:\n\n.. code-block:: bash\n\n    mlflow models serve -m runs:/<run_id>/model --env-manager local --host 127.0.0.1\n\nOpen a new terminal and run the below model scoring script to request a prediction from\nthe served model:\n\n.. code-block:: python\n\n    import pandas as pd\n    import requests\n    from pyod.utils.data import generate_data\n\n    contamination = 0.1  # percentage of outliers\n    n_train = 200  # number of training points\n    n_test = 100  # number of testing points\n\n    _, X_test, _, _ = generate_data(\n        n_train=n_train, n_test=n_test, contamination=contamination\n    )\n\n    # Define local host and endpoint url\n    host = \"127.0.0.1\"\n    url = f\"http://{host}:5000/invocations\"\n\n    # Convert to list for JSON serialization\n    X_test_list = X_test.tolist()\n\n    # Create configuration DataFrame\n    predict_conf = pd.DataFrame(\n        [\n            {\n                \"X\": X_test_list,\n                \"predict_method\": \"decision_function\",\n            }\n        ]\n    )\n\n    # Create dictionary with pandas DataFrame in the split orientation\n    json_data = {\"dataframe_split\": predict_conf.to_dict(orient=\"split\")}\n\n    # Score model\n    response = requests.post(url, json=json_data)\n    print(response.json())\n\nContributing\n------------\n\nContributions from the community are welcome, I will be happy to support the inclusion\nand development of new features and flavors. To open an issue or request a new feature, please\nopen a GitHub issue.\n\nVersioning\n----------\n\nVersions and changes are documented in the\n`changelog <https://github.com/ml-toolkits/mlflavors/tree/main/CHANGELOG.rst>`_ .\n\nDevelopment\n-----------\n\nTo set up your local development environment, create a virtual environment, such as:\n\n.. code-block:: bash\n\n    $ conda create -n mlflavors-dev python=3.9\n    $ source activate mlflavors-dev\n\nInstall project locally:\n\n.. code-block:: bash\n\n    $ python -m pip install --upgrade pip\n    $ pip install -e \".[dev]\"\n\nInstall pre-commit hooks:\n\n.. code-block:: bash\n\n    $ pre-commit install\n\nRun tests:\n\n.. code-block:: bash\n\n    $ pytest\n\nBuild Sphinx docs:\n\n.. code-block:: bash\n\n    $ cd docs\n    $ make html\n",
    "bugtrack_url": null,
    "license": "BSD-3-Clause",
    "summary": "MLflavors: A collection of custom MLflow flavors.",
    "version": "0.1.0",
    "project_urls": {
        "Documentation": "https://mlflavors.readthedocs.io/en/latest/",
        "Homepage": "https://github.com/ml-toolkits/mlflavors",
        "Issue Tracker": "https://github.com/ml-toolkits/mlflavors/issues"
    },
    "split_keywords": [
        "machine-learning",
        "ai",
        "mlflow"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d356a89ceb7211ba9a955ee38ce523333d8b985bf9623e876c2600b65d835887",
                "md5": "6150c6e1ac64498333a05e8d258383e0",
                "sha256": "ad08d9ea692830db657b5e03ca280a3c46a4a1d88a5127913128290b6b3f7a1d"
            },
            "downloads": -1,
            "filename": "mlflavors-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "6150c6e1ac64498333a05e8d258383e0",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 32913,
            "upload_time": "2023-05-10T13:31:50",
            "upload_time_iso_8601": "2023-05-10T13:31:50.337750Z",
            "url": "https://files.pythonhosted.org/packages/d3/56/a89ceb7211ba9a955ee38ce523333d8b985bf9623e876c2600b65d835887/mlflavors-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "71f57a5d157d848f3e9b6c3a4e6bc05e1ff632c617fc329d26fc9b7737e73c5e",
                "md5": "5184225676f5fe5407f503d310fa098a",
                "sha256": "04560518e89bbd083506b72a77d01bccc124ccabed88970a02f19d6467860a9d"
            },
            "downloads": -1,
            "filename": "mlflavors-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "5184225676f5fe5407f503d310fa098a",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 25308,
            "upload_time": "2023-05-10T13:31:52",
            "upload_time_iso_8601": "2023-05-10T13:31:52.040847Z",
            "url": "https://files.pythonhosted.org/packages/71/f5/7a5d157d848f3e9b6c3a4e6bc05e1ff632c617fc329d26fc9b7737e73c5e/mlflavors-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-05-10 13:31:52",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "ml-toolkits",
    "github_project": "mlflavors",
    "travis_ci": false,
    "coveralls": true,
    "github_actions": true,
    "lcname": "mlflavors"
}

Benjamin Bluhm