gate-drift


Namegate-drift JSON
Version 0.1.5 PyPI version JSON
download
home_page
SummaryData drift detection tool for machine learning pipelines.
upload_time2023-04-28 17:33:46
maintainer
docs_urlNone
authorShreya Shankar
requires_python>=3.8,<4.0
licenseMIT
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # GATE: Data Drift Detection for Machine Learning Pipelines

[![GATE](https://github.com/dm4ml/gate/workflows/gate/badge.svg)](https://github.com/dm4ml/gate/actions?query=workflow:"gate")
[![lint (via ruff)](https://github.com/dm4ml/gate/workflows/lint/badge.svg)](https://github.com/dm4ml/gate/actions?query=workflow:"lint")
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)

GATE is a Python module that detects drift in partitions of data. GATE computes partition summaries, which are then fed into an anomaly detection algorithm to detect whether a new partition is anomalous. This minimizes false positive alerts when detecting drift in machine learning (ML) pipelines, where there may be many features and prediction columns.

### Support for Embeddings

We now support drift detection on embeddings, in addition to structured data. GATE considers _both_ the structured data and the embeddings when computing partition summaries and detecting drift. Check out the [embeddings page](./embedding) for a walkthrough of how to use GATE with embeddings.

## Installation

GATE is available on PyPI and can be installed with pip:

```bash
pip install gate-drift
```

Note that GATE requires Python 3.8 or higher.

## Usage

GATE is designed to be used with [Pandas](https://pandas.pydata.org/) dataframes. Check out the [documentation](https://dm4ml.github.io/gate/) for a walkthrough of how to use GATE.

## Research Contributions

GATE was developed and is maintained by researchers at the UC Berkeley [EPIC Lab](https://epic.berkeley.edu/).

An initial version of GATE was developed as part of a collaboration with Meta, and the research paper, "Moving Fast With Broken Data" by Shankar et al., is available on [arXiv](https://arxiv.org/abs/2303.06094). This module slightly differs from the original implementation, but the core ideas around partition summaries and anomaly detection are the same.

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "gate-drift",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8,<4.0",
    "maintainer_email": "",
    "keywords": "",
    "author": "Shreya Shankar",
    "author_email": "shreyashankar@berkeley.edu",
    "download_url": "https://files.pythonhosted.org/packages/ce/ec/2c012fc939e673fd7e59d4411af73a8a5af4df99788aa81609a310105ee9/gate_drift-0.1.5.tar.gz",
    "platform": null,
    "description": "# GATE: Data Drift Detection for Machine Learning Pipelines\n\n[![GATE](https://github.com/dm4ml/gate/workflows/gate/badge.svg)](https://github.com/dm4ml/gate/actions?query=workflow:\"gate\")\n[![lint (via ruff)](https://github.com/dm4ml/gate/workflows/lint/badge.svg)](https://github.com/dm4ml/gate/actions?query=workflow:\"lint\")\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n\nGATE is a Python module that detects drift in partitions of data. GATE computes partition summaries, which are then fed into an anomaly detection algorithm to detect whether a new partition is anomalous. This minimizes false positive alerts when detecting drift in machine learning (ML) pipelines, where there may be many features and prediction columns.\n\n### Support for Embeddings\n\nWe now support drift detection on embeddings, in addition to structured data. GATE considers _both_ the structured data and the embeddings when computing partition summaries and detecting drift. Check out the [embeddings page](./embedding) for a walkthrough of how to use GATE with embeddings.\n\n## Installation\n\nGATE is available on PyPI and can be installed with pip:\n\n```bash\npip install gate-drift\n```\n\nNote that GATE requires Python 3.8 or higher.\n\n## Usage\n\nGATE is designed to be used with [Pandas](https://pandas.pydata.org/) dataframes. Check out the [documentation](https://dm4ml.github.io/gate/) for a walkthrough of how to use GATE.\n\n## Research Contributions\n\nGATE was developed and is maintained by researchers at the UC Berkeley [EPIC Lab](https://epic.berkeley.edu/).\n\nAn initial version of GATE was developed as part of a collaboration with Meta, and the research paper, \"Moving Fast With Broken Data\" by Shankar et al., is available on [arXiv](https://arxiv.org/abs/2303.06094). This module slightly differs from the original implementation, but the core ideas around partition summaries and anomaly detection are the same.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Data drift detection tool for machine learning pipelines.",
    "version": "0.1.5",
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "dc9e68b30c7a7518b02f0c090ce2dc1b363b780c43fa2cb308e84779e97f8f57",
                "md5": "741d7ba011c894c97cbef47e367146eb",
                "sha256": "55d0feb84dc4486a663331b84ad0faa6a5ece861329f34cb2d5d94334aa93ff1"
            },
            "downloads": -1,
            "filename": "gate_drift-0.1.5-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "741d7ba011c894c97cbef47e367146eb",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8,<4.0",
            "size": 14539,
            "upload_time": "2023-04-28T17:33:45",
            "upload_time_iso_8601": "2023-04-28T17:33:45.251466Z",
            "url": "https://files.pythonhosted.org/packages/dc/9e/68b30c7a7518b02f0c090ce2dc1b363b780c43fa2cb308e84779e97f8f57/gate_drift-0.1.5-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ceec2c012fc939e673fd7e59d4411af73a8a5af4df99788aa81609a310105ee9",
                "md5": "182809fce1f7feec75c0e3194147e6fb",
                "sha256": "f2a68f720e5a161b007823d18a64fcc599f0b62a286715b8224107cf2f8f9c99"
            },
            "downloads": -1,
            "filename": "gate_drift-0.1.5.tar.gz",
            "has_sig": false,
            "md5_digest": "182809fce1f7feec75c0e3194147e6fb",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8,<4.0",
            "size": 13570,
            "upload_time": "2023-04-28T17:33:46",
            "upload_time_iso_8601": "2023-04-28T17:33:46.666954Z",
            "url": "https://files.pythonhosted.org/packages/ce/ec/2c012fc939e673fd7e59d4411af73a8a5af4df99788aa81609a310105ee9/gate_drift-0.1.5.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-04-28 17:33:46",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "lcname": "gate-drift"
}
        
Elapsed time: 0.24313s