faster-eTaPR


Namefaster-eTaPR JSON
Version 0.1.2 PyPI version JSON
download
home_pageNone
SummaryNone
upload_time2024-04-08 18:17:07
maintainerNone
docs_urlNone
authorNone
requires_python>=3.10
licenseNone
keywords anomaly detection etapr evaluation learning machine metrics performance point-adjust
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            faster-eTaPR
============

|pypi| |python| |docs| |pre-commit| |mypy| |codecov|

.. |pypi| image:: https://badge.fury.io/py/faster-eTaPR.svg
    :target: https://pypi.org/project/faster-etapr/
    :alt: Latest Version

.. |python| image:: https://img.shields.io/pypi/pyversions/faster-eTaPR
    :target: https://www.python.org/
    :alt: Supported Python Versions

.. |docs| image:: https://readthedocs.org/projects/faster-etapr/badge/?version=latest
    :target: https://faster-etapr.readthedocs.io/en/latest/?badge=latest
    :alt: Documentation Status

.. |pre-commit| image:: https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit&logoColor=white
    :target: https://github.com/pre-commit/pre-commit
    :alt: Pre-Commit enabled

.. |mypy| image:: http://www.mypy-lang.org/static/mypy_badge.svg
    :target: http://mypy-lang.org/
    :alt: MyPy checked

.. |codecov| image:: https://codecov.io/gh/GPla/faster-eTaPR/graph/badge.svg?token=FVA4W2KHR4
    :target: https://codecov.io/gh/GPla/faster-eTaPR
    :alt: Code Coverage

Faster implementation (`~200x <#benchmark>`_) of the enhanced time-aware precision and recall (eTaPR) from  `Hwang et al <https://dl.acm.org/doi/10.1145/3477314.3507024>`_.
The original implementation is `saurf4ng/eTaPR <https://github.com/saurf4ng/eTaPR>`_ and this implementation is fully tested against it.

Motivation
----------

The motivation behind the `eTaPR <https://dl.acm.org/doi/10.1145/3477314.3507024>`_ is that it is enough for a detection method to partially detect an anomaly segment, as along as an human expert can find the anomaly around this prediction.
The following illustration (a recreation from the `paper <https://dl.acm.org/doi/10.1145/3477314.3507024>`_) highlights the four cases which are considered by eTaPR:

.. image:: /img/motivation.png
    :width: 80%
    :align: center
    :alt: Motivation behind eTaPR

1. A *successful* detection: A human expert can likely find the anomaly :math:`A_1` based on the prediction :math:`P_1`.
2. A *failed* detection: Only a small portion of the prediction :math:`P_2` overlaps with the anomaly :math:`A_2`.
3. A *failed* detection: Most of the prediction :math:`P_3` lies in the range of non-anomalous behavior (prediction starts too early). A human expert will likely regard the prediction :math:`P_3` as incorrect or a false alarm. The prediction :math:`P_3` is *too imprecise* and the anomaly :math:`A_3` is likely to be missed.
4. A *failed* prediction: The prediction :math:`P_4` mostly overlaps with the anomaly :math:`A_4`, but covers only a small portion of the actual anomaly segment. Thus, a human expert is likely to dismiss the prediction :math:`P_4` as incorrect because the full extend of the anomaly remains hidden. The prediction `P_4` contains *insufficient* information about the anomaly.

Note that for case 4, we could still mark the anomaly as detected, if there were more predictions which overlap with the anomaly :math:`A_4`.
Specifically, the handling of the cases 3 and 4 is what sets eTaPR apart from other scoring methods.

If you want an in-depth explanation of the calculation, check out the `documentation <https://faster-etapr.readthedocs.io/>`_.

Getting Started
---------------

Install this package from PyPI using `pip <https://github.com/pypa/pip>`_ or `uv <https://github.com/astral-sh/uv>`_:

.. code::

    pip install faster-etapr

.. code::

    uv pip install faster-etapr

Now, you run your evaluation in python:

.. code::

    import faster_etapr
    faster_etapr.evaluate_from_preds(
        y_hat=[0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0],
        y=    [0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 1],
        theta_p=0.5,
        theta_r=0.1,
    )
    {
        'eta/recall': 0.3875,
        'eta/recall_detection': 0.5,
        'eta/recall_portion': 0.275,
        'eta/detected_anomalies': 2.0,
        'eta/precision': 0.46476766302377037,
        'eta/precision_detection': 0.46476766302377037,
        'eta/precision_portion': 0.46476766302377037,
        'eta/correct_predictions': 2.0,
        'eta/f1': 0.4226312395393011,
        'eta/TP': 4,
        'eta/FP': 5,
        'eta/FN': 7,
        'eta/wrong_predictions': 2,
        'eta/missed_anomalies': 2,
        'eta/anomalies': 4,
        'eta/segments': 0.499999999999875,
        'point/recall': 0.45454545454541323,
        'point/precision': 0.5555555555554939,
        'point/f1': 0.49999999999945494,
        'point/TP': 5,
        'point/FP': 4,
        'point/FN': 6,
        'point/anomalies': 4,
        'point/detected_anomalies': 3.0,
        'point/segments': 0.75,
        'point_adjust/recall': 0.9090909090909091,
        'point_adjust/precision': 0.7142857142857143,
        'point_adjust/f1': 0.7999999999995071
    }

We calculate three types of metrics:

- the `enhanced time-aware (eTa)
  <https://dl.acm.org/doi/10.1145/3477314.3507024>`_ metrics under
  ``eta/``
- the (traditional) point-wise metrics under ``point/``
- the `point-adjusted <https://arxiv.org/abs/1802.03903>`_ metrics under
  ``point_adjust/``


.. _benchmark:

Benchmark
---------

A little benchmark with randomly generated inputs (:code:`np.random.randint(0, 2, size=size)`):

+---------+-----------+--------------+--------+
| size    | eTaPR_pkg | faster_etapr | factor |
+=========+===========+==============+========+
| 1 000   | 0.4090    | 0.0032       | ~125x  |
+---------+-----------+--------------+--------+
| 10 000  | 35.8264   | 0.1810       | ~198x  |
+---------+-----------+--------------+--------+
| 20 000  | 148.2670  | 0.6547       | ~226x  |
+---------+-----------+--------------+--------+
| 100 000 | too long  | 55.04712     |        |
+---------+-----------+--------------+--------+

Citation
--------

If you use eTaPR, please cite the original author/paper:

.. code::

    @inproceedings{10.1145/3477314.3507024,
    author = {Hwang, Won-Seok and Yun, Jeong-Han and Kim, Jonguk and Min, Byung Gil},
    title = {"Do You Know Existing Accuracy Metrics Overrate Time-Series Anomaly Detections?"},
    year = {2022},
    isbn = {9781450387132},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/3477314.3507024},
    doi = {10.1145/3477314.3507024},
    booktitle = {Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing},
    pages = {403–412},
    numpages = {10},
    keywords = {accuracy metric, anomaly detection, precision, recall, time-series},
    location = {Virtual Event},
    series = {SAC '22}
    }

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "faster-eTaPR",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "anomaly, detection, etapr, evaluation, learning, machine, metrics, performance, point-adjust",
    "author": null,
    "author_email": "Gorden Platz <36087062+GPla@users.noreply.github.com>",
    "download_url": "https://files.pythonhosted.org/packages/dc/91/264b1f1944959c4d64794d4800d5f469ddbfd03e59263ed819bc0a8b8e27/faster_etapr-0.1.2.tar.gz",
    "platform": null,
    "description": "faster-eTaPR\n============\n\n|pypi| |python| |docs| |pre-commit| |mypy| |codecov|\n\n.. |pypi| image:: https://badge.fury.io/py/faster-eTaPR.svg\n    :target: https://pypi.org/project/faster-etapr/\n    :alt: Latest Version\n\n.. |python| image:: https://img.shields.io/pypi/pyversions/faster-eTaPR\n    :target: https://www.python.org/\n    :alt: Supported Python Versions\n\n.. |docs| image:: https://readthedocs.org/projects/faster-etapr/badge/?version=latest\n    :target: https://faster-etapr.readthedocs.io/en/latest/?badge=latest\n    :alt: Documentation Status\n\n.. |pre-commit| image:: https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit&logoColor=white\n    :target: https://github.com/pre-commit/pre-commit\n    :alt: Pre-Commit enabled\n\n.. |mypy| image:: http://www.mypy-lang.org/static/mypy_badge.svg\n    :target: http://mypy-lang.org/\n    :alt: MyPy checked\n\n.. |codecov| image:: https://codecov.io/gh/GPla/faster-eTaPR/graph/badge.svg?token=FVA4W2KHR4\n    :target: https://codecov.io/gh/GPla/faster-eTaPR\n    :alt: Code Coverage\n\nFaster implementation (`~200x <#benchmark>`_) of the enhanced time-aware precision and recall (eTaPR) from  `Hwang et al <https://dl.acm.org/doi/10.1145/3477314.3507024>`_.\nThe original implementation is `saurf4ng/eTaPR <https://github.com/saurf4ng/eTaPR>`_ and this implementation is fully tested against it.\n\nMotivation\n----------\n\nThe motivation behind the `eTaPR <https://dl.acm.org/doi/10.1145/3477314.3507024>`_ is that it is enough for a detection method to partially detect an anomaly segment, as along as an human expert can find the anomaly around this prediction.\nThe following illustration (a recreation from the `paper <https://dl.acm.org/doi/10.1145/3477314.3507024>`_) highlights the four cases which are considered by eTaPR:\n\n.. image:: /img/motivation.png\n    :width: 80%\n    :align: center\n    :alt: Motivation behind eTaPR\n\n1. A *successful* detection: A human expert can likely find the anomaly :math:`A_1` based on the prediction :math:`P_1`.\n2. A *failed* detection: Only a small portion of the prediction :math:`P_2` overlaps with the anomaly :math:`A_2`.\n3. A *failed* detection: Most of the prediction :math:`P_3` lies in the range of non-anomalous behavior (prediction starts too early). A human expert will likely regard the prediction :math:`P_3` as incorrect or a false alarm. The prediction :math:`P_3` is *too imprecise* and the anomaly :math:`A_3` is likely to be missed.\n4. A *failed* prediction: The prediction :math:`P_4` mostly overlaps with the anomaly :math:`A_4`, but covers only a small portion of the actual anomaly segment. Thus, a human expert is likely to dismiss the prediction :math:`P_4` as incorrect because the full extend of the anomaly remains hidden. The prediction `P_4` contains *insufficient* information about the anomaly.\n\nNote that for case 4, we could still mark the anomaly as detected, if there were more predictions which overlap with the anomaly :math:`A_4`.\nSpecifically, the handling of the cases 3 and 4 is what sets eTaPR apart from other scoring methods.\n\nIf you want an in-depth explanation of the calculation, check out the `documentation <https://faster-etapr.readthedocs.io/>`_.\n\nGetting Started\n---------------\n\nInstall this package from PyPI using `pip <https://github.com/pypa/pip>`_ or `uv <https://github.com/astral-sh/uv>`_:\n\n.. code::\n\n    pip install faster-etapr\n\n.. code::\n\n    uv pip install faster-etapr\n\nNow, you run your evaluation in python:\n\n.. code::\n\n    import faster_etapr\n    faster_etapr.evaluate_from_preds(\n        y_hat=[0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0],\n        y=    [0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 1],\n        theta_p=0.5,\n        theta_r=0.1,\n    )\n    {\n        'eta/recall': 0.3875,\n        'eta/recall_detection': 0.5,\n        'eta/recall_portion': 0.275,\n        'eta/detected_anomalies': 2.0,\n        'eta/precision': 0.46476766302377037,\n        'eta/precision_detection': 0.46476766302377037,\n        'eta/precision_portion': 0.46476766302377037,\n        'eta/correct_predictions': 2.0,\n        'eta/f1': 0.4226312395393011,\n        'eta/TP': 4,\n        'eta/FP': 5,\n        'eta/FN': 7,\n        'eta/wrong_predictions': 2,\n        'eta/missed_anomalies': 2,\n        'eta/anomalies': 4,\n        'eta/segments': 0.499999999999875,\n        'point/recall': 0.45454545454541323,\n        'point/precision': 0.5555555555554939,\n        'point/f1': 0.49999999999945494,\n        'point/TP': 5,\n        'point/FP': 4,\n        'point/FN': 6,\n        'point/anomalies': 4,\n        'point/detected_anomalies': 3.0,\n        'point/segments': 0.75,\n        'point_adjust/recall': 0.9090909090909091,\n        'point_adjust/precision': 0.7142857142857143,\n        'point_adjust/f1': 0.7999999999995071\n    }\n\nWe calculate three types of metrics:\n\n- the `enhanced time-aware (eTa)\n  <https://dl.acm.org/doi/10.1145/3477314.3507024>`_ metrics under\n  ``eta/``\n- the (traditional) point-wise metrics under ``point/``\n- the `point-adjusted <https://arxiv.org/abs/1802.03903>`_ metrics under\n  ``point_adjust/``\n\n\n.. _benchmark:\n\nBenchmark\n---------\n\nA little benchmark with randomly generated inputs (:code:`np.random.randint(0, 2, size=size)`):\n\n+---------+-----------+--------------+--------+\n| size    | eTaPR_pkg | faster_etapr | factor |\n+=========+===========+==============+========+\n| 1 000   | 0.4090    | 0.0032       | ~125x  |\n+---------+-----------+--------------+--------+\n| 10 000  | 35.8264   | 0.1810       | ~198x  |\n+---------+-----------+--------------+--------+\n| 20 000  | 148.2670  | 0.6547       | ~226x  |\n+---------+-----------+--------------+--------+\n| 100 000 | too long  | 55.04712     |        |\n+---------+-----------+--------------+--------+\n\nCitation\n--------\n\nIf you use eTaPR, please cite the original author/paper:\n\n.. code::\n\n    @inproceedings{10.1145/3477314.3507024,\n    author = {Hwang, Won-Seok and Yun, Jeong-Han and Kim, Jonguk and Min, Byung Gil},\n    title = {\"Do You Know Existing Accuracy Metrics Overrate Time-Series Anomaly Detections?\"},\n    year = {2022},\n    isbn = {9781450387132},\n    publisher = {Association for Computing Machinery},\n    address = {New York, NY, USA},\n    url = {https://doi.org/10.1145/3477314.3507024},\n    doi = {10.1145/3477314.3507024},\n    booktitle = {Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing},\n    pages = {403\u2013412},\n    numpages = {10},\n    keywords = {accuracy metric, anomaly detection, precision, recall, time-series},\n    location = {Virtual Event},\n    series = {SAC '22}\n    }\n",
    "bugtrack_url": null,
    "license": null,
    "summary": null,
    "version": "0.1.2",
    "project_urls": {
        "Documentation": "https://faster-etapr.readthedocs.io",
        "Source": "https://github.com/GPla/faster-eTaPR"
    },
    "split_keywords": [
        "anomaly",
        " detection",
        " etapr",
        " evaluation",
        " learning",
        " machine",
        " metrics",
        " performance",
        " point-adjust"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "5221e426a31ff3f39940e7e0dfa9501ead288c87345bfac0cead8ddb89d068d1",
                "md5": "7c384ebc4350ab072b6344e13352d951",
                "sha256": "933b47796f13a975e680e5d9e76fcccfd9abe37ad8d59fcbedf75dd48f9cdc7c"
            },
            "downloads": -1,
            "filename": "faster_etapr-0.1.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "7c384ebc4350ab072b6344e13352d951",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 12519,
            "upload_time": "2024-04-08T18:17:05",
            "upload_time_iso_8601": "2024-04-08T18:17:05.681130Z",
            "url": "https://files.pythonhosted.org/packages/52/21/e426a31ff3f39940e7e0dfa9501ead288c87345bfac0cead8ddb89d068d1/faster_etapr-0.1.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "dc91264b1f1944959c4d64794d4800d5f469ddbfd03e59263ed819bc0a8b8e27",
                "md5": "f07c9b801eac6d2a157520e3c452c0f8",
                "sha256": "3567fb8a65a417ab317e4d186d75908841ef8f8d20dd558948b9c88cf04298c5"
            },
            "downloads": -1,
            "filename": "faster_etapr-0.1.2.tar.gz",
            "has_sig": false,
            "md5_digest": "f07c9b801eac6d2a157520e3c452c0f8",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 240290,
            "upload_time": "2024-04-08T18:17:07",
            "upload_time_iso_8601": "2024-04-08T18:17:07.453274Z",
            "url": "https://files.pythonhosted.org/packages/dc/91/264b1f1944959c4d64794d4800d5f469ddbfd03e59263ed819bc0a8b8e27/faster_etapr-0.1.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-04-08 18:17:07",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "GPla",
    "github_project": "faster-eTaPR",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "tox": true,
    "lcname": "faster-etapr"
}
        
Elapsed time: 0.22014s