faster-eTaPR
============
|pypi| |python| |docs| |pre-commit| |mypy| |codecov|
.. |pypi| image:: https://badge.fury.io/py/faster-eTaPR.svg
:target: https://pypi.org/project/faster-etapr/
:alt: Latest Version
.. |python| image:: https://img.shields.io/pypi/pyversions/faster-eTaPR
:target: https://www.python.org/
:alt: Supported Python Versions
.. |docs| image:: https://readthedocs.org/projects/faster-etapr/badge/?version=latest
:target: https://faster-etapr.readthedocs.io/en/latest/?badge=latest
:alt: Documentation Status
.. |pre-commit| image:: https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit&logoColor=white
:target: https://github.com/pre-commit/pre-commit
:alt: Pre-Commit enabled
.. |mypy| image:: http://www.mypy-lang.org/static/mypy_badge.svg
:target: http://mypy-lang.org/
:alt: MyPy checked
.. |codecov| image:: https://codecov.io/gh/GPla/faster-eTaPR/graph/badge.svg?token=FVA4W2KHR4
:target: https://codecov.io/gh/GPla/faster-eTaPR
:alt: Code Coverage
Faster implementation (`~200x <#benchmark>`_) of the enhanced time-aware precision and recall (eTaPR) from `Hwang et al <https://dl.acm.org/doi/10.1145/3477314.3507024>`_.
The original implementation is `saurf4ng/eTaPR <https://github.com/saurf4ng/eTaPR>`_ and this implementation is fully tested against it.
Motivation
----------
The motivation behind the `eTaPR <https://dl.acm.org/doi/10.1145/3477314.3507024>`_ is that it is enough for a detection method to partially detect an anomaly segment, as along as an human expert can find the anomaly around this prediction.
The following illustration (a recreation from the `paper <https://dl.acm.org/doi/10.1145/3477314.3507024>`_) highlights the four cases which are considered by eTaPR:
.. image:: /img/motivation.png
:width: 80%
:align: center
:alt: Motivation behind eTaPR
1. A *successful* detection: A human expert can likely find the anomaly :math:`A_1` based on the prediction :math:`P_1`.
2. A *failed* detection: Only a small portion of the prediction :math:`P_2` overlaps with the anomaly :math:`A_2`.
3. A *failed* detection: Most of the prediction :math:`P_3` lies in the range of non-anomalous behavior (prediction starts too early). A human expert will likely regard the prediction :math:`P_3` as incorrect or a false alarm. The prediction :math:`P_3` is *too imprecise* and the anomaly :math:`A_3` is likely to be missed.
4. A *failed* prediction: The prediction :math:`P_4` mostly overlaps with the anomaly :math:`A_4`, but covers only a small portion of the actual anomaly segment. Thus, a human expert is likely to dismiss the prediction :math:`P_4` as incorrect because the full extend of the anomaly remains hidden. The prediction `P_4` contains *insufficient* information about the anomaly.
Note that for case 4, we could still mark the anomaly as detected, if there were more predictions which overlap with the anomaly :math:`A_4`.
Specifically, the handling of the cases 3 and 4 is what sets eTaPR apart from other scoring methods.
If you want an in-depth explanation of the calculation, check out the `documentation <https://faster-etapr.readthedocs.io/>`_.
Getting Started
---------------
Install this package from PyPI using `pip <https://github.com/pypa/pip>`_ or `uv <https://github.com/astral-sh/uv>`_:
.. code::
pip install faster-etapr
.. code::
uv pip install faster-etapr
Now, you run your evaluation in python:
.. code::
import faster_etapr
faster_etapr.evaluate_from_preds(
y_hat=[0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0],
y= [0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 1],
theta_p=0.5,
theta_r=0.1,
)
{
'eta/recall': 0.3875,
'eta/recall_detection': 0.5,
'eta/recall_portion': 0.275,
'eta/detected_anomalies': 2.0,
'eta/precision': 0.46476766302377037,
'eta/precision_detection': 0.46476766302377037,
'eta/precision_portion': 0.46476766302377037,
'eta/correct_predictions': 2.0,
'eta/f1': 0.4226312395393011,
'eta/TP': 4,
'eta/FP': 5,
'eta/FN': 7,
'eta/wrong_predictions': 2,
'eta/missed_anomalies': 2,
'eta/anomalies': 4,
'eta/segments': 0.499999999999875,
'point/recall': 0.45454545454541323,
'point/precision': 0.5555555555554939,
'point/f1': 0.49999999999945494,
'point/TP': 5,
'point/FP': 4,
'point/FN': 6,
'point/anomalies': 4,
'point/detected_anomalies': 3.0,
'point/segments': 0.75,
'point_adjust/recall': 0.9090909090909091,
'point_adjust/precision': 0.7142857142857143,
'point_adjust/f1': 0.7999999999995071
}
We calculate three types of metrics:
- the `enhanced time-aware (eTa)
<https://dl.acm.org/doi/10.1145/3477314.3507024>`_ metrics under
``eta/``
- the (traditional) point-wise metrics under ``point/``
- the `point-adjusted <https://arxiv.org/abs/1802.03903>`_ metrics under
``point_adjust/``
.. _benchmark:
Benchmark
---------
A little benchmark with randomly generated inputs (:code:`np.random.randint(0, 2, size=size)`):
+---------+-----------+--------------+--------+
| size | eTaPR_pkg | faster_etapr | factor |
+=========+===========+==============+========+
| 1 000 | 0.4090 | 0.0032 | ~125x |
+---------+-----------+--------------+--------+
| 10 000 | 35.8264 | 0.1810 | ~198x |
+---------+-----------+--------------+--------+
| 20 000 | 148.2670 | 0.6547 | ~226x |
+---------+-----------+--------------+--------+
| 100 000 | too long | 55.04712 | |
+---------+-----------+--------------+--------+
Citation
--------
If you use eTaPR, please cite the original author/paper:
.. code::
@inproceedings{10.1145/3477314.3507024,
author = {Hwang, Won-Seok and Yun, Jeong-Han and Kim, Jonguk and Min, Byung Gil},
title = {"Do You Know Existing Accuracy Metrics Overrate Time-Series Anomaly Detections?"},
year = {2022},
isbn = {9781450387132},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3477314.3507024},
doi = {10.1145/3477314.3507024},
booktitle = {Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing},
pages = {403–412},
numpages = {10},
keywords = {accuracy metric, anomaly detection, precision, recall, time-series},
location = {Virtual Event},
series = {SAC '22}
}
Raw data
{
"_id": null,
"home_page": null,
"name": "faster-eTaPR",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": "anomaly, detection, etapr, evaluation, learning, machine, metrics, performance, point-adjust",
"author": null,
"author_email": "Gorden Platz <36087062+GPla@users.noreply.github.com>",
"download_url": "https://files.pythonhosted.org/packages/dc/91/264b1f1944959c4d64794d4800d5f469ddbfd03e59263ed819bc0a8b8e27/faster_etapr-0.1.2.tar.gz",
"platform": null,
"description": "faster-eTaPR\n============\n\n|pypi| |python| |docs| |pre-commit| |mypy| |codecov|\n\n.. |pypi| image:: https://badge.fury.io/py/faster-eTaPR.svg\n :target: https://pypi.org/project/faster-etapr/\n :alt: Latest Version\n\n.. |python| image:: https://img.shields.io/pypi/pyversions/faster-eTaPR\n :target: https://www.python.org/\n :alt: Supported Python Versions\n\n.. |docs| image:: https://readthedocs.org/projects/faster-etapr/badge/?version=latest\n :target: https://faster-etapr.readthedocs.io/en/latest/?badge=latest\n :alt: Documentation Status\n\n.. |pre-commit| image:: https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit&logoColor=white\n :target: https://github.com/pre-commit/pre-commit\n :alt: Pre-Commit enabled\n\n.. |mypy| image:: http://www.mypy-lang.org/static/mypy_badge.svg\n :target: http://mypy-lang.org/\n :alt: MyPy checked\n\n.. |codecov| image:: https://codecov.io/gh/GPla/faster-eTaPR/graph/badge.svg?token=FVA4W2KHR4\n :target: https://codecov.io/gh/GPla/faster-eTaPR\n :alt: Code Coverage\n\nFaster implementation (`~200x <#benchmark>`_) of the enhanced time-aware precision and recall (eTaPR) from `Hwang et al <https://dl.acm.org/doi/10.1145/3477314.3507024>`_.\nThe original implementation is `saurf4ng/eTaPR <https://github.com/saurf4ng/eTaPR>`_ and this implementation is fully tested against it.\n\nMotivation\n----------\n\nThe motivation behind the `eTaPR <https://dl.acm.org/doi/10.1145/3477314.3507024>`_ is that it is enough for a detection method to partially detect an anomaly segment, as along as an human expert can find the anomaly around this prediction.\nThe following illustration (a recreation from the `paper <https://dl.acm.org/doi/10.1145/3477314.3507024>`_) highlights the four cases which are considered by eTaPR:\n\n.. image:: /img/motivation.png\n :width: 80%\n :align: center\n :alt: Motivation behind eTaPR\n\n1. A *successful* detection: A human expert can likely find the anomaly :math:`A_1` based on the prediction :math:`P_1`.\n2. A *failed* detection: Only a small portion of the prediction :math:`P_2` overlaps with the anomaly :math:`A_2`.\n3. A *failed* detection: Most of the prediction :math:`P_3` lies in the range of non-anomalous behavior (prediction starts too early). A human expert will likely regard the prediction :math:`P_3` as incorrect or a false alarm. The prediction :math:`P_3` is *too imprecise* and the anomaly :math:`A_3` is likely to be missed.\n4. A *failed* prediction: The prediction :math:`P_4` mostly overlaps with the anomaly :math:`A_4`, but covers only a small portion of the actual anomaly segment. Thus, a human expert is likely to dismiss the prediction :math:`P_4` as incorrect because the full extend of the anomaly remains hidden. The prediction `P_4` contains *insufficient* information about the anomaly.\n\nNote that for case 4, we could still mark the anomaly as detected, if there were more predictions which overlap with the anomaly :math:`A_4`.\nSpecifically, the handling of the cases 3 and 4 is what sets eTaPR apart from other scoring methods.\n\nIf you want an in-depth explanation of the calculation, check out the `documentation <https://faster-etapr.readthedocs.io/>`_.\n\nGetting Started\n---------------\n\nInstall this package from PyPI using `pip <https://github.com/pypa/pip>`_ or `uv <https://github.com/astral-sh/uv>`_:\n\n.. code::\n\n pip install faster-etapr\n\n.. code::\n\n uv pip install faster-etapr\n\nNow, you run your evaluation in python:\n\n.. code::\n\n import faster_etapr\n faster_etapr.evaluate_from_preds(\n y_hat=[0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0],\n y= [0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 1],\n theta_p=0.5,\n theta_r=0.1,\n )\n {\n 'eta/recall': 0.3875,\n 'eta/recall_detection': 0.5,\n 'eta/recall_portion': 0.275,\n 'eta/detected_anomalies': 2.0,\n 'eta/precision': 0.46476766302377037,\n 'eta/precision_detection': 0.46476766302377037,\n 'eta/precision_portion': 0.46476766302377037,\n 'eta/correct_predictions': 2.0,\n 'eta/f1': 0.4226312395393011,\n 'eta/TP': 4,\n 'eta/FP': 5,\n 'eta/FN': 7,\n 'eta/wrong_predictions': 2,\n 'eta/missed_anomalies': 2,\n 'eta/anomalies': 4,\n 'eta/segments': 0.499999999999875,\n 'point/recall': 0.45454545454541323,\n 'point/precision': 0.5555555555554939,\n 'point/f1': 0.49999999999945494,\n 'point/TP': 5,\n 'point/FP': 4,\n 'point/FN': 6,\n 'point/anomalies': 4,\n 'point/detected_anomalies': 3.0,\n 'point/segments': 0.75,\n 'point_adjust/recall': 0.9090909090909091,\n 'point_adjust/precision': 0.7142857142857143,\n 'point_adjust/f1': 0.7999999999995071\n }\n\nWe calculate three types of metrics:\n\n- the `enhanced time-aware (eTa)\n <https://dl.acm.org/doi/10.1145/3477314.3507024>`_ metrics under\n ``eta/``\n- the (traditional) point-wise metrics under ``point/``\n- the `point-adjusted <https://arxiv.org/abs/1802.03903>`_ metrics under\n ``point_adjust/``\n\n\n.. _benchmark:\n\nBenchmark\n---------\n\nA little benchmark with randomly generated inputs (:code:`np.random.randint(0, 2, size=size)`):\n\n+---------+-----------+--------------+--------+\n| size | eTaPR_pkg | faster_etapr | factor |\n+=========+===========+==============+========+\n| 1 000 | 0.4090 | 0.0032 | ~125x |\n+---------+-----------+--------------+--------+\n| 10 000 | 35.8264 | 0.1810 | ~198x |\n+---------+-----------+--------------+--------+\n| 20 000 | 148.2670 | 0.6547 | ~226x |\n+---------+-----------+--------------+--------+\n| 100 000 | too long | 55.04712 | |\n+---------+-----------+--------------+--------+\n\nCitation\n--------\n\nIf you use eTaPR, please cite the original author/paper:\n\n.. code::\n\n @inproceedings{10.1145/3477314.3507024,\n author = {Hwang, Won-Seok and Yun, Jeong-Han and Kim, Jonguk and Min, Byung Gil},\n title = {\"Do You Know Existing Accuracy Metrics Overrate Time-Series Anomaly Detections?\"},\n year = {2022},\n isbn = {9781450387132},\n publisher = {Association for Computing Machinery},\n address = {New York, NY, USA},\n url = {https://doi.org/10.1145/3477314.3507024},\n doi = {10.1145/3477314.3507024},\n booktitle = {Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing},\n pages = {403\u2013412},\n numpages = {10},\n keywords = {accuracy metric, anomaly detection, precision, recall, time-series},\n location = {Virtual Event},\n series = {SAC '22}\n }\n",
"bugtrack_url": null,
"license": null,
"summary": null,
"version": "0.1.2",
"project_urls": {
"Documentation": "https://faster-etapr.readthedocs.io",
"Source": "https://github.com/GPla/faster-eTaPR"
},
"split_keywords": [
"anomaly",
" detection",
" etapr",
" evaluation",
" learning",
" machine",
" metrics",
" performance",
" point-adjust"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "5221e426a31ff3f39940e7e0dfa9501ead288c87345bfac0cead8ddb89d068d1",
"md5": "7c384ebc4350ab072b6344e13352d951",
"sha256": "933b47796f13a975e680e5d9e76fcccfd9abe37ad8d59fcbedf75dd48f9cdc7c"
},
"downloads": -1,
"filename": "faster_etapr-0.1.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "7c384ebc4350ab072b6344e13352d951",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 12519,
"upload_time": "2024-04-08T18:17:05",
"upload_time_iso_8601": "2024-04-08T18:17:05.681130Z",
"url": "https://files.pythonhosted.org/packages/52/21/e426a31ff3f39940e7e0dfa9501ead288c87345bfac0cead8ddb89d068d1/faster_etapr-0.1.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "dc91264b1f1944959c4d64794d4800d5f469ddbfd03e59263ed819bc0a8b8e27",
"md5": "f07c9b801eac6d2a157520e3c452c0f8",
"sha256": "3567fb8a65a417ab317e4d186d75908841ef8f8d20dd558948b9c88cf04298c5"
},
"downloads": -1,
"filename": "faster_etapr-0.1.2.tar.gz",
"has_sig": false,
"md5_digest": "f07c9b801eac6d2a157520e3c452c0f8",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 240290,
"upload_time": "2024-04-08T18:17:07",
"upload_time_iso_8601": "2024-04-08T18:17:07.453274Z",
"url": "https://files.pythonhosted.org/packages/dc/91/264b1f1944959c4d64794d4800d5f469ddbfd03e59263ed819bc0a8b8e27/faster_etapr-0.1.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-04-08 18:17:07",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "GPla",
"github_project": "faster-eTaPR",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"tox": true,
"lcname": "faster-etapr"
}