pycatcher


Namepycatcher JSON
Version 0.0.67 PyPI version JSON
download
home_pagehttps://github.com/aseemanand/pycatcher/
SummaryThis package identifies outlier(s) for a given time-series dataset in simple steps. It supports day, week, month and quarter level time-series data.
upload_time2025-01-20 21:43:58
maintainerJagadish Pamarthi
docs_urlNone
authorAseem Anand
requires_python<4.0,>=3.9
licenseMIT
keywords outlier-detection python timeseries
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage
            ## PyCatcher
[![License](https://img.shields.io/badge/License-MIT-green.svg)](https://github.com/aseemanand/pycatcher/blob/main/LICENSE) [![PyPI Downloads](https://static.pepy.tech/badge/pycatcher)](https://pepy.tech/projects/pycatcher)  [![PyPI Downloads](https://static.pepy.tech/badge/pycatcher/month)](https://pepy.tech/projects/pycatcher)  [![PyPI Downloads](https://static.pepy.tech/badge/pycatcher/week)](https://pepy.tech/projects/pycatcher)  ![PYPI version](https://img.shields.io/pypi/v/pycatcher.svg) ![PYPI - Python Version](https://img.shields.io/pypi/pyversions/pycatcher.svg)

## Outlier Detection for Time-series Data
This package identifies outlier(s) for a given time-series dataset in simple steps. It supports day, week, month and 
quarter level time-series data.

- [Highlights](https://aseemanand.github.io/pycatcher/highlights/)
- [Outlier Detection Functions](https://aseemanand.github.io/pycatcher/outlier_detection_functions/)
- [Diagnostic Functions](https://aseemanand.github.io/pycatcher/diagnostic_functions/)

### Installation

```bash
pip install pycatcher
```

### Basic Requirements
* PyCatcher expects a Pandas DataFrame as an input for various outlier detection methods. It can convert Spark DataFrame 
to Pandas DataFrame at the data processing stage. 
* First column in the dataframe must be a time period column (date in 'YYYY-MM-DD'; month in 'YYYY-MM'; year in 'YYYY' 
format) and the last column a numeric column (sum or total count for the time period) to detect outliers using 
Seasonal Decomposition algorithms.
* Last column must be a numeric column to detect outliers using Interquartile Range (IQR) and Moving Average algorithms. 
* At present, PyCatcher does not depend on labeled observations (ground truth). Outliers are detected solely through 
underlying algorithms (for example, seasonal-trend decomposition and dispersion methods like MAD or Z-Score).   

<hr style="border:1.25px solid gray">

### Summary of features 
PyCatcher provides an efficient solution for detecting anomalies in time-series data using various statistical methods.
Below are the available techniques for anomaly detection, each optimized for different data characteristics.

### **1. Seasonal-Decomposition Based Anomaly Detection**

Seasonal decomposition algorithms (Classical; STL; MSTL) requires at least 2 years of data, otherwise we 
can use simpler methods (Inter Quartile Range (IQR); Moving Average method) to detect outliers.

#### **Detect Outliers Using Classical Seasonal Decomposition**
For datasets with at least two years of data, PyCatcher automatically determines whether the data follows 
an additive or multiplicative model to detect anomalies.

- **Method**: `detect_outliers_classic(df)`
- **Output**: DataFrame of detected anomalies or a message indicating no anomalies.

#### **Detect Today's Outliers**
Quickly identify if there are any anomalies specifically for the current date.

- **Method**: `detect_outliers_today_classic(df)`
- **Output**: Anomaly details for today or a message indicating no outliers.

#### **Detect the Latest Anomalies**
Retrieve the most recent anomalies identified in your time-series data.

- **Method**: `detect_outliers_latest_classic(df)`
- **Output**: Details of the latest detected anomalies.

#### **Visualize Outliers with Seasonal Decomposition**
Show outliers in your data through classical seasonal decomposition.

- **Method**: `build_outliers_plot_classic(df)`
- **Output**: Outlier plot generated using classical seasonal decomposition.

#### **Visualize Seasonal Decomposition**
Understand seasonality in your data by visualizing classical seasonal decomposition.

- **Method**: `build_seasonal_plot_classic(df)`
- **Output**: Seasonal plots displaying additive or multiplicative trends.

#### **Visualize Monthly Patterns**
Show month-wise box plot 

- **Method**: `build_monthwise_plot(df)`
- **Output**: Month-wise box plots showing spread and skewness of data.


#### **Detect Outliers Using Seasonal-Trend Decomposition using LOESS (STL)**
Use the Seasonal-Trend Decomposition method (STL) to detect anomalies.

- **Method**: `detect_outliers_stl(df)`
- **Output**: Rows flagged as outliers using STL.

#### **Detect Today's Outliers**
Quickly identify if there are any anomalies specifically for the current date.

- **Method**: `detect_outliers_today_stl(df)`
- **Output**: Anomaly details for today or a message indicating no outliers.

#### **Detect the Latest Anomalies**
Retrieve the most recent anomalies identified in your time-series data.

- **Method**: `detect_outliers_latest_stl(df)`
- **Output**: Details of the latest detected anomalies.

#### **Visualize STL Outliers**
Show outliers using the Seasonal-Trend Decomposition using LOESS (STL).

- **Method**: `build_outliers_plot_stl(df)`
- **Output**: Outlier plot generated using STL.

#### **Visualize Seasonal Decomposition using STL**
Understand seasonality in your data by visualizing Seasonal-Trend Decomposition using LOESS (STL).

- **Method**: `build_seasonal_plot_stl(df)`
- **Output**: Seasonal plot to decompose a time series into a trend component, seasonal components, 
and a residual component.

#### **Detect Outliers Using Multiple Seasonal-Trend decomposition using LOESS (MSTL)**
Use the Multiple Seasonal-Trend Decomposition method (MSTL) to detect anomalies. 

- **Method**: `detect_outliers_mstl(df)`
- **Output**: Rows flagged as outliers using MSTL.

#### **Detect Today's Outliers**
Quickly identify if there are any anomalies specifically for the current date.

- **Method**: `detect_outliers_today_mstl(df)`
- **Output**: Anomaly details for today or a message indicating no outliers.

#### **Detect the Latest Anomalies**
Retrieve the most recent anomalies identified in your time-series data.

- **Method**: `detect_outliers_latest_mstl(df)`
- **Output**: Details of the latest detected anomalies.

#### **Visualize MSTL Outliers**
Show outliers using the Multiple Seasonal-Trend Decomposition using LOESS (MSTL).

- **Method**: `build_outliers_plot_mstl(df)`
- **Output**: Outlier plot generated using MSTL.

#### **Visualize Multiple Seasonal Decomposition**
Understand seasonality in your data by visualizing Multiple Seasonal-Trend Decomposition using LOESS (MSTL).

- **Method**: `build_seasonal_plot_mstl(df)`
- **Output**: Seasonal plot to decompose a time series into a trend component, multiple seasonal components, 
and a residual component.

***

### **2. Detect Outliers Using ESD (Extreme Studentized Deviate)**
Detect anomalies in time-series data using the ESD algorithm.

- **Method**: `detect_outliers_esd(df)`
- **Output**: Rows flagged as outliers using the Generalized ESD or Seasonal ESD algorithm.

#### **Visualize ESD Outliers**
Show outliers using the Generalized ESD or Seasonal ESD algorithm.

- **Method**: `build_outliers_plot_esd(df)`
- **Output**: Outlier plot generated using Generalized ESD or Seasonal ESD algorithm.
  
---

### **3. Detect Outliers Using Moving Average**
Detect anomalies in time-series data using the Moving Average method.

- **Method**: `detect_outliers_moving_average(df)`
- **Output**: Rows flagged as outliers using Moving Average and Z-score algorithm.

#### **Visualize Moving Average Outliers**
Show outliers using the Moving Average and Z-score algorithm.

- **Method**: `build_outliers_plot_moving_average(df)`
- **Output**: Outlier plot generated using Moving Average method.
  
---

### **4. IQR-Based Anomaly Detection**

#### **Detect Outliers Using Interquartile Range (IQR)**
For datasets spanning less than two years, the IQR method is employed.

- **Method**: `find_outliers_iqr(df)`
- **Output**: Rows flagged as outliers based on IQR.

#### **Visualize IQR Plot**
Build an IQR plot for a given dataframe (for less than 2 years of data).

- **Method**: `build_iqr_plot(df)`
- **Output**: IQR plot for the time-series data.

<hr style="border:1.25px solid gray">

### Example Usage

To see an example of how to use the `pycatcher` package for outlier detection in time-series data, check out the [Example Notebook](https://github.com/aseemanand/pycatcher/blob/main/notebooks/Example%20Notebook.ipynb).

The notebook provides step-by-step guidance and demonstrates the key features of the library.



            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/aseemanand/pycatcher/",
    "name": "pycatcher",
    "maintainer": "Jagadish Pamarthi",
    "docs_url": null,
    "requires_python": "<4.0,>=3.9",
    "maintainer_email": "jagadish.vrsec@gmail.com",
    "keywords": "outlier-detection, python, timeseries",
    "author": "Aseem Anand",
    "author_email": "aseemanand@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/eb/c5/a88f29a2a0d2813bff52ccf7f07c106b39e873cb8cbdb1b20a9a35b519e0/pycatcher-0.0.67.tar.gz",
    "platform": null,
    "description": "## PyCatcher\n[![License](https://img.shields.io/badge/License-MIT-green.svg)](https://github.com/aseemanand/pycatcher/blob/main/LICENSE) [![PyPI Downloads](https://static.pepy.tech/badge/pycatcher)](https://pepy.tech/projects/pycatcher)  [![PyPI Downloads](https://static.pepy.tech/badge/pycatcher/month)](https://pepy.tech/projects/pycatcher)  [![PyPI Downloads](https://static.pepy.tech/badge/pycatcher/week)](https://pepy.tech/projects/pycatcher)  ![PYPI version](https://img.shields.io/pypi/v/pycatcher.svg) ![PYPI - Python Version](https://img.shields.io/pypi/pyversions/pycatcher.svg)\n\n## Outlier Detection for Time-series Data\nThis package identifies outlier(s) for a given time-series dataset in simple steps. It supports day, week, month and \nquarter level time-series data.\n\n- [Highlights](https://aseemanand.github.io/pycatcher/highlights/)\n- [Outlier Detection Functions](https://aseemanand.github.io/pycatcher/outlier_detection_functions/)\n- [Diagnostic Functions](https://aseemanand.github.io/pycatcher/diagnostic_functions/)\n\n### Installation\n\n```bash\npip install pycatcher\n```\n\n### Basic Requirements\n* PyCatcher expects a Pandas DataFrame as an input for various outlier detection methods. It can convert Spark DataFrame \nto Pandas DataFrame at the data processing stage. \n* First column in the dataframe must be a time period column (date in 'YYYY-MM-DD'; month in 'YYYY-MM'; year in 'YYYY' \nformat) and the last column a numeric column (sum or total count for the time period) to detect outliers using \nSeasonal Decomposition algorithms.\n* Last column must be a numeric column to detect outliers using Interquartile Range (IQR) and Moving Average algorithms. \n* At present, PyCatcher does not depend on labeled observations (ground truth). Outliers are detected solely through \nunderlying algorithms (for example, seasonal-trend decomposition and dispersion methods like MAD or Z-Score).   \n\n<hr style=\"border:1.25px solid gray\">\n\n### Summary of features \nPyCatcher provides an efficient solution for detecting anomalies in time-series data using various statistical methods.\nBelow are the available techniques for anomaly detection, each optimized for different data characteristics.\n\n### **1. Seasonal-Decomposition Based Anomaly Detection**\n\nSeasonal decomposition algorithms (Classical; STL; MSTL) requires at least 2 years of data, otherwise we \ncan use simpler methods (Inter Quartile Range (IQR); Moving Average method) to detect outliers.\n\n#### **Detect Outliers Using Classical Seasonal Decomposition**\nFor datasets with at least two years of data, PyCatcher automatically determines whether the data follows \nan additive or multiplicative model to detect anomalies.\n\n- **Method**: `detect_outliers_classic(df)`\n- **Output**: DataFrame of detected anomalies or a message indicating no anomalies.\n\n#### **Detect Today's Outliers**\nQuickly identify if there are any anomalies specifically for the current date.\n\n- **Method**: `detect_outliers_today_classic(df)`\n- **Output**: Anomaly details for today or a message indicating no outliers.\n\n#### **Detect the Latest Anomalies**\nRetrieve the most recent anomalies identified in your time-series data.\n\n- **Method**: `detect_outliers_latest_classic(df)`\n- **Output**: Details of the latest detected anomalies.\n\n#### **Visualize Outliers with Seasonal Decomposition**\nShow outliers in your data through classical seasonal decomposition.\n\n- **Method**: `build_outliers_plot_classic(df)`\n- **Output**: Outlier plot generated using classical seasonal decomposition.\n\n#### **Visualize Seasonal Decomposition**\nUnderstand seasonality in your data by visualizing classical seasonal decomposition.\n\n- **Method**: `build_seasonal_plot_classic(df)`\n- **Output**: Seasonal plots displaying additive or multiplicative trends.\n\n#### **Visualize Monthly Patterns**\nShow month-wise box plot \n\n- **Method**: `build_monthwise_plot(df)`\n- **Output**: Month-wise box plots showing spread and skewness of data.\n\n\n#### **Detect Outliers Using Seasonal-Trend Decomposition using LOESS (STL)**\nUse the Seasonal-Trend Decomposition method (STL) to detect anomalies.\n\n- **Method**: `detect_outliers_stl(df)`\n- **Output**: Rows flagged as outliers using STL.\n\n#### **Detect Today's Outliers**\nQuickly identify if there are any anomalies specifically for the current date.\n\n- **Method**: `detect_outliers_today_stl(df)`\n- **Output**: Anomaly details for today or a message indicating no outliers.\n\n#### **Detect the Latest Anomalies**\nRetrieve the most recent anomalies identified in your time-series data.\n\n- **Method**: `detect_outliers_latest_stl(df)`\n- **Output**: Details of the latest detected anomalies.\n\n#### **Visualize STL Outliers**\nShow outliers using the Seasonal-Trend Decomposition using LOESS (STL).\n\n- **Method**: `build_outliers_plot_stl(df)`\n- **Output**: Outlier plot generated using STL.\n\n#### **Visualize Seasonal Decomposition using STL**\nUnderstand seasonality in your data by visualizing Seasonal-Trend Decomposition using LOESS (STL).\n\n- **Method**: `build_seasonal_plot_stl(df)`\n- **Output**: Seasonal plot to decompose a time series into a trend component, seasonal components, \nand a residual component.\n\n#### **Detect Outliers Using Multiple Seasonal-Trend decomposition using LOESS (MSTL)**\nUse the Multiple Seasonal-Trend Decomposition method (MSTL) to detect anomalies. \n\n- **Method**: `detect_outliers_mstl(df)`\n- **Output**: Rows flagged as outliers using MSTL.\n\n#### **Detect Today's Outliers**\nQuickly identify if there are any anomalies specifically for the current date.\n\n- **Method**: `detect_outliers_today_mstl(df)`\n- **Output**: Anomaly details for today or a message indicating no outliers.\n\n#### **Detect the Latest Anomalies**\nRetrieve the most recent anomalies identified in your time-series data.\n\n- **Method**: `detect_outliers_latest_mstl(df)`\n- **Output**: Details of the latest detected anomalies.\n\n#### **Visualize MSTL Outliers**\nShow outliers using the Multiple Seasonal-Trend Decomposition using LOESS (MSTL).\n\n- **Method**: `build_outliers_plot_mstl(df)`\n- **Output**: Outlier plot generated using MSTL.\n\n#### **Visualize Multiple Seasonal Decomposition**\nUnderstand seasonality in your data by visualizing Multiple Seasonal-Trend Decomposition using LOESS (MSTL).\n\n- **Method**: `build_seasonal_plot_mstl(df)`\n- **Output**: Seasonal plot to decompose a time series into a trend component, multiple seasonal components, \nand a residual component.\n\n***\n\n### **2. Detect Outliers Using ESD (Extreme Studentized Deviate)**\nDetect anomalies in time-series data using the ESD algorithm.\n\n- **Method**: `detect_outliers_esd(df)`\n- **Output**: Rows flagged as outliers using the Generalized ESD or Seasonal ESD algorithm.\n\n#### **Visualize ESD Outliers**\nShow outliers using the Generalized ESD or Seasonal ESD algorithm.\n\n- **Method**: `build_outliers_plot_esd(df)`\n- **Output**: Outlier plot generated using Generalized ESD or Seasonal ESD algorithm.\n  \n---\n\n### **3. Detect Outliers Using Moving Average**\nDetect anomalies in time-series data using the Moving Average method.\n\n- **Method**: `detect_outliers_moving_average(df)`\n- **Output**: Rows flagged as outliers using Moving Average and Z-score algorithm.\n\n#### **Visualize Moving Average Outliers**\nShow outliers using the Moving Average and Z-score algorithm.\n\n- **Method**: `build_outliers_plot_moving_average(df)`\n- **Output**: Outlier plot generated using Moving Average method.\n  \n---\n\n### **4. IQR-Based Anomaly Detection**\n\n#### **Detect Outliers Using Interquartile Range (IQR)**\nFor datasets spanning less than two years, the IQR method is employed.\n\n- **Method**: `find_outliers_iqr(df)`\n- **Output**: Rows flagged as outliers based on IQR.\n\n#### **Visualize IQR Plot**\nBuild an IQR plot for a given dataframe (for less than 2 years of data).\n\n- **Method**: `build_iqr_plot(df)`\n- **Output**: IQR plot for the time-series data.\n\n<hr style=\"border:1.25px solid gray\">\n\n### Example Usage\n\nTo see an example of how to use the `pycatcher` package for outlier detection in time-series data, check out the [Example Notebook](https://github.com/aseemanand/pycatcher/blob/main/notebooks/Example%20Notebook.ipynb).\n\nThe notebook provides step-by-step guidance and demonstrates the key features of the library.\n\n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "This package identifies outlier(s) for a given time-series dataset in simple steps. It supports day, week, month and quarter level time-series data.",
    "version": "0.0.67",
    "project_urls": {
        "Homepage": "https://github.com/aseemanand/pycatcher/",
        "Repository": "https://github.com/aseemanand/pycatcher/"
    },
    "split_keywords": [
        "outlier-detection",
        " python",
        " timeseries"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "64a9204ba9f5bafe6a225a4e64865219f8f9e70ce1c19a2be06d0ad0f41090f2",
                "md5": "a79de4a39d8d23670763f518e2104217",
                "sha256": "f27a9cb18543bf489e8eff76bc24e0126f719a8dfc195d1e82983e791a68df4c"
            },
            "downloads": -1,
            "filename": "pycatcher-0.0.67-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "a79de4a39d8d23670763f518e2104217",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.9",
            "size": 32436,
            "upload_time": "2025-01-20T21:43:56",
            "upload_time_iso_8601": "2025-01-20T21:43:56.355325Z",
            "url": "https://files.pythonhosted.org/packages/64/a9/204ba9f5bafe6a225a4e64865219f8f9e70ce1c19a2be06d0ad0f41090f2/pycatcher-0.0.67-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ebc5a88f29a2a0d2813bff52ccf7f07c106b39e873cb8cbdb1b20a9a35b519e0",
                "md5": "db26e6302ea67dc30be347e7fdc88ca8",
                "sha256": "7435974c00bdd9e648de7ea3468bd0c64234c71cd7010f9e09918dc377a01cb7"
            },
            "downloads": -1,
            "filename": "pycatcher-0.0.67.tar.gz",
            "has_sig": false,
            "md5_digest": "db26e6302ea67dc30be347e7fdc88ca8",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.9",
            "size": 30435,
            "upload_time": "2025-01-20T21:43:58",
            "upload_time_iso_8601": "2025-01-20T21:43:58.297604Z",
            "url": "https://files.pythonhosted.org/packages/eb/c5/a88f29a2a0d2813bff52ccf7f07c106b39e873cb8cbdb1b20a9a35b519e0/pycatcher-0.0.67.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-01-20 21:43:58",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "aseemanand",
    "github_project": "pycatcher",
    "travis_ci": false,
    "coveralls": true,
    "github_actions": true,
    "lcname": "pycatcher"
}
        
Elapsed time: 0.57136s