au


Nameau JSON
Version 0.0.7 PyPI version JSON
download
home_pagehttps://github.com/thorwhalen/uu/tree/master/au
SummaryFiltering outliers
upload_time2024-01-19 20:07:08
maintainer
docs_urlNone
authorThor Whalen
requires_python
licenseapache-2.0
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            
# au: Outlier Detection Toolkit

Filtering outliers to find the golden nuggets that standout from the rest.

To install:	```pip install au```

Outlier detection is a fundamental step in data analysis, particularly relevant in statistics, data mining, and machine learning. This toolkit provides a set of functions and classes in Python for identifying outliers - observations in data that are significantly different from the majority. The toolkit is designed to accommodate various methodologies, ranging from statistical methods to machine learning-based approaches.

## Features

1. **Z-Score Based Outlier Detection**
   - Detects outliers by measuring how many standard deviations an element is from the mean.
   - Suitable for datasets where the distribution is expected to be Gaussian.

2. **Interquartile Range (IQR) Based Outlier Detection**
   - Utilizes the IQR, which is the difference between the 75th and 25th percentile of the data.
   - Effective for skewed distributions.

3. **Isolation Forest Based Outlier Detection**
   - Implements the Isolation Forest algorithm, a machine learning method for anomaly detection.
   - Ideal for high-dimensional datasets.

## Installation

Ensure that you have Python installed on your system. This toolkit requires `numpy` and `scikit-learn`. They can be installed via pip.

```
pip install numpy scikit-learn
```

## Features

1. **Z-Score Based Outlier Detection**
   - Detects outliers by measuring how many standard deviations an element is from the mean.
   - Suitable for datasets where the distribution is expected to be Gaussian.

2. **Interquartile Range (IQR) Based Outlier Detection**
   - Utilizes the IQR, which is the difference between the 75th and 25th percentile of the data.
   - Effective for skewed distributions.

3. **Isolation Forest Based Outlier Detection**
   - Implements the Isolation Forest algorithm, a machine learning method for anomaly detection.
   - Ideal for high-dimensional datasets.

## Installation

Ensure that you have Python installed on your system. This toolkit requires `numpy` and `scikit-learn`. They can be installed via pip:

```
pip install numpy scikit-learn
```

## Usage

1. **Z-Score Based Outlier Detection**

   ```python
   from outlier_detection import detect_outliers_zscore

   outliers = detect_outliers_zscore([10, 12, 12, 13, 12, 11, 40])
   ```

2. **Interquartile Range (IQR) Based Outlier Detection**

   ```python
   from outlier_detection import detect_outliers_iqr

   outliers = detect_outliers_iqr([10, 12, 12, 13, 12, 11, 40])
   ```

3. **Isolation Forest Based Outlier Detection**

   ```python
   from outlier_detection import IsolationForestOutlierDetector

   detector = IsolationForestOutlierDetector()
   outliers = detector.detect_outliers([10, 12, 12, 13, 12, 11, 40])
   ```

## Documentation

Each function and class in this toolkit comes with a detailed docstring, explaining its purpose, parameters, return values, and examples.


## Contributing

Contributions to this project are welcome! Please fork the repository and submit a pull request with your changes.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/thorwhalen/uu/tree/master/au",
    "name": "au",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "",
    "author": "Thor Whalen",
    "author_email": "",
    "download_url": "https://files.pythonhosted.org/packages/77/38/c518dbafb6f9736a1337cf030e142b0cc00af02f1f6939a6e1a6f77975bd/au-0.0.7.tar.gz",
    "platform": "any",
    "description": "\n# au: Outlier Detection Toolkit\n\nFiltering outliers to find the golden nuggets that standout from the rest.\n\nTo install:\t```pip install au```\n\nOutlier detection is a fundamental step in data analysis, particularly relevant in statistics, data mining, and machine learning. This toolkit provides a set of functions and classes in Python for identifying outliers - observations in data that are significantly different from the majority. The toolkit is designed to accommodate various methodologies, ranging from statistical methods to machine learning-based approaches.\n\n## Features\n\n1. **Z-Score Based Outlier Detection**\n   - Detects outliers by measuring how many standard deviations an element is from the mean.\n   - Suitable for datasets where the distribution is expected to be Gaussian.\n\n2. **Interquartile Range (IQR) Based Outlier Detection**\n   - Utilizes the IQR, which is the difference between the 75th and 25th percentile of the data.\n   - Effective for skewed distributions.\n\n3. **Isolation Forest Based Outlier Detection**\n   - Implements the Isolation Forest algorithm, a machine learning method for anomaly detection.\n   - Ideal for high-dimensional datasets.\n\n## Installation\n\nEnsure that you have Python installed on your system. This toolkit requires `numpy` and `scikit-learn`. They can be installed via pip.\n\n```\npip install numpy scikit-learn\n```\n\n## Features\n\n1. **Z-Score Based Outlier Detection**\n   - Detects outliers by measuring how many standard deviations an element is from the mean.\n   - Suitable for datasets where the distribution is expected to be Gaussian.\n\n2. **Interquartile Range (IQR) Based Outlier Detection**\n   - Utilizes the IQR, which is the difference between the 75th and 25th percentile of the data.\n   - Effective for skewed distributions.\n\n3. **Isolation Forest Based Outlier Detection**\n   - Implements the Isolation Forest algorithm, a machine learning method for anomaly detection.\n   - Ideal for high-dimensional datasets.\n\n## Installation\n\nEnsure that you have Python installed on your system. This toolkit requires `numpy` and `scikit-learn`. They can be installed via pip:\n\n```\npip install numpy scikit-learn\n```\n\n## Usage\n\n1. **Z-Score Based Outlier Detection**\n\n   ```python\n   from outlier_detection import detect_outliers_zscore\n\n   outliers = detect_outliers_zscore([10, 12, 12, 13, 12, 11, 40])\n   ```\n\n2. **Interquartile Range (IQR) Based Outlier Detection**\n\n   ```python\n   from outlier_detection import detect_outliers_iqr\n\n   outliers = detect_outliers_iqr([10, 12, 12, 13, 12, 11, 40])\n   ```\n\n3. **Isolation Forest Based Outlier Detection**\n\n   ```python\n   from outlier_detection import IsolationForestOutlierDetector\n\n   detector = IsolationForestOutlierDetector()\n   outliers = detector.detect_outliers([10, 12, 12, 13, 12, 11, 40])\n   ```\n\n## Documentation\n\nEach function and class in this toolkit comes with a detailed docstring, explaining its purpose, parameters, return values, and examples.\n\n\n## Contributing\n\nContributions to this project are welcome! Please fork the repository and submit a pull request with your changes.\n",
    "bugtrack_url": null,
    "license": "apache-2.0",
    "summary": "Filtering outliers",
    "version": "0.0.7",
    "project_urls": {
        "Homepage": "https://github.com/thorwhalen/uu/tree/master/au"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "635b9669bf754d17e6545946c11eee12b9c6eb1d614d2c8eefa27f96749df161",
                "md5": "9c7da6231df76d6be8e530eb66e11a42",
                "sha256": "9c6a14e8702206c16add0ca72465c66922d2bed6cb041ef4c6640237b6cf8fb1"
            },
            "downloads": -1,
            "filename": "au-0.0.7-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "9c7da6231df76d6be8e530eb66e11a42",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 7985,
            "upload_time": "2024-01-19T20:07:07",
            "upload_time_iso_8601": "2024-01-19T20:07:07.019126Z",
            "url": "https://files.pythonhosted.org/packages/63/5b/9669bf754d17e6545946c11eee12b9c6eb1d614d2c8eefa27f96749df161/au-0.0.7-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "7738c518dbafb6f9736a1337cf030e142b0cc00af02f1f6939a6e1a6f77975bd",
                "md5": "df94e476e6dfa91426276125a9fc8482",
                "sha256": "bca38d5ca7bdb687fb5d97646bf7d3c5504ac1b3e960c2b73a84c6b3b960a3af"
            },
            "downloads": -1,
            "filename": "au-0.0.7.tar.gz",
            "has_sig": false,
            "md5_digest": "df94e476e6dfa91426276125a9fc8482",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 7598,
            "upload_time": "2024-01-19T20:07:08",
            "upload_time_iso_8601": "2024-01-19T20:07:08.531131Z",
            "url": "https://files.pythonhosted.org/packages/77/38/c518dbafb6f9736a1337cf030e142b0cc00af02f1f6939a6e1a6f77975bd/au-0.0.7.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-01-19 20:07:08",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "thorwhalen",
    "github_project": "uu",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "au"
}
        
Elapsed time: 0.38379s