=============
outlier-utils
=============
Utility library for detecting and removing outliers from normally distributed datasets using the Smirnov-Grubbs_ test.
Requirements
------------
- Python_ (version 3.8 or later)
- SciPy_
- NumPy_
Overview
--------
Both the two-sided and the one-sided version of the test are supported. The former allows extracting outliers from both ends of the dataset, whereas the latter only considers min/max outliers. When running a test, every outlier will be removed until none can be found in the dataset. The output of the test is flexible enough to match several use cases. By default, the outlier-free data will be returned, but the test can also return the outliers themselves or their indices in the original dataset.
Examples
--------
- Two-sided Grubbs test with a Pandas series input
::
>>> from outliers import smirnov_grubbs as grubbs
>>> import pandas as pd
>>> data = pd.Series([1, 8, 9, 10, 9])
>>> grubbs.test(data, alpha=0.05)
1 8
2 9
3 10
4 9
dtype: int64
- Two-sided Grubbs test with a NumPy array input
::
>>> import numpy as np
>>> data = np.array([1, 8, 9, 10, 9])
>>> grubbs.test(data, alpha=0.05)
array([ 8, 9, 10, 9])
- One-sided (min) test returning outlier indices
::
>>> grubbs.min_test_indices([8, 9, 10, 1, 9], alpha=0.05)
[3]
- One-sided (max) tests returning outliers
::
>>> grubbs.max_test_outliers([8, 9, 10, 1, 9], alpha=0.05)
[]
>>> grubbs.max_test_outliers([8, 9, 10, 50, 9], alpha=0.05)
[50]
.. _Smirnov-Grubbs: https://en.wikipedia.org/wiki/Grubbs%27_test_for_outliers
.. _SciPy: https://www.scipy.org/
.. _NumPy: http://www.numpy.org/
.. _Python: https://www.python.org/
License
=======
This software is licensed under the MIT License.
Raw data
{
"_id": null,
"home_page": "",
"name": "outlier-utils",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "",
"author": "",
"author_email": "Masashi Shibata <contact@c-bata.link>",
"download_url": "https://files.pythonhosted.org/packages/29/3a/73493f0d4ee662798b27b0287d4372d99d3339ba5c3801caa14d5bf4d26d/outlier_utils-0.0.5.tar.gz",
"platform": null,
"description": "=============\noutlier-utils\n=============\n\nUtility library for detecting and removing outliers from normally distributed datasets using the Smirnov-Grubbs_ test.\n\nRequirements\n------------\n\n- Python_ (version 3.8 or later)\n- SciPy_\n- NumPy_\n\nOverview\n--------\n\nBoth the two-sided and the one-sided version of the test are supported. The former allows extracting outliers from both ends of the dataset, whereas the latter only considers min/max outliers. When running a test, every outlier will be removed until none can be found in the dataset. The output of the test is flexible enough to match several use cases. By default, the outlier-free data will be returned, but the test can also return the outliers themselves or their indices in the original dataset.\n\nExamples\n--------\n\n- Two-sided Grubbs test with a Pandas series input\n\n::\n\n >>> from outliers import smirnov_grubbs as grubbs\n >>> import pandas as pd\n >>> data = pd.Series([1, 8, 9, 10, 9])\n >>> grubbs.test(data, alpha=0.05)\n 1 8\n 2 9\n 3 10\n 4 9\n dtype: int64\n \n- Two-sided Grubbs test with a NumPy array input \n\n::\n\n >>> import numpy as np\n >>> data = np.array([1, 8, 9, 10, 9])\n >>> grubbs.test(data, alpha=0.05)\n array([ 8, 9, 10, 9])\n \n- One-sided (min) test returning outlier indices\n\n::\n\n >>> grubbs.min_test_indices([8, 9, 10, 1, 9], alpha=0.05)\n [3]\n \n- One-sided (max) tests returning outliers\n\n::\n\n >>> grubbs.max_test_outliers([8, 9, 10, 1, 9], alpha=0.05)\n []\n >>> grubbs.max_test_outliers([8, 9, 10, 50, 9], alpha=0.05)\n [50]\n\n\n.. _Smirnov-Grubbs: https://en.wikipedia.org/wiki/Grubbs%27_test_for_outliers\n.. _SciPy: https://www.scipy.org/\n.. _NumPy: http://www.numpy.org/\n.. _Python: https://www.python.org/\n\n\nLicense\n=======\n\nThis software is licensed under the MIT License.\n\n",
"bugtrack_url": null,
"license": "MIT License",
"summary": "Utility library for detecting and removing outliers from normally distributed datasets",
"version": "0.0.5",
"project_urls": {
"Bug Tracker": "https://github.com/c-bata/outlier-utils/issues",
"Homepage": "https://github.com/c-bata/outlier-utils",
"Sources": "https://github.com/c-bata/outlier-utils"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "5702281e0d898e50138b4275d8f2840d5b6bd41b276cb697dd56fd56ac91262c",
"md5": "fc28198aec5a8d9fd722bbd896e3c725",
"sha256": "2e16148a3fa7b2e16ad0a3b75d8c8920828b5cc11568795782d597d4cfb0b194"
},
"downloads": -1,
"filename": "outlier_utils-0.0.5-py3-none-any.whl",
"has_sig": false,
"md5_digest": "fc28198aec5a8d9fd722bbd896e3c725",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 5143,
"upload_time": "2023-08-31T16:10:25",
"upload_time_iso_8601": "2023-08-31T16:10:25.767920Z",
"url": "https://files.pythonhosted.org/packages/57/02/281e0d898e50138b4275d8f2840d5b6bd41b276cb697dd56fd56ac91262c/outlier_utils-0.0.5-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "293a73493f0d4ee662798b27b0287d4372d99d3339ba5c3801caa14d5bf4d26d",
"md5": "7018000d4a64e8ea0b96a0a0d45e130b",
"sha256": "16e46fa6f7b01fe5518ea73fc15d3de0e30091750c428760bbe7dde2c9590579"
},
"downloads": -1,
"filename": "outlier_utils-0.0.5.tar.gz",
"has_sig": false,
"md5_digest": "7018000d4a64e8ea0b96a0a0d45e130b",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 6201,
"upload_time": "2023-08-31T16:10:26",
"upload_time_iso_8601": "2023-08-31T16:10:26.679516Z",
"url": "https://files.pythonhosted.org/packages/29/3a/73493f0d4ee662798b27b0287d4372d99d3339ba5c3801caa14d5bf4d26d/outlier_utils-0.0.5.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-08-31 16:10:26",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "c-bata",
"github_project": "outlier-utils",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "numpy",
"specs": []
},
{
"name": "scipy",
"specs": []
},
{
"name": "flake8",
"specs": []
},
{
"name": "black",
"specs": []
},
{
"name": "isort",
"specs": []
},
{
"name": "pytest",
"specs": []
},
{
"name": "pandas",
"specs": []
}
],
"lcname": "outlier-utils"
}