OutlierIdentifiers


NameOutlierIdentifiers JSON
Version 0.1.1 PyPI version JSON
download
home_pagehttps://github.com/antononcube/Python-packages/tree/main/OutlierIdentifiers
SummaryOutlier identifiers functions package.
upload_time2024-08-21 03:44:29
maintainerNone
docs_urlNone
authorAnton Antonov
requires_python>=3.7
licenseNone
keywords outlier identifiers outlier detection outlier outliers detection ml machine learning
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # OutlierIdentifiers

## In brief

This a Python package for 1D outlier identifier functions. 
If follows closely the Wolfram Language (WL) paclet [AAp1], the R package [AAp2], and the Raku package [AAp3].

------

## Installation 

From PyPI.org:

```shell
python3 -m pip install OutlierIdentifiers
```

From GitHub:

```shell
python3 -m pip install git+https://github.com/antononcube/Python-packages.git#egg=OutlierIdentifiers\&subdirectory=OutlierIdentifiers
```

------

## Usage examples

Load packages:


```python
import numpy as np
import plotly.graph_objects as go

from OutlierIdentifiers import *
```

Generate a vector with random numbers:


```python
np.random.seed(14)
vec = np.random.normal(loc=10, scale=20, size=30)
print(vec)
```

    [ 41.02678223  11.58372049  13.47953057   8.55326868 -30.086588
      12.89355626 -20.02337245  14.22218902  -1.16410111  31.6905813
       6.27421752  10.2932275  -11.51138939  22.84504148   6.39326577
      22.40600507  26.21948669  25.55871733   5.25020644 -27.83824691
     -13.44243588  26.72413943  30.18546801  35.86198722  -0.98662331
      -9.6342573   28.29345516  27.46140757  10.44222283   9.91712833]


Plot the vector:


```python
# Create a scatter plot with markers
fig = go.Figure(data=go.Scatter(y=vec, mode='markers'))

# Add labels and title
fig.update_layout(title='Vector of Numbers', xaxis_title='Index', yaxis_title='Value', template = "plotly_dark")

# Display the plot
fig.show()

```



Find outlier positions:


```python
outlier_identifier(vec, identifier=hampel_identifier_parameters)
```




    array([ True, False, False, False,  True, False,  True, False, False,
            True, False, False,  True, False, False, False, False, False,
           False,  True,  True, False, False,  True, False,  True, False,
           False, False, False])



Find outlier values:


```python
outlier_identifier(vec, identifier=hampel_identifier_parameters, value = True)
```




    array([ 41.02678223, -30.086588  , -20.02337245,  31.6905813 ,
           -11.51138939, -27.83824691, -13.44243588,  35.86198722,
            -9.6342573 ])



Find *top* outlier positions and values:


```python
outlier_identifier(vec, identifier = lambda v: top_outliers(hampel_identifier_parameters(v)))
```




    array([ True, False, False, False, False, False, False, False, False,
            True, False, False, False, False, False, False, False, False,
           False, False, False, False, False,  True, False, False, False,
           False, False, False])




```python
outlier_identifier(vec, identifier = lambda v: top_outliers(hampel_identifier_parameters(v)), value=True)
```




    array([41.02678223, 31.6905813 , 35.86198722])



Find *bottom* outlier positions and values (using quartiles-based identifier):


```python
outlier_identifier(vec, identifier = lambda v: bottom_outliers(quartile_identifier_parameters(v)))
```




    array([False, False, False, False,  True, False,  True, False, False,
           False, False, False, False, False, False, False, False, False,
           False,  True, False, False, False, False, False, False, False,
           False, False, False])




```python
outlier_identifier(vec, identifier = lambda v: bottom_outliers(quartile_identifier_parameters(v)), value=True)
```




    array([-30.086588  , -20.02337245, -27.83824691])


Here is another way to get the outlier values:


```python
vec[pred]
```



    array([-30.086588  , -20.02337245, -27.83824691])



The available outlier parameters functions are:

- `hampel_identifier_parameters`
- `splus_quartile_identifier_parameters`
- `quartile_identifier_parameters`


```python
[ f(vec) for f in (hampel_identifier_parameters, splus_quartile_identifier_parameters, quartile_identifier_parameters)]
```




    [(-8.796653643076334, 30.822596969354976),
     (-37.649981209714, 64.27685968784428),
     (-14.46873856125025, 36.49468188752889)]



------

## References 

[AA1] Anton Antonov,
["Outlier detection in a list of numbers"](https://mathematicaforprediction.wordpress.com/2013/10/16/outlier-detection-in-a-list-of-numbers/),
(2013),
[MathematicaForPrediction at WordPress](https://mathematicaforprediction.wordpress.com).

[AAp1] Anton Antonov,
[OutlierIdentifiers WL paclet](https://resources.wolframcloud.com/PacletRepository/resources/AntonAntonov/OutlierIdentifiers/),
(2023),
[Wolfram Language Paclet Repository](https://resources.wolframcloud.com/PacletRepository/).

[AAp2] Anton Antonov,
[OutlierIdentifiers R package](https://github.com/antononcube/R-packages/tree/master/OutlierIdentifiers),
(2019),
[R-packages at GitHub/antononcube](https://github.com/antononcube/R-packages).

[AAp3] Anton Antonov,
[OutlierIdentifiers Raku package](https://github.com/antononcube/Raku-Statistics-OutlierIdentifiers),
(2022),
[GitHub/antononcube](https://github.com/antononcube/).

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/antononcube/Python-packages/tree/main/OutlierIdentifiers",
    "name": "OutlierIdentifiers",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": null,
    "keywords": "outlier identifiers, outlier detection, outlier, outliers, detection, ml, machine learning",
    "author": "Anton Antonov",
    "author_email": "antononcube@posteo.net",
    "download_url": "https://files.pythonhosted.org/packages/a8/e1/9be63c6067e46fff59543015a0e6a9214e07eae311d1f2782c00ad4a7ab4/outlieridentifiers-0.1.1.tar.gz",
    "platform": null,
    "description": "# OutlierIdentifiers\n\n## In brief\n\nThis a Python package for 1D outlier identifier functions. \nIf follows closely the Wolfram Language (WL) paclet [AAp1], the R package [AAp2], and the Raku package [AAp3].\n\n------\n\n## Installation \n\nFrom PyPI.org:\n\n```shell\npython3 -m pip install OutlierIdentifiers\n```\n\nFrom GitHub:\n\n```shell\npython3 -m pip install git+https://github.com/antononcube/Python-packages.git#egg=OutlierIdentifiers\\&subdirectory=OutlierIdentifiers\n```\n\n------\n\n## Usage examples\n\nLoad packages:\n\n\n```python\nimport numpy as np\nimport plotly.graph_objects as go\n\nfrom OutlierIdentifiers import *\n```\n\nGenerate a vector with random numbers:\n\n\n```python\nnp.random.seed(14)\nvec = np.random.normal(loc=10, scale=20, size=30)\nprint(vec)\n```\n\n    [ 41.02678223  11.58372049  13.47953057   8.55326868 -30.086588\n      12.89355626 -20.02337245  14.22218902  -1.16410111  31.6905813\n       6.27421752  10.2932275  -11.51138939  22.84504148   6.39326577\n      22.40600507  26.21948669  25.55871733   5.25020644 -27.83824691\n     -13.44243588  26.72413943  30.18546801  35.86198722  -0.98662331\n      -9.6342573   28.29345516  27.46140757  10.44222283   9.91712833]\n\n\nPlot the vector:\n\n\n```python\n# Create a scatter plot with markers\nfig = go.Figure(data=go.Scatter(y=vec, mode='markers'))\n\n# Add labels and title\nfig.update_layout(title='Vector of Numbers', xaxis_title='Index', yaxis_title='Value', template = \"plotly_dark\")\n\n# Display the plot\nfig.show()\n\n```\n\n\n\nFind outlier positions:\n\n\n```python\noutlier_identifier(vec, identifier=hampel_identifier_parameters)\n```\n\n\n\n\n    array([ True, False, False, False,  True, False,  True, False, False,\n            True, False, False,  True, False, False, False, False, False,\n           False,  True,  True, False, False,  True, False,  True, False,\n           False, False, False])\n\n\n\nFind outlier values:\n\n\n```python\noutlier_identifier(vec, identifier=hampel_identifier_parameters, value = True)\n```\n\n\n\n\n    array([ 41.02678223, -30.086588  , -20.02337245,  31.6905813 ,\n           -11.51138939, -27.83824691, -13.44243588,  35.86198722,\n            -9.6342573 ])\n\n\n\nFind *top* outlier positions and values:\n\n\n```python\noutlier_identifier(vec, identifier = lambda v: top_outliers(hampel_identifier_parameters(v)))\n```\n\n\n\n\n    array([ True, False, False, False, False, False, False, False, False,\n            True, False, False, False, False, False, False, False, False,\n           False, False, False, False, False,  True, False, False, False,\n           False, False, False])\n\n\n\n\n```python\noutlier_identifier(vec, identifier = lambda v: top_outliers(hampel_identifier_parameters(v)), value=True)\n```\n\n\n\n\n    array([41.02678223, 31.6905813 , 35.86198722])\n\n\n\nFind *bottom* outlier positions and values (using quartiles-based identifier):\n\n\n```python\noutlier_identifier(vec, identifier = lambda v: bottom_outliers(quartile_identifier_parameters(v)))\n```\n\n\n\n\n    array([False, False, False, False,  True, False,  True, False, False,\n           False, False, False, False, False, False, False, False, False,\n           False,  True, False, False, False, False, False, False, False,\n           False, False, False])\n\n\n\n\n```python\noutlier_identifier(vec, identifier = lambda v: bottom_outliers(quartile_identifier_parameters(v)), value=True)\n```\n\n\n\n\n    array([-30.086588  , -20.02337245, -27.83824691])\n\n\nHere is another way to get the outlier values:\n\n\n```python\nvec[pred]\n```\n\n\n\n    array([-30.086588  , -20.02337245, -27.83824691])\n\n\n\nThe available outlier parameters functions are:\n\n- `hampel_identifier_parameters`\n- `splus_quartile_identifier_parameters`\n- `quartile_identifier_parameters`\n\n\n```python\n[ f(vec) for f in (hampel_identifier_parameters, splus_quartile_identifier_parameters, quartile_identifier_parameters)]\n```\n\n\n\n\n    [(-8.796653643076334, 30.822596969354976),\n     (-37.649981209714, 64.27685968784428),\n     (-14.46873856125025, 36.49468188752889)]\n\n\n\n------\n\n## References \n\n[AA1] Anton Antonov,\n[\"Outlier detection in a list of numbers\"](https://mathematicaforprediction.wordpress.com/2013/10/16/outlier-detection-in-a-list-of-numbers/),\n(2013),\n[MathematicaForPrediction at WordPress](https://mathematicaforprediction.wordpress.com).\n\n[AAp1] Anton Antonov,\n[OutlierIdentifiers WL paclet](https://resources.wolframcloud.com/PacletRepository/resources/AntonAntonov/OutlierIdentifiers/),\n(2023),\n[Wolfram Language Paclet Repository](https://resources.wolframcloud.com/PacletRepository/).\n\n[AAp2] Anton Antonov,\n[OutlierIdentifiers R package](https://github.com/antononcube/R-packages/tree/master/OutlierIdentifiers),\n(2019),\n[R-packages at GitHub/antononcube](https://github.com/antononcube/R-packages).\n\n[AAp3] Anton Antonov,\n[OutlierIdentifiers Raku package](https://github.com/antononcube/Raku-Statistics-OutlierIdentifiers),\n(2022),\n[GitHub/antononcube](https://github.com/antononcube/).\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Outlier identifiers functions package.",
    "version": "0.1.1",
    "project_urls": {
        "Homepage": "https://github.com/antononcube/Python-packages/tree/main/OutlierIdentifiers"
    },
    "split_keywords": [
        "outlier identifiers",
        " outlier detection",
        " outlier",
        " outliers",
        " detection",
        " ml",
        " machine learning"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "04aff504432edd91b0d8f637b5b1adfb4ca0a33e557012b908eb0fdb76334314",
                "md5": "c6ec224d284e98bcea03bc2ad590204c",
                "sha256": "bb425b0db0b20b754c592e0b222b982a7e50da1333a381faf2d07d90e877678b"
            },
            "downloads": -1,
            "filename": "OutlierIdentifiers-0.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "c6ec224d284e98bcea03bc2ad590204c",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 5060,
            "upload_time": "2024-08-21T03:44:28",
            "upload_time_iso_8601": "2024-08-21T03:44:28.564369Z",
            "url": "https://files.pythonhosted.org/packages/04/af/f504432edd91b0d8f637b5b1adfb4ca0a33e557012b908eb0fdb76334314/OutlierIdentifiers-0.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "a8e19be63c6067e46fff59543015a0e6a9214e07eae311d1f2782c00ad4a7ab4",
                "md5": "c7805c9e6aa998f2e155b13e7b9408d3",
                "sha256": "8a019686249cb562ab42f606deae6e30aed4b825d977e0fddfc5419297901375"
            },
            "downloads": -1,
            "filename": "outlieridentifiers-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "c7805c9e6aa998f2e155b13e7b9408d3",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 4820,
            "upload_time": "2024-08-21T03:44:29",
            "upload_time_iso_8601": "2024-08-21T03:44:29.859741Z",
            "url": "https://files.pythonhosted.org/packages/a8/e1/9be63c6067e46fff59543015a0e6a9214e07eae311d1f2782c00ad4a7ab4/outlieridentifiers-0.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-08-21 03:44:29",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "antononcube",
    "github_project": "Python-packages",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "outlieridentifiers"
}
        
Elapsed time: 0.55473s