pyloras


Namepyloras JSON
Version 0.1.0b6 PyPI version JSON
download
home_page
SummaryExperimental implementations of several (over/under)-sampling techniques not yet available in the imbalanced-learn library.
upload_time2023-04-29 21:49:14
maintainer
docs_urlNone
author
requires_python>=3.8
licenseBSD 3-Clause License
keywords loras imbalanced datasets oversampling machine learning localized affine random shadowsampling
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # LoRAS

[![CI][3]](https://github.com/zoj613/pyloras/actions/workflows/build-and-test.yml)
[![Codecov][4]](https://codecov.io/gh/zoj613/pyloras/)
[![PyPI][5]](https://pypi.org/project/pyloras/#history)

Localized Random Affine Shadowsampling

This repo provides a python implementation of an imbalanced dataset oversampling
technique known as Localized Random Affine Shadowsampling (LoRAS). It also provides
implementations of several other over/under-sampling algorithms not yet available in
the ``imbalanced-learn`` package. These implementations piggybacks off of ``imbalanced-learn``
and thus aim to be as compatible as possible with it.


## Dependencies
- `Python >= 3.8`
- `numpy >= 1.17.3`
- `imbalanced-learn < 1.0.0`


## Installation

Using `pip`:
```shell
$ pip install -U pyloras
```

Alternatively, one can install from source with the following shell commands:
```shell
$ git clone https://github.com/zoj613/pyloras.git
$ cd pyloras/
$ pip install .
```

## Usage

```python
from collections import Counter
from pyloras import LORAS
from sklearn.datasets import make_classification

X, y = make_classification(n_samples=20000, n_features=5, n_informative=5,
                           n_redundant=0, n_repeated=0, n_classes=3,
                           n_clusters_per_class=1,
                           weights=[0.01, 0.05, 0.94],
                           class_sep=0.8, random_state=0)

lrs = LORAS(random_state=0, manifold_learner_params={'perplexity': 35, 'n_iter': 250})
print(sorted(Counter(y).items()))
# [(0, 270), (1, 1056), (2, 18674)]
X_resampled, y_resampled = lrs.fit_resample(X, y)
print(sorted(Counter(y_resampled.astype(int)).items()))
# [(0, 18674), (1, 18674), (2, 18674)]

# one can also use any custom 2d manifold learner via the ``manifold_learner` parameter
from umap import UMAP
LORAS(manifold_learner=UMAP()).fit_resample(X, y)

```

## Visualization

Below is a comparision of `imbalanced-learn`'s `SMOTE` implementation with `LORAS`
on the dummy data used in [this doc page][2] using the default parameters.

![](./scripts/img/resampled_data.svg)
![](./scripts/img/decision_fn.svg)
![](./scripts/img/particularities.svg)

The plots can be reproduced by running:
```
$ python scripts/compare_oversamplers.py --n_neighbors=<optional> --n_shadow=<optional> --n_affine=<optional>
```

## References
- Bej, S., Davtyan, N., Wolfien, M. et al. LoRAS: an oversampling approach for imbalanced datasets. Mach Learn 110, 279–301 (2021). https://doi.org/10.1007/s10994-020-05913-4
- Bej, S., Schultz, K., Srivastava, P., Wolfien, M., & Wolkenhauer, O. (2021). A multi-schematic classifier-independent oversampling approach for imbalanced datasets. ArXiv, abs/2107.07349.
- A. Tripathi, R. Chakraborty and S. K. Kopparapu, "A Novel Adaptive Minority Oversampling Technique for Improved Classification in Data Imbalanced Scenarios," 2020 25th International Conference on Pattern Recognition (ICPR), 2021, pp. 10650-10657, doi: 10.1109/ICPR48806.2021.9413002.


[1]: https://python-poetry.org/docs/pyproject/
[2]: https://imbalanced-learn.org/stable/auto_examples/over-sampling/plot_comparison_over_sampling.html#more-advanced-over-sampling-using-adasyn-and-smote
[3]: https://img.shields.io/github/workflow/status/zoj613/pyloras/CI/main?style=flat-square
[4]: https://img.shields.io/codecov/c/github/zoj613/pyloras?style=flat-square
[5]: https://img.shields.io/github/v/release/zoj613/pyloras?include_prereleases&style=flat-square

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "pyloras",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "",
    "keywords": "loras,imbalanced datasets,oversampling,machine learning,localized affine random shadowsampling",
    "author": "",
    "author_email": "Zolisa Bleki <zolisa.bleki@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/de/8b/2b832673f21a1cb873eb3825dc135df52ada12b6d0fd1057599653b0188b/pyloras-0.1.0b6.tar.gz",
    "platform": null,
    "description": "# LoRAS\n\n[![CI][3]](https://github.com/zoj613/pyloras/actions/workflows/build-and-test.yml)\n[![Codecov][4]](https://codecov.io/gh/zoj613/pyloras/)\n[![PyPI][5]](https://pypi.org/project/pyloras/#history)\n\nLocalized Random Affine Shadowsampling\n\nThis repo provides a python implementation of an imbalanced dataset oversampling\ntechnique known as Localized Random Affine Shadowsampling (LoRAS). It also provides\nimplementations of several other over/under-sampling algorithms not yet available in\nthe ``imbalanced-learn`` package. These implementations piggybacks off of ``imbalanced-learn``\nand thus aim to be as compatible as possible with it.\n\n\n## Dependencies\n- `Python >= 3.8`\n- `numpy >= 1.17.3`\n- `imbalanced-learn < 1.0.0`\n\n\n## Installation\n\nUsing `pip`:\n```shell\n$ pip install -U pyloras\n```\n\nAlternatively, one can install from source with the following shell commands:\n```shell\n$ git clone https://github.com/zoj613/pyloras.git\n$ cd pyloras/\n$ pip install .\n```\n\n## Usage\n\n```python\nfrom collections import Counter\nfrom pyloras import LORAS\nfrom sklearn.datasets import make_classification\n\nX, y = make_classification(n_samples=20000, n_features=5, n_informative=5,\n                           n_redundant=0, n_repeated=0, n_classes=3,\n                           n_clusters_per_class=1,\n                           weights=[0.01, 0.05, 0.94],\n                           class_sep=0.8, random_state=0)\n\nlrs = LORAS(random_state=0, manifold_learner_params={'perplexity': 35, 'n_iter': 250})\nprint(sorted(Counter(y).items()))\n# [(0, 270), (1, 1056), (2, 18674)]\nX_resampled, y_resampled = lrs.fit_resample(X, y)\nprint(sorted(Counter(y_resampled.astype(int)).items()))\n# [(0, 18674), (1, 18674), (2, 18674)]\n\n# one can also use any custom 2d manifold learner via the ``manifold_learner` parameter\nfrom umap import UMAP\nLORAS(manifold_learner=UMAP()).fit_resample(X, y)\n\n```\n\n## Visualization\n\nBelow is a comparision of `imbalanced-learn`'s `SMOTE` implementation with `LORAS`\non the dummy data used in [this doc page][2] using the default parameters.\n\n![](./scripts/img/resampled_data.svg)\n![](./scripts/img/decision_fn.svg)\n![](./scripts/img/particularities.svg)\n\nThe plots can be reproduced by running:\n```\n$ python scripts/compare_oversamplers.py --n_neighbors=<optional> --n_shadow=<optional> --n_affine=<optional>\n```\n\n## References\n- Bej, S., Davtyan, N., Wolfien, M. et al. LoRAS: an oversampling approach for imbalanced datasets. Mach Learn 110, 279\u2013301 (2021). https://doi.org/10.1007/s10994-020-05913-4\n- Bej, S., Schultz, K., Srivastava, P., Wolfien, M., & Wolkenhauer, O. (2021). A multi-schematic classifier-independent oversampling approach for imbalanced datasets. ArXiv, abs/2107.07349.\n- A. Tripathi, R. Chakraborty and S. K. Kopparapu, \"A Novel Adaptive Minority Oversampling Technique for Improved Classification in Data Imbalanced Scenarios,\" 2020 25th International Conference on Pattern Recognition (ICPR), 2021, pp. 10650-10657, doi: 10.1109/ICPR48806.2021.9413002.\n\n\n[1]: https://python-poetry.org/docs/pyproject/\n[2]: https://imbalanced-learn.org/stable/auto_examples/over-sampling/plot_comparison_over_sampling.html#more-advanced-over-sampling-using-adasyn-and-smote\n[3]: https://img.shields.io/github/workflow/status/zoj613/pyloras/CI/main?style=flat-square\n[4]: https://img.shields.io/codecov/c/github/zoj613/pyloras?style=flat-square\n[5]: https://img.shields.io/github/v/release/zoj613/pyloras?include_prereleases&style=flat-square\n",
    "bugtrack_url": null,
    "license": "BSD 3-Clause License",
    "summary": "Experimental implementations of several (over/under)-sampling techniques not yet available in the imbalanced-learn library.",
    "version": "0.1.0b6",
    "project_urls": {
        "source": "https://github.com/zoj613/pyloras",
        "tracker": "https://github.com/zoj613/pyloras/issues"
    },
    "split_keywords": [
        "loras",
        "imbalanced datasets",
        "oversampling",
        "machine learning",
        "localized affine random shadowsampling"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ff7f90020e140ddb5b5d29e6c42d0b8810c53f8a0450ed03ddd8db0345a1a6c9",
                "md5": "853a99809e6ea9fc864af167eea4e201",
                "sha256": "c13ed504adab476617aff876cec82c36f62ff3801391512abca2b48c21599355"
            },
            "downloads": -1,
            "filename": "pyloras-0.1.0b6-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "853a99809e6ea9fc864af167eea4e201",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 13953,
            "upload_time": "2023-04-29T21:49:13",
            "upload_time_iso_8601": "2023-04-29T21:49:13.745147Z",
            "url": "https://files.pythonhosted.org/packages/ff/7f/90020e140ddb5b5d29e6c42d0b8810c53f8a0450ed03ddd8db0345a1a6c9/pyloras-0.1.0b6-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "de8b2b832673f21a1cb873eb3825dc135df52ada12b6d0fd1057599653b0188b",
                "md5": "c1b90002e2654ae0554be04484f78e23",
                "sha256": "1c7ee116de9abbf36310fb766ea4420e4fa3615c1c8f44e76c73b4a50b170bcf"
            },
            "downloads": -1,
            "filename": "pyloras-0.1.0b6.tar.gz",
            "has_sig": false,
            "md5_digest": "c1b90002e2654ae0554be04484f78e23",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 13244,
            "upload_time": "2023-04-29T21:49:14",
            "upload_time_iso_8601": "2023-04-29T21:49:14.970374Z",
            "url": "https://files.pythonhosted.org/packages/de/8b/2b832673f21a1cb873eb3825dc135df52ada12b6d0fd1057599653b0188b/pyloras-0.1.0b6.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-04-29 21:49:14",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "zoj613",
    "github_project": "pyloras",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "pyloras"
}
        
Elapsed time: 0.43239s