[scikit-learn]: <http://scikit-learn.org/stable/>
[imbalanced-learn]: <http://imbalanced-learn.org/stable/>
[SOMO]: <https://www.sciencedirect.com/science/article/abs/pii/S0957417417302324>
[KMeans-SMOTE]: <https://www.sciencedirect.com/science/article/abs/pii/S0020025518304997>
[G-SOMO]: <https://www.sciencedirect.com/science/article/abs/pii/S095741742100662X>
[black badge]: <https://img.shields.io/badge/%20style-black-000000.svg>
[black]: <https://github.com/psf/black>
[docformatter badge]: <https://img.shields.io/badge/%20formatter-docformatter-fedcba.svg>
[docformatter]: <https://github.com/PyCQA/docformatter>
[ruff badge]: <https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/charliermarsh/ruff/main/assets/badge/v1.json>
[ruff]: <https://github.com/charliermarsh/ruff>
[mypy badge]: <http://www.mypy-lang.org/static/mypy_badge.svg>
[mypy]: <http://mypy-lang.org>
[mkdocs badge]: <https://img.shields.io/badge/docs-mkdocs%20material-blue.svg?style=flat>
[mkdocs]: <https://squidfunk.github.io/mkdocs-material>
[version badge]: <https://img.shields.io/pypi/v/imbalanced-learn-extra.svg>
[pythonversion badge]: <https://img.shields.io/pypi/pyversions/imbalanced-learn-extra.svg>
[downloads badge]: <https://img.shields.io/pypi/dd/imbalanced-learn-extra>
[gitter]: <https://gitter.im/imbalanced-learn-extra/community>
[gitter badge]: <https://badges.gitter.im/join%20chat.svg>
[discussions]: <https://github.com/georgedouzas/imbalanced-learn-extra/discussions>
[discussions badge]: <https://img.shields.io/github/discussions/georgedouzas/imbalanced-learn-extra>
[ci]: <https://github.com/georgedouzas/imbalanced-learn-extra/actions?query=workflow>
[ci badge]: <https://github.com/georgedouzas/imbalanced-learn-extra/actions/workflows/ci.yml/badge.svg?branch=main>
[doc]: <https://github.com/georgedouzas/imbalanced-learn-extra/actions?query=workflow>
[doc badge]: <https://github.com/georgedouzas/imbalanced-learn-extra/actions/workflows/doc.yml/badge.svg?branch=main>
# imbalanced-learn-extra
[![ci][ci badge]][ci] [![doc][doc badge]][doc]
| Category | Tools |
| ------------------| -------- |
| **Development** | [![black][black badge]][black] [![ruff][ruff badge]][ruff] [![mypy][mypy badge]][mypy] [![docformatter][docformatter badge]][docformatter] |
| **Package** | ![version][version badge] ![pythonversion][pythonversion badge] ![downloads][downloads badge] |
| **Documentation** | [![mkdocs][mkdocs badge]][mkdocs]|
| **Communication** | [![gitter][gitter badge]][gitter] [![discussions][discussions badge]][discussions] |
## Introduction
`imbalanced-learn-extra` is a Python package that extends [imbalanced-learn]. It implements algorithms that are not included in
[imbalanced-learn] due to their novelty or lower citation number. The current version includes the following:
- A general interface for clustering-based oversampling algorithms.
- The Geometric SMOTE algorithm. It is a geometrically enhanced drop-in replacement for SMOTE, that handles numerical as well as
categorical features.
## Installation
For user installation, `imbalanced-learn-extra` is currently available on the PyPi's repository, and you can
install it via `pip`:
```bash
pip install imbalanced-learn-extra
```
Development installation requires cloning the repository and then using [PDM](https://github.com/pdm-project/pdm) to install the
project as well as the main and development dependencies:
```bash
git clone https://github.com/georgedouzas/imbalanced-learn-extra.git
cd imbalanced-learn-extra
pdm install
```
SOM clusterer requires optional dependencies:
```bash
pip install imbalanced-learn-extra[som]
```
## Usage
All the classes included in `imbalanced-learn-extra` follow the [imbalanced-learn] API using the functionality of the base
oversampler. Using [scikit-learn] convention, the data are represented as follows:
- Input data `X`: 2D array-like or sparse matrices.
- Targets `y`: 1D array-like.
The oversamplers implement a `fit` method to learn from `X` and `y`:
```python
oversampler.fit(X, y)
```
They also implement a `fit_resample` method to resample `X` and `y`:
```python
X_resampled, y_resampled = clustering_based_oversampler.fit_resample(X, y)
```
## Citing `imbalanced-learn-extra`
Publications using clustering-based oversampling:
- [G. Douzas, F. Bacao, "Self-Organizing Map Oversampling (SOMO) for imbalanced data set learning", Expert Systems with
Applications, vol. 82, pp. 40-52, 2017.][SOMO]
- [G. Douzas, F. Bacao, F. Last, "Improving imbalanced learning through a heuristic oversampling method based on k-means and
SMOTE", Information Sciences, vol. 465, pp. 1-20, 2018.][KMeans-SMOTE]
- [G. Douzas, F. Bacao, F. Last, "G-SOMO: An oversampling approach based on self-organized maps and geometric SMOTE", Expert
Systems with Applications, vol. 183,115230, 2021.][G-SOMO]
Publications using Geometric-SMOTE:
- Douzas, G., Bacao, B. (2019). Geometric SMOTE: a geometrically enhanced
drop-in replacement for SMOTE. Information Sciences, 501, 118-135.
<https://doi.org/10.1016/j.ins.2019.06.007>
- Fonseca, J., Douzas, G., Bacao, F. (2021). Increasing the Effectiveness of
Active Learning: Introducing Artificial Data Generation in Active Learning
for Land Use/Land Cover Classification. Remote Sensing, 13(13), 2619.
<https://doi.org/10.3390/rs13132619>
- Douzas, G., Bacao, F., Fonseca, J., Khudinyan, M. (2019). Imbalanced
Learning in Land Cover Classification: Improving Minority Classes’
Prediction Accuracy Using the Geometric SMOTE Algorithm. Remote Sensing,
11(24), 3040. <https://doi.org/10.3390/rs11243040>
Raw data
{
"_id": null,
"home_page": null,
"name": "imbalanced-learn-extra",
"maintainer": null,
"docs_url": null,
"requires_python": "<3.13,>=3.10",
"maintainer_email": null,
"keywords": "machine learning, imbalanced learning, oversampling",
"author": null,
"author_email": "Georgios Douzas <gdouzas@icloud.com>",
"download_url": "https://files.pythonhosted.org/packages/27/cf/1838bdd28003239a5dbdc1b8580de7a5e7a75cc0ae92552358bc6bfbcc28/imbalanced-learn-extra-0.2.5.tar.gz",
"platform": null,
"description": "[scikit-learn]: <http://scikit-learn.org/stable/>\n[imbalanced-learn]: <http://imbalanced-learn.org/stable/>\n[SOMO]: <https://www.sciencedirect.com/science/article/abs/pii/S0957417417302324>\n[KMeans-SMOTE]: <https://www.sciencedirect.com/science/article/abs/pii/S0020025518304997>\n[G-SOMO]: <https://www.sciencedirect.com/science/article/abs/pii/S095741742100662X>\n[black badge]: <https://img.shields.io/badge/%20style-black-000000.svg>\n[black]: <https://github.com/psf/black>\n[docformatter badge]: <https://img.shields.io/badge/%20formatter-docformatter-fedcba.svg>\n[docformatter]: <https://github.com/PyCQA/docformatter>\n[ruff badge]: <https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/charliermarsh/ruff/main/assets/badge/v1.json>\n[ruff]: <https://github.com/charliermarsh/ruff>\n[mypy badge]: <http://www.mypy-lang.org/static/mypy_badge.svg>\n[mypy]: <http://mypy-lang.org>\n[mkdocs badge]: <https://img.shields.io/badge/docs-mkdocs%20material-blue.svg?style=flat>\n[mkdocs]: <https://squidfunk.github.io/mkdocs-material>\n[version badge]: <https://img.shields.io/pypi/v/imbalanced-learn-extra.svg>\n[pythonversion badge]: <https://img.shields.io/pypi/pyversions/imbalanced-learn-extra.svg>\n[downloads badge]: <https://img.shields.io/pypi/dd/imbalanced-learn-extra>\n[gitter]: <https://gitter.im/imbalanced-learn-extra/community>\n[gitter badge]: <https://badges.gitter.im/join%20chat.svg>\n[discussions]: <https://github.com/georgedouzas/imbalanced-learn-extra/discussions>\n[discussions badge]: <https://img.shields.io/github/discussions/georgedouzas/imbalanced-learn-extra>\n[ci]: <https://github.com/georgedouzas/imbalanced-learn-extra/actions?query=workflow>\n[ci badge]: <https://github.com/georgedouzas/imbalanced-learn-extra/actions/workflows/ci.yml/badge.svg?branch=main>\n[doc]: <https://github.com/georgedouzas/imbalanced-learn-extra/actions?query=workflow>\n[doc badge]: <https://github.com/georgedouzas/imbalanced-learn-extra/actions/workflows/doc.yml/badge.svg?branch=main>\n\n# imbalanced-learn-extra\n\n[![ci][ci badge]][ci] [![doc][doc badge]][doc]\n\n| Category | Tools |\n| ------------------| -------- |\n| **Development** | [![black][black badge]][black] [![ruff][ruff badge]][ruff] [![mypy][mypy badge]][mypy] [![docformatter][docformatter badge]][docformatter] |\n| **Package** | ![version][version badge] ![pythonversion][pythonversion badge] ![downloads][downloads badge] |\n| **Documentation** | [![mkdocs][mkdocs badge]][mkdocs]|\n| **Communication** | [![gitter][gitter badge]][gitter] [![discussions][discussions badge]][discussions] |\n\n## Introduction\n\n`imbalanced-learn-extra` is a Python package that extends [imbalanced-learn]. It implements algorithms that are not included in\n[imbalanced-learn] due to their novelty or lower citation number. The current version includes the following:\n\n- A general interface for clustering-based oversampling algorithms.\n\n- The Geometric SMOTE algorithm. It is a geometrically enhanced drop-in replacement for SMOTE, that handles numerical as well as\ncategorical features.\n\n## Installation\n\nFor user installation, `imbalanced-learn-extra` is currently available on the PyPi's repository, and you can\ninstall it via `pip`:\n\n```bash\npip install imbalanced-learn-extra\n```\n\nDevelopment installation requires cloning the repository and then using [PDM](https://github.com/pdm-project/pdm) to install the\nproject as well as the main and development dependencies:\n\n```bash\ngit clone https://github.com/georgedouzas/imbalanced-learn-extra.git\ncd imbalanced-learn-extra\npdm install\n```\n\nSOM clusterer requires optional dependencies:\n\n```bash\npip install imbalanced-learn-extra[som]\n```\n\n## Usage\n\nAll the classes included in `imbalanced-learn-extra` follow the [imbalanced-learn] API using the functionality of the base\noversampler. Using [scikit-learn] convention, the data are represented as follows:\n\n- Input data `X`: 2D array-like or sparse matrices.\n- Targets `y`: 1D array-like.\n\nThe oversamplers implement a `fit` method to learn from `X` and `y`:\n\n```python\noversampler.fit(X, y)\n```\n\nThey also implement a `fit_resample` method to resample `X` and `y`:\n\n```python\nX_resampled, y_resampled = clustering_based_oversampler.fit_resample(X, y)\n```\n\n## Citing `imbalanced-learn-extra`\n\nPublications using clustering-based oversampling:\n\n- [G. Douzas, F. Bacao, \"Self-Organizing Map Oversampling (SOMO) for imbalanced data set learning\", Expert Systems with\n Applications, vol. 82, pp. 40-52, 2017.][SOMO]\n- [G. Douzas, F. Bacao, F. Last, \"Improving imbalanced learning through a heuristic oversampling method based on k-means and\n SMOTE\", Information Sciences, vol. 465, pp. 1-20, 2018.][KMeans-SMOTE]\n- [G. Douzas, F. Bacao, F. Last, \"G-SOMO: An oversampling approach based on self-organized maps and geometric SMOTE\", Expert\n Systems with Applications, vol. 183,115230, 2021.][G-SOMO]\n\nPublications using Geometric-SMOTE:\n\n- Douzas, G., Bacao, B. (2019). Geometric SMOTE: a geometrically enhanced\n drop-in replacement for SMOTE. Information Sciences, 501, 118-135.\n <https://doi.org/10.1016/j.ins.2019.06.007>\n\n- Fonseca, J., Douzas, G., Bacao, F. (2021). Increasing the Effectiveness of\n Active Learning: Introducing Artificial Data Generation in Active Learning\n for Land Use/Land Cover Classification. Remote Sensing, 13(13), 2619.\n <https://doi.org/10.3390/rs13132619>\n\n- Douzas, G., Bacao, F., Fonseca, J., Khudinyan, M. (2019). Imbalanced\n Learning in Land Cover Classification: Improving Minority Classes\u2019\n Prediction Accuracy Using the Geometric SMOTE Algorithm. Remote Sensing,\n 11(24), 3040. <https://doi.org/10.3390/rs11243040>\n\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "An implementation of novel oversampling algorithms.",
"version": "0.2.5",
"project_urls": {
"Changelog": "https://georgedouzas.github.io/imbalanced-learn-extra/changelog",
"Discussions": "https://github.com/georgedouzas/imbalanced-learn-extra/discussions",
"Documentation": "https://georgedouzas.github.io/imbalanced-learn-extra",
"Funding": "https://github.com/sponsors/georgedouzas",
"Gitter": "https://gitter.im/imbalanced-learn-extra/community",
"Homepage": "https://georgedouzas.github.io/imbalanced-learn-extra",
"Issues": "https://github.com/georgedouzas/imbalanced-learn-extra/issues",
"Repository": "https://github.com/georgedouzas/imbalanced-learn-extra"
},
"split_keywords": [
"machine learning",
" imbalanced learning",
" oversampling"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "0d2262ed3dc211dddc3cc8fab2bb7e965ed83cf4cb21d8449a1a930d5f102404",
"md5": "01149ac03d840bd458bcff56bdb83a76",
"sha256": "c21ecbfc724908348fa9545d50327216624952e2365295d4968eecad338285e5"
},
"downloads": -1,
"filename": "imbalanced_learn_extra-0.2.5-py3-none-any.whl",
"has_sig": false,
"md5_digest": "01149ac03d840bd458bcff56bdb83a76",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<3.13,>=3.10",
"size": 35006,
"upload_time": "2024-11-07T08:00:43",
"upload_time_iso_8601": "2024-11-07T08:00:43.145116Z",
"url": "https://files.pythonhosted.org/packages/0d/22/62ed3dc211dddc3cc8fab2bb7e965ed83cf4cb21d8449a1a930d5f102404/imbalanced_learn_extra-0.2.5-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "27cf1838bdd28003239a5dbdc1b8580de7a5e7a75cc0ae92552358bc6bfbcc28",
"md5": "43c09dbb6b65924a579ab2982abfc051",
"sha256": "6c1b6ce8f238e67567686efd4e1412e809882ce61de014abd041a2d45c14e4aa"
},
"downloads": -1,
"filename": "imbalanced-learn-extra-0.2.5.tar.gz",
"has_sig": false,
"md5_digest": "43c09dbb6b65924a579ab2982abfc051",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<3.13,>=3.10",
"size": 36212,
"upload_time": "2024-11-07T08:00:44",
"upload_time_iso_8601": "2024-11-07T08:00:44.567119Z",
"url": "https://files.pythonhosted.org/packages/27/cf/1838bdd28003239a5dbdc1b8580de7a5e7a75cc0ae92552358bc6bfbcc28/imbalanced-learn-extra-0.2.5.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-11-07 08:00:44",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "georgedouzas",
"github_project": "imbalanced-learn-extra",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "imbalanced-learn-extra"
}