# 🧰 Outlier Toolkit 🛠️
A **standalone Python library** for detecting, handling, and transforming outliers in numeric and categorical data.
No external dependencies required.
---
## 📜 License
This project is licensed under the **Apache License 2.0**. See the [LICENSE](./LICENSE) file for more details.
[](https://www.apache.org/licenses/LICENSE-2.0)
---
## 📊 Features
### 1. Outlier Detection
- **Z-score Detection**: Identify extreme values based on standard deviation.
- **IQR Detection**: Detect outliers using the interquartile range (Q1, Q3).
### 2. Outlier Handling Techniques
- **Remove Outliers**: Drop outlier values from datasets.
- **Replace Outliers**: Replace outliers with mean, median, or most frequent values.
### 3. Winsorization
- **Standard Winsorization**: Cap extreme values at a fixed percentile.
- **Adaptive Quartiles**: Replace low/high outliers using Q1 and Q3.
- **Adaptive Inliers**: Replace low/high outliers using nearest inlier values (custom method).
### 4. Binning
- **Equal Width Binning**: Divide numeric range into equal-width intervals.
- **Equal Frequency Binning**: Divide data so each bin has approximately the same number of values.
- **Auto Binning (Outlier-based)**: Automatically separate low/high outliers and inliers using IQR.
---
## 🔧 Installation
```bash
pip install outlier_library
```
No external libraries required. Compatible with Python 3.7+.
---
## 🧮 Usage
```
from outlier.i_outlier.Zscore import detect_outliers_zscore
from outlier.i_outlier.IQR import detect_outliers_iqr
from outlier.outlierTech.remove import remove_outliers
from outlier.outlierTech.replace import replace_outliers
from outlier.outlierTech.winsorization.standard import winsorize_standard
from outlier.outlierTech.winsorization.adaptive import winsorize_quartiles
from outlier.outlierTech.winsorization.adaptive import winsorize_inliers
from outlier.outlierGroup.binning import eq_width_bin
from outlier.outlierGroup.binning import eq_freq_bin
from outlier.outlierGroup.binning import custom_binning
Sample Test data
numeric_data = [1, 2, 85, 95, 65, 75, 53, 67, 87, 89, 93, 1001, 1027, 3018]
categorical_data = ["Male", "Female", "Male", "Male", "Unknown", "Unknown", "Other"]
Detection
print(detect_outliers_zscore(numeric_data))
print(detect_outliers_iqr(numeric_data))
Handling
print(remove_outliers(numeric_data, method="IQR"))
print(replace_outliers(numeric_data, method="IQR"))
print(replace_outliers(categorical_data, method="IQR"))
Winsorization
print(winsorize_standard(numeric_data[:]))
print(winsorize_quartiles(numeric_data[:]))
print(winsorize_inliers(numeric_data[:]))
Binning
print(eq_width_bin(numeric_data[:]))
print(eq_freq_bin(numeric_data[:]))
print(custom_binning(numeric_data[:]))
```
---
## 📝 Notes
- Works for numeric and categorical data.
- All functions are standalone and do not require external libraries.
- Custom winsorization allows mapping outliers to nearest inliers for more controlled transformations.
---
## 👩💻 Author
**Irene Betsy D**
---
Raw data
{
"_id": null,
"home_page": "https://github.com/irenebetsy/outlier_library",
"name": "outlier-library",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": null,
"keywords": "outlier IQR ZScore Winsorization binning data preprocessing analytics",
"author": "Irene Betsy D",
"author_email": "betsydnicholraja@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/fb/1e/9df3eb3f8806baa532df36a9cb42f76f05ae64d86768d50ad6e43e34dd50/outlier_library-0.1.4.tar.gz",
"platform": null,
"description": "# \ud83e\uddf0 Outlier Toolkit \ud83d\udee0\ufe0f\r\n\r\nA **standalone Python library** for detecting, handling, and transforming outliers in numeric and categorical data. \r\nNo external dependencies required.\r\n\r\n---\r\n\r\n## \ud83d\udcdc License\r\n\r\nThis project is licensed under the **Apache License 2.0**. See the [LICENSE](./LICENSE) file for more details.\r\n\r\n[](https://www.apache.org/licenses/LICENSE-2.0)\r\n\r\n---\r\n\r\n\r\n## \ud83d\udcca Features\r\n\r\n### 1. Outlier Detection\r\n- **Z-score Detection**: Identify extreme values based on standard deviation.\r\n- **IQR Detection**: Detect outliers using the interquartile range (Q1, Q3).\r\n\r\n### 2. Outlier Handling Techniques\r\n- **Remove Outliers**: Drop outlier values from datasets.\r\n- **Replace Outliers**: Replace outliers with mean, median, or most frequent values.\r\n\r\n### 3. Winsorization\r\n- **Standard Winsorization**: Cap extreme values at a fixed percentile.\r\n- **Adaptive Quartiles**: Replace low/high outliers using Q1 and Q3.\r\n- **Adaptive Inliers**: Replace low/high outliers using nearest inlier values (custom method).\r\n\r\n### 4. Binning\r\n- **Equal Width Binning**: Divide numeric range into equal-width intervals.\r\n- **Equal Frequency Binning**: Divide data so each bin has approximately the same number of values.\r\n- **Auto Binning (Outlier-based)**: Automatically separate low/high outliers and inliers using IQR.\r\n\r\n---\r\n\r\n## \ud83d\udd27 Installation\r\n\r\n```bash\r\npip install outlier_library\r\n```\r\n\r\nNo external libraries required. Compatible with Python 3.7+.\r\n\r\n---\r\n\r\n## \ud83e\uddee Usage\r\n\r\n```\r\n\r\nfrom outlier.i_outlier.Zscore import detect_outliers_zscore\r\nfrom outlier.i_outlier.IQR import detect_outliers_iqr\r\nfrom outlier.outlierTech.remove import remove_outliers\r\nfrom outlier.outlierTech.replace import replace_outliers\r\nfrom outlier.outlierTech.winsorization.standard import winsorize_standard\r\nfrom outlier.outlierTech.winsorization.adaptive import winsorize_quartiles\r\nfrom outlier.outlierTech.winsorization.adaptive import winsorize_inliers\r\nfrom outlier.outlierGroup.binning import eq_width_bin\r\nfrom outlier.outlierGroup.binning import eq_freq_bin\r\nfrom outlier.outlierGroup.binning import custom_binning\r\n\r\n\r\n\r\nSample Test data\r\nnumeric_data = [1, 2, 85, 95, 65, 75, 53, 67, 87, 89, 93, 1001, 1027, 3018]\r\ncategorical_data = [\"Male\", \"Female\", \"Male\", \"Male\", \"Unknown\", \"Unknown\", \"Other\"]\r\n\r\nDetection\r\nprint(detect_outliers_zscore(numeric_data))\r\nprint(detect_outliers_iqr(numeric_data))\r\n\r\nHandling\r\nprint(remove_outliers(numeric_data, method=\"IQR\"))\r\nprint(replace_outliers(numeric_data, method=\"IQR\"))\r\nprint(replace_outliers(categorical_data, method=\"IQR\"))\r\n\r\nWinsorization\r\nprint(winsorize_standard(numeric_data[:]))\r\nprint(winsorize_quartiles(numeric_data[:]))\r\nprint(winsorize_inliers(numeric_data[:]))\r\n\r\nBinning\r\nprint(eq_width_bin(numeric_data[:]))\r\nprint(eq_freq_bin(numeric_data[:]))\r\nprint(custom_binning(numeric_data[:]))\r\n```\r\n---\r\n\r\n## \ud83d\udcdd Notes\r\n\r\n- Works for numeric and categorical data.\r\n- All functions are standalone and do not require external libraries.\r\n- Custom winsorization allows mapping outliers to nearest inliers for more controlled transformations.\r\n\r\n---\r\n\r\n## \ud83d\udc69\u200d\ud83d\udcbb Author\r\n**Irene Betsy D** \r\n\r\n---\r\n\r\n\r\n\r\n\r\n",
"bugtrack_url": null,
"license": "Apache License 2.0",
"summary": "A Python library for identifying and handling outliers",
"version": "0.1.4",
"project_urls": {
"Bug Tracker": "https://github.com/irenebetsy/outlier_library/issues",
"Documentation": "https://github.com/irenebetsy/outlier_library#readme",
"Homepage": "https://github.com/irenebetsy/outlier_library",
"Source Code": "https://github.com/irenebetsy/outlier_library"
},
"split_keywords": [
"outlier",
"iqr",
"zscore",
"winsorization",
"binning",
"data",
"preprocessing",
"analytics"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "66a3bbcbd35a0ef0e708284f4730977e20531bfcaf9bf568f1da47408bc65315",
"md5": "1c476662ce27419077a7222d8333f9bc",
"sha256": "75d3c7859caecb9b870fd9b087ad9df1e1a5feac5b3177e5d2465860cfb3af79"
},
"downloads": -1,
"filename": "outlier_library-0.1.4-py3-none-any.whl",
"has_sig": false,
"md5_digest": "1c476662ce27419077a7222d8333f9bc",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 8995,
"upload_time": "2025-08-30T10:54:00",
"upload_time_iso_8601": "2025-08-30T10:54:00.947836Z",
"url": "https://files.pythonhosted.org/packages/66/a3/bbcbd35a0ef0e708284f4730977e20531bfcaf9bf568f1da47408bc65315/outlier_library-0.1.4-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "fb1e9df3eb3f8806baa532df36a9cb42f76f05ae64d86768d50ad6e43e34dd50",
"md5": "17324e56f096c6cebb36a83868c2a1ac",
"sha256": "3208ac048d277982a21f23d02beb1ae4d46157df06e66279601c66ed41572a34"
},
"downloads": -1,
"filename": "outlier_library-0.1.4.tar.gz",
"has_sig": false,
"md5_digest": "17324e56f096c6cebb36a83868c2a1ac",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 6014,
"upload_time": "2025-08-30T10:54:02",
"upload_time_iso_8601": "2025-08-30T10:54:02.381370Z",
"url": "https://files.pythonhosted.org/packages/fb/1e/9df3eb3f8806baa532df36a9cb42f76f05ae64d86768d50ad6e43e34dd50/outlier_library-0.1.4.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-30 10:54:02",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "irenebetsy",
"github_project": "outlier_library",
"github_not_found": true,
"lcname": "outlier-library"
}