# 🧰 Outlier Toolkit 🛠️
A **standalone Python library** for detecting, handling, and transforming outliers in numeric and categorical data.
No external dependencies required.
---
## 📜 License
This project is licensed under the **Apache License 2.0**. See the [LICENSE](./LICENSE) file for more details.
[](https://www.apache.org/licenses/LICENSE-2.0)
---
## 📊 Features
### 1. Outlier Detection
- **Z-score Detection**: Identify extreme values based on standard deviation.
- **IQR Detection**: Detect outliers using the interquartile range (Q1, Q3).
### 2. Outlier Handling Techniques
- **Remove Outliers**: Drop outlier values from datasets.
- **Replace Outliers**: Replace outliers with mean, median, or most frequent values.
### 3. Winsorization
- **Standard Winsorization**: Cap extreme values at a fixed percentile.
- **Adaptive Quartiles**: Replace low/high outliers using Q1 and Q3.
- **Adaptive Inliers**: Replace low/high outliers using nearest inlier values (custom method).
### 4. Binning
- **Equal Width Binning**: Divide numeric range into equal-width intervals.
- **Equal Frequency Binning**: Divide data so each bin has approximately the same number of values.
- **Auto Binning (Outlier-based)**: Automatically separate low/high outliers and inliers using IQR.
---
## 🔧 Installation
```bash
pip install outlier-toolkit
```
No external libraries required. Compatible with Python 3.7+.
---
## 🧮 Usage
```
from outlier.i_outlier.Zscore import detect_outliers_zscore
from outlier.i_outlier.IQR import detect_outliers_iqr
from outlier.outlierTech.remove import remove_outliers
from outlier.outlierTech.replace import replace_outliers
from outlier.outlierTech.winsorization.standard import winsorize_standard
from outlier.outlierTech.winsorization.adaptive import winsorize_quartiles
from outlier.outlierTech.winsorization.adaptive import winsorize_inliers
from outlier.outlierGroup.binning import eq_width_bin
from outlier.outlierGroup.binning import eq_freq_bin
from outlier.outlierGroup.binning import custom_binning
# Sample Test data
numeric_data = [1, 2, 85, 95, 65, 75, 53, 67, 87, 89, 93, 1001, 1027, 3018]
categorical_data = ["Male", "Female", "Male", "Male", "Unknown", "Unknown", "Other"]
#Detection
print("=== Zscore Detection ===")
print(detect_outliers_zscore(numeric_data))
print("\n=== IQR Detection ===")
print(detect_outliers_iqr(numeric_data))
#Handling
print("\n=== Remove Outliers ===")
print(remove_outliers(numeric_data, method="IQR"))
print("\n=== Replace Outliers (auto-detect) ===")
print(replace_outliers(numeric_data, method="IQR"))
print(replace_outliers(categorical_data, method="IQR"))
#Winsorization
print("\n=== Winsorization (Standard 5%) ===")
print(winsorize_standard(numeric_data[:]))
print("\n=== Winsorization (Adaptive Quartiles) ===")
print(winsorize_quartiles(numeric_data[:]))
print("\n=== Winsorization (Adaptive Inliers) ===")
print(winsorize_inliers(numeric_data[:]))
#Binning
print("\n=== Binning (Equal Width Binning) ===")
print(eq_width_bin(numeric_data[:]))
print("\n=== Binning (Equal Width Binning) ===")
print(eq_freq_bin(numeric_data[:]))
print("\n=== Binning (Equal Width Binning) ===")
print(custom_binning(numeric_data[:]))
```
---
## 📝 Notes
- Works for numeric and categorical data.
- All functions are standalone and do not require external libraries.
- Custom winsorization allows mapping outliers to nearest inliers for more controlled transformations.
---
## 👩💻 Author
**Irene Betsy D**
---
Raw data
{
"_id": null,
"home_page": "https://github.com/irenebetsy/outlier_library",
"name": "outlier-toolkit",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": null,
"keywords": "outlier IQR ZScore Winsorization binning data preprocessing analytics",
"author": "Irene Betsy D",
"author_email": "betsydnicholraja@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/0c/6f/b366ddbc280e2114b80c61b15539950ce7ae7b7f5ac71bb0520484f3fcb8/outlier_toolkit-0.1.3.tar.gz",
"platform": null,
"description": "# \ud83e\uddf0 Outlier Toolkit \ud83d\udee0\ufe0f\r\n\r\nA **standalone Python library** for detecting, handling, and transforming outliers in numeric and categorical data. \r\nNo external dependencies required.\r\n\r\n---\r\n\r\n## \ud83d\udcdc License\r\n\r\nThis project is licensed under the **Apache License 2.0**. See the [LICENSE](./LICENSE) file for more details.\r\n\r\n[](https://www.apache.org/licenses/LICENSE-2.0)\r\n\r\n---\r\n\r\n\r\n## \ud83d\udcca Features\r\n\r\n### 1. Outlier Detection\r\n- **Z-score Detection**: Identify extreme values based on standard deviation.\r\n- **IQR Detection**: Detect outliers using the interquartile range (Q1, Q3).\r\n\r\n### 2. Outlier Handling Techniques\r\n- **Remove Outliers**: Drop outlier values from datasets.\r\n- **Replace Outliers**: Replace outliers with mean, median, or most frequent values.\r\n\r\n### 3. Winsorization\r\n- **Standard Winsorization**: Cap extreme values at a fixed percentile.\r\n- **Adaptive Quartiles**: Replace low/high outliers using Q1 and Q3.\r\n- **Adaptive Inliers**: Replace low/high outliers using nearest inlier values (custom method).\r\n\r\n### 4. Binning\r\n- **Equal Width Binning**: Divide numeric range into equal-width intervals.\r\n- **Equal Frequency Binning**: Divide data so each bin has approximately the same number of values.\r\n- **Auto Binning (Outlier-based)**: Automatically separate low/high outliers and inliers using IQR.\r\n\r\n---\r\n\r\n## \ud83d\udd27 Installation\r\n\r\n```bash\r\npip install outlier-toolkit\r\n```\r\n\r\nNo external libraries required. Compatible with Python 3.7+.\r\n\r\n---\r\n\r\n## \ud83e\uddee Usage\r\n\r\n```\r\n\r\nfrom outlier.i_outlier.Zscore import detect_outliers_zscore\r\nfrom outlier.i_outlier.IQR import detect_outliers_iqr\r\nfrom outlier.outlierTech.remove import remove_outliers\r\nfrom outlier.outlierTech.replace import replace_outliers\r\nfrom outlier.outlierTech.winsorization.standard import winsorize_standard\r\nfrom outlier.outlierTech.winsorization.adaptive import winsorize_quartiles\r\nfrom outlier.outlierTech.winsorization.adaptive import winsorize_inliers\r\nfrom outlier.outlierGroup.binning import eq_width_bin\r\nfrom outlier.outlierGroup.binning import eq_freq_bin\r\nfrom outlier.outlierGroup.binning import custom_binning\r\n\r\n\r\n\r\n# Sample Test data\r\nnumeric_data = [1, 2, 85, 95, 65, 75, 53, 67, 87, 89, 93, 1001, 1027, 3018]\r\ncategorical_data = [\"Male\", \"Female\", \"Male\", \"Male\", \"Unknown\", \"Unknown\", \"Other\"]\r\n\r\n#Detection\r\nprint(\"=== Zscore Detection ===\")\r\nprint(detect_outliers_zscore(numeric_data))\r\n\r\nprint(\"\\n=== IQR Detection ===\")\r\nprint(detect_outliers_iqr(numeric_data))\r\n\r\n#Handling\r\nprint(\"\\n=== Remove Outliers ===\")\r\nprint(remove_outliers(numeric_data, method=\"IQR\"))\r\n\r\nprint(\"\\n=== Replace Outliers (auto-detect) ===\")\r\nprint(replace_outliers(numeric_data, method=\"IQR\"))\r\nprint(replace_outliers(categorical_data, method=\"IQR\"))\r\n\r\n#Winsorization\r\nprint(\"\\n=== Winsorization (Standard 5%) ===\")\r\nprint(winsorize_standard(numeric_data[:]))\r\n\r\nprint(\"\\n=== Winsorization (Adaptive Quartiles) ===\")\r\nprint(winsorize_quartiles(numeric_data[:]))\r\n\r\nprint(\"\\n=== Winsorization (Adaptive Inliers) ===\")\r\nprint(winsorize_inliers(numeric_data[:]))\r\n\r\n#Binning\r\nprint(\"\\n=== Binning (Equal Width Binning) ===\")\r\nprint(eq_width_bin(numeric_data[:]))\r\n\r\nprint(\"\\n=== Binning (Equal Width Binning) ===\")\r\nprint(eq_freq_bin(numeric_data[:]))\r\n\r\nprint(\"\\n=== Binning (Equal Width Binning) ===\")\r\nprint(custom_binning(numeric_data[:]))\r\n```\r\n---\r\n\r\n## \ud83d\udcdd Notes\r\n\r\n- Works for numeric and categorical data.\r\n- All functions are standalone and do not require external libraries.\r\n- Custom winsorization allows mapping outliers to nearest inliers for more controlled transformations.\r\n\r\n---\r\n\r\n## \ud83d\udc69\u200d\ud83d\udcbb Author\r\n**Irene Betsy D** \r\n\r\n---\r\n\r\n\r\n\r\n\r\n",
"bugtrack_url": null,
"license": "Apache License 2.0",
"summary": "A Python library for identifying and handling outliers",
"version": "0.1.3",
"project_urls": {
"Bug Tracker": "https://github.com/irenebetsy/outlier_library/issues",
"Documentation": "https://github.com/irenebetsy/outlier_library#readme",
"Homepage": "https://github.com/irenebetsy/outlier_library",
"Source Code": "https://github.com/irenebetsy/outlier_library"
},
"split_keywords": [
"outlier",
"iqr",
"zscore",
"winsorization",
"binning",
"data",
"preprocessing",
"analytics"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "f71519b9aba6948dc886f574ce35dbe0415d635863cac12725ebb9f834821f03",
"md5": "8cd85992ba1cd4ba8a1e42d1ce7e9171",
"sha256": "16be88334b1e0f535295451443ba5c385d679edc68b3855132b5694d797fc8f1"
},
"downloads": -1,
"filename": "outlier_toolkit-0.1.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "8cd85992ba1cd4ba8a1e42d1ce7e9171",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 9079,
"upload_time": "2025-08-30T11:38:27",
"upload_time_iso_8601": "2025-08-30T11:38:27.070190Z",
"url": "https://files.pythonhosted.org/packages/f7/15/19b9aba6948dc886f574ce35dbe0415d635863cac12725ebb9f834821f03/outlier_toolkit-0.1.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "0c6fb366ddbc280e2114b80c61b15539950ce7ae7b7f5ac71bb0520484f3fcb8",
"md5": "a8313e7cf314f1eee8203e3b7311dc2c",
"sha256": "fc3a8c1a3c5a37024c07b4671f44d24a548bcdeb610c0f8de436b9b04b1ef992"
},
"downloads": -1,
"filename": "outlier_toolkit-0.1.3.tar.gz",
"has_sig": false,
"md5_digest": "a8313e7cf314f1eee8203e3b7311dc2c",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 6111,
"upload_time": "2025-08-30T11:38:28",
"upload_time_iso_8601": "2025-08-30T11:38:28.021303Z",
"url": "https://files.pythonhosted.org/packages/0c/6f/b366ddbc280e2114b80c61b15539950ce7ae7b7f5ac71bb0520484f3fcb8/outlier_toolkit-0.1.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-30 11:38:28",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "irenebetsy",
"github_project": "outlier_library",
"github_not_found": true,
"lcname": "outlier-toolkit"
}