Name | skewnormlib JSON |
Version |
0.1.3
JSON |
| download |
home_page | https://github.com/mohdnihal03/skewnormlib |
Summary | A Python library for skew-weighted normalization |
upload_time | 2025-01-25 15:04:34 |
maintainer | None |
docs_url | None |
author | Mohammed Nihal |
requires_python | >=3.6 |
license | MIT License
Copyright (c) 2025 Mohammed Nihal
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES, OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
|
keywords |
skewness
normalization
data preprocessing
machine learning
|
VCS |
|
bugtrack_url |
|
requirements |
numpy
scipy
scikit-learn
pytest
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# skewnormlib
`skewnormlib` is an open-source Python library designed for preprocessing data using a novel method called **Skew Weighted Normalization**. This technique accounts for the skewness in data and applies normalization while optionally introducing a non-linear transformation. It is ideal for preparing data for machine learning models that require normalized inputs.
---
## What is SkewWeightedNormalization?
The `SkewWeightedNormalization` class is a custom data transformer that normalizes data while considering its skewness. It achieves this by combining a skew-adjusted normalization term and a non-linear transformation term, making it more robust for skewed datasets.
### Mathematical Formula:
The transformation is defined as:
\[
\text{scaled\_data} = \frac{X - \mu}{\sigma (1 + \alpha \cdot |\gamma|)} + \beta \cdot \tanh\left(\frac{X - \mu}{k \cdot \sigma}\right)
\]
Where:
- X: Input data.
- mu: Mean of the data.
- sigma: Standard deviation of the data.
- gamma: Skewness of the data.
- alpha: Skewness weighting factor (default: 1.0).
- beta: Weighting factor for the non-linear transformation (default: 0.5).
- k: Scaling factor for the non-linear term (default: 1.0).
---
## How to Use
### Installation
Install the library from PyPI:
```bash
pip install skewnormlib
```
### Usage Example
```python
import numpy as np
from skewnorm.normalization import SkewWeightedNormalization
from sklearn.model_selection import train_test_split
# Sample data
data = np.random.rand(100, 5) # 100 samples with 5 features
X_train, X_test = train_test_split(data, test_size=0.2)
# Initialize the transformer
swn = SkewWeightedNormalization(alpha=1.0, beta=0.5, k=1.0)
# Fit and transform the training data
X_train_transformed = swn.fit_transform(X_train)
# Transform the test data
X_test_transformed = swn.transform(X_test)
# Reverse the transformation (if needed)
X_test_original = swn.inverse_transform(X_test_transformed)
print("X_train_transformed shape:", X_train_transformed.shape)
print("X_test_transformed shape:", X_test_transformed.shape)
print("Original test data recovered shape:", X_test_original.shape)
```
---
## Advantages
- **Handles Skewness**: Unlike traditional normalization techniques, this method adjusts for skewness, making it suitable for heavily skewed datasets.
- **Non-linear Transformation**: The additional \(\tanh\) term helps reduce the influence of extreme outliers.
- **Scikit-learn Compatible**: The class adheres to Scikit-learn's API, allowing seamless integration into pipelines.
- **Open Source**: The library is open for contributions, making it a community-driven project.
---
## Disadvantages
- **Complexity**: The additional parameters (\(\alpha, \beta, k\)) require tuning for optimal results.
- **Performance**: Slightly slower than standard normalization techniques due to the computation of skewness and the non-linear term.
---
## When to Use It
- **Skewed Data**: Use this method when the dataset has significant skewness, as it normalizes while accounting for the skew.
- **Outlier Sensitivity**: When datasets contain extreme outliers that might adversely affect models, this normalization technique can help mitigate their influence.
- **Preprocessing for ML**: Use this as a preprocessing step for machine learning models that perform better on normalized data (e.g., SVM, Neural Networks).
---
## Contributing
This library is open source, and contributions are welcome! Feel free to:
1. Submit bug reports.
2. Suggest new features.
3. Improve documentation.
4. Optimize performance.
Visit the GitHub repository to contribute:
[GitHub Repository](https://github.com/mohdnihal03/skewnorm)
---
## License
This project is licensed under the MIT License, ensuring it remains free and open for everyone.
---
## Contact
For questions or suggestions, contact:
- **Author**: Mohammed Nihal
- **Email**:mohdnihal03@gmail.com
Raw data
{
"_id": null,
"home_page": "https://github.com/mohdnihal03/skewnormlib",
"name": "skewnormlib",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": null,
"keywords": "skewness, normalization, data preprocessing, machine learning",
"author": "Mohammed Nihal",
"author_email": "Mohd Nihal <mohdnihalll03@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/4b/1b/c061250fa9c3564d5c4d6967a36b1f00e01d84b80c7c436c51203b716e8d/skewnormlib-0.1.3.tar.gz",
"platform": null,
"description": "# skewnormlib\n\n`skewnormlib` is an open-source Python library designed for preprocessing data using a novel method called **Skew Weighted Normalization**. This technique accounts for the skewness in data and applies normalization while optionally introducing a non-linear transformation. It is ideal for preparing data for machine learning models that require normalized inputs.\n\n---\n\n## What is SkewWeightedNormalization?\n\nThe `SkewWeightedNormalization` class is a custom data transformer that normalizes data while considering its skewness. It achieves this by combining a skew-adjusted normalization term and a non-linear transformation term, making it more robust for skewed datasets.\n\n### Mathematical Formula:\nThe transformation is defined as:\n\n\\[\n\\text{scaled\\_data} = \\frac{X - \\mu}{\\sigma (1 + \\alpha \\cdot |\\gamma|)} + \\beta \\cdot \\tanh\\left(\\frac{X - \\mu}{k \\cdot \\sigma}\\right)\n\\]\n\nWhere:\n- X: Input data.\n- mu: Mean of the data.\n- sigma: Standard deviation of the data.\n- gamma: Skewness of the data.\n- alpha: Skewness weighting factor (default: 1.0).\n- beta: Weighting factor for the non-linear transformation (default: 0.5).\n- k: Scaling factor for the non-linear term (default: 1.0).\n\n---\n\n## How to Use\n\n### Installation\nInstall the library from PyPI:\n```bash\npip install skewnormlib\n```\n\n### Usage Example\n```python\nimport numpy as np\nfrom skewnorm.normalization import SkewWeightedNormalization\nfrom sklearn.model_selection import train_test_split\n\n# Sample data\ndata = np.random.rand(100, 5) # 100 samples with 5 features\nX_train, X_test = train_test_split(data, test_size=0.2)\n\n# Initialize the transformer\nswn = SkewWeightedNormalization(alpha=1.0, beta=0.5, k=1.0)\n\n# Fit and transform the training data\nX_train_transformed = swn.fit_transform(X_train)\n\n# Transform the test data\nX_test_transformed = swn.transform(X_test)\n\n# Reverse the transformation (if needed)\nX_test_original = swn.inverse_transform(X_test_transformed)\n\nprint(\"X_train_transformed shape:\", X_train_transformed.shape)\nprint(\"X_test_transformed shape:\", X_test_transformed.shape)\nprint(\"Original test data recovered shape:\", X_test_original.shape)\n\n```\n\n---\n\n## Advantages\n- **Handles Skewness**: Unlike traditional normalization techniques, this method adjusts for skewness, making it suitable for heavily skewed datasets.\n- **Non-linear Transformation**: The additional \\(\\tanh\\) term helps reduce the influence of extreme outliers.\n- **Scikit-learn Compatible**: The class adheres to Scikit-learn's API, allowing seamless integration into pipelines.\n- **Open Source**: The library is open for contributions, making it a community-driven project.\n\n---\n\n## Disadvantages\n- **Complexity**: The additional parameters (\\(\\alpha, \\beta, k\\)) require tuning for optimal results.\n- **Performance**: Slightly slower than standard normalization techniques due to the computation of skewness and the non-linear term.\n\n---\n\n## When to Use It\n- **Skewed Data**: Use this method when the dataset has significant skewness, as it normalizes while accounting for the skew.\n- **Outlier Sensitivity**: When datasets contain extreme outliers that might adversely affect models, this normalization technique can help mitigate their influence.\n- **Preprocessing for ML**: Use this as a preprocessing step for machine learning models that perform better on normalized data (e.g., SVM, Neural Networks).\n\n---\n\n## Contributing\nThis library is open source, and contributions are welcome! Feel free to:\n1. Submit bug reports.\n2. Suggest new features.\n3. Improve documentation.\n4. Optimize performance.\n\nVisit the GitHub repository to contribute:\n[GitHub Repository](https://github.com/mohdnihal03/skewnorm)\n\n---\n\n## License\nThis project is licensed under the MIT License, ensuring it remains free and open for everyone.\n\n---\n\n## Contact\nFor questions or suggestions, contact:\n- **Author**: Mohammed Nihal\n- **Email**:mohdnihal03@gmail.com\n",
"bugtrack_url": null,
"license": "MIT License\n \n Copyright (c) 2025 Mohammed Nihal\n \n Permission is hereby granted, free of charge, to any person obtaining a copy\n of this software and associated documentation files (the \"Software\"), to deal\n in the Software without restriction, including without limitation the rights\n to use, copy, modify, merge, publish, distribute, sublicense, and/or sell\n copies of the Software, and to permit persons to whom the Software is\n furnished to do so, subject to the following conditions:\n \n The above copyright notice and this permission notice shall be included in all\n copies or substantial portions of the Software.\n \n THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\n IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\n FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\n AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES, OR OTHER\n LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\n OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\n SOFTWARE.\n ",
"summary": "A Python library for skew-weighted normalization",
"version": "0.1.3",
"project_urls": {
"Homepage": "https://github.com/mohdnihal03/skewnormlib"
},
"split_keywords": [
"skewness",
" normalization",
" data preprocessing",
" machine learning"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "37aedb4ac9bb120c16822a1f59c815fa7faa17cdff31b383e67002c912637bf8",
"md5": "6d23d42050afbc942f7563e89fbb4e32",
"sha256": "d5603cc971ecf4cc218f1b910199aa9266c4549d3fb14ff05ec44f13abf92aca"
},
"downloads": -1,
"filename": "skewnormlib-0.1.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "6d23d42050afbc942f7563e89fbb4e32",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 6518,
"upload_time": "2025-01-25T15:04:32",
"upload_time_iso_8601": "2025-01-25T15:04:32.901219Z",
"url": "https://files.pythonhosted.org/packages/37/ae/db4ac9bb120c16822a1f59c815fa7faa17cdff31b383e67002c912637bf8/skewnormlib-0.1.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "4b1bc061250fa9c3564d5c4d6967a36b1f00e01d84b80c7c436c51203b716e8d",
"md5": "4dcb70bac769f51181c194e085f7259e",
"sha256": "cd3fd49a0eaa9b630a5f7aefc53a49902b7e4694cb453aaf9fdea04e0a65008d"
},
"downloads": -1,
"filename": "skewnormlib-0.1.3.tar.gz",
"has_sig": false,
"md5_digest": "4dcb70bac769f51181c194e085f7259e",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 5403,
"upload_time": "2025-01-25T15:04:34",
"upload_time_iso_8601": "2025-01-25T15:04:34.696078Z",
"url": "https://files.pythonhosted.org/packages/4b/1b/c061250fa9c3564d5c4d6967a36b1f00e01d84b80c7c436c51203b716e8d/skewnormlib-0.1.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-01-25 15:04:34",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "mohdnihal03",
"github_project": "skewnormlib",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "numpy",
"specs": []
},
{
"name": "scipy",
"specs": []
},
{
"name": "scikit-learn",
"specs": []
},
{
"name": "pytest",
"specs": []
}
],
"lcname": "skewnormlib"
}