# Enhanced Automatic Shifted Log Transformer
**Automatically transform skewed data into more normal distributions using Monte Carlo optimized shifted log transformation.**
---
## ๐ Overview
This transformer improves data normality by applying an **automatically tuned shifted log transformation**. It uses **Monte Carlo optimization** with Dirichlet sampling to find the best shift parameters for each feature.
โ
**Reduces skewness**
โ
**Stabilizes variance**
โ
**Scikit-learn compatible**
โ
**Fast** (Numba-accelerated)
---
## ๐ง Installation
```bash
pip install enhanced-automatic-shifted-log
```
Or from source:
```bash
git clone https://github.com/AkmalHusain2003/enhanced-automatic-shifted-log.git
cd enhanced-automatic-shifted-log
pip install -e .
```
---
## ๐ Quick Start
```python
from enhanced_aslt import AutomaticShiftedLogTransformer
# Fit & transform
transformer = AutomaticShiftedLogTransformer(mc_iterations=1000, random_state=42)
transformed_data = transformer.fit_transform(your_data)
```
---
## ๐ Example: Before vs After
```python
import matplotlib.pyplot as plt
import seaborn as sns
# Create side-by-side comparison plots
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10, 4))
# Plot original data
sns.histplot(original_data, ax=ax1, color='red')
ax1.set_title('Before Transformation')
# Plot transformed data
sns.histplot(transformed_data, ax=ax2, color='green')
ax2.set_title('After Transformation')
plt.tight_layout()
plt.show()
```
---
## โ๏ธ Key Parameters
| Parameter | Description | Default |
|--------------------------|-----------------------------------------------|---------|
| `mc_iterations` | Monte Carlo iterations | 1000 |
| `random_state` | Random seed | None |
| `min_improvement_skewed` | Minimum skewness improvement for skewed data | 0.02 |
| `normality_threshold` | Shapiro-Wilk threshold to skip transformation | 0.9 |
---
## ๐ฌ How It Works
The Enhanced Automatic Shifted Log Transformer uses Monte Carlo optimization to automatically find the optimal shift parameter for each feature:
1. **Skewness Detection**: Identifies features that would benefit from transformation
2. **Monte Carlo Optimization**: Uses Dirichlet sampling to explore shift parameter space
3. **Normality Assessment**: Applies Shapiro-Wilk test to evaluate transformation quality
4. **Adaptive Processing**: Only transforms features that show significant improvement
---
## ๐ Performance Benefits
- **Automatic Parameter Tuning**: No manual hyperparameter selection required
- **Feature-Wise Optimization**: Each column gets individually optimized parameters
- **Computational Efficiency**: Numba acceleration for fast processing
- **Robust Statistics**: Uses multiple normality metrics for reliable results
---
## ๐งช Advanced Usage
```python
from enhanced_aslt import AutomaticShiftedLogTransformer
import numpy as np
# Generate sample skewed data
np.random.seed(42)
skewed_data = np.random.exponential(2, (1000, 3))
# Initialize transformer with custom parameters
transformer = AutomaticShiftedLogTransformer(
mc_iterations=2000,
random_state=42,
min_improvement_skewed=0.05,
normality_threshold=0.95
)
# Fit and transform
X_transformed = transformer.fit_transform(skewed_data)
# Access transformation parameters
print("Optimal shift parameters:", transformer.shift_params_)
print("Features transformed:", transformer.features_transformed_)
# Inverse transform (if needed)
X_original = transformer.inverse_transform(X_transformed)
```
---
## ๐ References
- Feng, Q., Hannig, J., & Marron, J. S. (2016). *A Note on Automatic Data Transformation*. arXiv:1601.01986 [stat.ME]. https://arxiv.org/abs/1601.01986
- Tukey, J. W. (1977). *Exploratory Data Analysis*. Addison-Wesley.
- Box, G. E. P., & Cox, D. R. (1964). *An analysis of transformations*. Journal of the Royal Statistical Society, 26(2), 211-252.
---
## ๐ค Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
1. Fork the repository
2. Create your feature branch (`git checkout -b feature/AmazingFeature`)
3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)
4. Push to the branch (`git push origin feature/AmazingFeature`)
5. Open a Pull Request
---
## ๐ Issues
If you encounter any issues or have suggestions for improvements, please open an issue on the [GitHub repository](https://github.com/AkmalHusain2003/enhanced-automatic-shifted-log/issues).
---
## ๐ License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
---
## ๐จโ๐ป Author
**Muhammad Akmal Husain**
- GitHub: [@AkmalHusain2003](https://github.com/AkmalHusain2003)
- Email: [akmalhusain2003@gmail.com](mailto:akmalhusain2003@gmail.com)
---
## ๐ Show Your Support
If this project helped you, please consider giving it a โญ๏ธ on GitHub!
---
*Built with โค๏ธ for the data science community*
Raw data
{
"_id": null,
"home_page": "https://github.com/AkmalHusain2003/enhanced-automatic-shifted-log",
"name": "enhanced-automatic-shifted-log",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": null,
"keywords": "data-transformation, log-transformation, normality, monte-carlo-optimization, statistical-preprocessing, data-science, machine-learning, numba-acceleration, feng-transformation, adaptive-algorithms",
"author": "Muhammad Akmal Husain",
"author_email": "Muhammad Akmal Husain <akmalhusain2003@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/7e/17/c929dfc679d701abfb62fe559f9f84ba08bb4655ad108685ad94b5774f0e/enhanced_automatic_shifted_log-0.1.0.tar.gz",
"platform": null,
"description": "# Enhanced Automatic Shifted Log Transformer\r\n\r\n**Automatically transform skewed data into more normal distributions using Monte Carlo optimized shifted log transformation.**\r\n\r\n---\r\n\r\n## \ud83d\udccc Overview\r\n\r\nThis transformer improves data normality by applying an **automatically tuned shifted log transformation**. It uses **Monte Carlo optimization** with Dirichlet sampling to find the best shift parameters for each feature.\r\n\r\n\u2705 **Reduces skewness** \r\n\u2705 **Stabilizes variance** \r\n\u2705 **Scikit-learn compatible** \r\n\u2705 **Fast** (Numba-accelerated)\r\n\r\n---\r\n\r\n## \ud83d\udd27 Installation\r\n\r\n```bash\r\npip install enhanced-automatic-shifted-log\r\n```\r\n\r\nOr from source:\r\n\r\n```bash\r\ngit clone https://github.com/AkmalHusain2003/enhanced-automatic-shifted-log.git\r\ncd enhanced-automatic-shifted-log\r\npip install -e .\r\n```\r\n\r\n---\r\n\r\n## \ud83d\ude80 Quick Start\r\n\r\n```python\r\nfrom enhanced_aslt import AutomaticShiftedLogTransformer\r\n\r\n# Fit & transform\r\ntransformer = AutomaticShiftedLogTransformer(mc_iterations=1000, random_state=42)\r\ntransformed_data = transformer.fit_transform(your_data)\r\n```\r\n\r\n---\r\n\r\n## \ud83d\udcca Example: Before vs After\r\n\r\n```python\r\nimport matplotlib.pyplot as plt\r\nimport seaborn as sns\r\n\r\n# Create side-by-side comparison plots\r\nfig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10, 4))\r\n\r\n# Plot original data\r\nsns.histplot(original_data, ax=ax1, color='red')\r\nax1.set_title('Before Transformation')\r\n\r\n# Plot transformed data\r\nsns.histplot(transformed_data, ax=ax2, color='green')\r\nax2.set_title('After Transformation')\r\n\r\nplt.tight_layout()\r\nplt.show()\r\n```\r\n\r\n---\r\n\r\n## \u2699\ufe0f Key Parameters\r\n\r\n| Parameter | Description | Default |\r\n|--------------------------|-----------------------------------------------|---------|\r\n| `mc_iterations` | Monte Carlo iterations | 1000 |\r\n| `random_state` | Random seed | None |\r\n| `min_improvement_skewed` | Minimum skewness improvement for skewed data | 0.02 |\r\n| `normality_threshold` | Shapiro-Wilk threshold to skip transformation | 0.9 |\r\n\r\n---\r\n\r\n## \ud83d\udd2c How It Works\r\n\r\nThe Enhanced Automatic Shifted Log Transformer uses Monte Carlo optimization to automatically find the optimal shift parameter for each feature:\r\n\r\n1. **Skewness Detection**: Identifies features that would benefit from transformation\r\n2. **Monte Carlo Optimization**: Uses Dirichlet sampling to explore shift parameter space\r\n3. **Normality Assessment**: Applies Shapiro-Wilk test to evaluate transformation quality\r\n4. **Adaptive Processing**: Only transforms features that show significant improvement\r\n\r\n---\r\n\r\n## \ud83d\udcc8 Performance Benefits\r\n\r\n- **Automatic Parameter Tuning**: No manual hyperparameter selection required\r\n- **Feature-Wise Optimization**: Each column gets individually optimized parameters\r\n- **Computational Efficiency**: Numba acceleration for fast processing\r\n- **Robust Statistics**: Uses multiple normality metrics for reliable results\r\n\r\n---\r\n\r\n## \ud83e\uddea Advanced Usage\r\n\r\n```python\r\nfrom enhanced_aslt import AutomaticShiftedLogTransformer\r\nimport numpy as np\r\n\r\n# Generate sample skewed data\r\nnp.random.seed(42)\r\nskewed_data = np.random.exponential(2, (1000, 3))\r\n\r\n# Initialize transformer with custom parameters\r\ntransformer = AutomaticShiftedLogTransformer(\r\n mc_iterations=2000,\r\n random_state=42,\r\n min_improvement_skewed=0.05,\r\n normality_threshold=0.95\r\n)\r\n\r\n# Fit and transform\r\nX_transformed = transformer.fit_transform(skewed_data)\r\n\r\n# Access transformation parameters\r\nprint(\"Optimal shift parameters:\", transformer.shift_params_)\r\nprint(\"Features transformed:\", transformer.features_transformed_)\r\n\r\n# Inverse transform (if needed)\r\nX_original = transformer.inverse_transform(X_transformed)\r\n```\r\n\r\n---\r\n\r\n## \ud83d\udcda References\r\n\r\n- Feng, Q., Hannig, J., & Marron, J. S. (2016). *A Note on Automatic Data Transformation*. arXiv:1601.01986 [stat.ME]. https://arxiv.org/abs/1601.01986\r\n- Tukey, J. W. (1977). *Exploratory Data Analysis*. Addison-Wesley.\r\n- Box, G. E. P., & Cox, D. R. (1964). *An analysis of transformations*. Journal of the Royal Statistical Society, 26(2), 211-252.\r\n\r\n---\r\n\r\n## \ud83e\udd1d Contributing\r\n\r\nContributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.\r\n\r\n1. Fork the repository\r\n2. Create your feature branch (`git checkout -b feature/AmazingFeature`)\r\n3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)\r\n4. Push to the branch (`git push origin feature/AmazingFeature`)\r\n5. Open a Pull Request\r\n\r\n---\r\n\r\n## \ud83d\udc1b Issues\r\n\r\nIf you encounter any issues or have suggestions for improvements, please open an issue on the [GitHub repository](https://github.com/AkmalHusain2003/enhanced-automatic-shifted-log/issues).\r\n\r\n---\r\n\r\n## \ud83d\udcdc License\r\n\r\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\r\n\r\n---\r\n\r\n## \ud83d\udc68\u200d\ud83d\udcbb Author\r\n\r\n**Muhammad Akmal Husain**\r\n\r\n- GitHub: [@AkmalHusain2003](https://github.com/AkmalHusain2003)\r\n- Email: [akmalhusain2003@gmail.com](mailto:akmalhusain2003@gmail.com)\r\n\r\n---\r\n\r\n## \ud83c\udf1f Show Your Support\r\n\r\nIf this project helped you, please consider giving it a \u2b50\ufe0f on GitHub!\r\n\r\n---\r\n\r\n*Built with \u2764\ufe0f for the data science community*\r\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Enhanced Automatic Shifted Log Transformer with Monte Carlo optimization",
"version": "0.1.0",
"project_urls": {
"Homepage": "https://github.com/AkmalHusain2003/enhanced-automatic-shifted-log",
"Issues": "https://github.com/AkmalHusain2003/enhanced-automatic-shifted-log/issues",
"Repository": "https://github.com/AkmalHusain2003/enhanced-automatic-shifted-log.git"
},
"split_keywords": [
"data-transformation",
" log-transformation",
" normality",
" monte-carlo-optimization",
" statistical-preprocessing",
" data-science",
" machine-learning",
" numba-acceleration",
" feng-transformation",
" adaptive-algorithms"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "230dd39a116dcc69cb4f758480f60d2ae76b6e6ae1cff89a21b99f0283b8f1e9",
"md5": "26da579b1f7e869d049b5f07117e3978",
"sha256": "2abf77de5d10bcd053c7c022c06bb6cb486adc848eee05ba94725863619eef5e"
},
"downloads": -1,
"filename": "enhanced_automatic_shifted_log-0.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "26da579b1f7e869d049b5f07117e3978",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 17057,
"upload_time": "2025-08-12T04:39:38",
"upload_time_iso_8601": "2025-08-12T04:39:38.092304Z",
"url": "https://files.pythonhosted.org/packages/23/0d/d39a116dcc69cb4f758480f60d2ae76b6e6ae1cff89a21b99f0283b8f1e9/enhanced_automatic_shifted_log-0.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "7e17c929dfc679d701abfb62fe559f9f84ba08bb4655ad108685ad94b5774f0e",
"md5": "9369c63f0d6d6c869207dafbf823b350",
"sha256": "9fecad1a8a8b9a6f20e9d23d7ab4e403aee28e047f6e8fa5051c54af5fa75228"
},
"downloads": -1,
"filename": "enhanced_automatic_shifted_log-0.1.0.tar.gz",
"has_sig": false,
"md5_digest": "9369c63f0d6d6c869207dafbf823b350",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 20031,
"upload_time": "2025-08-12T04:39:39",
"upload_time_iso_8601": "2025-08-12T04:39:39.983481Z",
"url": "https://files.pythonhosted.org/packages/7e/17/c929dfc679d701abfb62fe559f9f84ba08bb4655ad108685ad94b5774f0e/enhanced_automatic_shifted_log-0.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-12 04:39:39",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "AkmalHusain2003",
"github_project": "enhanced-automatic-shifted-log",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "enhanced-automatic-shifted-log"
}