enhanced-automatic-shifted-log


Nameenhanced-automatic-shifted-log JSON
Version 0.1.0 PyPI version JSON
download
home_pagehttps://github.com/AkmalHusain2003/enhanced-automatic-shifted-log
SummaryEnhanced Automatic Shifted Log Transformer with Monte Carlo optimization
upload_time2025-08-12 04:39:39
maintainerNone
docs_urlNone
authorMuhammad Akmal Husain
requires_python>=3.7
licenseMIT
keywords data-transformation log-transformation normality monte-carlo-optimization statistical-preprocessing data-science machine-learning numba-acceleration feng-transformation adaptive-algorithms
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Enhanced Automatic Shifted Log Transformer

**Automatically transform skewed data into more normal distributions using Monte Carlo optimized shifted log transformation.**

---

## ๐Ÿ“Œ Overview

This transformer improves data normality by applying an **automatically tuned shifted log transformation**. It uses **Monte Carlo optimization** with Dirichlet sampling to find the best shift parameters for each feature.

โœ… **Reduces skewness**  
โœ… **Stabilizes variance**  
โœ… **Scikit-learn compatible**  
โœ… **Fast** (Numba-accelerated)

---

## ๐Ÿ”ง Installation

```bash
pip install enhanced-automatic-shifted-log
```

Or from source:

```bash
git clone https://github.com/AkmalHusain2003/enhanced-automatic-shifted-log.git
cd enhanced-automatic-shifted-log
pip install -e .
```

---

## ๐Ÿš€ Quick Start

```python
from enhanced_aslt import AutomaticShiftedLogTransformer

# Fit & transform
transformer = AutomaticShiftedLogTransformer(mc_iterations=1000, random_state=42)
transformed_data = transformer.fit_transform(your_data)
```

---

## ๐Ÿ“Š Example: Before vs After

```python
import matplotlib.pyplot as plt
import seaborn as sns

# Create side-by-side comparison plots
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10, 4))

# Plot original data
sns.histplot(original_data, ax=ax1, color='red')
ax1.set_title('Before Transformation')

# Plot transformed data
sns.histplot(transformed_data, ax=ax2, color='green')
ax2.set_title('After Transformation')

plt.tight_layout()
plt.show()
```

---

## โš™๏ธ Key Parameters

| Parameter                | Description                                   | Default |
|--------------------------|-----------------------------------------------|---------|
| `mc_iterations`          | Monte Carlo iterations                        | 1000    |
| `random_state`           | Random seed                                   | None    |
| `min_improvement_skewed` | Minimum skewness improvement for skewed data  | 0.02    |
| `normality_threshold`    | Shapiro-Wilk threshold to skip transformation | 0.9     |

---

## ๐Ÿ”ฌ How It Works

The Enhanced Automatic Shifted Log Transformer uses Monte Carlo optimization to automatically find the optimal shift parameter for each feature:

1. **Skewness Detection**: Identifies features that would benefit from transformation
2. **Monte Carlo Optimization**: Uses Dirichlet sampling to explore shift parameter space
3. **Normality Assessment**: Applies Shapiro-Wilk test to evaluate transformation quality
4. **Adaptive Processing**: Only transforms features that show significant improvement

---

## ๐Ÿ“ˆ Performance Benefits

- **Automatic Parameter Tuning**: No manual hyperparameter selection required
- **Feature-Wise Optimization**: Each column gets individually optimized parameters
- **Computational Efficiency**: Numba acceleration for fast processing
- **Robust Statistics**: Uses multiple normality metrics for reliable results

---

## ๐Ÿงช Advanced Usage

```python
from enhanced_aslt import AutomaticShiftedLogTransformer
import numpy as np

# Generate sample skewed data
np.random.seed(42)
skewed_data = np.random.exponential(2, (1000, 3))

# Initialize transformer with custom parameters
transformer = AutomaticShiftedLogTransformer(
    mc_iterations=2000,
    random_state=42,
    min_improvement_skewed=0.05,
    normality_threshold=0.95
)

# Fit and transform
X_transformed = transformer.fit_transform(skewed_data)

# Access transformation parameters
print("Optimal shift parameters:", transformer.shift_params_)
print("Features transformed:", transformer.features_transformed_)

# Inverse transform (if needed)
X_original = transformer.inverse_transform(X_transformed)
```

---

## ๐Ÿ“š References

- Feng, Q., Hannig, J., & Marron, J. S. (2016). *A Note on Automatic Data Transformation*. arXiv:1601.01986 [stat.ME]. https://arxiv.org/abs/1601.01986
- Tukey, J. W. (1977). *Exploratory Data Analysis*. Addison-Wesley.
- Box, G. E. P., & Cox, D. R. (1964). *An analysis of transformations*. Journal of the Royal Statistical Society, 26(2), 211-252.

---

## ๐Ÿค Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

1. Fork the repository
2. Create your feature branch (`git checkout -b feature/AmazingFeature`)
3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)
4. Push to the branch (`git push origin feature/AmazingFeature`)
5. Open a Pull Request

---

## ๐Ÿ› Issues

If you encounter any issues or have suggestions for improvements, please open an issue on the [GitHub repository](https://github.com/AkmalHusain2003/enhanced-automatic-shifted-log/issues).

---

## ๐Ÿ“œ License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

---

## ๐Ÿ‘จโ€๐Ÿ’ป Author

**Muhammad Akmal Husain**

- GitHub: [@AkmalHusain2003](https://github.com/AkmalHusain2003)
- Email: [akmalhusain2003@gmail.com](mailto:akmalhusain2003@gmail.com)

---

## ๐ŸŒŸ Show Your Support

If this project helped you, please consider giving it a โญ๏ธ on GitHub!

---

*Built with โค๏ธ for the data science community*

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/AkmalHusain2003/enhanced-automatic-shifted-log",
    "name": "enhanced-automatic-shifted-log",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": null,
    "keywords": "data-transformation, log-transformation, normality, monte-carlo-optimization, statistical-preprocessing, data-science, machine-learning, numba-acceleration, feng-transformation, adaptive-algorithms",
    "author": "Muhammad Akmal Husain",
    "author_email": "Muhammad Akmal Husain <akmalhusain2003@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/7e/17/c929dfc679d701abfb62fe559f9f84ba08bb4655ad108685ad94b5774f0e/enhanced_automatic_shifted_log-0.1.0.tar.gz",
    "platform": null,
    "description": "# Enhanced Automatic Shifted Log Transformer\r\n\r\n**Automatically transform skewed data into more normal distributions using Monte Carlo optimized shifted log transformation.**\r\n\r\n---\r\n\r\n## \ud83d\udccc Overview\r\n\r\nThis transformer improves data normality by applying an **automatically tuned shifted log transformation**. It uses **Monte Carlo optimization** with Dirichlet sampling to find the best shift parameters for each feature.\r\n\r\n\u2705 **Reduces skewness**  \r\n\u2705 **Stabilizes variance**  \r\n\u2705 **Scikit-learn compatible**  \r\n\u2705 **Fast** (Numba-accelerated)\r\n\r\n---\r\n\r\n## \ud83d\udd27 Installation\r\n\r\n```bash\r\npip install enhanced-automatic-shifted-log\r\n```\r\n\r\nOr from source:\r\n\r\n```bash\r\ngit clone https://github.com/AkmalHusain2003/enhanced-automatic-shifted-log.git\r\ncd enhanced-automatic-shifted-log\r\npip install -e .\r\n```\r\n\r\n---\r\n\r\n## \ud83d\ude80 Quick Start\r\n\r\n```python\r\nfrom enhanced_aslt import AutomaticShiftedLogTransformer\r\n\r\n# Fit & transform\r\ntransformer = AutomaticShiftedLogTransformer(mc_iterations=1000, random_state=42)\r\ntransformed_data = transformer.fit_transform(your_data)\r\n```\r\n\r\n---\r\n\r\n## \ud83d\udcca Example: Before vs After\r\n\r\n```python\r\nimport matplotlib.pyplot as plt\r\nimport seaborn as sns\r\n\r\n# Create side-by-side comparison plots\r\nfig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10, 4))\r\n\r\n# Plot original data\r\nsns.histplot(original_data, ax=ax1, color='red')\r\nax1.set_title('Before Transformation')\r\n\r\n# Plot transformed data\r\nsns.histplot(transformed_data, ax=ax2, color='green')\r\nax2.set_title('After Transformation')\r\n\r\nplt.tight_layout()\r\nplt.show()\r\n```\r\n\r\n---\r\n\r\n## \u2699\ufe0f Key Parameters\r\n\r\n| Parameter                | Description                                   | Default |\r\n|--------------------------|-----------------------------------------------|---------|\r\n| `mc_iterations`          | Monte Carlo iterations                        | 1000    |\r\n| `random_state`           | Random seed                                   | None    |\r\n| `min_improvement_skewed` | Minimum skewness improvement for skewed data  | 0.02    |\r\n| `normality_threshold`    | Shapiro-Wilk threshold to skip transformation | 0.9     |\r\n\r\n---\r\n\r\n## \ud83d\udd2c How It Works\r\n\r\nThe Enhanced Automatic Shifted Log Transformer uses Monte Carlo optimization to automatically find the optimal shift parameter for each feature:\r\n\r\n1. **Skewness Detection**: Identifies features that would benefit from transformation\r\n2. **Monte Carlo Optimization**: Uses Dirichlet sampling to explore shift parameter space\r\n3. **Normality Assessment**: Applies Shapiro-Wilk test to evaluate transformation quality\r\n4. **Adaptive Processing**: Only transforms features that show significant improvement\r\n\r\n---\r\n\r\n## \ud83d\udcc8 Performance Benefits\r\n\r\n- **Automatic Parameter Tuning**: No manual hyperparameter selection required\r\n- **Feature-Wise Optimization**: Each column gets individually optimized parameters\r\n- **Computational Efficiency**: Numba acceleration for fast processing\r\n- **Robust Statistics**: Uses multiple normality metrics for reliable results\r\n\r\n---\r\n\r\n## \ud83e\uddea Advanced Usage\r\n\r\n```python\r\nfrom enhanced_aslt import AutomaticShiftedLogTransformer\r\nimport numpy as np\r\n\r\n# Generate sample skewed data\r\nnp.random.seed(42)\r\nskewed_data = np.random.exponential(2, (1000, 3))\r\n\r\n# Initialize transformer with custom parameters\r\ntransformer = AutomaticShiftedLogTransformer(\r\n    mc_iterations=2000,\r\n    random_state=42,\r\n    min_improvement_skewed=0.05,\r\n    normality_threshold=0.95\r\n)\r\n\r\n# Fit and transform\r\nX_transformed = transformer.fit_transform(skewed_data)\r\n\r\n# Access transformation parameters\r\nprint(\"Optimal shift parameters:\", transformer.shift_params_)\r\nprint(\"Features transformed:\", transformer.features_transformed_)\r\n\r\n# Inverse transform (if needed)\r\nX_original = transformer.inverse_transform(X_transformed)\r\n```\r\n\r\n---\r\n\r\n## \ud83d\udcda References\r\n\r\n- Feng, Q., Hannig, J., & Marron, J. S. (2016). *A Note on Automatic Data Transformation*. arXiv:1601.01986 [stat.ME]. https://arxiv.org/abs/1601.01986\r\n- Tukey, J. W. (1977). *Exploratory Data Analysis*. Addison-Wesley.\r\n- Box, G. E. P., & Cox, D. R. (1964). *An analysis of transformations*. Journal of the Royal Statistical Society, 26(2), 211-252.\r\n\r\n---\r\n\r\n## \ud83e\udd1d Contributing\r\n\r\nContributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.\r\n\r\n1. Fork the repository\r\n2. Create your feature branch (`git checkout -b feature/AmazingFeature`)\r\n3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)\r\n4. Push to the branch (`git push origin feature/AmazingFeature`)\r\n5. Open a Pull Request\r\n\r\n---\r\n\r\n## \ud83d\udc1b Issues\r\n\r\nIf you encounter any issues or have suggestions for improvements, please open an issue on the [GitHub repository](https://github.com/AkmalHusain2003/enhanced-automatic-shifted-log/issues).\r\n\r\n---\r\n\r\n## \ud83d\udcdc License\r\n\r\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\r\n\r\n---\r\n\r\n## \ud83d\udc68\u200d\ud83d\udcbb Author\r\n\r\n**Muhammad Akmal Husain**\r\n\r\n- GitHub: [@AkmalHusain2003](https://github.com/AkmalHusain2003)\r\n- Email: [akmalhusain2003@gmail.com](mailto:akmalhusain2003@gmail.com)\r\n\r\n---\r\n\r\n## \ud83c\udf1f Show Your Support\r\n\r\nIf this project helped you, please consider giving it a \u2b50\ufe0f on GitHub!\r\n\r\n---\r\n\r\n*Built with \u2764\ufe0f for the data science community*\r\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Enhanced Automatic Shifted Log Transformer with Monte Carlo optimization",
    "version": "0.1.0",
    "project_urls": {
        "Homepage": "https://github.com/AkmalHusain2003/enhanced-automatic-shifted-log",
        "Issues": "https://github.com/AkmalHusain2003/enhanced-automatic-shifted-log/issues",
        "Repository": "https://github.com/AkmalHusain2003/enhanced-automatic-shifted-log.git"
    },
    "split_keywords": [
        "data-transformation",
        " log-transformation",
        " normality",
        " monte-carlo-optimization",
        " statistical-preprocessing",
        " data-science",
        " machine-learning",
        " numba-acceleration",
        " feng-transformation",
        " adaptive-algorithms"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "230dd39a116dcc69cb4f758480f60d2ae76b6e6ae1cff89a21b99f0283b8f1e9",
                "md5": "26da579b1f7e869d049b5f07117e3978",
                "sha256": "2abf77de5d10bcd053c7c022c06bb6cb486adc848eee05ba94725863619eef5e"
            },
            "downloads": -1,
            "filename": "enhanced_automatic_shifted_log-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "26da579b1f7e869d049b5f07117e3978",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 17057,
            "upload_time": "2025-08-12T04:39:38",
            "upload_time_iso_8601": "2025-08-12T04:39:38.092304Z",
            "url": "https://files.pythonhosted.org/packages/23/0d/d39a116dcc69cb4f758480f60d2ae76b6e6ae1cff89a21b99f0283b8f1e9/enhanced_automatic_shifted_log-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "7e17c929dfc679d701abfb62fe559f9f84ba08bb4655ad108685ad94b5774f0e",
                "md5": "9369c63f0d6d6c869207dafbf823b350",
                "sha256": "9fecad1a8a8b9a6f20e9d23d7ab4e403aee28e047f6e8fa5051c54af5fa75228"
            },
            "downloads": -1,
            "filename": "enhanced_automatic_shifted_log-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "9369c63f0d6d6c869207dafbf823b350",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 20031,
            "upload_time": "2025-08-12T04:39:39",
            "upload_time_iso_8601": "2025-08-12T04:39:39.983481Z",
            "url": "https://files.pythonhosted.org/packages/7e/17/c929dfc679d701abfb62fe559f9f84ba08bb4655ad108685ad94b5774f0e/enhanced_automatic_shifted_log-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-12 04:39:39",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "AkmalHusain2003",
    "github_project": "enhanced-automatic-shifted-log",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "enhanced-automatic-shifted-log"
}
        
Elapsed time: 0.40899s