dataframe-statistical-analyzer


Namedataframe-statistical-analyzer JSON
Version 1.0.2 PyPI version JSON
download
home_pagehttps://github.com/DurgeshRathod/dataframe-statistical-analyzer
SummaryThe `DataFrame Statistical Analyzer` package provides a utility tool for statistical analyzing in a Pandas DataFrame.
upload_time2024-07-05 15:18:00
maintainerNone
docs_urlNone
authorDurgesh Rathod
requires_python<4.0,>=3.11
licenseNone
keywords time-series data analysis machine learning statistics
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # DataFrame Statistical Analyzer Utility 📊

The `DataFrameAnalyzer` project provides a robust and extensible tool for analyzing and visualizing data stored in a Pandas DataFrame. The tool encapsulates various data analysis functionalities, including summary statistics, percentage change computation, outlier detection, trend analysis, moving average calculation, correlation analysis, and seasonal pattern interpretation. The project is designed following the SOLID principles and incorporates design patterns to ensure maintainability and ease of use. 🚀

## Features 🌟

- **Summary Statistics**: Statistical summary of the DataFrame. 📈
- **Month-to-Month Percentage Changes**: Percentage changes between consecutive months. 🔄
- **Outliers Detection (Z-score > 3)**: DataFrame segments identified as outliers based on Z-score. 🚨
- **Outliers Detection (MAD)**: DataFrame segments identified as outliers based on Median Absolute Deviation. 📉
- **Trend Analysis (Linear Regression)**: Slope and intercept of linear trends for numeric columns. 📈
- **Moving Average (3 months window)**: Moving average values for numeric columns over a 3-month window. 📊
- **Calculating DIPS**: DataFrame segments identified as dips below certain thresholds. 📉
- **Calculating Increases**: DataFrame segments identified as increases above certain thresholds. 📈
- **Seasonal Patterns**: Monthly seasonal patterns identified using Holt-Winters exponential smoothing. 🌿
- **Correlation Analysis**: Correlation matrix between numeric columns. 🔗

## Installation 🛠️

1. Install the package:
   ```bash
   pip install dataframe-statistical-analyzer
   ```

## Usage 🖥️

1. Import the necessary modules:
   ```python
   import pandas as pd
   from dataframe_statistical_analyzer import DataFrameAnalyzer
   ```

2. Prepare your DataFrame:
   ```python
   data = {
       "month": ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December'],
       "stock_price": [50.0, 51.5, 49.8, 52.0, 53.2, 54.0, 55.0, 56.0, 57.5, 59.0, 60.0, 61.0]
   }

   df = pd.DataFrame(data)
   ```

3. Initialize the `DataFrameAnalyzer` with the DataFrame:
   ```python
   analyzer = DataFrameAnalyzer(df)
   ```

4. Perform the analysis:
   ```python
   analyzer.analyze()
   ```

5. **Expected Outputs**: When you run the `analyze()` method of `DataFrameAnalyzer`, you can expect to see the following outputs:
   - **Summary Statistics**: Statistical summary of the DataFrame.
   - **Month-to-Month Percentage Changes**: Percentage changes between consecutive months.
   - **Outliers Detection (Z-score > 3)**: DataFrame segments identified as outliers based on Z-score.
   - **Outliers Detection (MAD)**: DataFrame segments identified as outliers based on Median Absolute Deviation.
   - **Trend Analysis (Linear Regression)**: Slope and intercept of linear trends for numeric columns.
   - **Moving Average (3 months window)**: Moving average values for numeric columns over a 3-month window.
   - **Calculating DIPS**: DataFrame segments identified as dips below certain thresholds.
   - **Calculating Increases**: DataFrame segments identified as increases above certain thresholds.
   - **Seasonal Patterns**: Monthly seasonal patterns identified using Holt-Winters exponential smoothing.
   - **Correlation Analysis**: Correlation matrix between numeric columns.

## Contributing 🤝

We welcome contributions to the `DataFrameAnalyzer` project. Please fork the repository and submit a pull request with your changes. Ensure your code adheres to the existing style and includes appropriate tests.

## License 📜

This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for more details.

## Acknowledgments 🙏

This project utilizes several open-source libraries, including Pandas, Matplotlib, Scipy, Scikit-learn, and Statsmodels. We thank the developers and maintainers of these libraries for their invaluable contributions to the open-source community.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/DurgeshRathod/dataframe-statistical-analyzer",
    "name": "dataframe-statistical-analyzer",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.11",
    "maintainer_email": null,
    "keywords": "time-series, data analysis, machine learning, statistics",
    "author": "Durgesh Rathod",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/1d/3c/93ca5a3e468fb4abbb9835971992ce8e51953113b73720692fec47b75f11/dataframe_statistical_analyzer-1.0.2.tar.gz",
    "platform": null,
    "description": "# DataFrame Statistical Analyzer Utility \ud83d\udcca\n\nThe `DataFrameAnalyzer` project provides a robust and extensible tool for analyzing and visualizing data stored in a Pandas DataFrame. The tool encapsulates various data analysis functionalities, including summary statistics, percentage change computation, outlier detection, trend analysis, moving average calculation, correlation analysis, and seasonal pattern interpretation. The project is designed following the SOLID principles and incorporates design patterns to ensure maintainability and ease of use. \ud83d\ude80\n\n## Features \ud83c\udf1f\n\n- **Summary Statistics**: Statistical summary of the DataFrame. \ud83d\udcc8\n- **Month-to-Month Percentage Changes**: Percentage changes between consecutive months. \ud83d\udd04\n- **Outliers Detection (Z-score > 3)**: DataFrame segments identified as outliers based on Z-score. \ud83d\udea8\n- **Outliers Detection (MAD)**: DataFrame segments identified as outliers based on Median Absolute Deviation. \ud83d\udcc9\n- **Trend Analysis (Linear Regression)**: Slope and intercept of linear trends for numeric columns. \ud83d\udcc8\n- **Moving Average (3 months window)**: Moving average values for numeric columns over a 3-month window. \ud83d\udcca\n- **Calculating DIPS**: DataFrame segments identified as dips below certain thresholds. \ud83d\udcc9\n- **Calculating Increases**: DataFrame segments identified as increases above certain thresholds. \ud83d\udcc8\n- **Seasonal Patterns**: Monthly seasonal patterns identified using Holt-Winters exponential smoothing. \ud83c\udf3f\n- **Correlation Analysis**: Correlation matrix between numeric columns. \ud83d\udd17\n\n## Installation \ud83d\udee0\ufe0f\n\n1. Install the package:\n   ```bash\n   pip install dataframe-statistical-analyzer\n   ```\n\n## Usage \ud83d\udda5\ufe0f\n\n1. Import the necessary modules:\n   ```python\n   import pandas as pd\n   from dataframe_statistical_analyzer import DataFrameAnalyzer\n   ```\n\n2. Prepare your DataFrame:\n   ```python\n   data = {\n       \"month\": ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December'],\n       \"stock_price\": [50.0, 51.5, 49.8, 52.0, 53.2, 54.0, 55.0, 56.0, 57.5, 59.0, 60.0, 61.0]\n   }\n\n   df = pd.DataFrame(data)\n   ```\n\n3. Initialize the `DataFrameAnalyzer` with the DataFrame:\n   ```python\n   analyzer = DataFrameAnalyzer(df)\n   ```\n\n4. Perform the analysis:\n   ```python\n   analyzer.analyze()\n   ```\n\n5. **Expected Outputs**: When you run the `analyze()` method of `DataFrameAnalyzer`, you can expect to see the following outputs:\n   - **Summary Statistics**: Statistical summary of the DataFrame.\n   - **Month-to-Month Percentage Changes**: Percentage changes between consecutive months.\n   - **Outliers Detection (Z-score > 3)**: DataFrame segments identified as outliers based on Z-score.\n   - **Outliers Detection (MAD)**: DataFrame segments identified as outliers based on Median Absolute Deviation.\n   - **Trend Analysis (Linear Regression)**: Slope and intercept of linear trends for numeric columns.\n   - **Moving Average (3 months window)**: Moving average values for numeric columns over a 3-month window.\n   - **Calculating DIPS**: DataFrame segments identified as dips below certain thresholds.\n   - **Calculating Increases**: DataFrame segments identified as increases above certain thresholds.\n   - **Seasonal Patterns**: Monthly seasonal patterns identified using Holt-Winters exponential smoothing.\n   - **Correlation Analysis**: Correlation matrix between numeric columns.\n\n## Contributing \ud83e\udd1d\n\nWe welcome contributions to the `DataFrameAnalyzer` project. Please fork the repository and submit a pull request with your changes. Ensure your code adheres to the existing style and includes appropriate tests.\n\n## License \ud83d\udcdc\n\nThis project is licensed under the MIT License. See the [LICENSE](LICENSE) file for more details.\n\n## Acknowledgments \ud83d\ude4f\n\nThis project utilizes several open-source libraries, including Pandas, Matplotlib, Scipy, Scikit-learn, and Statsmodels. We thank the developers and maintainers of these libraries for their invaluable contributions to the open-source community.\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "The `DataFrame Statistical Analyzer` package provides a utility tool for statistical analyzing in a Pandas DataFrame.",
    "version": "1.0.2",
    "project_urls": {
        "Homepage": "https://github.com/DurgeshRathod/dataframe-statistical-analyzer",
        "Repository": "https://github.com/DurgeshRathod/dataframe-statistical-analyzer"
    },
    "split_keywords": [
        "time-series",
        " data analysis",
        " machine learning",
        " statistics"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c1f73749dbb1281ca534b7268c95a5938f121928392de799ca044bb35430f547",
                "md5": "c3bd494eeb5977609742dd7fbc0db52b",
                "sha256": "2973b98a0490fee10c5f5428e689f494b542a47dead8254ac1004db599ee8983"
            },
            "downloads": -1,
            "filename": "dataframe_statistical_analyzer-1.0.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "c3bd494eeb5977609742dd7fbc0db52b",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.11",
            "size": 4492,
            "upload_time": "2024-07-05T15:17:59",
            "upload_time_iso_8601": "2024-07-05T15:17:59.695261Z",
            "url": "https://files.pythonhosted.org/packages/c1/f7/3749dbb1281ca534b7268c95a5938f121928392de799ca044bb35430f547/dataframe_statistical_analyzer-1.0.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "1d3c93ca5a3e468fb4abbb9835971992ce8e51953113b73720692fec47b75f11",
                "md5": "c0cbe4c5c490b6c78f2e966c62d7f8fd",
                "sha256": "b09dedd7bdc7fd44e46bbe53fd4d07cc2a35a5317092dc67aebe85087590729a"
            },
            "downloads": -1,
            "filename": "dataframe_statistical_analyzer-1.0.2.tar.gz",
            "has_sig": false,
            "md5_digest": "c0cbe4c5c490b6c78f2e966c62d7f8fd",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.11",
            "size": 3860,
            "upload_time": "2024-07-05T15:18:00",
            "upload_time_iso_8601": "2024-07-05T15:18:00.934469Z",
            "url": "https://files.pythonhosted.org/packages/1d/3c/93ca5a3e468fb4abbb9835971992ce8e51953113b73720692fec47b75f11/dataframe_statistical_analyzer-1.0.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-07-05 15:18:00",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "DurgeshRathod",
    "github_project": "dataframe-statistical-analyzer",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "dataframe-statistical-analyzer"
}
        
Elapsed time: 3.45901s