analytics-huytln


Nameanalytics-huytln JSON
Version 0.3.2 PyPI version JSON
download
home_pagehttps://github.com/trinhlenhathuy/analytics_huytln
SummaryA simple library to plot insightful charts
upload_time2024-08-24 08:12:21
maintainerNone
docs_urlNone
authorHuy Trịnh Lê Nhật
requires_python>=3.6
licenseMIT
keywords data visualization charts pareto heatmap
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            Analytics by Huytln - Unlock powerful, customizable, and insightful visualizations with just one line of code.
==============================================================================================================

.. image:: https://img.shields.io/pypi/pyversions/analytics-huytln
   :alt: Python Versions

.. image:: https://img.shields.io/pypi/l/analytics-huytln
   :alt: License

.. image:: https://img.shields.io/pypi/v/analytics-huytln
   :alt: PyPI Version

`analytics-huytln` is a Python library designed to assist in data analysis through various charts and data visualizations. 
The goal of this library is to make it easier for analysts and data scientists to explore, analyze, and present their data.

Contents Overview
-----------------
What's new in the version
-------------------------

.. image:: https://img.shields.io/pypi/v/analytics-huytln
   :alt: PyPI Version

- Deploy "Trend Analysis Chart & Forecast" by linear regression
   + Normal forecast
   + Seasonality

.. image:: https://img.shields.io/badge/version-0.2.7-blue
   :alt: Version 0.2.7
   :target: https://pypi.org/project/analytics-huytln/

-  Allow the plotting function to be applied to a DataFrame with more columns than the required number of columns.
-  Allow to export raw data from chart
-  Adjust heatmap by timing function -> plot_heatmap_by_2_dimensions function to allow flexible input of two dimensions.

.. contents::
   :depth: 3
   :local:

Features
--------

- **Easy Data Analysis**: Provides powerful tools for data analysis with convenient functions and methods.
- **Chart Creation**: Supports various common charts for data visualization.
  
  - Pareto chart
  - Heatmap by 2 dimensions
  - Trend Analysis & Forecast

- **High Compatibility**: Works well with popular data formats such as CSV, Excel, JSON.
- **User-Friendly**: Offers user-friendly functions that are easy to use and integrate into existing projects.

Installation
------------

Install `analytics-huytln` via pip:

.. code-block:: bash

    pip install --upgrade analytics-huytln

Modules Usage
-------------

- `Plot Pareto Chart`_

  - `Parameters pareto`_
  - `Usage pareto`_
  - `Output pareto`_
  - `Analyse pareto`_

- `Heatmap by 2 Dimensions`_

  - `Parameters heatmap`_
  - `Usage heatmap`_
  - `Output heatmap`_
  - `Analyse heatmap`_

- `Trend Analysis and Forecast`_

  - `Parameters Trend analysis`_
  - `Usage Trend analysis`_
  - `Output Trend analysis`_
  - `Analyse Trend analysis`_

- `Seasonality Trend Analysis and Forecast`_

  - `Parameters seasonality trend analysis`_
  - `Usage seasonality trend analysis`_
  - `Output seasonality trend analysis`_
  - `Analyse seasonality trend analysis`_

- `incoming insightful charts`_

---------------------------------------------------------------------------------------------------------------------------

Plot Pareto Chart
=================

.. _plot_pareto_chart:

The `plot_pareto_chart` function creates a Pareto chart from Excel data.

.. _Parameters_pareto:

Parameters pareto
-----------------

- **df** (*pandas.DataFrame*): DataFrame containing the data with dim_name (category) and metric (value) columns.
- **dim_name** (*str*): Name of the column representing the category (e.g., SKU).
- **metric** (*str*): Name of the column containing the values to analyze (e.g., Sales).
- **save_to_excel** (*binary*): (optional) 1 to export data, default by 0

.. _Usage_pareto:

Usage pareto
------------

Here's how to use the `plot_pareto_chart` function:

.. code-block:: python

    import pandas as pd
    from pareto_chart_lib.pareto import plot_pareto_chart

    # Read data from Excel file
    df = pd.read_excel('data_pareto.xlsx')

    # Create a Pareto chart
    plot_pareto_chart(df, 'SKU', 'Sales')

.. _Output_pareto:

Output pareto
-------------

.. image:: https://github.com/user-attachments/assets/f2147e62-dc28-486c-8176-b5d763811c47
   :width: 830px
   :alt: Pareto Chart Output

.. _Analyse_pareto:

Analyse pareto
--------------

**Chart Components**:

- **Histogram (Blue Bars)**: Represents the number of sales for each SKU. The SKUs are sorted in descending order of sales, with the most sold SKU on the left.
- **Cumulative Percentage Curve (Orange Line)**: Represents the cumulative percentage of total sales as you move from left to right across the SKUs. The percentage curve helps identify the SKUs that contribute to a significant portion of the total sales.
- **Horizontal Lines**: Dotted lines at 80% and 95% cumulative sales percentage mark important thresholds.
- **Annotations**: The chart marks specific SKUs (SKU 10 and SKU 32) that correspond to the 80% and 95% cumulative sales levels.

**Table**:

- **Level**: Indicates the cumulative percentage levels (80% and 95%).
- **Total Sales**: The total number of sales up to the specified cumulative percentage.
- **Total SKUs to X%**: The number of SKUs contributing to the specified cumulative percentage.
- **Percent of SKU**: The percentage of SKUs contributing to the specified cumulative percentage of sales.

**Analysis**:

- **80% of Sales**:
    - SKU 10 is the last SKU contributing to 80% of total sales.
    - Only 7 SKUs (5.00% of the total SKUs) are responsible for generating 80% of the sales. This indicates that a small number of SKUs are driving the majority of the sales, which is consistent with the Pareto principle (80/20 rule).

- **95% of Sales**:
    - SKU 32 is the last SKU contributing to 95% of total sales.
    - 30 SKUs (21.43% of the total SKUs) contribute to 95% of the sales.

**Conclusion**:

This Pareto chart visually emphasizes that a small fraction of SKUs contributes to a large fraction of total sales. This insight can help prioritize inventory management, marketing efforts, and sales strategies focusing on the top-performing SKUs.

Heatmap by 2 Dimensions
=======================

.. _heatmap_by_2_dimensions:

The `plot_heatmap_by_2_dimensions` function creates a visual representation of sales data, illustrating the relationship between two dimensions (e.g., time and SKU) and highlighting significant sales periods. This function allows for flexible input of two dimensions to explore and emphasize their correlation effectively.

.. _Parameters_heatmap:

Parameters heatmap
------------------

- **df** (*pandas.DataFrame*): DataFrame containing the data with dim_name (category) and metric (value) columns.
- **dim_name_x** (*str*): Name of the horizontal column representing the category 1 (e.g., Timing, percent of discount).
- **dim_name_y** (*str*): Name of the vertical column representing the category 2 (e.g., SKU).
- **metric** (*str*): Name of the column containing the values to analyze (e.g., Sales).
- **highlight** (*int*): The number of top points to be highlighted.
- **save_to_excel** (*binary*): (optional) 1 to export data, default by 0

.. _Usage_heatmap:

Usage heatmap
-------------

Here's how to use the `plot_heatmap_by_2_dimensions` function:

.. code-block:: python

    import pandas as pd
    from heatmap_by_2_dimensions.heatmap_by_2_dimensions import plot_heatmap_by_2_dimensions

    # Read data from Excel file
    df = pd.read_excel('data_order_by_time.xlsx')

    # Create a heatmap by timing and SKU with the top 10 highest sales points highlighted
    plot_heatmap_by_2_dimensions(df, 'timing', 'SKU', 'Sales', 10)

.. _Output_heatmap:

Output heatmap
--------------

.. image:: https://github.com/user-attachments/assets/208cf8bd-70ff-4734-9a56-d3d96679d1f2
   :width: 704px
   :alt: Heatmap Output

.. _Analyse_heatmap:

Analyse heatmap
---------------

**Chart Components**:

- **X-axis (Timing)**: The timing is represented as a concatenation of the day of the week and hour.
- **Y-axis (SKU)**: Represents different SKUs, with each row dedicated to a specific SKU. 
- **Scatter Plot (Dots)**:
    - **Data Points**: Each dot represents a sale of a specific SKU at a particular time.
    - **Color and Size**: The dots vary in size and color, representing the quantity of items sold. Larger dots indicate higher quantities or larger sales amounts.
    - **Vertical Lines (Red)**: These lines represent the times with the highest total sales across all SKUs.

**Table**:

- **Time Periods**: The chart could be segmented by specific time periods (days or hours) to analyze how sales performance fluctuates during these periods.
- **Top SKUs**: The distribution of sales across different SKUs can help identify top-performing SKUs at various times, similar to how a Pareto chart highlights top contributors.

**Analysis**:

- **Sales Concentration**:
    - There are visible clusters of sales activity at certain times, indicating peak periods where specific SKUs are more popular.
    - The distribution suggests that certain SKUs have consistent sales across different times, while others may peak during specific hours or days.

- **Timing Patterns**:
    - The timing axis shows a dense clustering of sales at specific periods, which might correlate with customer behavior, promotional activities, or operational factors.
    - The overlap of timing labels suggests that further aggregation or a different representation (e.g., hourly or daily aggregates) could provide clearer insights.

- **Impact of Vertical Lines**:
    - The red vertical lines likely mark significant time thresholds, which could be used to analyze how sales change before and after these periods.
    - These lines might highlight the impact of certain events, such as promotions, holidays, or restocking, on sales patterns.

**Conclusion**:

This scatter plot provides a comprehensive view of sales distribution across different SKUs and times. The clustering of dots and the variations in size and color reveal key insights into sales performance, indicating peak periods and top-performing SKUs. The vertical lines and timing axis add another layer of insight into sales trends and periods of interest.

Trend Analysis and Forecast
===========================

.. _trend_analysis_and_forecast:

The `plot_trend_analysis_normal` function performs trend analysis and forecasts future values using linear regression on time series data.

.. _Parameters_trend_analysis:

Parameters Trend analysis
-------------------------

- **df** (*pandas.DataFrame*): DataFrame containing the time series data with columns for time dimension and metric values.
- **time_dimension** (*Date*): Name of the column representing the time dimension (e.g., Date).
- **metric** (*str*): Name of the column containing the values to analyze (e.g., Sales).
- **forecast_periods** (*int*): (optional) Number of periods to forecast into the future (default is 12).

.. _Usage_trend_analysis:

Usage Trend analysis
--------------------

Here's how to use the `plot_trend_analysis_normal` function:

.. code-block:: python

   import pandas as pd
   from linear_forecast_lib.linear_forecast import plot_trend_analysis_normal

    # Read data from Excel file
   df = pd.read_excel('data_trend.xlsx')
   
   # Create a trend analysis chart
   plot_trend_analysis_normal(df, 'Date', 'Sales', 12)

.. _Output_trend_analysis:

Output Trend analysis
---------------------
.. image:: https://github.com/user-attachments/assets/b966432c-2b24-4850-933e-8ba2ee5f9e35
   :width: 941
   :alt: Trend Analyse Normal

- Trend Line: Displays the observed values and the forecasted values along with the linear regression line.

- Analysis Table: Provides key metrics and comments on the regression analysis.

.. _Analyse_trend_analysis:

Analyse Trend analysis
----------------------
**Chart Components**:

- **Trend Line (Observed and Forecast)**: Shows the actual values and the forecasted values, with the forecasted values indicated by a dashed line.
- **Regression Equation**: Displays the linear regression equation on the chart.
- **Analysis Table**: Includes metrics such as Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), p-value, Slope, R-squared, Mean Value, and Trend.

**Analysis Details**:

- **Mean Absolute Error (MAE)**: Indicates the average deviation of predictions from actual values.
- **Mean Absolute Percentage Error (MAPE)**: Shows the average percentage error of predictions.
- **p-value**: Tests the significance of the regression model.
- **Slope**: Represents the rate of change in the metric over time.
- **R-squared**: Measures the goodness of fit for the regression model.
- **Mean Valued**: Average value of the metric over the time period.
- **Trend**: Indicates whether the trend is positive or negative.

**Conclusions**:

- This trend analysis chart helps visualize the overall trend in the data and provides a forecast for future values. The analysis table gives a detailed breakdown of the regression metrics and their significance, which can be used to understand the performance and reliability of the forecast.

Seasonality Trend Analysis and Forecast
=======================================

.. _seasonality_trend_analysis_and_forecast:

The plot_trend_analysis_seasonality function performs trend analysis and forecasts future values using linear regression on time series data, incorporating seasonal patterns such as daily, weekly, monthly, or quarterly.

.. _Parameters_seasonality_trend_analysis:

Parameters Seasonality Trend Analysis
-------------------------------------

- **df** (*pandas.DataFrame*): DataFrame containing the time series data with columns for time dimension and metric values.
- **time_dimension** (*Date*): Name of the column representing the time dimension (e.g., Date).
- **metric** (*str*): Name of the column containing the values to analyze (e.g., Sales).
- **forecast_periods** (*int*): (optional) Number of periods to forecast into the future (default is 12).
- **seasonality** (*char*): (optional) Specifies the type of seasonality to consider, possible values are (Default is 'M'):

   + 'D' for daily 
   + 'W' for weekly, 
   + 'M' for monthly 
   + 'Q' for quarterly

.. _Usage_seasonality_trend_analysis:

Usage Seasonality Trend Analysis
--------------------------------

Here's how to use the plot_trend_analysis_seasonality function:

.. code-block:: python

   import pandas as pd
   from linear_forecast_lib.linear_forecast import plot_trend_analysis_seasonality

    # Read data from Excel file
   df = pd.read_excel('data_trend.xlsx')
   
   # Create a trend analysis chart
   plot_trend_analysis_seasonality(df, 'Date', 'Sales', 12, 'M')

.. _Output_seasonality_trend_analysis:

Output Seasonality Trend Analysis
---------------------------------
.. image:: https://github.com/user-attachments/assets/898d293c-ebfb-4723-b788-30a87d8c7272
   :width: 943
   :alt: Trend Analysis Seasonality

- Trend Line: Displays the observed values and the forecasted values along with the linear regression line.

- Analysis Table: Provides key metrics and comments on the regression analysis.

.. _Analyse_seasonality_trend_analysis:

Analyse Seasonality Trend Analysis
----------------------------------
**Chart Components**:

- **Trend Line (Observed and Forecast)**: Shows the actual values and the forecasted values, with the forecasted values indicated by a dashed line.
- **Seasonality Feature**: Displays the influence of seasonal patterns on the trend.
- **Regression Equation**: Displays the linear regression equation on the chart.
- **Analysis Table**: Includes metrics such as Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), p-value, Slope, R-squared, Mean Value, and Trend.

**Analysis Details**:

- **Mean Absolute Error (MAE)**: Indicates the average deviation of predictions from actual values.
- **Mean Absolute Percentage Error (MAPE)**: Shows the average percentage error of predictions.
- **p-value**: Tests the significance of the regression model.
- **Slope**: Represents the rate of change in the metric over time.
- **R-squared**: Measures the goodness of fit for the regression model.
- **Mean Valued**: Average value of the metric over the time period.
- **Trend**: Indicates whether the trend is positive or negative.

**Conclusions**:

- This seasonality trend analysis chart helps visualize the overall trend and seasonal patterns in the data. It provides a forecast for future values, considering both the trend and seasonality. The analysis table gives a detailed breakdown of the regression metrics and their significance, which can be used to understand the performance and reliability of the forecast.


incoming insightful charts
==========================

.. _incoming_insightful_charts:

- Correlation Heatmap
- Customer Segmentation
- Revenue Growth Tracker
- Sales Funnel Analysis
- Operational Efficiency Heatmap
- Candlestick Chart
- Sankey Multiple Levels

Let me know if you need further analysis or any specific insights!
==================================================================

.. code-block:: bash

    git clone https://github.com/trinhlenhathuy/analytics_huytln.git

    cd analytics_huytln

    python setup.py sdist bdist_wheel

    twine upload --config-file .pypirc dist/*


            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/trinhlenhathuy/analytics_huytln",
    "name": "analytics-huytln",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": null,
    "keywords": "data visualization, charts, pareto, heatmap",
    "author": "Huy Tr\u1ecbnh L\u00ea Nh\u1eadt",
    "author_email": "trinhlenhathuy@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/57/9f/f414408d2ff50abc61975af9a278b7c736d5847bf4a20e23887b72779fdf/analytics_huytln-0.3.2.tar.gz",
    "platform": null,
    "description": "Analytics by Huytln - Unlock powerful, customizable, and insightful visualizations with just one line of code.\r\n==============================================================================================================\r\n\r\n.. image:: https://img.shields.io/pypi/pyversions/analytics-huytln\r\n   :alt: Python Versions\r\n\r\n.. image:: https://img.shields.io/pypi/l/analytics-huytln\r\n   :alt: License\r\n\r\n.. image:: https://img.shields.io/pypi/v/analytics-huytln\r\n   :alt: PyPI Version\r\n\r\n`analytics-huytln` is a Python library designed to assist in data analysis through various charts and data visualizations. \r\nThe goal of this library is to make it easier for analysts and data scientists to explore, analyze, and present their data.\r\n\r\nContents Overview\r\n-----------------\r\nWhat's new in the version\r\n-------------------------\r\n\r\n.. image:: https://img.shields.io/pypi/v/analytics-huytln\r\n   :alt: PyPI Version\r\n\r\n- Deploy \"Trend Analysis Chart & Forecast\" by linear regression\r\n   + Normal forecast\r\n   + Seasonality\r\n\r\n.. image:: https://img.shields.io/badge/version-0.2.7-blue\r\n   :alt: Version 0.2.7\r\n   :target: https://pypi.org/project/analytics-huytln/\r\n\r\n-  Allow the plotting function to be applied to a DataFrame with more columns than the required number of columns.\r\n-  Allow to export raw data from chart\r\n-  Adjust heatmap by timing function -> plot_heatmap_by_2_dimensions function to allow flexible input of two dimensions.\r\n\r\n.. contents::\r\n   :depth: 3\r\n   :local:\r\n\r\nFeatures\r\n--------\r\n\r\n- **Easy Data Analysis**: Provides powerful tools for data analysis with convenient functions and methods.\r\n- **Chart Creation**: Supports various common charts for data visualization.\r\n  \r\n  - Pareto chart\r\n  - Heatmap by 2 dimensions\r\n  - Trend Analysis & Forecast\r\n\r\n- **High Compatibility**: Works well with popular data formats such as CSV, Excel, JSON.\r\n- **User-Friendly**: Offers user-friendly functions that are easy to use and integrate into existing projects.\r\n\r\nInstallation\r\n------------\r\n\r\nInstall `analytics-huytln` via pip:\r\n\r\n.. code-block:: bash\r\n\r\n    pip install --upgrade analytics-huytln\r\n\r\nModules Usage\r\n-------------\r\n\r\n- `Plot Pareto Chart`_\r\n\r\n  - `Parameters pareto`_\r\n  - `Usage pareto`_\r\n  - `Output pareto`_\r\n  - `Analyse pareto`_\r\n\r\n- `Heatmap by 2 Dimensions`_\r\n\r\n  - `Parameters heatmap`_\r\n  - `Usage heatmap`_\r\n  - `Output heatmap`_\r\n  - `Analyse heatmap`_\r\n\r\n- `Trend Analysis and Forecast`_\r\n\r\n  - `Parameters Trend analysis`_\r\n  - `Usage Trend analysis`_\r\n  - `Output Trend analysis`_\r\n  - `Analyse Trend analysis`_\r\n\r\n- `Seasonality Trend Analysis and Forecast`_\r\n\r\n  - `Parameters seasonality trend analysis`_\r\n  - `Usage seasonality trend analysis`_\r\n  - `Output seasonality trend analysis`_\r\n  - `Analyse seasonality trend analysis`_\r\n\r\n- `incoming insightful charts`_\r\n\r\n---------------------------------------------------------------------------------------------------------------------------\r\n\r\nPlot Pareto Chart\r\n=================\r\n\r\n.. _plot_pareto_chart:\r\n\r\nThe `plot_pareto_chart` function creates a Pareto chart from Excel data.\r\n\r\n.. _Parameters_pareto:\r\n\r\nParameters pareto\r\n-----------------\r\n\r\n- **df** (*pandas.DataFrame*): DataFrame containing the data with dim_name (category) and metric (value) columns.\r\n- **dim_name** (*str*): Name of the column representing the category (e.g., SKU).\r\n- **metric** (*str*): Name of the column containing the values to analyze (e.g., Sales).\r\n- **save_to_excel** (*binary*): (optional) 1 to export data, default by 0\r\n\r\n.. _Usage_pareto:\r\n\r\nUsage pareto\r\n------------\r\n\r\nHere's how to use the `plot_pareto_chart` function:\r\n\r\n.. code-block:: python\r\n\r\n    import pandas as pd\r\n    from pareto_chart_lib.pareto import plot_pareto_chart\r\n\r\n    # Read data from Excel file\r\n    df = pd.read_excel('data_pareto.xlsx')\r\n\r\n    # Create a Pareto chart\r\n    plot_pareto_chart(df, 'SKU', 'Sales')\r\n\r\n.. _Output_pareto:\r\n\r\nOutput pareto\r\n-------------\r\n\r\n.. image:: https://github.com/user-attachments/assets/f2147e62-dc28-486c-8176-b5d763811c47\r\n   :width: 830px\r\n   :alt: Pareto Chart Output\r\n\r\n.. _Analyse_pareto:\r\n\r\nAnalyse pareto\r\n--------------\r\n\r\n**Chart Components**:\r\n\r\n- **Histogram (Blue Bars)**: Represents the number of sales for each SKU. The SKUs are sorted in descending order of sales, with the most sold SKU on the left.\r\n- **Cumulative Percentage Curve (Orange Line)**: Represents the cumulative percentage of total sales as you move from left to right across the SKUs. The percentage curve helps identify the SKUs that contribute to a significant portion of the total sales.\r\n- **Horizontal Lines**: Dotted lines at 80% and 95% cumulative sales percentage mark important thresholds.\r\n- **Annotations**: The chart marks specific SKUs (SKU 10 and SKU 32) that correspond to the 80% and 95% cumulative sales levels.\r\n\r\n**Table**:\r\n\r\n- **Level**: Indicates the cumulative percentage levels (80% and 95%).\r\n- **Total Sales**: The total number of sales up to the specified cumulative percentage.\r\n- **Total SKUs to X%**: The number of SKUs contributing to the specified cumulative percentage.\r\n- **Percent of SKU**: The percentage of SKUs contributing to the specified cumulative percentage of sales.\r\n\r\n**Analysis**:\r\n\r\n- **80% of Sales**:\r\n    - SKU 10 is the last SKU contributing to 80% of total sales.\r\n    - Only 7 SKUs (5.00% of the total SKUs) are responsible for generating 80% of the sales. This indicates that a small number of SKUs are driving the majority of the sales, which is consistent with the Pareto principle (80/20 rule).\r\n\r\n- **95% of Sales**:\r\n    - SKU 32 is the last SKU contributing to 95% of total sales.\r\n    - 30 SKUs (21.43% of the total SKUs) contribute to 95% of the sales.\r\n\r\n**Conclusion**:\r\n\r\nThis Pareto chart visually emphasizes that a small fraction of SKUs contributes to a large fraction of total sales. This insight can help prioritize inventory management, marketing efforts, and sales strategies focusing on the top-performing SKUs.\r\n\r\nHeatmap by 2 Dimensions\r\n=======================\r\n\r\n.. _heatmap_by_2_dimensions:\r\n\r\nThe `plot_heatmap_by_2_dimensions` function creates a visual representation of sales data, illustrating the relationship between two dimensions (e.g., time and SKU) and highlighting significant sales periods. This function allows for flexible input of two dimensions to explore and emphasize their correlation effectively.\r\n\r\n.. _Parameters_heatmap:\r\n\r\nParameters heatmap\r\n------------------\r\n\r\n- **df** (*pandas.DataFrame*): DataFrame containing the data with dim_name (category) and metric (value) columns.\r\n- **dim_name_x** (*str*): Name of the horizontal column representing the category 1 (e.g., Timing, percent of discount).\r\n- **dim_name_y** (*str*): Name of the vertical column representing the category 2 (e.g., SKU).\r\n- **metric** (*str*): Name of the column containing the values to analyze (e.g., Sales).\r\n- **highlight** (*int*): The number of top points to be highlighted.\r\n- **save_to_excel** (*binary*): (optional) 1 to export data, default by 0\r\n\r\n.. _Usage_heatmap:\r\n\r\nUsage heatmap\r\n-------------\r\n\r\nHere's how to use the `plot_heatmap_by_2_dimensions` function:\r\n\r\n.. code-block:: python\r\n\r\n    import pandas as pd\r\n    from heatmap_by_2_dimensions.heatmap_by_2_dimensions import plot_heatmap_by_2_dimensions\r\n\r\n    # Read data from Excel file\r\n    df = pd.read_excel('data_order_by_time.xlsx')\r\n\r\n    # Create a heatmap by timing and SKU with the top 10 highest sales points highlighted\r\n    plot_heatmap_by_2_dimensions(df, 'timing', 'SKU', 'Sales', 10)\r\n\r\n.. _Output_heatmap:\r\n\r\nOutput heatmap\r\n--------------\r\n\r\n.. image:: https://github.com/user-attachments/assets/208cf8bd-70ff-4734-9a56-d3d96679d1f2\r\n   :width: 704px\r\n   :alt: Heatmap Output\r\n\r\n.. _Analyse_heatmap:\r\n\r\nAnalyse heatmap\r\n---------------\r\n\r\n**Chart Components**:\r\n\r\n- **X-axis (Timing)**: The timing is represented as a concatenation of the day of the week and hour.\r\n- **Y-axis (SKU)**: Represents different SKUs, with each row dedicated to a specific SKU. \r\n- **Scatter Plot (Dots)**:\r\n    - **Data Points**: Each dot represents a sale of a specific SKU at a particular time.\r\n    - **Color and Size**: The dots vary in size and color, representing the quantity of items sold. Larger dots indicate higher quantities or larger sales amounts.\r\n    - **Vertical Lines (Red)**: These lines represent the times with the highest total sales across all SKUs.\r\n\r\n**Table**:\r\n\r\n- **Time Periods**: The chart could be segmented by specific time periods (days or hours) to analyze how sales performance fluctuates during these periods.\r\n- **Top SKUs**: The distribution of sales across different SKUs can help identify top-performing SKUs at various times, similar to how a Pareto chart highlights top contributors.\r\n\r\n**Analysis**:\r\n\r\n- **Sales Concentration**:\r\n    - There are visible clusters of sales activity at certain times, indicating peak periods where specific SKUs are more popular.\r\n    - The distribution suggests that certain SKUs have consistent sales across different times, while others may peak during specific hours or days.\r\n\r\n- **Timing Patterns**:\r\n    - The timing axis shows a dense clustering of sales at specific periods, which might correlate with customer behavior, promotional activities, or operational factors.\r\n    - The overlap of timing labels suggests that further aggregation or a different representation (e.g., hourly or daily aggregates) could provide clearer insights.\r\n\r\n- **Impact of Vertical Lines**:\r\n    - The red vertical lines likely mark significant time thresholds, which could be used to analyze how sales change before and after these periods.\r\n    - These lines might highlight the impact of certain events, such as promotions, holidays, or restocking, on sales patterns.\r\n\r\n**Conclusion**:\r\n\r\nThis scatter plot provides a comprehensive view of sales distribution across different SKUs and times. The clustering of dots and the variations in size and color reveal key insights into sales performance, indicating peak periods and top-performing SKUs. The vertical lines and timing axis add another layer of insight into sales trends and periods of interest.\r\n\r\nTrend Analysis and Forecast\r\n===========================\r\n\r\n.. _trend_analysis_and_forecast:\r\n\r\nThe `plot_trend_analysis_normal` function performs trend analysis and forecasts future values using linear regression on time series data.\r\n\r\n.. _Parameters_trend_analysis:\r\n\r\nParameters Trend analysis\r\n-------------------------\r\n\r\n- **df** (*pandas.DataFrame*): DataFrame containing the time series data with columns for time dimension and metric values.\r\n- **time_dimension** (*Date*): Name of the column representing the time dimension (e.g., Date).\r\n- **metric** (*str*): Name of the column containing the values to analyze (e.g., Sales).\r\n- **forecast_periods** (*int*): (optional) Number of periods to forecast into the future (default is 12).\r\n\r\n.. _Usage_trend_analysis:\r\n\r\nUsage Trend analysis\r\n--------------------\r\n\r\nHere's how to use the `plot_trend_analysis_normal` function:\r\n\r\n.. code-block:: python\r\n\r\n   import pandas as pd\r\n   from linear_forecast_lib.linear_forecast import plot_trend_analysis_normal\r\n\r\n    # Read data from Excel file\r\n   df = pd.read_excel('data_trend.xlsx')\r\n   \r\n   # Create a trend analysis chart\r\n   plot_trend_analysis_normal(df, 'Date', 'Sales', 12)\r\n\r\n.. _Output_trend_analysis:\r\n\r\nOutput Trend analysis\r\n---------------------\r\n.. image:: https://github.com/user-attachments/assets/b966432c-2b24-4850-933e-8ba2ee5f9e35\r\n   :width: 941\r\n   :alt: Trend Analyse Normal\r\n\r\n- Trend Line: Displays the observed values and the forecasted values along with the linear regression line.\r\n\r\n- Analysis Table: Provides key metrics and comments on the regression analysis.\r\n\r\n.. _Analyse_trend_analysis:\r\n\r\nAnalyse Trend analysis\r\n----------------------\r\n**Chart Components**:\r\n\r\n- **Trend Line (Observed and Forecast)**: Shows the actual values and the forecasted values, with the forecasted values indicated by a dashed line.\r\n- **Regression Equation**: Displays the linear regression equation on the chart.\r\n- **Analysis Table**: Includes metrics such as Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), p-value, Slope, R-squared, Mean Value, and Trend.\r\n\r\n**Analysis Details**:\r\n\r\n- **Mean Absolute Error (MAE)**: Indicates the average deviation of predictions from actual values.\r\n- **Mean Absolute Percentage Error (MAPE)**: Shows the average percentage error of predictions.\r\n- **p-value**: Tests the significance of the regression model.\r\n- **Slope**: Represents the rate of change in the metric over time.\r\n- **R-squared**: Measures the goodness of fit for the regression model.\r\n- **Mean Valued**: Average value of the metric over the time period.\r\n- **Trend**: Indicates whether the trend is positive or negative.\r\n\r\n**Conclusions**:\r\n\r\n- This trend analysis chart helps visualize the overall trend in the data and provides a forecast for future values. The analysis table gives a detailed breakdown of the regression metrics and their significance, which can be used to understand the performance and reliability of the forecast.\r\n\r\nSeasonality Trend Analysis and Forecast\r\n=======================================\r\n\r\n.. _seasonality_trend_analysis_and_forecast:\r\n\r\nThe plot_trend_analysis_seasonality function performs trend analysis and forecasts future values using linear regression on time series data, incorporating seasonal patterns such as daily, weekly, monthly, or quarterly.\r\n\r\n.. _Parameters_seasonality_trend_analysis:\r\n\r\nParameters Seasonality Trend Analysis\r\n-------------------------------------\r\n\r\n- **df** (*pandas.DataFrame*): DataFrame containing the time series data with columns for time dimension and metric values.\r\n- **time_dimension** (*Date*): Name of the column representing the time dimension (e.g., Date).\r\n- **metric** (*str*): Name of the column containing the values to analyze (e.g., Sales).\r\n- **forecast_periods** (*int*): (optional) Number of periods to forecast into the future (default is 12).\r\n- **seasonality** (*char*): (optional) Specifies the type of seasonality to consider, possible values are (Default is 'M'):\r\n\r\n   + 'D' for daily \r\n   + 'W' for weekly, \r\n   + 'M' for monthly \r\n   + 'Q' for quarterly\r\n\r\n.. _Usage_seasonality_trend_analysis:\r\n\r\nUsage Seasonality Trend Analysis\r\n--------------------------------\r\n\r\nHere's how to use the plot_trend_analysis_seasonality function:\r\n\r\n.. code-block:: python\r\n\r\n   import pandas as pd\r\n   from linear_forecast_lib.linear_forecast import plot_trend_analysis_seasonality\r\n\r\n    # Read data from Excel file\r\n   df = pd.read_excel('data_trend.xlsx')\r\n   \r\n   # Create a trend analysis chart\r\n   plot_trend_analysis_seasonality(df, 'Date', 'Sales', 12, 'M')\r\n\r\n.. _Output_seasonality_trend_analysis:\r\n\r\nOutput Seasonality Trend Analysis\r\n---------------------------------\r\n.. image:: https://github.com/user-attachments/assets/898d293c-ebfb-4723-b788-30a87d8c7272\r\n   :width: 943\r\n   :alt: Trend Analysis Seasonality\r\n\r\n- Trend Line: Displays the observed values and the forecasted values along with the linear regression line.\r\n\r\n- Analysis Table: Provides key metrics and comments on the regression analysis.\r\n\r\n.. _Analyse_seasonality_trend_analysis:\r\n\r\nAnalyse Seasonality Trend Analysis\r\n----------------------------------\r\n**Chart Components**:\r\n\r\n- **Trend Line (Observed and Forecast)**: Shows the actual values and the forecasted values, with the forecasted values indicated by a dashed line.\r\n- **Seasonality Feature**: Displays the influence of seasonal patterns on the trend.\r\n- **Regression Equation**: Displays the linear regression equation on the chart.\r\n- **Analysis Table**: Includes metrics such as Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), p-value, Slope, R-squared, Mean Value, and Trend.\r\n\r\n**Analysis Details**:\r\n\r\n- **Mean Absolute Error (MAE)**: Indicates the average deviation of predictions from actual values.\r\n- **Mean Absolute Percentage Error (MAPE)**: Shows the average percentage error of predictions.\r\n- **p-value**: Tests the significance of the regression model.\r\n- **Slope**: Represents the rate of change in the metric over time.\r\n- **R-squared**: Measures the goodness of fit for the regression model.\r\n- **Mean Valued**: Average value of the metric over the time period.\r\n- **Trend**: Indicates whether the trend is positive or negative.\r\n\r\n**Conclusions**:\r\n\r\n- This seasonality trend analysis chart helps visualize the overall trend and seasonal patterns in the data. It provides a forecast for future values, considering both the trend and seasonality. The analysis table gives a detailed breakdown of the regression metrics and their significance, which can be used to understand the performance and reliability of the forecast.\r\n\r\n\r\nincoming insightful charts\r\n==========================\r\n\r\n.. _incoming_insightful_charts:\r\n\r\n- Correlation Heatmap\r\n- Customer Segmentation\r\n- Revenue Growth Tracker\r\n- Sales Funnel Analysis\r\n- Operational Efficiency Heatmap\r\n- Candlestick Chart\r\n- Sankey Multiple Levels\r\n\r\nLet me know if you need further analysis or any specific insights!\r\n==================================================================\r\n\r\n.. code-block:: bash\r\n\r\n    git clone https://github.com/trinhlenhathuy/analytics_huytln.git\r\n\r\n    cd analytics_huytln\r\n\r\n    python setup.py sdist bdist_wheel\r\n\r\n    twine upload --config-file .pypirc dist/*\r\n\r\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A simple library to plot insightful charts",
    "version": "0.3.2",
    "project_urls": {
        "Homepage": "https://github.com/trinhlenhathuy/analytics_huytln"
    },
    "split_keywords": [
        "data visualization",
        " charts",
        " pareto",
        " heatmap"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "8ac1208e4daf90393f9cb5d42fa525d3238469cf57f1ee5be7177b64687a1fba",
                "md5": "34b480c8bcaf4f7b9736e254250d1376",
                "sha256": "71e46aadcfadc0e1274597dd79639482a568c55c60e57ec4079d797d5ee46058"
            },
            "downloads": -1,
            "filename": "analytics_huytln-0.3.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "34b480c8bcaf4f7b9736e254250d1376",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 13404,
            "upload_time": "2024-08-24T08:12:20",
            "upload_time_iso_8601": "2024-08-24T08:12:20.122195Z",
            "url": "https://files.pythonhosted.org/packages/8a/c1/208e4daf90393f9cb5d42fa525d3238469cf57f1ee5be7177b64687a1fba/analytics_huytln-0.3.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "579ff414408d2ff50abc61975af9a278b7c736d5847bf4a20e23887b72779fdf",
                "md5": "ed6716e332abfd8ba5ea76beef00c332",
                "sha256": "279a650d60ce34944288343ab2b2904b79e11d84837cac8c481beabb109b998b"
            },
            "downloads": -1,
            "filename": "analytics_huytln-0.3.2.tar.gz",
            "has_sig": false,
            "md5_digest": "ed6716e332abfd8ba5ea76beef00c332",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 12752,
            "upload_time": "2024-08-24T08:12:21",
            "upload_time_iso_8601": "2024-08-24T08:12:21.427655Z",
            "url": "https://files.pythonhosted.org/packages/57/9f/f414408d2ff50abc61975af9a278b7c736d5847bf4a20e23887b72779fdf/analytics_huytln-0.3.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-08-24 08:12:21",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "trinhlenhathuy",
    "github_project": "analytics_huytln",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "analytics-huytln"
}
        
Elapsed time: 0.30211s