ab-testing-module


Nameab-testing-module JSON
Version 3.1.7 PyPI version JSON
download
home_pagehttps://github.com/knowusuboaky/ab_testing_module
SummaryThe AB Testing Module is a comprehensive suite designed for analyzing and reporting A/B test experiments, featuring functions for statistical analysis, advanced modeling, and data visualization to transform experimental results into actionable insights.
upload_time2024-03-18 08:25:19
maintainer
docs_urlNone
authorKwadwo Daddy Nyame Owusu - Boakye
requires_python>=3.6
license
keywords statistics data analysis a/b testing statistical modeling data visualization analysis toolbox experimental analysis hypothesis testing effect size power analysis data science machine learning plotly visualizations pandas data manipulation numpy calculations scipy statistics statsmodels regression matplotlib plotting seaborn charts data exploration quantitative analysis research tools analytical reporting decision support inferential statistics predictive modeling
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # AB Testing Module

## Overview

The AB Testing Library is a comprehensive suite designed for analyzing and reporting A/B test experiments, featuring functions for statistical analysis, advanced modeling, and data visualization to transform experimental results into actionable insights.


## Features

### A/B Testing Function Documentation

#### Purpose

The `ab_test` function is designed to facilitate A/B testing and statistical analysis within a given dataset. This robust tool can handle both binary and continuous outcome data, performing appropriate statistical tests to determine if there are significant differences between groups.

#### Function Map

<img src="https://github.com/knowusuboaky/ab_testing_module/blob/main/README_files/figure-markdown/mermaid-figure-1.png?raw=true" width="1526" height="459" alt="Optional Alt Text">

#### Parameters

- `data`: Pandas DataFrame containing the dataset for analysis.
- `group_column`: String specifying the column in `data` that contains the group labels.
- `value_column`: String specifying the column with the values to analyze.
- `control_group`: (Optional) String specifying the control group's label for comparison.
- `alpha`: (Optional) Float defining the significance level for the statistical tests, defaulting to 0.05.
- `handle_outliers`: (Optional) List specifying the method and strategy for handling outliers.
- `mc_correction`: (Optional) String specifying the method for multiple comparisons correction.

#### Functionality

- Validates input data and columns.
- Handles outliers if specified.
- Selects and performs the appropriate statistical tests based on data characteristics.
- Calculates effect sizes and conducts power analysis for the tests conducted.
- Applies multiple comparisons correction if specified.
- Generates interpretations of test results.

#### Output

Returns a Pandas DataFrame with columns for each test performed, including the test name, p-values, effect sizes, power, and interpretations.


#### Installation

This is the environment we need to load.

``` bash

pip install ab_testing_module==3.1.7
```

#### Load Package

``` bash

from ab_testing_module import ab_test
```

#### Base Operations

``` bash

# Perform ab test or abc test
df_results = ab_test(data=df, 
                    group_column='group', 
                    value_column='outcome', 
                    control_group='control', 
                    alpha=0.05, 
                    handle_outliers=None, 
                    mc_correction=None)
```


The `ab_test` function performs statistical analysis to compare outcomes across different groups in an A/B testing framework.


- **`data`**: The dataset to analyze, provided as a Pandas DataFrame. It should contain at least two columns: one for grouping (e.g., experimental vs. control groups) and one for the outcomes of interest (e.g., conversion rates, scores).

- **`group_column`**: A string specifying the name of the column in `data` that contains the group labels. This column is used to differentiate between the groups involved in the A/B test (e.g., 'group').

- **`value_column`**: A string indicating the name of the column in `data` that contains the values or outcomes to be analyzed (e.g., 'outcome'). These values are the focus of the statistical comparison between groups.

- **`control_group`**: (Optional) A string specifying the label of the control group within the `group_column`. This argument is essential when the analysis involves comparing each test group against a common control group to assess the effect of different treatments or conditions.

- **`alpha`**: (Optional) A float representing the significance level used in the statistical tests. The default value is `0.05`, which is a common threshold for determining statistical significance. Lowering this value makes the criteria for significance stricter, while increasing it makes the criteria more lenient.

- **`handle_outliers`**: (Optional) A list specifying the method and strategy for handling outliers in the `value_column`. The first element of the list is the method ('IQR' for interquartile range or 'Z-score'), and the second element is the strategy ('remove', 'impute', or 'cap'). If `None`, no outlier handling is performed.

  - `IQR`: Identifies outliers based on the interquartile range. Typically, values below Q1-1.5*IQR or above Q3+1.5*IQR are considered outliers.
  - `Z-score`: Identifies outliers based on the Z-score, with values typically beyond 3 standard deviations from the mean considered outliers.
  - `remove`: Outliers are removed from the dataset.
  - `impute`: Outliers are replaced with a specified statistic (e.g., the median).
  - `cap`: Outliers are capped at a specified maximum and/or minimum value.

- **`mc_correction`**: (Optional) A string specifying the method for multiple comparisons correction when conducting multiple statistical tests. This argument is critical for controlling the family-wise error rate or the false discovery rate in experiments involving multiple group comparisons. Options include:

  - `None`: No correction is applied.
  - `'bonferroni'`: The Bonferroni correction, which adjusts p-values by multiplying them by the number of comparisons.
  - `'fdr'`: The False Discovery Rate correction, which controls the expected proportion of incorrectly rejected null hypotheses.
  - `'holm'`: The Holm-Bonferroni method, a step-down procedure that is less conservative than the standard Bonferroni correction.


- The function returns a Pandas DataFrame (`df_results`) containing the results of the statistical tests, including test names, p-values, effect sizes, power analyses (where applicable), and detailed interpretations of the results. The DataFrame provides a comprehensive summary of the findings from the A/B test analysis.


### Advanced Modeling Function Documentation

#### Purpose

The `modeling` is a sophisticated function designed for conducting advanced statistical modeling. It fits different types of statistical models to explore relationships between variables, quantify effects, and provide insights into the data structure.

#### Function Map

<img src="https://github.com/knowusuboaky/ab_testing_module/blob/main/README_files/figure-markdown/mermaid-figure-2.png?raw=true" width="1526" height="459" alt="Optional Alt Text">

#### Parameters

- `data`: Pandas DataFrame containing the dataset.
- `group_column`: String specifying the column with the group information.
- `value_column`: String specifying the column with the dependent variable.
- `control_group`: (Optional) String specifying the label of the control group.

#### Functionality

- Automatically determines the model type based on the dependent variable's characteristics.
- Fits the chosen model and generates summaries, including parameter estimates and their significance.
- Performs effect size and power analysis where applicable.
- Provides detailed interpretations of the modeling results.

#### Output

- Model summaries for each fitted model.
- Structured interpretations of each model's results.
- Effect sizes and results of power analyses for the conducted models.


#### Installation

This is the environment we need to load.

``` bash

pip install ab_testing_module==3.1.7
```

#### Load Package

``` bash

from ab_testing_module import modeling
```

#### Base Operations

``` bash

# Perform advanced modeling
model_summaries, interpretations, model_summaries_df, interpretations_df = modeling(data=df, 
                                                                                    group_column='group', 
                                                                                    value_column='outcome', 
                                                                                    control_group='control')
```


The `modeling` function conducts advanced statistical modeling to explore relationships between variables in the context of A/B testing.


- **`data`** (DataFrame): The dataset for modeling, including independent variables and the dependent variable of interest.
- **`group_column`** (str): The column in `data` that identifies the grouping of data points (e.g., different conditions or treatments).
- **`value_column`** (str): The column in `data` representing the dependent variable or outcome of interest.
- **`control_group`** (str, optional): Specifies the label of the control group for comparative modeling purposes.


The function returns a tuple containing four elements, designed to provide both the raw statistical outputs from the modeling process and their synthesized interpretations for easier understanding:

- **`model_summaries`**: A list or other collection type that includes detailed summaries for each of the statistical models fitted during the analysis. These summaries typically contain information on model coefficients, p-values, confidence intervals, and other pertinent statistical metrics.

- **`interpretations`**: This output consists of narrative interpretations for each fitted model. It aims to translate the technical statistical findings into more accessible insights, highlighting significant results and their potential implications.

- **`model_summaries_df`** (`Pandas DataFrame`): A DataFrame consolidating the statistical summaries of all models into a tabular format. This structure facilitates a straightforward comparison across models, offering a clear view of the estimated effects and their statistical significance.

- **`interpretations_df`** (`Pandas DataFrame`): Similar to `model_summaries_df`, this DataFrame organizes the interpretations of each model's results into a structured table. It provides a narrative summary of the findings in an easily digestible format, suitable for reporting or further discussion.

### Data Visualization Function Documentation

#### Purpose

The `data_viz` function creates a series of data visualizations based on a provided dataset. It aims to facilitate data exploration, presentation, and the comprehension of statistical analyses through visual means.


#### Function Map

<img src="https://github.com/knowusuboaky/ab_testing_module/blob/main/README_files/figure-markdown/mermaid-figure-3.png?raw=true" width="1526" height="459" alt="Optional Alt Text">

#### Parameters

- `data`: Pandas DataFrame with the dataset to be visualized.
- `group_column`: String specifying the column with group labels.
- `value_column`: String specifying the column with values to analyze.
- `viz_types`: List of strings indicating the types of visualizations to generate.


#### Functionality

- Selects and customizes visualizations based on `viz_types`.
- Handles grouping and value considerations for data segmentation.
- Generates data-driven interpretations for each visualization.
- Employs interactive and aesthetically pleasing visualization techniques.

#### Output

- Displays requested visualizations.
- Provides interpretations for each visualization to highlight key findings.


#### Installation

This is the environment we need to load.

``` bash

pip install ab_testing_module==3.1.7
```

#### Load Package

``` bash

from ab_testing_module import data_viz
```

#### Base Operations

``` bash

# Create data visualization
visualizations = data_viz(data=df, 
                        group_column='group', 
                        value_column='outcome',
                        viz_types=['boxplot', 'violinplot', 'histogram', 'countplot', 'heatmap'])
```


The `data_viz` function creates a series of visualizations to aid in the exploratory data analysis and presentation of findings from the dataset.


- **`data`** (DataFrame): The dataset to be visualized, should include the variable(s) for grouping and the variable of interest.
- **`group_column`** (str): The column name in `data` that contains the labels for different groups or categories within the data.
- **`value_column`** (str): The column name in `data` that contains the values or outcomes to be visualized and analyzed.
- **`viz_types`** (list of str): A list specifying the types of visualizations to generate, such as `['boxplot', 'violinplot', 'histogram', 'countplot', 'heatmap']`.



## Ideal Use Cases

The Statistical Analysis Toolbox is designed to support a wide range of data analysis scenarios, particularly those involving A/B testing, statistical modeling, and data visualization. Below are the ideal use cases for each of the key functions within the toolbox: `ab_test`, `modeling`, and `data_viz`.

### `ab_test` Function

#### Ideal Use Cases

- **Comparing Conversion Rates**: Ideal for analyzing the effectiveness of two different webpage designs on user conversion rates. The `ab_test` function can statistically determine if one design leads to significantly higher conversions than the other.
- **Evaluating Marketing Strategies**: Useful for assessing the impact of different marketing campaigns or strategies on sales or customer engagement metrics.
- **Product Feature Testing**: When introducing new features or changes to a product, `ab_test` can help evaluate the change's impact on user behavior or satisfaction.

### `modeling` Function

#### Ideal Use Cases

- **Predictive Modeling**: For scenarios where the goal is to predict outcomes based on a set of variables, such as forecasting sales based on historical data and market conditions.
- **Causal Inference**: In situations where understanding the causal relationship between variables is crucial, such as determining the effect of educational interventions on student performance.
- **Complex Comparative Analysis**: Suitable for analyzing datasets with multiple groups and variables, where simple statistical tests are insufficient. For example, comparing the effects of various teaching methods across different schools or demographic groups.

### `data_viz` Function

#### Ideal Use Cases

- **Data Exploration**: Before diving into complex analyses, `data_viz` can help uncover patterns, trends, and outliers in the data, guiding further investigation.
- **Reporting and Presentation**: Creating compelling visual representations of analysis results for reports, presentations, or dashboards. It's particularly useful for conveying findings to non-technical stakeholders.
- **Comparative Analysis Visualization**: When the goal is to visually compare outcomes across different groups, such as visualizing the distribution of customer satisfaction ratings before and after implementing a customer service improvement plan.


## Contributing

We welcome contributions, suggestions, and feedback to make this library
even better. Feel free to fork the repository, submit pull requests, or
open issues.

## Documentation & Examples

For documentation and usage examples, visit the GitHub repository:
https://github.com/knowusuboaky/ab_testing_module

**Author**: Kwadwo Daddy Nyame Owusu - Boakye\
**Email**: kwadwo.owusuboakye@outlook.com\
**License**: MIT

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/knowusuboaky/ab_testing_module",
    "name": "ab-testing-module",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": "",
    "keywords": "statistics,data analysis,A/B testing,statistical modeling,data visualization,analysis toolbox,experimental analysis,hypothesis testing,effect size,power analysis,data science,machine learning,plotly visualizations,pandas data manipulation,numpy calculations,scipy statistics,statsmodels regression,matplotlib plotting,seaborn charts,data exploration,quantitative analysis,research tools,analytical reporting,decision support,inferential statistics,predictive modeling",
    "author": "Kwadwo Daddy Nyame Owusu - Boakye",
    "author_email": "kwadwo.owusuboakye@outlook.com",
    "download_url": "https://files.pythonhosted.org/packages/7a/91/bd26e46ce4a230ab5d56f56bdb5a92a4b85927532798e09d785e56f4b223/ab_testing_module-3.1.7.tar.gz",
    "platform": null,
    "description": "# AB Testing Module\r\n\r\n## Overview\r\n\r\nThe AB Testing Library is a comprehensive suite designed for analyzing and reporting A/B test experiments, featuring functions for statistical analysis, advanced modeling, and data visualization to transform experimental results into actionable insights.\r\n\r\n\r\n## Features\r\n\r\n### A/B Testing Function Documentation\r\n\r\n#### Purpose\r\n\r\nThe `ab_test` function is designed to facilitate A/B testing and statistical analysis within a given dataset. This robust tool can handle both binary and continuous outcome data, performing appropriate statistical tests to determine if there are significant differences between groups.\r\n\r\n#### Function Map\r\n\r\n<img src=\"https://github.com/knowusuboaky/ab_testing_module/blob/main/README_files/figure-markdown/mermaid-figure-1.png?raw=true\" width=\"1526\" height=\"459\" alt=\"Optional Alt Text\">\r\n\r\n#### Parameters\r\n\r\n- `data`: Pandas DataFrame containing the dataset for analysis.\r\n- `group_column`: String specifying the column in `data` that contains the group labels.\r\n- `value_column`: String specifying the column with the values to analyze.\r\n- `control_group`: (Optional) String specifying the control group's label for comparison.\r\n- `alpha`: (Optional) Float defining the significance level for the statistical tests, defaulting to 0.05.\r\n- `handle_outliers`: (Optional) List specifying the method and strategy for handling outliers.\r\n- `mc_correction`: (Optional) String specifying the method for multiple comparisons correction.\r\n\r\n#### Functionality\r\n\r\n- Validates input data and columns.\r\n- Handles outliers if specified.\r\n- Selects and performs the appropriate statistical tests based on data characteristics.\r\n- Calculates effect sizes and conducts power analysis for the tests conducted.\r\n- Applies multiple comparisons correction if specified.\r\n- Generates interpretations of test results.\r\n\r\n#### Output\r\n\r\nReturns a Pandas DataFrame with columns for each test performed, including the test name, p-values, effect sizes, power, and interpretations.\r\n\r\n\r\n#### Installation\r\n\r\nThis is the environment we need to load.\r\n\r\n``` bash\r\n\r\npip install ab_testing_module==3.1.7\r\n```\r\n\r\n#### Load Package\r\n\r\n``` bash\r\n\r\nfrom ab_testing_module import ab_test\r\n```\r\n\r\n#### Base Operations\r\n\r\n``` bash\r\n\r\n# Perform ab test or abc test\r\ndf_results = ab_test(data=df, \r\n                    group_column='group', \r\n                    value_column='outcome', \r\n                    control_group='control', \r\n                    alpha=0.05, \r\n                    handle_outliers=None, \r\n                    mc_correction=None)\r\n```\r\n\r\n\r\nThe `ab_test` function performs statistical analysis to compare outcomes across different groups in an A/B testing framework.\r\n\r\n\r\n- **`data`**: The dataset to analyze, provided as a Pandas DataFrame. It should contain at least two columns: one for grouping (e.g., experimental vs. control groups) and one for the outcomes of interest (e.g., conversion rates, scores).\r\n\r\n- **`group_column`**: A string specifying the name of the column in `data` that contains the group labels. This column is used to differentiate between the groups involved in the A/B test (e.g., 'group').\r\n\r\n- **`value_column`**: A string indicating the name of the column in `data` that contains the values or outcomes to be analyzed (e.g., 'outcome'). These values are the focus of the statistical comparison between groups.\r\n\r\n- **`control_group`**: (Optional) A string specifying the label of the control group within the `group_column`. This argument is essential when the analysis involves comparing each test group against a common control group to assess the effect of different treatments or conditions.\r\n\r\n- **`alpha`**: (Optional) A float representing the significance level used in the statistical tests. The default value is `0.05`, which is a common threshold for determining statistical significance. Lowering this value makes the criteria for significance stricter, while increasing it makes the criteria more lenient.\r\n\r\n- **`handle_outliers`**: (Optional) A list specifying the method and strategy for handling outliers in the `value_column`. The first element of the list is the method ('IQR' for interquartile range or 'Z-score'), and the second element is the strategy ('remove', 'impute', or 'cap'). If `None`, no outlier handling is performed.\r\n\r\n  - `IQR`: Identifies outliers based on the interquartile range. Typically, values below Q1-1.5*IQR or above Q3+1.5*IQR are considered outliers.\r\n  - `Z-score`: Identifies outliers based on the Z-score, with values typically beyond 3 standard deviations from the mean considered outliers.\r\n  - `remove`: Outliers are removed from the dataset.\r\n  - `impute`: Outliers are replaced with a specified statistic (e.g., the median).\r\n  - `cap`: Outliers are capped at a specified maximum and/or minimum value.\r\n\r\n- **`mc_correction`**: (Optional) A string specifying the method for multiple comparisons correction when conducting multiple statistical tests. This argument is critical for controlling the family-wise error rate or the false discovery rate in experiments involving multiple group comparisons. Options include:\r\n\r\n  - `None`: No correction is applied.\r\n  - `'bonferroni'`: The Bonferroni correction, which adjusts p-values by multiplying them by the number of comparisons.\r\n  - `'fdr'`: The False Discovery Rate correction, which controls the expected proportion of incorrectly rejected null hypotheses.\r\n  - `'holm'`: The Holm-Bonferroni method, a step-down procedure that is less conservative than the standard Bonferroni correction.\r\n\r\n\r\n- The function returns a Pandas DataFrame (`df_results`) containing the results of the statistical tests, including test names, p-values, effect sizes, power analyses (where applicable), and detailed interpretations of the results. The DataFrame provides a comprehensive summary of the findings from the A/B test analysis.\r\n\r\n\r\n### Advanced Modeling Function Documentation\r\n\r\n#### Purpose\r\n\r\nThe `modeling` is a sophisticated function designed for conducting advanced statistical modeling. It fits different types of statistical models to explore relationships between variables, quantify effects, and provide insights into the data structure.\r\n\r\n#### Function Map\r\n\r\n<img src=\"https://github.com/knowusuboaky/ab_testing_module/blob/main/README_files/figure-markdown/mermaid-figure-2.png?raw=true\" width=\"1526\" height=\"459\" alt=\"Optional Alt Text\">\r\n\r\n#### Parameters\r\n\r\n- `data`: Pandas DataFrame containing the dataset.\r\n- `group_column`: String specifying the column with the group information.\r\n- `value_column`: String specifying the column with the dependent variable.\r\n- `control_group`: (Optional) String specifying the label of the control group.\r\n\r\n#### Functionality\r\n\r\n- Automatically determines the model type based on the dependent variable's characteristics.\r\n- Fits the chosen model and generates summaries, including parameter estimates and their significance.\r\n- Performs effect size and power analysis where applicable.\r\n- Provides detailed interpretations of the modeling results.\r\n\r\n#### Output\r\n\r\n- Model summaries for each fitted model.\r\n- Structured interpretations of each model's results.\r\n- Effect sizes and results of power analyses for the conducted models.\r\n\r\n\r\n#### Installation\r\n\r\nThis is the environment we need to load.\r\n\r\n``` bash\r\n\r\npip install ab_testing_module==3.1.7\r\n```\r\n\r\n#### Load Package\r\n\r\n``` bash\r\n\r\nfrom ab_testing_module import modeling\r\n```\r\n\r\n#### Base Operations\r\n\r\n``` bash\r\n\r\n# Perform advanced modeling\r\nmodel_summaries, interpretations, model_summaries_df, interpretations_df = modeling(data=df, \r\n                                                                                    group_column='group', \r\n                                                                                    value_column='outcome', \r\n                                                                                    control_group='control')\r\n```\r\n\r\n\r\nThe `modeling` function conducts advanced statistical modeling to explore relationships between variables in the context of A/B testing.\r\n\r\n\r\n- **`data`** (DataFrame): The dataset for modeling, including independent variables and the dependent variable of interest.\r\n- **`group_column`** (str): The column in `data` that identifies the grouping of data points (e.g., different conditions or treatments).\r\n- **`value_column`** (str): The column in `data` representing the dependent variable or outcome of interest.\r\n- **`control_group`** (str, optional): Specifies the label of the control group for comparative modeling purposes.\r\n\r\n\r\nThe function returns a tuple containing four elements, designed to provide both the raw statistical outputs from the modeling process and their synthesized interpretations for easier understanding:\r\n\r\n- **`model_summaries`**: A list or other collection type that includes detailed summaries for each of the statistical models fitted during the analysis. These summaries typically contain information on model coefficients, p-values, confidence intervals, and other pertinent statistical metrics.\r\n\r\n- **`interpretations`**: This output consists of narrative interpretations for each fitted model. It aims to translate the technical statistical findings into more accessible insights, highlighting significant results and their potential implications.\r\n\r\n- **`model_summaries_df`** (`Pandas DataFrame`): A DataFrame consolidating the statistical summaries of all models into a tabular format. This structure facilitates a straightforward comparison across models, offering a clear view of the estimated effects and their statistical significance.\r\n\r\n- **`interpretations_df`** (`Pandas DataFrame`): Similar to `model_summaries_df`, this DataFrame organizes the interpretations of each model's results into a structured table. It provides a narrative summary of the findings in an easily digestible format, suitable for reporting or further discussion.\r\n\r\n### Data Visualization Function Documentation\r\n\r\n#### Purpose\r\n\r\nThe `data_viz` function creates a series of data visualizations based on a provided dataset. It aims to facilitate data exploration, presentation, and the comprehension of statistical analyses through visual means.\r\n\r\n\r\n#### Function Map\r\n\r\n<img src=\"https://github.com/knowusuboaky/ab_testing_module/blob/main/README_files/figure-markdown/mermaid-figure-3.png?raw=true\" width=\"1526\" height=\"459\" alt=\"Optional Alt Text\">\r\n\r\n#### Parameters\r\n\r\n- `data`: Pandas DataFrame with the dataset to be visualized.\r\n- `group_column`: String specifying the column with group labels.\r\n- `value_column`: String specifying the column with values to analyze.\r\n- `viz_types`: List of strings indicating the types of visualizations to generate.\r\n\r\n\r\n#### Functionality\r\n\r\n- Selects and customizes visualizations based on `viz_types`.\r\n- Handles grouping and value considerations for data segmentation.\r\n- Generates data-driven interpretations for each visualization.\r\n- Employs interactive and aesthetically pleasing visualization techniques.\r\n\r\n#### Output\r\n\r\n- Displays requested visualizations.\r\n- Provides interpretations for each visualization to highlight key findings.\r\n\r\n\r\n#### Installation\r\n\r\nThis is the environment we need to load.\r\n\r\n``` bash\r\n\r\npip install ab_testing_module==3.1.7\r\n```\r\n\r\n#### Load Package\r\n\r\n``` bash\r\n\r\nfrom ab_testing_module import data_viz\r\n```\r\n\r\n#### Base Operations\r\n\r\n``` bash\r\n\r\n# Create data visualization\r\nvisualizations = data_viz(data=df, \r\n                        group_column='group', \r\n                        value_column='outcome',\r\n                        viz_types=['boxplot', 'violinplot', 'histogram', 'countplot', 'heatmap'])\r\n```\r\n\r\n\r\nThe `data_viz` function creates a series of visualizations to aid in the exploratory data analysis and presentation of findings from the dataset.\r\n\r\n\r\n- **`data`** (DataFrame): The dataset to be visualized, should include the variable(s) for grouping and the variable of interest.\r\n- **`group_column`** (str): The column name in `data` that contains the labels for different groups or categories within the data.\r\n- **`value_column`** (str): The column name in `data` that contains the values or outcomes to be visualized and analyzed.\r\n- **`viz_types`** (list of str): A list specifying the types of visualizations to generate, such as `['boxplot', 'violinplot', 'histogram', 'countplot', 'heatmap']`.\r\n\r\n\r\n\r\n## Ideal Use Cases\r\n\r\nThe Statistical Analysis Toolbox is designed to support a wide range of data analysis scenarios, particularly those involving A/B testing, statistical modeling, and data visualization. Below are the ideal use cases for each of the key functions within the toolbox: `ab_test`, `modeling`, and `data_viz`.\r\n\r\n### `ab_test` Function\r\n\r\n#### Ideal Use Cases\r\n\r\n- **Comparing Conversion Rates**: Ideal for analyzing the effectiveness of two different webpage designs on user conversion rates. The `ab_test` function can statistically determine if one design leads to significantly higher conversions than the other.\r\n- **Evaluating Marketing Strategies**: Useful for assessing the impact of different marketing campaigns or strategies on sales or customer engagement metrics.\r\n- **Product Feature Testing**: When introducing new features or changes to a product, `ab_test` can help evaluate the change's impact on user behavior or satisfaction.\r\n\r\n### `modeling` Function\r\n\r\n#### Ideal Use Cases\r\n\r\n- **Predictive Modeling**: For scenarios where the goal is to predict outcomes based on a set of variables, such as forecasting sales based on historical data and market conditions.\r\n- **Causal Inference**: In situations where understanding the causal relationship between variables is crucial, such as determining the effect of educational interventions on student performance.\r\n- **Complex Comparative Analysis**: Suitable for analyzing datasets with multiple groups and variables, where simple statistical tests are insufficient. For example, comparing the effects of various teaching methods across different schools or demographic groups.\r\n\r\n### `data_viz` Function\r\n\r\n#### Ideal Use Cases\r\n\r\n- **Data Exploration**: Before diving into complex analyses, `data_viz` can help uncover patterns, trends, and outliers in the data, guiding further investigation.\r\n- **Reporting and Presentation**: Creating compelling visual representations of analysis results for reports, presentations, or dashboards. It's particularly useful for conveying findings to non-technical stakeholders.\r\n- **Comparative Analysis Visualization**: When the goal is to visually compare outcomes across different groups, such as visualizing the distribution of customer satisfaction ratings before and after implementing a customer service improvement plan.\r\n\r\n\r\n## Contributing\r\n\r\nWe welcome contributions, suggestions, and feedback to make this library\r\neven better. Feel free to fork the repository, submit pull requests, or\r\nopen issues.\r\n\r\n## Documentation & Examples\r\n\r\nFor documentation and usage examples, visit the GitHub repository:\r\nhttps://github.com/knowusuboaky/ab_testing_module\r\n\r\n**Author**: Kwadwo Daddy Nyame Owusu - Boakye\\\r\n**Email**: kwadwo.owusuboakye@outlook.com\\\r\n**License**: MIT\r\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "The AB Testing Module is a comprehensive suite designed for analyzing and reporting A/B test experiments, featuring functions for statistical analysis, advanced modeling, and data visualization to transform experimental results into actionable insights.",
    "version": "3.1.7",
    "project_urls": {
        "Homepage": "https://github.com/knowusuboaky/ab_testing_module"
    },
    "split_keywords": [
        "statistics",
        "data analysis",
        "a/b testing",
        "statistical modeling",
        "data visualization",
        "analysis toolbox",
        "experimental analysis",
        "hypothesis testing",
        "effect size",
        "power analysis",
        "data science",
        "machine learning",
        "plotly visualizations",
        "pandas data manipulation",
        "numpy calculations",
        "scipy statistics",
        "statsmodels regression",
        "matplotlib plotting",
        "seaborn charts",
        "data exploration",
        "quantitative analysis",
        "research tools",
        "analytical reporting",
        "decision support",
        "inferential statistics",
        "predictive modeling"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "7a91bd26e46ce4a230ab5d56f56bdb5a92a4b85927532798e09d785e56f4b223",
                "md5": "a06661ce677485b7126390a111f3ed2b",
                "sha256": "911fdad32637c892f93be34c01bc264f64505c48a260ccb7e52adc9fca37fc5d"
            },
            "downloads": -1,
            "filename": "ab_testing_module-3.1.7.tar.gz",
            "has_sig": false,
            "md5_digest": "a06661ce677485b7126390a111f3ed2b",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 23667,
            "upload_time": "2024-03-18T08:25:19",
            "upload_time_iso_8601": "2024-03-18T08:25:19.155229Z",
            "url": "https://files.pythonhosted.org/packages/7a/91/bd26e46ce4a230ab5d56f56bdb5a92a4b85927532798e09d785e56f4b223/ab_testing_module-3.1.7.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-03-18 08:25:19",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "knowusuboaky",
    "github_project": "ab_testing_module",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "ab-testing-module"
}
        
Elapsed time: 0.29987s