# ab-test-simulator
<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->
## Install
``` sh
pip install ab_test_simulator
```
## imports
``` python
from ab_test_simulator.generator import (
generate_binary_data,
generate_continuous_data,
data_to_contingency,
)
from ab_test_simulator.power import (
simulate_power_binary,
sample_size_chi2,
simulate_power_continuous,
continuous_sample_size,
)
from ab_test_simulator.plotting import (
plot_power,
plot_distribution,
plot_betas,
)
```
## Binary target (e.g. conversion rate experiments)
### Sample size:
We can calculate the sample size required with the function
“sample_size_chi2”. Input needed is:
- Conversion rate control: cr0
- Conversion rate variant for minimal detectable effect: cr1 (for
example, if we have a conversion rate of 1% and want to detect an
effect of at least 20% relate, we would set cr0=0.010 and cr1=0.012)
- Significance threshold: alpha. Usually set to 0.05, this defines our
tolerance for falsely detecting an effect if in reality there is none
(alpha=0.05 means that in 5% of the cases we will detect an effect
even though the samples for control and variant are drawn from the
exact same distribution).
- Statistical power. Usually set to 0.8. This means that if the effect
is the minimal effect specified above, we have an 80% probability of
identifying it at statistically significant (and hence 20% of not
idenfitying it).
- one_sided: If the test is one-sided (one_sided=True) or if it is
two-sided (one_sided=False). As a rule of thumb, if there are very
strong reasons to believe that the variant cannot be inferior to the
control, we can use a one sided test. In case of doubts, using a two
sided test is better.
let us calculate the sample size for the following example:
``` python
n_sample = sample_size_chi2(
cr0=0.01,
cr1=0.012,
alpha=0.05,
power=0.8,
one_sided=True,
)
print(f"Required sample size per variant is {int(n_sample)}.")
```
Required sample size per variant is 33560.
``` python
n_sample_two_sided = sample_size_chi2(
cr0=0.01,
cr1=0.012,
alpha=0.05,
power=0.8,
one_sided=False,
)
print(
f"For the two-sided experiment, required sample size per variant is {int(n_sample_two_sided)}."
)
```
For the two-sided experiment, required sample size per variant is 42606.
### Power simulations
What happens if we use a smaller sample size? And how can we understand
the sample size?
Let us analyze the statistical power with synthethic data. We can do
this with the simulate_power_binary function. We are using some default
argument here, see [this
page](https://k111git.github.io/ab-test-simulator/power.html) for more
information.
``` python
# simulation = simulate_power_binary()
```
Note: The simulation object return the total sample size, so we need to
split it per variant.
``` python
# simulation
```
Finally, we can plot the results (note: the plot function show the
sample size per variant):
``` python
# plot_power(
# simulation,
# added_lines=[{"sample_size": sample_size_chi2(), "label": "Chi2"}],
# )
```
### The problem of peaking
wip
## Contunious target (e.g. average)
Here we assume normally distributed data (which usually holds due to the
central limit theorem).
### Sample size
We can calculate the sample size required with the function
“continuous_sample_size”. Input needed is:
- mu1: Mean of the control group
- mu2: Mean of the variant group assuming minimal detectable effect
(e.g. if the mean it 5, and we want to detect an effect as small as
0.05, mu1=5.00 and mu2=5.05)
- sigma: Standard deviation (we assume the same for variant and control,
should be estimated from historical data)
- alpha, power, one_sided: as in the binary case
Let us calculate an example:
``` python
n_sample = continuous_sample_size(
mu1=5.0, mu2=5.05, sigma=1, alpha=0.05, power=0.8, one_sided=True
)
print(f"Required sample size per variant is {int(n_sample)}.")
```
Required sample size per variant is 4946.
Let us also do some simulations. These show results for the t-test as
well as bayesian testing (only 1-sided).
``` python
# simulation = simulate_power_continuous()
```
``` python
# plot_power(
# simulation,
# added_lines=[
# {"sample_size": continuous_sample_size(), "label": "Formula"}
# ],
# )
```
## Data Generators
We can also use the data generators for example data to analyze or
visualuze as if they were experiments.
Distribution without effect:
``` python
df_continuous = generate_continuous_data(effect=0)
# plot_distribution(df_continuous)
```
Distribution with effect:
``` python
df_continuous = generate_continuous_data(effect=1)
# plot_distribution(df_continuous)
```
## Visualizations
Plot beta distributions for a contingency table:
``` python
df = generate_binary_data()
df_contingency = data_to_contingency(df)
# plot_betas(df_contingency, xmin=0, xmax=0.04)
```
Raw data
{
"_id": null,
"home_page": "https://github.com/k111git/ab-test-simulator",
"name": "ab-test-simulator",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": "",
"keywords": "nbdev jupyter notebook python",
"author": "Kolja",
"author_email": "",
"download_url": "https://files.pythonhosted.org/packages/20/e5/2bc0092e0b867a4b7a937b3727d1721175da4c8644b23f5bd3cc0f573d39/ab-test-simulator-0.0.7.tar.gz",
"platform": null,
"description": "# ab-test-simulator\n\n<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->\n\n## Install\n\n``` sh\npip install ab_test_simulator\n```\n\n## imports\n\n``` python\nfrom ab_test_simulator.generator import (\n generate_binary_data,\n generate_continuous_data,\n data_to_contingency,\n)\nfrom ab_test_simulator.power import (\n simulate_power_binary,\n sample_size_chi2,\n simulate_power_continuous,\n continuous_sample_size,\n)\nfrom ab_test_simulator.plotting import (\n plot_power,\n plot_distribution,\n plot_betas,\n)\n```\n\n## Binary target (e.g.\u00a0conversion rate experiments)\n\n### Sample size:\n\nWe can calculate the sample size required with the function\n\u201csample_size_chi2\u201d. Input needed is:\n\n- Conversion rate control: cr0\n\n- Conversion rate variant for minimal detectable effect: cr1 (for\n example, if we have a conversion rate of 1% and want to detect an\n effect of at least 20% relate, we would set cr0=0.010 and cr1=0.012)\n\n- Significance threshold: alpha. Usually set to 0.05, this defines our\n tolerance for falsely detecting an effect if in reality there is none\n (alpha=0.05 means that in 5% of the cases we will detect an effect\n even though the samples for control and variant are drawn from the\n exact same distribution).\n\n- Statistical power. Usually set to 0.8. This means that if the effect\n is the minimal effect specified above, we have an 80% probability of\n identifying it at statistically significant (and hence 20% of not\n idenfitying it).\n\n- one_sided: If the test is one-sided (one_sided=True) or if it is\n two-sided (one_sided=False). As a rule of thumb, if there are very\n strong reasons to believe that the variant cannot be inferior to the\n control, we can use a one sided test. In case of doubts, using a two\n sided test is better.\n\nlet us calculate the sample size for the following example:\n\n``` python\nn_sample = sample_size_chi2(\n cr0=0.01,\n cr1=0.012,\n alpha=0.05,\n power=0.8,\n one_sided=True,\n)\nprint(f\"Required sample size per variant is {int(n_sample)}.\")\n```\n\n Required sample size per variant is 33560.\n\n``` python\nn_sample_two_sided = sample_size_chi2(\n cr0=0.01,\n cr1=0.012,\n alpha=0.05,\n power=0.8,\n one_sided=False,\n)\nprint(\n f\"For the two-sided experiment, required sample size per variant is {int(n_sample_two_sided)}.\"\n)\n```\n\n For the two-sided experiment, required sample size per variant is 42606.\n\n### Power simulations\n\nWhat happens if we use a smaller sample size? And how can we understand\nthe sample size?\n\nLet us analyze the statistical power with synthethic data. We can do\nthis with the simulate_power_binary function. We are using some default\nargument here, see [this\npage](https://k111git.github.io/ab-test-simulator/power.html) for more\ninformation.\n\n``` python\n# simulation = simulate_power_binary()\n```\n\nNote: The simulation object return the total sample size, so we need to\nsplit it per variant.\n\n``` python\n# simulation\n```\n\nFinally, we can plot the results (note: the plot function show the\nsample size per variant):\n\n``` python\n# plot_power(\n# simulation,\n# added_lines=[{\"sample_size\": sample_size_chi2(), \"label\": \"Chi2\"}],\n# )\n```\n\n### The problem of peaking\n\nwip\n\n## Contunious target (e.g.\u00a0average)\n\nHere we assume normally distributed data (which usually holds due to the\ncentral limit theorem).\n\n### Sample size\n\nWe can calculate the sample size required with the function\n\u201ccontinuous_sample_size\u201d. Input needed is:\n\n- mu1: Mean of the control group\n\n- mu2: Mean of the variant group assuming minimal detectable effect\n (e.g.\u00a0if the mean it 5, and we want to detect an effect as small as\n 0.05, mu1=5.00 and mu2=5.05)\n\n- sigma: Standard deviation (we assume the same for variant and control,\n should be estimated from historical data)\n\n- alpha, power, one_sided: as in the binary case\n\nLet us calculate an example:\n\n``` python\nn_sample = continuous_sample_size(\n mu1=5.0, mu2=5.05, sigma=1, alpha=0.05, power=0.8, one_sided=True\n)\nprint(f\"Required sample size per variant is {int(n_sample)}.\")\n```\n\n Required sample size per variant is 4946.\n\nLet us also do some simulations. These show results for the t-test as\nwell as bayesian testing (only 1-sided).\n\n``` python\n# simulation = simulate_power_continuous()\n```\n\n``` python\n# plot_power(\n# simulation,\n# added_lines=[\n# {\"sample_size\": continuous_sample_size(), \"label\": \"Formula\"}\n# ],\n# )\n```\n\n## Data Generators\n\nWe can also use the data generators for example data to analyze or\nvisualuze as if they were experiments.\n\nDistribution without effect:\n\n``` python\ndf_continuous = generate_continuous_data(effect=0)\n# plot_distribution(df_continuous)\n```\n\nDistribution with effect:\n\n``` python\ndf_continuous = generate_continuous_data(effect=1)\n# plot_distribution(df_continuous)\n```\n\n## Visualizations\n\nPlot beta distributions for a contingency table:\n\n``` python\ndf = generate_binary_data()\ndf_contingency = data_to_contingency(df)\n# plot_betas(df_contingency, xmin=0, xmax=0.04)\n```\n\n\n",
"bugtrack_url": null,
"license": "Apache Software License 2.0",
"summary": "Simple library to simulate AB tests.",
"version": "0.0.7",
"project_urls": {
"Homepage": "https://github.com/k111git/ab-test-simulator"
},
"split_keywords": [
"nbdev",
"jupyter",
"notebook",
"python"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "01dbd915faf76d2cf8e56f999601444c3f0d8c1e73fd86cc80cd2a47522bf578",
"md5": "e0e07341a74c5822ea53682d59a8e39b",
"sha256": "2423b1d4d767d33ef9ccf00c52d9d6fb5699ae1657612552c09731460fd8b274"
},
"downloads": -1,
"filename": "ab_test_simulator-0.0.7-py3-none-any.whl",
"has_sig": false,
"md5_digest": "e0e07341a74c5822ea53682d59a8e39b",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 14341,
"upload_time": "2023-06-02T13:13:28",
"upload_time_iso_8601": "2023-06-02T13:13:28.868741Z",
"url": "https://files.pythonhosted.org/packages/01/db/d915faf76d2cf8e56f999601444c3f0d8c1e73fd86cc80cd2a47522bf578/ab_test_simulator-0.0.7-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "20e52bc0092e0b867a4b7a937b3727d1721175da4c8644b23f5bd3cc0f573d39",
"md5": "43f309387590bd53e0503735aa649568",
"sha256": "b3ee5344d79856bef090094ea434a9a147846a79c477c993244a84f08e3b2407"
},
"downloads": -1,
"filename": "ab-test-simulator-0.0.7.tar.gz",
"has_sig": false,
"md5_digest": "43f309387590bd53e0503735aa649568",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 15713,
"upload_time": "2023-06-02T13:13:30",
"upload_time_iso_8601": "2023-06-02T13:13:30.954889Z",
"url": "https://files.pythonhosted.org/packages/20/e5/2bc0092e0b867a4b7a937b3727d1721175da4c8644b23f5bd3cc0f573d39/ab-test-simulator-0.0.7.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-06-02 13:13:30",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "k111git",
"github_project": "ab-test-simulator",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [],
"lcname": "ab-test-simulator"
}