ab-test-simulator

Name	ab-test-simulator JSON
Version	0.0.7 JSON
	download
home_page	https://github.com/k111git/ab-test-simulator
Summary	Simple library to simulate AB tests.
upload_time	2023-06-02 13:13:30
maintainer
docs_url	None
author	Kolja
requires_python	>=3.7
license	Apache Software License 2.0
keywords	nbdev jupyter notebook python
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # ab-test-simulator

<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->

## Install

``` sh
pip install ab_test_simulator
```

## imports

``` python
from ab_test_simulator.generator import (
    generate_binary_data,
    generate_continuous_data,
    data_to_contingency,
)
from ab_test_simulator.power import (
    simulate_power_binary,
    sample_size_chi2,
    simulate_power_continuous,
    continuous_sample_size,
)
from ab_test_simulator.plotting import (
    plot_power,
    plot_distribution,
    plot_betas,
)
```

## Binary target (e.g. conversion rate experiments)

### Sample size:

We can calculate the sample size required with the function
“sample_size_chi2”. Input needed is:

- Conversion rate control: cr0

- Conversion rate variant for minimal detectable effect: cr1 (for
  example, if we have a conversion rate of 1% and want to detect an
  effect of at least 20% relate, we would set cr0=0.010 and cr1=0.012)

- Significance threshold: alpha. Usually set to 0.05, this defines our
  tolerance for falsely detecting an effect if in reality there is none
  (alpha=0.05 means that in 5% of the cases we will detect an effect
  even though the samples for control and variant are drawn from the
  exact same distribution).

- Statistical power. Usually set to 0.8. This means that if the effect
  is the minimal effect specified above, we have an 80% probability of
  identifying it at statistically significant (and hence 20% of not
  idenfitying it).

- one_sided: If the test is one-sided (one_sided=True) or if it is
  two-sided (one_sided=False). As a rule of thumb, if there are very
  strong reasons to believe that the variant cannot be inferior to the
  control, we can use a one sided test. In case of doubts, using a two
  sided test is better.

let us calculate the sample size for the following example:

``` python
n_sample = sample_size_chi2(
    cr0=0.01,
    cr1=0.012,
    alpha=0.05,
    power=0.8,
    one_sided=True,
)
print(f"Required sample size per variant is {int(n_sample)}.")
```

    Required sample size per variant is 33560.

``` python
n_sample_two_sided = sample_size_chi2(
    cr0=0.01,
    cr1=0.012,
    alpha=0.05,
    power=0.8,
    one_sided=False,
)
print(
    f"For the two-sided experiment, required sample size per variant is {int(n_sample_two_sided)}."
)
```

    For the two-sided experiment, required sample size per variant is 42606.

### Power simulations

What happens if we use a smaller sample size? And how can we understand
the sample size?

Let us analyze the statistical power with synthethic data. We can do
this with the simulate_power_binary function. We are using some default
argument here, see [this
page](https://k111git.github.io/ab-test-simulator/power.html) for more
information.

``` python
# simulation = simulate_power_binary()
```

Note: The simulation object return the total sample size, so we need to
split it per variant.

``` python
# simulation
```

Finally, we can plot the results (note: the plot function show the
sample size per variant):

``` python
# plot_power(
#     simulation,
#     added_lines=[{"sample_size": sample_size_chi2(), "label": "Chi2"}],
# )
```

### The problem of peaking

wip

## Contunious target (e.g. average)

Here we assume normally distributed data (which usually holds due to the
central limit theorem).

### Sample size

We can calculate the sample size required with the function
“continuous_sample_size”. Input needed is:

- mu1: Mean of the control group

- mu2: Mean of the variant group assuming minimal detectable effect
  (e.g. if the mean it 5, and we want to detect an effect as small as
  0.05, mu1=5.00 and mu2=5.05)

- sigma: Standard deviation (we assume the same for variant and control,
  should be estimated from historical data)

- alpha, power, one_sided: as in the binary case

Let us calculate an example:

``` python
n_sample = continuous_sample_size(
    mu1=5.0, mu2=5.05, sigma=1, alpha=0.05, power=0.8, one_sided=True
)
print(f"Required sample size per variant is {int(n_sample)}.")
```

    Required sample size per variant is 4946.

Let us also do some simulations. These show results for the t-test as
well as bayesian testing (only 1-sided).

``` python
# simulation = simulate_power_continuous()
```

``` python
# plot_power(
#     simulation,
#     added_lines=[
#         {"sample_size": continuous_sample_size(), "label": "Formula"}
#     ],
# )
```

## Data Generators

We can also use the data generators for example data to analyze or
visualuze as if they were experiments.

Distribution without effect:

``` python
df_continuous = generate_continuous_data(effect=0)
# plot_distribution(df_continuous)
```

Distribution with effect:

``` python
df_continuous = generate_continuous_data(effect=1)
# plot_distribution(df_continuous)
```

## Visualizations

Plot beta distributions for a contingency table:

``` python
df = generate_binary_data()
df_contingency = data_to_contingency(df)
# plot_betas(df_contingency, xmin=0, xmax=0.04)
```

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/k111git/ab-test-simulator",
    "name": "ab-test-simulator",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "",
    "keywords": "nbdev jupyter notebook python",
    "author": "Kolja",
    "author_email": "",
    "download_url": "https://files.pythonhosted.org/packages/20/e5/2bc0092e0b867a4b7a937b3727d1721175da4c8644b23f5bd3cc0f573d39/ab-test-simulator-0.0.7.tar.gz",
    "platform": null,
    "description": "# ab-test-simulator\n\n<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->\n\n## Install\n\n``` sh\npip install ab_test_simulator\n```\n\n## imports\n\n``` python\nfrom ab_test_simulator.generator import (\n    generate_binary_data,\n    generate_continuous_data,\n    data_to_contingency,\n)\nfrom ab_test_simulator.power import (\n    simulate_power_binary,\n    sample_size_chi2,\n    simulate_power_continuous,\n    continuous_sample_size,\n)\nfrom ab_test_simulator.plotting import (\n    plot_power,\n    plot_distribution,\n    plot_betas,\n)\n```\n\n## Binary target (e.g.\u00a0conversion rate experiments)\n\n### Sample size:\n\nWe can calculate the sample size required with the function\n\u201csample_size_chi2\u201d. Input needed is:\n\n- Conversion rate control: cr0\n\n- Conversion rate variant for minimal detectable effect: cr1 (for\n  example, if we have a conversion rate of 1% and want to detect an\n  effect of at least 20% relate, we would set cr0=0.010 and cr1=0.012)\n\n- Significance threshold: alpha. Usually set to 0.05, this defines our\n  tolerance for falsely detecting an effect if in reality there is none\n  (alpha=0.05 means that in 5% of the cases we will detect an effect\n  even though the samples for control and variant are drawn from the\n  exact same distribution).\n\n- Statistical power. Usually set to 0.8. This means that if the effect\n  is the minimal effect specified above, we have an 80% probability of\n  identifying it at statistically significant (and hence 20% of not\n  idenfitying it).\n\n- one_sided: If the test is one-sided (one_sided=True) or if it is\n  two-sided (one_sided=False). As a rule of thumb, if there are very\n  strong reasons to believe that the variant cannot be inferior to the\n  control, we can use a one sided test. In case of doubts, using a two\n  sided test is better.\n\nlet us calculate the sample size for the following example:\n\n``` python\nn_sample = sample_size_chi2(\n    cr0=0.01,\n    cr1=0.012,\n    alpha=0.05,\n    power=0.8,\n    one_sided=True,\n)\nprint(f\"Required sample size per variant is {int(n_sample)}.\")\n```\n\n    Required sample size per variant is 33560.\n\n``` python\nn_sample_two_sided = sample_size_chi2(\n    cr0=0.01,\n    cr1=0.012,\n    alpha=0.05,\n    power=0.8,\n    one_sided=False,\n)\nprint(\n    f\"For the two-sided experiment, required sample size per variant is {int(n_sample_two_sided)}.\"\n)\n```\n\n    For the two-sided experiment, required sample size per variant is 42606.\n\n### Power simulations\n\nWhat happens if we use a smaller sample size? And how can we understand\nthe sample size?\n\nLet us analyze the statistical power with synthethic data. We can do\nthis with the simulate_power_binary function. We are using some default\nargument here, see [this\npage](https://k111git.github.io/ab-test-simulator/power.html) for more\ninformation.\n\n``` python\n# simulation = simulate_power_binary()\n```\n\nNote: The simulation object return the total sample size, so we need to\nsplit it per variant.\n\n``` python\n# simulation\n```\n\nFinally, we can plot the results (note: the plot function show the\nsample size per variant):\n\n``` python\n# plot_power(\n#     simulation,\n#     added_lines=[{\"sample_size\": sample_size_chi2(), \"label\": \"Chi2\"}],\n# )\n```\n\n### The problem of peaking\n\nwip\n\n## Contunious target (e.g.\u00a0average)\n\nHere we assume normally distributed data (which usually holds due to the\ncentral limit theorem).\n\n### Sample size\n\nWe can calculate the sample size required with the function\n\u201ccontinuous_sample_size\u201d. Input needed is:\n\n- mu1: Mean of the control group\n\n- mu2: Mean of the variant group assuming minimal detectable effect\n  (e.g.\u00a0if the mean it 5, and we want to detect an effect as small as\n  0.05, mu1=5.00 and mu2=5.05)\n\n- sigma: Standard deviation (we assume the same for variant and control,\n  should be estimated from historical data)\n\n- alpha, power, one_sided: as in the binary case\n\nLet us calculate an example:\n\n``` python\nn_sample = continuous_sample_size(\n    mu1=5.0, mu2=5.05, sigma=1, alpha=0.05, power=0.8, one_sided=True\n)\nprint(f\"Required sample size per variant is {int(n_sample)}.\")\n```\n\n    Required sample size per variant is 4946.\n\nLet us also do some simulations. These show results for the t-test as\nwell as bayesian testing (only 1-sided).\n\n``` python\n# simulation = simulate_power_continuous()\n```\n\n``` python\n# plot_power(\n#     simulation,\n#     added_lines=[\n#         {\"sample_size\": continuous_sample_size(), \"label\": \"Formula\"}\n#     ],\n# )\n```\n\n## Data Generators\n\nWe can also use the data generators for example data to analyze or\nvisualuze as if they were experiments.\n\nDistribution without effect:\n\n``` python\ndf_continuous = generate_continuous_data(effect=0)\n# plot_distribution(df_continuous)\n```\n\nDistribution with effect:\n\n``` python\ndf_continuous = generate_continuous_data(effect=1)\n# plot_distribution(df_continuous)\n```\n\n## Visualizations\n\nPlot beta distributions for a contingency table:\n\n``` python\ndf = generate_binary_data()\ndf_contingency = data_to_contingency(df)\n# plot_betas(df_contingency, xmin=0, xmax=0.04)\n```\n\n\n",
    "bugtrack_url": null,
    "license": "Apache Software License 2.0",
    "summary": "Simple library to simulate AB tests.",
    "version": "0.0.7",
    "project_urls": {
        "Homepage": "https://github.com/k111git/ab-test-simulator"
    },
    "split_keywords": [
        "nbdev",
        "jupyter",
        "notebook",
        "python"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "01dbd915faf76d2cf8e56f999601444c3f0d8c1e73fd86cc80cd2a47522bf578",
                "md5": "e0e07341a74c5822ea53682d59a8e39b",
                "sha256": "2423b1d4d767d33ef9ccf00c52d9d6fb5699ae1657612552c09731460fd8b274"
            },
            "downloads": -1,
            "filename": "ab_test_simulator-0.0.7-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "e0e07341a74c5822ea53682d59a8e39b",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 14341,
            "upload_time": "2023-06-02T13:13:28",
            "upload_time_iso_8601": "2023-06-02T13:13:28.868741Z",
            "url": "https://files.pythonhosted.org/packages/01/db/d915faf76d2cf8e56f999601444c3f0d8c1e73fd86cc80cd2a47522bf578/ab_test_simulator-0.0.7-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "20e52bc0092e0b867a4b7a937b3727d1721175da4c8644b23f5bd3cc0f573d39",
                "md5": "43f309387590bd53e0503735aa649568",
                "sha256": "b3ee5344d79856bef090094ea434a9a147846a79c477c993244a84f08e3b2407"
            },
            "downloads": -1,
            "filename": "ab-test-simulator-0.0.7.tar.gz",
            "has_sig": false,
            "md5_digest": "43f309387590bd53e0503735aa649568",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 15713,
            "upload_time": "2023-06-02T13:13:30",
            "upload_time_iso_8601": "2023-06-02T13:13:30.954889Z",
            "url": "https://files.pythonhosted.org/packages/20/e5/2bc0092e0b867a4b7a937b3727d1721175da4c8644b23f5bd3cc0f573d39/ab-test-simulator-0.0.7.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-06-02 13:13:30",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "k111git",
    "github_project": "ab-test-simulator",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "ab-test-simulator"
}

Kolja