# TwoSamplesBinomial: Two-sample testing for counts data
Usually in the context of a multiple testing approach to compare two or more frequency tables. Combine with ``multiple-hypothesis-testing`` to
obtain a global test for the significance of the difference between the
tables.
References:
- [1] D. L. Donoho and A. Kipnis. (2022) Higher criticism to compare two large frequency tables, with sensitivity to possible rare and weak differences. Annals of Statistics.
- [2] C. B. Dean. (1992) Testing for Overdispersion in Poisson and Binomial Regression Models. Journal of the American Statistical Association
## Methods:
- ``bin_allocation_test`` (the test from [1])
- ``bin_variance_test`` (test from [2])
- ``bin_variance_test_df`` the same as ``bin_variance_test`` plus additional information
### Additional auxiliary function of independent interest:
- ``poisson_test`` Vectorized one-sided Poisson test with an option to do a randomized test
- ``binom_test`` Vectorized one-sided binomial test with an option to do a randomized test
- ``binom_test_two_sided`` Vectorized Two-sided binomial test with an option to do a randomized test
- ``binom_test_two_sided_slow`` Vectorized two-sided binomial test using scipy.stats.binom_test
## Example:
```
from twosample import bin_allocation_test, bin_variance_test
from multitest import MultiTest
import numpy as np
N = 100
n = 500
eps = 0.1
mu = 0.01
P = np.ones(N) / N
Q = P.copy()
Q[np.random.rand(N) < eps] += mu
Q = Q / Q.sum()
smp1 = np.random.multinomial(n, P) # sample form P
smp2 = np.random.multinomial(n, Q) # sample from Q
pvals_alloc = bin_allocation_test(smp1, smp2) # binomial P-values
pvals_var = bin_variance_test(smp1, smp2) # binomial P-values
mt_alloc = MultiTest(pvals_alloc)
mt_var = MultiTest(pvals_var)
print("HC(binomial_allocation) = ", mt_alloc.hc()[0])
print("HC(varaince) = ", mt_var.hc()[0])
```
Raw data
{
"_id": null,
"home_page": "https://github.com/alonkipnis/higher-criticism-test",
"name": "two-sample-binomial",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": "",
"keywords": "",
"author": "Alon Kipnis",
"author_email": "alonkipnis@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/f7/2f/97bc0c770f57710d0c969da351c37557dfad4692921abb34b0a339463d7e/two-sample-binomial-0.0.4.tar.gz",
"platform": null,
"description": "# TwoSamplesBinomial: Two-sample testing for counts data\nUsually in the context of a multiple testing approach to compare two or more frequency tables. Combine with ``multiple-hypothesis-testing`` to\nobtain a global test for the significance of the difference between the \ntables.\n\nReferences:\n- [1] D. L. Donoho and A. Kipnis. (2022) Higher criticism to compare two large frequency tables, with sensitivity to possible rare and weak differences. Annals of Statistics. \n- [2] C. B. Dean. (1992) Testing for Overdispersion in Poisson and Binomial Regression Models. Journal of the American Statistical Association\n\n\n## Methods:\n- ``bin_allocation_test`` (the test from [1])\n- ``bin_variance_test`` (test from [2])\n- ``bin_variance_test_df`` the same as ``bin_variance_test`` plus additional information\n\n\n### Additional auxiliary function of independent interest:\n - ``poisson_test`` Vectorized one-sided Poisson test with an option to do a randomized test\n - ``binom_test`` Vectorized one-sided binomial test with an option to do a randomized test\n - ``binom_test_two_sided`` Vectorized Two-sided binomial test with an option to do a randomized test\n - ``binom_test_two_sided_slow`` Vectorized two-sided binomial test using scipy.stats.binom_test\n\n## Example:\n```\nfrom twosample import bin_allocation_test, bin_variance_test\nfrom multitest import MultiTest\nimport numpy as np\n\nN = 100\nn = 500\neps = 0.1\nmu = 0.01\n\nP = np.ones(N) / N\nQ = P.copy()\nQ[np.random.rand(N) < eps] += mu\nQ = Q / Q.sum()\n\n \nsmp1 = np.random.multinomial(n, P) # sample form P\nsmp2 = np.random.multinomial(n, Q) # sample from Q\n\npvals_alloc = bin_allocation_test(smp1, smp2) # binomial P-values\npvals_var = bin_variance_test(smp1, smp2) # binomial P-values\n\nmt_alloc = MultiTest(pvals_alloc)\nmt_var = MultiTest(pvals_var)\n\nprint(\"HC(binomial_allocation) = \", mt_alloc.hc()[0])\nprint(\"HC(varaince) = \", mt_var.hc()[0])\n```\n",
"bugtrack_url": null,
"license": "",
"summary": "Several two-samples tests for count data",
"version": "0.0.4",
"project_urls": {
"Download": "https://github.com/alonkipnis/higher-criticism-test",
"Homepage": "https://github.com/alonkipnis/higher-criticism-test"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "6428f855f8d0163ad20a7b0a6cd28d2426a0aeaceeaeb41d1de28483eed9a505",
"md5": "f88773f4cf76b97966aaa3454c0a0231",
"sha256": "703c023fcd78cb99753fb696a3157665de37863551e131ef6b123e2772e2f5dc"
},
"downloads": -1,
"filename": "two_sample_binomial-0.0.4-py3-none-any.whl",
"has_sig": false,
"md5_digest": "f88773f4cf76b97966aaa3454c0a0231",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 5621,
"upload_time": "2024-03-13T09:46:00",
"upload_time_iso_8601": "2024-03-13T09:46:00.411550Z",
"url": "https://files.pythonhosted.org/packages/64/28/f855f8d0163ad20a7b0a6cd28d2426a0aeaceeaeb41d1de28483eed9a505/two_sample_binomial-0.0.4-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "f72f97bc0c770f57710d0c969da351c37557dfad4692921abb34b0a339463d7e",
"md5": "70eca8f6ab733f6db59ce13fc0bfdf1c",
"sha256": "3de158e7806cf5a7cd2cbb619bcdf8ae91781c3dfeb9796fcd901e20c12210f2"
},
"downloads": -1,
"filename": "two-sample-binomial-0.0.4.tar.gz",
"has_sig": false,
"md5_digest": "70eca8f6ab733f6db59ce13fc0bfdf1c",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 5282,
"upload_time": "2024-03-13T09:46:04",
"upload_time_iso_8601": "2024-03-13T09:46:04.105240Z",
"url": "https://files.pythonhosted.org/packages/f7/2f/97bc0c770f57710d0c969da351c37557dfad4692921abb34b0a339463d7e/two-sample-binomial-0.0.4.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-03-13 09:46:04",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "alonkipnis",
"github_project": "higher-criticism-test",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "two-sample-binomial"
}