two-sample-binomial


Nametwo-sample-binomial JSON
Version 0.0.4 PyPI version JSON
download
home_pagehttps://github.com/alonkipnis/higher-criticism-test
SummarySeveral two-samples tests for count data
upload_time2024-03-13 09:46:04
maintainer
docs_urlNone
authorAlon Kipnis
requires_python>=3.6
license
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # TwoSamplesBinomial: Two-sample testing for counts data
Usually in the context of a multiple testing approach to compare two or more frequency tables. Combine with ``multiple-hypothesis-testing`` to
obtain a global test for the significance of the difference between the 
tables.

References:
- [1] D. L. Donoho and A. Kipnis. (2022) Higher criticism to compare two large frequency tables, with sensitivity to possible rare and weak differences. Annals of Statistics. 
- [2]  C. B. Dean. (1992) Testing for Overdispersion in Poisson and Binomial Regression Models. Journal of the American Statistical Association


## Methods:
- ``bin_allocation_test`` (the test from [1])
- ``bin_variance_test`` (test from [2])
- ``bin_variance_test_df`` the same as ``bin_variance_test`` plus additional information


### Additional auxiliary function of independent interest:
 - ``poisson_test`` Vectorized one-sided Poisson test with an option to do a randomized test
 - ``binom_test`` Vectorized one-sided binomial test with an option to do a randomized test
 - ``binom_test_two_sided`` Vectorized Two-sided binomial test with an option to do a randomized test
 - ``binom_test_two_sided_slow`` Vectorized two-sided binomial test using scipy.stats.binom_test

## Example:
```
from twosample import bin_allocation_test, bin_variance_test
from multitest import MultiTest
import numpy as np

N = 100
n = 500
eps = 0.1
mu = 0.01

P = np.ones(N) / N
Q = P.copy()
Q[np.random.rand(N) < eps] += mu
Q = Q / Q.sum()

  
smp1 = np.random.multinomial(n, P)  # sample form P
smp2 = np.random.multinomial(n, Q)  # sample from Q

pvals_alloc = bin_allocation_test(smp1, smp2) # binomial P-values
pvals_var = bin_variance_test(smp1, smp2) # binomial P-values

mt_alloc = MultiTest(pvals_alloc)
mt_var = MultiTest(pvals_var)

print("HC(binomial_allocation) = ", mt_alloc.hc()[0])
print("HC(varaince) = ", mt_var.hc()[0])
```

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/alonkipnis/higher-criticism-test",
    "name": "two-sample-binomial",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": "",
    "keywords": "",
    "author": "Alon Kipnis",
    "author_email": "alonkipnis@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/f7/2f/97bc0c770f57710d0c969da351c37557dfad4692921abb34b0a339463d7e/two-sample-binomial-0.0.4.tar.gz",
    "platform": null,
    "description": "# TwoSamplesBinomial: Two-sample testing for counts data\nUsually in the context of a multiple testing approach to compare two or more frequency tables. Combine with ``multiple-hypothesis-testing`` to\nobtain a global test for the significance of the difference between the \ntables.\n\nReferences:\n- [1] D. L. Donoho and A. Kipnis. (2022) Higher criticism to compare two large frequency tables, with sensitivity to possible rare and weak differences. Annals of Statistics. \n- [2]  C. B. Dean. (1992) Testing for Overdispersion in Poisson and Binomial Regression Models. Journal of the American Statistical Association\n\n\n## Methods:\n- ``bin_allocation_test`` (the test from [1])\n- ``bin_variance_test`` (test from [2])\n- ``bin_variance_test_df`` the same as ``bin_variance_test`` plus additional information\n\n\n### Additional auxiliary function of independent interest:\n - ``poisson_test`` Vectorized one-sided Poisson test with an option to do a randomized test\n - ``binom_test`` Vectorized one-sided binomial test with an option to do a randomized test\n - ``binom_test_two_sided`` Vectorized Two-sided binomial test with an option to do a randomized test\n - ``binom_test_two_sided_slow`` Vectorized two-sided binomial test using scipy.stats.binom_test\n\n## Example:\n```\nfrom twosample import bin_allocation_test, bin_variance_test\nfrom multitest import MultiTest\nimport numpy as np\n\nN = 100\nn = 500\neps = 0.1\nmu = 0.01\n\nP = np.ones(N) / N\nQ = P.copy()\nQ[np.random.rand(N) < eps] += mu\nQ = Q / Q.sum()\n\n  \nsmp1 = np.random.multinomial(n, P)  # sample form P\nsmp2 = np.random.multinomial(n, Q)  # sample from Q\n\npvals_alloc = bin_allocation_test(smp1, smp2) # binomial P-values\npvals_var = bin_variance_test(smp1, smp2) # binomial P-values\n\nmt_alloc = MultiTest(pvals_alloc)\nmt_var = MultiTest(pvals_var)\n\nprint(\"HC(binomial_allocation) = \", mt_alloc.hc()[0])\nprint(\"HC(varaince) = \", mt_var.hc()[0])\n```\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Several two-samples tests for count data",
    "version": "0.0.4",
    "project_urls": {
        "Download": "https://github.com/alonkipnis/higher-criticism-test",
        "Homepage": "https://github.com/alonkipnis/higher-criticism-test"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "6428f855f8d0163ad20a7b0a6cd28d2426a0aeaceeaeb41d1de28483eed9a505",
                "md5": "f88773f4cf76b97966aaa3454c0a0231",
                "sha256": "703c023fcd78cb99753fb696a3157665de37863551e131ef6b123e2772e2f5dc"
            },
            "downloads": -1,
            "filename": "two_sample_binomial-0.0.4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f88773f4cf76b97966aaa3454c0a0231",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 5621,
            "upload_time": "2024-03-13T09:46:00",
            "upload_time_iso_8601": "2024-03-13T09:46:00.411550Z",
            "url": "https://files.pythonhosted.org/packages/64/28/f855f8d0163ad20a7b0a6cd28d2426a0aeaceeaeb41d1de28483eed9a505/two_sample_binomial-0.0.4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f72f97bc0c770f57710d0c969da351c37557dfad4692921abb34b0a339463d7e",
                "md5": "70eca8f6ab733f6db59ce13fc0bfdf1c",
                "sha256": "3de158e7806cf5a7cd2cbb619bcdf8ae91781c3dfeb9796fcd901e20c12210f2"
            },
            "downloads": -1,
            "filename": "two-sample-binomial-0.0.4.tar.gz",
            "has_sig": false,
            "md5_digest": "70eca8f6ab733f6db59ce13fc0bfdf1c",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 5282,
            "upload_time": "2024-03-13T09:46:04",
            "upload_time_iso_8601": "2024-03-13T09:46:04.105240Z",
            "url": "https://files.pythonhosted.org/packages/f7/2f/97bc0c770f57710d0c969da351c37557dfad4692921abb34b0a339463d7e/two-sample-binomial-0.0.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-03-13 09:46:04",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "alonkipnis",
    "github_project": "higher-criticism-test",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "two-sample-binomial"
}
        
Elapsed time: 0.18876s