xbern-confidence-intervals

Name	xbern-confidence-intervals JSON
Version	1.0.0 JSON
	download
home_page	https://github.com/krishnap25/xbern_confidence_intervals
Summary	Compute adaptive confidence intervals for XBern distributions
upload_time	2024-08-17 06:23:52
maintainer	None
docs_url	None
author	Krishna Pillutla
requires_python	>=3.8
license	None
keywords
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Adaptive Confidence Intervals for Exchangeable Bernoulli (XBern) Means

**NOTE**: This library has been extracted from the open-source code at [this link](https://github.com/google-research/federated/tree/master/lidp_auditing). Please refer to the license there (Apache 2.0 license) for details, which is also applicable to this package.

An XBern or exchangeable Bernoulli distribution is a probability distribution
over binary vectors which is exchangeable, i.e., the probability mass does not
change when the coordinates of the vector are permuted.
This package gives confidence intervals for the mean (first moment) of the 
XBern distribution from 
[Pillutla et al. (NeurIPS 2023)](https://arxiv.org/pdf/2305.18447.pdf).

For n binary vectors in k-dimensions, the confidence interval could vary
from 1 / sqrt(n) (if the k dimensions are copies of each other)
to 1 / sqrt(nk) (if the k dimensions are independent). We give adaptive
confidence intervals that automatically adapt the actual level of correlation between the
k dimensions. The obtained confidence intervals recover the 1 / sqrt(n) rate in the high correlation regime while getting an improved 1 / sqrt(nk) + 1 / n^{3/4} rate in the low correlation regime.


# Table of Contents
- [Installation](#installation)
- [Requirements](#requirements)
- [Functionality](#functionality)
- [Citation](#citation)


# Installation

For a direct install, run 
```bash
pip install xbern-confidence-intervals
```
or navigate to the parent directory and run
```bash
pip install -e .
```


# Requirements
The installation command above installs the main requirements, which are:
- numpy >= 1.22.4
- pandas >= 1.4.4
- scipy >= 1.7.3


#  Functionality

We give a quick overview of the API here. Please see [xbern_demo.ipynb](https://github.com/krishnap25/xbern_confidence_intervals/xbern_demo.ipynb) for a full tour of the features.

The API works as follows:

```python
import numpy as np
import xbern_confidence_intervals as xbern_ci

n, k = 1000, 10  # n samples of k components each.

# The input is a boolean array of shape (n, k).
# The components are correlated in general but we take them
#  to be independent for a quick demonstration of this package.
samples = (np.random.rand(n, k) > 0.99)
beta = 0.05  # failure probability

# For asymptotic Wilson intervals:
left, right = xbern_ci.get_bernstein_confidence_intervals(samples, beta)
# One-sided confidence intervals that satisfy Pr(mean < left) < 1-beta or
# Pr(mean > right) < 1-beta under the limit n -> infinity.
# Here, left/right are pd.Series with different confidence estimates as index.

# For non-asymptotic Bernstein intervals:
left, right = xbern_ci.get_bernstein_confidence_intervals(samples, beta)
# One-sided confidence intervals that satisfy Pr(mean < left) < 1-beta or
# Pr(mean > right) < 1-beta which holds at each n.
```

We also provide a vectorized implementation for the Wilson intervals,
that repeats the above calculations for each element in the batch:

```python
# The first dimension represents the batch:
samples = (np.random.rand(batch_size, n, k) > 0.99)

left, right = xbern_ci.xbern_ci.get_wilson_confidence_intervals_vectorized(
    samples, beta
)
# Here, left/right are pd.DataFrame with confidence estimates as index and
# the batch entires on the columns.
```

# Citation

If you find this package useful, please cite
```
@inproceedings{pillutla2023unleashing,
  author       = {Krishna Pillutla and
                  Galen Andrew and
                  Peter Kairouz and
                  H. Brendan McMahan and
                  Alina Oprea and
                  Sewoong Oh},
  title        = {{Unleashing the Power of Randomization in Auditing Differentially Private
                  ML}},
  booktitle      = {NeurIPS},
  year         = {2023},
}
```

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/krishnap25/xbern_confidence_intervals",
    "name": "xbern-confidence-intervals",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": null,
    "author": "Krishna Pillutla",
    "author_email": "krishnap@dsai.iitm.ac.in",
    "download_url": "https://files.pythonhosted.org/packages/5b/4c/8fa45e197b7aad665d3a449a89c6bc1e16c676e8a595de87e8fc75110bab/xbern_confidence_intervals-1.0.0.tar.gz",
    "platform": null,
    "description": "# Adaptive Confidence Intervals for Exchangeable Bernoulli (XBern) Means\n\n**NOTE**: This library has been extracted from the open-source code at [this link](https://github.com/google-research/federated/tree/master/lidp_auditing). Please refer to the license there (Apache 2.0 license) for details, which is also applicable to this package.\n\nAn XBern or exchangeable Bernoulli distribution is a probability distribution\nover binary vectors which is exchangeable, i.e., the probability mass does not\nchange when the coordinates of the vector are permuted.\nThis package gives confidence intervals for the mean (first moment) of the \nXBern distribution from \n[Pillutla et al. (NeurIPS 2023)](https://arxiv.org/pdf/2305.18447.pdf).\n\nFor n binary vectors in k-dimensions, the confidence interval could vary\nfrom 1 / sqrt(n) (if the k dimensions are copies of each other)\nto 1 / sqrt(nk) (if the k dimensions are independent). We give adaptive\nconfidence intervals that automatically adapt the actual level of correlation between the\nk dimensions. The obtained confidence intervals recover the 1 / sqrt(n) rate in the high correlation regime while getting an improved 1 / sqrt(nk) + 1 / n^{3/4} rate in the low correlation regime.\n\n\n# Table of Contents\n- [Installation](#installation)\n- [Requirements](#requirements)\n- [Functionality](#functionality)\n- [Citation](#citation)\n\n\n# Installation\n\nFor a direct install, run \n```bash\npip install xbern-confidence-intervals\n```\nor navigate to the parent directory and run\n```bash\npip install -e .\n```\n\n\n# Requirements\nThe installation command above installs the main requirements, which are:\n- numpy >= 1.22.4\n- pandas >= 1.4.4\n- scipy >= 1.7.3\n\n\n#  Functionality\n\nWe give a quick overview of the API here. Please see [xbern_demo.ipynb](https://github.com/krishnap25/xbern_confidence_intervals/xbern_demo.ipynb) for a full tour of the features.\n\nThe API works as follows:\n\n```python\nimport numpy as np\nimport xbern_confidence_intervals as xbern_ci\n\nn, k = 1000, 10  # n samples of k components each.\n\n# The input is a boolean array of shape (n, k).\n# The components are correlated in general but we take them\n#  to be independent for a quick demonstration of this package.\nsamples = (np.random.rand(n, k) > 0.99)\nbeta = 0.05  # failure probability\n\n# For asymptotic Wilson intervals:\nleft, right = xbern_ci.get_bernstein_confidence_intervals(samples, beta)\n# One-sided confidence intervals that satisfy Pr(mean < left) < 1-beta or\n# Pr(mean > right) < 1-beta under the limit n -> infinity.\n# Here, left/right are pd.Series with different confidence estimates as index.\n\n# For non-asymptotic Bernstein intervals:\nleft, right = xbern_ci.get_bernstein_confidence_intervals(samples, beta)\n# One-sided confidence intervals that satisfy Pr(mean < left) < 1-beta or\n# Pr(mean > right) < 1-beta which holds at each n.\n```\n\nWe also provide a vectorized implementation for the Wilson intervals,\nthat repeats the above calculations for each element in the batch:\n\n```python\n# The first dimension represents the batch:\nsamples = (np.random.rand(batch_size, n, k) > 0.99)\n\nleft, right = xbern_ci.xbern_ci.get_wilson_confidence_intervals_vectorized(\n    samples, beta\n)\n# Here, left/right are pd.DataFrame with confidence estimates as index and\n# the batch entires on the columns.\n```\n\n# Citation\n\nIf you find this package useful, please cite\n```\n@inproceedings{pillutla2023unleashing,\n  author       = {Krishna Pillutla and\n                  Galen Andrew and\n                  Peter Kairouz and\n                  H. Brendan McMahan and\n                  Alina Oprea and\n                  Sewoong Oh},\n  title        = {{Unleashing the Power of Randomization in Auditing Differentially Private\n                  ML}},\n  booktitle      = {NeurIPS},\n  year         = {2023},\n}\n```\n\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Compute adaptive confidence intervals for XBern distributions",
    "version": "1.0.0",
    "project_urls": {
        "Bug Tracker": "https://github.com/krishnap25/xbern_confidence_intervals/issues",
        "Homepage": "https://github.com/krishnap25/xbern_confidence_intervals"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "5b4c8fa45e197b7aad665d3a449a89c6bc1e16c676e8a595de87e8fc75110bab",
                "md5": "aef9dd388209431e21293bb27a3437a3",
                "sha256": "312a8aeb3020d77a5015497494d98ec9aa29a885a284b5bb0d95069821ca12d3"
            },
            "downloads": -1,
            "filename": "xbern_confidence_intervals-1.0.0.tar.gz",
            "has_sig": false,
            "md5_digest": "aef9dd388209431e21293bb27a3437a3",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 12815,
            "upload_time": "2024-08-17T06:23:52",
            "upload_time_iso_8601": "2024-08-17T06:23:52.294331Z",
            "url": "https://files.pythonhosted.org/packages/5b/4c/8fa45e197b7aad665d3a449a89c6bc1e16c676e8a595de87e8fc75110bab/xbern_confidence_intervals-1.0.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-08-17 06:23:52",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "krishnap25",
    "github_project": "xbern_confidence_intervals",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [],
    "lcname": "xbern-confidence-intervals"
}

Krishna Pillutla