### PLD_subsampling
Implements and evaluates privacy amplification by subsampling for Privacy Loss Distribution (PLD) probability mass functions (PMFs). Generates CDF plots and epsilon ratio plots comparing analytical ground truth, `dp-accounting`, and our direct subsampling implementation.
### Package layout
- `PLD_subsampling/`
- `PLD_subsampling_impl.py`: Core subsampling primitives
- `stable_subsampling_loss`: numerically stable loss mapping
- `exclusive_ccdf_from_pdf`: CCDF helper (exclusive tail)
- `subsample_losses`: transforms a PMF on a uniform loss grid
- `wrappers/dp_accounting_wrappers.py`: Thin wrappers around `dp-accounting` (construct PLDs, amplify PLDs separately for remove/add), plus PMF/PLD utilities
- `amplify_pld_separate_directions(base_pld, sampling_prob) -> PrivacyLossDistribution`: returns a PLD with amplified remove/add PMFs
- `scale_pmf_infinity_mass(pmf, delta) -> PMF`: increases the infinity mass by `delta` and scales all finite probabilities by `(1-β-δ)/(1-β)` preserving PMF type (dense/sparse)
- `scale_pld_infinity_mass(pld, delta) -> PrivacyLossDistribution`: applies the same infinity-mass change to both directions of a PLD and returns a new PLD
- `testing/`
- `analytic_Gaussian.py`: Analytical PLD and epsilon(δ) formulas for Gaussian mechanism
- `test_utils.py`: Experiment runners (`run_experiment`, `run_multiple_experiments`)
- `plot_utils.py`: Plotting (CDF with focused x-range, epsilon ratio)
- `main.py`: Runs experiments and saves figures to `plots/`
### Quickstart
1) Create a virtual environment and install dependencies
```bash
python3 -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
pip install -r requirements.txt
```
2) Editable install for local development (optional)
```bash
pip install -e .
```
3) Run experiments and generate plots
```bash
python -m PLD_subsampling.main
```
Figures are written to `plots/` (treat this directory as build output).
### Usage examples
Scale infinity mass on a PMF and a PLD (see `PLD_subsampling/example.ipynb` for a full demo):
```python
from PLD_subsampling.wrappers.dp_accounting_wrappers import (
scale_pmf_infinity_mass,
scale_pld_infinity_mass,
)
from dp_accounting.pld import privacy_loss_distribution
# Build a fresh PLD
pld = privacy_loss_distribution.from_gaussian_mechanism(
standard_deviation=1.0,
sensitivity=1.0,
value_discretization_interval=1e-4,
sampling_prob=0.1,
pessimistic_estimate=True,
)
# Scale a single PMF
pmf_scaled = scale_pmf_infinity_mass(pld._pmf_remove, delta=1e-4)
# Scale both directions of the PLD
pld_scaled = scale_pld_infinity_mass(pld, delta=1e-4)
```
### Notes
- CDF plots automatically focus the main x-axis on the transition region and add slight y-padding to show the 0 and 1 limits clearly.
- Epsilon-ratio plots show method/GT vs analytical epsilon over log-scale epsilon.
- All heavy computations use vectorized NumPy operations with careful numerical handling in tail regions.
### Build a package
```bash
python -m pip install --upgrade build
python -m build
```
Artifacts will be created under `dist/`. To upload to PyPI/TestPyPI, use `twine` with an API token.
Raw data
{
"_id": null,
"home_page": null,
"name": "PLD-subsampling",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": "differential-privacy, privacy-loss-distribution, subsampling, dp, pld",
"author": "Moshe Shenfeld",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/1c/a6/c0df40c2462332706c33a1cdc55d3270615a685fdca9110c7435d6097fb3/pld_subsampling-0.1.2.tar.gz",
"platform": null,
"description": "### PLD_subsampling\n\nImplements and evaluates privacy amplification by subsampling for Privacy Loss Distribution (PLD) probability mass functions (PMFs). Generates CDF plots and epsilon ratio plots comparing analytical ground truth, `dp-accounting`, and our direct subsampling implementation.\n\n### Package layout\n\n- `PLD_subsampling/`\n - `PLD_subsampling_impl.py`: Core subsampling primitives\n - `stable_subsampling_loss`: numerically stable loss mapping\n - `exclusive_ccdf_from_pdf`: CCDF helper (exclusive tail)\n - `subsample_losses`: transforms a PMF on a uniform loss grid\n - `wrappers/dp_accounting_wrappers.py`: Thin wrappers around `dp-accounting` (construct PLDs, amplify PLDs separately for remove/add), plus PMF/PLD utilities\n - `amplify_pld_separate_directions(base_pld, sampling_prob) -> PrivacyLossDistribution`: returns a PLD with amplified remove/add PMFs\n - `scale_pmf_infinity_mass(pmf, delta) -> PMF`: increases the infinity mass by `delta` and scales all finite probabilities by `(1-\u03b2-\u03b4)/(1-\u03b2)` preserving PMF type (dense/sparse)\n - `scale_pld_infinity_mass(pld, delta) -> PrivacyLossDistribution`: applies the same infinity-mass change to both directions of a PLD and returns a new PLD\n - `testing/`\n - `analytic_Gaussian.py`: Analytical PLD and epsilon(\u03b4) formulas for Gaussian mechanism\n - `test_utils.py`: Experiment runners (`run_experiment`, `run_multiple_experiments`)\n - `plot_utils.py`: Plotting (CDF with focused x-range, epsilon ratio)\n - `main.py`: Runs experiments and saves figures to `plots/`\n\n### Quickstart\n\n1) Create a virtual environment and install dependencies\n\n```bash\npython3 -m venv .venv\nsource .venv/bin/activate\npython -m pip install --upgrade pip\npip install -r requirements.txt\n```\n\n2) Editable install for local development (optional)\n\n```bash\npip install -e .\n```\n\n3) Run experiments and generate plots\n\n```bash\npython -m PLD_subsampling.main\n```\n\nFigures are written to `plots/` (treat this directory as build output).\n\n### Usage examples\n\nScale infinity mass on a PMF and a PLD (see `PLD_subsampling/example.ipynb` for a full demo):\n\n```python\nfrom PLD_subsampling.wrappers.dp_accounting_wrappers import (\n scale_pmf_infinity_mass,\n scale_pld_infinity_mass,\n)\nfrom dp_accounting.pld import privacy_loss_distribution\n\n# Build a fresh PLD\npld = privacy_loss_distribution.from_gaussian_mechanism(\n standard_deviation=1.0,\n sensitivity=1.0,\n value_discretization_interval=1e-4,\n sampling_prob=0.1,\n pessimistic_estimate=True,\n)\n\n# Scale a single PMF\npmf_scaled = scale_pmf_infinity_mass(pld._pmf_remove, delta=1e-4)\n\n# Scale both directions of the PLD\npld_scaled = scale_pld_infinity_mass(pld, delta=1e-4)\n```\n\n### Notes\n\n- CDF plots automatically focus the main x-axis on the transition region and add slight y-padding to show the 0 and 1 limits clearly.\n- Epsilon-ratio plots show method/GT vs analytical epsilon over log-scale epsilon.\n- All heavy computations use vectorized NumPy operations with careful numerical handling in tail regions.\n\n### Build a package\n\n```bash\npython -m pip install --upgrade build\npython -m build\n```\n\nArtifacts will be created under `dist/`. To upload to PyPI/TestPyPI, use `twine` with an API token.\n",
"bugtrack_url": null,
"license": "Apache-2.0",
"summary": "PLD PMF/PLD transformations: subsampling amplification and infinity-mass scaling utilities",
"version": "0.1.2",
"project_urls": {
"Homepage": "https://github.com/moshenfeld/PLD_subsampling",
"Repository": "https://github.com/moshenfeld/PLD_subsampling.git"
},
"split_keywords": [
"differential-privacy",
" privacy-loss-distribution",
" subsampling",
" dp",
" pld"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "ff51503a8e7aacb0d5cb972df2a01be1081a6fd9e435568064ca3b58d53531ad",
"md5": "25f5c81d641f0e1e1d0a7132d8daa0c9",
"sha256": "611a8cef959c50969c68487bd284ae77517bd1c5ae223b93a940c4a879fe3669"
},
"downloads": -1,
"filename": "pld_subsampling-0.1.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "25f5c81d641f0e1e1d0a7132d8daa0c9",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 14468,
"upload_time": "2025-08-19T19:38:01",
"upload_time_iso_8601": "2025-08-19T19:38:01.029294Z",
"url": "https://files.pythonhosted.org/packages/ff/51/503a8e7aacb0d5cb972df2a01be1081a6fd9e435568064ca3b58d53531ad/pld_subsampling-0.1.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "1ca6c0df40c2462332706c33a1cdc55d3270615a685fdca9110c7435d6097fb3",
"md5": "bacbd3908b9ab253ff73555aa12cf609",
"sha256": "85a70652457671433b22b5565dd5ba687c61c8f9311864aee72358ca99590e98"
},
"downloads": -1,
"filename": "pld_subsampling-0.1.2.tar.gz",
"has_sig": false,
"md5_digest": "bacbd3908b9ab253ff73555aa12cf609",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 13474,
"upload_time": "2025-08-19T19:38:02",
"upload_time_iso_8601": "2025-08-19T19:38:02.735470Z",
"url": "https://files.pythonhosted.org/packages/1c/a6/c0df40c2462332706c33a1cdc55d3270615a685fdca9110c7435d6097fb3/pld_subsampling-0.1.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-19 19:38:02",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "moshenfeld",
"github_project": "PLD_subsampling",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [
{
"name": "numpy",
"specs": [
[
"==",
"2.3.2"
]
]
},
{
"name": "scipy",
"specs": [
[
"==",
"1.16.1"
]
]
},
{
"name": "matplotlib",
"specs": [
[
"==",
"3.10.5"
]
]
},
{
"name": "dp-accounting",
"specs": [
[
"==",
"0.5.0"
]
]
}
],
"lcname": "pld-subsampling"
}