inference-gym

Name	inference-gym JSON
Version	0.0.5 JSON
	download
home_page	https://github.com/tensorflow/probability/tree/main/spinoffs/inference_gym
Summary	The Inference Gym is the place to exercise inference methods to help make them faster, leaner and more robust.
upload_time	2025-01-24 19:56:04
maintainer	None
docs_url	None
author	Google LLC
requires_python	>=3.6
license	Apache 2.0
keywords	tensorflow jax probability statistics bayesian machine learning
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Inference Gym

## Overview

The Inference Gym is the place to exercise inference methods to help make them
faster, leaner and more robust. The goal of the Inference Gym is to provide
a set of probabilistic inference problems with a standardized interface, making
it easy to test new inference techniques across a variety of challenging tasks.

Currently it provides a repository of probabilistic models that can be used to
benchmark (the computational and statistical performance of) inference
algorithms. Probabilistic models are implemented as subclasses of the
[`Model`][model] class, which minimally provides the following faculties:

- A description of the shapes and dtypes of the parameters of the model.
- Event space bijectors which map from the unconstrained real space, to the
  support of the model's associated density.
- Ability to compute the log un-normalized density at a certain parameter
  setting.
- Name of the model.
- Sample transformations, which when applied to samples from the model's density
  represent quantities with a useful interpretation.

Each model can additionally provide:

- Ground truth quantities associated with each sample transformation. This can
  include mean, variance and other statistics. If these are estimated via
  Monte-Carlo methods, a standard error is also provided. This can be used to
  verify the algorithm's level of bias.

## Getting started

Check out the [tutorial].

## Usage

```bash
pip install tfp-nightly inference_gym
# Install at least one the following
pip install tf-nightly  # For the TensorFlow backend.
pip install jax jaxlib  # For the JAX backend.
# Install to support external datasets
pip install tfds-nightly
```

```python
import matplotlib.pyplot as plt
import numpy as np
from inference_gym import using_tensorflow as
inference_gym

model = inference_gym.targets.GermanCreditNumericLogisticRegression()

samples = inference_method(
  model.unnormalized_log_prob,
  model.default_event_space_bijector,
  model.event_shape,
  model.dtype)

plt.figure()
plt.suptitle(str(model))  # 'German Credit Numeric Logistic Regression'
for i, (name, sample_transformation) in enumerate(
    model.sample_transformations.items()):
  transformed_samples = sample_transformation(samples)
  bias_sq = tf.square(
      tf.reduce_mean(transformed_samples, 0) -
      sample_transformation.ground_truth_mean)
  ess = compute_ess(  # E.g. tfp.mcmc.effective_sample_size if using MCMC.
      transformed_samples,
      tf.square(sample_transformation.ground_truth_standard_deviation))
  plt.subplot(len(model.sample_transformations), 2, 2 * i + 1)
  plt.title('{} bias^2'.format(sample_transformation))  # e.g. 'Identity bias^2'
  plt.bar(np.arange(bias_sq.shape[-1]), bias_sq)
  plt.subplot(len(model.sample_transformations), 2, 2 * i + 2)
  plt.title('{} ess'.format(sample_transformation))
  plt.bar(np.arange(ess.shape[-1]), ess)
```

Also, see [`VectorModel`][vector_model] which can be used to simplify the
interface requirements for the inference method.


## What makes for a good Inference Gym Model?

A good model should ideally do one or more of these:

- Help build intuition (usually 1D or 2D for ease of visualization)
- Represent a generally important application of Bayesian inference
- Pose a challenge for inference, e.g.
  - high dimensionality
  - poor or pathological conditioning
  - mixing continuous and discrete latents
  - multimodality
  - non-identifiability
  - expensive gradients

Naturally, a model shouldn’t have all of those properties so users can more
easily do experiments to tease out which complication has what effect on the
inference procedure. This isn’t an exhaustive list.

## Making changes

### Adding a new model

It's easiest to mimic an existing example. Here's a small table to help you
find an example. If your model isn't described well by these possibilities,
feel free to ask for help.

| Bayesian Model? | Real dataset? | Analytic Ground Truth? | Stan Implementation? | Multiple RVs? | Example Model                                                            |
|-----------------|---------------|------------------------|----------------------|---------------|--------------------------------------------------------------------------|
| Yes             | Real          | No                     | Yes                  | Yes           | [`GermanCreditNumericSparseLogicRegression`][sparse_logistic_regression] |
| Yes             | Real          | No                     | Yes                  | No            | [`GermanCreditLogicRegression`][logistic_regression]                     |
| Yes             | Synthetic     | No                     | Yes                  | Yes           | [`SyntheticItemResponseTheory`][irt]                                     |
| No              | None          | Yes                    | No                   | No            | [`IllConditionedGaussian`][gaussian]                                     |

A Bayesian model in the table above refers to models whose density over the
parameters is computed using the product of a prior and a likelihood function
(i.e. using Bayes' theorem). These models should inherit from the
[`BayesianModel`][bayesian_model] class, as it provides some utilities for such
models.

Currently we have a little tooling to help use `cmdstanpy` to generate ground
truth values (in the correct format) for models without analytic ground truth.
Using this requires adding a model implementation inside the
[`inference_gym/tools/stan`][ground_truth_dir]
directory.

New (and existing) models should follow the [Model Contract][contract].

### Adding a new real dataset

We strongly encourage you to add your dataset to TensorFlow Datasets first.
Then, you can follow the example of the `German Credit (numeric)` dataset used
in the `GermanCreditLogicRegression`.

### Adding a new synthetic dataset

Follow the example of the [`SyntheticItemResponseTheory`][irt] model.

### Generating ground truth files.

See [`inference_gym/tools/get_ground_truth.py`][get_ground_truth].

[model]: https://github.com/tensorflow/probability/tree/main/spinoffs/inference_gym/inference_gym/targets/model.py
[get_ground_truth]: https://github.com/tensorflow/probability/tree/main/spinoffs/inference_gym/inference_gym/tools/get_ground_truth.py
[ground_truth_dir]: https://github.com/tensorflow/probability/tree/main/spinoffs/inference_gym/inference_gym/tools/stan
[bayesian_model]: https://github.com/tensorflow/probability/tree/main/spinoffs/inference_gym/inference_gym/targets/bayesian_model.py
[sparse_logistic_regression]: https://github.com/tensorflow/probability/tree/main/spinoffs/inference_gym/inference_gym/targets/sparse_logistic_regression.py
[logistic_regression]: https://github.com/tensorflow/probability/tree/main/spinoffs/inference_gym/inference_gym/targets/logistic_regression.py
[irt]: https://github.com/tensorflow/probability/tree/main/spinoffs/inference_gym/inference_gym/targets/item_response_theory.py
[gaussian]: https://github.com/tensorflow/probability/tree/main/spinoffs/inference_gym/inference_gym/targets/ill_conditioned_gaussian.py
[vector_model]: https://github.com/tensorflow/probability/tree/main/spinoffs/inference_gym/inference_gym/targets/vector_model.py
[tutorial]: https://github.com/tensorflow/probability/tree/main/spinoffs/inference_gym/notebooks/inference_gym_tutorial.ipynb
[contract]: https://github.com/tensorflow/probability/tree/main/spinoffs/inference_gym/model_contract.md

### Citing Inference Gym

To cite the Inference Gym:

```none
@software{inferencegym2020,
  author = {Pavel Sountsov and Alexey Radul and contributors},
  title = {Inference Gym},
  url = {https://pypi.org/project/inference_gym},
  version = {0.0.4},
  year = {2020},
}
```

Make sure to update the `version` attribute to match the actual version you're
using.

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/tensorflow/probability/tree/main/spinoffs/inference_gym",
    "name": "inference-gym",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": null,
    "keywords": "tensorflow jax probability statistics bayesian machine learning",
    "author": "Google LLC",
    "author_email": "no-reply@google.com",
    "download_url": null,
    "platform": null,
    "description": "# Inference Gym\n\n## Overview\n\nThe Inference Gym is the place to exercise inference methods to help make them\nfaster, leaner and more robust. The goal of the Inference Gym is to provide\na set of probabilistic inference problems with a standardized interface, making\nit easy to test new inference techniques across a variety of challenging tasks.\n\nCurrently it provides a repository of probabilistic models that can be used to\nbenchmark (the computational and statistical performance of) inference\nalgorithms. Probabilistic models are implemented as subclasses of the\n[`Model`][model] class, which minimally provides the following faculties:\n\n- A description of the shapes and dtypes of the parameters of the model.\n- Event space bijectors which map from the unconstrained real space, to the\n  support of the model's associated density.\n- Ability to compute the log un-normalized density at a certain parameter\n  setting.\n- Name of the model.\n- Sample transformations, which when applied to samples from the model's density\n  represent quantities with a useful interpretation.\n\nEach model can additionally provide:\n\n- Ground truth quantities associated with each sample transformation. This can\n  include mean, variance and other statistics. If these are estimated via\n  Monte-Carlo methods, a standard error is also provided. This can be used to\n  verify the algorithm's level of bias.\n\n## Getting started\n\nCheck out the [tutorial].\n\n## Usage\n\n```bash\npip install tfp-nightly inference_gym\n# Install at least one the following\npip install tf-nightly  # For the TensorFlow backend.\npip install jax jaxlib  # For the JAX backend.\n# Install to support external datasets\npip install tfds-nightly\n```\n\n```python\nimport matplotlib.pyplot as plt\nimport numpy as np\nfrom inference_gym import using_tensorflow as\ninference_gym\n\nmodel = inference_gym.targets.GermanCreditNumericLogisticRegression()\n\nsamples = inference_method(\n  model.unnormalized_log_prob,\n  model.default_event_space_bijector,\n  model.event_shape,\n  model.dtype)\n\nplt.figure()\nplt.suptitle(str(model))  # 'German Credit Numeric Logistic Regression'\nfor i, (name, sample_transformation) in enumerate(\n    model.sample_transformations.items()):\n  transformed_samples = sample_transformation(samples)\n  bias_sq = tf.square(\n      tf.reduce_mean(transformed_samples, 0) -\n      sample_transformation.ground_truth_mean)\n  ess = compute_ess(  # E.g. tfp.mcmc.effective_sample_size if using MCMC.\n      transformed_samples,\n      tf.square(sample_transformation.ground_truth_standard_deviation))\n  plt.subplot(len(model.sample_transformations), 2, 2 * i + 1)\n  plt.title('{} bias^2'.format(sample_transformation))  # e.g. 'Identity bias^2'\n  plt.bar(np.arange(bias_sq.shape[-1]), bias_sq)\n  plt.subplot(len(model.sample_transformations), 2, 2 * i + 2)\n  plt.title('{} ess'.format(sample_transformation))\n  plt.bar(np.arange(ess.shape[-1]), ess)\n```\n\nAlso, see [`VectorModel`][vector_model] which can be used to simplify the\ninterface requirements for the inference method.\n\n\n## What makes for a good Inference Gym Model?\n\nA good model should ideally do one or more of these:\n\n- Help build intuition (usually 1D or 2D for ease of visualization)\n- Represent a generally important application of Bayesian inference\n- Pose a challenge for inference, e.g.\n  - high dimensionality\n  - poor or pathological conditioning\n  - mixing continuous and discrete latents\n  - multimodality\n  - non-identifiability\n  - expensive gradients\n\nNaturally, a model shouldn\u2019t have all of those properties so users can more\neasily do experiments to tease out which complication has what effect on the\ninference procedure. This isn\u2019t an exhaustive list.\n\n## Making changes\n\n### Adding a new model\n\nIt's easiest to mimic an existing example. Here's a small table to help you\nfind an example. If your model isn't described well by these possibilities,\nfeel free to ask for help.\n\n| Bayesian Model? | Real dataset? | Analytic Ground Truth? | Stan Implementation? | Multiple RVs? | Example Model                                                            |\n|-----------------|---------------|------------------------|----------------------|---------------|--------------------------------------------------------------------------|\n| Yes             | Real          | No                     | Yes                  | Yes           | [`GermanCreditNumericSparseLogicRegression`][sparse_logistic_regression] |\n| Yes             | Real          | No                     | Yes                  | No            | [`GermanCreditLogicRegression`][logistic_regression]                     |\n| Yes             | Synthetic     | No                     | Yes                  | Yes           | [`SyntheticItemResponseTheory`][irt]                                     |\n| No              | None          | Yes                    | No                   | No            | [`IllConditionedGaussian`][gaussian]                                     |\n\nA Bayesian model in the table above refers to models whose density over the\nparameters is computed using the product of a prior and a likelihood function\n(i.e. using Bayes' theorem). These models should inherit from the\n[`BayesianModel`][bayesian_model] class, as it provides some utilities for such\nmodels.\n\nCurrently we have a little tooling to help use `cmdstanpy` to generate ground\ntruth values (in the correct format) for models without analytic ground truth.\nUsing this requires adding a model implementation inside the\n[`inference_gym/tools/stan`][ground_truth_dir]\ndirectory.\n\nNew (and existing) models should follow the [Model Contract][contract].\n\n### Adding a new real dataset\n\nWe strongly encourage you to add your dataset to TensorFlow Datasets first.\nThen, you can follow the example of the `German Credit (numeric)` dataset used\nin the `GermanCreditLogicRegression`.\n\n### Adding a new synthetic dataset\n\nFollow the example of the [`SyntheticItemResponseTheory`][irt] model.\n\n### Generating ground truth files.\n\nSee [`inference_gym/tools/get_ground_truth.py`][get_ground_truth].\n\n[model]: https://github.com/tensorflow/probability/tree/main/spinoffs/inference_gym/inference_gym/targets/model.py\n[get_ground_truth]: https://github.com/tensorflow/probability/tree/main/spinoffs/inference_gym/inference_gym/tools/get_ground_truth.py\n[ground_truth_dir]: https://github.com/tensorflow/probability/tree/main/spinoffs/inference_gym/inference_gym/tools/stan\n[bayesian_model]: https://github.com/tensorflow/probability/tree/main/spinoffs/inference_gym/inference_gym/targets/bayesian_model.py\n[sparse_logistic_regression]: https://github.com/tensorflow/probability/tree/main/spinoffs/inference_gym/inference_gym/targets/sparse_logistic_regression.py\n[logistic_regression]: https://github.com/tensorflow/probability/tree/main/spinoffs/inference_gym/inference_gym/targets/logistic_regression.py\n[irt]: https://github.com/tensorflow/probability/tree/main/spinoffs/inference_gym/inference_gym/targets/item_response_theory.py\n[gaussian]: https://github.com/tensorflow/probability/tree/main/spinoffs/inference_gym/inference_gym/targets/ill_conditioned_gaussian.py\n[vector_model]: https://github.com/tensorflow/probability/tree/main/spinoffs/inference_gym/inference_gym/targets/vector_model.py\n[tutorial]: https://github.com/tensorflow/probability/tree/main/spinoffs/inference_gym/notebooks/inference_gym_tutorial.ipynb\n[contract]: https://github.com/tensorflow/probability/tree/main/spinoffs/inference_gym/model_contract.md\n\n### Citing Inference Gym\n\nTo cite the Inference Gym:\n\n```none\n@software{inferencegym2020,\n  author = {Pavel Sountsov and Alexey Radul and contributors},\n  title = {Inference Gym},\n  url = {https://pypi.org/project/inference_gym},\n  version = {0.0.4},\n  year = {2020},\n}\n```\n\nMake sure to update the `version` attribute to match the actual version you're\nusing.\n\n",
    "bugtrack_url": null,
    "license": "Apache 2.0",
    "summary": "The Inference Gym is the place to exercise inference methods to help make them faster, leaner and more robust.",
    "version": "0.0.5",
    "project_urls": {
        "Homepage": "https://github.com/tensorflow/probability/tree/main/spinoffs/inference_gym"
    },
    "split_keywords": [
        "tensorflow",
        "jax",
        "probability",
        "statistics",
        "bayesian",
        "machine",
        "learning"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "20d7b872ca17316bf495604aa67e68ab01e872c0c117c5439f90a94f6ca8c1a3",
                "md5": "df5909e3726904147f3330a7c7cf6321",
                "sha256": "13794d80264839b3c1925d0f5800941b5f573a208e875f8dfec5051d0c80ff01"
            },
            "downloads": -1,
            "filename": "inference_gym-0.0.5-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "df5909e3726904147f3330a7c7cf6321",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 390898,
            "upload_time": "2025-01-24T19:56:04",
            "upload_time_iso_8601": "2025-01-24T19:56:04.880237Z",
            "url": "https://files.pythonhosted.org/packages/20/d7/b872ca17316bf495604aa67e68ab01e872c0c117c5439f90a94f6ca8c1a3/inference_gym-0.0.5-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-01-24 19:56:04",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "tensorflow",
    "github_project": "probability",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "inference-gym"
}

Google LLC