# GPry
### A Package for Bayesian inference of expensive likelihoods with Gaussian Processes.
| Authors | Jonas El Gammal, Jesus Torrado, Nils Schoeneberg and Christian Fidler |
|---------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Source code | [Source code on GitHub](https://github.com/jonaselgammal/GPry) |
| Documentation | [Documentation on Read the Docs](https://gpry.readthedocs.io/en/latest/) |
| License | [LGPL](https://www.gnu.org/licenses/lgpl-3.0.en.html) + mandatory bug reporting asap + <br>mandatory [arXiv'ing](https://arxiv.org) of publications using it (see [LICENCE.txt](https://github.com/jonaselgammal/GPry/blob/main/LICENSE) for exceptions).<br>The documentation is licensed under the [GFDL](https://www.gnu.org/licenses/fdl-1.3.en.html). |
| Support | For questions drop me an [email](mailto:jonas.el.gammal@rwth-aachen.de). For issues/bugs please use GitHub's functions. |
| Installation | ``pip install gpry`` |
GPry was developed as a result of my master thesis and and can be seen as an alternative to established samplers like MCMC and Nested Sampling.
It is targeted at a specific class of posterior distributions (with the initial goal of speeding up inference in cosmology) with the aim of creating an algorithm which can efficiently obtain marginal quantities from (computationally) expensive likelihoods.
Although our background is in cosmology, GPry will work with any likelihood which can be called as a python function. It uses [Cobaya's](https://github.com/CobayaSampler/cobaya) model framework so all of Cobaya's inbuilt likelihoods work too.
### What kinds of likelihoods/posteriors work with GPry?
The requirements that your likelihood/posterior has to fulfil in order for this algorithm to be efficient and give correct results are as follows:
- The likelihood/posterior should be *smooth* (continuous) and you should know how smooth (how many times differentiable).
- The likelihood/posterior evaluation should be *slow*. What slow means depends on the number of dimensions and expected shape of the posterior distribution but as a rule of thumb, if your MCMC takes longer to converge than you're willing to wait you should give it a shot.
- The likelihood should be *low-dimensional* (d<20 as a rule of thumb). In higher dimensions you might still gain considerable improvements in speed if your likelihood is sufficiently slow but the computational overhead of the algorithm increases considerably.
### What does GPry do?
Unlike algorithms like MCMC which sample a posterior distribution GPry is designed to interpolate it using Gaussian Process regression and active sampling. This converges to the posterior shape requiring much fewer posterior samples than sampling algorithms because it sets a prior on the functional shape of the posterior.
This doesn't mean that your posterior has to have a certain shape but rather that we assume that the posterior is a continuous, differentiable function which has a single characteristic length scale along each dimension (don't take this too literally though. It will still work with many likelihoods which do not have a single characteristic length-scale at all!)
Furthermore GPry implements a number of tricks to mitigate some of the pitfalls associated with interpolating functions with GPs. The most important ones are:
- A novel **acquisition function** for efficient sampling of the parameter space. This procedure is inspired by Bayesian optimization.
- A batch acquisition algorithm which enables evaluating the likelihood/posterior in parallel using multiple cores. This is based on the **Kriging-believer** algorithm. A nice bonus is that it also decreases the time for fitting the GP's hyperparameters.
- In order to prevent sampling regions which fall well outside the 2- σ contours and account for the fact that many theory codes just return 0 far away from the fiducial values instead of computing the actual likelihood (which leads to - ∞ in the log-posterior) we shrink the prior using an **SVM classifier** to divide the parameter space into a "finite" region of interest and an "infinite" (uninteresting) region.
### What benefits does GPry offer compared to MCMC, Nested Sampling, ...?
To put it bluntly mostly the number of samples required to converge to the correct posterior shape. The increase in performance is therefore most pronounced in cases where evaluating the likelihood/posterior at a single location is very costly (i.e. when it requires running some expensive theory calculations, large amounts of data need to be processed, ...).
### Why does GPry require so few samples to converge?
Unlike most samplers GPry does not select sampling locations statistically but instead uses a deterministic function which is optimized in order to always sample the location which adds most information. Furthermore, unlike samplers which essentially build a histogram of the sampling locations (like MCMC) and are oblivious to the posterior values themselves, GPry uses all information it can get from the samples by interpolating. There are no such things as rejected steps. Every sample contributes to the GP interpolation.
### Where's the catch?
Like every other sampler GPry isn't perfect and has some limitations:
- GPs don't scale well with the number of training samples as training the GP involves inverting a kernel matrix. Unfortunately the computational complexity of this inversion scales with the number of training samples cubed. This means that as the number of training samples required grows, the overhead of the algorithm increases considerably.
- While we tested GPry on mildly multimodal posterior distributions it is not a supported feature and should be used with caution. The algorithm is fairly greedy and can easily miss a mode, especially if the separation between modes is large.
- The algorithm is generally robust towards "weirdly" shaped posterior distributions, however the structure of the kernel still assumes a single characteristic correlation length. This means that in very pathological cases (for instance a very wide distribution with a tiny spike in it) GPry will struggle to capture the mode correctly.
- This code is novel and hasn't profited from years of user feedback and maintenance. Bugs can (and probably will) occur. If you find any please report them to us!
**We are actively working on mitigating some of those issues and we will keep on developing this code so look out for new versions!**
Raw data
{
"_id": null,
"home_page": "https://github.com/jonaselgammal/GPry",
"name": "gpry",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "inference,gaussianprocesses,sampling,cosmology",
"author": "Jonas El Gammal, Jesus Torrado, Nils Schoeneberg and Christian Fidler",
"author_email": "jonas.el.gammal@rwth-aachen.de",
"download_url": "https://files.pythonhosted.org/packages/91/d2/f5847339fa35230048659072f8e1a8c0c73c9d2588122ef1c953d0d3057e/gpry-2.0.1.tar.gz",
"platform": null,
"description": "# GPry\n### A Package for Bayesian inference of expensive likelihoods with Gaussian Processes.\n\n\n| Authors | Jonas El Gammal, Jesus Torrado, Nils Schoeneberg and Christian Fidler |\n|---------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| Source code | [Source code on GitHub](https://github.com/jonaselgammal/GPry) |\n| Documentation | [Documentation on Read the Docs](https://gpry.readthedocs.io/en/latest/) |\n| License | [LGPL](https://www.gnu.org/licenses/lgpl-3.0.en.html) + mandatory bug reporting asap + <br>mandatory [arXiv'ing](https://arxiv.org) of publications using it (see [LICENCE.txt](https://github.com/jonaselgammal/GPry/blob/main/LICENSE) for exceptions).<br>The documentation is licensed under the [GFDL](https://www.gnu.org/licenses/fdl-1.3.en.html). |\n| Support | For questions drop me an [email](mailto:jonas.el.gammal@rwth-aachen.de). For issues/bugs please use GitHub's functions. |\n| Installation | ``pip install gpry`` |\n\nGPry was developed as a result of my master thesis and and can be seen as an alternative to established samplers like MCMC and Nested Sampling.\nIt is targeted at a specific class of posterior distributions (with the initial goal of speeding up inference in cosmology) with the aim of creating an algorithm which can efficiently obtain marginal quantities from (computationally) expensive likelihoods.\n\nAlthough our background is in cosmology, GPry will work with any likelihood which can be called as a python function. It uses [Cobaya's](https://github.com/CobayaSampler/cobaya) model framework so all of Cobaya's inbuilt likelihoods work too.\n\n### What kinds of likelihoods/posteriors work with GPry?\nThe requirements that your likelihood/posterior has to fulfil in order for this algorithm to be efficient and give correct results are as follows:\n\n- The likelihood/posterior should be *smooth* (continuous) and you should know how smooth (how many times differentiable).\n- The likelihood/posterior evaluation should be *slow*. What slow means depends on the number of dimensions and expected shape of the posterior distribution but as a rule of thumb, if your MCMC takes longer to converge than you're willing to wait you should give it a shot.\n- The likelihood should be *low-dimensional* (d<20 as a rule of thumb). In higher dimensions you might still gain considerable improvements in speed if your likelihood is sufficiently slow but the computational overhead of the algorithm increases considerably.\n\n### What does GPry do?\nUnlike algorithms like MCMC which sample a posterior distribution GPry is designed to interpolate it using Gaussian Process regression and active sampling. This converges to the posterior shape requiring much fewer posterior samples than sampling algorithms because it sets a prior on the functional shape of the posterior.\nThis doesn't mean that your posterior has to have a certain shape but rather that we assume that the posterior is a continuous, differentiable function which has a single characteristic length scale along each dimension (don't take this too literally though. It will still work with many likelihoods which do not have a single characteristic length-scale at all!)\nFurthermore GPry implements a number of tricks to mitigate some of the pitfalls associated with interpolating functions with GPs. The most important ones are:\n- A novel **acquisition function** for efficient sampling of the parameter space. This procedure is inspired by Bayesian optimization.\n- A batch acquisition algorithm which enables evaluating the likelihood/posterior in parallel using multiple cores. This is based on the **Kriging-believer** algorithm. A nice bonus is that it also decreases the time for fitting the GP's hyperparameters.\n- In order to prevent sampling regions which fall well outside the 2- σ contours and account for the fact that many theory codes just return 0 far away from the fiducial values instead of computing the actual likelihood (which leads to - ∞ in the log-posterior) we shrink the prior using an **SVM classifier** to divide the parameter space into a \"finite\" region of interest and an \"infinite\" (uninteresting) region.\n\n### What benefits does GPry offer compared to MCMC, Nested Sampling, ...?\nTo put it bluntly mostly the number of samples required to converge to the correct posterior shape. The increase in performance is therefore most pronounced in cases where evaluating the likelihood/posterior at a single location is very costly (i.e. when it requires running some expensive theory calculations, large amounts of data need to be processed, ...).\n\n### Why does GPry require so few samples to converge?\nUnlike most samplers GPry does not select sampling locations statistically but instead uses a deterministic function which is optimized in order to always sample the location which adds most information. Furthermore, unlike samplers which essentially build a histogram of the sampling locations (like MCMC) and are oblivious to the posterior values themselves, GPry uses all information it can get from the samples by interpolating. There are no such things as rejected steps. Every sample contributes to the GP interpolation.\n\n### Where's the catch?\nLike every other sampler GPry isn't perfect and has some limitations:\n- GPs don't scale well with the number of training samples as training the GP involves inverting a kernel matrix. Unfortunately the computational complexity of this inversion scales with the number of training samples cubed. This means that as the number of training samples required grows, the overhead of the algorithm increases considerably.\n- While we tested GPry on mildly multimodal posterior distributions it is not a supported feature and should be used with caution. The algorithm is fairly greedy and can easily miss a mode, especially if the separation between modes is large.\n- The algorithm is generally robust towards \"weirdly\" shaped posterior distributions, however the structure of the kernel still assumes a single characteristic correlation length. This means that in very pathological cases (for instance a very wide distribution with a tiny spike in it) GPry will struggle to capture the mode correctly.\n- This code is novel and hasn't profited from years of user feedback and maintenance. Bugs can (and probably will) occur. If you find any please report them to us!\n\n**We are actively working on mitigating some of those issues and we will keep on developing this code so look out for new versions!**\n",
"bugtrack_url": null,
"license": "LGPL",
"summary": "A package for fast bayesian inference of expensive Likelihoods",
"version": "2.0.1",
"project_urls": {
"Documentation": "https://gpry.readthedocs.io",
"Homepage": "https://github.com/jonaselgammal/GPry",
"Source": "https://github.com/jonaselgammal/GPry"
},
"split_keywords": [
"inference",
"gaussianprocesses",
"sampling",
"cosmology"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "89849fe828cfb6a955e61c467dbf387c97418549efd0e6469f1832c2bf173e32",
"md5": "8aff3bd375b44ac534c8451a9d56874d",
"sha256": "5846ebefed4fc7ecbb3b04e4b5f7a8b8383750577014e416e4f66fe75af75dea"
},
"downloads": -1,
"filename": "gpry-2.0.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "8aff3bd375b44ac534c8451a9d56874d",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 122003,
"upload_time": "2023-09-14T11:10:55",
"upload_time_iso_8601": "2023-09-14T11:10:55.846574Z",
"url": "https://files.pythonhosted.org/packages/89/84/9fe828cfb6a955e61c467dbf387c97418549efd0e6469f1832c2bf173e32/gpry-2.0.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "91d2f5847339fa35230048659072f8e1a8c0c73c9d2588122ef1c953d0d3057e",
"md5": "cd3a51bd417385c96e9a99f5f01062fc",
"sha256": "bc95ca09c070865bc3d0dd55c39b096f6ea3a3b58617fe069cf9c66abdd43f40"
},
"downloads": -1,
"filename": "gpry-2.0.1.tar.gz",
"has_sig": false,
"md5_digest": "cd3a51bd417385c96e9a99f5f01062fc",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 119527,
"upload_time": "2023-09-14T11:10:57",
"upload_time_iso_8601": "2023-09-14T11:10:57.491163Z",
"url": "https://files.pythonhosted.org/packages/91/d2/f5847339fa35230048659072f8e1a8c0c73c9d2588122ef1c953d0d3057e/gpry-2.0.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-09-14 11:10:57",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "jonaselgammal",
"github_project": "GPry",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "gpry"
}