CmdStanCache
=============
Quicker model iterations and enhanced productivity for Stan MCMC by
* caching model compilation in a smart way
* caching sampling results in a smart way
No waiting for the resampling the same model with the same data.
Install
-------
First install `CmdStanPy <https://cmdstanpy.readthedocs.io/>`_ and
CmdStan and make sure it works.
::
$ pip install cmdstancache
Usage
-----
::
model = """
data {
int N;
}
parameters {
real<lower=-10.0, upper=10.0> x[N];
}
model {
for (i in 1:N-1) {
target += -2 * (100 * square(x[i+1] - square(x[i])) + square(1 - x[i]));
}
}
"""
data = dict(N=2)
import cmdstancache
stan_variables, method_variables = cmdstancache.run_stan(
model,
data=data,
# any other sample() parameters go here
seed=42
)
**Now comes the trick**:
* If you run this code twice, the second time the stored result is read.
* If you add or modify a code comment, the same result is returned without having to rerun.
.. image:: https://coveralls.io/repos/github/JohannesBuchner/CmdStanCache/badge.svg?branch=main
:target: https://coveralls.io/github/JohannesBuchner/CmdStanCache?branch=main
.. image:: https://github.com/JohannesBuchner/CmdStanCache/actions/workflows/testing.yml/badge.svg
:target: https://github.com/JohannesBuchner/CmdStanCache/actions/workflows/testing.yml
.. image:: https://img.shields.io/pypi/v/cmdstancache.svg
:target: https://pypi.python.org/pypi/cmdstancache
How it works
-------------
cmdstancache keeps a cache of code and data that has previously been used for MCMC sampling.
If it already has the results, it returns it from the cache.
Here are the details:
1. The code is normalised (stripped of comments and indents)
2. A hash of the normalised code is computed
3. The model code is stored in ~/.stan_cache/<codehash>.stan
4. The model is compiled, if it is not already there
5. The data are sorted by key, exported to json, and a hash computed
6. The data are stored in ~/.stan_cache/<datahash>.json
7. cmdstanpy MCMC is run with code=<codehash>.stan and data=<datahash>.json
8. fit.stan_variables() and fit.method_variables() are returned
9. joblib memoizes steps 7 and 8, avoiding resampling when the same data and code hash are seen.
Plotting
--------
Make a quick corner plots of only the scalar model variables::
cmdstancache.plot_corner(stan_variables)
In case some chains are stuck, and you want to remove their samples for plotting::
cleaned_variables = remove_stuck_chains(stan_variables, method_variables)
plot = plot_corner(cleaned_variables)
Since this is optional, the dependency of corner is pulled in if installed with::
$ pip install cmdstancache[plot]
Contributors
-------------
* @JohannesBuchner
Contributions are welcome.
Raw data
{
"_id": null,
"home_page": "https://github.com/JohannesBuchner/CmdStanCache",
"name": "cmdstancache",
"maintainer": "",
"docs_url": null,
"requires_python": ">3.0, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*",
"maintainer_email": "",
"keywords": "",
"author": "Johannes Buchner",
"author_email": "johannes.buchner.acad@gmx.com",
"download_url": "https://files.pythonhosted.org/packages/dd/6b/31630564a7ec16d6b3c53745a411a8755188fea9e6a385048bdd6bb2bfc2/cmdstancache-1.2.2.tar.gz",
"platform": null,
"description": "\nCmdStanCache\n=============\n\nQuicker model iterations and enhanced productivity for Stan MCMC by\n\n* caching model compilation in a smart way\n* caching sampling results in a smart way\n\nNo waiting for the resampling the same model with the same data.\n\nInstall \n-------\n\nFirst install `CmdStanPy <https://cmdstanpy.readthedocs.io/>`_ and\nCmdStan and make sure it works.\n\n::\n\n\t$ pip install cmdstancache\n\nUsage\n-----\n::\n\n\tmodel = \"\"\"\n\tdata {\n\t int N;\n\t}\n\tparameters {\n\t real<lower=-10.0, upper=10.0> x[N];\n\t}\n\tmodel {\n\t for (i in 1:N-1) {\n\t\t target += -2 * (100 * square(x[i+1] - square(x[i])) + square(1 - x[i]));\n\t }\n\t}\n\t\"\"\"\n\tdata = dict(N=2)\n\n\timport cmdstancache\n\n\tstan_variables, method_variables = cmdstancache.run_stan(\n\t\tmodel,\n\t\tdata=data, \n\t\t# any other sample() parameters go here\n\t\tseed=42\n\t)\n\n**Now comes the trick**:\n\n* If you run this code twice, the second time the stored result is read.\n\n* If you add or modify a code comment, the same result is returned without having to rerun.\n\n.. image:: https://coveralls.io/repos/github/JohannesBuchner/CmdStanCache/badge.svg?branch=main\n\t:target: https://coveralls.io/github/JohannesBuchner/CmdStanCache?branch=main\n.. image:: https://github.com/JohannesBuchner/CmdStanCache/actions/workflows/testing.yml/badge.svg\n\t:target: https://github.com/JohannesBuchner/CmdStanCache/actions/workflows/testing.yml\n.. image:: https://img.shields.io/pypi/v/cmdstancache.svg\n :target: https://pypi.python.org/pypi/cmdstancache\n\n\nHow it works\n-------------\n\ncmdstancache keeps a cache of code and data that has previously been used for MCMC sampling.\nIf it already has the results, it returns it from the cache.\n\nHere are the details:\n\n1. The code is normalised (stripped of comments and indents)\n2. A hash of the normalised code is computed\n3. The model code is stored in ~/.stan_cache/<codehash>.stan\n4. The model is compiled, if it is not already there\n5. The data are sorted by key, exported to json, and a hash computed\n6. The data are stored in ~/.stan_cache/<datahash>.json\n7. cmdstanpy MCMC is run with code=<codehash>.stan and data=<datahash>.json\n8. fit.stan_variables() and fit.method_variables() are returned\n9. joblib memoizes steps 7 and 8, avoiding resampling when the same data and code hash are seen.\n\n\nPlotting\n--------\n\nMake a quick corner plots of only the scalar model variables::\n\n\tcmdstancache.plot_corner(stan_variables)\n\nIn case some chains are stuck, and you want to remove their samples for plotting::\n\n\tcleaned_variables = remove_stuck_chains(stan_variables, method_variables)\n\tplot = plot_corner(cleaned_variables)\n\nSince this is optional, the dependency of corner is pulled in if installed with::\n\n\t$ pip install cmdstancache[plot]\n\nContributors\n-------------\n\n* @JohannesBuchner\n\nContributions are welcome.\n\n\n",
"bugtrack_url": null,
"license": "GPL",
"summary": "Smart cache for Stan models and runs",
"version": "1.2.2",
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "dd6b31630564a7ec16d6b3c53745a411a8755188fea9e6a385048bdd6bb2bfc2",
"md5": "e356e67d60673b98d3a4b3161a62b6a1",
"sha256": "0ec885f01df441b5f16b7ef1c14547b1f44a2e9c98cf94655b0563402dc0099b"
},
"downloads": -1,
"filename": "cmdstancache-1.2.2.tar.gz",
"has_sig": true,
"md5_digest": "e356e67d60673b98d3a4b3161a62b6a1",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">3.0, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*",
"size": 18066,
"upload_time": "2023-03-27T19:32:02",
"upload_time_iso_8601": "2023-03-27T19:32:02.722635Z",
"url": "https://files.pythonhosted.org/packages/dd/6b/31630564a7ec16d6b3c53745a411a8755188fea9e6a385048bdd6bb2bfc2/cmdstancache-1.2.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-03-27 19:32:02",
"github": true,
"gitlab": false,
"bitbucket": false,
"github_user": "JohannesBuchner",
"github_project": "CmdStanCache",
"travis_ci": false,
"coveralls": true,
"github_actions": true,
"lcname": "cmdstancache"
}