# Generalized Inverse Normal distribution
The `ginormal` package provides the density function and random variable generation from the generalized inverse normal (GIN) distribution introduced by [Robert (1991)](#2). The GIN distribution is a way to generalize the distribution of the reciprocal of a normal random variable. That is, the distribution generalizes the distribution of the random variable $Z = 1/X$ where $X \sim \text{Normal}(\mu, \sigma^2)$. This distribution is *different* from the generalized inverse Gaussian (GIG) distribution [(Jørgensen, 2012)](#3) despite the similarities in naming (see [below](#digression)).
The GIN distribution is supported on the entire real line $z \in (-\infty, \infty)$ and takes three parameters:
- $\alpha > 1$, a degrees-of-freedom parameter,
- $\mu \in (-\infty, \infty)$, similar to a location parameter, it shifts the density of the distribution left and right,
- $\tau > 0$, similar to a scale parameter, it spreads the density of the distribution.
This package is the first to provide an efficient sampling algorithm for drawing from the GIN distribution. We provide similar routines for the GIN distribution truncated to the positive or negative reals. Further details of the distribution, theoretical guarantees and pseudo-code for the sampling algorithms, as well as an application to Bayesian estimation of network formation models can be found in the working paper [Ding, Estrada and Montoya-Blandón (2023)](#1).
## Installation
To install the package for use in Python, enter the following command to a terminal:
```
python -m pip install ginormal
```
## Examples
Examples of how to use the `ginormal` package routines are available in the [GitHub repository](https://github.com/smonto2/ginormal/blob/main/example.py).
## Routines
Provided with the package are four main routines:
1. `dgin(z, alpha, mu, tau, log = TRUE, quasi = FALSE)`
2. `dtgin(z, alpha, mu, tau, sign, log = TRUE, quasi = FALSE)`
3. `rgin(size, alpha, mu, tau, algo)`
4. `rtgin(size, alpha, mu, tau, sign, algo)`
The first two compute the densities and the last two are used for random number generation. Density routines take in the quantile `z`; parameters `alpha`, `mu` and `tau`; and two optional logical arguments:
- `log`, should the logarithm of the density be returned? Defaults to `TRUE`.
- `quasi`, should the value of the kernel (or quasi-density) be returned? Defaults to `FALSE`.
Generation routines take the same parameters but require a `size` argument determining the amount of random variates to generate. These routines only admit a parameter `alpha` larger than 2. They take an additional argument `algo`, which can be either `"hormann"` or `"leydold"`, and defaults to `"hormann"` as our prefered method. See [below for details](#rvgeneration) on both points.
Those routines including "`t`" in their name work for the truncated variants. They take an additional logical argument `sign`, where `sign = TRUE` implies truncation to positive numbers $(z > 0)$ and `sign = FALSE` to negative numbers $(z < 0)$.
## Density functions
Let $Z \sim \text{GIN}(\alpha, \mu, \tau)$. The GIN density function is given by
$$f_Z(z) = \frac{1}{C(\alpha, \mu, \tau)} |z|^{-\alpha}\exp\left[-\frac{1}{2\tau^2} \left( \frac{1}{z} - \mu \right)^2 \right] \equiv \frac{g(z; \alpha, \mu, \tau)}{C(\alpha, \mu, \tau)}$$
where $g(z; \alpha, \mu, \tau)$ is the kernel or quasi-density and the proportionality constant can be written in closed form as
```math
C(\alpha, \mu, \tau) = (\sqrt{2} \tau)^{\alpha-1} \exp\left(- \frac{\mu^2}{2\tau^2} \right) \Gamma\left(\frac{\alpha-1}{2}\right) {}_1F_1\left(\frac{\alpha-1}{2}; \frac{1}{2}; \frac{\mu^2}{\tau^2}\right)
```
where $\Gamma(x)$ is the [Gamma function](https://mathworld.wolfram.com/GammaFunction.html) and $`{}_1F_1(a, b; x)`$ is the [confluent hypergeometric function](https://mathworld.wolfram.com/ConfluentHypergeometricFunctionoftheFirstKind.html). In addition to the density and generation routines for the GIN distribution, we provide similar routines for the GIN distribution truncated to positive or negative numbers. These are denoted by $\text{GIN}^{+}$ when truncated to $(0, \infty)$ and by $\text{GIN}^{-}$ when truncated to $(-\infty, 0)$. Let $Z^{+} \sim \text{GIN}^{+}(\alpha, \mu, \tau)$ and $Z^{-} \sim \text{GIN}^{-}(\alpha, \mu, \tau)$. Their densities are given by
$$f_{Z^{+}}(z) = \frac{g(z; \alpha, \mu, \tau)}{C^{+}(\alpha, \mu, \tau)} \mathbb{I}(z > 0)$$
$$f_{Z^{-}}(z) = \frac{g(z; \alpha, \mu, \tau)}{C^{-}(\alpha, \mu, \tau)} \mathbb{I}(z < 0)$$
with proportionality constants
$$C^{+}(\alpha, \mu) = e^{-\frac{\mu^2}{4}} \Gamma(\alpha - 1) D_{-(\alpha-1)}(-\mu)$$
$$C^{-}(\alpha, \mu) = e^{-\frac{\mu^2}{4}} \Gamma(\alpha - 1) D_{-(\alpha-1)}(\mu)$$
where $\mathbb{I}(\cdot)$ is the indicator function that is 1 when its argument is true and 0 otherwise, and $D_\nu(x)$ is the [parabolic cylinder function](https://mathworld.wolfram.com/ParabolicCylinderFunction.html). [^1]
## Random variable generation
<a id="rvgeneration"> </a> [Ding, Estrada and Montoya-Blandón (2023)](#1) provide an efficient sampling algorithm for the GIN distribution and its truncated variants for the case of $\alpha > 2$. This restriction is not of concern if the goal is the perform Bayesian estimation using this distribution (see [below for more details](#digression) and Remark 2 in the paper). Generation is done using the ratio-of-uniforms method with mode shift ([Kinderman and Monahan, 1977](#4)), which requires the computation of the minimal bounding rectangle. We implement two alternatives found in the literature:
1. [Leydold (2001)](#5) that requires information on the proportionality constants.
2. [Hörmann and Leydold (2014)](#6) that requires solving a cubic equation. This is our prefered method and the default in the package.
## Digression: Difference between GIN and GIG distributions
<a id="digression"> </a> While the kernels — and therefore the sampling techniques — for the GIN and GIG distribution are similar, these two distribution share some important differences. The main is their conceptualization, as they both attempt to generalize the idea of an inverse normal distribution in different ways. The GIG distribution does so by choosing cumulants that are inverses to those of the normal distribution. The GIN distribution does so by directly using the density of the reciprocal after a change of variables. Another important difference comes from their use as conjugate priors in Bayesian analysis:
- $\theta \sim \text{GIN}(\alpha, \mu, \tau)$ is the conjugate prior if observations are random samples from $Y \sim \text{Normal}(\theta, \theta^2)$
- $\theta \sim \text{GIG}(\alpha, \mu, \tau)$ is the conjugate prior if observations are random samples from $Y \sim \text{Normal}(\theta, \theta)$
These are both mixture models with a similar structure but carry different interpretations and thus require different posterior sampling algorithms. This interpretation also shows why the restriction of $\alpha \geq 2$ is not binding if the goal is to perform Bayesian analysis. A prior $\theta \sim \text{GIN}(\alpha_0, \mu_0, \tau_0)$ with $\alpha_0 = 1 + \varepsilon$ is non-informative when $\varepsilon > 0$ is arbitrarily small. However, the posterior distribution will have degrees-of-freedom parameter $\alpha_N = N + 1 + \varepsilon$ where $N$ is the sample size. As $N \geq 1$ implies $\alpha_N > 2$, for a conjugate Bayesian analysis we are always drawing from the GIN distribution with $\alpha > 2$.
[^1]: Python implementations of both the confluent hypergeometric and parabolic cylinder functions are available in the `scipy` module. In R, package [`BAS`](https://cran.r-project.org/package=BAS) contains the confluent hypergeometric function. For the parabolic cylinder function, we use a Fortran subroutine provided in the SPECFUN library [(Zhang and Jin, 1996)](#7) and our own R translation of this function.
## References
1. <a id="1"> [Ding, C., Estrada, J., and Montoya-Blandón, S. (2023). Bayesian Inference of Network Formation Models with Payoff Externalities. Working Paper.](https://www.smontoyablandon.com/publication/networks/network_externalities.pdf) </a>
2. <a id="2"> [Robert, C. (1991). Generalized inverse normal distributions. Statistics & Probability Letters, 11(1), 37-41.](https://doi.org/10.1016/0167-7152%2891%2990174-P) </a>
3. <a id="3"> [Jørgensen, B. (2012). Statistical properties of the generalized inverse Gaussian distribution (Vol. 9). Springer Science & Business Media.](https://link.springer.com/book/10.1007/978-1-4612-5698-4) </a>
4. <a id="4"> [Kinderman, A. J., and Monahan, J. F. (1977). Computer generation of random variables using the ratio of uniform deviates. ACM Transactions on Mathematical Software (TOMS), 3(3), 257-260.](https://doi.org/10.1145/355744.355750)
5. <a id="5"> [Leydold, J. (2001). A simple universal generator for continuous and discrete univariate T-concave distributions. ACM Transactions on Mathematical Software (TOMS), 27(1), 66-82.](https://doi.org/10.1145/382043.382322) </a>
6. <a id="6"> [Hörmann, W., and Leydold, J. (2014). Generating generalized inverse Gaussian random variates. Statistics and Computing, 24, 547-557.](https://doi.org/10.1007/s11222-013-9387-3) </a>
7. <a id="7"> Zhang, S. and Jianming, J. (1996). Computation of Special Functions, Wiley. ISBN: 0-471-11963-6, LC: QA351.C45. </a>
Raw data
{
"_id": null,
"home_page": null,
"name": "ginormal",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": "Santiago Montoya-Bland\u00f3n <Santiago.Montoya-Blandon@glasgow.ac.uk>",
"keywords": "statistics, distribution, generalized inverse normal, random variable generation",
"author": null,
"author_email": "Santiago Montoya-Bland\u00f3n <Santiago.Montoya-Blandon@glasgow.ac.uk>, Cheng Ding <cheng.ding.emory@gmail.com>, Juan Estrada <jjestra@emory.edu>, Zhilang Xia <zhilang.xia@glasgow.ac.uk>",
"download_url": "https://files.pythonhosted.org/packages/28/d4/d721099ba0a4dbe1363a1b7875e09c85689a2cb1e4909a52b16646f1849f/ginormal-0.0.13.tar.gz",
"platform": null,
"description": "# Generalized Inverse Normal distribution\r\nThe `ginormal` package provides the density function and random variable generation from the generalized inverse normal (GIN) distribution introduced by [Robert (1991)](#2). The GIN distribution is a way to generalize the distribution of the reciprocal of a normal random variable. That is, the distribution generalizes the distribution of the random variable $Z = 1/X$ where $X \\sim \\text{Normal}(\\mu, \\sigma^2)$. This distribution is *different* from the generalized inverse Gaussian (GIG) distribution [(J\u00f8rgensen, 2012)](#3) despite the similarities in naming (see [below](#digression)).\r\n\r\nThe GIN distribution is supported on the entire real line $z \\in (-\\infty, \\infty)$ and takes three parameters:\r\n- $\\alpha > 1$, a degrees-of-freedom parameter,\r\n- $\\mu \\in (-\\infty, \\infty)$, similar to a location parameter, it shifts the density of the distribution left and right,\r\n- $\\tau > 0$, similar to a scale parameter, it spreads the density of the distribution.\r\n\r\nThis package is the first to provide an efficient sampling algorithm for drawing from the GIN distribution. We provide similar routines for the GIN distribution truncated to the positive or negative reals. Further details of the distribution, theoretical guarantees and pseudo-code for the sampling algorithms, as well as an application to Bayesian estimation of network formation models can be found in the working paper [Ding, Estrada and Montoya-Bland\u00f3n (2023)](#1).\r\n\r\n## Installation\r\n\r\nTo install the package for use in Python, enter the following command to a terminal:\r\n```\r\npython -m pip install ginormal\r\n```\r\n\r\n## Examples\r\n\r\nExamples of how to use the `ginormal` package routines are available in the [GitHub repository](https://github.com/smonto2/ginormal/blob/main/example.py).\r\n\r\n## Routines\r\n\r\nProvided with the package are four main routines:\r\n1. `dgin(z, alpha, mu, tau, log = TRUE, quasi = FALSE)`\r\n2. `dtgin(z, alpha, mu, tau, sign, log = TRUE, quasi = FALSE)`\r\n3. `rgin(size, alpha, mu, tau, algo)`\r\n4. `rtgin(size, alpha, mu, tau, sign, algo)`\r\n\r\nThe first two compute the densities and the last two are used for random number generation. Density routines take in the quantile `z`; parameters `alpha`, `mu` and `tau`; and two optional logical arguments:\r\n- `log`, should the logarithm of the density be returned? Defaults to `TRUE`.\r\n- `quasi`, should the value of the kernel (or quasi-density) be returned? Defaults to `FALSE`.\r\n\r\nGeneration routines take the same parameters but require a `size` argument determining the amount of random variates to generate. These routines only admit a parameter `alpha` larger than 2. They take an additional argument `algo`, which can be either `\"hormann\"` or `\"leydold\"`, and defaults to `\"hormann\"` as our prefered method. See [below for details](#rvgeneration) on both points.\r\n\r\nThose routines including \"`t`\" in their name work for the truncated variants. They take an additional logical argument `sign`, where `sign = TRUE` implies truncation to positive numbers $(z > 0)$ and `sign = FALSE` to negative numbers $(z < 0)$.\r\n\r\n## Density functions\r\n\r\nLet $Z \\sim \\text{GIN}(\\alpha, \\mu, \\tau)$. The GIN density function is given by\r\n$$f_Z(z) = \\frac{1}{C(\\alpha, \\mu, \\tau)} |z|^{-\\alpha}\\exp\\left[-\\frac{1}{2\\tau^2} \\left( \\frac{1}{z} - \\mu \\right)^2 \\right] \\equiv \\frac{g(z; \\alpha, \\mu, \\tau)}{C(\\alpha, \\mu, \\tau)}$$\r\nwhere $g(z; \\alpha, \\mu, \\tau)$ is the kernel or quasi-density and the proportionality constant can be written in closed form as\r\n```math\r\nC(\\alpha, \\mu, \\tau) = (\\sqrt{2} \\tau)^{\\alpha-1} \\exp\\left(- \\frac{\\mu^2}{2\\tau^2} \\right) \\Gamma\\left(\\frac{\\alpha-1}{2}\\right) {}_1F_1\\left(\\frac{\\alpha-1}{2}; \\frac{1}{2}; \\frac{\\mu^2}{\\tau^2}\\right)\r\n```\r\nwhere $\\Gamma(x)$ is the [Gamma function](https://mathworld.wolfram.com/GammaFunction.html) and $`{}_1F_1(a, b; x)`$ is the [confluent hypergeometric function](https://mathworld.wolfram.com/ConfluentHypergeometricFunctionoftheFirstKind.html). In addition to the density and generation routines for the GIN distribution, we provide similar routines for the GIN distribution truncated to positive or negative numbers. These are denoted by $\\text{GIN}^{+}$ when truncated to $(0, \\infty)$ and by $\\text{GIN}^{-}$ when truncated to $(-\\infty, 0)$. Let $Z^{+} \\sim \\text{GIN}^{+}(\\alpha, \\mu, \\tau)$ and $Z^{-} \\sim \\text{GIN}^{-}(\\alpha, \\mu, \\tau)$. Their densities are given by\r\n$$f_{Z^{+}}(z) = \\frac{g(z; \\alpha, \\mu, \\tau)}{C^{+}(\\alpha, \\mu, \\tau)} \\mathbb{I}(z > 0)$$\r\n$$f_{Z^{-}}(z) = \\frac{g(z; \\alpha, \\mu, \\tau)}{C^{-}(\\alpha, \\mu, \\tau)} \\mathbb{I}(z < 0)$$\r\nwith proportionality constants\r\n$$C^{+}(\\alpha, \\mu) = e^{-\\frac{\\mu^2}{4}} \\Gamma(\\alpha - 1) D_{-(\\alpha-1)}(-\\mu)$$\r\n$$C^{-}(\\alpha, \\mu) = e^{-\\frac{\\mu^2}{4}} \\Gamma(\\alpha - 1) D_{-(\\alpha-1)}(\\mu)$$\r\nwhere $\\mathbb{I}(\\cdot)$ is the indicator function that is 1 when its argument is true and 0 otherwise, and $D_\\nu(x)$ is the [parabolic cylinder function](https://mathworld.wolfram.com/ParabolicCylinderFunction.html). [^1]\r\n\r\n## Random variable generation\r\n\r\n<a id=\"rvgeneration\"> </a> [Ding, Estrada and Montoya-Bland\u00f3n (2023)](#1) provide an efficient sampling algorithm for the GIN distribution and its truncated variants for the case of $\\alpha > 2$. This restriction is not of concern if the goal is the perform Bayesian estimation using this distribution (see [below for more details](#digression) and Remark 2 in the paper). Generation is done using the ratio-of-uniforms method with mode shift ([Kinderman and Monahan, 1977](#4)), which requires the computation of the minimal bounding rectangle. We implement two alternatives found in the literature:\r\n1. [Leydold (2001)](#5) that requires information on the proportionality constants.\r\n2. [H\u00f6rmann and Leydold (2014)](#6) that requires solving a cubic equation. This is our prefered method and the default in the package.\r\n\r\n## Digression: Difference between GIN and GIG distributions\r\n\r\n<a id=\"digression\"> </a> While the kernels — and therefore the sampling techniques — for the GIN and GIG distribution are similar, these two distribution share some important differences. The main is their conceptualization, as they both attempt to generalize the idea of an inverse normal distribution in different ways. The GIG distribution does so by choosing cumulants that are inverses to those of the normal distribution. The GIN distribution does so by directly using the density of the reciprocal after a change of variables. Another important difference comes from their use as conjugate priors in Bayesian analysis:\r\n- $\\theta \\sim \\text{GIN}(\\alpha, \\mu, \\tau)$ is the conjugate prior if observations are random samples from $Y \\sim \\text{Normal}(\\theta, \\theta^2)$\r\n- $\\theta \\sim \\text{GIG}(\\alpha, \\mu, \\tau)$ is the conjugate prior if observations are random samples from $Y \\sim \\text{Normal}(\\theta, \\theta)$\r\n\r\nThese are both mixture models with a similar structure but carry different interpretations and thus require different posterior sampling algorithms. This interpretation also shows why the restriction of $\\alpha \\geq 2$ is not binding if the goal is to perform Bayesian analysis. A prior $\\theta \\sim \\text{GIN}(\\alpha_0, \\mu_0, \\tau_0)$ with $\\alpha_0 = 1 + \\varepsilon$ is non-informative when $\\varepsilon > 0$ is arbitrarily small. However, the posterior distribution will have degrees-of-freedom parameter $\\alpha_N = N + 1 + \\varepsilon$ where $N$ is the sample size. As $N \\geq 1$ implies $\\alpha_N > 2$, for a conjugate Bayesian analysis we are always drawing from the GIN distribution with $\\alpha > 2$.\r\n\r\n[^1]: Python implementations of both the confluent hypergeometric and parabolic cylinder functions are available in the `scipy` module. In R, package [`BAS`](https://cran.r-project.org/package=BAS) contains the confluent hypergeometric function. For the parabolic cylinder function, we use a Fortran subroutine provided in the SPECFUN library [(Zhang and Jin, 1996)](#7) and our own R translation of this function.\r\n\r\n## References\r\n1. <a id=\"1\"> [Ding, C., Estrada, J., and Montoya-Bland\u00f3n, S. (2023). Bayesian Inference of Network Formation Models with Payoff Externalities. Working Paper.](https://www.smontoyablandon.com/publication/networks/network_externalities.pdf) </a>\r\n2. <a id=\"2\"> [Robert, C. (1991). Generalized inverse normal distributions. Statistics & Probability Letters, 11(1), 37-41.](https://doi.org/10.1016/0167-7152%2891%2990174-P) </a>\r\n3. <a id=\"3\"> [J\u00f8rgensen, B. (2012). Statistical properties of the generalized inverse Gaussian distribution (Vol. 9). Springer Science & Business Media.](https://link.springer.com/book/10.1007/978-1-4612-5698-4) </a>\r\n4. <a id=\"4\"> [Kinderman, A. J., and Monahan, J. F. (1977). Computer generation of random variables using the ratio of uniform deviates. ACM Transactions on Mathematical Software (TOMS), 3(3), 257-260.](https://doi.org/10.1145/355744.355750)\r\n5. <a id=\"5\"> [Leydold, J. (2001). A simple universal generator for continuous and discrete univariate T-concave distributions. ACM Transactions on Mathematical Software (TOMS), 27(1), 66-82.](https://doi.org/10.1145/382043.382322) </a>\r\n6. <a id=\"6\"> [H\u00f6rmann, W., and Leydold, J. (2014). Generating generalized inverse Gaussian random variates. Statistics and Computing, 24, 547-557.](https://doi.org/10.1007/s11222-013-9387-3) </a>\r\n7. <a id=\"7\"> Zhang, S. and Jianming, J. (1996). Computation of Special Functions, Wiley. ISBN: 0-471-11963-6, LC: QA351.C45. </a>\r\n",
"bugtrack_url": null,
"license": null,
"summary": "Generalized Inverse Normal distribution density and generation",
"version": "0.0.13",
"project_urls": {
"Bug tracking": "https://github.com/smonto2/ginormal/issues",
"Homepage": "https://github.com/smonto2/ginormal"
},
"split_keywords": [
"statistics",
" distribution",
" generalized inverse normal",
" random variable generation"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "be164e60b5284122373828f01956cda94e96926b443efcbb86657705b95ae8f3",
"md5": "4af47743b4e92ddb7317f299ca5ffdb0",
"sha256": "c5908e5cd16072fbb9c6a195c58b61084d55e90e5cabb00e80e24a5de9d4194b"
},
"downloads": -1,
"filename": "ginormal-0.0.13-py3-none-any.whl",
"has_sig": false,
"md5_digest": "4af47743b4e92ddb7317f299ca5ffdb0",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 12443,
"upload_time": "2024-05-11T21:04:09",
"upload_time_iso_8601": "2024-05-11T21:04:09.735750Z",
"url": "https://files.pythonhosted.org/packages/be/16/4e60b5284122373828f01956cda94e96926b443efcbb86657705b95ae8f3/ginormal-0.0.13-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "28d4d721099ba0a4dbe1363a1b7875e09c85689a2cb1e4909a52b16646f1849f",
"md5": "55d78f6bc335762f9bf9f719ce52f16e",
"sha256": "7d30217d15f2a669ecb7e2571873030d671fa61dd3f48c6f91b777b4f9ce98ad"
},
"downloads": -1,
"filename": "ginormal-0.0.13.tar.gz",
"has_sig": false,
"md5_digest": "55d78f6bc335762f9bf9f719ce52f16e",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 12768,
"upload_time": "2024-05-11T21:04:11",
"upload_time_iso_8601": "2024-05-11T21:04:11.577915Z",
"url": "https://files.pythonhosted.org/packages/28/d4/d721099ba0a4dbe1363a1b7875e09c85689a2cb1e4909a52b16646f1849f/ginormal-0.0.13.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-05-11 21:04:11",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "smonto2",
"github_project": "ginormal",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "ginormal"
}