networkscaleup

Name	networkscaleup JSON
Version	0.0.11 JSON
	download
home_page	None
Summary	Network Scale-Up Models for Aggregated Relational Data
upload_time	2025-10-16 21:59:10
maintainer	None
docs_url	None
author	None
requires_python	>=3.8
license	None
keywords	ard bayesian network scale-up social networks
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Fitting Network Scale-up Models

## Overview

This package fits several different network scale-up models (NSUM) to Aggregated Relational Data (ARD). ARD represents survey responses about how many people each respondent knows in different subpopulations through "How many X's do you know?" questions. Specifically, if <img src="https://latex.codecogs.com/svg.latex?N_i" alt="N_i"> respondents are asked how many people they know in <img src="https://latex.codecogs.com/svg.latex?N_k" alt="N_k"> subpopulations, then ARD is an <img src="https://latex.codecogs.com/svg.latex?N_i" alt="N_i"> by <img src="https://latex.codecogs.com/svg.latex?N_k" alt="N_k"> matrix, where the <img src="https://latex.codecogs.com/svg.latex?(i,j)" alt="(i,j)"> element represents how many people respondent <img src="https://latex.codecogs.com/svg.latex?i" alt="i"> reports knowing in subpopulation <img src="https://latex.codecogs.com/svg.latex?j" alt="j">. NSUM leverages these responses to estimate the unknown size of hard-to-reach populations. See Laga, et al. (2021) for more details.

In this package, we provide functions to estimate the size and accompanying parameters (e.g. degrees) from 4 papers:

- Killworth, P. D., Johnsen, E. C., McCarty, C., Shelley, G. A., and Bernard, H. R. (1998) plug-in MLE
- Killworth, P. D., McCarty, C., Bernard, H. R., Shelley, G. A., and Johnsen, E. C. (1998) MLE
- Zheng, T., Salganik, M. J., and Gelman, A. (2006) overdispersed model
- Laga, I., Bao, L., and Niu, X (2021) uncorrelated, correlated, and covariate models

## Requirements

This package requires the following Python libraries:
- `numpy >= 1.24`
- `pandas >= 2.1`
- `scipy >= 1.11`
- `cmdstanpy >= 1.1`

### PIMLE

The plug-in MLE estimator from Killworth, P. D., Johnsen, E. C., McCarty, C., Shelley, G. A., and Bernard, H. R. (1998) is a two-stage estimator that first estimates the degrees for each respondent <img src="https://latex.codecogs.com/svg.latex?d_i" alt="d_i">
by maximizing the following likelihood for each respondent:

$$L(d_i;y,\{N_k\}) = \prod_{k=1}^{L} {d_i \choose y_{ik}} \left(\frac{N_k}{N}\right)^{y_{ik}}\left(1-\frac{N_k}{N}\right)^{d_i-y_{ik}}$$

where <img src="https://latex.codecogs.com/svg.latex?L" alt="L"> is the number of subpopulations with known <img src="https://latex.codecogs.com/svg.latex?N_k" alt="N_k">. For the second stage, the model plugs in the estimated <img src="https://latex.codecogs.com/svg.latex?d_i" alt="d_i"> into the equation

$$\frac{y_{ik}}{d_i} = \frac{N_k}{N}$$

and solves for the unknown <img src="https://latex.codecogs.com/svg.latex?N_k" alt="N_k"> for each respondent. These values are then averaged to obtain a single estimated of <img src="https://latex.codecogs.com/svg.latex?N_k" alt="N_k">.

To summarize, stage 1 estimate <img src="https://latex.codecogs.com/svg.latex?\smash{\hat{d}_i}" alt="\hat{d}_i"> by

<div align="center">
  <img src="https://latex.codecogs.com/svg.latex?\hat{d}_i%20=%20N%20\cdot%20\frac{\sum_{k=1}^{L}y_{ik}}{\sum_{k=1}^{L}N_{k}}" alt="\hat{d}_i formula">
</div>

and then these estimates are used in stage 2 to estimate the unknown <img src="https://latex.codecogs.com/svg.latex?\hat{N}_k" alt="\hat{N}_k"> by

<div align="center">
  <img src="https://latex.codecogs.com/svg.latex?%5Chat%7BN%7D_k%5E%7B%5Cmathrm%7BPIMLE%7D%7D%20%3D%20%5Cfrac%7BN%7D%7Bn%7D%5Csum_%7Bi%3D1%7D%5E%7Bn%7D%5Cfrac%7By_%7Bik%7D%7D%7B%5Chat%7Bd%7D_i%7D" alt="\hat{N}_{k}^{PIMLE} formula">
</div>

The following demonstrates how to use the `killworth` function to compute PIMLE estimates of unknown subpopulation sizes:


```python
pimle_est = killworth(ard,
                      known_sizes = sizes[[0, 1, 3]],
                      known_ind = [0,1,3],
                      N = N,
                      model = "PIMLE")
```

Note that the function may provide a warning saying that at least one <img src="https://latex.codecogs.com/svg.latex?\hat{d}_i" alt="\hat{d}_i"> was 0. This occurs when a respondent does not report knowing anyone in the known subpopulations. This is an issue for the PIMLE since a 0 value is in the denominator for <img src="https://latex.codecogs.com/svg.latex?\hat{N}_u^{PIMLE}" alt="\hat{N}_u^{PIMLE}">. Thus, we ignore the responses from respondents that correspond to <img src="https://latex.codecogs.com/svg.latex?\hat{d}_i" alt="\hat{d}_i"> = 0.

### MLE

The MLE estimator from Killworth, P. D., McCarty, C., Bernard, H. R., Shelley, G. A., and Johnsen, E. C. (1998) is also a two-stage model with an identical first stage, i.e

<div align="center">
  <img src="https://latex.codecogs.com/svg.latex?%5Chat%7Bd%7D_i%20%3D%20N%20%5Ccdot%20%5Cfrac%7B%5Csum_%7Bk%3D1%7D%5E%7BL%7Dy_%7Bik%7D%7D%7B%5Csum_%7Bk%3D1%7D%5E%7BL%7DN_%7Bk%7D%7D" alt="\hat{d}_i formula">
</div>

However, the second stage estimates <img src="https://latex.codecogs.com/svg.latex?\hat{N}_k" alt="\hat{N}_k"> by maximizing the Binomial likelihood with respect to <img src="https://latex.codecogs.com/svg.latex?\hat{N}_k" alt="\hat{N}_k">, fixing <img src="https://latex.codecogs.com/svg.latex?d_i" alt="d_i"> at the estimated <img src="https://latex.codecogs.com/svg.latex?\hat{d}_i" alt="\hat{d}_i">. Thus, the estimate for the unknown subpopulation size is given by

<div align="center">
  <img src="https://latex.codecogs.com/svg.latex?%5Chat%7BN%7D_k%5E%7B%5Cmathrm%7BMLE%7D%7D%20%3D%20N%20%5Ccdot%20%5Cfrac%7B%5Csum_%7Bi%3D1%7D%5E%7Bn%7D%20y_%7Bik%7D%7D%7B%5Csum_%7Bi%3D1%7D%5E%7Bn%7D%20%5Chat%7Bd%7D_i%7D" alt="\hat{N}_k^{MLE} formula">
</div>

The following demonstrates how to use the `killworth` function to compute MLE estimates of unknown subpopulation sizes:


```python
mle_est = killworth(ard,
                      known_sizes = np.ravel(sizes)[[0,1,3]],
                      known_ind = [0,1,3],
                      N = N,
                      model = "MLE")
```

Note that there is no warning here since the denominator depends on the summation of <img src="https://latex.codecogs.com/svg.latex?\hat{d}_i" alt="\hat{d}_i">.

## Bayesian Models

Now we introduce the two Bayesian estimators implemented in this package.

### Overdispersed Model

The overdispersed model proposed in Zheng et al. (2006) assumes the following likelihood:

$$y_{ik} \sim \text{Negative-Binomial}(\text{mean}=e^{a_i+b_k},\text{overdispersion}=\omega_k)$$

Please see the original manuscript for more details on the model structure and priors.

This package fits this overdispersed model either via the Gibbs-Metropolis algorithm provided in the original manuscript (`overdispersed`) or via Stan (`overdispersedStan`). We suggest using the Stan version since convergence and effective sample sizes are more satisfactory in the Stan implementation, and does not require tuning jumping scales for Metropolis updates.

In order to identity the <img src="https://latex.codecogs.com/svg.latex?\alpha_i" alt="\alpha_i"> and <img src="https://latex.codecogs.com/svg.latex?\beta_k" alt="\beta_k"> as log-degrees and log-prevalences, respectively, the overdispersed model requires scaling the parameters. In order to scale the parameters, the user must supply at least one subpopulation with known size and the column index corresponding to that known size. Additionally, a two secondary groups may be supplied which can adjust for differences in gender or other binary group classifications. More details of the scaling procedure can be found in the original manuscript.

The following demonstrates how to use the `overdispersed` and `overdispersedStan` functions to compute estimates of unknown subpopulation sizes using the Gibbs-Metropolis and Stan implementations of the overdispersed model. Note that in practice, both `warmup` and `iter` should be set to higher values:


```python
overdisp_gibbs_metrop_est = overdispersed(
                         ard, 
                         known_sizes = sizes[[0,1,3]], 
                         known_ind = [0,1,3], 
                         G1_ind = 0,
                         G2_ind = 1,
                         B2_ind = 3,
                         N=N,
                         warmup = 500,
                         iter = 1000,
                         verbose = True,
                         init = "MLE")

overdisp_stan = overdispersedStan(
                         ard, 
                         known_sizes = sizes[[0,1,3]], 
                         known_ind = [0,1,3],
                         G1_ind = 0,
                         G2_ind = 1,
                         B2_ind = 3,
                         N = N,
                         chains = 2,
                         cores = 2,
                         warmup = 250,
                         iter = 500)
```

### Correlated Models

The correlated model proposed in Laga et al. (2023) assumes the following likelihood

<div align="center">
<img src="https://latex.codecogs.com/svg.latex?y_%7Bik%7D%20%5Csim%20%5Ctext%7BPoisson%7D%5CBig%28%5Cexp%28%5Cdelta_i%20%2B%20%5Crho_k%20%2B%20%5Cbeta_%7B%5Ctext%7Bglobal%7D%7D%20z_%7Bi%2C%5Ctext%7Bglobal%7D%7D%20%2B%20%5Cbeta_%7Bk%2C%5Ctext%7Bsubpop%7D%7D%20z_%7Bi%2C%5Ctext%7Bsubpop%7D%7D%20%2B%20%5Calpha_k%20x_%7Bik%7D%20%2B%20b_%7Bik%7D%29%5CBig%29" alt="overdispersed_poisson">
</div>

where critically,

$$\textbf{b}_i \sim \mathcal{N}_k(\mu, \Sigma)$$

so that the responses for each respondent are correlated across subpopulations. Again, <img src="https://latex.codecogs.com/svg.latex?\delta_i" alt="\delta_i"> and <img src="https://latex.codecogs.com/svg.latex?\rho_k" alt="\rho_k"> need to be scaled, and they can either be scaled using the same procedure as for the overdispersed model (providing indices corresponding to different groups), by using all known subpopulation sizes, or by weighting groups according to their correlation with other groups. More details about these scaling procedures are provided in Laga et al. (2023).

In this package, model parameters are estimated via Stan. Note that while the full model likelihood depends on <img src="https://latex.codecogs.com/svg.latex?X%2C%20Z_%7B%5Ctext%7Bglobal%7D%7D" alt="X, Z_{global}">
, and <img src="https://latex.codecogs.com/svg.latex?Z_{subpop}" alt="Z_{subpop}">, any combination of these covariates can be provided. Additionally, we can assume that <img src="https://latex.codecogs.com/svg.latex?\Sigma" alt="\Sigma"> is a diagonal matrix (i.e. no correlation) by setting the argument `model = uncorrelated` in the `correlatedStan` function. 

The following demonstrates how to use the `correlatedStan` function to compute estimates of unknown subpopulation sizes using the Stan implementations of the correlated model. Note that in practice, both `warmup` and `iter` should be set to higher values:


```python
correlated_cov_stan = correlatedStan(
    ard,
    known_sizes = sizes[[0,1,3]],
    known_ind = [0,1,3],
    model = "correlated",
    scaling = "weighted",
    x = x,
    z_subpop = z_subpop,
    z_global = z_global,
    N = N,
    chains = 2,
    cores = 2,
    warmup = 250,
    iter = 500,
)

correlated_nocov_stan = correlatedStan(
    ard,
    known_sizes = sizes[[0,1,3]],
    known_ind = [0,1,3],
    model = "correlated",
    scaling = "all",
    N = N,
    chains = 2,
    cores = 2,
    warmup = 250,
    iter = 500,
)

uncorrelated_cov_stan = correlatedStan(
    ard,
    known_sizes = sizes[[0,1,3]],
    known_ind = [0,1,3],
    model = "uncorrelated",
    scaling = "all",
    x = x,
    z_subpop = z_subpop,
    z_global = z_global,
    N = N,
    chains = 2,
    cores = 2,
    warmup = 250,
    iter = 500,
)

uncorrelated_x_stan = correlatedStan(
    ard,
    known_sizes = sizes[[0,1,3]],
    known_ind = [0,1,3],
    model = "uncorrelated",
    scaling = "all",
    x = x,
    N = N,
    chains = 2,
    cores = 2,
    warmup = 250,
    iter = 500,
)
```

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "networkscaleup",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "ARD, Bayesian, network scale-up, social networks",
    "author": null,
    "author_email": "\"Ian Laga, Sarah Nagy, Charles Costanzo\" <ian.laga@montana.edu>, Charles Costanzo <cgc5478@psu.edu>, Sarah Nagy <s.nagy.4343@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/53/6d/676fed1b7b090376e47266618c7f15eee5dcbddb28bcb282ce1f27476330/networkscaleup-0.0.11.tar.gz",
    "platform": null,
    "description": "# Fitting Network Scale-up Models\n\n## Overview\n\nThis package fits several different network scale-up models (NSUM) to Aggregated Relational Data (ARD). ARD represents survey responses about how many people each respondent knows in different subpopulations through \"How many X's do you know?\" questions. Specifically, if <img src=\"https://latex.codecogs.com/svg.latex?N_i\" alt=\"N_i\"> respondents are asked how many people they know in <img src=\"https://latex.codecogs.com/svg.latex?N_k\" alt=\"N_k\"> subpopulations, then ARD is an <img src=\"https://latex.codecogs.com/svg.latex?N_i\" alt=\"N_i\"> by <img src=\"https://latex.codecogs.com/svg.latex?N_k\" alt=\"N_k\"> matrix, where the <img src=\"https://latex.codecogs.com/svg.latex?(i,j)\" alt=\"(i,j)\"> element represents how many people respondent <img src=\"https://latex.codecogs.com/svg.latex?i\" alt=\"i\"> reports knowing in subpopulation <img src=\"https://latex.codecogs.com/svg.latex?j\" alt=\"j\">. NSUM leverages these responses to estimate the unknown size of hard-to-reach populations. See Laga, et al. (2021) for more details.\n\nIn this package, we provide functions to estimate the size and accompanying parameters (e.g. degrees) from 4 papers:\n\n- Killworth, P. D., Johnsen, E. C., McCarty, C., Shelley, G. A., and Bernard, H. R. (1998) plug-in MLE\n- Killworth, P. D., McCarty, C., Bernard, H. R., Shelley, G. A., and Johnsen, E. C. (1998) MLE\n- Zheng, T., Salganik, M. J., and Gelman, A. (2006) overdispersed model\n- Laga, I., Bao, L., and Niu, X (2021) uncorrelated, correlated, and covariate models\n\n## Requirements\n\nThis package requires the following Python libraries:\n- `numpy >= 1.24`\n- `pandas >= 2.1`\n- `scipy >= 1.11`\n- `cmdstanpy >= 1.1`\n\n### PIMLE\n\nThe plug-in MLE estimator from Killworth, P. D., Johnsen, E. C., McCarty, C., Shelley, G. A., and Bernard, H. R. (1998) is a two-stage estimator that first estimates the degrees for each respondent <img src=\"https://latex.codecogs.com/svg.latex?d_i\" alt=\"d_i\">\nby maximizing the following likelihood for each respondent:\n\n$$L(d_i;y,\\{N_k\\}) = \\prod_{k=1}^{L} {d_i \\choose y_{ik}} \\left(\\frac{N_k}{N}\\right)^{y_{ik}}\\left(1-\\frac{N_k}{N}\\right)^{d_i-y_{ik}}$$\n\nwhere <img src=\"https://latex.codecogs.com/svg.latex?L\" alt=\"L\"> is the number of subpopulations with known <img src=\"https://latex.codecogs.com/svg.latex?N_k\" alt=\"N_k\">. For the second stage, the model plugs in the estimated <img src=\"https://latex.codecogs.com/svg.latex?d_i\" alt=\"d_i\"> into the equation\n\n$$\\frac{y_{ik}}{d_i} = \\frac{N_k}{N}$$\n\nand solves for the unknown <img src=\"https://latex.codecogs.com/svg.latex?N_k\" alt=\"N_k\"> for each respondent. These values are then averaged to obtain a single estimated of <img src=\"https://latex.codecogs.com/svg.latex?N_k\" alt=\"N_k\">.\n\nTo summarize, stage 1 estimate <img src=\"https://latex.codecogs.com/svg.latex?\\smash{\\hat{d}_i}\" alt=\"\\hat{d}_i\"> by\n\n<div align=\"center\">\n  <img src=\"https://latex.codecogs.com/svg.latex?\\hat{d}_i%20=%20N%20\\cdot%20\\frac{\\sum_{k=1}^{L}y_{ik}}{\\sum_{k=1}^{L}N_{k}}\" alt=\"\\hat{d}_i formula\">\n</div>\n\nand then these estimates are used in stage 2 to estimate the unknown <img src=\"https://latex.codecogs.com/svg.latex?\\hat{N}_k\" alt=\"\\hat{N}_k\"> by\n\n<div align=\"center\">\n  <img src=\"https://latex.codecogs.com/svg.latex?%5Chat%7BN%7D_k%5E%7B%5Cmathrm%7BPIMLE%7D%7D%20%3D%20%5Cfrac%7BN%7D%7Bn%7D%5Csum_%7Bi%3D1%7D%5E%7Bn%7D%5Cfrac%7By_%7Bik%7D%7D%7B%5Chat%7Bd%7D_i%7D\" alt=\"\\hat{N}_{k}^{PIMLE} formula\">\n</div>\n\nThe following demonstrates how to use the `killworth` function to compute PIMLE estimates of unknown subpopulation sizes:\n\n\n```python\npimle_est = killworth(ard,\n                      known_sizes = sizes[[0, 1, 3]],\n                      known_ind = [0,1,3],\n                      N = N,\n                      model = \"PIMLE\")\n```\n\nNote that the function may provide a warning saying that at least one <img src=\"https://latex.codecogs.com/svg.latex?\\hat{d}_i\" alt=\"\\hat{d}_i\"> was 0. This occurs when a respondent does not report knowing anyone in the known subpopulations. This is an issue for the PIMLE since a 0 value is in the denominator for <img src=\"https://latex.codecogs.com/svg.latex?\\hat{N}_u^{PIMLE}\" alt=\"\\hat{N}_u^{PIMLE}\">. Thus, we ignore the responses from respondents that correspond to <img src=\"https://latex.codecogs.com/svg.latex?\\hat{d}_i\" alt=\"\\hat{d}_i\"> = 0.\n\n### MLE\n\nThe MLE estimator from Killworth, P. D., McCarty, C., Bernard, H. R., Shelley, G. A., and Johnsen, E. C. (1998) is also a two-stage model with an identical first stage, i.e\n\n<div align=\"center\">\n  <img src=\"https://latex.codecogs.com/svg.latex?%5Chat%7Bd%7D_i%20%3D%20N%20%5Ccdot%20%5Cfrac%7B%5Csum_%7Bk%3D1%7D%5E%7BL%7Dy_%7Bik%7D%7D%7B%5Csum_%7Bk%3D1%7D%5E%7BL%7DN_%7Bk%7D%7D\" alt=\"\\hat{d}_i formula\">\n</div>\n\nHowever, the second stage estimates <img src=\"https://latex.codecogs.com/svg.latex?\\hat{N}_k\" alt=\"\\hat{N}_k\"> by maximizing the Binomial likelihood with respect to <img src=\"https://latex.codecogs.com/svg.latex?\\hat{N}_k\" alt=\"\\hat{N}_k\">, fixing <img src=\"https://latex.codecogs.com/svg.latex?d_i\" alt=\"d_i\"> at the estimated <img src=\"https://latex.codecogs.com/svg.latex?\\hat{d}_i\" alt=\"\\hat{d}_i\">. Thus, the estimate for the unknown subpopulation size is given by\n\n<div align=\"center\">\n  <img src=\"https://latex.codecogs.com/svg.latex?%5Chat%7BN%7D_k%5E%7B%5Cmathrm%7BMLE%7D%7D%20%3D%20N%20%5Ccdot%20%5Cfrac%7B%5Csum_%7Bi%3D1%7D%5E%7Bn%7D%20y_%7Bik%7D%7D%7B%5Csum_%7Bi%3D1%7D%5E%7Bn%7D%20%5Chat%7Bd%7D_i%7D\" alt=\"\\hat{N}_k^{MLE} formula\">\n</div>\n\nThe following demonstrates how to use the `killworth` function to compute MLE estimates of unknown subpopulation sizes:\n\n\n```python\nmle_est = killworth(ard,\n                      known_sizes = np.ravel(sizes)[[0,1,3]],\n                      known_ind = [0,1,3],\n                      N = N,\n                      model = \"MLE\")\n```\n\nNote that there is no warning here since the denominator depends on the summation of <img src=\"https://latex.codecogs.com/svg.latex?\\hat{d}_i\" alt=\"\\hat{d}_i\">.\n\n## Bayesian Models\n\nNow we introduce the two Bayesian estimators implemented in this package.\n\n### Overdispersed Model\n\nThe overdispersed model proposed in Zheng et al. (2006) assumes the following likelihood:\n\n$$y_{ik} \\sim \\text{Negative-Binomial}(\\text{mean}=e^{a_i+b_k},\\text{overdispersion}=\\omega_k)$$\n\nPlease see the original manuscript for more details on the model structure and priors.\n\nThis package fits this overdispersed model either via the Gibbs-Metropolis algorithm provided in the original manuscript (`overdispersed`) or via Stan (`overdispersedStan`). We suggest using the Stan version since convergence and effective sample sizes are more satisfactory in the Stan implementation, and does not require tuning jumping scales for Metropolis updates.\n\nIn order to identity the <img src=\"https://latex.codecogs.com/svg.latex?\\alpha_i\" alt=\"\\alpha_i\"> and <img src=\"https://latex.codecogs.com/svg.latex?\\beta_k\" alt=\"\\beta_k\"> as log-degrees and log-prevalences, respectively, the overdispersed model requires scaling the parameters. In order to scale the parameters, the user must supply at least one subpopulation with known size and the column index corresponding to that known size. Additionally, a two secondary groups may be supplied which can adjust for differences in gender or other binary group classifications. More details of the scaling procedure can be found in the original manuscript.\n\nThe following demonstrates how to use the `overdispersed` and `overdispersedStan` functions to compute estimates of unknown subpopulation sizes using the Gibbs-Metropolis and Stan implementations of the overdispersed model. Note that in practice, both `warmup` and `iter` should be set to higher values:\n\n\n```python\noverdisp_gibbs_metrop_est = overdispersed(\n                         ard, \n                         known_sizes = sizes[[0,1,3]], \n                         known_ind = [0,1,3], \n                         G1_ind = 0,\n                         G2_ind = 1,\n                         B2_ind = 3,\n                         N=N,\n                         warmup = 500,\n                         iter = 1000,\n                         verbose = True,\n                         init = \"MLE\")\n\noverdisp_stan = overdispersedStan(\n                         ard, \n                         known_sizes = sizes[[0,1,3]], \n                         known_ind = [0,1,3],\n                         G1_ind = 0,\n                         G2_ind = 1,\n                         B2_ind = 3,\n                         N = N,\n                         chains = 2,\n                         cores = 2,\n                         warmup = 250,\n                         iter = 500)\n```\n\n### Correlated Models\n\nThe correlated model proposed in Laga et al. (2023) assumes the following likelihood\n\n<div align=\"center\">\n<img src=\"https://latex.codecogs.com/svg.latex?y_%7Bik%7D%20%5Csim%20%5Ctext%7BPoisson%7D%5CBig%28%5Cexp%28%5Cdelta_i%20%2B%20%5Crho_k%20%2B%20%5Cbeta_%7B%5Ctext%7Bglobal%7D%7D%20z_%7Bi%2C%5Ctext%7Bglobal%7D%7D%20%2B%20%5Cbeta_%7Bk%2C%5Ctext%7Bsubpop%7D%7D%20z_%7Bi%2C%5Ctext%7Bsubpop%7D%7D%20%2B%20%5Calpha_k%20x_%7Bik%7D%20%2B%20b_%7Bik%7D%29%5CBig%29\" alt=\"overdispersed_poisson\">\n</div>\n\nwhere critically,\n\n$$\\textbf{b}_i \\sim \\mathcal{N}_k(\\mu, \\Sigma)$$\n\nso that the responses for each respondent are correlated across subpopulations. Again, <img src=\"https://latex.codecogs.com/svg.latex?\\delta_i\" alt=\"\\delta_i\"> and <img src=\"https://latex.codecogs.com/svg.latex?\\rho_k\" alt=\"\\rho_k\"> need to be scaled, and they can either be scaled using the same procedure as for the overdispersed model (providing indices corresponding to different groups), by using all known subpopulation sizes, or by weighting groups according to their correlation with other groups. More details about these scaling procedures are provided in Laga et al. (2023).\n\nIn this package, model parameters are estimated via Stan. Note that while the full model likelihood depends on <img src=\"https://latex.codecogs.com/svg.latex?X%2C%20Z_%7B%5Ctext%7Bglobal%7D%7D\" alt=\"X, Z_{global}\">\n, and <img src=\"https://latex.codecogs.com/svg.latex?Z_{subpop}\" alt=\"Z_{subpop}\">, any combination of these covariates can be provided. Additionally, we can assume that <img src=\"https://latex.codecogs.com/svg.latex?\\Sigma\" alt=\"\\Sigma\"> is a diagonal matrix (i.e. no correlation) by setting the argument `model = uncorrelated` in the `correlatedStan` function. \n\nThe following demonstrates how to use the `correlatedStan` function to compute estimates of unknown subpopulation sizes using the Stan implementations of the correlated model. Note that in practice, both `warmup` and `iter` should be set to higher values:\n\n\n```python\ncorrelated_cov_stan = correlatedStan(\n    ard,\n    known_sizes = sizes[[0,1,3]],\n    known_ind = [0,1,3],\n    model = \"correlated\",\n    scaling = \"weighted\",\n    x = x,\n    z_subpop = z_subpop,\n    z_global = z_global,\n    N = N,\n    chains = 2,\n    cores = 2,\n    warmup = 250,\n    iter = 500,\n)\n\ncorrelated_nocov_stan = correlatedStan(\n    ard,\n    known_sizes = sizes[[0,1,3]],\n    known_ind = [0,1,3],\n    model = \"correlated\",\n    scaling = \"all\",\n    N = N,\n    chains = 2,\n    cores = 2,\n    warmup = 250,\n    iter = 500,\n)\n\nuncorrelated_cov_stan = correlatedStan(\n    ard,\n    known_sizes = sizes[[0,1,3]],\n    known_ind = [0,1,3],\n    model = \"uncorrelated\",\n    scaling = \"all\",\n    x = x,\n    z_subpop = z_subpop,\n    z_global = z_global,\n    N = N,\n    chains = 2,\n    cores = 2,\n    warmup = 250,\n    iter = 500,\n)\n\nuncorrelated_x_stan = correlatedStan(\n    ard,\n    known_sizes = sizes[[0,1,3]],\n    known_ind = [0,1,3],\n    model = \"uncorrelated\",\n    scaling = \"all\",\n    x = x,\n    N = N,\n    chains = 2,\n    cores = 2,\n    warmup = 250,\n    iter = 500,\n)\n```",
    "bugtrack_url": null,
    "license": null,
    "summary": "Network Scale-Up Models for Aggregated Relational Data",
    "version": "0.0.11",
    "project_urls": {
        "Homepage": "https://github.com/pypa/sampleproject",
        "Issues": "https://github.com/pypa/sampleproject/issues"
    },
    "split_keywords": [
        "ard",
        " bayesian",
        " network scale-up",
        " social networks"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "66893d54c7db312f27263d662847c4898aabf9fc2ddab1e981d75eb33f2210bc",
                "md5": "2b879510d3ffa569b0879de3296d62ac",
                "sha256": "cd725957a88ce576de5b03a038a9093f43fbddfa990ac0e54753f66a297c43ef"
            },
            "downloads": -1,
            "filename": "networkscaleup-0.0.11-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "2b879510d3ffa569b0879de3296d62ac",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 2982257,
            "upload_time": "2025-10-16T21:59:09",
            "upload_time_iso_8601": "2025-10-16T21:59:09.309146Z",
            "url": "https://files.pythonhosted.org/packages/66/89/3d54c7db312f27263d662847c4898aabf9fc2ddab1e981d75eb33f2210bc/networkscaleup-0.0.11-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "536d676fed1b7b090376e47266618c7f15eee5dcbddb28bcb282ce1f27476330",
                "md5": "e36c364a23c26cfdcef404c4b1b41087",
                "sha256": "56623c134cc5e4199c730a07f3fc2f09043f8d3f5041b052686734f30b593ae6"
            },
            "downloads": -1,
            "filename": "networkscaleup-0.0.11.tar.gz",
            "has_sig": false,
            "md5_digest": "e36c364a23c26cfdcef404c4b1b41087",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 2965623,
            "upload_time": "2025-10-16T21:59:10",
            "upload_time_iso_8601": "2025-10-16T21:59:10.677085Z",
            "url": "https://files.pythonhosted.org/packages/53/6d/676fed1b7b090376e47266618c7f15eee5dcbddb28bcb282ce1f27476330/networkscaleup-0.0.11.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-10-16 21:59:10",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "pypa",
    "github_project": "sampleproject",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "networkscaleup"
}

None