Name | stepmix JSON |
Version |
2.2.1
JSON |
| download |
home_page | None |
Summary | A Python package for stepwise estimation of latent class models with measurement and structural components. The package can also be used to fit mixture models with various observed random variables. |
upload_time | 2024-02-15 15:29:59 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.7 |
license | None |
keywords |
clustering
mixtures
lca
em
latent-class-analysis
expectation–maximization
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
StepMix
==============================
<a href="https://pypi.org/project/stepmix/"><img src="https://badge.fury.io/py/stepmix.svg" alt="PyPI version"></a>
[![Build](https://github.com/Labo-Lacourse/stepmix/actions/workflows/pytest.yaml/badge.svg)](https://github.com/Labo-Lacourse/stepmix/actions/workflows/pytest.yaml)
[![Documentation Status](https://readthedocs.org/projects/stepmix/badge/?version=latest)](https://stepmix.readthedocs.io/en/latest/index.html)
<a href="https://github.com/psf/black"><img alt="Code style: black" src="https://img.shields.io/badge/code%20style-black-000000.svg"></a>
[![Downloads](https://static.pepy.tech/badge/stepmix)](https://pepy.tech/project/stepmix)
[![Downloads](https://static.pepy.tech/badge/stepmix/month)](https://pepy.tech/project/stepmix)
[![arXiv](https://img.shields.io/badge/arXiv-2304.03853-b31b1b.svg)](https://arxiv.org/abs/2304.03853)
*For StepMixR, please refer to <a href="https://github.com/Labo-Lacourse/stepmixr">this repository.</a>*
A Python package following the scikit-learn API for generalized mixture modeling. The package supports categorical
data (Latent Class Analysis) and continuous data (Gaussian Mixtures/Latent Profile Analysis). StepMix can be used for
both clustering and supervised learning.
Additional features include:
* Support for missing values through Full Information Maximum Likelihood (FIML);
* Multiple stepwise Expectation-Maximization (EM) estimation methods based on pseudolikelihood theory;
* Covariates and distal outcomes;
* Parametric and non-parametric bootstrapping.
![](https://drive.google.com/uc?export=view&id=1mB9-Y2N3biqHRyRVX5cvIdixBpoiyCG_)
# Reference
If you find StepMix useful, please consider citing our [arXiv preprint](https://arxiv.org/abs/2304.03853):
```
@article{morin2023stepmix,
title={StepMix: A Python Package for Pseudo-Likelihood Estimation of Generalized Mixture Models with External Variables},
author={Morin, Sacha and Legault, Robin and Lalibert{\'e}, F{\'e}lix and Bakk, Zsuzsa and Gigu{\`e}re, Charles-{\'E}douard and de la Sablonni{\`e}re, Roxane and Lacourse, {\'E}ric},
journal={arXiv preprint arXiv:2304.03853},
year={2023}
}
```
# Install
You can install StepMix with pip, preferably in a virtual environment:
```
pip install stepmix
```
# Quickstart
A StepMix mixture using categorical variables on a preloaded data matrix. StepMix accepts either `numpy.array`or
`pandas.DataFrame`. Categories should be integer-encoded and 0-indexed.
```python
from stepmix.stepmix import StepMix
# Categorical StepMix Model with 3 latent classes
model = StepMix(n_components=3, measurement="categorical")
model.fit(data)
# Allow missing values
model_nan = StepMix(n_components=3, measurement="categorical_nan")
model_nan.fit(data_nan)
```
For binary data you can also use `measurement="binary"` or `measurement="binary_nan"`. For continuous data, you can fit a Gaussian Mixture with diagonal covariances using `measurement="continuous"` or `measurement="continuous_nan"`.
Set `verbose=1` for a detailed output.
Please refer to the StepMix tutorials to learn how to combine continuous and categorical data in the same model.
# Tutorials
Detailed tutorials are available in notebooks:
1. [Generalized Mixture Models with StepMix](https://colab.research.google.com/drive/1T8017QsMCiy62z2QHOvmbzE-tCECO-w7?):
an in-depth look at how mixture models can be defined with StepMix. The tutorial uses the Iris Dataset as an example
and covers:
1. Gaussian Mixtures (Latent Profile Analysis);
2. Binary Mixtures (LCA);
3. Categorical Mixtures (LCA);
3. Mixed Categorical and Continuous Mixtures;
5. Missing Values through Full-Information Maximum Likelihood.
2. [Stepwise Estimation with StepMix](https://colab.research.google.com/drive/1xJB4y6eaprBMw98lB7kflWz8MfQcT2cI?usp=drive_link):
a tutorial demonstrating how to define measurement and structural models. The tutorial discusses:
1. LCA models with distal outcomes;
2. LCA models with covariates;
3. 1-step, 2-step and 3-step estimation;
4. Corrections (BCH or ML) and other options for 3-step estimation;
5. Putting it All Together: A Complete Model with Missing Values
3. [Model Selection](https://colab.research.google.com/drive/1btXHCx90eCsnUlQv_yN-9AzKDhJP_JkG?usp=drive_link):
1. Selecting the number of components in a mixture model (```n_components```) with cross-validation;
3. Selecting the number of components with the Parametric Bootstrapped Likelihood Ratio Test (BLRT);
2. Fit indices: AIC, BIC and other metrics.
4. [Parameters, Bootstrapping and CI](https://colab.research.google.com/drive/14DJCqFTUaYp3JtLAeAMYmGHFLCHE-r7z):
a tutorial discussing how to:
1. Access StepMix parameters;
2. Bootstrap StepMix estimators;
2. Quickly plot confidence intervals.
5. [Supervised and Semi-Supervised Learning with StepMix](https://colab.research.google.com/drive/1GKkdKkCsHWnB4ocjkx8oQdf-gUxHWjeB?usp=sharing):
1. Binary Classification;
1. Multiclass Classification;
1. Semi-Supervised Learning;
1. Cross-Validation.
![](https://drive.google.com/uc?export=view&id=1gajwp-NTu9kSdK_7DBhpiX0SebEx5WMF)
Raw data
{
"_id": null,
"home_page": null,
"name": "stepmix",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": null,
"keywords": "clustering,mixtures,lca,em,latent-class-analysis,expectation\u2013maximization",
"author": null,
"author_email": "Sacha Morin <sacha.morin@mila.quebec>, Robin Legault <robin.legault@umontreal.ca>, Charles-\u00c9douard Gigu\u00e8re <ce.giguere@gmail.com>, \u00c9ric Lacourse <eric.lacourse@umontreal.ca>, Roxane de la Sablonni\u00e8re <roxane.de.la.sablonniere@umontreal.ca>",
"download_url": "https://files.pythonhosted.org/packages/84/6e/9a3032c734b8a13060ee72a778843f5e7e10e56813c59f58be4c0128b44a/stepmix-2.2.1.tar.gz",
"platform": null,
"description": "StepMix\n==============================\n<a href=\"https://pypi.org/project/stepmix/\"><img src=\"https://badge.fury.io/py/stepmix.svg\" alt=\"PyPI version\"></a>\n[![Build](https://github.com/Labo-Lacourse/stepmix/actions/workflows/pytest.yaml/badge.svg)](https://github.com/Labo-Lacourse/stepmix/actions/workflows/pytest.yaml)\n[![Documentation Status](https://readthedocs.org/projects/stepmix/badge/?version=latest)](https://stepmix.readthedocs.io/en/latest/index.html)\n<a href=\"https://github.com/psf/black\"><img alt=\"Code style: black\" src=\"https://img.shields.io/badge/code%20style-black-000000.svg\"></a>\n[![Downloads](https://static.pepy.tech/badge/stepmix)](https://pepy.tech/project/stepmix)\n[![Downloads](https://static.pepy.tech/badge/stepmix/month)](https://pepy.tech/project/stepmix)\n[![arXiv](https://img.shields.io/badge/arXiv-2304.03853-b31b1b.svg)](https://arxiv.org/abs/2304.03853)\n\n*For StepMixR, please refer to <a href=\"https://github.com/Labo-Lacourse/stepmixr\">this repository.</a>*\n\nA Python package following the scikit-learn API for generalized mixture modeling. The package supports categorical \ndata (Latent Class Analysis) and continuous data (Gaussian Mixtures/Latent Profile Analysis). StepMix can be used for\nboth clustering and supervised learning.\n\nAdditional features include:\n* Support for missing values through Full Information Maximum Likelihood (FIML); \n* Multiple stepwise Expectation-Maximization (EM) estimation methods based on pseudolikelihood theory;\n* Covariates and distal outcomes;\n* Parametric and non-parametric bootstrapping.\n\n![](https://drive.google.com/uc?export=view&id=1mB9-Y2N3biqHRyRVX5cvIdixBpoiyCG_)\n\n# Reference\nIf you find StepMix useful, please consider citing our [arXiv preprint](https://arxiv.org/abs/2304.03853):\n```\n@article{morin2023stepmix,\n title={StepMix: A Python Package for Pseudo-Likelihood Estimation of Generalized Mixture Models with External Variables},\n author={Morin, Sacha and Legault, Robin and Lalibert{\\'e}, F{\\'e}lix and Bakk, Zsuzsa and Gigu{\\`e}re, Charles-{\\'E}douard and de la Sablonni{\\`e}re, Roxane and Lacourse, {\\'E}ric},\n journal={arXiv preprint arXiv:2304.03853},\n year={2023}\n}\n```\n\n\n# Install\nYou can install StepMix with pip, preferably in a virtual environment: \n```\npip install stepmix\n``` \n# Quickstart\nA StepMix mixture using categorical variables on a preloaded data matrix. StepMix accepts either `numpy.array`or \n`pandas.DataFrame`. Categories should be integer-encoded and 0-indexed.\n\n```python\nfrom stepmix.stepmix import StepMix\n\n# Categorical StepMix Model with 3 latent classes\nmodel = StepMix(n_components=3, measurement=\"categorical\")\nmodel.fit(data)\n\n# Allow missing values\nmodel_nan = StepMix(n_components=3, measurement=\"categorical_nan\")\nmodel_nan.fit(data_nan)\n```\nFor binary data you can also use `measurement=\"binary\"` or `measurement=\"binary_nan\"`. For continuous data, you can fit a Gaussian Mixture with diagonal covariances using `measurement=\"continuous\"` or `measurement=\"continuous_nan\"`.\n\nSet `verbose=1` for a detailed output.\n\nPlease refer to the StepMix tutorials to learn how to combine continuous and categorical data in the same model.\n# Tutorials\nDetailed tutorials are available in notebooks: \n1. [Generalized Mixture Models with StepMix](https://colab.research.google.com/drive/1T8017QsMCiy62z2QHOvmbzE-tCECO-w7?): \nan in-depth look at how mixture models can be defined with StepMix. The tutorial uses the Iris Dataset as an example\nand covers:\n 1. Gaussian Mixtures (Latent Profile Analysis);\n 2. Binary Mixtures (LCA);\n 3. Categorical Mixtures (LCA);\n 3. Mixed Categorical and Continuous Mixtures;\n 5. Missing Values through Full-Information Maximum Likelihood.\n2. [Stepwise Estimation with StepMix](https://colab.research.google.com/drive/1xJB4y6eaprBMw98lB7kflWz8MfQcT2cI?usp=drive_link):\n a tutorial demonstrating how to define measurement and structural models. The tutorial discusses:\n 1. LCA models with distal outcomes;\n 2. LCA models with covariates; \n 3. 1-step, 2-step and 3-step estimation;\n 4. Corrections (BCH or ML) and other options for 3-step estimation;\n 5. Putting it All Together: A Complete Model with Missing Values\n3. [Model Selection](https://colab.research.google.com/drive/1btXHCx90eCsnUlQv_yN-9AzKDhJP_JkG?usp=drive_link):\n 1. Selecting the number of components in a mixture model (```n_components```) with cross-validation;\n 3. Selecting the number of components with the Parametric Bootstrapped Likelihood Ratio Test (BLRT);\n 2. Fit indices: AIC, BIC and other metrics.\n4. [Parameters, Bootstrapping and CI](https://colab.research.google.com/drive/14DJCqFTUaYp3JtLAeAMYmGHFLCHE-r7z):\n a tutorial discussing how to:\n 1. Access StepMix parameters;\n 2. Bootstrap StepMix estimators;\n 2. Quickly plot confidence intervals.\n5. [Supervised and Semi-Supervised Learning with StepMix](https://colab.research.google.com/drive/1GKkdKkCsHWnB4ocjkx8oQdf-gUxHWjeB?usp=sharing):\n 1. Binary Classification;\n 1. Multiclass Classification;\n 1. Semi-Supervised Learning;\n 1. Cross-Validation.\n\n![](https://drive.google.com/uc?export=view&id=1gajwp-NTu9kSdK_7DBhpiX0SebEx5WMF)\n",
"bugtrack_url": null,
"license": null,
"summary": "A Python package for stepwise estimation of latent class models with measurement and structural components. The package can also be used to fit mixture models with various observed random variables.",
"version": "2.2.1",
"project_urls": {
"Homepage": "https://stepmix.readthedocs.io/en/latest/"
},
"split_keywords": [
"clustering",
"mixtures",
"lca",
"em",
"latent-class-analysis",
"expectation\u2013maximization"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "f82382f2483b7f5d440217a785622983c2bb894acbdeb4b6c5d170132d16f065",
"md5": "0477af8f711521222bb5a3b3bfe69e17",
"sha256": "c8dd4a1426d9bd8d514b49a668c624bb8049bb686354ea2ad534f05e1229517c"
},
"downloads": -1,
"filename": "stepmix-2.2.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "0477af8f711521222bb5a3b3bfe69e17",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 44184,
"upload_time": "2024-02-15T15:29:52",
"upload_time_iso_8601": "2024-02-15T15:29:52.124830Z",
"url": "https://files.pythonhosted.org/packages/f8/23/82f2483b7f5d440217a785622983c2bb894acbdeb4b6c5d170132d16f065/stepmix-2.2.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "846e9a3032c734b8a13060ee72a778843f5e7e10e56813c59f58be4c0128b44a",
"md5": "466b38e4fd9e0b326aee72ae0a16406a",
"sha256": "8157fe272a5d0df0070ce1745557aac183862b10fccda591fe28a564b890e544"
},
"downloads": -1,
"filename": "stepmix-2.2.1.tar.gz",
"has_sig": false,
"md5_digest": "466b38e4fd9e0b326aee72ae0a16406a",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 60260,
"upload_time": "2024-02-15T15:29:59",
"upload_time_iso_8601": "2024-02-15T15:29:59.450170Z",
"url": "https://files.pythonhosted.org/packages/84/6e/9a3032c734b8a13060ee72a778843f5e7e10e56813c59f58be4c0128b44a/stepmix-2.2.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-02-15 15:29:59",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "stepmix"
}