stepmix


Namestepmix JSON
Version 2.2.1 PyPI version JSON
download
home_pageNone
SummaryA Python package for stepwise estimation of latent class models with measurement and structural components. The package can also be used to fit mixture models with various observed random variables.
upload_time2024-02-15 15:29:59
maintainerNone
docs_urlNone
authorNone
requires_python>=3.7
licenseNone
keywords clustering mixtures lca em latent-class-analysis expectation–maximization
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            StepMix
==============================
<a href="https://pypi.org/project/stepmix/"><img src="https://badge.fury.io/py/stepmix.svg" alt="PyPI version"></a>
[![Build](https://github.com/Labo-Lacourse/stepmix/actions/workflows/pytest.yaml/badge.svg)](https://github.com/Labo-Lacourse/stepmix/actions/workflows/pytest.yaml)
[![Documentation Status](https://readthedocs.org/projects/stepmix/badge/?version=latest)](https://stepmix.readthedocs.io/en/latest/index.html)
<a href="https://github.com/psf/black"><img alt="Code style: black" src="https://img.shields.io/badge/code%20style-black-000000.svg"></a>
[![Downloads](https://static.pepy.tech/badge/stepmix)](https://pepy.tech/project/stepmix)
[![Downloads](https://static.pepy.tech/badge/stepmix/month)](https://pepy.tech/project/stepmix)
[![arXiv](https://img.shields.io/badge/arXiv-2304.03853-b31b1b.svg)](https://arxiv.org/abs/2304.03853)

*For StepMixR, please refer to <a href="https://github.com/Labo-Lacourse/stepmixr">this repository.</a>*

A Python package following the scikit-learn API for generalized mixture modeling. The package supports categorical 
data (Latent Class Analysis) and continuous data (Gaussian Mixtures/Latent Profile Analysis). StepMix can be used for
both clustering and supervised learning.

Additional features include:
* Support for missing values through Full Information Maximum Likelihood (FIML); 
* Multiple stepwise Expectation-Maximization (EM) estimation methods based on pseudolikelihood theory;
* Covariates and distal outcomes;
* Parametric and non-parametric bootstrapping.

![](https://drive.google.com/uc?export=view&id=1mB9-Y2N3biqHRyRVX5cvIdixBpoiyCG_)

# Reference
If you find StepMix useful, please consider citing our [arXiv preprint](https://arxiv.org/abs/2304.03853):
```
@article{morin2023stepmix,
  title={StepMix: A Python Package for Pseudo-Likelihood Estimation of Generalized Mixture Models with External Variables},
  author={Morin, Sacha and Legault, Robin and Lalibert{\'e}, F{\'e}lix and Bakk, Zsuzsa and Gigu{\`e}re, Charles-{\'E}douard and de la Sablonni{\`e}re, Roxane and Lacourse, {\'E}ric},
  journal={arXiv preprint arXiv:2304.03853},
  year={2023}
}
```


# Install
You can install StepMix with pip, preferably in a virtual environment: 
```
pip install stepmix
``` 
# Quickstart
A StepMix mixture using categorical variables on a preloaded data matrix. StepMix accepts either `numpy.array`or 
`pandas.DataFrame`. Categories should be integer-encoded and 0-indexed.

```python
from stepmix.stepmix import StepMix

# Categorical StepMix Model with 3 latent classes
model = StepMix(n_components=3, measurement="categorical")
model.fit(data)

# Allow missing values
model_nan = StepMix(n_components=3, measurement="categorical_nan")
model_nan.fit(data_nan)
```
For binary data you can also use `measurement="binary"` or `measurement="binary_nan"`. For continuous data, you can fit a Gaussian Mixture with diagonal covariances using `measurement="continuous"` or `measurement="continuous_nan"`.

Set `verbose=1` for a detailed output.

Please refer to the StepMix tutorials to learn how to combine continuous and categorical data in the same model.
# Tutorials
Detailed tutorials are available in notebooks: 
1. [Generalized Mixture Models with StepMix](https://colab.research.google.com/drive/1T8017QsMCiy62z2QHOvmbzE-tCECO-w7?): 
an in-depth look at how mixture models can be defined with StepMix. The tutorial uses the Iris Dataset as an example
and covers:
   1. Gaussian Mixtures (Latent Profile Analysis);
   2. Binary Mixtures (LCA);
   3. Categorical Mixtures (LCA);
   3. Mixed Categorical and Continuous Mixtures;
   5. Missing Values through Full-Information Maximum Likelihood.
2. [Stepwise Estimation with StepMix](https://colab.research.google.com/drive/1xJB4y6eaprBMw98lB7kflWz8MfQcT2cI?usp=drive_link):
    a tutorial demonstrating how to define measurement and structural models. The tutorial discusses:
   1. LCA models with distal outcomes;
   2. LCA models with covariates; 
   3. 1-step, 2-step and 3-step estimation;
   4. Corrections (BCH or ML) and other options for 3-step estimation;
   5. Putting it All Together: A Complete Model with Missing Values
3. [Model Selection](https://colab.research.google.com/drive/1btXHCx90eCsnUlQv_yN-9AzKDhJP_JkG?usp=drive_link):
    1. Selecting the number of components in a mixture model (```n_components```) with cross-validation;
    3. Selecting the number of components with the Parametric Bootstrapped Likelihood Ratio Test (BLRT);
    2. Fit indices: AIC, BIC and other metrics.
4. [Parameters, Bootstrapping and CI](https://colab.research.google.com/drive/14DJCqFTUaYp3JtLAeAMYmGHFLCHE-r7z):
   a tutorial discussing how to:
   1. Access StepMix parameters;
   2. Bootstrap StepMix estimators;
   2. Quickly plot confidence intervals.
5. [Supervised and Semi-Supervised Learning with StepMix](https://colab.research.google.com/drive/1GKkdKkCsHWnB4ocjkx8oQdf-gUxHWjeB?usp=sharing):
   1. Binary Classification;
   1. Multiclass Classification;
   1. Semi-Supervised Learning;
   1. Cross-Validation.

![](https://drive.google.com/uc?export=view&id=1gajwp-NTu9kSdK_7DBhpiX0SebEx5WMF)

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "stepmix",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": null,
    "keywords": "clustering,mixtures,lca,em,latent-class-analysis,expectation\u2013maximization",
    "author": null,
    "author_email": "Sacha Morin <sacha.morin@mila.quebec>, Robin Legault <robin.legault@umontreal.ca>, Charles-\u00c9douard Gigu\u00e8re <ce.giguere@gmail.com>, \u00c9ric Lacourse <eric.lacourse@umontreal.ca>, Roxane de la Sablonni\u00e8re <roxane.de.la.sablonniere@umontreal.ca>",
    "download_url": "https://files.pythonhosted.org/packages/84/6e/9a3032c734b8a13060ee72a778843f5e7e10e56813c59f58be4c0128b44a/stepmix-2.2.1.tar.gz",
    "platform": null,
    "description": "StepMix\n==============================\n<a href=\"https://pypi.org/project/stepmix/\"><img src=\"https://badge.fury.io/py/stepmix.svg\" alt=\"PyPI version\"></a>\n[![Build](https://github.com/Labo-Lacourse/stepmix/actions/workflows/pytest.yaml/badge.svg)](https://github.com/Labo-Lacourse/stepmix/actions/workflows/pytest.yaml)\n[![Documentation Status](https://readthedocs.org/projects/stepmix/badge/?version=latest)](https://stepmix.readthedocs.io/en/latest/index.html)\n<a href=\"https://github.com/psf/black\"><img alt=\"Code style: black\" src=\"https://img.shields.io/badge/code%20style-black-000000.svg\"></a>\n[![Downloads](https://static.pepy.tech/badge/stepmix)](https://pepy.tech/project/stepmix)\n[![Downloads](https://static.pepy.tech/badge/stepmix/month)](https://pepy.tech/project/stepmix)\n[![arXiv](https://img.shields.io/badge/arXiv-2304.03853-b31b1b.svg)](https://arxiv.org/abs/2304.03853)\n\n*For StepMixR, please refer to <a href=\"https://github.com/Labo-Lacourse/stepmixr\">this repository.</a>*\n\nA Python package following the scikit-learn API for generalized mixture modeling. The package supports categorical \ndata (Latent Class Analysis) and continuous data (Gaussian Mixtures/Latent Profile Analysis). StepMix can be used for\nboth clustering and supervised learning.\n\nAdditional features include:\n* Support for missing values through Full Information Maximum Likelihood (FIML); \n* Multiple stepwise Expectation-Maximization (EM) estimation methods based on pseudolikelihood theory;\n* Covariates and distal outcomes;\n* Parametric and non-parametric bootstrapping.\n\n![](https://drive.google.com/uc?export=view&id=1mB9-Y2N3biqHRyRVX5cvIdixBpoiyCG_)\n\n# Reference\nIf you find StepMix useful, please consider citing our [arXiv preprint](https://arxiv.org/abs/2304.03853):\n```\n@article{morin2023stepmix,\n  title={StepMix: A Python Package for Pseudo-Likelihood Estimation of Generalized Mixture Models with External Variables},\n  author={Morin, Sacha and Legault, Robin and Lalibert{\\'e}, F{\\'e}lix and Bakk, Zsuzsa and Gigu{\\`e}re, Charles-{\\'E}douard and de la Sablonni{\\`e}re, Roxane and Lacourse, {\\'E}ric},\n  journal={arXiv preprint arXiv:2304.03853},\n  year={2023}\n}\n```\n\n\n# Install\nYou can install StepMix with pip, preferably in a virtual environment: \n```\npip install stepmix\n``` \n# Quickstart\nA StepMix mixture using categorical variables on a preloaded data matrix. StepMix accepts either `numpy.array`or \n`pandas.DataFrame`. Categories should be integer-encoded and 0-indexed.\n\n```python\nfrom stepmix.stepmix import StepMix\n\n# Categorical StepMix Model with 3 latent classes\nmodel = StepMix(n_components=3, measurement=\"categorical\")\nmodel.fit(data)\n\n# Allow missing values\nmodel_nan = StepMix(n_components=3, measurement=\"categorical_nan\")\nmodel_nan.fit(data_nan)\n```\nFor binary data you can also use `measurement=\"binary\"` or `measurement=\"binary_nan\"`. For continuous data, you can fit a Gaussian Mixture with diagonal covariances using `measurement=\"continuous\"` or `measurement=\"continuous_nan\"`.\n\nSet `verbose=1` for a detailed output.\n\nPlease refer to the StepMix tutorials to learn how to combine continuous and categorical data in the same model.\n# Tutorials\nDetailed tutorials are available in notebooks: \n1. [Generalized Mixture Models with StepMix](https://colab.research.google.com/drive/1T8017QsMCiy62z2QHOvmbzE-tCECO-w7?): \nan in-depth look at how mixture models can be defined with StepMix. The tutorial uses the Iris Dataset as an example\nand covers:\n   1. Gaussian Mixtures (Latent Profile Analysis);\n   2. Binary Mixtures (LCA);\n   3. Categorical Mixtures (LCA);\n   3. Mixed Categorical and Continuous Mixtures;\n   5. Missing Values through Full-Information Maximum Likelihood.\n2. [Stepwise Estimation with StepMix](https://colab.research.google.com/drive/1xJB4y6eaprBMw98lB7kflWz8MfQcT2cI?usp=drive_link):\n    a tutorial demonstrating how to define measurement and structural models. The tutorial discusses:\n   1. LCA models with distal outcomes;\n   2. LCA models with covariates; \n   3. 1-step, 2-step and 3-step estimation;\n   4. Corrections (BCH or ML) and other options for 3-step estimation;\n   5. Putting it All Together: A Complete Model with Missing Values\n3. [Model Selection](https://colab.research.google.com/drive/1btXHCx90eCsnUlQv_yN-9AzKDhJP_JkG?usp=drive_link):\n    1. Selecting the number of components in a mixture model (```n_components```) with cross-validation;\n    3. Selecting the number of components with the Parametric Bootstrapped Likelihood Ratio Test (BLRT);\n    2. Fit indices: AIC, BIC and other metrics.\n4. [Parameters, Bootstrapping and CI](https://colab.research.google.com/drive/14DJCqFTUaYp3JtLAeAMYmGHFLCHE-r7z):\n   a tutorial discussing how to:\n   1. Access StepMix parameters;\n   2. Bootstrap StepMix estimators;\n   2. Quickly plot confidence intervals.\n5. [Supervised and Semi-Supervised Learning with StepMix](https://colab.research.google.com/drive/1GKkdKkCsHWnB4ocjkx8oQdf-gUxHWjeB?usp=sharing):\n   1. Binary Classification;\n   1. Multiclass Classification;\n   1. Semi-Supervised Learning;\n   1. Cross-Validation.\n\n![](https://drive.google.com/uc?export=view&id=1gajwp-NTu9kSdK_7DBhpiX0SebEx5WMF)\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A Python package for stepwise estimation of latent class models with measurement and structural components. The package can also be used to fit mixture models with various observed random variables.",
    "version": "2.2.1",
    "project_urls": {
        "Homepage": "https://stepmix.readthedocs.io/en/latest/"
    },
    "split_keywords": [
        "clustering",
        "mixtures",
        "lca",
        "em",
        "latent-class-analysis",
        "expectation\u2013maximization"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "f82382f2483b7f5d440217a785622983c2bb894acbdeb4b6c5d170132d16f065",
                "md5": "0477af8f711521222bb5a3b3bfe69e17",
                "sha256": "c8dd4a1426d9bd8d514b49a668c624bb8049bb686354ea2ad534f05e1229517c"
            },
            "downloads": -1,
            "filename": "stepmix-2.2.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "0477af8f711521222bb5a3b3bfe69e17",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 44184,
            "upload_time": "2024-02-15T15:29:52",
            "upload_time_iso_8601": "2024-02-15T15:29:52.124830Z",
            "url": "https://files.pythonhosted.org/packages/f8/23/82f2483b7f5d440217a785622983c2bb894acbdeb4b6c5d170132d16f065/stepmix-2.2.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "846e9a3032c734b8a13060ee72a778843f5e7e10e56813c59f58be4c0128b44a",
                "md5": "466b38e4fd9e0b326aee72ae0a16406a",
                "sha256": "8157fe272a5d0df0070ce1745557aac183862b10fccda591fe28a564b890e544"
            },
            "downloads": -1,
            "filename": "stepmix-2.2.1.tar.gz",
            "has_sig": false,
            "md5_digest": "466b38e4fd9e0b326aee72ae0a16406a",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 60260,
            "upload_time": "2024-02-15T15:29:59",
            "upload_time_iso_8601": "2024-02-15T15:29:59.450170Z",
            "url": "https://files.pythonhosted.org/packages/84/6e/9a3032c734b8a13060ee72a778843f5e7e10e56813c59f58be4c0128b44a/stepmix-2.2.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-02-15 15:29:59",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "stepmix"
}
        
Elapsed time: 0.29773s