latentcor: Fast Computation of Latent Correlations for Mixed Data
=================================================================
.. image:: https://readthedocs.org/projects/latentcor-py/badge/?version=latest
:target: https://latentcor-py.readthedocs.io/en/latest/?badge=latest
:alt: Documentation Status
.. image:: https://img.shields.io/pypi/v/latentcor.svg
:target: https://pypi.python.org/pypi/latentcor_py
.. image:: https://app.travis-ci.com/mingzehuang/latentcor_py.svg?branch=master
:target: https://app.travis-ci.com/mingzehuang/latentcor_py
.. image:: https://codecov.io/gh/mingzehuang/latentcor_py/branch/master/graph/badge.svg?token=SF57J6ZW0B
:target: https://codecov.io/gh/mingzehuang/latentcor_py
* Free software: `GNU General Public License v3 <https://github.com/mingzehuang/latentcor_py/blob/master/LICENSE>`_
* Documentation: https://latentcor-py.readthedocs.io.
Introduction
------------
`latentcor` is an Python package for estimation of latent correlations with mixed data types (continuous, binary, truncated, and ternary) under the latent Gaussian copula model. For references on the estimation framework, see
* `Fan, J., Liu, H., Ning, Y., and Zou, H. (2017), “High Dimensional Semiparametric Latent Graphical Model for Mixed Data.” <https://doi.org/10.1111/rssb.12168>`_ *JRSS B*. **Continuous/binary** types.
* `Quan X., Booth J.G. and Wells M.T. “Rank-based approach for estimating correlations in mixed ordinal data.” <https://arxiv.org/abs/1809.06255>`_ *arXiv*. **Ternary** type.
* `Yoon G., Carroll R.J. and Gaynanova I. (2020). “Sparse semiparametric canonical correlation analysis for data of mixed types.” <https://doi.org/10.1093/biomet/asaa007>`_ *Biometrika*. **Truncated** type for zero-inflated data.
* `Yoon G., Müller C.L. and Gaynanova I. (2021). “Fast computation of latent correlations.” <https://doi.org/10.1080/10618600.2021.1882468>`_. **Approximation method of computation**, see `math framework <https://latentcor-py.readthedocs.io/en/latest/math.html#>`_ for details.
Statement of need
-----------------
No Python software package is currently available that allows accurate and fast correlation estimation from mixed variable data in a unifying manner.
The Python package :code:`latentcor`, introduced here, thus represents the first stand-alone Python package for computation of latent correlation that
takes into account all variable types (continuous/binary/ordinal/zero-inflated), comes with an optimized memory footprint, and is computationally efficient,
essentially making latent correlation estimation almost as fast as rank-based correlation estimation.
Installation
------------
The easiest way to install :code:`latentcor` is using :code:`pip`.
.. code-block::
pip install latentcor
Example
-------
Let's import :code:`gen_data`, :code:`get_tps` and :code:`latentcor` from :code:`latentcor`.
.. code-block::
from latentcor import gen_data, get_tps, latentcor
First, we will generate a pair of variables with different types using a sample size :code:`n=100` which will serve as example data. Here first variable will be ternary, and second variable will be continuous.
.. code-block::
simdata = gen_data(n = 100, tps = ["ter", "con"])
print(simdata['X'][ : 6, : ])
Then we can estimate the latent correlation matrix based on these 2 variables using :code:`latentcor` function.
.. code-block::
estimate = latentcor(simdata['X'], tps = ["ter", "con"])
print(estimate['R'])
Community Guidelines
--------------------
* Contributions and suggestions to the software are always welcome. Please consult our `contribution guidelines <https://github.com/mingzehuang/latentcor_py/blob/master/CONTRIBUTING.rst>`_ prior to submitting a pull request.
* Report issues or problems with the software using github’s `issue tracker <https://github.com/mingzehuang/latentcor_py/issues>`_.
* The easiest way to replicate development environment of `latentcor` is using `pip`:
.. code-block::
pip install -r requirements_dev.txt
Credits
-------
This package was created with Cookiecutter_ and the `audreyr/cookiecutter-pypackage`_ project template.
.. _Cookiecutter: https://github.com/audreyr/cookiecutter
.. _`audreyr/cookiecutter-pypackage`: https://github.com/audreyr/cookiecutter-pypackage
=======
History
=======
0.1.0 (2021-12-28)
------------------
* First version.
0.1.1 (2022-01-06)
------------------
* Fix some typos.
0.1.2 (2022-01-06)
------------------
* Fix some bug on :code:`use_nearPD` argument in function :code:`latentcor`.
0.1.3 (2022-01-07)
------------------
* Fix syntax errors for :code:`jupyter-execute` in README.txt.
0.1.4 (2022-05-23)
------------------
* Fix error for continuous estimation.
0.2.0 (2022-08-16)
------------------
* Increase maximum iteration for positive definiteness adjustment.
* Make function outputs as dictionary.
0.2.1 (2022-08-22)
------------------
* Make output latent correlation matrix as pandas.DataFrame.
* Polish output heatmap.
0.2.2 (2022-08-22)
------------------
* Update README file.
0.2.3 (2022-08-22)
------------------
* Correct update history.
0.2.4 (2022-09-07)
------------------
* Correct incompatible versions.
0.2.5 (2023-11-05)
* Regenerate interpolants for approximation method and fix version compatibility for Python 3.7.
Raw data
{
"_id": null,
"home_page": "https://github.com/mingzehuang/latentcor_py",
"name": "latentcor",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": "",
"keywords": "latentcor",
"author": "Mingze Huang, Christian L. M\u00fcller, Irina Gaynanova",
"author_email": "mingzehuang@gmail.com, christian.mueller@stat.uni-muenchen.de, irinag@stat.tamu.edu",
"download_url": "https://files.pythonhosted.org/packages/85/cd/93f07b8b587e4343f97dc6929eef700c58466ad74bebc0ba3f74a71e4121/latentcor-0.2.5.tar.gz",
"platform": null,
"description": "\r\nlatentcor: Fast Computation of Latent Correlations for Mixed Data\r\n=================================================================\r\n\r\n.. image:: https://readthedocs.org/projects/latentcor-py/badge/?version=latest\r\n :target: https://latentcor-py.readthedocs.io/en/latest/?badge=latest\r\n :alt: Documentation Status\r\n\r\n.. image:: https://img.shields.io/pypi/v/latentcor.svg\r\n :target: https://pypi.python.org/pypi/latentcor_py\r\n\r\n.. image:: https://app.travis-ci.com/mingzehuang/latentcor_py.svg?branch=master\r\n :target: https://app.travis-ci.com/mingzehuang/latentcor_py\r\n\r\n.. image:: https://codecov.io/gh/mingzehuang/latentcor_py/branch/master/graph/badge.svg?token=SF57J6ZW0B\r\n :target: https://codecov.io/gh/mingzehuang/latentcor_py\r\n\r\n* Free software: `GNU General Public License v3 <https://github.com/mingzehuang/latentcor_py/blob/master/LICENSE>`_\r\n* Documentation: https://latentcor-py.readthedocs.io.\r\n \r\nIntroduction\r\n------------\r\n\r\n`latentcor` is an Python package for estimation of latent correlations with mixed data types (continuous, binary, truncated, and ternary) under the latent Gaussian copula model. For references on the estimation framework, see\r\n\r\n* `Fan, J., Liu, H., Ning, Y., and Zou, H. (2017), \u201cHigh Dimensional Semiparametric Latent Graphical Model for Mixed Data.\u201d <https://doi.org/10.1111/rssb.12168>`_ *JRSS B*. **Continuous/binary** types.\r\n\r\n* `Quan X., Booth J.G. and Wells M.T. \u201cRank-based approach for estimating correlations in mixed ordinal data.\u201d <https://arxiv.org/abs/1809.06255>`_ *arXiv*. **Ternary** type.\r\n\r\n* `Yoon G., Carroll R.J. and Gaynanova I. (2020). \u201cSparse semiparametric canonical correlation analysis for data of mixed types.\u201d <https://doi.org/10.1093/biomet/asaa007>`_ *Biometrika*. **Truncated** type for zero-inflated data.\r\n\r\n* `Yoon G., M\u00fcller C.L. and Gaynanova I. (2021). \u201cFast computation of latent correlations.\u201d <https://doi.org/10.1080/10618600.2021.1882468>`_. **Approximation method of computation**, see `math framework <https://latentcor-py.readthedocs.io/en/latest/math.html#>`_ for details.\r\n\r\n\r\n\r\nStatement of need\r\n-----------------\r\n\r\nNo Python software package is currently available that allows accurate and fast correlation estimation from mixed variable data in a unifying manner.\r\nThe Python package :code:`latentcor`, introduced here, thus represents the first stand-alone Python package for computation of latent correlation that\r\ntakes into account all variable types (continuous/binary/ordinal/zero-inflated), comes with an optimized memory footprint, and is computationally efficient,\r\nessentially making latent correlation estimation almost as fast as rank-based correlation estimation.\r\n\r\n\r\nInstallation\r\n------------\r\n\r\nThe easiest way to install :code:`latentcor` is using :code:`pip`.\r\n\r\n.. code-block::\r\n\r\n pip install latentcor\r\n\r\n\r\nExample\r\n-------\r\n\r\nLet's import :code:`gen_data`, :code:`get_tps` and :code:`latentcor` from :code:`latentcor`.\r\n\r\n.. code-block::\r\n\r\n from latentcor import gen_data, get_tps, latentcor\r\n\r\nFirst, we will generate a pair of variables with different types using a sample size :code:`n=100` which will serve as example data. Here first variable will be ternary, and second variable will be continuous.\r\n\r\n.. code-block::\r\n \r\n simdata = gen_data(n = 100, tps = [\"ter\", \"con\"])\r\n print(simdata['X'][ : 6, : ])\r\n\r\nThen we can estimate the latent correlation matrix based on these 2 variables using :code:`latentcor` function.\r\n\r\n.. code-block::\r\n\r\n estimate = latentcor(simdata['X'], tps = [\"ter\", \"con\"])\r\n print(estimate['R'])\r\n\r\nCommunity Guidelines\r\n--------------------\r\n\r\n* Contributions and suggestions to the software are always welcome. Please consult our `contribution guidelines <https://github.com/mingzehuang/latentcor_py/blob/master/CONTRIBUTING.rst>`_ prior to submitting a pull request.\r\n* Report issues or problems with the software using github\u2019s `issue tracker <https://github.com/mingzehuang/latentcor_py/issues>`_.\r\n* The easiest way to replicate development environment of `latentcor` is using `pip`:\r\n\r\n.. code-block::\r\n\r\n pip install -r requirements_dev.txt\r\n\r\n\r\nCredits\r\n-------\r\n\r\nThis package was created with Cookiecutter_ and the `audreyr/cookiecutter-pypackage`_ project template.\r\n\r\n.. _Cookiecutter: https://github.com/audreyr/cookiecutter\r\n.. _`audreyr/cookiecutter-pypackage`: https://github.com/audreyr/cookiecutter-pypackage\r\n\r\n=======\r\nHistory\r\n=======\r\n\r\n0.1.0 (2021-12-28)\r\n------------------\r\n\r\n* First version.\r\n\r\n0.1.1 (2022-01-06)\r\n------------------\r\n\r\n* Fix some typos.\r\n\r\n0.1.2 (2022-01-06)\r\n------------------\r\n\r\n* Fix some bug on :code:`use_nearPD` argument in function :code:`latentcor`.\r\n\r\n0.1.3 (2022-01-07)\r\n------------------\r\n\r\n* Fix syntax errors for :code:`jupyter-execute` in README.txt.\r\n\r\n0.1.4 (2022-05-23)\r\n------------------\r\n\r\n* Fix error for continuous estimation.\r\n\r\n0.2.0 (2022-08-16)\r\n------------------\r\n\r\n* Increase maximum iteration for positive definiteness adjustment.\r\n* Make function outputs as dictionary.\r\n\r\n0.2.1 (2022-08-22)\r\n------------------\r\n\r\n* Make output latent correlation matrix as pandas.DataFrame.\r\n* Polish output heatmap.\r\n\r\n0.2.2 (2022-08-22)\r\n------------------\r\n\r\n* Update README file.\r\n\r\n0.2.3 (2022-08-22)\r\n------------------\r\n\r\n* Correct update history.\r\n\r\n0.2.4 (2022-09-07)\r\n------------------\r\n\r\n* Correct incompatible versions.\r\n\r\n0.2.5 (2023-11-05)\r\n\r\n* Regenerate interpolants for approximation method and fix version compatibility for Python 3.7.\r\n",
"bugtrack_url": null,
"license": "GNU General Public License v3",
"summary": "Fast Computation of Latent Correlations for Mixed Data",
"version": "0.2.5",
"project_urls": {
"Homepage": "https://github.com/mingzehuang/latentcor_py"
},
"split_keywords": [
"latentcor"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "70fa5711a6e3205006a17a8f542f54b017bcea86bd4af041d0e6967cd4db829f",
"md5": "3c630331fc825b1ea2a54c2eaa91eba6",
"sha256": "4ceb7f2c8ea95ed143e92dec84a05f1beca0cc30a76c0558bd3650190ba09a01"
},
"downloads": -1,
"filename": "latentcor-0.2.5-py2.py3-none-any.whl",
"has_sig": false,
"md5_digest": "3c630331fc825b1ea2a54c2eaa91eba6",
"packagetype": "bdist_wheel",
"python_version": "py2.py3",
"requires_python": ">=3.7",
"size": 4038490,
"upload_time": "2023-11-05T23:36:31",
"upload_time_iso_8601": "2023-11-05T23:36:31.286331Z",
"url": "https://files.pythonhosted.org/packages/70/fa/5711a6e3205006a17a8f542f54b017bcea86bd4af041d0e6967cd4db829f/latentcor-0.2.5-py2.py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "85cd93f07b8b587e4343f97dc6929eef700c58466ad74bebc0ba3f74a71e4121",
"md5": "379c70c637c2f765f3307a94c85096f5",
"sha256": "4b220216f5b86a404cbc51a9552eb4355f7c8729a43dd575bccb2d22d614d679"
},
"downloads": -1,
"filename": "latentcor-0.2.5.tar.gz",
"has_sig": false,
"md5_digest": "379c70c637c2f765f3307a94c85096f5",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 4056194,
"upload_time": "2023-11-05T23:36:40",
"upload_time_iso_8601": "2023-11-05T23:36:40.498136Z",
"url": "https://files.pythonhosted.org/packages/85/cd/93f07b8b587e4343f97dc6929eef700c58466ad74bebc0ba3f74a71e4121/latentcor-0.2.5.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-11-05 23:36:40",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "mingzehuang",
"github_project": "latentcor_py",
"travis_ci": true,
"coveralls": false,
"github_actions": false,
"tox": true,
"lcname": "latentcor"
}