muvi


Namemuvi JSON
Version 0.1.5 PyPI version JSON
download
home_pagehttps://github.com/MLO-lab/MuVI
SummaryMuVI: A multi-view latent variable model with domain-informed structured sparsity for integrating noisy feature sets.
upload_time2024-11-22 12:08:10
maintainerNone
docs_urlNone
authorArber Qoku
requires_python<3.11,>=3.9
licenseNone
keywords multi-view multi-omics feature sets latent variable model structured sparsity variational inference single-cell
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # MuVI

A multi-view latent variable model with domain-informed structured sparsity, that integrates noisy domain expertise in terms of feature sets.

[Examples](examples/1_basic_tutorial.ipynb) | [Paper](https://proceedings.mlr.press/v206/qoku23a/qoku23a.pdf) | [BibTeX](citation.bib)

[![Build](https://github.com/mlo-lab/muvi/actions/workflows/build.yml/badge.svg)](https://github.com/mlo-lab/muvi/actions/workflows/build.yml/)
[![Coverage](https://codecov.io/gh/mlo-lab/muvi/branch/master/graph/badge.svg)](https://codecov.io/gh/mlo-lab/muvi)

## Basic usage

The `MuVI` class is the main entry point for loading the data and performing the inference:

```py
import numpy as np
import pandas as pd
import anndata as ad
import mudata as md
import muvi

# Load processed input data (missing values are allowed)
# Matrix of dimensions n_samples x n_rna_features
rna_df = pd.read_csv(...)
# Matrix of dimensions n_samples x n_prot_features
prot_df = pd.read_csv(...)

# Load prior feature sets, e.g. gene sets
gene_sets = muvi.fs.from_gmt(...)
# Binary matrix of dimensions n_gene_sets x n_rna_features
gene_sets_mask = gene_sets.to_mask(rna_df.columns)

# Create a MuVI object by passing both input data and prior information
model = muvi.MuVI(
    observations={"rna": rna_df, "prot": prot_df},
    prior_masks={"rna": gene_sets_mask},
    ...
    device=device,
)

# Alternatively, create a MuVI model from AnnData (single-view)
rna_adata = ad.AnnData(rna_df, dtype=np.float32)
rna_adata.varm['gene_sets_mask'] = gene_sets_mask.T
model = muvi.tl.from_adata(
    adata, 
    prior_mask_key="gene_sets_mask", 
    ..., 
    device=device
)

# Alternatively, create a MuVI model from MuData (multi-view)
mdata = md.MuData({"rna": rna_adata, "prot": prot_adata})
model = muvi.tl.mdata(
    mdata, 
    prior_mask_key="gene_sets_mask", 
    ..., 
    device=device
)

# Fit the model for a given number of training epochs
model.fit(batch_size, n_epochs, ...)

# Continue with the downstream analysis (see below)
```

## Submodules

The package consists of three additional submodules for analysing the results post-training:

- [`muvi.tl`](muvi/tools/utils.py) provides tools for downstream analysis, e.g.,
  - compute `muvi.tl.variance_explained` across all factors and views
  - `muvi.tl.test` the significance between the prior feature sets and the inferred factors
  - apply clustering on the latent space such as `muvi.tl.leiden`
  - `muvi.tl.save` the model in order to `muvi.tl.load` it at a later point in time
- [`muvi.pl`](muvi/tools/plotting.py) works in tandem with `muvi.tl` by providing visualization methods such as
  - `muvi.pl.variance_explained` (see above)
  - plotting the latent space via `muvi.pl.tsne`, `muvi.pl.scatter` or `muvi.pl.stripplot`
  - investigating factors in terms of their inferred loadings with `muvi.pl.inspect_factor`
- [`muvi.fs`](muvi/tools/feature_sets.py) serves the data structure and methods for loading, processing and storing the prior information from feature sets

## Tutorials

Check out our [basic tutorial](examples/1_basic_tutorial.ipynb) to get familiar with `MuVI`, or jump straight to a [single-cell multiome](examples/3a_single-cell_multi-omics_integration.ipynb) analysis!

`R` users can readily export a trained `MuVI` model into `R` with a single line of code and resume the analysis with the [`MOFA2`](https://biofam.github.io/MOFA2) package.

```py
muvi.ext.save_as_hdf5(model, "muvi.hdf5", save_metadata=True)
```

See [this vignette](https://raw.githack.com/MLO-lab/MuVI/master/examples/4_single-cell_multi-omics_integration_R.html) for more details!

## Installation

We suggest using [conda](https://docs.conda.io/en/latest/miniconda.html) to manage your environments, and [pip](https://pypi.org/project/pip/) to install `muvi` as a python package. Follow these steps to get `muvi` up and running!

1. Create a python environment in `conda`:

```bash
conda create -n muvi python=3.9
```

2. Activate freshly created environment:

```bash
source activate muvi
```

3. Install `muvi` with `pip`:

```bash
python3 -m pip install muvi
```

4. Alternatively, install the latest version with `pip`:

```bash
python3 -m pip install git+https://github.com/MLO-lab/MuVI.git
```

Make sure to install a GPU version of [PyTorch](https://pytorch.org/) to significantly speed up the inference.

## Citation

If you use `MuVI` in your work, please use this [BibTeX](citation.bib) entry:

> **Encoding Domain Knowledge in Multi-view Latent Variable Models: A Bayesian Approach with Structured Sparsity**
>
> Arber Qoku and Florian Buettner
>
> _International Conference on Artificial Intelligence and Statistics (AISTATS)_ 2023
>
> <https://proceedings.mlr.press/v206/qoku23a.html>


            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/MLO-lab/MuVI",
    "name": "muvi",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<3.11,>=3.9",
    "maintainer_email": null,
    "keywords": "multi-view, multi-omics, feature sets, latent variable model, structured sparsity, variational inference, single-cell",
    "author": "Arber Qoku",
    "author_email": "arber.qoku@dkfz-heidelberg.com",
    "download_url": "https://files.pythonhosted.org/packages/b8/d6/0519c6961bc0ce99982326c4351f95016a97bbdd8cb662b81c2987d5e8a6/muvi-0.1.5.tar.gz",
    "platform": null,
    "description": "# MuVI\n\nA multi-view latent variable model with domain-informed structured sparsity, that integrates noisy domain expertise in terms of feature sets.\n\n[Examples](examples/1_basic_tutorial.ipynb) | [Paper](https://proceedings.mlr.press/v206/qoku23a/qoku23a.pdf) | [BibTeX](citation.bib)\n\n[![Build](https://github.com/mlo-lab/muvi/actions/workflows/build.yml/badge.svg)](https://github.com/mlo-lab/muvi/actions/workflows/build.yml/)\n[![Coverage](https://codecov.io/gh/mlo-lab/muvi/branch/master/graph/badge.svg)](https://codecov.io/gh/mlo-lab/muvi)\n\n## Basic usage\n\nThe `MuVI` class is the main entry point for loading the data and performing the inference:\n\n```py\nimport numpy as np\nimport pandas as pd\nimport anndata as ad\nimport mudata as md\nimport muvi\n\n# Load processed input data (missing values are allowed)\n# Matrix of dimensions n_samples x n_rna_features\nrna_df = pd.read_csv(...)\n# Matrix of dimensions n_samples x n_prot_features\nprot_df = pd.read_csv(...)\n\n# Load prior feature sets, e.g. gene sets\ngene_sets = muvi.fs.from_gmt(...)\n# Binary matrix of dimensions n_gene_sets x n_rna_features\ngene_sets_mask = gene_sets.to_mask(rna_df.columns)\n\n# Create a MuVI object by passing both input data and prior information\nmodel = muvi.MuVI(\n    observations={\"rna\": rna_df, \"prot\": prot_df},\n    prior_masks={\"rna\": gene_sets_mask},\n    ...\n    device=device,\n)\n\n# Alternatively, create a MuVI model from AnnData (single-view)\nrna_adata = ad.AnnData(rna_df, dtype=np.float32)\nrna_adata.varm['gene_sets_mask'] = gene_sets_mask.T\nmodel = muvi.tl.from_adata(\n    adata, \n    prior_mask_key=\"gene_sets_mask\", \n    ..., \n    device=device\n)\n\n# Alternatively, create a MuVI model from MuData (multi-view)\nmdata = md.MuData({\"rna\": rna_adata, \"prot\": prot_adata})\nmodel = muvi.tl.mdata(\n    mdata, \n    prior_mask_key=\"gene_sets_mask\", \n    ..., \n    device=device\n)\n\n# Fit the model for a given number of training epochs\nmodel.fit(batch_size, n_epochs, ...)\n\n# Continue with the downstream analysis (see below)\n```\n\n## Submodules\n\nThe package consists of three additional submodules for analysing the results post-training:\n\n- [`muvi.tl`](muvi/tools/utils.py) provides tools for downstream analysis, e.g.,\n  - compute `muvi.tl.variance_explained` across all factors and views\n  - `muvi.tl.test` the significance between the prior feature sets and the inferred factors\n  - apply clustering on the latent space such as `muvi.tl.leiden`\n  - `muvi.tl.save` the model in order to `muvi.tl.load` it at a later point in time\n- [`muvi.pl`](muvi/tools/plotting.py) works in tandem with `muvi.tl` by providing visualization methods such as\n  - `muvi.pl.variance_explained` (see above)\n  - plotting the latent space via `muvi.pl.tsne`, `muvi.pl.scatter` or `muvi.pl.stripplot`\n  - investigating factors in terms of their inferred loadings with `muvi.pl.inspect_factor`\n- [`muvi.fs`](muvi/tools/feature_sets.py) serves the data structure and methods for loading, processing and storing the prior information from feature sets\n\n## Tutorials\n\nCheck out our [basic tutorial](examples/1_basic_tutorial.ipynb) to get familiar with `MuVI`, or jump straight to a [single-cell multiome](examples/3a_single-cell_multi-omics_integration.ipynb) analysis!\n\n`R` users can readily export a trained `MuVI` model into `R` with a single line of code and resume the analysis with the [`MOFA2`](https://biofam.github.io/MOFA2) package.\n\n```py\nmuvi.ext.save_as_hdf5(model, \"muvi.hdf5\", save_metadata=True)\n```\n\nSee [this vignette](https://raw.githack.com/MLO-lab/MuVI/master/examples/4_single-cell_multi-omics_integration_R.html) for more details!\n\n## Installation\n\nWe suggest using [conda](https://docs.conda.io/en/latest/miniconda.html) to manage your environments, and [pip](https://pypi.org/project/pip/) to install `muvi` as a python package. Follow these steps to get `muvi` up and running!\n\n1. Create a python environment in `conda`:\n\n```bash\nconda create -n muvi python=3.9\n```\n\n2. Activate freshly created environment:\n\n```bash\nsource activate muvi\n```\n\n3. Install `muvi` with `pip`:\n\n```bash\npython3 -m pip install muvi\n```\n\n4. Alternatively, install the latest version with `pip`:\n\n```bash\npython3 -m pip install git+https://github.com/MLO-lab/MuVI.git\n```\n\nMake sure to install a GPU version of [PyTorch](https://pytorch.org/) to significantly speed up the inference.\n\n## Citation\n\nIf you use `MuVI` in your work, please use this [BibTeX](citation.bib) entry:\n\n> **Encoding Domain Knowledge in Multi-view Latent Variable Models: A Bayesian Approach with Structured Sparsity**\n>\n> Arber Qoku and Florian Buettner\n>\n> _International Conference on Artificial Intelligence and Statistics (AISTATS)_ 2023\n>\n> <https://proceedings.mlr.press/v206/qoku23a.html>\n\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "MuVI: A multi-view latent variable model with domain-informed structured sparsity for integrating noisy feature sets.",
    "version": "0.1.5",
    "project_urls": {
        "Homepage": "https://github.com/MLO-lab/MuVI",
        "Repository": "https://github.com/MLO-lab/MuVI"
    },
    "split_keywords": [
        "multi-view",
        " multi-omics",
        " feature sets",
        " latent variable model",
        " structured sparsity",
        " variational inference",
        " single-cell"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "cfb4a1349fe11eb5534c97ef42feae50e21df0847116dbb020140864a2be3f92",
                "md5": "aba83fc572856a131d47aa82497e62ee",
                "sha256": "cb8f46484dea11fb72c9a8501e22b08c61e6684cc6010b899e200fa8cb7ad13d"
            },
            "downloads": -1,
            "filename": "muvi-0.1.5-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "aba83fc572856a131d47aa82497e62ee",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<3.11,>=3.9",
            "size": 54115,
            "upload_time": "2024-11-22T12:08:08",
            "upload_time_iso_8601": "2024-11-22T12:08:08.632256Z",
            "url": "https://files.pythonhosted.org/packages/cf/b4/a1349fe11eb5534c97ef42feae50e21df0847116dbb020140864a2be3f92/muvi-0.1.5-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b8d60519c6961bc0ce99982326c4351f95016a97bbdd8cb662b81c2987d5e8a6",
                "md5": "64c0e24820ae94b755abcc95e84838cd",
                "sha256": "2bb64850cdf580500191c4afffcc6d1b852e5caea5c827679c208b87ec0ab0f8"
            },
            "downloads": -1,
            "filename": "muvi-0.1.5.tar.gz",
            "has_sig": false,
            "md5_digest": "64c0e24820ae94b755abcc95e84838cd",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<3.11,>=3.9",
            "size": 52038,
            "upload_time": "2024-11-22T12:08:10",
            "upload_time_iso_8601": "2024-11-22T12:08:10.587467Z",
            "url": "https://files.pythonhosted.org/packages/b8/d6/0519c6961bc0ce99982326c4351f95016a97bbdd8cb662b81c2987d5e8a6/muvi-0.1.5.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-22 12:08:10",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "MLO-lab",
    "github_project": "MuVI",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "muvi"
}
        
Elapsed time: 0.47543s