# scBiG for representation learning of single-cell gene expression data based on bipartite graph embedding
## Overview
![alt](overview.png)
scBiG is a graph autoencoder network where the encoder based on multi-layer graph convolutional networks extracts high-order representations of cells and genes from the cell-gene bipartite graph, and the decoder based on the ZINB model uses these representations to reconstruct the gene expression matrix. By virtue of a model-driven self-supervised training paradigm, scBiG can effectively learn low-dimensional representations of both cells and genes, amenable to diverse downstream analytical tasks.
## Installation
Please install `scBiG` from pypi with:
```bash
pip install scbig
```
Or clone this repository and use
```bash
pip install -e .
```
in the root of this repository.
## Quick start
Load the data to be analyzed:
```python
import scanpy as sc
adata = sc.AnnData(data)
```
Perform data pre-processing:
```python
# Basic filtering
sc.pp.filter_genes(adata, min_cells=3)
sc.pp.filter_cells(adata, min_genes=200)
adata.raw = adata.copy()
# Total-count normlize, logarithmize the data, calculate the gene size factor
sc.pp.normalize_per_cell(adata)
adata.obs['cs_factor'] = adata.obs.n_counts / np.median(adata.obs.n_counts)
sc.pp.log1p(adata)
adata.var['gs_factor'] = np.max(adata.X, axis=0, keepdims=True).reshape(-1)
```
Run the scBiG method:
```python
from scbig import run_scbig
adata = run_scbig(adata)
```
The output adata contains the cell embeddings in `adata.obsm['feat']` and the gene embeddings in `adata.obsm['feat']`. The embeddings can be used as input of other downstream analyses.
Please refer to `tutorial.ipynb` for a detailed description of scBiG's usage.
Raw data
{
"_id": null,
"home_page": "https://github.com/sldyns/scBiG",
"name": "scbig",
"maintainer": "Kun Qian",
"docs_url": null,
"requires_python": "",
"maintainer_email": "kun_qian@foxmail.com",
"keywords": "single-cell RNA-sequencing,Graph node embedding,Dimensionality reduction",
"author": "Kun Qian, Ting Li",
"author_email": "kun_qian@foxmail.com",
"download_url": "https://files.pythonhosted.org/packages/74/87/560dd65870c6773c6b620ff8e3efc151814e9cbeccd788244c6184aedaf3/scbig-0.1.1.tar.gz",
"platform": "any",
"description": "# scBiG for representation learning of single-cell gene expression data based on bipartite graph embedding\r\n\r\n## Overview\r\n\r\n![alt](overview.png)\r\n\r\nscBiG is a graph autoencoder network where the encoder based on multi-layer graph convolutional networks extracts high-order representations of cells and genes from the cell-gene bipartite graph, and the decoder based on the ZINB model uses these representations to reconstruct the gene expression matrix. By virtue of a model-driven self-supervised training paradigm, scBiG can effectively learn low-dimensional representations of both cells and genes, amenable to diverse downstream analytical tasks.\r\n\r\n## Installation\r\n\r\nPlease install `scBiG` from pypi with:\r\n\r\n```bash\r\npip install scbig\r\n```\r\n\r\nOr clone this repository and use\r\n\r\n```bash\r\npip install -e .\r\n```\r\n\r\nin the root of this repository.\r\n\r\n## Quick start\r\n\r\nLoad the data to be analyzed:\r\n\r\n```python\r\nimport scanpy as sc\r\n\r\nadata = sc.AnnData(data)\r\n```\r\n\r\n\r\n\r\nPerform data pre-processing:\r\n\r\n```python\r\n# Basic filtering\r\nsc.pp.filter_genes(adata, min_cells=3)\r\nsc.pp.filter_cells(adata, min_genes=200)\r\n\r\nadata.raw = adata.copy()\r\n\r\n# Total-count normlize, logarithmize the data, calculate the gene size factor \r\nsc.pp.normalize_per_cell(adata)\r\nadata.obs['cs_factor'] = adata.obs.n_counts / np.median(adata.obs.n_counts)\r\nsc.pp.log1p(adata)\r\nadata.var['gs_factor'] = np.max(adata.X, axis=0, keepdims=True).reshape(-1)\r\n```\r\n\r\nRun the scBiG method:\r\n\r\n```python\r\nfrom scbig import run_scbig\r\nadata = run_scbig(adata)\r\n```\r\n\r\nThe output adata contains the cell embeddings in `adata.obsm['feat']` and the gene embeddings in `adata.obsm['feat']`. The embeddings can be used as input of other downstream analyses.\r\n\r\nPlease refer to `tutorial.ipynb` for a detailed description of scBiG's usage.\r\n",
"bugtrack_url": null,
"license": "MIT Licence",
"summary": "scBiG for representation learning of single-cell gene expression data based on bipartite graph embedding",
"version": "0.1.1",
"project_urls": {
"Homepage": "https://github.com/sldyns/scBiG"
},
"split_keywords": [
"single-cell rna-sequencing",
"graph node embedding",
"dimensionality reduction"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "e3a5f88ae94d9dfd0889ef9d6183b205a941c741d1428891cddc0f830ba61e80",
"md5": "36b197a0fdb89517988ce3779d07fbbb",
"sha256": "a7ab17c91862e9fddde5ff0380dfb5a0e6c2020842849fbccdfb6da490f304dd"
},
"downloads": -1,
"filename": "scbig-0.1.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "36b197a0fdb89517988ce3779d07fbbb",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 16560,
"upload_time": "2023-11-29T12:49:16",
"upload_time_iso_8601": "2023-11-29T12:49:16.191536Z",
"url": "https://files.pythonhosted.org/packages/e3/a5/f88ae94d9dfd0889ef9d6183b205a941c741d1428891cddc0f830ba61e80/scbig-0.1.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "7487560dd65870c6773c6b620ff8e3efc151814e9cbeccd788244c6184aedaf3",
"md5": "5f335d0092663d9be1050e4e78077122",
"sha256": "26d49d8d7fc363f7d85a5a0f26fca0ac56d808eeda8d1bb68135bfb7dbfd0002"
},
"downloads": -1,
"filename": "scbig-0.1.1.tar.gz",
"has_sig": false,
"md5_digest": "5f335d0092663d9be1050e4e78077122",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 14455,
"upload_time": "2023-11-29T12:49:18",
"upload_time_iso_8601": "2023-11-29T12:49:18.424312Z",
"url": "https://files.pythonhosted.org/packages/74/87/560dd65870c6773c6b620ff8e3efc151814e9cbeccd788244c6184aedaf3/scbig-0.1.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-11-29 12:49:18",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "sldyns",
"github_project": "scBiG",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "scbig"
}