# Causal discovery of gene regulatory programs from single-cell genomics
[](https://pypi.org/project/cascade-reg)
[](https://github.com/gao-lab/CASCADE/actions/workflows/build.yml)
[](https://codecov.io/gh/gao-lab/CASCADE)
[](https://opensource.org/licenses/MIT)
[](https://github.com/python/black)
**CASCADE** stands for **C**ausality-**A**ware **S**ingle-**C**ell **A**daptive
**D**iscover/**D**eduction/**D**esign **E**ngine. It is a deep learning-based
bioinformatics tool for causal gene regulatory network discovery, counterfactual
perturbation effect prediction, and targeted intervention design based on
high-content single-cell perturbation screens.
Trained on single-cell perturbation data, CASCADE models the causal gene
regulatory network as a directed acyclic graph (DAG) and leverages
differentiable causal discovery (DCD) to transform the search of discrete
network structures into a manageable optimization problem. We achieve causal
discovery with thousands of genes by incorporating a scaffold graph built from
context-agnostic, coarse prior regulatory knowledge to constrain search space
and enhance computational efficiency in an evidence-guided manner. Additionally,
technical confounding covariate as well as gene-wise perturbation latent
variables encoded from gene ontology (GO) annotations are also included to
account for effects not explained by the causal structure. The complete CASCADE
model is constructed within a Bayesian framework, allowing for the estimation of
causal uncertainty under limited data regimes typical of practical biological
experiments.

Using the inferred causal regulatory network, CASCADE supports two types of
downstream inference. First, it performs counterfactual deduction of unseen
perturbation effects by iteratively propagating perturbation effects following
the topological order of the causal graph. Notably, this deduction process
remains end-to-end differentiable, allowing it to be inverted into intervention
design by treating gene intervention as an optimizable parameter trained to
minimize deviation between the counterfactual outcome and desired target
transcriptomes.
For more details, please check out our preprint at TODO.
## Install
CASCADE is implemented in the ``cascade-reg`` package. It can be installed
directly using pip:
```sh
pip install cascade-reg
```
To avoid potential dependency conflicts, installing within a
[conda environment](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html)
is recommended.
A conda build will be available in the future.
## How to use
Proceed to our [documentation site](https://cascade-reg.readthedocs.io) for how to
use the ``cascade-reg`` package.
## Replicate results
1. Check out the repository to branch `repicate`:
```sh
git checkout replicate
```
2. Create a local conda environment using the `env.sh` script:
```sh
./env.sh create
```
3. Activate the local conda environment:
```sh
mamba activate ./conda
```
4. Use scripts in `data/download` to prepare necessary data
5. Use scripts in `data/scaffold` to prepare the scaffold graphs
6. Use pipeline in `evaluation` for running systematic benchmarks
7. Use notebooks in `experiments` for intervention design case studies
## Development
Instructions below are only for development purpose.
### Environment setup
Use the following commands to manage the development environment:
```sh
./env.sh create # Create new environment based on config files
./env.sh export # Export environment changes to config files
./env.sh update # Update environment based on config files
```
Use the following commands to activate and deactivate the environment:
```sh
mamba activate ./conda
mamba deactivate
```
### Build documentation
```sh
sphinx-build -b html -D language=en docs docs/_build/html/en
```
Raw data
{
"_id": null,
"home_page": null,
"name": "cascade-reg",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "bioinformatics, causal-discovery, causal-inference, deep-learning, gene-regulatory-network, single-cell",
"author": null,
"author_email": "Zhi-Jie Cao <caozj@mail.cbi.pku.edu.cn>",
"download_url": "https://files.pythonhosted.org/packages/b7/c8/a43f9aa12a2631ff36d5cb61a67dcd12f98c909fd1f05dd280feaa86c36d/cascade_reg-0.5.0.tar.gz",
"platform": null,
"description": "# Causal discovery of gene regulatory programs from single-cell genomics\n\n[](https://pypi.org/project/cascade-reg)\n[](https://github.com/gao-lab/CASCADE/actions/workflows/build.yml)\n[](https://codecov.io/gh/gao-lab/CASCADE)\n[](https://opensource.org/licenses/MIT)\n[](https://github.com/python/black)\n\n**CASCADE** stands for **C**ausality-**A**ware **S**ingle-**C**ell **A**daptive\n**D**iscover/**D**eduction/**D**esign **E**ngine. It is a deep learning-based\nbioinformatics tool for causal gene regulatory network discovery, counterfactual\nperturbation effect prediction, and targeted intervention design based on\nhigh-content single-cell perturbation screens.\n\nTrained on single-cell perturbation data, CASCADE models the causal gene\nregulatory network as a directed acyclic graph (DAG) and leverages\ndifferentiable causal discovery (DCD) to transform the search of discrete\nnetwork structures into a manageable optimization problem. We achieve causal\ndiscovery with thousands of genes by incorporating a scaffold graph built from\ncontext-agnostic, coarse prior regulatory knowledge to constrain search space\nand enhance computational efficiency in an evidence-guided manner. Additionally,\ntechnical confounding covariate as well as gene-wise perturbation latent\nvariables encoded from gene ontology (GO) annotations are also included to\naccount for effects not explained by the causal structure. The complete CASCADE\nmodel is constructed within a Bayesian framework, allowing for the estimation of\ncausal uncertainty under limited data regimes typical of practical biological\nexperiments.\n\n\n\nUsing the inferred causal regulatory network, CASCADE supports two types of\ndownstream inference. First, it performs counterfactual deduction of unseen\nperturbation effects by iteratively propagating perturbation effects following\nthe topological order of the causal graph. Notably, this deduction process\nremains end-to-end differentiable, allowing it to be inverted into intervention\ndesign by treating gene intervention as an optimizable parameter trained to\nminimize deviation between the counterfactual outcome and desired target\ntranscriptomes.\n\nFor more details, please check out our preprint at TODO.\n\n## Install\n\nCASCADE is implemented in the ``cascade-reg`` package. It can be installed\ndirectly using pip:\n\n```sh\npip install cascade-reg\n```\n\nTo avoid potential dependency conflicts, installing within a\n[conda environment](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html)\nis recommended.\n\nA conda build will be available in the future.\n\n## How to use\n\nProceed to our [documentation site](https://cascade-reg.readthedocs.io) for how to\nuse the ``cascade-reg`` package.\n\n## Replicate results\n\n1. Check out the repository to branch `repicate`:\n ```sh\n git checkout replicate\n ```\n2. Create a local conda environment using the `env.sh` script:\n ```sh\n ./env.sh create\n ```\n3. Activate the local conda environment:\n ```sh\n mamba activate ./conda\n ```\n4. Use scripts in `data/download` to prepare necessary data\n5. Use scripts in `data/scaffold` to prepare the scaffold graphs\n6. Use pipeline in `evaluation` for running systematic benchmarks\n7. Use notebooks in `experiments` for intervention design case studies\n\n## Development\n\nInstructions below are only for development purpose.\n\n### Environment setup\n\nUse the following commands to manage the development environment:\n\n```sh\n./env.sh create # Create new environment based on config files\n./env.sh export # Export environment changes to config files\n./env.sh update # Update environment based on config files\n```\n\nUse the following commands to activate and deactivate the environment:\n\n```sh\nmamba activate ./conda\nmamba deactivate\n```\n\n### Build documentation\n\n```sh\nsphinx-build -b html -D language=en docs docs/_build/html/en\n```\n\n",
"bugtrack_url": null,
"license": null,
"summary": "Causal discovery of gene regulatory programs from single-cell genomics",
"version": "0.5.0",
"project_urls": {
"Github": "https://github.com/gao-lab/CASCADE-dev"
},
"split_keywords": [
"bioinformatics",
" causal-discovery",
" causal-inference",
" deep-learning",
" gene-regulatory-network",
" single-cell"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "8f7299e438eaa7c66bd30dd7ade6b18e650439170f7cb0c2208e2880fce8b28c",
"md5": "bccb314e3cc704219b2c67850fa7643f",
"sha256": "2642d0a34a6e02a84f721d7c09cb79d5a1746ab632cb407f7645a4174ad21550"
},
"downloads": -1,
"filename": "cascade_reg-0.5.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "bccb314e3cc704219b2c67850fa7643f",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 76013,
"upload_time": "2025-09-02T03:49:40",
"upload_time_iso_8601": "2025-09-02T03:49:40.372918Z",
"url": "https://files.pythonhosted.org/packages/8f/72/99e438eaa7c66bd30dd7ade6b18e650439170f7cb0c2208e2880fce8b28c/cascade_reg-0.5.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "b7c8a43f9aa12a2631ff36d5cb61a67dcd12f98c909fd1f05dd280feaa86c36d",
"md5": "12002150dce28ba6e3a7d2327a95cef9",
"sha256": "64e4c3cc704e1ab1a64f7410130856cc6a3ca6f0d43c2290c14d92533e05ea61"
},
"downloads": -1,
"filename": "cascade_reg-0.5.0.tar.gz",
"has_sig": false,
"md5_digest": "12002150dce28ba6e3a7d2327a95cef9",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 73080,
"upload_time": "2025-09-02T03:49:41",
"upload_time_iso_8601": "2025-09-02T03:49:41.502436Z",
"url": "https://files.pythonhosted.org/packages/b7/c8/a43f9aa12a2631ff36d5cb61a67dcd12f98c909fd1f05dd280feaa86c36d/cascade_reg-0.5.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-09-02 03:49:41",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "gao-lab",
"github_project": "CASCADE-dev",
"github_not_found": true,
"lcname": "cascade-reg"
}