step-kit


Namestep-kit JSON
Version 0.2.6 PyPI version JSON
download
home_pagehttps://github.com/SGGb0nd/step
SummarySTEP, an acronym for Spatial Transcriptomics Embedding Procedure, is a deep learning-based tool for the analysis of single-cell RNA (scRNA-seq) and spatially resolved transcriptomics (SRT) data. STEP introduces a unified approach to process and analyze multiple samples of scRNA-seq data as well as align several sections of SRT data, disregarding location relationships. Furthermore, STEP conducts integrative analysis across different modalities like scRNA-seq and SRT.
upload_time2024-11-05 05:13:49
maintainerNone
docs_urlNone
authorSGGb0nd
requires_python<4.0,>=3.10
licenseApache-2.0
keywords spatial transcriptomics single-cell rna-seq deep learning scrna-seq srt step step
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # STEP: Spatial Transcriptomics Embedding Procedure
[![Docs](https://github.com/SGGb0nd/step/actions/workflows/mkdocs.yaml/badge.svg)](https://github.com/SGGb0nd/step/actions/workflows/mkdocs.yaml)
[![Pages](https://github.com/SGGb0nd/step/actions/workflows/pages/pages-build-deployment/badge.svg)](https://github.com/SGGb0nd/step/actions/workflows/pages/pages-build-deployment)
[![PyPI version](https://badge.fury.io/py/step-kit.svg)](https://badge.fury.io/py/step-kit)
[![DOI](http://img.shields.io/badge/DOI-10.1101/2024.04.15.589470-B31B1B.svg)](https://doi.org/10.1101/2024.04.15.589470)

![image](http://43.134.105.224/images/STEP_fig_1a.webp)
![image](http://43.134.105.224/images/STEP_fig_1b.webp)

## Introduction
STEP, an acronym for Spatial Transcriptomics Embedding Procedure, is a foundation deep learning/AI architecture for the analysis of spatially resolved transcriptomics (SRT) data, and is also compatible with scRNA-seq data. STEP roots on the precise captures of three major varitions occured in the SRT (and scRNA-seq) data: **Transcriptional Variations**, **Batch Variations** and **Spatial Variations** with the correponding modular designs: **Backbone model**: a Transformer based model togther with gene module seqeunce mapping; **Batch-effect model**: A pair of inverse transformations utilizing the *batch-embedding* conception for the decoupled batch-effect elimination; **Spatial model**: a GCN-based spatial filter/smoother working on the extracted embedding from the Backbone model, different from the usage of GCN in other methods as a feature extractor. Thus, with the proper combinations of these models, STEP introduces a unified approach to systematically process and analyze single or multiple samples of SRT data, disregarding location relationships between sections (meaning both contiguous and non-contiguous sections), to reveal multi-scale bilogical heterogeneities (cell types and spatial domains) in multi-resolution SRT data. Furthermore, STEP can also conduct integrative analysis on scRNA-seq and SRT data.

## Key Features

-  Integration of multiple scRNA-seq and single-cell resolution SRT samples to reveal cell-type level heterogeneities.
-  Alignment of various SRT data sections contiguous or non-contiguous to identify spatial domains across sections.
-  Scalable to different data resolutions, i.e., wild range of technologies and platforms of SRT data, including **Visium HD**, **Visum**, **MERFISH**, **STARmap**, **Stereo-seq**, **ST**, etc.
-  Scalable to large datasets with a high number of cells and spatial locations.
-  Performance of integrative analysis across modalities (scRNA-seq and SRT) and cell-type deconvolution for the non-single-cell resolution SRT data.

## Other Capabilities
-  Capability to produce not only the batch-corrected embeddings but also batch-corrected gene expression profiles for scRNA-seq data.
-  Capability to perform spatial mapping of reference scRNA-seq data points to the spatial locations of SRT data based on the learned co-embeddings and kNN.
-  Comprehensive `adata` processing by specifically desinged `BaseDataset` class and its view version `MaskedDataset` class for the SRT data, which can be easily integrated with the PyTorch `DataLoader` class for training and validation.
-  Low computational cost and high efficiency in processing large-scale SRT data: 4-8 GB GPU memory is sufficient for processing a dataset with 100,000 spatial locations with 2000+ sample-size in fast mode, and consumes less than 6 minutes for training in 2000 iterations (tested on NVIDIA RTX 3090).

## System Requiremtns
-  **Software Requirements**: Python 3.10+, CUDA 11.6+, PyTorch 1.13.1+, and dgl 1.1.3+.
-  **Hardware Requirements**: NVIDIA GPU with CUDA support (recommended), 8GB+ GPU memory. As possible as high RAM and CPU cores for storing and processing large-scale data (this is not required by STEP, but by the data itself).
## Installation

### Quick Installation (Recommended for Python 3.10, Linux, CUDA 11.7)

```bash
pip install step-kit[cu117]
```

This command installs STEP along with compatible versions of PyTorch and DGL.

### Basic Installation

```bash
pip install step-kit
```

⚠️ **Critical Note**: 
- This basic installation **does not** include PyTorch and DGL.
- STEP **will not function** without PyTorch and DGL properly installed.
- You must install these dependencies separately before using STEP.

### Manual Dependency Installation

If you need to install specific versions of PyTorch and DGL:

1. Install DGL (example for latest DGL compatible with PyTorch 2.3.x and CUDA 11.8):
   ```bash
   pip install dgl -f https://data.dgl.ai/wheels/torch-2.3/cu118/repo.html
   ```

2. Install PyTorch (example for PyTorch 2.3.1 with CUDA 11.8):
   ```bash
   pip install torch==2.3.1+cu118 -f https://download.pytorch.org/whl/cu118/torch_stable.html
   ```

For more information on installing PyTorch and DGL, visit:
- PyTorch: https://pytorch.org/
- DGL: https://www.dgl.ai/

Remember to choose versions compatible with your system configuration and CUDA version (if applicable).

## Contribution

We welcome contributions! Please see [`CONTRIBUTING.md`](./CONTRIBUTING.md) for more details!

## License

step is licensed under [LICENSE](./LICENSE)

## Contact

If you have any questions, please feel free to contact us at [here](mailto:lilounan1997@gmail.com), or feel free to open an issue on this repository.

## Citation
The preprint of STEP is available at [bioRxiv](https://www.biorxiv.org/content/early/2024/04/20/2024.04.15.589470.full.pdf). If you use STEP in your research, please cite:

```bibtex
@article{Li2024.04.15.589470,
  title = {{{STEP}}: {{Spatial}} Transcriptomics Embedding Procedure for Multi-Scale Biological Heterogeneities Revelation in Multiple Samples},
  author = {Li, Lounan and Li, Zhong and Li, Yuanyuan and Yin, Xiao-ming and Xu, Xiaojiang},
  year = {2024},
  journal = {bioRxiv : the preprint server for biology},
  eprint = {https://www.biorxiv.org/content/early/2024/04/20/2024.04.15.589470.full.pdf},
  publisher = {Cold Spring Harbor Laboratory},
  doi = {10.1101/2024.04.15.589470},
}
```

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/SGGb0nd/step",
    "name": "step-kit",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.10",
    "maintainer_email": null,
    "keywords": "spatial transcriptomics, single-cell RNA-seq, deep learning, scRNA-seq, SRT, step, STEP",
    "author": "SGGb0nd",
    "author_email": "lilounan1997@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/a4/1a/59da27dcb04290f526a2f5e07ce9d5122515456f369268c46c01b292da11/step_kit-0.2.6.tar.gz",
    "platform": null,
    "description": "# STEP: Spatial Transcriptomics Embedding Procedure\n[![Docs](https://github.com/SGGb0nd/step/actions/workflows/mkdocs.yaml/badge.svg)](https://github.com/SGGb0nd/step/actions/workflows/mkdocs.yaml)\n[![Pages](https://github.com/SGGb0nd/step/actions/workflows/pages/pages-build-deployment/badge.svg)](https://github.com/SGGb0nd/step/actions/workflows/pages/pages-build-deployment)\n[![PyPI version](https://badge.fury.io/py/step-kit.svg)](https://badge.fury.io/py/step-kit)\n[![DOI](http://img.shields.io/badge/DOI-10.1101/2024.04.15.589470-B31B1B.svg)](https://doi.org/10.1101/2024.04.15.589470)\n\n![image](http://43.134.105.224/images/STEP_fig_1a.webp)\n![image](http://43.134.105.224/images/STEP_fig_1b.webp)\n\n## Introduction\nSTEP, an acronym for Spatial Transcriptomics Embedding Procedure, is a foundation deep learning/AI architecture for the analysis of spatially resolved transcriptomics (SRT) data, and is also compatible with scRNA-seq data. STEP roots on the precise captures of three major varitions occured in the SRT (and scRNA-seq) data: **Transcriptional Variations**, **Batch Variations** and **Spatial Variations** with the correponding modular designs: **Backbone model**: a Transformer based model togther with gene module seqeunce mapping; **Batch-effect model**: A pair of inverse transformations utilizing the *batch-embedding* conception for the decoupled batch-effect elimination; **Spatial model**: a GCN-based spatial filter/smoother working on the extracted embedding from the Backbone model, different from the usage of GCN in other methods as a feature extractor. Thus, with the proper combinations of these models, STEP introduces a unified approach to systematically process and analyze single or multiple samples of SRT data, disregarding location relationships between sections (meaning both contiguous and non-contiguous sections), to reveal multi-scale bilogical heterogeneities (cell types and spatial domains) in multi-resolution SRT data. Furthermore, STEP can also conduct integrative analysis on scRNA-seq and SRT data.\n\n## Key Features\n\n-  Integration of multiple scRNA-seq and single-cell resolution SRT samples to reveal cell-type level heterogeneities.\n-  Alignment of various SRT data sections contiguous or non-contiguous to identify spatial domains across sections.\n-  Scalable to different data resolutions, i.e., wild range of technologies and platforms of SRT data, including **Visium HD**, **Visum**, **MERFISH**, **STARmap**, **Stereo-seq**, **ST**, etc.\n-  Scalable to large datasets with a high number of cells and spatial locations.\n-  Performance of integrative analysis across modalities (scRNA-seq and SRT) and cell-type deconvolution for the non-single-cell resolution SRT data.\n\n## Other Capabilities\n-  Capability to produce not only the batch-corrected embeddings but also batch-corrected gene expression profiles for scRNA-seq data.\n-  Capability to perform spatial mapping of reference scRNA-seq data points to the spatial locations of SRT data based on the learned co-embeddings and kNN.\n-  Comprehensive `adata` processing by specifically desinged `BaseDataset` class and its view version `MaskedDataset` class for the SRT data, which can be easily integrated with the PyTorch `DataLoader` class for training and validation.\n-  Low computational cost and high efficiency in processing large-scale SRT data: 4-8 GB GPU memory is sufficient for processing a dataset with 100,000 spatial locations with 2000+ sample-size in fast mode, and consumes less than 6 minutes for training in 2000 iterations (tested on NVIDIA RTX 3090).\n\n## System Requiremtns\n-  **Software Requirements**: Python 3.10+, CUDA 11.6+, PyTorch 1.13.1+, and dgl 1.1.3+.\n-  **Hardware Requirements**: NVIDIA GPU with CUDA support (recommended), 8GB+ GPU memory. As possible as high RAM and CPU cores for storing and processing large-scale data (this is not required by STEP, but by the data itself).\n## Installation\n\n### Quick Installation (Recommended for Python 3.10, Linux, CUDA 11.7)\n\n```bash\npip install step-kit[cu117]\n```\n\nThis command installs STEP along with compatible versions of PyTorch and DGL.\n\n### Basic Installation\n\n```bash\npip install step-kit\n```\n\n\u26a0\ufe0f **Critical Note**: \n- This basic installation **does not** include PyTorch and DGL.\n- STEP **will not function** without PyTorch and DGL properly installed.\n- You must install these dependencies separately before using STEP.\n\n### Manual Dependency Installation\n\nIf you need to install specific versions of PyTorch and DGL:\n\n1. Install DGL (example for latest DGL compatible with PyTorch 2.3.x and CUDA 11.8):\n   ```bash\n   pip install dgl -f https://data.dgl.ai/wheels/torch-2.3/cu118/repo.html\n   ```\n\n2. Install PyTorch (example for PyTorch 2.3.1 with CUDA 11.8):\n   ```bash\n   pip install torch==2.3.1+cu118 -f https://download.pytorch.org/whl/cu118/torch_stable.html\n   ```\n\nFor more information on installing PyTorch and DGL, visit:\n- PyTorch: https://pytorch.org/\n- DGL: https://www.dgl.ai/\n\nRemember to choose versions compatible with your system configuration and CUDA version (if applicable).\n\n## Contribution\n\nWe welcome contributions! Please see [`CONTRIBUTING.md`](./CONTRIBUTING.md) for more details!\n\n## License\n\nstep is licensed under [LICENSE](./LICENSE)\n\n## Contact\n\nIf you have any questions, please feel free to contact us at [here](mailto:lilounan1997@gmail.com), or feel free to open an issue on this repository.\n\n## Citation\nThe preprint of STEP is available at [bioRxiv](https://www.biorxiv.org/content/early/2024/04/20/2024.04.15.589470.full.pdf). If you use STEP in your research, please cite:\n\n```bibtex\n@article{Li2024.04.15.589470,\n  title = {{{STEP}}: {{Spatial}} Transcriptomics Embedding Procedure for Multi-Scale Biological Heterogeneities Revelation in Multiple Samples},\n  author = {Li, Lounan and Li, Zhong and Li, Yuanyuan and Yin, Xiao-ming and Xu, Xiaojiang},\n  year = {2024},\n  journal = {bioRxiv : the preprint server for biology},\n  eprint = {https://www.biorxiv.org/content/early/2024/04/20/2024.04.15.589470.full.pdf},\n  publisher = {Cold Spring Harbor Laboratory},\n  doi = {10.1101/2024.04.15.589470},\n}\n```\n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "STEP, an acronym for Spatial Transcriptomics Embedding Procedure, is a deep learning-based tool for the analysis of single-cell RNA (scRNA-seq) and spatially resolved transcriptomics (SRT) data. STEP introduces a unified approach to process and analyze multiple samples of scRNA-seq data as well as align several sections of SRT data, disregarding location relationships. Furthermore, STEP conducts integrative analysis across different modalities like scRNA-seq and SRT.",
    "version": "0.2.6",
    "project_urls": {
        "Documentation": "https://sggb0nd.github.io/step/",
        "Homepage": "https://github.com/SGGb0nd/step",
        "Repository": "https://github.com/SGGb0nd/step"
    },
    "split_keywords": [
        "spatial transcriptomics",
        " single-cell rna-seq",
        " deep learning",
        " scrna-seq",
        " srt",
        " step",
        " step"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "4e28580ab7780f1b8711a9df9fd7dbcac794080d3051a941ee6f1e16dff40ea9",
                "md5": "cba7476a37b1b1873bb52b1dcb73f3f4",
                "sha256": "00659bd429c04f06ec692316b1c179c939ffa38064e65e8d8fb0580a22899eae"
            },
            "downloads": -1,
            "filename": "step_kit-0.2.6-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "cba7476a37b1b1873bb52b1dcb73f3f4",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.10",
            "size": 99165,
            "upload_time": "2024-11-05T05:13:47",
            "upload_time_iso_8601": "2024-11-05T05:13:47.422828Z",
            "url": "https://files.pythonhosted.org/packages/4e/28/580ab7780f1b8711a9df9fd7dbcac794080d3051a941ee6f1e16dff40ea9/step_kit-0.2.6-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "a41a59da27dcb04290f526a2f5e07ce9d5122515456f369268c46c01b292da11",
                "md5": "301a398622eceab152cc2f716dcbb23a",
                "sha256": "5ff41ac72369c978f2b2e0e5a0046f2abc1391db534b50ee1667fcbb18842b84"
            },
            "downloads": -1,
            "filename": "step_kit-0.2.6.tar.gz",
            "has_sig": false,
            "md5_digest": "301a398622eceab152cc2f716dcbb23a",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.10",
            "size": 85343,
            "upload_time": "2024-11-05T05:13:49",
            "upload_time_iso_8601": "2024-11-05T05:13:49.289644Z",
            "url": "https://files.pythonhosted.org/packages/a4/1a/59da27dcb04290f526a2f5e07ce9d5122515456f369268c46c01b292da11/step_kit-0.2.6.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-05 05:13:49",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "SGGb0nd",
    "github_project": "step",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "step-kit"
}
        
Elapsed time: 0.58782s