Name | paste-bio JSON |
Version |
1.4.0
JSON |
| download |
home_page | https://github.com/raphael-group/paste |
Summary | A computational method to align and integrate spatial transcriptomics experiments. |
upload_time | 2023-05-26 20:41:42 |
maintainer | |
docs_url | None |
author | Max Land |
requires_python | >=3.6 |
license | |
keywords |
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
[![PyPI](https://img.shields.io/pypi/v/paste-bio.svg)](https://pypi.org/project/paste-bio)
[![Downloads](https://pepy.tech/badge/paste-bio)](https://pepy.tech/project/paste-bio)
[![Documentation Status](https://readthedocs.org/projects/paste-bio/badge/?version=latest)](https://paste-bio.readthedocs.io/en/stable/?badge=stable)
[![Anaconda](https://anaconda.org/bioconda/paste-bio/badges/version.svg)](https://anaconda.org/bioconda/paste-bio/badges/version.svg)
[![bioconda-downloads](https://anaconda.org/bioconda/paste-bio/badges/downloads.svg)](https://anaconda.org/bioconda/paste-bio/badges/downloads.svg)
# PASTE
![PASTE Overview](https://github.com/raphael-group/paste/blob/main/docs/source/_static/images/paste_overview.png)
PASTE is a computational method that leverages both gene expression similarity and spatial distances between spots to align and integrate spatial transcriptomics data. In particular, there are two methods:
1. `pairwise_align`: align spots across pairwise slices.
2. `center_align`: integrate multiple slices into one center slice.
You can read full paper [here](https://www.nature.com/articles/s41592-022-01459-6).
Additional examples and the code to reproduce the paper's analyses can be found [here](https://github.com/raphael-group/paste_reproducibility). Preprocessed datasets used in the paper can be found on [zenodo](https://doi.org/10.5281/zenodo.6334774).
### Recent News
* PASTE is now published in [Nature Methods](https://www.nature.com/articles/s41592-022-01459-6)!
* The code to reproduce the analisys can be found [here](https://github.com/raphael-group/paste_reproducibility).
* As of version 1.2.0, PASTE now supports GPU implementation via Pytorch. For more details, see the GPU section of the [Tutorial notebook](docs/source/notebooks/getting-started.ipynb).
### Installation
The easiest way is to install PASTE on pypi: https://pypi.org/project/paste-bio/.
`pip install paste-bio`
Or you can install PASTE on bioconda: https://anaconda.org/bioconda/paste-bio.
`conda install -c bioconda paste-bio`
Check out Tutorial.ipynb for an example of how to use PASTE.
Alternatively, you can clone the respository and try the following example in a
notebook or the command line.
### Quick Start
To use PASTE we require at least two slices of spatial-omics data (both
expression and coordinates) that are in
anndata format (i.e. read in by scanpy/squidpy). We have included a breast
cancer dataset from [1] in the [sample_data folder](sample_data/) of this repo
that we will use as an example below to show how to use PASTE.
```python
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
import numpy as np
import scanpy as sc
import paste as pst
# Load Slices
data_dir = './sample_data/' # change this path to the data you wish to analyze
# Assume that the coordinates of slices are named slice_name + "_coor.csv"
def load_slices(data_dir, slice_names=["slice1", "slice2"]):
slices = []
for slice_name in slice_names:
slice_i = sc.read_csv(data_dir + slice_name + ".csv")
slice_i_coor = np.genfromtxt(data_dir + slice_name + "_coor.csv", delimiter = ',')
slice_i.obsm['spatial'] = slice_i_coor
# Preprocess slices
sc.pp.filter_genes(slice_i, min_counts = 15)
sc.pp.filter_cells(slice_i, min_counts = 100)
slices.append(slice_i)
return slices
slices = load_slices(data_dir)
slice1, slice2 = slices
# Pairwise align the slices
pi12 = pst.pairwise_align(slice1, slice2)
# To visualize the alignment you can stack the slices
# according to the alignment pi
slices, pis = [slice1, slice2], [pi12]
new_slices = pst.stack_slices_pairwise(slices, pis)
slice_colors = ['#e41a1c','#377eb8']
plt.figure(figsize=(7,7))
for i in range(len(new_slices)):
pst.plot_slice(new_slices[i],slice_colors[i],s=400)
plt.legend(handles=[mpatches.Patch(color=slice_colors[0], label='1'),mpatches.Patch(color=slice_colors[1], label='2')])
plt.gca().invert_yaxis()
plt.axis('off')
plt.show()
# Center align slices
## We have to reload the slices as pairwise_alignment modifies the slices.
slices = load_slices(data_dir)
slice1, slice2 = slices
# Construct a center slice
## choose one of the slices as the coordinate reference for the center slice,
## i.e. the center slice will have the same number of spots as this slice and
## the same coordinates.
initial_slice = slice1.copy()
slices = [slice1, slice2]
lmbda = len(slices)*[1/len(slices)] # set hyperparameter to be uniform
## Possible to pass in an initial pi (as keyword argument pis_init)
## to improve performance, see Tutorial.ipynb notebook for more details.
center_slice, pis = pst.center_align(initial_slice, slices, lmbda)
## The low dimensional representation of our center slice is held
## in the matrices W and H, which can be used for downstream analyses
W = center_slice.uns['paste_W']
H = center_slice.uns['paste_H']
```
### GPU implementation
PASTE now is compatible with gpu via Pytorch. All we need to do is add the following two parameters to our main functions:
```
pi12 = pst.pairwise_align(slice1, slice2, backend = ot.backend.TorchBackend(), use_gpu = True)
center_slice, pis = pst.center_align(initial_slice, slices, lmbda, backend = ot.backend.TorchBackend(), use_gpu = True)
```
For more details, see the GPU section of the [Tutorial notebook](docs/source/notebooks/getting-started.ipynb).
### Command Line
We provide the option of running PASTE from the command line.
First, clone the repository:
`git clone https://github.com/raphael-group/paste.git`
Next, when providing files, you will need to provide two separate files: the gene expression data followed by spatial data (both as .csv) for the code to initialize one slice object.
Sample execution (based on this repo): `python paste-cmd-line.py -m center -f ./sample_data/slice1.csv ./sample_data/slice1_coor.csv ./sample_data/slice2.csv ./sample_data/slice2_coor.csv ./sample_data/slice3.csv ./sample_data/slice3_coor.csv`
Note: `pairwise` will return pairwise alignment between each consecutive pair of slices (e.g. \[slice1,slice2\], \[slice2,slice3\]).
| Flag | Name | Description | Default Value |
| --- | --- | --- | --- |
| -m | mode | Select either `pairwise` or `center` | (str) `pairwise` |
| -f | files | Path to data files (.csv) | None |
| -d | direc | Directory to store output files | Current Directory |
| -a | alpha | Alpha parameter for PASTE | (float) `0.1` |
| -c | cost | Expression dissimilarity cost (`kl` or `Euclidean`) | (str) `kl` |
| -p | n_components | n_components for NMF step in `center_align` | (int) `15` |
| -l | lmbda | Lambda parameter in `center_align` | (floats) probability vector of length `n` |
| -i | intial_slice | Specify which file is also the intial slice in `center_align` | (int) `1` |
| -t | threshold | Convergence threshold for `center_align` | (float) `0.001` |
| -x | coordinates | Output new coordinates (toggle to turn on) | `False` |
| -w | weights | Weights files of spots in each slice (.csv) | None |
| -s | start | Initial alignments for OT. If not given uses uniform (.csv structure similar to alignment output) | None |
`pairwise_align` outputs a (.csv) file containing mapping of spots between each consecutive pair of slices. The rows correspond to spots of the first slice, and cols the second.
`center_align` outputs two files containing the low dimensional representation (NMF decomposition) of the center slice gene expression, and files containing a mapping of spots between the center slice (rows) to each input slice (cols).
### Sample Dataset
Added sample spatial transcriptomics dataset consisting of four breast cancer slice courtesy of:
[1] Ståhl, Patrik & Salmén, Fredrik & Vickovic, Sanja & Lundmark, Anna & Fernandez Navarro, Jose & Magnusson, Jens & Giacomello, Stefania & Asp, Michaela & Westholm, Jakub & Huss, Mikael & Mollbrink, Annelie & Linnarsson, Sten & Codeluppi, Simone & Borg, Åke & Pontén, Fredrik & Costea, Paul & Sahlén, Pelin Akan & Mulder, Jan & Bergmann, Olaf & Frisén, Jonas. (2016). Visualization and analysis of gene expression in tissue sections by spatial transcriptomics. Science. 353. 78-82. 10.1126/science.aaf2403.
Note: Original data is (.tsv), but we converted it to (.csv).
### References
Ron Zeira, Max Land, Alexander Strzalkowski and Benjamin J. Raphael. "Alignment and integration of spatial transcriptomics data". Nature Methods (2022). https://doi.org/10.1038/s41592-022-01459-6
Raw data
{
"_id": null,
"home_page": "https://github.com/raphael-group/paste",
"name": "paste-bio",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": "",
"keywords": "",
"author": "Max Land",
"author_email": "max.ruikang.land@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/ab/9d/ecf6f5dd4a8b0d4b5205559a520e07379d4632df3d880005c51e351cd5ba/paste-bio-1.4.0.tar.gz",
"platform": null,
"description": "[![PyPI](https://img.shields.io/pypi/v/paste-bio.svg)](https://pypi.org/project/paste-bio)\n[![Downloads](https://pepy.tech/badge/paste-bio)](https://pepy.tech/project/paste-bio)\n[![Documentation Status](https://readthedocs.org/projects/paste-bio/badge/?version=latest)](https://paste-bio.readthedocs.io/en/stable/?badge=stable)\n[![Anaconda](https://anaconda.org/bioconda/paste-bio/badges/version.svg)](https://anaconda.org/bioconda/paste-bio/badges/version.svg)\n[![bioconda-downloads](https://anaconda.org/bioconda/paste-bio/badges/downloads.svg)](https://anaconda.org/bioconda/paste-bio/badges/downloads.svg)\n\n# PASTE\n\n![PASTE Overview](https://github.com/raphael-group/paste/blob/main/docs/source/_static/images/paste_overview.png)\n\nPASTE is a computational method that leverages both gene expression similarity and spatial distances between spots to align and integrate spatial transcriptomics data. In particular, there are two methods:\n1. `pairwise_align`: align spots across pairwise slices.\n2. `center_align`: integrate multiple slices into one center slice.\n\nYou can read full paper [here](https://www.nature.com/articles/s41592-022-01459-6). \n\nAdditional examples and the code to reproduce the paper's analyses can be found [here](https://github.com/raphael-group/paste_reproducibility). Preprocessed datasets used in the paper can be found on [zenodo](https://doi.org/10.5281/zenodo.6334774).\n\n### Recent News\n\n* PASTE is now published in [Nature Methods](https://www.nature.com/articles/s41592-022-01459-6)!\n\n* The code to reproduce the analisys can be found [here](https://github.com/raphael-group/paste_reproducibility).\n\n* As of version 1.2.0, PASTE now supports GPU implementation via Pytorch. For more details, see the GPU section of the [Tutorial notebook](docs/source/notebooks/getting-started.ipynb).\n\n### Installation\n\nThe easiest way is to install PASTE on pypi: https://pypi.org/project/paste-bio/. \n\n`pip install paste-bio` \n\nOr you can install PASTE on bioconda: https://anaconda.org/bioconda/paste-bio.\n\n`conda install -c bioconda paste-bio`\n\nCheck out Tutorial.ipynb for an example of how to use PASTE.\n\nAlternatively, you can clone the respository and try the following example in a\nnotebook or the command line. \n\n### Quick Start\n\nTo use PASTE we require at least two slices of spatial-omics data (both\nexpression and coordinates) that are in\nanndata format (i.e. read in by scanpy/squidpy). We have included a breast\ncancer dataset from [1] in the [sample_data folder](sample_data/) of this repo \nthat we will use as an example below to show how to use PASTE.\n\n```python\nimport matplotlib.pyplot as plt\nimport matplotlib.patches as mpatches\nimport numpy as np\nimport scanpy as sc\nimport paste as pst\n\n# Load Slices\ndata_dir = './sample_data/' # change this path to the data you wish to analyze\n\n# Assume that the coordinates of slices are named slice_name + \"_coor.csv\"\ndef load_slices(data_dir, slice_names=[\"slice1\", \"slice2\"]):\n slices = [] \n for slice_name in slice_names:\n slice_i = sc.read_csv(data_dir + slice_name + \".csv\")\n slice_i_coor = np.genfromtxt(data_dir + slice_name + \"_coor.csv\", delimiter = ',')\n slice_i.obsm['spatial'] = slice_i_coor\n # Preprocess slices\n sc.pp.filter_genes(slice_i, min_counts = 15)\n sc.pp.filter_cells(slice_i, min_counts = 100)\n slices.append(slice_i)\n return slices\n\nslices = load_slices(data_dir)\nslice1, slice2 = slices\n\n# Pairwise align the slices\npi12 = pst.pairwise_align(slice1, slice2)\n\n# To visualize the alignment you can stack the slices \n# according to the alignment pi\nslices, pis = [slice1, slice2], [pi12]\nnew_slices = pst.stack_slices_pairwise(slices, pis)\n\nslice_colors = ['#e41a1c','#377eb8']\nplt.figure(figsize=(7,7))\nfor i in range(len(new_slices)):\n pst.plot_slice(new_slices[i],slice_colors[i],s=400)\nplt.legend(handles=[mpatches.Patch(color=slice_colors[0], label='1'),mpatches.Patch(color=slice_colors[1], label='2')])\nplt.gca().invert_yaxis()\nplt.axis('off')\nplt.show()\n\n# Center align slices\n## We have to reload the slices as pairwise_alignment modifies the slices.\nslices = load_slices(data_dir)\nslice1, slice2 = slices\n\n# Construct a center slice\n## choose one of the slices as the coordinate reference for the center slice,\n## i.e. the center slice will have the same number of spots as this slice and\n## the same coordinates.\ninitial_slice = slice1.copy() \nslices = [slice1, slice2]\nlmbda = len(slices)*[1/len(slices)] # set hyperparameter to be uniform\n\n## Possible to pass in an initial pi (as keyword argument pis_init) \n## to improve performance, see Tutorial.ipynb notebook for more details.\ncenter_slice, pis = pst.center_align(initial_slice, slices, lmbda) \n\n## The low dimensional representation of our center slice is held \n## in the matrices W and H, which can be used for downstream analyses\nW = center_slice.uns['paste_W']\nH = center_slice.uns['paste_H']\n```\n\n### GPU implementation\nPASTE now is compatible with gpu via Pytorch. All we need to do is add the following two parameters to our main functions:\n```\npi12 = pst.pairwise_align(slice1, slice2, backend = ot.backend.TorchBackend(), use_gpu = True)\n\ncenter_slice, pis = pst.center_align(initial_slice, slices, lmbda, backend = ot.backend.TorchBackend(), use_gpu = True) \n```\nFor more details, see the GPU section of the [Tutorial notebook](docs/source/notebooks/getting-started.ipynb).\n\n### Command Line\n\nWe provide the option of running PASTE from the command line. \n\nFirst, clone the repository:\n\n`git clone https://github.com/raphael-group/paste.git`\n\nNext, when providing files, you will need to provide two separate files: the gene expression data followed by spatial data (both as .csv) for the code to initialize one slice object.\n\nSample execution (based on this repo): `python paste-cmd-line.py -m center -f ./sample_data/slice1.csv ./sample_data/slice1_coor.csv ./sample_data/slice2.csv ./sample_data/slice2_coor.csv ./sample_data/slice3.csv ./sample_data/slice3_coor.csv`\n\nNote: `pairwise` will return pairwise alignment between each consecutive pair of slices (e.g. \\[slice1,slice2\\], \\[slice2,slice3\\]).\n\n| Flag | Name | Description | Default Value |\n| --- | --- | --- | --- |\n| -m | mode | Select either `pairwise` or `center` | (str) `pairwise` |\n| -f | files | Path to data files (.csv) | None |\n| -d | direc | Directory to store output files | Current Directory |\n| -a | alpha | Alpha parameter for PASTE | (float) `0.1` |\n| -c | cost | Expression dissimilarity cost (`kl` or `Euclidean`) | (str) `kl` |\n| -p | n_components | n_components for NMF step in `center_align` | (int) `15` |\n| -l | lmbda | Lambda parameter in `center_align` | (floats) probability vector of length `n` |\n| -i | intial_slice | Specify which file is also the intial slice in `center_align` | (int) `1` |\n| -t | threshold | Convergence threshold for `center_align` | (float) `0.001` |\n| -x | coordinates | Output new coordinates (toggle to turn on) | `False` |\n| -w | weights | Weights files of spots in each slice (.csv) | None |\n| -s | start | Initial alignments for OT. If not given uses uniform (.csv structure similar to alignment output) | None |\n\n`pairwise_align` outputs a (.csv) file containing mapping of spots between each consecutive pair of slices. The rows correspond to spots of the first slice, and cols the second.\n\n`center_align` outputs two files containing the low dimensional representation (NMF decomposition) of the center slice gene expression, and files containing a mapping of spots between the center slice (rows) to each input slice (cols).\n\n### Sample Dataset\n\nAdded sample spatial transcriptomics dataset consisting of four breast cancer slice courtesy of:\n\n[1] St\u00e5hl, Patrik & Salm\u00e9n, Fredrik & Vickovic, Sanja & Lundmark, Anna & Fernandez Navarro, Jose & Magnusson, Jens & Giacomello, Stefania & Asp, Michaela & Westholm, Jakub & Huss, Mikael & Mollbrink, Annelie & Linnarsson, Sten & Codeluppi, Simone & Borg, \u00c5ke & Pont\u00e9n, Fredrik & Costea, Paul & Sahl\u00e9n, Pelin Akan & Mulder, Jan & Bergmann, Olaf & Fris\u00e9n, Jonas. (2016). Visualization and analysis of gene expression in tissue sections by spatial transcriptomics. Science. 353. 78-82. 10.1126/science.aaf2403. \n\nNote: Original data is (.tsv), but we converted it to (.csv).\n\n### References\n\nRon Zeira, Max Land, Alexander Strzalkowski and Benjamin J. Raphael. \"Alignment and integration of spatial transcriptomics data\". Nature Methods (2022). https://doi.org/10.1038/s41592-022-01459-6\n",
"bugtrack_url": null,
"license": "",
"summary": "A computational method to align and integrate spatial transcriptomics experiments.",
"version": "1.4.0",
"project_urls": {
"Bug Tracker": "https://github.com/raphael-group/paste/issues",
"Homepage": "https://github.com/raphael-group/paste"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "9fa30f6ad33e70e6abb97c3b860d04a8b800f9a0106b31f8cc76e80ab6815412",
"md5": "3baa4747930592dea0c804d9731d28af",
"sha256": "372946e9978871ce31bf0d3a6f1bab1e04451b9df914e5ebdca7ed79846cfad9"
},
"downloads": -1,
"filename": "paste_bio-1.4.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "3baa4747930592dea0c804d9731d28af",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 14592,
"upload_time": "2023-05-26T20:41:39",
"upload_time_iso_8601": "2023-05-26T20:41:39.419822Z",
"url": "https://files.pythonhosted.org/packages/9f/a3/0f6ad33e70e6abb97c3b860d04a8b800f9a0106b31f8cc76e80ab6815412/paste_bio-1.4.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "ab9decf6f5dd4a8b0d4b5205559a520e07379d4632df3d880005c51e351cd5ba",
"md5": "45afba19195c06a500174040d1a06615",
"sha256": "fa525de92cfe1b179f2ce797514b352c489ac6b12692c3cfaa1e6862fc7cdbbe"
},
"downloads": -1,
"filename": "paste-bio-1.4.0.tar.gz",
"has_sig": false,
"md5_digest": "45afba19195c06a500174040d1a06615",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 16781,
"upload_time": "2023-05-26T20:41:42",
"upload_time_iso_8601": "2023-05-26T20:41:42.039328Z",
"url": "https://files.pythonhosted.org/packages/ab/9d/ecf6f5dd4a8b0d4b5205559a520e07379d4632df3d880005c51e351cd5ba/paste-bio-1.4.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-05-26 20:41:42",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "raphael-group",
"github_project": "paste",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [],
"lcname": "paste-bio"
}