IRescue


NameIRescue JSON
Version 1.0.3 PyPI version JSON
download
home_page
SummaryInterspersed Repeats singl-cell quantifier
upload_time2023-02-22 16:01:13
maintainer
docs_urlNone
author
requires_python>=3.7
licenseMIT License Copyright (c) 2022 Benedetto Polimeni Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
keywords bioinformatics transposable-elements scrna-seq single-cell single-cell-rna-seq
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            ![GitHub Workflow Status](https://img.shields.io/github/actions/workflow/status/bodegalab/irescue/python-publish.yml?logo=github&label=build)
[![PyPI](https://img.shields.io/pypi/v/irescue?logo=python)](https://pypi.org/project/irescue/)
[![install with bioconda](https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg?style=flat&logo=anaconda)](https://bioconda.github.io/recipes/irescue/README.html)
[![container](https://img.shields.io/badge/dynamic/json?url=https://quay.io/api/v1/repository/biocontainers/irescue/tag/&label=container&query=$.tags.0.name&prefix=quay.io/biocontainers/irescue:)](https://quay.io/repository/biocontainers/irescue?tab=tags)
[![Paper](https://img.shields.io/badge/DOI-10.1101%2F2022.09.16.508229-9cf)](https://doi.org/10.1101/2022.09.16.508229)

# IRescue - <ins>I</ins>nterspersed <ins>Re</ins>peats <ins>s</ins>ingle-<ins>c</ins>ell q<ins>u</ins>antifi<ins>e</ins>r

<img align="right" height="160" src="docs/logo.png">
IRescue is a software for quantifying the expression of transposable elements (TEs) subfamilies in single cell RNA sequencing (scRNA-seq) data. The core feature of IRescue is to consider all multiple alignments (i.e. non-primary alignments) of reads/UMIs mapping on multiple TEs in a BAM file, to accurately infer the TE subfamily of origin. IRescue implements a UMI error-correction, deduplication and quantification strategy that includes such alignment events. IRescue's output is compatible with most scRNA-seq analysis toolkits, such as Seurat or Scanpy.

## Content

- [Installation](#installation)
  - [Using conda](#conda)
  - [Using pip](#pip)
  - [Container (Docker/Singularity)](#container)
- [Usage](#usage)
  - [Quick start](#quick_start)
  - [Output files](#output_files)
  - [Load IRescue data with Seurat](#seurat)
- [Cite](#cite)

## <a name="installation"></a>Installation

### <a name="conda"></a>Using conda (recommended)

We recommend using conda, as it will install all the required packages along IRescue.

```bash
conda create -n irescue -c conda-forge -c bioconda irescue
```

### <a name="pip"></a>Using pip

If for any reason it's not possible or desiderable to use conda, it can be installed with pip and the following requirements must be installed manually: `python>=3.7`, `samtools>=1.12` and `bedtools>=2.30.0`.

```bash
pip install irescue
```

### <a name="container"></a>Container (Docker/Singularity)

Docker and Singularity containers are available for each conda release of IRescue. Choose the `TAG` corresponding to the desired IRescue version [from the Biocontainers repository](https://quay.io/repository/biocontainers/irescue?tab=tags) and pull or execute the container with Docker or Singularity:

```bash
# Get latest biocontainers tag (with curl and python3, otherwise check the above link for the desired version/tag)
TAG=$(curl -s -X GET https://quay.io/api/v1/repository/biocontainers/irescue/tag/ | python3 -c 'import json,sys;obj=json.load(sys.stdin);print(obj["tags"][0]["name"])')

# Run with Docker
docker run quay.io/biocontainers/irescue:$TAG irescue --help

# Run with Singularity
singularity exec https://depot.galaxyproject.org/singularity/irescue:$TAG irescue --help
```

## <a name="usage"></a>Usage

### <a name="quick_start"></a>Quick start

The only required input is a BAM file annotated with cell barcode and UMI sequences as tags (by default, `CB` tag for cell barcode and `UR` tag for UMI; override with `--CBtag` and `--UMItag`). You can obtain it by aligning your reads using [STARsolo](https://github.com/alexdobin/STAR/blob/master/docs/STARsolo.md).

RepeatMasker annotation will be automatically downloaded for the chosen genome assembly (e.g. `-g hg38`), or provide your own annotation in bed format (e.g. `-r TE.bed`).

```bash
irescue -b genome_alignments.bam -g hg38
```

If you already obtained gene-level counts (using STARsolo, Cell Ranger, Alevin, Kallisto or other tools), it is advised to provide the whitelisted cell barcodes list as a text file, e.g.: `-w barcodes.tsv`. This will significantly improve performance.

IRescue performs best using at least 4 threads, e.g.: `-p 8`.

### <a name="output_files"></a>Output files

IRescue generates TE counts in a sparse matrix format, readable by [Seurat](https://github.com/satijalab/seurat) or [Scanpy](https://github.com/scverse/scanpy):

```
IRescue_out/
├── barcodes.tsv.gz
├── features.tsv.gz
└── matrix.mtx.gz
```

### <a name="seurat"></a>Load IRescue data with Seurat

To integrate TE counts into an existing Seurat object containing gene expression data, they can be added as an additional assay:

```R
# import TE counts from IRescue output directory
te.data <- Seurat::Read10X('./IRescue_out/', gene.column = 1, cell.column = 1)

# create Seurat assay from TE counts
te.assay <- Seurat::CreateAssayObject(te.data)

# subset the assay by the cells already present in the Seurat object (in case it has been filtered)
te.assay <- subset(te.assay, colnames(te.assay)[which(colnames(te.assay) %in% colnames(seurat_object))])

# add the assay in the Seurat object
seurat_object[['TE']] <- irescue.assay
```

The result will be something like this:
```
An object of class Seurat 
32276 features across 42513 samples within 2 assays 
Active assay: RNA (31078 features, 0 variable features)
 1 other assay present: TE
```

## <a name="cite"></a>Cite

Polimeni B, Marasca F, Ranzani V, Bodega B.
IRescue: single cell uncertainty-aware quantification of transposable elements expression.
bioRxiv 2022.09.16.508229; doi: https://doi.org/10.1101/2022.09.16.508229

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "IRescue",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "",
    "keywords": "bioinformatics,transposable-elements,scrna-seq,single-cell,single-cell-rna-seq",
    "author": "",
    "author_email": "Benedetto Polimeni <polimeni@ingm.org>",
    "download_url": "https://files.pythonhosted.org/packages/e7/a1/1e40c1078d25eaf558bb7998b2f17e64170b4f4fa20700784c39e0de9bbb/IRescue-1.0.3.tar.gz",
    "platform": null,
    "description": "![GitHub Workflow Status](https://img.shields.io/github/actions/workflow/status/bodegalab/irescue/python-publish.yml?logo=github&label=build)\n[![PyPI](https://img.shields.io/pypi/v/irescue?logo=python)](https://pypi.org/project/irescue/)\n[![install with bioconda](https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg?style=flat&logo=anaconda)](https://bioconda.github.io/recipes/irescue/README.html)\n[![container](https://img.shields.io/badge/dynamic/json?url=https://quay.io/api/v1/repository/biocontainers/irescue/tag/&label=container&query=$.tags.0.name&prefix=quay.io/biocontainers/irescue:)](https://quay.io/repository/biocontainers/irescue?tab=tags)\n[![Paper](https://img.shields.io/badge/DOI-10.1101%2F2022.09.16.508229-9cf)](https://doi.org/10.1101/2022.09.16.508229)\n\n# IRescue - <ins>I</ins>nterspersed <ins>Re</ins>peats <ins>s</ins>ingle-<ins>c</ins>ell q<ins>u</ins>antifi<ins>e</ins>r\n\n<img align=\"right\" height=\"160\" src=\"docs/logo.png\">\nIRescue is a software for quantifying the expression of transposable elements (TEs) subfamilies in single cell RNA sequencing (scRNA-seq) data. The core feature of IRescue is to consider all multiple alignments (i.e. non-primary alignments) of reads/UMIs mapping on multiple TEs in a BAM file, to accurately infer the TE subfamily of origin. IRescue implements a UMI error-correction, deduplication and quantification strategy that includes such alignment events. IRescue's output is compatible with most scRNA-seq analysis toolkits, such as Seurat or Scanpy.\n\n## Content\n\n- [Installation](#installation)\n  - [Using conda](#conda)\n  - [Using pip](#pip)\n  - [Container (Docker/Singularity)](#container)\n- [Usage](#usage)\n  - [Quick start](#quick_start)\n  - [Output files](#output_files)\n  - [Load IRescue data with Seurat](#seurat)\n- [Cite](#cite)\n\n## <a name=\"installation\"></a>Installation\n\n### <a name=\"conda\"></a>Using conda (recommended)\n\nWe recommend using conda, as it will install all the required packages along IRescue.\n\n```bash\nconda create -n irescue -c conda-forge -c bioconda irescue\n```\n\n### <a name=\"pip\"></a>Using pip\n\nIf for any reason it's not possible or desiderable to use conda, it can be installed with pip and the following requirements must be installed manually: `python>=3.7`, `samtools>=1.12` and `bedtools>=2.30.0`.\n\n```bash\npip install irescue\n```\n\n### <a name=\"container\"></a>Container (Docker/Singularity)\n\nDocker and Singularity containers are available for each conda release of IRescue. Choose the `TAG` corresponding to the desired IRescue version [from the Biocontainers repository](https://quay.io/repository/biocontainers/irescue?tab=tags) and pull or execute the container with Docker or Singularity:\n\n```bash\n# Get latest biocontainers tag (with curl and python3, otherwise check the above link for the desired version/tag)\nTAG=$(curl -s -X GET https://quay.io/api/v1/repository/biocontainers/irescue/tag/ | python3 -c 'import json,sys;obj=json.load(sys.stdin);print(obj[\"tags\"][0][\"name\"])')\n\n# Run with Docker\ndocker run quay.io/biocontainers/irescue:$TAG irescue --help\n\n# Run with Singularity\nsingularity exec https://depot.galaxyproject.org/singularity/irescue:$TAG irescue --help\n```\n\n## <a name=\"usage\"></a>Usage\n\n### <a name=\"quick_start\"></a>Quick start\n\nThe only required input is a BAM file annotated with cell barcode and UMI sequences as tags (by default, `CB` tag for cell barcode and `UR` tag for UMI; override with `--CBtag` and `--UMItag`). You can obtain it by aligning your reads using [STARsolo](https://github.com/alexdobin/STAR/blob/master/docs/STARsolo.md).\n\nRepeatMasker annotation will be automatically downloaded for the chosen genome assembly (e.g. `-g hg38`), or provide your own annotation in bed format (e.g. `-r TE.bed`).\n\n```bash\nirescue -b genome_alignments.bam -g hg38\n```\n\nIf you already obtained gene-level counts (using STARsolo, Cell Ranger, Alevin, Kallisto or other tools), it is advised to provide the whitelisted cell barcodes list as a text file, e.g.: `-w barcodes.tsv`. This will significantly improve performance.\n\nIRescue performs best using at least 4 threads, e.g.: `-p 8`.\n\n### <a name=\"output_files\"></a>Output files\n\nIRescue generates TE counts in a sparse matrix format, readable by [Seurat](https://github.com/satijalab/seurat) or [Scanpy](https://github.com/scverse/scanpy):\n\n```\nIRescue_out/\n\u251c\u2500\u2500 barcodes.tsv.gz\n\u251c\u2500\u2500 features.tsv.gz\n\u2514\u2500\u2500 matrix.mtx.gz\n```\n\n### <a name=\"seurat\"></a>Load IRescue data with Seurat\n\nTo integrate TE counts into an existing Seurat object containing gene expression data, they can be added as an additional assay:\n\n```R\n# import TE counts from IRescue output directory\nte.data <- Seurat::Read10X('./IRescue_out/', gene.column = 1, cell.column = 1)\n\n# create Seurat assay from TE counts\nte.assay <- Seurat::CreateAssayObject(te.data)\n\n# subset the assay by the cells already present in the Seurat object (in case it has been filtered)\nte.assay <- subset(te.assay, colnames(te.assay)[which(colnames(te.assay) %in% colnames(seurat_object))])\n\n# add the assay in the Seurat object\nseurat_object[['TE']] <- irescue.assay\n```\n\nThe result will be something like this:\n```\nAn object of class Seurat \n32276 features across 42513 samples within 2 assays \nActive assay: RNA (31078 features, 0 variable features)\n 1 other assay present: TE\n```\n\n## <a name=\"cite\"></a>Cite\n\nPolimeni B, Marasca F, Ranzani V, Bodega B.\nIRescue: single cell uncertainty-aware quantification of transposable elements expression.\nbioRxiv 2022.09.16.508229; doi: https://doi.org/10.1101/2022.09.16.508229\n",
    "bugtrack_url": null,
    "license": "MIT License  Copyright (c) 2022 Benedetto Polimeni  Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:  The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.  THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.",
    "summary": "Interspersed Repeats singl-cell quantifier",
    "version": "1.0.3",
    "split_keywords": [
        "bioinformatics",
        "transposable-elements",
        "scrna-seq",
        "single-cell",
        "single-cell-rna-seq"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "280cd82a00f32b5697aa43c2f90fcdc0ecc387f3248325963bc3a9ae27ae97df",
                "md5": "aacfc09a8508953a305b6f9a0309726f",
                "sha256": "e73235604e9b51d2bf277e706262d8f40a4154408c87c99c1010cb2051ae52ca"
            },
            "downloads": -1,
            "filename": "IRescue-1.0.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "aacfc09a8508953a305b6f9a0309726f",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 16851,
            "upload_time": "2023-02-22T16:01:12",
            "upload_time_iso_8601": "2023-02-22T16:01:12.327009Z",
            "url": "https://files.pythonhosted.org/packages/28/0c/d82a00f32b5697aa43c2f90fcdc0ecc387f3248325963bc3a9ae27ae97df/IRescue-1.0.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e7a11e40c1078d25eaf558bb7998b2f17e64170b4f4fa20700784c39e0de9bbb",
                "md5": "d5310f00334097aef0c27fd121ec8649",
                "sha256": "9d68f42cd70c1ee9151699201319a3c879a3400b51a1224ccd66191ae5bb16ae"
            },
            "downloads": -1,
            "filename": "IRescue-1.0.3.tar.gz",
            "has_sig": false,
            "md5_digest": "d5310f00334097aef0c27fd121ec8649",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 14481,
            "upload_time": "2023-02-22T16:01:13",
            "upload_time_iso_8601": "2023-02-22T16:01:13.366061Z",
            "url": "https://files.pythonhosted.org/packages/e7/a1/1e40c1078d25eaf558bb7998b2f17e64170b4f4fa20700784c39e0de9bbb/IRescue-1.0.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-02-22 16:01:13",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "lcname": "irescue"
}
        
Elapsed time: 0.05689s