# TBpore
*Mycobacterium tuberculosis* genomic analysis from Nanopore sequencing data
[![Python CI](https://github.com/mbhall88/tbpore/actions/workflows/ci.yaml/badge.svg)](https://github.com/mbhall88/tbpore/actions/workflows/ci.yaml)
[![codecov](https://codecov.io/gh/mbhall88/tbpore/branch/main/graph/badge.svg)](https://codecov.io/gh/mbhall88/tbpore)
[![PyPI](https://img.shields.io/pypi/v/tbpore)](https://pypi.org/project/tbpore/)
![PyPI - Python Version](https://img.shields.io/pypi/pyversions/tbpore)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
[TOC]: #
# Table of Contents
- [Synopsis](#synopsis)
- [Citation](#citation)
- [Installation](#installation)
- [Configuring the decontamination database index](#configuring-the-decontamination-database-index)
- [Performance](#performance)
- [Usage](#usage)
# Synopsis
`tbpore` is a tool with two main goals.
First is to process Nanopore Mycobacterium tuberculosis sequencing data to describe
variants with respect to the
canonical TB strain H37Rv and predict antibiotic resistance (command `tbpore process`).
Variant description is done by decontaminating reads, calling variants with
[bcftools](https://github.com/samtools/bcftools) and filtering variants.
Antibiotic resistance is predicted
with [mykrobe](https://github.com/Mykrobe-tools/mykrobe).
Second, `tbpore` can be used to cluster TB samples based on their genotyping and a given
distance threshold (command
`tbpore cluster`).
## Citation
TBpore is a slimmed-down version of
the [full pipeline](https://github.com/mbhall88/head_to_head_pipeline) used
in our paper 👇
> Hall, M. B. et al. Evaluation of Nanopore sequencing for Mycobacterium tuberculosis drug susceptibility testing and outbreak investigation: a genomic analysis. *The Lancet Microbe* 0, (2022) doi: [10.1016/S2666-5247(22)00301-9][doi].
[doi]: https://doi.org/10.1016/S2666-5247(22)00301-9
## Installation
### conda
[![Conda (channel only)](https://img.shields.io/conda/vn/bioconda/tbpore)](https://anaconda.org/bioconda/tbpore)
[![bioconda version](https://anaconda.org/bioconda/tbpore/badges/platforms.svg)](https://anaconda.org/bioconda/tbpore)
![Conda](https://img.shields.io/conda/dn/bioconda/tbpore)
Prerequisite: [`conda`][conda] (and bioconda channel [correctly set up][channels])
```shell
$ conda install tbpore
```
### pip
![PyPI](https://img.shields.io/pypi/v/tbpore)
![PyPI - Python Version](https://img.shields.io/pypi/pyversions/tbpore)
The python components of `tbpore` are availble to install through [PyPI].
```shell
pip install tbpore
```
**However**, you will need to install the following dependencies, which cannot be
installed through PyPI.
#### Dependencies
* [`rasusa`](https://github.com/mbhall88/rasusa)
* [`psdm`](https://github.com/mbhall88/psdm) version 0.1.x
* [`samtools`](https://github.com/samtools/samtools) version 1.13
* [`bcftools`](https://github.com/samtools/bcftools) version 1.13
* [`mykrobe`](https://github.com/Mykrobe-tools/mykrobe) version 0.12.x
* [`minimap2`](https://github.com/lh3/minimap2) version 2.22
* [`seqkit`](https://bioinf.shenwei.me/seqkit/) version 2.x
We make no guarentees about the performance of `tbpore` with versions other than those
specified above. In particular, the `bcftools` version is very important. The latest
versions of the other dependencies can likely be used.
### Container
Docker images are provided through biocontainers.
#### `singularity`
Prerequisite: [`singularity`][singularity]
```shell
$ URI="docker://quay.io/biocontainers/tbpore:<tag>"
$ singularity exec "$URI" tbpore --help
```
see [here][tags] for valid values for `<tag>`.
#### `docker`
[![Docker Repository on Quay](https://quay.io/repository/biocontainers/tbpore/status "Docker Repository on Quay")](https://quay.io/repository/biocontainers/tbpore)
Prerequisite: [Docker]
```shell
$ docker pull quay.io/biocontainers/tbpore:<tag>
$ docker run quay.io/biocontainers/tbpore:<tag> tbpore --help
```
see [here][tags] for valid values for `<tag>`.
## Configuring the decontamination database index
After installing TBpore, you will need to download the decontamination database index.
```
$ tbpore download
```
By default, this will download the index
to `${HOME}/.tbpore/decontamination_db/remove_contam.map-ont.mmi`, as this is the
default location `tbpore process` will search for.
If you prefer to download the index to another location, this can be done with
```
$ tbpore download -o other/location/db.mmi
```
Keep in mind, if you specify a non-default location, you will need to use the `--db`
option when running `tbpore process`.
## Performance
### `tbpore process`
Benchmarked on 151 TB ONT samples with 1 thread:
* Runtime: `2103`s avg, `4048`s max (s = seconds);
* RAM: `12.4`GB avg, `13.1`GB max (GB = Gigabytes);
### `tbpore cluster`
Clustering 151 TB ONT samples:
* Runtime: `286`s;
* RAM: `<1`GB;
## Usage
### General usage
```
Usage: tbpore [OPTIONS] COMMAND [ARGS]...
Options:
-h, --help Show this message and exit.
-V, --version Show the version and exit.
-v, --verbose Turns on debug-level logger. Option is mutually exclusive
with quiet.
-q, --quiet Turns off all logging except errors. Option is mutually
exclusive with verbose.
Commands:
cluster Cluster consensus sequences
download Download and validate the decontamination database
process Single-sample TB genomic analysis from Nanopore sequencing data
```
### process
```
Usage: tbpore process [OPTIONS] [INPUTS]...
Single-sample TB genomic analysis from Nanopore sequencing data
INPUTS: Fastq file(s) and/or a directory containing fastq files. All files
will be joined into a single fastq file, so ensure they're all part of the
same sample/isolate.
Options:
-h, --help Show this message and exit.
-r, --recursive Recursively search INPUTS for fastq files
-S, --name TEXT Name of the sample. By default, will use the
first INPUT file with fastq extensions
removed
-A, --report_all_mykrobe_calls Report all mykrobe calls (turn on flag -A,
--report_all_calls when calling mykrobe)
--db PATH Path to the decontaminaton database
[default: ${HOME}/.tbpore/decontamination_db/
remove_contam.map-ont.mmi]
-m, --metadata PATH Path to the decontaminaton database metadata
file [default: /Users/michaelhall/Projects/
tbpore/data/decontamination_db/remove_contam
.tsv.gz]
-o, --outdir DIRECTORY Directory to place output files [default:
.]
--tmp DIRECTORY Specify where to write all (tbpore)
temporary files. [default: <outdir>/.tbpore]
-t, --threads INTEGER Number of threads to use in multithreaded
tools [default: 1]
-d, --cleanup / -D, --no-cleanup
Remove all temporary files on *successful*
completion [default: no-cleanup]
--cache DIRECTORY Path to use for the cache [default:
/Users/michaelhall/.cache]
```
### cluster
```
Usage: tbpore cluster [OPTIONS] [INPUTS]...
Cluster consensus sequences
Preferably input consensus sequences previously generated with tbpore
process.
INPUTS: Two or more consensus fasta sequences. Use glob patterns to input
several easily (e.g. output/sample_*/*.consensus.fa).
Options:
-h, --help Show this message and exit.
-T, --threshold INTEGER Clustering threshold [default: 6]
-o, --outdir DIRECTORY Directory to place output files [default:
.]
--tmp DIRECTORY Specify where to write all (tbpore)
temporary files. [default: <outdir>/.tbpore]
-t, --threads INTEGER Number of threads to use in multithreaded
tools [default: 1]
-d, --cleanup / -D, --no-cleanup
Remove all temporary files on *successful*
completion [default: no-cleanup]
--cache DIRECTORY Path to use for the cache [default:
/Users/michaelhall/.cache]
```
### download
```
Usage: tbpore download [OPTIONS]
Download and validate the decontamination database
Options:
-h, --help Show this message and exit.
-o, --output PATH Download database to a specified filepath [default: ${HOME}/
.tbpore/decontamination_db/remove_contam.map-ont.mmi]
-f, --force Force overwrite if the database already exists
```
[channels]: https://bioconda.github.io/#usage
[conda]: https://docs.conda.io/projects/conda/en/latest/user-guide/install/
[PyPI]: https://pypi.org/project/tbpore/
[singularity]: https://sylabs.io/guides/3.6/user-guide/quick_start.html#quick-installation-steps
[tags]: https://quay.io/repository/biocontainers/tbpore?tab=tags
[Docker]: https://docs.docker.com/v17.12/install/
Raw data
{
"_id": null,
"home_page": "https://github.com/mbhall88/tbpore",
"name": "tbpore",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.8,<4.0",
"maintainer_email": "",
"keywords": "tuberculosis,nanopore,diagnostics,genomics,variant-calling,resistance-prediction",
"author": "Michael Hall",
"author_email": "michael@mbh.sh",
"download_url": "https://files.pythonhosted.org/packages/22/1d/13567a731ee351405a65cb34902e1b25b34795a0e854197ba82cb163601c/tbpore-0.4.1.tar.gz",
"platform": null,
"description": "# TBpore\n\n*Mycobacterium tuberculosis* genomic analysis from Nanopore sequencing data\n\n[![Python CI](https://github.com/mbhall88/tbpore/actions/workflows/ci.yaml/badge.svg)](https://github.com/mbhall88/tbpore/actions/workflows/ci.yaml)\n[![codecov](https://codecov.io/gh/mbhall88/tbpore/branch/main/graph/badge.svg)](https://codecov.io/gh/mbhall88/tbpore)\n[![PyPI](https://img.shields.io/pypi/v/tbpore)](https://pypi.org/project/tbpore/)\n![PyPI - Python Version](https://img.shields.io/pypi/pyversions/tbpore)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n\n[TOC]: #\n\n# Table of Contents\n\n- [Synopsis](#synopsis)\n- [Citation](#citation)\n- [Installation](#installation)\n- [Configuring the decontamination database index](#configuring-the-decontamination-database-index)\n- [Performance](#performance)\n- [Usage](#usage)\n\n# Synopsis\n\n`tbpore` is a tool with two main goals.\nFirst is to process Nanopore Mycobacterium tuberculosis sequencing data to describe\nvariants with respect to the\ncanonical TB strain H37Rv and predict antibiotic resistance (command `tbpore process`).\nVariant description is done by decontaminating reads, calling variants with\n[bcftools](https://github.com/samtools/bcftools) and filtering variants.\nAntibiotic resistance is predicted\nwith [mykrobe](https://github.com/Mykrobe-tools/mykrobe).\nSecond, `tbpore` can be used to cluster TB samples based on their genotyping and a given\ndistance threshold (command\n`tbpore cluster`).\n\n## Citation\n\nTBpore is a slimmed-down version of\nthe [full pipeline](https://github.com/mbhall88/head_to_head_pipeline) used\nin our paper \ud83d\udc47\n\n\n> Hall, M. B. et al. Evaluation of Nanopore sequencing for Mycobacterium tuberculosis drug susceptibility testing and outbreak investigation: a genomic analysis. *The Lancet Microbe* 0, (2022) doi: [10.1016/S2666-5247(22)00301-9][doi].\n\n[doi]: https://doi.org/10.1016/S2666-5247(22)00301-9\n\n## Installation\n\n### conda\n\n[![Conda (channel only)](https://img.shields.io/conda/vn/bioconda/tbpore)](https://anaconda.org/bioconda/tbpore)\n[![bioconda version](https://anaconda.org/bioconda/tbpore/badges/platforms.svg)](https://anaconda.org/bioconda/tbpore)\n![Conda](https://img.shields.io/conda/dn/bioconda/tbpore)\n\nPrerequisite: [`conda`][conda] (and bioconda channel [correctly set up][channels])\n\n```shell\n$ conda install tbpore\n```\n\n### pip\n\n![PyPI](https://img.shields.io/pypi/v/tbpore)\n![PyPI - Python Version](https://img.shields.io/pypi/pyversions/tbpore)\n\nThe python components of `tbpore` are availble to install through [PyPI].\n\n```shell\npip install tbpore\n```\n\n**However**, you will need to install the following dependencies, which cannot be\ninstalled through PyPI.\n\n#### Dependencies\n\n* [`rasusa`](https://github.com/mbhall88/rasusa)\n* [`psdm`](https://github.com/mbhall88/psdm) version 0.1.x\n* [`samtools`](https://github.com/samtools/samtools) version 1.13\n* [`bcftools`](https://github.com/samtools/bcftools) version 1.13\n* [`mykrobe`](https://github.com/Mykrobe-tools/mykrobe) version 0.12.x\n* [`minimap2`](https://github.com/lh3/minimap2) version 2.22\n* [`seqkit`](https://bioinf.shenwei.me/seqkit/) version 2.x\n\nWe make no guarentees about the performance of `tbpore` with versions other than those\nspecified above. In particular, the `bcftools` version is very important. The latest\nversions of the other dependencies can likely be used.\n\n### Container\n\nDocker images are provided through biocontainers.\n\n#### `singularity`\n\nPrerequisite: [`singularity`][singularity]\n\n```shell\n$ URI=\"docker://quay.io/biocontainers/tbpore:<tag>\"\n$ singularity exec \"$URI\" tbpore --help\n```\n\nsee [here][tags] for valid values for `<tag>`.\n\n#### `docker`\n\n[![Docker Repository on Quay](https://quay.io/repository/biocontainers/tbpore/status \"Docker Repository on Quay\")](https://quay.io/repository/biocontainers/tbpore)\n\nPrerequisite: [Docker]\n\n```shell\n$ docker pull quay.io/biocontainers/tbpore:<tag>\n$ docker run quay.io/biocontainers/tbpore:<tag> tbpore --help\n```\n\nsee [here][tags] for valid values for `<tag>`.\n\n## Configuring the decontamination database index\n\nAfter installing TBpore, you will need to download the decontamination database index.\n\n```\n$ tbpore download\n```\n\nBy default, this will download the index\nto `${HOME}/.tbpore/decontamination_db/remove_contam.map-ont.mmi`, as this is the\ndefault location `tbpore process` will search for.\n\nIf you prefer to download the index to another location, this can be done with\n\n```\n$ tbpore download -o other/location/db.mmi\n```\n\nKeep in mind, if you specify a non-default location, you will need to use the `--db`\noption when running `tbpore process`.\n\n## Performance\n\n### `tbpore process`\n\nBenchmarked on 151 TB ONT samples with 1 thread:\n\n* Runtime: `2103`s avg, `4048`s max (s = seconds);\n* RAM: `12.4`GB avg, `13.1`GB max (GB = Gigabytes);\n\n### `tbpore cluster`\n\nClustering 151 TB ONT samples:\n\n* Runtime: `286`s;\n* RAM: `<1`GB;\n\n## Usage\n\n### General usage\n\n```\nUsage: tbpore [OPTIONS] COMMAND [ARGS]...\n\nOptions:\n -h, --help Show this message and exit.\n -V, --version Show the version and exit.\n -v, --verbose Turns on debug-level logger. Option is mutually exclusive\n with quiet.\n -q, --quiet Turns off all logging except errors. Option is mutually\n exclusive with verbose.\n\nCommands:\n cluster Cluster consensus sequences\n download Download and validate the decontamination database\n process Single-sample TB genomic analysis from Nanopore sequencing data\n```\n\n### process\n\n```\nUsage: tbpore process [OPTIONS] [INPUTS]...\n\n Single-sample TB genomic analysis from Nanopore sequencing data\n\n INPUTS: Fastq file(s) and/or a directory containing fastq files. All files\n will be joined into a single fastq file, so ensure they're all part of the\n same sample/isolate.\n\nOptions:\n -h, --help Show this message and exit.\n -r, --recursive Recursively search INPUTS for fastq files\n -S, --name TEXT Name of the sample. By default, will use the\n first INPUT file with fastq extensions\n removed\n -A, --report_all_mykrobe_calls Report all mykrobe calls (turn on flag -A,\n --report_all_calls when calling mykrobe)\n --db PATH Path to the decontaminaton database\n [default: ${HOME}/.tbpore/decontamination_db/\n remove_contam.map-ont.mmi]\n -m, --metadata PATH Path to the decontaminaton database metadata\n file [default: /Users/michaelhall/Projects/\n tbpore/data/decontamination_db/remove_contam\n .tsv.gz]\n -o, --outdir DIRECTORY Directory to place output files [default:\n .]\n --tmp DIRECTORY Specify where to write all (tbpore)\n temporary files. [default: <outdir>/.tbpore]\n -t, --threads INTEGER Number of threads to use in multithreaded\n tools [default: 1]\n -d, --cleanup / -D, --no-cleanup\n Remove all temporary files on *successful*\n completion [default: no-cleanup]\n --cache DIRECTORY Path to use for the cache [default:\n /Users/michaelhall/.cache]\n```\n\n### cluster\n\n```\nUsage: tbpore cluster [OPTIONS] [INPUTS]...\n\n Cluster consensus sequences\n\n Preferably input consensus sequences previously generated with tbpore\n process.\n\n INPUTS: Two or more consensus fasta sequences. Use glob patterns to input\n several easily (e.g. output/sample_*/*.consensus.fa).\n\nOptions:\n -h, --help Show this message and exit.\n -T, --threshold INTEGER Clustering threshold [default: 6]\n -o, --outdir DIRECTORY Directory to place output files [default:\n .]\n --tmp DIRECTORY Specify where to write all (tbpore)\n temporary files. [default: <outdir>/.tbpore]\n -t, --threads INTEGER Number of threads to use in multithreaded\n tools [default: 1]\n -d, --cleanup / -D, --no-cleanup\n Remove all temporary files on *successful*\n completion [default: no-cleanup]\n --cache DIRECTORY Path to use for the cache [default:\n /Users/michaelhall/.cache]\n```\n\n### download\n\n```\nUsage: tbpore download [OPTIONS]\n\n Download and validate the decontamination database\n\nOptions:\n -h, --help Show this message and exit.\n -o, --output PATH Download database to a specified filepath [default: ${HOME}/\n .tbpore/decontamination_db/remove_contam.map-ont.mmi]\n -f, --force Force overwrite if the database already exists\n```\n\n[channels]: https://bioconda.github.io/#usage\n\n[conda]: https://docs.conda.io/projects/conda/en/latest/user-guide/install/\n\n[PyPI]: https://pypi.org/project/tbpore/\n\n[singularity]: https://sylabs.io/guides/3.6/user-guide/quick_start.html#quick-installation-steps\n\n[tags]: https://quay.io/repository/biocontainers/tbpore?tab=tags\n\n[Docker]: https://docs.docker.com/v17.12/install/\n\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Mycobacterium tuberculosis genomic analysis from Nanopore sequencing data",
"version": "0.4.1",
"split_keywords": [
"tuberculosis",
"nanopore",
"diagnostics",
"genomics",
"variant-calling",
"resistance-prediction"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "066b7cacdba1d684bfc2b51be9d102ef0262b12892ea3ce53671fcd0ee106f1c",
"md5": "6b2c47f3d45982649be7f8419a246fce",
"sha256": "16b63b6c1981db321f95262985c08af98e88e01c22df114c1426bd689b8c8575"
},
"downloads": -1,
"filename": "tbpore-0.4.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "6b2c47f3d45982649be7f8419a246fce",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8,<4.0",
"size": 2205318,
"upload_time": "2023-02-01T23:05:40",
"upload_time_iso_8601": "2023-02-01T23:05:40.546034Z",
"url": "https://files.pythonhosted.org/packages/06/6b/7cacdba1d684bfc2b51be9d102ef0262b12892ea3ce53671fcd0ee106f1c/tbpore-0.4.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "221d13567a731ee351405a65cb34902e1b25b34795a0e854197ba82cb163601c",
"md5": "346c63e9402f085af7d9d7704aa6946d",
"sha256": "faf305074f7fc5da95ebd9a40fd66a6df29573a0e6342b74f0cabb2f2cf82cb3"
},
"downloads": -1,
"filename": "tbpore-0.4.1.tar.gz",
"has_sig": false,
"md5_digest": "346c63e9402f085af7d9d7704aa6946d",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8,<4.0",
"size": 2203855,
"upload_time": "2023-02-01T23:05:42",
"upload_time_iso_8601": "2023-02-01T23:05:42.389776Z",
"url": "https://files.pythonhosted.org/packages/22/1d/13567a731ee351405a65cb34902e1b25b34795a0e854197ba82cb163601c/tbpore-0.4.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-02-01 23:05:42",
"github": true,
"gitlab": false,
"bitbucket": false,
"github_user": "mbhall88",
"github_project": "tbpore",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "tbpore"
}