tbpore


Nametbpore JSON
Version 0.4.1 PyPI version JSON
download
home_pagehttps://github.com/mbhall88/tbpore
SummaryMycobacterium tuberculosis genomic analysis from Nanopore sequencing data
upload_time2023-02-01 23:05:42
maintainer
docs_urlNone
authorMichael Hall
requires_python>=3.8,<4.0
licenseMIT
keywords tuberculosis nanopore diagnostics genomics variant-calling resistance-prediction
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # TBpore

*Mycobacterium tuberculosis* genomic analysis from Nanopore sequencing data

[![Python CI](https://github.com/mbhall88/tbpore/actions/workflows/ci.yaml/badge.svg)](https://github.com/mbhall88/tbpore/actions/workflows/ci.yaml)
[![codecov](https://codecov.io/gh/mbhall88/tbpore/branch/main/graph/badge.svg)](https://codecov.io/gh/mbhall88/tbpore)
[![PyPI](https://img.shields.io/pypi/v/tbpore)](https://pypi.org/project/tbpore/)
![PyPI - Python Version](https://img.shields.io/pypi/pyversions/tbpore)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)

[TOC]: #

# Table of Contents

- [Synopsis](#synopsis)
- [Citation](#citation)
- [Installation](#installation)
- [Configuring the decontamination database index](#configuring-the-decontamination-database-index)
- [Performance](#performance)
- [Usage](#usage)

# Synopsis

`tbpore` is a tool with two main goals.
First is to process Nanopore Mycobacterium tuberculosis sequencing data to describe
variants with respect to the
canonical TB strain H37Rv and predict antibiotic resistance (command `tbpore process`).
Variant description is done by decontaminating reads, calling variants with
[bcftools](https://github.com/samtools/bcftools) and filtering variants.
Antibiotic resistance is predicted
with [mykrobe](https://github.com/Mykrobe-tools/mykrobe).
Second, `tbpore` can be used to cluster TB samples based on their genotyping and a given
distance threshold (command
`tbpore cluster`).

## Citation

TBpore is a slimmed-down version of
the [full pipeline](https://github.com/mbhall88/head_to_head_pipeline) used
in our paper 👇


> Hall, M. B. et al. Evaluation of Nanopore sequencing for Mycobacterium tuberculosis drug susceptibility testing and outbreak investigation: a genomic analysis. *The Lancet Microbe* 0, (2022) doi: [10.1016/S2666-5247(22)00301-9][doi].

[doi]: https://doi.org/10.1016/S2666-5247(22)00301-9

## Installation

### conda

[![Conda (channel only)](https://img.shields.io/conda/vn/bioconda/tbpore)](https://anaconda.org/bioconda/tbpore)
[![bioconda version](https://anaconda.org/bioconda/tbpore/badges/platforms.svg)](https://anaconda.org/bioconda/tbpore)
![Conda](https://img.shields.io/conda/dn/bioconda/tbpore)

Prerequisite: [`conda`][conda] (and bioconda channel [correctly set up][channels])

```shell
$ conda install tbpore
```

### pip

![PyPI](https://img.shields.io/pypi/v/tbpore)
![PyPI - Python Version](https://img.shields.io/pypi/pyversions/tbpore)

The python components of `tbpore` are availble to install through [PyPI].

```shell
pip install tbpore
```

**However**, you will need to install the following dependencies, which cannot be
installed through PyPI.

#### Dependencies

* [`rasusa`](https://github.com/mbhall88/rasusa)
* [`psdm`](https://github.com/mbhall88/psdm) version 0.1.x
* [`samtools`](https://github.com/samtools/samtools) version 1.13
* [`bcftools`](https://github.com/samtools/bcftools) version 1.13
* [`mykrobe`](https://github.com/Mykrobe-tools/mykrobe) version 0.12.x
* [`minimap2`](https://github.com/lh3/minimap2) version 2.22
* [`seqkit`](https://bioinf.shenwei.me/seqkit/) version 2.x

We make no guarentees about the performance of `tbpore` with versions other than those
specified above. In particular, the `bcftools` version is very important. The latest
versions of the other dependencies can likely be used.

### Container

Docker images are provided through biocontainers.

#### `singularity`

Prerequisite: [`singularity`][singularity]

```shell
$ URI="docker://quay.io/biocontainers/tbpore:<tag>"
$ singularity exec "$URI" tbpore --help
```

see [here][tags] for valid values for `<tag>`.

#### `docker`

[![Docker Repository on Quay](https://quay.io/repository/biocontainers/tbpore/status "Docker Repository on Quay")](https://quay.io/repository/biocontainers/tbpore)

Prerequisite: [Docker]

```shell
$ docker pull quay.io/biocontainers/tbpore:<tag>
$ docker run quay.io/biocontainers/tbpore:<tag> tbpore --help
```

see [here][tags] for valid values for `<tag>`.

## Configuring the decontamination database index

After installing TBpore, you will need to download the decontamination database index.

```
$ tbpore download
```

By default, this will download the index
to `${HOME}/.tbpore/decontamination_db/remove_contam.map-ont.mmi`, as this is the
default location `tbpore process` will search for.

If you prefer to download the index to another location, this can be done with

```
$ tbpore download -o other/location/db.mmi
```

Keep in mind, if you specify a non-default location, you will need to use the `--db`
option when running `tbpore process`.

## Performance

### `tbpore process`

Benchmarked on 151 TB ONT samples with 1 thread:

* Runtime: `2103`s avg, `4048`s max (s = seconds);
* RAM: `12.4`GB avg, `13.1`GB max (GB = Gigabytes);

### `tbpore cluster`

Clustering 151 TB ONT samples:

* Runtime: `286`s;
* RAM: `<1`GB;

## Usage

### General usage

```
Usage: tbpore [OPTIONS] COMMAND [ARGS]...

Options:
  -h, --help     Show this message and exit.
  -V, --version  Show the version and exit.
  -v, --verbose  Turns on debug-level logger. Option is mutually exclusive
                 with quiet.
  -q, --quiet    Turns off all logging except errors. Option is mutually
                 exclusive with verbose.

Commands:
  cluster   Cluster consensus sequences
  download  Download and validate the decontamination database
  process   Single-sample TB genomic analysis from Nanopore sequencing data
```

### process

```
Usage: tbpore process [OPTIONS] [INPUTS]...

  Single-sample TB genomic analysis from Nanopore sequencing data

  INPUTS: Fastq file(s) and/or a directory containing fastq files. All files
  will be joined into a single fastq file, so ensure they're all part of the
  same sample/isolate.

Options:
  -h, --help                      Show this message and exit.
  -r, --recursive                 Recursively search INPUTS for fastq files
  -S, --name TEXT                 Name of the sample. By default, will use the
                                  first INPUT file with fastq extensions
                                  removed
  -A, --report_all_mykrobe_calls  Report all mykrobe calls (turn on flag -A,
                                  --report_all_calls when calling mykrobe)
  --db PATH                       Path to the decontaminaton database
                                  [default: ${HOME}/.tbpore/decontamination_db/
                                  remove_contam.map-ont.mmi]
  -m, --metadata PATH             Path to the decontaminaton database metadata
                                  file  [default: /Users/michaelhall/Projects/
                                  tbpore/data/decontamination_db/remove_contam
                                  .tsv.gz]
  -o, --outdir DIRECTORY          Directory to place output files  [default:
                                  .]
  --tmp DIRECTORY                 Specify where to write all (tbpore)
                                  temporary files. [default: <outdir>/.tbpore]
  -t, --threads INTEGER           Number of threads to use in multithreaded
                                  tools  [default: 1]
  -d, --cleanup / -D, --no-cleanup
                                  Remove all temporary files on *successful*
                                  completion  [default: no-cleanup]
  --cache DIRECTORY               Path to use for the cache  [default:
                                  /Users/michaelhall/.cache]
```

### cluster

```
Usage: tbpore cluster [OPTIONS] [INPUTS]...

  Cluster consensus sequences

  Preferably input consensus sequences previously generated with tbpore
  process.

  INPUTS: Two or more consensus fasta sequences. Use glob patterns to input
  several easily (e.g. output/sample_*/*.consensus.fa).

Options:
  -h, --help                      Show this message and exit.
  -T, --threshold INTEGER         Clustering threshold  [default: 6]
  -o, --outdir DIRECTORY          Directory to place output files  [default:
                                  .]
  --tmp DIRECTORY                 Specify where to write all (tbpore)
                                  temporary files. [default: <outdir>/.tbpore]
  -t, --threads INTEGER           Number of threads to use in multithreaded
                                  tools  [default: 1]
  -d, --cleanup / -D, --no-cleanup
                                  Remove all temporary files on *successful*
                                  completion  [default: no-cleanup]
  --cache DIRECTORY               Path to use for the cache  [default:
                                  /Users/michaelhall/.cache]
```

### download

```
Usage: tbpore download [OPTIONS]

  Download and validate the decontamination database

Options:
  -h, --help         Show this message and exit.
  -o, --output PATH  Download database to a specified filepath  [default: ${HOME}/
                     .tbpore/decontamination_db/remove_contam.map-ont.mmi]
  -f, --force        Force overwrite if the database already exists
```

[channels]: https://bioconda.github.io/#usage

[conda]: https://docs.conda.io/projects/conda/en/latest/user-guide/install/

[PyPI]: https://pypi.org/project/tbpore/

[singularity]: https://sylabs.io/guides/3.6/user-guide/quick_start.html#quick-installation-steps

[tags]: https://quay.io/repository/biocontainers/tbpore?tab=tags

[Docker]: https://docs.docker.com/v17.12/install/


            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/mbhall88/tbpore",
    "name": "tbpore",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8,<4.0",
    "maintainer_email": "",
    "keywords": "tuberculosis,nanopore,diagnostics,genomics,variant-calling,resistance-prediction",
    "author": "Michael Hall",
    "author_email": "michael@mbh.sh",
    "download_url": "https://files.pythonhosted.org/packages/22/1d/13567a731ee351405a65cb34902e1b25b34795a0e854197ba82cb163601c/tbpore-0.4.1.tar.gz",
    "platform": null,
    "description": "# TBpore\n\n*Mycobacterium tuberculosis* genomic analysis from Nanopore sequencing data\n\n[![Python CI](https://github.com/mbhall88/tbpore/actions/workflows/ci.yaml/badge.svg)](https://github.com/mbhall88/tbpore/actions/workflows/ci.yaml)\n[![codecov](https://codecov.io/gh/mbhall88/tbpore/branch/main/graph/badge.svg)](https://codecov.io/gh/mbhall88/tbpore)\n[![PyPI](https://img.shields.io/pypi/v/tbpore)](https://pypi.org/project/tbpore/)\n![PyPI - Python Version](https://img.shields.io/pypi/pyversions/tbpore)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n\n[TOC]: #\n\n# Table of Contents\n\n- [Synopsis](#synopsis)\n- [Citation](#citation)\n- [Installation](#installation)\n- [Configuring the decontamination database index](#configuring-the-decontamination-database-index)\n- [Performance](#performance)\n- [Usage](#usage)\n\n# Synopsis\n\n`tbpore` is a tool with two main goals.\nFirst is to process Nanopore Mycobacterium tuberculosis sequencing data to describe\nvariants with respect to the\ncanonical TB strain H37Rv and predict antibiotic resistance (command `tbpore process`).\nVariant description is done by decontaminating reads, calling variants with\n[bcftools](https://github.com/samtools/bcftools) and filtering variants.\nAntibiotic resistance is predicted\nwith [mykrobe](https://github.com/Mykrobe-tools/mykrobe).\nSecond, `tbpore` can be used to cluster TB samples based on their genotyping and a given\ndistance threshold (command\n`tbpore cluster`).\n\n## Citation\n\nTBpore is a slimmed-down version of\nthe [full pipeline](https://github.com/mbhall88/head_to_head_pipeline) used\nin our paper \ud83d\udc47\n\n\n> Hall, M. B. et al. Evaluation of Nanopore sequencing for Mycobacterium tuberculosis drug susceptibility testing and outbreak investigation: a genomic analysis. *The Lancet Microbe* 0, (2022) doi: [10.1016/S2666-5247(22)00301-9][doi].\n\n[doi]: https://doi.org/10.1016/S2666-5247(22)00301-9\n\n## Installation\n\n### conda\n\n[![Conda (channel only)](https://img.shields.io/conda/vn/bioconda/tbpore)](https://anaconda.org/bioconda/tbpore)\n[![bioconda version](https://anaconda.org/bioconda/tbpore/badges/platforms.svg)](https://anaconda.org/bioconda/tbpore)\n![Conda](https://img.shields.io/conda/dn/bioconda/tbpore)\n\nPrerequisite: [`conda`][conda] (and bioconda channel [correctly set up][channels])\n\n```shell\n$ conda install tbpore\n```\n\n### pip\n\n![PyPI](https://img.shields.io/pypi/v/tbpore)\n![PyPI - Python Version](https://img.shields.io/pypi/pyversions/tbpore)\n\nThe python components of `tbpore` are availble to install through [PyPI].\n\n```shell\npip install tbpore\n```\n\n**However**, you will need to install the following dependencies, which cannot be\ninstalled through PyPI.\n\n#### Dependencies\n\n* [`rasusa`](https://github.com/mbhall88/rasusa)\n* [`psdm`](https://github.com/mbhall88/psdm) version 0.1.x\n* [`samtools`](https://github.com/samtools/samtools) version 1.13\n* [`bcftools`](https://github.com/samtools/bcftools) version 1.13\n* [`mykrobe`](https://github.com/Mykrobe-tools/mykrobe) version 0.12.x\n* [`minimap2`](https://github.com/lh3/minimap2) version 2.22\n* [`seqkit`](https://bioinf.shenwei.me/seqkit/) version 2.x\n\nWe make no guarentees about the performance of `tbpore` with versions other than those\nspecified above. In particular, the `bcftools` version is very important. The latest\nversions of the other dependencies can likely be used.\n\n### Container\n\nDocker images are provided through biocontainers.\n\n#### `singularity`\n\nPrerequisite: [`singularity`][singularity]\n\n```shell\n$ URI=\"docker://quay.io/biocontainers/tbpore:<tag>\"\n$ singularity exec \"$URI\" tbpore --help\n```\n\nsee [here][tags] for valid values for `<tag>`.\n\n#### `docker`\n\n[![Docker Repository on Quay](https://quay.io/repository/biocontainers/tbpore/status \"Docker Repository on Quay\")](https://quay.io/repository/biocontainers/tbpore)\n\nPrerequisite: [Docker]\n\n```shell\n$ docker pull quay.io/biocontainers/tbpore:<tag>\n$ docker run quay.io/biocontainers/tbpore:<tag> tbpore --help\n```\n\nsee [here][tags] for valid values for `<tag>`.\n\n## Configuring the decontamination database index\n\nAfter installing TBpore, you will need to download the decontamination database index.\n\n```\n$ tbpore download\n```\n\nBy default, this will download the index\nto `${HOME}/.tbpore/decontamination_db/remove_contam.map-ont.mmi`, as this is the\ndefault location `tbpore process` will search for.\n\nIf you prefer to download the index to another location, this can be done with\n\n```\n$ tbpore download -o other/location/db.mmi\n```\n\nKeep in mind, if you specify a non-default location, you will need to use the `--db`\noption when running `tbpore process`.\n\n## Performance\n\n### `tbpore process`\n\nBenchmarked on 151 TB ONT samples with 1 thread:\n\n* Runtime: `2103`s avg, `4048`s max (s = seconds);\n* RAM: `12.4`GB avg, `13.1`GB max (GB = Gigabytes);\n\n### `tbpore cluster`\n\nClustering 151 TB ONT samples:\n\n* Runtime: `286`s;\n* RAM: `<1`GB;\n\n## Usage\n\n### General usage\n\n```\nUsage: tbpore [OPTIONS] COMMAND [ARGS]...\n\nOptions:\n  -h, --help     Show this message and exit.\n  -V, --version  Show the version and exit.\n  -v, --verbose  Turns on debug-level logger. Option is mutually exclusive\n                 with quiet.\n  -q, --quiet    Turns off all logging except errors. Option is mutually\n                 exclusive with verbose.\n\nCommands:\n  cluster   Cluster consensus sequences\n  download  Download and validate the decontamination database\n  process   Single-sample TB genomic analysis from Nanopore sequencing data\n```\n\n### process\n\n```\nUsage: tbpore process [OPTIONS] [INPUTS]...\n\n  Single-sample TB genomic analysis from Nanopore sequencing data\n\n  INPUTS: Fastq file(s) and/or a directory containing fastq files. All files\n  will be joined into a single fastq file, so ensure they're all part of the\n  same sample/isolate.\n\nOptions:\n  -h, --help                      Show this message and exit.\n  -r, --recursive                 Recursively search INPUTS for fastq files\n  -S, --name TEXT                 Name of the sample. By default, will use the\n                                  first INPUT file with fastq extensions\n                                  removed\n  -A, --report_all_mykrobe_calls  Report all mykrobe calls (turn on flag -A,\n                                  --report_all_calls when calling mykrobe)\n  --db PATH                       Path to the decontaminaton database\n                                  [default: ${HOME}/.tbpore/decontamination_db/\n                                  remove_contam.map-ont.mmi]\n  -m, --metadata PATH             Path to the decontaminaton database metadata\n                                  file  [default: /Users/michaelhall/Projects/\n                                  tbpore/data/decontamination_db/remove_contam\n                                  .tsv.gz]\n  -o, --outdir DIRECTORY          Directory to place output files  [default:\n                                  .]\n  --tmp DIRECTORY                 Specify where to write all (tbpore)\n                                  temporary files. [default: <outdir>/.tbpore]\n  -t, --threads INTEGER           Number of threads to use in multithreaded\n                                  tools  [default: 1]\n  -d, --cleanup / -D, --no-cleanup\n                                  Remove all temporary files on *successful*\n                                  completion  [default: no-cleanup]\n  --cache DIRECTORY               Path to use for the cache  [default:\n                                  /Users/michaelhall/.cache]\n```\n\n### cluster\n\n```\nUsage: tbpore cluster [OPTIONS] [INPUTS]...\n\n  Cluster consensus sequences\n\n  Preferably input consensus sequences previously generated with tbpore\n  process.\n\n  INPUTS: Two or more consensus fasta sequences. Use glob patterns to input\n  several easily (e.g. output/sample_*/*.consensus.fa).\n\nOptions:\n  -h, --help                      Show this message and exit.\n  -T, --threshold INTEGER         Clustering threshold  [default: 6]\n  -o, --outdir DIRECTORY          Directory to place output files  [default:\n                                  .]\n  --tmp DIRECTORY                 Specify where to write all (tbpore)\n                                  temporary files. [default: <outdir>/.tbpore]\n  -t, --threads INTEGER           Number of threads to use in multithreaded\n                                  tools  [default: 1]\n  -d, --cleanup / -D, --no-cleanup\n                                  Remove all temporary files on *successful*\n                                  completion  [default: no-cleanup]\n  --cache DIRECTORY               Path to use for the cache  [default:\n                                  /Users/michaelhall/.cache]\n```\n\n### download\n\n```\nUsage: tbpore download [OPTIONS]\n\n  Download and validate the decontamination database\n\nOptions:\n  -h, --help         Show this message and exit.\n  -o, --output PATH  Download database to a specified filepath  [default: ${HOME}/\n                     .tbpore/decontamination_db/remove_contam.map-ont.mmi]\n  -f, --force        Force overwrite if the database already exists\n```\n\n[channels]: https://bioconda.github.io/#usage\n\n[conda]: https://docs.conda.io/projects/conda/en/latest/user-guide/install/\n\n[PyPI]: https://pypi.org/project/tbpore/\n\n[singularity]: https://sylabs.io/guides/3.6/user-guide/quick_start.html#quick-installation-steps\n\n[tags]: https://quay.io/repository/biocontainers/tbpore?tab=tags\n\n[Docker]: https://docs.docker.com/v17.12/install/\n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Mycobacterium tuberculosis genomic analysis from Nanopore sequencing data",
    "version": "0.4.1",
    "split_keywords": [
        "tuberculosis",
        "nanopore",
        "diagnostics",
        "genomics",
        "variant-calling",
        "resistance-prediction"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "066b7cacdba1d684bfc2b51be9d102ef0262b12892ea3ce53671fcd0ee106f1c",
                "md5": "6b2c47f3d45982649be7f8419a246fce",
                "sha256": "16b63b6c1981db321f95262985c08af98e88e01c22df114c1426bd689b8c8575"
            },
            "downloads": -1,
            "filename": "tbpore-0.4.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "6b2c47f3d45982649be7f8419a246fce",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8,<4.0",
            "size": 2205318,
            "upload_time": "2023-02-01T23:05:40",
            "upload_time_iso_8601": "2023-02-01T23:05:40.546034Z",
            "url": "https://files.pythonhosted.org/packages/06/6b/7cacdba1d684bfc2b51be9d102ef0262b12892ea3ce53671fcd0ee106f1c/tbpore-0.4.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "221d13567a731ee351405a65cb34902e1b25b34795a0e854197ba82cb163601c",
                "md5": "346c63e9402f085af7d9d7704aa6946d",
                "sha256": "faf305074f7fc5da95ebd9a40fd66a6df29573a0e6342b74f0cabb2f2cf82cb3"
            },
            "downloads": -1,
            "filename": "tbpore-0.4.1.tar.gz",
            "has_sig": false,
            "md5_digest": "346c63e9402f085af7d9d7704aa6946d",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8,<4.0",
            "size": 2203855,
            "upload_time": "2023-02-01T23:05:42",
            "upload_time_iso_8601": "2023-02-01T23:05:42.389776Z",
            "url": "https://files.pythonhosted.org/packages/22/1d/13567a731ee351405a65cb34902e1b25b34795a0e854197ba82cb163601c/tbpore-0.4.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-02-01 23:05:42",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "mbhall88",
    "github_project": "tbpore",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "tbpore"
}
        
Elapsed time: 0.04242s