trimnami


Nametrimnami JSON
Version 0.1.4 PyPI version JSON
download
home_pagehttps://github.com/beardymcjohnface/Trimnami
SummaryTrim lots of metagenomics samples all at once.
upload_time2024-05-02 06:23:26
maintainerNone
docs_urlNone
authorMichael Roach
requires_python>=3.9
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            ![](trimnami.png)

[![](https://img.shields.io/static/v1?label=CLI&message=Snaketool&color=blueviolet)](https://github.com/beardymcjohnface/Snaketool)
[![](https://img.shields.io/static/v1?label=Licence&message=MIT&color=black)](https://opensource.org/license/mit/)
[![](https://img.shields.io/static/v1?label=Install%20with&message=PIP&color=success)](https://pypi.org/project/trimnami/)
[![install with bioconda](https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg?style=flat)](http://bioconda.github.io/recipes/trimnami/README.html)
![GitHub last commit (branch)](https://img.shields.io/github/last-commit/beardymcjohnface/Trimnami/main)
[![Unit tests](https://github.com/beardymcjohnface/Trimnami/actions/workflows/python-app.yml/badge.svg)](https://github.com/beardymcjohnface/Trimnami/actions/workflows/python-app.yml)
[![Env builds](https://github.com/beardymcjohnface/Trimnami/actions/workflows/trimnami-build-envs.yml/badge.svg)](https://github.com/beardymcjohnface/Trimnami/actions/workflows/trimnami-build-envs.yml)
[![codecov](https://codecov.io/gh/beardymcjohnface/Trimnami/branch/main/graph/badge.svg?token=E0w8zHLLDq)](https://codecov.io/gh/beardymcjohnface/Trimnami)


---

Trim lots of metagenomics samples all at once.

## Motivation

We keep writing pipelines that start with read trimming.
Rather than copy-pasting code each time,
this standalone Snaketool handles our trimming needs.
The tool will collect sample names and files from a directory or TSV file,
optionally remove host reads, and trim with your favourite read trimmer.
Read trimming methods supported so far:

- Fastp
- Prinseq++
- BBtools for Round A/B viral metagenomics
- Filtlong + Rasusa for longreads

## Install

Trimnami is still in development but can be easily installed with pip:
 
__Easy install__

```shell
pip install trimnami
```

__Developer install__
```shell
git clone https://github.com/beardymcjohnface/Trimnami.git
cd Trimnami/
pip install -e .
```

## Test

Trimnami comes with inbuilt tests which you can run to check everything works fine.

```shell
# test fastp only (default method)
trimnami test

# test all SR methods
trimnami test fastp prinseq roundAB

# test all SR methods with host removal
trimnami testhost fastp prinseq roundAB

# test nanopore method (with host removal)
trimnami testnp
```

## Usage

Trim reads with Fastp or Prinseq++

```shell
# Fastp (default)
trimnami run --reads reads/

# Prinseq++
trimnami run --reads reads/ prinseq

# Why not both!
trimnami run --reads reads/ fastp prinseq
```

Include host removal

```shell
trimnami run --reads reads/ --host host_genome.fasta
```

Longreads with host removal.
Specify 'nanopore' for targets and use the appropriate minimap preset.

```shell
trimnami run \
    --reads reads/ \
    --host host_genome.fasta \
    --minimap map-ont \
    nanopore
```

## Parsing samples with `--reads`

You can pass either a directory of reads or a TSV file to `--reads`.
 - __Directory:__ Trimnami will infer sample names and \_R1/\_R2 pairs from the filenames.
 - __TSV file:__ Trimnami expects 2 or 3 columns, with column 1 being the sample name and columns 2 and 3 the reads files.

__[More information and examples here](https://gist.github.com/beardymcjohnface/bb161ba04ae1042299f48a4849e917c8#file-readme-md)__

## Configure trimming parameters

You can customise the trimming parameters via the config file.
Copy the default config file.

```shell
trimnami config
```

Then edit the config file `trimnami.out/trimnami.config.yaml` in your favourite text editor.
Run trimnami like normal, or point to your custom config file if you've moved it.

```shell
trimnami run ... --configfile /my/awesome/config.yaml
```

## Outputs

Trimmed reads will be saved in various subfolders in the output directory.
e.g. if trimming with Fastp or Prinseq++, 
trimmed reads will be in `trimnami.out/fastp/` or `trimnami.out/prinseq/`.
Paired reads will yield three files: 
The R1 and R2 paired reads, and any singletons from trimming or host removal.
Subsampling will produce extra files of subsampled trimmed reads.
Multiqc-fastqc reports for any runs will be available in `trimnami.out/reports/`

### Example outputs
<details>
    <summary>Click to expand</summary>

prinseq

```text
trimnami.out/
└── prinseq
    ├── A13-04-182-06_TAGCTT.paired.R1.fastq.gz
    ├── A13-04-182-06_TAGCTT.paired.R2.fastq.gz
    ├── A13-04-182-06_TAGCTT.paired.S.fastq.gz
    ├── A13-12-250-06_GGCTAC.paired.R1.fastq.gz
    ├── A13-12-250-06_GGCTAC.paired.R2.fastq.gz
    ├── A13-12-250-06_GGCTAC.paired.S.fastq.gz
    └── A13-135-177-06_AGTTCC.single.fastq.gz
```

prinseq with fastqc reports

```text
trimnami.out/
├── prinseq
│   ├── A13-04-182-06_TAGCTT.paired.R1.fastq.gz
│   ├── A13-04-182-06_TAGCTT.paired.R2.fastq.gz
│   ├── A13-04-182-06_TAGCTT.paired.S.fastq.gz
│   ├── A13-12-250-06_GGCTAC.paired.R1.fastq.gz
│   ├── A13-12-250-06_GGCTAC.paired.R2.fastq.gz
│   ├── A13-12-250-06_GGCTAC.paired.S.fastq.gz
│   └── A13-135-177-06_AGTTCC.single.fastq.gz
└── reports
    ├── prinseq.fastqc.html
    └── untrimmed.fastqc.html

```

prinseq with host removal

```text
trimnami.out/
└── prinseq
    ├── A13-04-182-06_TAGCTT.host_rm.paired.R1.fastq.gz
    ├── A13-04-182-06_TAGCTT.host_rm.paired.R2.fastq.gz
    ├── A13-04-182-06_TAGCTT.host_rm.paired.S.fastq.gz
    ├── A13-12-250-06_GGCTAC.host_rm.paired.R1.fastq.gz
    ├── A13-12-250-06_GGCTAC.host_rm.paired.R2.fastq.gz
    ├── A13-12-250-06_GGCTAC.host_rm.paired.S.fastq.gz
    └── A13-135-177-06_AGTTCC.host_rm.single.fastq.gz
```

prinseq with host removal and subsampling

```text
trimnami.out/
└── prinseq
    ├── A13-04-182-06_TAGCTT.host_rm.paired.R1.fastq.gz
    ├── A13-04-182-06_TAGCTT.host_rm.paired.R1.subsampled.fastq.gz
    ├── A13-04-182-06_TAGCTT.host_rm.paired.R2.fastq.gz
    ├── A13-04-182-06_TAGCTT.host_rm.paired.R2.subsampled.fastq.gz
    ├── A13-04-182-06_TAGCTT.host_rm.paired.S.fastq.gz
    ├── A13-04-182-06_TAGCTT.host_rm.paired.S.subsampled.fastq.gz
    ├── A13-12-250-06_GGCTAC.host_rm.paired.R1.fastq.gz
    ├── A13-12-250-06_GGCTAC.host_rm.paired.R1.subsampled.fastq.gz
    ├── A13-12-250-06_GGCTAC.host_rm.paired.R2.fastq.gz
    ├── A13-12-250-06_GGCTAC.host_rm.paired.R2.subsampled.fastq.gz
    ├── A13-12-250-06_GGCTAC.host_rm.paired.S.fastq.gz
    ├── A13-12-250-06_GGCTAC.host_rm.paired.S.subsampled.fastq.gz
    ├── A13-135-177-06_AGTTCC.host_rm.single.fastq.gz
    └── A13-135-177-06_AGTTCC.host_rm.single.subsampled.fastq.gz
```
</details>


            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/beardymcjohnface/Trimnami",
    "name": "trimnami",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": null,
    "author": "Michael Roach",
    "author_email": "beardymcjohnface@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/8c/40/ada26a1ef6bcb1dab7cf45aa5894e9eac7117e2ffab03800c2d15aeb96e2/trimnami-0.1.4.tar.gz",
    "platform": null,
    "description": "![](trimnami.png)\n\n[![](https://img.shields.io/static/v1?label=CLI&message=Snaketool&color=blueviolet)](https://github.com/beardymcjohnface/Snaketool)\n[![](https://img.shields.io/static/v1?label=Licence&message=MIT&color=black)](https://opensource.org/license/mit/)\n[![](https://img.shields.io/static/v1?label=Install%20with&message=PIP&color=success)](https://pypi.org/project/trimnami/)\n[![install with bioconda](https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg?style=flat)](http://bioconda.github.io/recipes/trimnami/README.html)\n![GitHub last commit (branch)](https://img.shields.io/github/last-commit/beardymcjohnface/Trimnami/main)\n[![Unit tests](https://github.com/beardymcjohnface/Trimnami/actions/workflows/python-app.yml/badge.svg)](https://github.com/beardymcjohnface/Trimnami/actions/workflows/python-app.yml)\n[![Env builds](https://github.com/beardymcjohnface/Trimnami/actions/workflows/trimnami-build-envs.yml/badge.svg)](https://github.com/beardymcjohnface/Trimnami/actions/workflows/trimnami-build-envs.yml)\n[![codecov](https://codecov.io/gh/beardymcjohnface/Trimnami/branch/main/graph/badge.svg?token=E0w8zHLLDq)](https://codecov.io/gh/beardymcjohnface/Trimnami)\n\n\n---\n\nTrim lots of metagenomics samples all at once.\n\n## Motivation\n\nWe keep writing pipelines that start with read trimming.\nRather than copy-pasting code each time,\nthis standalone Snaketool handles our trimming needs.\nThe tool will collect sample names and files from a directory or TSV file,\noptionally remove host reads, and trim with your favourite read trimmer.\nRead trimming methods supported so far:\n\n- Fastp\n- Prinseq++\n- BBtools for Round A/B viral metagenomics\n- Filtlong + Rasusa for longreads\n\n## Install\n\nTrimnami is still in development but can be easily installed with pip:\n \n__Easy install__\n\n```shell\npip install trimnami\n```\n\n__Developer install__\n```shell\ngit clone https://github.com/beardymcjohnface/Trimnami.git\ncd Trimnami/\npip install -e .\n```\n\n## Test\n\nTrimnami comes with inbuilt tests which you can run to check everything works fine.\n\n```shell\n# test fastp only (default method)\ntrimnami test\n\n# test all SR methods\ntrimnami test fastp prinseq roundAB\n\n# test all SR methods with host removal\ntrimnami testhost fastp prinseq roundAB\n\n# test nanopore method (with host removal)\ntrimnami testnp\n```\n\n## Usage\n\nTrim reads with Fastp or Prinseq++\n\n```shell\n# Fastp (default)\ntrimnami run --reads reads/\n\n# Prinseq++\ntrimnami run --reads reads/ prinseq\n\n# Why not both!\ntrimnami run --reads reads/ fastp prinseq\n```\n\nInclude host removal\n\n```shell\ntrimnami run --reads reads/ --host host_genome.fasta\n```\n\nLongreads with host removal.\nSpecify 'nanopore' for targets and use the appropriate minimap preset.\n\n```shell\ntrimnami run \\\n    --reads reads/ \\\n    --host host_genome.fasta \\\n    --minimap map-ont \\\n    nanopore\n```\n\n## Parsing samples with `--reads`\n\nYou can pass either a directory of reads or a TSV file to `--reads`.\n - __Directory:__ Trimnami will infer sample names and \\_R1/\\_R2 pairs from the filenames.\n - __TSV file:__ Trimnami expects 2 or 3 columns, with column 1 being the sample name and columns 2 and 3 the reads files.\n\n__[More information and examples here](https://gist.github.com/beardymcjohnface/bb161ba04ae1042299f48a4849e917c8#file-readme-md)__\n\n## Configure trimming parameters\n\nYou can customise the trimming parameters via the config file.\nCopy the default config file.\n\n```shell\ntrimnami config\n```\n\nThen edit the config file `trimnami.out/trimnami.config.yaml` in your favourite text editor.\nRun trimnami like normal, or point to your custom config file if you've moved it.\n\n```shell\ntrimnami run ... --configfile /my/awesome/config.yaml\n```\n\n## Outputs\n\nTrimmed reads will be saved in various subfolders in the output directory.\ne.g. if trimming with Fastp or Prinseq++, \ntrimmed reads will be in `trimnami.out/fastp/` or `trimnami.out/prinseq/`.\nPaired reads will yield three files: \nThe R1 and R2 paired reads, and any singletons from trimming or host removal.\nSubsampling will produce extra files of subsampled trimmed reads.\nMultiqc-fastqc reports for any runs will be available in `trimnami.out/reports/`\n\n### Example outputs\n<details>\n    <summary>Click to expand</summary>\n\nprinseq\n\n```text\ntrimnami.out/\n\u2514\u2500\u2500 prinseq\n    \u251c\u2500\u2500 A13-04-182-06_TAGCTT.paired.R1.fastq.gz\n    \u251c\u2500\u2500 A13-04-182-06_TAGCTT.paired.R2.fastq.gz\n    \u251c\u2500\u2500 A13-04-182-06_TAGCTT.paired.S.fastq.gz\n    \u251c\u2500\u2500 A13-12-250-06_GGCTAC.paired.R1.fastq.gz\n    \u251c\u2500\u2500 A13-12-250-06_GGCTAC.paired.R2.fastq.gz\n    \u251c\u2500\u2500 A13-12-250-06_GGCTAC.paired.S.fastq.gz\n    \u2514\u2500\u2500 A13-135-177-06_AGTTCC.single.fastq.gz\n```\n\nprinseq with fastqc reports\n\n```text\ntrimnami.out/\n\u251c\u2500\u2500 prinseq\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 A13-04-182-06_TAGCTT.paired.R1.fastq.gz\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 A13-04-182-06_TAGCTT.paired.R2.fastq.gz\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 A13-04-182-06_TAGCTT.paired.S.fastq.gz\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 A13-12-250-06_GGCTAC.paired.R1.fastq.gz\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 A13-12-250-06_GGCTAC.paired.R2.fastq.gz\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 A13-12-250-06_GGCTAC.paired.S.fastq.gz\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 A13-135-177-06_AGTTCC.single.fastq.gz\n\u2514\u2500\u2500 reports\n    \u251c\u2500\u2500 prinseq.fastqc.html\n    \u2514\u2500\u2500 untrimmed.fastqc.html\n\n```\n\nprinseq with host removal\n\n```text\ntrimnami.out/\n\u2514\u2500\u2500 prinseq\n    \u251c\u2500\u2500 A13-04-182-06_TAGCTT.host_rm.paired.R1.fastq.gz\n    \u251c\u2500\u2500 A13-04-182-06_TAGCTT.host_rm.paired.R2.fastq.gz\n    \u251c\u2500\u2500 A13-04-182-06_TAGCTT.host_rm.paired.S.fastq.gz\n    \u251c\u2500\u2500 A13-12-250-06_GGCTAC.host_rm.paired.R1.fastq.gz\n    \u251c\u2500\u2500 A13-12-250-06_GGCTAC.host_rm.paired.R2.fastq.gz\n    \u251c\u2500\u2500 A13-12-250-06_GGCTAC.host_rm.paired.S.fastq.gz\n    \u2514\u2500\u2500 A13-135-177-06_AGTTCC.host_rm.single.fastq.gz\n```\n\nprinseq with host removal and subsampling\n\n```text\ntrimnami.out/\n\u2514\u2500\u2500 prinseq\n    \u251c\u2500\u2500 A13-04-182-06_TAGCTT.host_rm.paired.R1.fastq.gz\n    \u251c\u2500\u2500 A13-04-182-06_TAGCTT.host_rm.paired.R1.subsampled.fastq.gz\n    \u251c\u2500\u2500 A13-04-182-06_TAGCTT.host_rm.paired.R2.fastq.gz\n    \u251c\u2500\u2500 A13-04-182-06_TAGCTT.host_rm.paired.R2.subsampled.fastq.gz\n    \u251c\u2500\u2500 A13-04-182-06_TAGCTT.host_rm.paired.S.fastq.gz\n    \u251c\u2500\u2500 A13-04-182-06_TAGCTT.host_rm.paired.S.subsampled.fastq.gz\n    \u251c\u2500\u2500 A13-12-250-06_GGCTAC.host_rm.paired.R1.fastq.gz\n    \u251c\u2500\u2500 A13-12-250-06_GGCTAC.host_rm.paired.R1.subsampled.fastq.gz\n    \u251c\u2500\u2500 A13-12-250-06_GGCTAC.host_rm.paired.R2.fastq.gz\n    \u251c\u2500\u2500 A13-12-250-06_GGCTAC.host_rm.paired.R2.subsampled.fastq.gz\n    \u251c\u2500\u2500 A13-12-250-06_GGCTAC.host_rm.paired.S.fastq.gz\n    \u251c\u2500\u2500 A13-12-250-06_GGCTAC.host_rm.paired.S.subsampled.fastq.gz\n    \u251c\u2500\u2500 A13-135-177-06_AGTTCC.host_rm.single.fastq.gz\n    \u2514\u2500\u2500 A13-135-177-06_AGTTCC.host_rm.single.subsampled.fastq.gz\n```\n</details>\n\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Trim lots of metagenomics samples all at once.",
    "version": "0.1.4",
    "project_urls": {
        "Homepage": "https://github.com/beardymcjohnface/Trimnami"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "32cb09f05f75781a5944ec48d428f8ca39668a5afe0809d01071c4473da0145b",
                "md5": "219973f06db3735f59499b97e0595086",
                "sha256": "6601e37be7740c238bf1c1c70e25a1aab74f6cdc15ea366e2e7b89d2a24002d9"
            },
            "downloads": -1,
            "filename": "trimnami-0.1.4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "219973f06db3735f59499b97e0595086",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 6539047,
            "upload_time": "2024-05-02T06:23:24",
            "upload_time_iso_8601": "2024-05-02T06:23:24.297645Z",
            "url": "https://files.pythonhosted.org/packages/32/cb/09f05f75781a5944ec48d428f8ca39668a5afe0809d01071c4473da0145b/trimnami-0.1.4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "8c40ada26a1ef6bcb1dab7cf45aa5894e9eac7117e2ffab03800c2d15aeb96e2",
                "md5": "c8a73a164bcd35452df6059d1e5a9397",
                "sha256": "b98b6011ec0c898942a2c11047ec81fbf3ee2dc2ada2c999fd741739374a498f"
            },
            "downloads": -1,
            "filename": "trimnami-0.1.4.tar.gz",
            "has_sig": false,
            "md5_digest": "c8a73a164bcd35452df6059d1e5a9397",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 6492578,
            "upload_time": "2024-05-02T06:23:26",
            "upload_time_iso_8601": "2024-05-02T06:23:26.478501Z",
            "url": "https://files.pythonhosted.org/packages/8c/40/ada26a1ef6bcb1dab7cf45aa5894e9eac7117e2ffab03800c2d15aeb96e2/trimnami-0.1.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-05-02 06:23:26",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "beardymcjohnface",
    "github_project": "Trimnami",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "trimnami"
}
        
Elapsed time: 3.12708s