hitac


Namehitac JSON
Version 2.2.1 PyPI version JSON
download
home_page
SummaryHierarchical taxonomic classifier.
upload_time2023-09-25 11:22:57
maintainer
docs_urlNone
authorFabio Malcher Miranda
requires_python>=3.8
licenseBSD 3-Clause
keywords hierarchical taxonomic classifier
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            
# HiTaC

HiTaC is an open-source hierarchical taxonomic classifier for fungal ITS sequences.

[![pipeline](https://github.com/mirand863/hitac/actions/workflows/deploy.yml/badge.svg?branch=main)](https://github.com/mirand863/hitac/actions/workflows/deploy.yml) [![codecov](https://codecov.io/gh/mirand863/hitac/branch/main/graph/badge.svg?token=2G05Q8PQBE)](https://codecov.io/gh/mirand863/hitac) [![Downloads PyPI](https://static.pepy.tech/personalized-badge/hitac?period=total&units=international_system&left_color=grey&right_color=brightgreen&left_text=pypi)](https://pypi.org/project/hitac/) [![Downloads Conda](https://img.shields.io/conda/dn/bioconda/hitac?label=conda)](https://anaconda.org/bioconda/hitac) [![License](https://img.shields.io/badge/License-BSD_3--Clause-blue.svg)](https://opensource.org/licenses/BSD-3-Clause) [![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)

## Quick links

- [Benchmark](#benchmark)
- [Install standalone version](#install-standalone-version)
- [Quick start for standalone version](#quick-start-for-standalone-version)
- [Install as a QIIME2 plugin](#install-as-a-qiime2-plugin)
- [Quick start for QIIME2 plugin](#quick-start-for-qiime2-plugin)
- [Support](#support)
- [Contributing](#contributing)
- [Getting the latest updates](#getting-the-latest-updates)
- [Citation](#citation)

## Benchmark

HiTaC was thoroughly evaluated with the [TAXXI benchmark](https://peerj.com/articles/4652/), consistently achieving higher accuracy and sensitivity as evidenced in the figures below.

![Accuracy](benchmark/results/images/accuracy.svg)

![True positive rate](benchmark/results/images/tpr.svg)

For reproducibility, a Snakemake pipeline was created. Instructions on how to run it and source code are available at [https://github.com/mirand863/hitac/tree/main/benchmark](https://github.com/mirand863/hitac/tree/main/benchmark).

## Install standalone version

### Option 1: Conda

HiTaC can be easily installed in a new conda environment by running the following command:

```shell
conda create -n hitac -c bioconda hitac
```

Afterward, the new conda environment created can be activated with:

```shell
conda activate hitac
```

For conda installation instructions, we refer the reader to [Conda's user guide](https://conda.io/projects/conda/en/latest/user-guide/install/index.html).

### Option 2: Pip

Alternatively, HiTaC can be installed with pip by running:

```shell
pip install hitac
```

### Option 3: Docker

Lastly, HiTaC can be downloaded as a docker image:

```shell
docker pull mirand863/hitac_standalone:latest
```

The downloaded image can then be started with:

```shell
docker run -it mirand863/hitac_standalone:latest /bin/bash
```

## Quick start for standalone version

For an interactive tutorial, we refer the reader to our [Google Colabs notebook](https://colab.research.google.com/drive/1tqTvB1Xz5agqFhu4gkB8KlZVHycS3X9w?usp=sharing).

To see the usage run `[command] --help` if you want further help with a specific command.

```shell
usage: hitac-fit [-h] --reference REFERENCE [--kmer KMER] [--threads THREADS] --classifier CLASSIFIER

Fit hierarchical classifier

optional arguments:
  -h, --help            show this help message and exit
  --reference REFERENCE
                        Input FASTA file with reference sequence(s) to train model
  --kmer KMER           K-mer size for feature extraction [default: 6]
  --threads THREADS     Number of threads to train in parallel [default: all]
  --classifier CLASSIFIER
                        Path to store trained hierarchical classifier
```

### Input Files

HiTaC accepts reference and query files in FASTA format. The reference file must have the taxonomies annotated as follows:

```shell
>EU272527;tax=d:Fungi,p:Ascomycota,c:Eurotiomycetes,o:Eurotiales,f:Trichocomaceae,g:Paecilomyces,s:Paecilomyces_sinensis;
CCGAGTGAGGGTCCCACGAGGCCCAACCTCCCATCCGTGTTGAACTACACCTGTTGCTTCGGCGGGCCCGCCGTGGTTCA
CGCCCGGCCGCCGGGGGGCCTTGTGCTCCCGGGCCCGCGCCCGCCGAAGACCCCTCGAACGCTGCCCTGAAGGTTGCCGT
CTGAGTATAAAATCAATCATTAAAACTTTCAACAACGGATCTCTTGGTTCCGGCATCGATGAAGAACGCAGCGAAATGCG
ATAAGTAATGTGAATTGCAGAATTCCGTGAATCATCGAATCTTTGAACGCACATTGCGCCCCCTGGCATTCCGGGGGGCA
TGCCTGTCCGAGCGTCATTGCTAACCCTCCAGCCCGGCTGGTGTGTTGGGTCGACGTCCCCCCCGGGGGACGGGCCCGAA
AGGCAGCGGCGGCGCCGCGTCCGATCCTCGAGCGTATGGGGCTTTGTCACGCGCTCTGGTAGGGTCGGCCGGCTGGCCAG
CCAGCGACCTCACGGTCACCTATTTTTTCTCTTAGG
>L54118;tax=d:Fungi,p:Basidiomycota,c:Agaricomycetes,o:Boletales,f:Suillaceae,g:Suillus,s:Suillus_placidus;
ACGAATTCATAATTCGGCGAGGGGAAAGCGGAGGGTTGTAGCTGGCCTTTTTACCGAGGCACGTGCACGCTCTCTTCCGA
ACTCTCGTCGTATGGGCGCGGGGCGACCCGCGTCTTTCATCCCACCTCTTCGTGTAGAAAGTCTTTGAATGTTTTTACCA
TCATCGAGTCGCGACTTCTAGGAGACGCGATTCTTTGAGACAAAAGTTTATTACAACTTTCAGCAATGGATCTCTTGGCT
CTCGCATCGATGAAGAACGCAGCGAATCGCGATATGTAATGTGAATTGCAGATCTACAGTGAATCATCGAATCTTTGAAC
GCACCTTGCGCTCCTCGGTGTTCCGAGGAGCATGCCTGTTTGAGCGTCAGTAAATTCTCAACCCCTCTCGATTTGCTTCG
AGAGGGCGCTTGGATGGTGGGGGCTGCCGGAGACCTGGATTTATCCCTGGACTCGGGCTCTCCTGAAATGCATCGGCTTG
CGGTCGACTTTCGACTTTGCGCGACAAGGCCTTCGGCGTGATAATGATCGCCGTTCGCCGAAGCGCAGGAATGAACGGTC
CCGCGCCTCTAATCCGTCGACGCTTTCGAGCGTCTTCCTCATTGACGTTTGACCTCAAAT
```

### Training and predicting taxonomies

To train the model and classify, simply run:

```shell
hitac-fit \
--reference reference.fasta \
--classifier classifier.pkl

hitac-classify \
--classifier classifier.pkl \
--reads reads.fasta \
--classification classification.tsv
```

Additionally, a filter can be trained to remove ranks where the predictions might be inaccurate and to compute the confidence score:

```shell
hitac fit-filter \
--reference reference.fasta \
--filter filter.pkl

hitac filter \
--filter filter.pkl \
--reads reads.fasta \
--classification classification.tsv \
--filtered-classification filtered_classification.tsv
```

### Output File

HiTaC generates a TSV file for the predictions. The first column in the TSV file contains the identifier of the test sequence and the second column holds the predictions made by HiTaC. For example:

```shell
EU254776	d:Fungi,p:Ascomycota,c:Sordariomycetes,o:Diaporthales,f:Valsaceae,g:Cryptosporella,s:Cryptosporella_femoralis
FJ711636	d:Fungi,p:Basidiomycota,c:Agaricomycetes,o:Agaricales,f:Marasmiaceae,g:Armillaria,s:Armillaria_tabescens
UDB016040	d:Fungi,p:Basidiomycota,c:Agaricomycetes,o:Russulales,f:Russulaceae,g:Russula,s:Russula_adusta
GU827310	d:Fungi,p:Ascomycota,c:Lecanoromycetes,o:Lecanorales,f:Ramalinaceae,g:Ramalina,s:Ramalina_conduplicans
JN943699	d:Fungi,p:Ascomycota,c:Lecanoromycetes,o:Lecanorales,f:Parmeliaceae,g:Punctelia,s:Punctelia_caseana
```

## Install as a QIIME2 plugin

### Option 1: Conda

HiTaC can also be installed as a QIIME 2 plugin. To install QIIME 2 version 2023.2 in a GNU/Linux machine, run:

```shell
wget https://data.qiime2.org/distro/core/qiime2-2023.2-py38-linux-conda.yml
conda env create -n hitac --file qiime2-2023.2-py38-linux-conda.yml
# OPTIONAL CLEANUP
rm qiime2-2023.2-py38-linux-conda.yml
```

**Note:** Instructions on how to install on Windows and macOS are available at [QIIME 2 docs](https://docs.qiime2.org/2023.2/install/native/#install-qiime-2-within-a-conda-environment).

Afterward, the new conda environment created in the last step can be activated and HiTaC can be installed:

```shell
conda activate hitac
conda install -c conda-forge -c bioconda hitac
```

For conda installation instructions, we refer the reader to [Conda's user guide](https://conda.io/projects/conda/en/latest/user-guide/install/index.html).

### Option 2: Pip

Alternatively, HiTaC can be installed with pip in an environment where QIIME 2 was previously installed:

```shell
pip install hitac
```

### Option 3: Docker

Lastly, HiTaC and all its dependencies can be downloaded as a docker image:

```shell
docker pull mirand863/hitac_qiime:latest
```

The downloaded image can then be started with:

```shell
docker run -it mirand863/hitac_qiime:latest /bin/bash
```

## Quick start for QIIME2 plugin

For an interactive tutorial, we refer the reader to our [Google Colabs notebook](https://colab.research.google.com/drive/12XicbyNhUQB2eVaiJG2b-0HMsOqvQTNs).

To see the usage run `qiime hitac --help` or `qiime hitac [command] --help` if you want further help with a specific command.

```shell
Usage: qiime hitac [OPTIONS] COMMAND [ARGS]...

  Description: This QIIME 2 plugin wraps HiTaC for hierarchical taxonomic
  classification.

  Plugin website: https://gitlab.com/dacs-hpi/hitac

  Getting user support: Please post to the QIIME 2 forum for help with this
  plugin: https://forum.qiime2.org

Options:
  --version    Show the version and exit.
  --citations  Show citations and exit.
  --help       Show this message and exit.

Commands:
  classify    Hierarchical classification with HiTaC's pre-fitted model
  filter      Hierarchical classification filtering with HiTaC's pre-fitted
              model

  fit         Train HiTaC's hierarchical classifier
  fit-filter  Train HiTaC's hierarchical filter
```

### Input Files

HiTaC accepts taxonomy in TSV format and training and test files in FASTA format. All these files must be previously imported by QIIME 2, for example:

```shell
qiime tools import \
--input-path query.fasta \
--output-path query.qza \
--type 'FeatureData[Sequence]'

qiime tools import \
--input-path reference.fasta \
--output-path reference.qza \
--type 'FeatureData[Sequence]'

qiime tools import \
--type 'FeatureData[Taxonomy]' \
--input-format HeaderlessTSVTaxonomyFormat \
--input-path taxonomy.txt \
--output-path taxonomy.qza
```

### Training and predicting taxonomies

To train the model and classify, simply run:

```shell
qiime hitac fit \
--i-reference-reads reference.qza \
--i-reference-taxonomy taxonomy.qza \
--o-classifier classifier.qza

qiime hitac classify \
--i-classifier classifier.qza \
--i-reads query.qza \
--o-classification classification.qza
```

Additionally, a filter can be trained to remove ranks where the predictions might be inaccurate and to compute the confidence score:

```shell
qiime hitac fit-filter \
--i-reference-reads reference.qza \
--i-reference-taxonomy taxonomy.qza \
--o-filter filter.qza

qiime hitac filter \
--i-filter filter.qza \
--i-reads query.qza \
--i-classification classification.qza \
--o-filtered-classification filtered_classification.qza
```

### Output File

The predictions can be exported from QIIME 2 to a TSV file:

```shell
qiime tools export \
--input-path classification.qza \
--output-path output_dir
```

or alternativelly if the filter was used:

```shell
qiime tools export \
--input-path filter_output.qza \
--output-path output_dir
```

The first column in the TSV file contains the identifier of the test sequence, while the second column holds the predictions made by HiTaC and the third column is the prediction probability if the filter was applied. For example:

```shell
Feature ID	Taxon	Confidence
EU254776	d__Fungi; p__Ascomycota; c__Sordariomycetes; o__Diaporthales; f__Valsaceae; g__Cryptosporella	-1
FJ711636	d__Fungi; p__Basidiomycota; c__Agaricomycetes; o__Agaricales; f__Marasmiaceae; g__Armillaria	-1
UDB016040	d__Fungi; p__Basidiomycota; c__Agaricomycetes; o__Russulales; f__Russulaceae; g__Russula	-1
GU827310	d__Fungi; p__Ascomycota; c__Lecanoromycetes; o__Lecanorales; f__Ramalinaceae; g__Ramalina	-1
JN943699	d__Fungi; p__Ascomycota; c__Lecanoromycetes; o__Lecanorales; f__Parmeliaceae; g__Punctelia	-1
```

## Support

If you run into any problems or issues, please create a [GitLab issue](https://gitlab.com/mirand863/hitac/-/issues) and we will try our best to help.

We strive to provide good support through our issue tracker on GitLab. However, if you'd like to receive private support with:

- Phone / video calls to discuss your specific use case and get recommendations
- Private discussions over Slack or Mattermost

Please reach out to fabio.malchermiranda@hpi.de.

## Contributing

We are a small team on a mission to improve ITS taxonomic classification, and we will take all the help we can get! If you would like to get involved, here is information on [contribution guidelines and how to test the code locally](CONTRIBUTING.md).

You can contribute in multiple ways, e.g., reporting bugs, writing or translating documentation, reviewing or refactoring code, requesting or implementing new features, etc.

## Getting the latest updates

If you'd like to get updates when we release new versions, please click on the notification button on the top and select "Watch". GitLab will then send you notifications along with a changelog with each new release.

## Citation

If you use HiTaC, please cite:

>Miranda, Fábio M., et al. "HiTaC: Hierarchical Taxonomic Classification of Fungal ITS Sequences." bioRxiv (2020).

```latex
@article{miranda2020hitac,
  title={HiTaC: Hierarchical Taxonomic Classification of Fungal ITS Sequences},
  author={Miranda, F{\'a}bio M and Azevedo, Vasco AC and Renard, Bernhard Y and Piro, Vitor C and Ramos, Rommel TJ},
  journal={bioRxiv},
  year={2020},
  publisher={Cold Spring Harbor Laboratory}
}
```

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "hitac",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "",
    "keywords": "hierarchical taxonomic classifier",
    "author": "Fabio Malcher Miranda",
    "author_email": "fabio.malchermiranda@hpi.de",
    "download_url": "https://files.pythonhosted.org/packages/4a/a8/2e1865d146e85681b682d5e640c4243729d4c25b7102d7a8efd5cb075f62/hitac-2.2.1.tar.gz",
    "platform": null,
    "description": "\n# HiTaC\n\nHiTaC is an open-source hierarchical taxonomic classifier for fungal ITS sequences.\n\n[![pipeline](https://github.com/mirand863/hitac/actions/workflows/deploy.yml/badge.svg?branch=main)](https://github.com/mirand863/hitac/actions/workflows/deploy.yml) [![codecov](https://codecov.io/gh/mirand863/hitac/branch/main/graph/badge.svg?token=2G05Q8PQBE)](https://codecov.io/gh/mirand863/hitac) [![Downloads PyPI](https://static.pepy.tech/personalized-badge/hitac?period=total&units=international_system&left_color=grey&right_color=brightgreen&left_text=pypi)](https://pypi.org/project/hitac/) [![Downloads Conda](https://img.shields.io/conda/dn/bioconda/hitac?label=conda)](https://anaconda.org/bioconda/hitac) [![License](https://img.shields.io/badge/License-BSD_3--Clause-blue.svg)](https://opensource.org/licenses/BSD-3-Clause) [![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n\n## Quick links\n\n- [Benchmark](#benchmark)\n- [Install standalone version](#install-standalone-version)\n- [Quick start for standalone version](#quick-start-for-standalone-version)\n- [Install as a QIIME2 plugin](#install-as-a-qiime2-plugin)\n- [Quick start for QIIME2 plugin](#quick-start-for-qiime2-plugin)\n- [Support](#support)\n- [Contributing](#contributing)\n- [Getting the latest updates](#getting-the-latest-updates)\n- [Citation](#citation)\n\n## Benchmark\n\nHiTaC was thoroughly evaluated with the [TAXXI benchmark](https://peerj.com/articles/4652/), consistently achieving higher accuracy and sensitivity as evidenced in the figures below.\n\n![Accuracy](benchmark/results/images/accuracy.svg)\n\n![True positive rate](benchmark/results/images/tpr.svg)\n\nFor reproducibility, a Snakemake pipeline was created. Instructions on how to run it and source code are available at [https://github.com/mirand863/hitac/tree/main/benchmark](https://github.com/mirand863/hitac/tree/main/benchmark).\n\n## Install standalone version\n\n### Option 1: Conda\n\nHiTaC can be easily installed in a new conda environment by running the following command:\n\n```shell\nconda create -n hitac -c bioconda hitac\n```\n\nAfterward, the new conda environment created can be activated with:\n\n```shell\nconda activate hitac\n```\n\nFor conda installation instructions, we refer the reader to [Conda's user guide](https://conda.io/projects/conda/en/latest/user-guide/install/index.html).\n\n### Option 2: Pip\n\nAlternatively, HiTaC can be installed with pip by running:\n\n```shell\npip install hitac\n```\n\n### Option 3: Docker\n\nLastly, HiTaC can be downloaded as a docker image:\n\n```shell\ndocker pull mirand863/hitac_standalone:latest\n```\n\nThe downloaded image can then be started with:\n\n```shell\ndocker run -it mirand863/hitac_standalone:latest /bin/bash\n```\n\n## Quick start for standalone version\n\nFor an interactive tutorial, we refer the reader to our [Google Colabs notebook](https://colab.research.google.com/drive/1tqTvB1Xz5agqFhu4gkB8KlZVHycS3X9w?usp=sharing).\n\nTo see the usage run `[command] --help` if you want further help with a specific command.\n\n```shell\nusage: hitac-fit [-h] --reference REFERENCE [--kmer KMER] [--threads THREADS] --classifier CLASSIFIER\n\nFit hierarchical classifier\n\noptional arguments:\n  -h, --help            show this help message and exit\n  --reference REFERENCE\n                        Input FASTA file with reference sequence(s) to train model\n  --kmer KMER           K-mer size for feature extraction [default: 6]\n  --threads THREADS     Number of threads to train in parallel [default: all]\n  --classifier CLASSIFIER\n                        Path to store trained hierarchical classifier\n```\n\n### Input Files\n\nHiTaC accepts reference and query files in FASTA format. The reference file must have the taxonomies annotated as follows:\n\n```shell\n>EU272527;tax=d:Fungi,p:Ascomycota,c:Eurotiomycetes,o:Eurotiales,f:Trichocomaceae,g:Paecilomyces,s:Paecilomyces_sinensis;\nCCGAGTGAGGGTCCCACGAGGCCCAACCTCCCATCCGTGTTGAACTACACCTGTTGCTTCGGCGGGCCCGCCGTGGTTCA\nCGCCCGGCCGCCGGGGGGCCTTGTGCTCCCGGGCCCGCGCCCGCCGAAGACCCCTCGAACGCTGCCCTGAAGGTTGCCGT\nCTGAGTATAAAATCAATCATTAAAACTTTCAACAACGGATCTCTTGGTTCCGGCATCGATGAAGAACGCAGCGAAATGCG\nATAAGTAATGTGAATTGCAGAATTCCGTGAATCATCGAATCTTTGAACGCACATTGCGCCCCCTGGCATTCCGGGGGGCA\nTGCCTGTCCGAGCGTCATTGCTAACCCTCCAGCCCGGCTGGTGTGTTGGGTCGACGTCCCCCCCGGGGGACGGGCCCGAA\nAGGCAGCGGCGGCGCCGCGTCCGATCCTCGAGCGTATGGGGCTTTGTCACGCGCTCTGGTAGGGTCGGCCGGCTGGCCAG\nCCAGCGACCTCACGGTCACCTATTTTTTCTCTTAGG\n>L54118;tax=d:Fungi,p:Basidiomycota,c:Agaricomycetes,o:Boletales,f:Suillaceae,g:Suillus,s:Suillus_placidus;\nACGAATTCATAATTCGGCGAGGGGAAAGCGGAGGGTTGTAGCTGGCCTTTTTACCGAGGCACGTGCACGCTCTCTTCCGA\nACTCTCGTCGTATGGGCGCGGGGCGACCCGCGTCTTTCATCCCACCTCTTCGTGTAGAAAGTCTTTGAATGTTTTTACCA\nTCATCGAGTCGCGACTTCTAGGAGACGCGATTCTTTGAGACAAAAGTTTATTACAACTTTCAGCAATGGATCTCTTGGCT\nCTCGCATCGATGAAGAACGCAGCGAATCGCGATATGTAATGTGAATTGCAGATCTACAGTGAATCATCGAATCTTTGAAC\nGCACCTTGCGCTCCTCGGTGTTCCGAGGAGCATGCCTGTTTGAGCGTCAGTAAATTCTCAACCCCTCTCGATTTGCTTCG\nAGAGGGCGCTTGGATGGTGGGGGCTGCCGGAGACCTGGATTTATCCCTGGACTCGGGCTCTCCTGAAATGCATCGGCTTG\nCGGTCGACTTTCGACTTTGCGCGACAAGGCCTTCGGCGTGATAATGATCGCCGTTCGCCGAAGCGCAGGAATGAACGGTC\nCCGCGCCTCTAATCCGTCGACGCTTTCGAGCGTCTTCCTCATTGACGTTTGACCTCAAAT\n```\n\n### Training and predicting taxonomies\n\nTo train the model and classify, simply run:\n\n```shell\nhitac-fit \\\n--reference reference.fasta \\\n--classifier classifier.pkl\n\nhitac-classify \\\n--classifier classifier.pkl \\\n--reads reads.fasta \\\n--classification classification.tsv\n```\n\nAdditionally, a filter can be trained to remove ranks where the predictions might be inaccurate and to compute the confidence score:\n\n```shell\nhitac fit-filter \\\n--reference reference.fasta \\\n--filter filter.pkl\n\nhitac filter \\\n--filter filter.pkl \\\n--reads reads.fasta \\\n--classification classification.tsv \\\n--filtered-classification filtered_classification.tsv\n```\n\n### Output File\n\nHiTaC generates a TSV file for the predictions. The first column in the TSV file contains the identifier of the test sequence and the second column holds the predictions made by HiTaC. For example:\n\n```shell\nEU254776\td:Fungi,p:Ascomycota,c:Sordariomycetes,o:Diaporthales,f:Valsaceae,g:Cryptosporella,s:Cryptosporella_femoralis\nFJ711636\td:Fungi,p:Basidiomycota,c:Agaricomycetes,o:Agaricales,f:Marasmiaceae,g:Armillaria,s:Armillaria_tabescens\nUDB016040\td:Fungi,p:Basidiomycota,c:Agaricomycetes,o:Russulales,f:Russulaceae,g:Russula,s:Russula_adusta\nGU827310\td:Fungi,p:Ascomycota,c:Lecanoromycetes,o:Lecanorales,f:Ramalinaceae,g:Ramalina,s:Ramalina_conduplicans\nJN943699\td:Fungi,p:Ascomycota,c:Lecanoromycetes,o:Lecanorales,f:Parmeliaceae,g:Punctelia,s:Punctelia_caseana\n```\n\n## Install as a QIIME2 plugin\n\n### Option 1: Conda\n\nHiTaC can also be installed as a QIIME 2 plugin. To install QIIME 2 version 2023.2 in a GNU/Linux machine, run:\n\n```shell\nwget https://data.qiime2.org/distro/core/qiime2-2023.2-py38-linux-conda.yml\nconda env create -n hitac --file qiime2-2023.2-py38-linux-conda.yml\n# OPTIONAL CLEANUP\nrm qiime2-2023.2-py38-linux-conda.yml\n```\n\n**Note:** Instructions on how to install on Windows and macOS are available at [QIIME 2 docs](https://docs.qiime2.org/2023.2/install/native/#install-qiime-2-within-a-conda-environment).\n\nAfterward, the new conda environment created in the last step can be activated and HiTaC can be installed:\n\n```shell\nconda activate hitac\nconda install -c conda-forge -c bioconda hitac\n```\n\nFor conda installation instructions, we refer the reader to [Conda's user guide](https://conda.io/projects/conda/en/latest/user-guide/install/index.html).\n\n### Option 2: Pip\n\nAlternatively, HiTaC can be installed with pip in an environment where QIIME 2 was previously installed:\n\n```shell\npip install hitac\n```\n\n### Option 3: Docker\n\nLastly, HiTaC and all its dependencies can be downloaded as a docker image:\n\n```shell\ndocker pull mirand863/hitac_qiime:latest\n```\n\nThe downloaded image can then be started with:\n\n```shell\ndocker run -it mirand863/hitac_qiime:latest /bin/bash\n```\n\n## Quick start for QIIME2 plugin\n\nFor an interactive tutorial, we refer the reader to our [Google Colabs notebook](https://colab.research.google.com/drive/12XicbyNhUQB2eVaiJG2b-0HMsOqvQTNs).\n\nTo see the usage run `qiime hitac --help` or `qiime hitac [command] --help` if you want further help with a specific command.\n\n```shell\nUsage: qiime hitac [OPTIONS] COMMAND [ARGS]...\n\n  Description: This QIIME 2 plugin wraps HiTaC for hierarchical taxonomic\n  classification.\n\n  Plugin website: https://gitlab.com/dacs-hpi/hitac\n\n  Getting user support: Please post to the QIIME 2 forum for help with this\n  plugin: https://forum.qiime2.org\n\nOptions:\n  --version    Show the version and exit.\n  --citations  Show citations and exit.\n  --help       Show this message and exit.\n\nCommands:\n  classify    Hierarchical classification with HiTaC's pre-fitted model\n  filter      Hierarchical classification filtering with HiTaC's pre-fitted\n              model\n\n  fit         Train HiTaC's hierarchical classifier\n  fit-filter  Train HiTaC's hierarchical filter\n```\n\n### Input Files\n\nHiTaC accepts taxonomy in TSV format and training and test files in FASTA format. All these files must be previously imported by QIIME 2, for example:\n\n```shell\nqiime tools import \\\n--input-path query.fasta \\\n--output-path query.qza \\\n--type 'FeatureData[Sequence]'\n\nqiime tools import \\\n--input-path reference.fasta \\\n--output-path reference.qza \\\n--type 'FeatureData[Sequence]'\n\nqiime tools import \\\n--type 'FeatureData[Taxonomy]' \\\n--input-format HeaderlessTSVTaxonomyFormat \\\n--input-path taxonomy.txt \\\n--output-path taxonomy.qza\n```\n\n### Training and predicting taxonomies\n\nTo train the model and classify, simply run:\n\n```shell\nqiime hitac fit \\\n--i-reference-reads reference.qza \\\n--i-reference-taxonomy taxonomy.qza \\\n--o-classifier classifier.qza\n\nqiime hitac classify \\\n--i-classifier classifier.qza \\\n--i-reads query.qza \\\n--o-classification classification.qza\n```\n\nAdditionally, a filter can be trained to remove ranks where the predictions might be inaccurate and to compute the confidence score:\n\n```shell\nqiime hitac fit-filter \\\n--i-reference-reads reference.qza \\\n--i-reference-taxonomy taxonomy.qza \\\n--o-filter filter.qza\n\nqiime hitac filter \\\n--i-filter filter.qza \\\n--i-reads query.qza \\\n--i-classification classification.qza \\\n--o-filtered-classification filtered_classification.qza\n```\n\n### Output File\n\nThe predictions can be exported from QIIME 2 to a TSV file:\n\n```shell\nqiime tools export \\\n--input-path classification.qza \\\n--output-path output_dir\n```\n\nor alternativelly if the filter was used:\n\n```shell\nqiime tools export \\\n--input-path filter_output.qza \\\n--output-path output_dir\n```\n\nThe first column in the TSV file contains the identifier of the test sequence, while the second column holds the predictions made by HiTaC and the third column is the prediction probability if the filter was applied. For example:\n\n```shell\nFeature ID\tTaxon\tConfidence\nEU254776\td__Fungi; p__Ascomycota; c__Sordariomycetes; o__Diaporthales; f__Valsaceae; g__Cryptosporella\t-1\nFJ711636\td__Fungi; p__Basidiomycota; c__Agaricomycetes; o__Agaricales; f__Marasmiaceae; g__Armillaria\t-1\nUDB016040\td__Fungi; p__Basidiomycota; c__Agaricomycetes; o__Russulales; f__Russulaceae; g__Russula\t-1\nGU827310\td__Fungi; p__Ascomycota; c__Lecanoromycetes; o__Lecanorales; f__Ramalinaceae; g__Ramalina\t-1\nJN943699\td__Fungi; p__Ascomycota; c__Lecanoromycetes; o__Lecanorales; f__Parmeliaceae; g__Punctelia\t-1\n```\n\n## Support\n\nIf you run into any problems or issues, please create a [GitLab issue](https://gitlab.com/mirand863/hitac/-/issues) and we will try our best to help.\n\nWe strive to provide good support through our issue tracker on GitLab. However, if you'd like to receive private support with:\n\n- Phone / video calls to discuss your specific use case and get recommendations\n- Private discussions over Slack or Mattermost\n\nPlease reach out to fabio.malchermiranda@hpi.de.\n\n## Contributing\n\nWe are a small team on a mission to improve ITS taxonomic classification, and we will take all the help we can get! If you would like to get involved, here is information on [contribution guidelines and how to test the code locally](CONTRIBUTING.md).\n\nYou can contribute in multiple ways, e.g., reporting bugs, writing or translating documentation, reviewing or refactoring code, requesting or implementing new features, etc.\n\n## Getting the latest updates\n\nIf you'd like to get updates when we release new versions, please click on the notification button on the top and select \"Watch\". GitLab will then send you notifications along with a changelog with each new release.\n\n## Citation\n\nIf you use HiTaC, please cite:\n\n>Miranda, F\u00e1bio M., et al. \"HiTaC: Hierarchical Taxonomic Classification of Fungal ITS Sequences.\" bioRxiv (2020).\n\n```latex\n@article{miranda2020hitac,\n  title={HiTaC: Hierarchical Taxonomic Classification of Fungal ITS Sequences},\n  author={Miranda, F{\\'a}bio M and Azevedo, Vasco AC and Renard, Bernhard Y and Piro, Vitor C and Ramos, Rommel TJ},\n  journal={bioRxiv},\n  year={2020},\n  publisher={Cold Spring Harbor Laboratory}\n}\n```\n",
    "bugtrack_url": null,
    "license": "BSD 3-Clause",
    "summary": "Hierarchical taxonomic classifier.",
    "version": "2.2.1",
    "project_urls": {
        "Bug Tracker": "https://gitlab.com/dacs-hpi/hitac/-/issues",
        "Related Software": "https://gitlab.com/dacs-hpi",
        "Source Code": "https://gitlab.com/dacs-hpi/hitac"
    },
    "split_keywords": [
        "hierarchical",
        "taxonomic",
        "classifier"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "2f42bfcbee71daeee8650e15cf02163cfeb0dcb225c98202471cd9a3d3ee860d",
                "md5": "bc142e2dea3ea37681bf035c031ecd8f",
                "sha256": "1a0656154957af59c88ee9f7c6e21cb46dc35298d25a51cf583c3f0c24eb14ab"
            },
            "downloads": -1,
            "filename": "hitac-2.2.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "bc142e2dea3ea37681bf035c031ecd8f",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 20016,
            "upload_time": "2023-09-25T11:22:56",
            "upload_time_iso_8601": "2023-09-25T11:22:56.440097Z",
            "url": "https://files.pythonhosted.org/packages/2f/42/bfcbee71daeee8650e15cf02163cfeb0dcb225c98202471cd9a3d3ee860d/hitac-2.2.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "4aa82e1865d146e85681b682d5e640c4243729d4c25b7102d7a8efd5cb075f62",
                "md5": "9128e26999adb866065c318a494ffa5e",
                "sha256": "3015bd17fee7f026ddf08708bf8a935d7a73108bc2cc8e9e9275fdf6b823298c"
            },
            "downloads": -1,
            "filename": "hitac-2.2.1.tar.gz",
            "has_sig": false,
            "md5_digest": "9128e26999adb866065c318a494ffa5e",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 42358,
            "upload_time": "2023-09-25T11:22:57",
            "upload_time_iso_8601": "2023-09-25T11:22:57.691501Z",
            "url": "https://files.pythonhosted.org/packages/4a/a8/2e1865d146e85681b682d5e640c4243729d4c25b7102d7a8efd5cb075f62/hitac-2.2.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-09-25 11:22:57",
    "github": false,
    "gitlab": true,
    "bitbucket": false,
    "codeberg": false,
    "gitlab_user": "dacs-hpi",
    "gitlab_project": "hitac",
    "lcname": "hitac"
}
        
Elapsed time: 0.20288s