amira-amr

Name	amira-amr JSON
Version	0.4.0 JSON
	download
home_page	https://github.com/Danderson123/amira
Summary	Amira
upload_time	2024-12-15 09:04:04
maintainer	None
docs_url	None
author	Daniel Anderson
requires_python	<3.13,>=3.9
license	Apache-2.0
keywords	amr prediction genotyping sequence assembly
VCS
bugtrack_url
requirements	joblib matplotlib pandas pysam pytest scipy sourmash tqdm
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Amira

## Introduction

Amira is an AMR gene detection tool designed to work directly from bacterial long read sequences. Amira makes it easy to reliably identify the AMR genes in a bacterial sample, reduces the time taken to get meaningful results and allows more accurate detection of AMR genes than assembly.

## Overview

Amira leverages the full length of long read sequences to differentiate multi-copy genes by their local genomic context. This is done by first identifying the genes on each sequencing read and using the gene calls to construct a *de Bruijn* graph (DBG) in gene space. Following error correction, the reads containing different copies of multi-copy AMR genes can be clustered together based on their path in the graph, then assembled to obtain the nucleotide sequence.

## Prerequisites

Amira requires Python and three additional non-Python tools for optimal functionality:

- **Python >=3.9,<3.13**.
- **Poetry** to manage the Python dependencies.
- **Pandora** to identify the genes on each sequencing read.
- **minimap2** for sequence alignment.
- **racon** for allele polishing.

## Installation

Follow these steps to install Amira and its dependencies.

### From source

#### Step 1: Clone the Amira Repository

Open a terminal and run the following command to clone the repository and navigate into it:
```bash
git clone https://github.com/Danderson123/Amira && cd Amira
```
#### Step 2: Install Poetry
Amira’s dependencies are managed with Poetry. Install Poetry by running:
```bash
pip install poetry
```
#### Step 3: Install Python Dependencies
Once Poetry is installed, use it to set up Amira’s dependencies:

```bash
poetry install
```
####  Step 4: Install Non-Python Dependencies
Amira requires Pandora, minimap2 and racon. Follow the links below for instructions on building binaries for each tool:

- [Pandora Installation Guide](https://github.com/iqbal-lab-org/pandora?tab=readme-ov-file#installation)
- [minimap2 Installation Guide](https://github.com/lh3/minimap2)
- [racon Installation Guide](https://github.com/isovic/racon)

After installation, make a note of the paths to these binaries as they will be required when running Amira.

### From PyPI

Amira can be installed from PyPI by running:
```bash
pip install amira-amr
```
Amira can then be run with:
```bash
amira --help
```
## Running Amira
Amira can be run on the output of Pandora directly, or from JSON files listing the genes and gene positions on each sequencing read. Below are instructions and an example command for running Amira with the JSON files.

### Running from JSON
To run Amira from the JSON files, you can use this command. You will need to replace `<PATH TO RACON BINARY>` with the absolute path to the racon binary you made earlier and replace `<PATH TO MINIMAP2 BINARY>` with the path to the minimap2 binary.
```
python3 amira/__main__.py --pandoraJSON <PATH TO GENE CALL JSON> --gene-positions <PATH TO GENE POSITION JSON> --pandoraConsensus <PATH TO PANDORA CONSENSUS FASTQ> --readfile <PATH TO READ FASTQ> --output <OUTPUT DIRECTORY> --gene-path <AMR GENE REFERENCE FASTA> --phenotypes <ALLELE-PHENOTYPE MAPPING JSON> --racon-path <PATH TO RACON BINARY> --minimap2-path <PATH TO MINIMAP2 BINARY> --debug --cores <CPUS> --sample-reads --filter-contaminants
```

####  JSON example

Some example JSON data can be downloaded from [here](https://drive.google.com/drive/folders/1mQ8JmzVhFiNkgRy5_1iFQrqV2TLNnlQ4). Amira can then be run using this command:
```
python3 amira/__main__.py --pandoraJSON test_data/gene_calls_with_gene_filtering.json --gene-positions test_data/gene_positions_with_gene_filtering.json --pandoraConsensus test_data/pandora.consensus.fq.gz --readfile test_data/SRR23044220_1.fastq.gz --output amira_output --gene-path AMR_alleles_unified.fa --phenotypes AMR_calls.json --racon-path <PATH TO RACON BINARY> --minimap2-path <PATH TO MINIMAP2 BINARY> --debug --cores <CPUS> --sample-reads --filter-contaminants
```

### Running with Pandora
[Pandora](https://github.com/iqbal-lab-org/pandora) uses species-specific reference pan-genomes (panRGs) to identify the genes on each sequencing read. For *Escherichia coli*, a pre-built panRG can be downloaded from [here](https://drive.google.com/file/d/15uyl7iQei3Ikd2d6oI_XbARXiKmxl-2d/view). After installing Pandora, you can call the genes on your sequencing reads using this command:
```bash
pandora map -t <THREADS> --min-gene-coverage-proportion 0.5 --max-covg 10000 -o pandora_map_output <PANRG PATH> <PATH TO READ FASTQ>
```
Amira can then be run directly on the output of Pandora using this command:
```bash
python3 amira/__main__.py --pandoraSam pandora_map_output/*.sam --pandoraConsensus pandora_map_output/pandora.consensus.fq.gz --readfile <PATH TO READ FASTQ> --output amira_output --gene-path AMR_alleles_unified.fa --minimum-length-proportion 0.5 --maximum-length-proportion 1.5 --cores <CPUS> --phenotypes AMR_calls.json --filter-contaminants --sample-reads
 ```

### Additional options
For additional options and configurations, run:
```bash
python3 amira/__main__.py --help
```
## Citation
TBD

## Contributing
If you’d like to contribute to Amira, please follow these steps:

1. Fork the repository.
2. Create a new branch for your feature or bugfix (git checkout -b feature-name).
3. Commit your changes (git commit -m "Description of feature").
4. Push to the branch (git push origin feature-name).
5. Submit a pull request.

## License
This project is licensed under the Apache License 2.0 License. See the LICENSE file for details.

## Contact
For questions, feedback, or issues, please open an issue on GitHub or contact [Daniel Anderson](<mailto:dander@ebi.ac.uk>).

## Additional Resources
* [Pandora Documentation](https://github.com/iqbal-lab-org/pandora/wiki/Usage)

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/Danderson123/amira",
    "name": "amira-amr",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<3.13,>=3.9",
    "maintainer_email": null,
    "keywords": "AMR Prediction, Genotyping, Sequence Assembly",
    "author": "Daniel Anderson",
    "author_email": "dander@ebi.ac.uk",
    "download_url": "https://files.pythonhosted.org/packages/86/6c/f05a4809ddb2311186bae3727da42923dc85c82f624ab21db947d429a7a6/amira_amr-0.4.0.tar.gz",
    "platform": null,
    "description": "# Amira\n\n## Introduction\n\nAmira is an AMR gene detection tool designed to work directly from bacterial long read sequences. Amira makes it easy to reliably identify the AMR genes in a bacterial sample, reduces the time taken to get meaningful results and allows more accurate detection of AMR genes than assembly.\n\n## Overview\n\nAmira leverages the full length of long read sequences to differentiate multi-copy genes by their local genomic context. This is done by first identifying the genes on each sequencing read and using the gene calls to construct a *de Bruijn* graph (DBG) in gene space. Following error correction, the reads containing different copies of multi-copy AMR genes can be clustered together based on their path in the graph, then assembled to obtain the nucleotide sequence.\n\n## Prerequisites\n\nAmira requires Python and three additional non-Python tools for optimal functionality:\n\n- **Python >=3.9,<3.13**.\n- **Poetry** to manage the Python dependencies.\n- **Pandora** to identify the genes on each sequencing read.\n- **minimap2** for sequence alignment.\n- **racon** for allele polishing.\n\n## Installation\n\nFollow these steps to install Amira and its dependencies.\n\n### From source\n\n#### Step 1: Clone the Amira Repository\n\nOpen a terminal and run the following command to clone the repository and navigate into it:\n```bash\ngit clone https://github.com/Danderson123/Amira && cd Amira\n```\n#### Step 2: Install Poetry\nAmira\u2019s dependencies are managed with Poetry. Install Poetry by running:\n```bash\npip install poetry\n```\n#### Step 3: Install Python Dependencies\nOnce Poetry is installed, use it to set up Amira\u2019s dependencies:\n\n```bash\npoetry install\n```\n####  Step 4: Install Non-Python Dependencies\nAmira requires Pandora, minimap2 and racon. Follow the links below for instructions on building binaries for each tool:\n\n- [Pandora Installation Guide](https://github.com/iqbal-lab-org/pandora?tab=readme-ov-file#installation)\n- [minimap2 Installation Guide](https://github.com/lh3/minimap2)\n- [racon Installation Guide](https://github.com/isovic/racon)\n\nAfter installation, make a note of the paths to these binaries as they will be required when running Amira.\n\n### From PyPI\n\nAmira can be installed from PyPI by running:\n```bash\npip install amira-amr\n```\nAmira can then be run with:\n```bash\namira --help\n```\n## Running Amira\nAmira can be run on the output of Pandora directly, or from JSON files listing the genes and gene positions on each sequencing read. Below are instructions and an example command for running Amira with the JSON files.\n\n### Running from JSON\nTo run Amira from the JSON files, you can use this command. You will need to replace `<PATH TO RACON BINARY>` with the absolute path to the racon binary you made earlier and replace `<PATH TO MINIMAP2 BINARY>` with the path to the minimap2 binary.\n```\npython3 amira/__main__.py --pandoraJSON <PATH TO GENE CALL JSON> --gene-positions <PATH TO GENE POSITION JSON> --pandoraConsensus <PATH TO PANDORA CONSENSUS FASTQ> --readfile <PATH TO READ FASTQ> --output <OUTPUT DIRECTORY> --gene-path <AMR GENE REFERENCE FASTA> --phenotypes <ALLELE-PHENOTYPE MAPPING JSON> --racon-path <PATH TO RACON BINARY> --minimap2-path <PATH TO MINIMAP2 BINARY> --debug --cores <CPUS> --sample-reads --filter-contaminants\n```\n\n####  JSON example\n\nSome example JSON data can be downloaded from [here](https://drive.google.com/drive/folders/1mQ8JmzVhFiNkgRy5_1iFQrqV2TLNnlQ4). Amira can then be run using this command:\n```\npython3 amira/__main__.py --pandoraJSON test_data/gene_calls_with_gene_filtering.json --gene-positions test_data/gene_positions_with_gene_filtering.json --pandoraConsensus test_data/pandora.consensus.fq.gz --readfile test_data/SRR23044220_1.fastq.gz --output amira_output --gene-path AMR_alleles_unified.fa --phenotypes AMR_calls.json --racon-path <PATH TO RACON BINARY> --minimap2-path <PATH TO MINIMAP2 BINARY> --debug --cores <CPUS> --sample-reads --filter-contaminants\n```\n\n### Running with Pandora\n[Pandora](https://github.com/iqbal-lab-org/pandora) uses species-specific reference pan-genomes (panRGs) to identify the genes on each sequencing read. For *Escherichia coli*, a pre-built panRG can be downloaded from [here](https://drive.google.com/file/d/15uyl7iQei3Ikd2d6oI_XbARXiKmxl-2d/view). After installing Pandora, you can call the genes on your sequencing reads using this command:\n```bash\npandora map -t <THREADS> --min-gene-coverage-proportion 0.5 --max-covg 10000 -o pandora_map_output <PANRG PATH> <PATH TO READ FASTQ>\n```\nAmira can then be run directly on the output of Pandora using this command:\n```bash\npython3 amira/__main__.py --pandoraSam pandora_map_output/*.sam --pandoraConsensus pandora_map_output/pandora.consensus.fq.gz --readfile <PATH TO READ FASTQ> --output amira_output --gene-path AMR_alleles_unified.fa --minimum-length-proportion 0.5 --maximum-length-proportion 1.5 --cores <CPUS> --phenotypes AMR_calls.json --filter-contaminants --sample-reads\n ```\n\n### Additional options\nFor additional options and configurations, run:\n```bash\npython3 amira/__main__.py --help\n```\n## Citation\nTBD\n\n## Contributing\nIf you\u2019d like to contribute to Amira, please follow these steps:\n\n1. Fork the repository.\n2. Create a new branch for your feature or bugfix (git checkout -b feature-name).\n3. Commit your changes (git commit -m \"Description of feature\").\n4. Push to the branch (git push origin feature-name).\n5. Submit a pull request.\n\n## License\nThis project is licensed under the Apache License 2.0 License. See the LICENSE file for details.\n\n## Contact\nFor questions, feedback, or issues, please open an issue on GitHub or contact [Daniel Anderson](<mailto:dander@ebi.ac.uk>).\n\n## Additional Resources\n* [Pandora Documentation](https://github.com/iqbal-lab-org/pandora/wiki/Usage)",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "Amira",
    "version": "0.4.0",
    "project_urls": {
        "Homepage": "https://github.com/Danderson123/amira",
        "Repository": "https://github.com/Danderson123/amira"
    },
    "split_keywords": [
        "amr prediction",
        " genotyping",
        " sequence assembly"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c1477192d8c6209cb3fa86b2fdae6f8fd62d7fa1f2229370f5f19c77dca015c5",
                "md5": "5a79b2bf99a8f8e4543e92a392c8e286",
                "sha256": "b04acff7d0b2d35e0f819a021d80dac7fc50dcc16f251c67fc527a2110502ddc"
            },
            "downloads": -1,
            "filename": "amira_amr-0.4.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "5a79b2bf99a8f8e4543e92a392c8e286",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<3.13,>=3.9",
            "size": 60382,
            "upload_time": "2024-12-15T09:04:01",
            "upload_time_iso_8601": "2024-12-15T09:04:01.892026Z",
            "url": "https://files.pythonhosted.org/packages/c1/47/7192d8c6209cb3fa86b2fdae6f8fd62d7fa1f2229370f5f19c77dca015c5/amira_amr-0.4.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "866cf05a4809ddb2311186bae3727da42923dc85c82f624ab21db947d429a7a6",
                "md5": "f3e5bfab07b0c4ac9501fee1a8a58f04",
                "sha256": "c590517e94d5d62e70569e137c4d1a70d33d03cd98c79b4886605eca2b9146a5"
            },
            "downloads": -1,
            "filename": "amira_amr-0.4.0.tar.gz",
            "has_sig": false,
            "md5_digest": "f3e5bfab07b0c4ac9501fee1a8a58f04",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<3.13,>=3.9",
            "size": 57641,
            "upload_time": "2024-12-15T09:04:04",
            "upload_time_iso_8601": "2024-12-15T09:04:04.527028Z",
            "url": "https://files.pythonhosted.org/packages/86/6c/f05a4809ddb2311186bae3727da42923dc85c82f624ab21db947d429a7a6/amira_amr-0.4.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-12-15 09:04:04",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Danderson123",
    "github_project": "amira",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "joblib",
            "specs": [
                [
                    "==",
                    "1.2.0"
                ]
            ]
        },
        {
            "name": "matplotlib",
            "specs": [
                [
                    "==",
                    "3.6.2"
                ]
            ]
        },
        {
            "name": "pandas",
            "specs": [
                [
                    "==",
                    "2.2.2"
                ]
            ]
        },
        {
            "name": "pysam",
            "specs": [
                [
                    "==",
                    "0.22.0"
                ]
            ]
        },
        {
            "name": "pytest",
            "specs": [
                [
                    "==",
                    "7.2.0"
                ]
            ]
        },
        {
            "name": "scipy",
            "specs": [
                [
                    "==",
                    "1.12.0"
                ]
            ]
        },
        {
            "name": "sourmash",
            "specs": [
                [
                    "==",
                    "4.8.4"
                ]
            ]
        },
        {
            "name": "tqdm",
            "specs": [
                [
                    "==",
                    "4.66.3"
                ]
            ]
        }
    ],
    "lcname": "amira-amr"
}

Daniel Anderson