Hammerhead-View


NameHammerhead-View JSON
Version 0.2.0 PyPI version JSON
download
home_pagehttps://github.com/lrslab/Hammerhead
SummaryA tool designed to de novo find potential modification sites.
upload_time2024-05-21 06:02:11
maintainerNone
docs_urlNone
authorXudong Liu
requires_python<=3.11,>=3.7
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Hammerhead
<a href="https://pypi.python.org/pypi/Hammerhead-View" rel="pypi">![PyPI](https://img.shields.io/pypi/v/Hammerhead-View?color=green) </a>
[![License: GPL v2](https://img.shields.io/badge/License-GPL_v2-blue.svg)](https://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html)

<img src="figure_demo/Logo.png" width="700" style="display: block; margin-left: auto; margin-right: auto;">




# Workflow

<img src="figure_demo/Demo_1.png" width="750" style="display: block; margin-left: auto; margin-right: auto;">

The Hammerhead was developed specifically to identify potential modification sites using Nanopore R10.4.1 simplex reads. It leverages the strand-specific error pattern observed in these reads to detect modifications.



The pipeline utilizes a self-defined metric called the difference index to quantify the discrepancy in observed accuracy between the forward and reverse strands at individual sites. This difference index serves as a measure of the potential modification probability. A higher value of the difference index indicates a higher likelihood of modification at the corresponding site.



# Installation

To use this tool, you'll need to install additional tools or packages for read processing, including samtools and minimap2. The following command can help you install dependencies.

```shell
# test version for dependencies
# minimap2	2.17
# samtools	1.17
# bedtools	2.30.0

conda install -c bioconda -c conda-forge minimap2 samtools bedtools -y
conda install -c bioconda -c conda-forge minimap2==2.17 samtools==1.17 bedtools==2.30.0 -y
```

To install this tool, please use the following command.
```shell
pip install Hammerhead-View
```




# Quick usage

`Hammerhead` can be run in two different strategies to detect methylation:

This first strategy is to select the sites with a difference index over the cutoff, the default is 0.35.

```shell
hammerhead --ref genome.fa --read input.fastq --cpu 4
```

The second strategy is to select the top N sites, based on the difference index sorted from the largest to the smallest, the default number is 2000.

```shell
hammerhead --ref genome.fa --read input.fastq --cpu 4 --method top
```



# Example

Here, we provide demo datasets for testing the `Hammerhead`. The following commands can help to download them.

```shell
wget https://figshare.com/ndownloader/files/46437190 -O ecoli.fa
wget https://figshare.com/ndownloader/files/46437193 -O test.fastq.gz
```

Please run the following command to start data analysis!

```shell
hammerhead --ref ecoli.fa --read test.fastq.gz --min_depth 5 --min_depth_strand 3
```

**Note:** The arguments used in this command were for demonstration purposes only (the read coverage of data was too shallow) and may not reflect the optimal settings for your dataset. It is generally recommended to use the default arguments when you have sufficient read coverage, typically considered to be more than 50-fold coverage.



# Tool showcase

To show the potential of Hammerhead to identify the modifications in the bacterium. Here, two datasets from  *E. coli* were used to call methylation including whole-genome sequencing (WGS) and whole-genome amplification (WGA) R10.4.1 simplex reads. The *dam* and *dcm* genes were found in the genome of the used *E. coli* strain. These two genes are associated with the G6mATC and C5mCWGG methylation.

<img src="figure_demo/Demo_2.png" width="750" style="display: block; margin-left: auto; margin-right: auto;">

The distribution of difference index for sites in *E. coli* genome. The WGA reads were used as a negative control due to the lack of inherent methylation information. Based on the background noise of WGA reads, the sites with a difference index over 0.35 were regarded as potential modification sites.

<img src="figure_demo/Demo_3.png" width="750" style="display: block; margin-left: auto; margin-right: auto;">

The motif of C<u>C</u>WGG and G<u>A</u>TC was enriched using the sequences near these potential modification sites (-10 bp to +10 bp). 

**Note:** Two datasets are available at the [here](https://figshare.com/articles/dataset/_i_E_coli_i_datasets/24298663). Both datasets were basecalled using the modification aware model, which is available in the directory of `modification_aware_basecalling_model`.



To demonstrate the effectiveness of the polishing strategy based on the Hammerhead  in correcting substitution error types (`G2A` and `C2T`) caused by DNA modifications in assemblies, we present the substitution rates of 15 assemblies. These assemblies were generated using 40-, 50-, and 60-fold random subsampling *Acinetobacter pittii* R10.4.1 reads. We compared the results obtained from different polishing approaches with the reference chromosome.

- No polishing
- Polishing potential modification sites with approximate 10-fold duplex reads
- Polishing total assemblies with 50-fold next-generation sequencing (NGS) reads



<img src="figure_demo/Demo_4.png" width="750" style="display: block; margin-left: auto; margin-right: auto;">

# Documentation

For more details about the usage of Hammerhead and results profiling, please refer to the [documentation](https://hammerhead-documentation.readthedocs.io/en/latest/#).

**All rights reserved.**






            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/lrslab/Hammerhead",
    "name": "Hammerhead-View",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<=3.11,>=3.7",
    "maintainer_email": null,
    "keywords": null,
    "author": "Xudong Liu",
    "author_email": "xudongliu98@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/f8/51/f82656e832be6dbb3010742e3703c63073174c9338e46fbb8dd27883312b/Hammerhead-View-0.2.0.tar.gz",
    "platform": null,
    "description": "# Hammerhead\n<a href=\"https://pypi.python.org/pypi/Hammerhead-View\" rel=\"pypi\">![PyPI](https://img.shields.io/pypi/v/Hammerhead-View?color=green) </a>\n[![License: GPL v2](https://img.shields.io/badge/License-GPL_v2-blue.svg)](https://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html)\n\n<img src=\"figure_demo/Logo.png\" width=\"700\" style=\"display: block; margin-left: auto; margin-right: auto;\">\n\n\n\n\n# Workflow\n\n<img src=\"figure_demo/Demo_1.png\" width=\"750\" style=\"display: block; margin-left: auto; margin-right: auto;\">\n\nThe Hammerhead was developed specifically to identify potential modification sites using Nanopore R10.4.1 simplex reads. It leverages the strand-specific error pattern observed in these reads to detect modifications.\n\n\n\nThe pipeline utilizes a self-defined metric called the difference index to quantify the discrepancy in observed accuracy between the forward and reverse strands at individual sites. This difference index serves as a measure of the potential modification probability. A higher value of the difference index indicates a higher likelihood of modification at the corresponding site.\n\n\n\n# Installation\n\nTo use this tool, you'll need to install additional tools or packages for read processing, including samtools and minimap2. The following command can help you install dependencies.\n\n```shell\n# test version for dependencies\n# minimap2\t2.17\n# samtools\t1.17\n# bedtools\t2.30.0\n\nconda install -c bioconda -c conda-forge minimap2 samtools bedtools -y\nconda install -c bioconda -c conda-forge minimap2==2.17 samtools==1.17 bedtools==2.30.0 -y\n```\n\nTo install this tool, please use the following command.\n```shell\npip install Hammerhead-View\n```\n\n\n\n\n# Quick usage\n\n`Hammerhead` can be run in two different strategies to detect methylation:\n\nThis first strategy is to select the sites with a difference index over the cutoff, the default is 0.35.\n\n```shell\nhammerhead --ref genome.fa --read input.fastq --cpu 4\n```\n\nThe second strategy is to select the top N sites, based on the difference index sorted from the largest to the smallest, the default number is 2000.\n\n```shell\nhammerhead --ref genome.fa --read input.fastq --cpu 4 --method top\n```\n\n\n\n# Example\n\nHere, we provide demo datasets for testing the `Hammerhead`. The following commands can help to download them.\n\n```shell\nwget https://figshare.com/ndownloader/files/46437190 -O ecoli.fa\nwget https://figshare.com/ndownloader/files/46437193 -O test.fastq.gz\n```\n\nPlease run the following command to start data analysis!\n\n```shell\nhammerhead --ref ecoli.fa --read test.fastq.gz --min_depth 5 --min_depth_strand 3\n```\n\n**Note:** The arguments used in this command were for demonstration purposes only (the read coverage of data was too shallow) and may not reflect the optimal settings for your dataset. It is generally recommended to use the default arguments when you have sufficient read coverage, typically considered to be more than 50-fold coverage.\n\n\n\n# Tool showcase\n\nTo show the potential of Hammerhead to identify the modifications in the bacterium. Here, two datasets from  *E. coli* were used to call methylation including whole-genome sequencing (WGS) and whole-genome amplification (WGA) R10.4.1 simplex reads. The *dam* and *dcm* genes were found in the genome of the used *E. coli* strain. These two genes are associated with the G6mATC and C5mCWGG methylation.\n\n<img src=\"figure_demo/Demo_2.png\" width=\"750\" style=\"display: block; margin-left: auto; margin-right: auto;\">\n\nThe distribution of difference index for sites in *E. coli* genome. The WGA reads were used as a negative control due to the lack of inherent methylation information. Based on the background noise of WGA reads, the sites with a difference index over 0.35 were regarded as potential modification sites.\n\n<img src=\"figure_demo/Demo_3.png\" width=\"750\" style=\"display: block; margin-left: auto; margin-right: auto;\">\n\nThe motif of C<u>C</u>WGG and G<u>A</u>TC was enriched using the sequences near these potential modification sites (-10 bp to +10 bp). \n\n**Note:** Two datasets are available at the [here](https://figshare.com/articles/dataset/_i_E_coli_i_datasets/24298663). Both datasets were basecalled using the modification aware model, which is available in the directory of `modification_aware_basecalling_model`.\n\n\n\nTo demonstrate the effectiveness of the polishing strategy based on the Hammerhead  in correcting substitution error types (`G2A` and `C2T`) caused by DNA modifications in assemblies, we present the substitution rates of 15 assemblies. These assemblies were generated using 40-\uff0c 50-, and 60-fold random subsampling *Acinetobacter pittii* R10.4.1 reads. We compared the results obtained from different polishing approaches with the reference chromosome.\n\n- No polishing\n- Polishing potential modification sites with approximate 10-fold duplex reads\n- Polishing total assemblies with 50-fold next-generation sequencing (NGS) reads\n\n\n\n<img src=\"figure_demo/Demo_4.png\" width=\"750\" style=\"display: block; margin-left: auto; margin-right: auto;\">\n\n# Documentation\n\nFor more details about the usage of Hammerhead and results profiling, please refer to the [documentation](https://hammerhead-documentation.readthedocs.io/en/latest/#).\n\n**All rights reserved.**\n\n\n\n\n\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A tool designed to de novo find potential modification sites.",
    "version": "0.2.0",
    "project_urls": {
        "Homepage": "https://github.com/lrslab/Hammerhead"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "174a6a47e8c89dd196f0324ff8d63d52da67d2a5c60f7dcdfc0b9986126cbaf0",
                "md5": "5e48c78fb4d1bea5606ce24669e80cc7",
                "sha256": "f1e4329fc5d4e07ed2e321158237c7c1982ae765cee3d014e4e5d9b8898deb46"
            },
            "downloads": -1,
            "filename": "Hammerhead_View-0.2.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "5e48c78fb4d1bea5606ce24669e80cc7",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<=3.11,>=3.7",
            "size": 14402,
            "upload_time": "2024-05-21T06:02:10",
            "upload_time_iso_8601": "2024-05-21T06:02:10.453105Z",
            "url": "https://files.pythonhosted.org/packages/17/4a/6a47e8c89dd196f0324ff8d63d52da67d2a5c60f7dcdfc0b9986126cbaf0/Hammerhead_View-0.2.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f851f82656e832be6dbb3010742e3703c63073174c9338e46fbb8dd27883312b",
                "md5": "f586d5ede4f21adf19fb755efcc8c191",
                "sha256": "4611a89195311a596c7a479c62195ad795dcedba2c02309e189192541fbac044"
            },
            "downloads": -1,
            "filename": "Hammerhead-View-0.2.0.tar.gz",
            "has_sig": false,
            "md5_digest": "f586d5ede4f21adf19fb755efcc8c191",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<=3.11,>=3.7",
            "size": 15409,
            "upload_time": "2024-05-21T06:02:11",
            "upload_time_iso_8601": "2024-05-21T06:02:11.787588Z",
            "url": "https://files.pythonhosted.org/packages/f8/51/f82656e832be6dbb3010742e3703c63073174c9338e46fbb8dd27883312b/Hammerhead-View-0.2.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-05-21 06:02:11",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "lrslab",
    "github_project": "Hammerhead",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "hammerhead-view"
}
        
Elapsed time: 0.25598s