Giraffe-View


NameGiraffe-View JSON
Version 0.2.3 PyPI version JSON
download
home_pagehttps://github.com/lxd98/Giraffe_View
SummaryGiraffe_View is specially designed to provide a comprehensive assessment of the accuracy of long-read sequencing datasets obtained from both the PacBio and Nanopore platforms.
upload_time2024-08-08 04:53:56
maintainerNone
docs_urlNone
authorXudong Liu
requires_python>=3
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # <img src="Results/giraffe_logo.png" width="80" style="display: block; margin-left: auto; margin-right: auto;"> Giraffe
<a href="https://pypi.org/project/Giraffe-View/" rel="pypi">![PyPI](https://img.shields.io/pypi/v/Giraffe-View?color=green)</a> <a href="https://opensource.org/license/mit/" rel="license">![License](https://img.shields.io/pypi/l/nanoCEM?color=orange)</a>

**Giraffe** is specially designed to provide a comprehensive assessment of the accuracy of long-read sequencing datasets obtained from both the Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT) platforms, offering four distinct functions.

<img src="Results/workflow.png" width="850" style="display: block; margin-left: auto; margin-right: auto;">

`estimate`   Calculation of estimated read accuracy (Q score), length, and GC content.

`observe`     Calculation of observed read accuracy, mismatch proportion, and homopolymer identification (e.g. AAAA).

`gcbias`       Calculation of the relationship between GC content and sequencing depth.

`modbin`       Calculation of the distribution of modification (e.g. 5mC or 6mA methylation) at the regional level.



# Installation

## Installation by [Conda](https://conda.io/projects/conda/en/latest/index.html)

```shell
# install on the current environment
conda install -c raymond_liu giraffe_view -y

# install on a new environment
conda create -n giraffe -c raymond_liu giraffe_view -y
```



## Installation by [PyPI](https://pypi.org/)

Before using this tool, you need to install additional dependencies for read processing, including the [samtools](https://www.htslib.org/),[minimap2](https://github.com/lh3/minimap2), and [bedtools](https://github.com/arq5x/bedtools2). The following commands can help you install both the software package and its dependencies.

```shell
# Testing version
# samtools 1.17
# minimap2 2.17-r941
# bedtools 2.30.0

# install on the currently environment
conda install -c bioconda -c conda-forge samtools minimap2 bedtools -y

# install on a new environment
conda create -n giraffe -c bioconda -c conda-forge python==3.9 samtools==1.17 minimap2==2.17 bedtools==2.30.0 -y && conda activate giraffe
```

To install this tool, please use the following command.
```shell
pip install Giraffe-View
```




# Quick usage

 **Giraffe** can be run with a one-button command or by executing individual functions.

## ONE-button pattern

```shell
# Running function of "estimate", "observe", and "gcbias" with FASTQ files
giraffe --read <read table> --ref <reference> --cpu <number of processes or threads>

# Running function of "estimate", "observe", and "gcbias" with unaligned SAM/BAM files
giraffe --read <unaligned SAM/BAM table> --ref <reference> --cpu <number of processes or threads>

# Example for input table (sample_ID data_type file_path)
sample_A ONT /home/user/data/S1.fastq
sample_B ONT /home/user/data/S2.fastq
sample_C ONT /home/user/data/S3.fastq
...
```

 Here the data_type can be ONT DNA reads (ONT), ONT directly sequencing reads (ONT_RNA), and Pacbio DNA reads (Pacbio).



## Estimate function

```shell
# For the FASTQ reads
giraffe estimate --read <read table> 

# For the unaligned SAM/BAM files
giraffe estimate --unaligned <unaligned SAM/BAM table>
```



## Observe function

```shell
# For FASTQ reads
giraffe observe --read <read table> --ref <reference>

# For unaligned SAM/BAM files
giraffe observe --unaligned <unaligned SAM/BAM table> --ref <reference>

# For aligned SAM/BAM files
giraffe observe --aligned <aligned SAM/BAM table>
```

**Note:** If you are going to use aligned SAM/BAM files as input, please remove the secondary alignment (**--secondary=no**) and add the MD tag (**--MD**) before mapping by adding these two highlighted parameters.



## GCbias function

```shell
giraffe gcbias --ref <reference> --aligned <aligned SAM/BAM table>
```



## Modbin function

```shell
giraffe modbin --methyl <methylation table> --region <target region>

# Example for methylation file (Chrom Start End Value):
contig_A 132 133 0.92
contig_A 255 256 0.27
contig_A 954 955 0.52
...
```



# Example

Here, we provide demo datasets for testing the **Giraffe**. The following commands can help to download them and run the demo.

```shell
giraffe_run_demo
```

The demo datasets included three E. coli datasets including a 4.2 MB reference, 79 MB R10.4.1 reads, and 121 MB R9.4.1 reads. For the methylation files, two files of zebrafish blood (23 MB)and kidney (19 KB) are included. This demo takes about 7 minutes and  20 seconds with a maximum memory of 391 MB. This running includes the one-command pattern and four individual functions testing.



# Tool showcase

The one-command pattern will generate a summary in [HTML](https://lxd98.github.io/giraffe.github.io) format. If the scale of the X/Y-axis is not reasonable, the script of `giraffe_plot`  can be used to replot the figure.

# Documentation

For more details about the usage of Giraffe and results profiling, please refer to the [document](https://giraffe-documentation.readthedocs.io/en/latest).




            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/lxd98/Giraffe_View",
    "name": "Giraffe-View",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3",
    "maintainer_email": null,
    "keywords": null,
    "author": "Xudong Liu",
    "author_email": "xudongliu98@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/1c/6e/042ed663e613e2bb037dad1a0d7d7b2613986f4574db7e8163ae3d89bd1a/giraffe_view-0.2.3.tar.gz",
    "platform": null,
    "description": "# <img src=\"Results/giraffe_logo.png\" width=\"80\" style=\"display: block; margin-left: auto; margin-right: auto;\"> Giraffe\n<a href=\"https://pypi.org/project/Giraffe-View/\" rel=\"pypi\">![PyPI](https://img.shields.io/pypi/v/Giraffe-View?color=green)</a> <a href=\"https://opensource.org/license/mit/\" rel=\"license\">![License](https://img.shields.io/pypi/l/nanoCEM?color=orange)</a>\n\n**Giraffe** is specially designed to provide a comprehensive assessment of the accuracy of long-read sequencing datasets obtained from both the Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT) platforms, offering four distinct functions.\n\n<img src=\"Results/workflow.png\" width=\"850\" style=\"display: block; margin-left: auto; margin-right: auto;\">\n\n`estimate`   Calculation of estimated read accuracy (Q score), length, and GC content.\n\n`observe`     Calculation of observed read accuracy, mismatch proportion, and homopolymer identification (e.g. AAAA).\n\n`gcbias`       Calculation of the relationship between GC content and sequencing depth.\n\n`modbin`       Calculation of the distribution of modification (e.g. 5mC or 6mA methylation) at the regional level.\n\n\n\n# Installation\n\n## Installation by [Conda](https://conda.io/projects/conda/en/latest/index.html)\n\n```shell\n# install on the current environment\nconda install -c raymond_liu giraffe_view -y\n\n# install on a new environment\nconda create -n giraffe -c raymond_liu giraffe_view -y\n```\n\n\n\n## Installation by [PyPI](https://pypi.org/)\n\nBefore using this tool, you need to install additional dependencies for read processing, including the [samtools](https://www.htslib.org/)\uff0c[minimap2](https://github.com/lh3/minimap2), and [bedtools](https://github.com/arq5x/bedtools2). The following commands can help you install both the software package and its dependencies.\n\n```shell\n# Testing version\n# samtools 1.17\n# minimap2 2.17-r941\n# bedtools 2.30.0\n\n# install on the currently environment\nconda install -c bioconda -c conda-forge samtools minimap2 bedtools -y\n\n# install on a new environment\nconda create -n giraffe -c bioconda -c conda-forge python==3.9 samtools==1.17 minimap2==2.17 bedtools==2.30.0 -y && conda activate giraffe\n```\n\nTo install this tool, please use the following command.\n```shell\npip install Giraffe-View\n```\n\n\n\n\n# Quick usage\n\n **Giraffe** can be run with a one-button command or by executing individual functions.\n\n## ONE-button pattern\n\n```shell\n# Running function of \"estimate\", \"observe\", and \"gcbias\" with FASTQ files\ngiraffe --read <read table> --ref <reference> --cpu <number of processes or threads>\n\n# Running function of \"estimate\", \"observe\", and \"gcbias\" with unaligned SAM/BAM files\ngiraffe --read <unaligned SAM/BAM table> --ref <reference> --cpu <number of processes or threads>\n\n# Example for input table (sample_ID data_type file_path)\nsample_A ONT /home/user/data/S1.fastq\nsample_B ONT /home/user/data/S2.fastq\nsample_C ONT /home/user/data/S3.fastq\n...\n```\n\n Here the data_type can be ONT DNA reads (ONT), ONT directly sequencing reads (ONT_RNA), and Pacbio DNA reads (Pacbio).\n\n\n\n## Estimate function\n\n```shell\n# For the FASTQ reads\ngiraffe estimate --read <read table> \n\n# For the unaligned SAM/BAM files\ngiraffe estimate --unaligned <unaligned SAM/BAM table>\n```\n\n\n\n## Observe function\n\n```shell\n# For FASTQ reads\ngiraffe observe --read <read table> --ref <reference>\n\n# For unaligned SAM/BAM files\ngiraffe observe --unaligned <unaligned SAM/BAM table> --ref <reference>\n\n# For aligned SAM/BAM files\ngiraffe observe --aligned <aligned SAM/BAM table>\n```\n\n**Note:** If you are going to use aligned SAM/BAM files as input, please remove the secondary alignment (**--secondary=no**) and add the MD tag (**--MD**) before mapping by adding these two highlighted parameters.\n\n\n\n## GCbias function\n\n```shell\ngiraffe gcbias --ref <reference> --aligned <aligned SAM/BAM table>\n```\n\n\n\n## Modbin function\n\n```shell\ngiraffe modbin --methyl <methylation table> --region <target region>\n\n# Example for methylation file (Chrom Start End Value):\ncontig_A 132 133 0.92\ncontig_A 255 256 0.27\ncontig_A 954 955 0.52\n...\n```\n\n\n\n# Example\n\nHere, we provide demo datasets for testing the **Giraffe**. The following commands can help to download them and run the demo.\n\n```shell\ngiraffe_run_demo\n```\n\nThe demo datasets included three E. coli datasets including a 4.2 MB reference, 79 MB R10.4.1 reads, and 121 MB R9.4.1 reads. For the methylation files, two files of zebrafish blood (23 MB)and kidney (19 KB) are included. This demo takes about 7 minutes and  20 seconds with a maximum memory of 391 MB. This running includes the one-command pattern and four individual functions testing.\n\n\n\n# Tool showcase\n\nThe one-command pattern will generate a summary in [HTML](https://lxd98.github.io/giraffe.github.io) format. If the scale of the X/Y-axis is not reasonable, the script of `giraffe_plot`  can be used to replot the figure.\n\n# Documentation\n\nFor more details about the usage of Giraffe and results profiling, please refer to the [document](https://giraffe-documentation.readthedocs.io/en/latest).\n\n\n\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Giraffe_View is specially designed to provide a comprehensive assessment of the accuracy of long-read sequencing datasets obtained from both the PacBio and Nanopore platforms.",
    "version": "0.2.3",
    "project_urls": {
        "Homepage": "https://github.com/lxd98/Giraffe_View"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "1bc894ba7724f2ecfb295a4a3dd42fe3d2bea4c66e3830471201393ef62e569c",
                "md5": "ed86ef8aa1abcf5d38f97f005a94420f",
                "sha256": "f28a9cfad79e06a9ed0ca7c7d20f4fe284cb974c8adc1673255d79d3ef8bba81"
            },
            "downloads": -1,
            "filename": "Giraffe_View-0.2.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "ed86ef8aa1abcf5d38f97f005a94420f",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3",
            "size": 29045,
            "upload_time": "2024-08-08T04:53:54",
            "upload_time_iso_8601": "2024-08-08T04:53:54.998514Z",
            "url": "https://files.pythonhosted.org/packages/1b/c8/94ba7724f2ecfb295a4a3dd42fe3d2bea4c66e3830471201393ef62e569c/Giraffe_View-0.2.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "1c6e042ed663e613e2bb037dad1a0d7d7b2613986f4574db7e8163ae3d89bd1a",
                "md5": "41819ffad67f788558f9292169519ca2",
                "sha256": "015090a4b1e889658c206e4f27623e4a91b3601bcb0e88fe817d80d7e2bf252c"
            },
            "downloads": -1,
            "filename": "giraffe_view-0.2.3.tar.gz",
            "has_sig": false,
            "md5_digest": "41819ffad67f788558f9292169519ca2",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3",
            "size": 23902,
            "upload_time": "2024-08-08T04:53:56",
            "upload_time_iso_8601": "2024-08-08T04:53:56.847359Z",
            "url": "https://files.pythonhosted.org/packages/1c/6e/042ed663e613e2bb037dad1a0d7d7b2613986f4574db7e8163ae3d89bd1a/giraffe_view-0.2.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-08-08 04:53:56",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "lxd98",
    "github_project": "Giraffe_View",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "giraffe-view"
}
        
Elapsed time: 0.31174s