| Name | Giraffe-View JSON |
| Version |
0.2.3
JSON |
| download |
| home_page | https://github.com/lxd98/Giraffe_View |
| Summary | Giraffe_View is specially designed to provide a comprehensive assessment of the accuracy of long-read sequencing datasets obtained from both the PacBio and Nanopore platforms. |
| upload_time | 2024-08-08 04:53:56 |
| maintainer | None |
| docs_url | None |
| author | Xudong Liu |
| requires_python | >=3 |
| license | None |
| keywords |
|
| VCS |
 |
| bugtrack_url |
|
| requirements |
No requirements were recorded.
|
| Travis-CI |
No Travis.
|
| coveralls test coverage |
No coveralls.
|
# <img src="Results/giraffe_logo.png" width="80" style="display: block; margin-left: auto; margin-right: auto;"> Giraffe
<a href="https://pypi.org/project/Giraffe-View/" rel="pypi"></a> <a href="https://opensource.org/license/mit/" rel="license"></a>
**Giraffe** is specially designed to provide a comprehensive assessment of the accuracy of long-read sequencing datasets obtained from both the Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT) platforms, offering four distinct functions.
<img src="Results/workflow.png" width="850" style="display: block; margin-left: auto; margin-right: auto;">
`estimate` Calculation of estimated read accuracy (Q score), length, and GC content.
`observe` Calculation of observed read accuracy, mismatch proportion, and homopolymer identification (e.g. AAAA).
`gcbias` Calculation of the relationship between GC content and sequencing depth.
`modbin` Calculation of the distribution of modification (e.g. 5mC or 6mA methylation) at the regional level.
# Installation
## Installation by [Conda](https://conda.io/projects/conda/en/latest/index.html)
```shell
# install on the current environment
conda install -c raymond_liu giraffe_view -y
# install on a new environment
conda create -n giraffe -c raymond_liu giraffe_view -y
```
## Installation by [PyPI](https://pypi.org/)
Before using this tool, you need to install additional dependencies for read processing, including the [samtools](https://www.htslib.org/),[minimap2](https://github.com/lh3/minimap2), and [bedtools](https://github.com/arq5x/bedtools2). The following commands can help you install both the software package and its dependencies.
```shell
# Testing version
# samtools 1.17
# minimap2 2.17-r941
# bedtools 2.30.0
# install on the currently environment
conda install -c bioconda -c conda-forge samtools minimap2 bedtools -y
# install on a new environment
conda create -n giraffe -c bioconda -c conda-forge python==3.9 samtools==1.17 minimap2==2.17 bedtools==2.30.0 -y && conda activate giraffe
```
To install this tool, please use the following command.
```shell
pip install Giraffe-View
```
# Quick usage
**Giraffe** can be run with a one-button command or by executing individual functions.
## ONE-button pattern
```shell
# Running function of "estimate", "observe", and "gcbias" with FASTQ files
giraffe --read <read table> --ref <reference> --cpu <number of processes or threads>
# Running function of "estimate", "observe", and "gcbias" with unaligned SAM/BAM files
giraffe --read <unaligned SAM/BAM table> --ref <reference> --cpu <number of processes or threads>
# Example for input table (sample_ID data_type file_path)
sample_A ONT /home/user/data/S1.fastq
sample_B ONT /home/user/data/S2.fastq
sample_C ONT /home/user/data/S3.fastq
...
```
Here the data_type can be ONT DNA reads (ONT), ONT directly sequencing reads (ONT_RNA), and Pacbio DNA reads (Pacbio).
## Estimate function
```shell
# For the FASTQ reads
giraffe estimate --read <read table>
# For the unaligned SAM/BAM files
giraffe estimate --unaligned <unaligned SAM/BAM table>
```
## Observe function
```shell
# For FASTQ reads
giraffe observe --read <read table> --ref <reference>
# For unaligned SAM/BAM files
giraffe observe --unaligned <unaligned SAM/BAM table> --ref <reference>
# For aligned SAM/BAM files
giraffe observe --aligned <aligned SAM/BAM table>
```
**Note:** If you are going to use aligned SAM/BAM files as input, please remove the secondary alignment (**--secondary=no**) and add the MD tag (**--MD**) before mapping by adding these two highlighted parameters.
## GCbias function
```shell
giraffe gcbias --ref <reference> --aligned <aligned SAM/BAM table>
```
## Modbin function
```shell
giraffe modbin --methyl <methylation table> --region <target region>
# Example for methylation file (Chrom Start End Value):
contig_A 132 133 0.92
contig_A 255 256 0.27
contig_A 954 955 0.52
...
```
# Example
Here, we provide demo datasets for testing the **Giraffe**. The following commands can help to download them and run the demo.
```shell
giraffe_run_demo
```
The demo datasets included three E. coli datasets including a 4.2 MB reference, 79 MB R10.4.1 reads, and 121 MB R9.4.1 reads. For the methylation files, two files of zebrafish blood (23 MB)and kidney (19 KB) are included. This demo takes about 7 minutes and 20 seconds with a maximum memory of 391 MB. This running includes the one-command pattern and four individual functions testing.
# Tool showcase
The one-command pattern will generate a summary in [HTML](https://lxd98.github.io/giraffe.github.io) format. If the scale of the X/Y-axis is not reasonable, the script of `giraffe_plot` can be used to replot the figure.
# Documentation
For more details about the usage of Giraffe and results profiling, please refer to the [document](https://giraffe-documentation.readthedocs.io/en/latest).
Raw data
{
"_id": null,
"home_page": "https://github.com/lxd98/Giraffe_View",
"name": "Giraffe-View",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3",
"maintainer_email": null,
"keywords": null,
"author": "Xudong Liu",
"author_email": "xudongliu98@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/1c/6e/042ed663e613e2bb037dad1a0d7d7b2613986f4574db7e8163ae3d89bd1a/giraffe_view-0.2.3.tar.gz",
"platform": null,
"description": "# <img src=\"Results/giraffe_logo.png\" width=\"80\" style=\"display: block; margin-left: auto; margin-right: auto;\"> Giraffe\n<a href=\"https://pypi.org/project/Giraffe-View/\" rel=\"pypi\"></a> <a href=\"https://opensource.org/license/mit/\" rel=\"license\"></a>\n\n**Giraffe** is specially designed to provide a comprehensive assessment of the accuracy of long-read sequencing datasets obtained from both the Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT) platforms, offering four distinct functions.\n\n<img src=\"Results/workflow.png\" width=\"850\" style=\"display: block; margin-left: auto; margin-right: auto;\">\n\n`estimate` Calculation of estimated read accuracy (Q score), length, and GC content.\n\n`observe` Calculation of observed read accuracy, mismatch proportion, and homopolymer identification (e.g. AAAA).\n\n`gcbias` Calculation of the relationship between GC content and sequencing depth.\n\n`modbin` Calculation of the distribution of modification (e.g. 5mC or 6mA methylation) at the regional level.\n\n\n\n# Installation\n\n## Installation by [Conda](https://conda.io/projects/conda/en/latest/index.html)\n\n```shell\n# install on the current environment\nconda install -c raymond_liu giraffe_view -y\n\n# install on a new environment\nconda create -n giraffe -c raymond_liu giraffe_view -y\n```\n\n\n\n## Installation by [PyPI](https://pypi.org/)\n\nBefore using this tool, you need to install additional dependencies for read processing, including the [samtools](https://www.htslib.org/)\uff0c[minimap2](https://github.com/lh3/minimap2), and [bedtools](https://github.com/arq5x/bedtools2). The following commands can help you install both the software package and its dependencies.\n\n```shell\n# Testing version\n# samtools 1.17\n# minimap2 2.17-r941\n# bedtools 2.30.0\n\n# install on the currently environment\nconda install -c bioconda -c conda-forge samtools minimap2 bedtools -y\n\n# install on a new environment\nconda create -n giraffe -c bioconda -c conda-forge python==3.9 samtools==1.17 minimap2==2.17 bedtools==2.30.0 -y && conda activate giraffe\n```\n\nTo install this tool, please use the following command.\n```shell\npip install Giraffe-View\n```\n\n\n\n\n# Quick usage\n\n **Giraffe** can be run with a one-button command or by executing individual functions.\n\n## ONE-button pattern\n\n```shell\n# Running function of \"estimate\", \"observe\", and \"gcbias\" with FASTQ files\ngiraffe --read <read table> --ref <reference> --cpu <number of processes or threads>\n\n# Running function of \"estimate\", \"observe\", and \"gcbias\" with unaligned SAM/BAM files\ngiraffe --read <unaligned SAM/BAM table> --ref <reference> --cpu <number of processes or threads>\n\n# Example for input table (sample_ID data_type file_path)\nsample_A ONT /home/user/data/S1.fastq\nsample_B ONT /home/user/data/S2.fastq\nsample_C ONT /home/user/data/S3.fastq\n...\n```\n\n Here the data_type can be ONT DNA reads (ONT), ONT directly sequencing reads (ONT_RNA), and Pacbio DNA reads (Pacbio).\n\n\n\n## Estimate function\n\n```shell\n# For the FASTQ reads\ngiraffe estimate --read <read table> \n\n# For the unaligned SAM/BAM files\ngiraffe estimate --unaligned <unaligned SAM/BAM table>\n```\n\n\n\n## Observe function\n\n```shell\n# For FASTQ reads\ngiraffe observe --read <read table> --ref <reference>\n\n# For unaligned SAM/BAM files\ngiraffe observe --unaligned <unaligned SAM/BAM table> --ref <reference>\n\n# For aligned SAM/BAM files\ngiraffe observe --aligned <aligned SAM/BAM table>\n```\n\n**Note:** If you are going to use aligned SAM/BAM files as input, please remove the secondary alignment (**--secondary=no**) and add the MD tag (**--MD**) before mapping by adding these two highlighted parameters.\n\n\n\n## GCbias function\n\n```shell\ngiraffe gcbias --ref <reference> --aligned <aligned SAM/BAM table>\n```\n\n\n\n## Modbin function\n\n```shell\ngiraffe modbin --methyl <methylation table> --region <target region>\n\n# Example for methylation file (Chrom Start End Value):\ncontig_A 132 133 0.92\ncontig_A 255 256 0.27\ncontig_A 954 955 0.52\n...\n```\n\n\n\n# Example\n\nHere, we provide demo datasets for testing the **Giraffe**. The following commands can help to download them and run the demo.\n\n```shell\ngiraffe_run_demo\n```\n\nThe demo datasets included three E. coli datasets including a 4.2 MB reference, 79 MB R10.4.1 reads, and 121 MB R9.4.1 reads. For the methylation files, two files of zebrafish blood (23 MB)and kidney (19 KB) are included. This demo takes about 7 minutes and 20 seconds with a maximum memory of 391 MB. This running includes the one-command pattern and four individual functions testing.\n\n\n\n# Tool showcase\n\nThe one-command pattern will generate a summary in [HTML](https://lxd98.github.io/giraffe.github.io) format. If the scale of the X/Y-axis is not reasonable, the script of `giraffe_plot` can be used to replot the figure.\n\n# Documentation\n\nFor more details about the usage of Giraffe and results profiling, please refer to the [document](https://giraffe-documentation.readthedocs.io/en/latest).\n\n\n\n",
"bugtrack_url": null,
"license": null,
"summary": "Giraffe_View is specially designed to provide a comprehensive assessment of the accuracy of long-read sequencing datasets obtained from both the PacBio and Nanopore platforms.",
"version": "0.2.3",
"project_urls": {
"Homepage": "https://github.com/lxd98/Giraffe_View"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "1bc894ba7724f2ecfb295a4a3dd42fe3d2bea4c66e3830471201393ef62e569c",
"md5": "ed86ef8aa1abcf5d38f97f005a94420f",
"sha256": "f28a9cfad79e06a9ed0ca7c7d20f4fe284cb974c8adc1673255d79d3ef8bba81"
},
"downloads": -1,
"filename": "Giraffe_View-0.2.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "ed86ef8aa1abcf5d38f97f005a94420f",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3",
"size": 29045,
"upload_time": "2024-08-08T04:53:54",
"upload_time_iso_8601": "2024-08-08T04:53:54.998514Z",
"url": "https://files.pythonhosted.org/packages/1b/c8/94ba7724f2ecfb295a4a3dd42fe3d2bea4c66e3830471201393ef62e569c/Giraffe_View-0.2.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "1c6e042ed663e613e2bb037dad1a0d7d7b2613986f4574db7e8163ae3d89bd1a",
"md5": "41819ffad67f788558f9292169519ca2",
"sha256": "015090a4b1e889658c206e4f27623e4a91b3601bcb0e88fe817d80d7e2bf252c"
},
"downloads": -1,
"filename": "giraffe_view-0.2.3.tar.gz",
"has_sig": false,
"md5_digest": "41819ffad67f788558f9292169519ca2",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3",
"size": 23902,
"upload_time": "2024-08-08T04:53:56",
"upload_time_iso_8601": "2024-08-08T04:53:56.847359Z",
"url": "https://files.pythonhosted.org/packages/1c/6e/042ed663e613e2bb037dad1a0d7d7b2613986f4574db7e8163ae3d89bd1a/giraffe_view-0.2.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-08-08 04:53:56",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "lxd98",
"github_project": "Giraffe_View",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "giraffe-view"
}