graphbin


Namegraphbin JSON
Version 1.7.1 PyPI version JSON
download
home_pageNone
Summarygraphbin: Refined binning of metagenomic contigs using assembly graphs.
upload_time2023-07-25 02:17:56
maintainerNone
docs_urlNone
authorNone
requires_python>=3.7
licenseNone
keywords genomics bioinformatics
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <p align="center">
  <img src="https://raw.githubusercontent.com/metagentools/GraphBin/master/GraphBin_logo.png" width="400" title="GraphBin logo" alt="GraphBin logo">
</p>

# GraphBin: Refined Binning of Metagenomic Contigs using Assembly Graphs

[![DOI](https://img.shields.io/badge/DOI-10.1093/bioinformatics/btaa180-informational)](https://doi.org/10.1093/bioinformatics/btaa180)
[![Anaconda-Server Badge](https://anaconda.org/bioconda/graphbin/badges/version.svg)](https://anaconda.org/bioconda/graphbin)
[![Anaconda-Server Badge](https://anaconda.org/bioconda/graphbin/badges/downloads.svg)](https://anaconda.org/bioconda/graphbin)
[![PyPI version](https://badge.fury.io/py/graphbin.svg)](https://badge.fury.io/py/graphbin)
[![Downloads](https://static.pepy.tech/badge/graphbin)](https://pepy.tech/project/graphbin)

[![CI](https://github.com/metagentools/GraphBin/actions/workflows/testing_python_app.yml/badge.svg)](https://github.com/metagentools/GraphBin/actions/workflows/testing_python_app.yml)
[![codecov](https://codecov.io/gh/metagentools/GraphBin/branch/develop/graph/badge.svg?token=0S310F6QXJ)](https://codecov.io/gh/metagentools/GraphBin)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
[![CodeQL](https://github.com/metagentools/GraphBin/actions/workflows/codeql.yml/badge.svg)](https://github.com/metagentools/GraphBin/actions/workflows/codeql.yml)
[![Documentation Status](https://readthedocs.org/projects/graphbin/badge/?version=latest)](https://graphbin.readthedocs.io/en/latest/?badge=latest)

**GraphBin** is an NGS data-based metagenomic contig bin refinement tool that makes use of the contig connectivity information from the assembly graph to bin contigs. It utilizes the binning result of an existing binning tool and a label propagation algorithm to correct mis-binned contigs and predict the labels of contigs which are discarded due to short length.

**For detailed instructions on installation, usage and visualisation, please refer to the [documentation hosted at Read the Docs](https://graphbin.readthedocs.io/).**

## Dependencies

GraphBin installation requires python 3 to run. The following dependencies are required to run GraphBin and related support scripts.
* [python-igraph](https://igraph.org/python/)
* [cogent3](https://cogent3.org/)
* [cairocffi](https://pypi.org/project/cairocffi/)
* [click](https://click.palletsprojects.com/)

## Installing GraphBin

### Using Conda

You can install GraphBin using the [bioconda](https://anaconda.org/bioconda/graphbin) distribution. You can download 
[Anaconda](https://www.anaconda.com/distribution/) or [Miniconda](https://docs.conda.io/en/latest/miniconda.html) which contains `conda`.

```
# add channels
conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge

# create conda environment
conda create -n graphbin

# activate conda environment
conda activate graphbin

# install graphbin
conda install -c bioconda graphbin

# check graphbin installation
graphbin -h
```

### Using pip

You can install GraphBin using `pip` from the [PyPI](https://pypi.org/project/graphbin/) distribution.

```
pip install graphbin
```

For ***development*** purposes, please clone the repository and install via [flit](https://pypi.org/project/flit/).

```
# clone repository to your local machine
git clone https://github.com/metagentools/GraphBin.git

# go to repo directory
cd GraphBin

# install flit
pip install flit

# install graphbin via flit
flit install -s --python `which python`
```

## Example Usage

```
# SPAdes version
graphbin --assembler spades --graph /path/to/graph_file.gfa --contigs /path/to/contigs.fasta --paths /path/to/paths_file.paths --binned /path/to/binning_result.csv --output /path/to/output_folder

# SGA version
graphbin --assembler sga --graph /path/to/graph_file.asqg --contigs /path/to/contigs.fa --binned /path/to/binning_result.csv --output /path/to/output_folder

# MEGAHIT version
graphbin --assembler megahit --graph /path/to/graph_file.gfa --contigs /path/to/contigs.fa --binned /path/to/binning_result.csv --output /path/to/output_folder
```

## Visualization of the Assembly Graph of ESC+metaSPAdes Test Dataset

### Initial Assembly Graph
<p align="center">
  <img src="https://raw.githubusercontent.com/metagentools/GraphBin/master/images/3G_SPAdes_graph_plot.png" width="400" title="Initial assembly graph" alt="Initial assembly graph">
</p>

### TAXAassign Labelling
<p align="center">
  <img src="https://raw.githubusercontent.com/metagentools/GraphBin/master/images/3G_SPAdes_taxaassign_graph_plot.png" width="400" title="TAXAassign Labelling" alt="TAXAassign Labelling">
</p>

### Original MaxBin Labelling with 2 Mis-binned Contigs
<p align="center">
  <img src="https://raw.githubusercontent.com/metagentools/GraphBin/master/images/3G_SPAdes_maxbin_graph_plot_edit.png" width="400" title="MaxBin Labelling" alt="MaxBin Labelling">
</p>

### Refined Labels
<p align="center">
  <img src="https://raw.githubusercontent.com/metagentools/GraphBin/master/images/3G_SPAdes_maxbin_graph_plot_correct.png" width="400" title="Refined Labels" alt="Refined Labels">
</p>

### Final Labelling of GraphBin
<p align="center">
  <img src="https://raw.githubusercontent.com/metagentools/GraphBin/master/images/3G_SPAdes_after_label_prop_graph_plot.png" width="400" title="Final Labelling" alt="Final Labelling">
</p>


## Citation
If you use GraphBin in your work, please cite GraphBin as,

Vijini Mallawaarachchi, Anuradha Wickramarachchi, Yu Lin. GraphBin: Refined binning of metagenomic contigs using assembly graphs. Bioinformatics, Volume 36, Issue 11, June 2020, Pages 3307–3313, DOI: [10.1093/bioinformatics/btaa180](http://dx.doi.org/10.1093/bioinformatics/btaa180)

```bibtex
@article{10.1093/bioinformatics/btaa180,
    author = {Mallawaarachchi, Vijini and Wickramarachchi, Anuradha and Lin, Yu},
    title = "{GraphBin: refined binning of metagenomic contigs using assembly graphs}",
    journal = {Bioinformatics},
    volume = {36},
    number = {11},
    pages = {3307-3313},
    year = {2020},
    month = {03},
    abstract = "{The field of metagenomics has provided valuable insights into the structure, diversity and ecology within microbial communities. One key step in metagenomics analysis is to assemble reads into longer contigs which are then binned into groups of contigs that belong to different species present in the metagenomic sample. Binning of contigs plays an important role in metagenomics and most available binning algorithms bin contigs using genomic features such as oligonucleotide/k-mer composition and contig coverage. As metagenomic contigs are derived from the assembly process, they are output from the underlying assembly graph which contains valuable connectivity information between contigs that can be used for binning. We propose GraphBin, a new binning method that makes use of the assembly graph and applies a label propagation algorithm to refine the binning result of existing tools. We show that GraphBin can make use of the assembly graphs constructed from both the de Bruijn graph and the overlap-layout-consensus approach. Moreover, we demonstrate improved experimental results from GraphBin in terms of identifying mis-binned contigs and binning of contigs discarded by existing binning tools. To the best of our knowledge, this is the first time that the information from the assembly graph has been used in a tool for the binning of metagenomic contigs. The source code of GraphBin is available at https://github.com/Vini2/GraphBin.vijini.mallawaarachchi@anu.edu.au or yu.lin@anu.edu.auSupplementary data are available at Bioinformatics online.}",
    issn = {1367-4803},
    doi = {10.1093/bioinformatics/btaa180},
    url = {https://doi.org/10.1093/bioinformatics/btaa180},
    eprint = {https://academic.oup.com/bioinformatics/article-pdf/36/11/3307/33329097/btaa180.pdf},
}
```

## Funding

GraphBin is funded by an [Essential Open Source Software for Science Grant](https://chanzuckerberg.com/eoss/proposals/cogent3-python-apis-for-iq-tree-and-graphbin-via-a-plug-in-architecture/) from the Chan Zuckerberg Initiative.

<p align="left">
  <img src="https://chanzuckerberg.com/wp-content/themes/czi/img/logo.svg" width="300">
</p>

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "graphbin",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": null,
    "keywords": "genomics,bioinformatics",
    "author": null,
    "author_email": "Vijini Mallawaarachchi <Vijini.Mallawaarachchi@anu.edu.au>, Anuradha Wickramarachchi <Anuradha.Wickramarachchi@anu.edu.au>, Yu Lin <yu.lin@anu.edu.au>",
    "download_url": "https://files.pythonhosted.org/packages/af/66/5178762875c6ec06f15e7e166ef7d5ed82b4c8c5f872e18cf1ec49ca5935/graphbin-1.7.1.tar.gz",
    "platform": null,
    "description": "<p align=\"center\">\n  <img src=\"https://raw.githubusercontent.com/metagentools/GraphBin/master/GraphBin_logo.png\" width=\"400\" title=\"GraphBin logo\" alt=\"GraphBin logo\">\n</p>\n\n# GraphBin: Refined Binning of Metagenomic Contigs using Assembly Graphs\n\n[![DOI](https://img.shields.io/badge/DOI-10.1093/bioinformatics/btaa180-informational)](https://doi.org/10.1093/bioinformatics/btaa180)\n[![Anaconda-Server Badge](https://anaconda.org/bioconda/graphbin/badges/version.svg)](https://anaconda.org/bioconda/graphbin)\n[![Anaconda-Server Badge](https://anaconda.org/bioconda/graphbin/badges/downloads.svg)](https://anaconda.org/bioconda/graphbin)\n[![PyPI version](https://badge.fury.io/py/graphbin.svg)](https://badge.fury.io/py/graphbin)\n[![Downloads](https://static.pepy.tech/badge/graphbin)](https://pepy.tech/project/graphbin)\n\n[![CI](https://github.com/metagentools/GraphBin/actions/workflows/testing_python_app.yml/badge.svg)](https://github.com/metagentools/GraphBin/actions/workflows/testing_python_app.yml)\n[![codecov](https://codecov.io/gh/metagentools/GraphBin/branch/develop/graph/badge.svg?token=0S310F6QXJ)](https://codecov.io/gh/metagentools/GraphBin)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n[![CodeQL](https://github.com/metagentools/GraphBin/actions/workflows/codeql.yml/badge.svg)](https://github.com/metagentools/GraphBin/actions/workflows/codeql.yml)\n[![Documentation Status](https://readthedocs.org/projects/graphbin/badge/?version=latest)](https://graphbin.readthedocs.io/en/latest/?badge=latest)\n\n**GraphBin** is an NGS data-based metagenomic contig bin refinement tool that makes use of the contig connectivity information from the assembly graph to bin contigs. It utilizes the binning result of an existing binning tool and a label propagation algorithm to correct mis-binned contigs and predict the labels of contigs which are discarded due to short length.\n\n**For detailed instructions on installation, usage and visualisation, please refer to the [documentation hosted at Read the Docs](https://graphbin.readthedocs.io/).**\n\n## Dependencies\n\nGraphBin installation requires python 3 to run. The following dependencies are required to run GraphBin and related support scripts.\n* [python-igraph](https://igraph.org/python/)\n* [cogent3](https://cogent3.org/)\n* [cairocffi](https://pypi.org/project/cairocffi/)\n* [click](https://click.palletsprojects.com/)\n\n## Installing GraphBin\n\n### Using Conda\n\nYou can install GraphBin using the [bioconda](https://anaconda.org/bioconda/graphbin) distribution. You can download \n[Anaconda](https://www.anaconda.com/distribution/) or [Miniconda](https://docs.conda.io/en/latest/miniconda.html) which contains `conda`.\n\n```\n# add channels\nconda config --add channels defaults\nconda config --add channels bioconda\nconda config --add channels conda-forge\n\n# create conda environment\nconda create -n graphbin\n\n# activate conda environment\nconda activate graphbin\n\n# install graphbin\nconda install -c bioconda graphbin\n\n# check graphbin installation\ngraphbin -h\n```\n\n### Using pip\n\nYou can install GraphBin using `pip` from the [PyPI](https://pypi.org/project/graphbin/) distribution.\n\n```\npip install graphbin\n```\n\nFor ***development*** purposes, please clone the repository and install via [flit](https://pypi.org/project/flit/).\n\n```\n# clone repository to your local machine\ngit clone https://github.com/metagentools/GraphBin.git\n\n# go to repo directory\ncd GraphBin\n\n# install flit\npip install flit\n\n# install graphbin via flit\nflit install -s --python `which python`\n```\n\n## Example Usage\n\n```\n# SPAdes version\ngraphbin --assembler spades --graph /path/to/graph_file.gfa --contigs /path/to/contigs.fasta --paths /path/to/paths_file.paths --binned /path/to/binning_result.csv --output /path/to/output_folder\n\n# SGA version\ngraphbin --assembler sga --graph /path/to/graph_file.asqg --contigs /path/to/contigs.fa --binned /path/to/binning_result.csv --output /path/to/output_folder\n\n# MEGAHIT version\ngraphbin --assembler megahit --graph /path/to/graph_file.gfa --contigs /path/to/contigs.fa --binned /path/to/binning_result.csv --output /path/to/output_folder\n```\n\n## Visualization of the Assembly Graph of ESC+metaSPAdes Test Dataset\n\n### Initial Assembly Graph\n<p align=\"center\">\n  <img src=\"https://raw.githubusercontent.com/metagentools/GraphBin/master/images/3G_SPAdes_graph_plot.png\" width=\"400\" title=\"Initial assembly graph\" alt=\"Initial assembly graph\">\n</p>\n\n### TAXAassign Labelling\n<p align=\"center\">\n  <img src=\"https://raw.githubusercontent.com/metagentools/GraphBin/master/images/3G_SPAdes_taxaassign_graph_plot.png\" width=\"400\" title=\"TAXAassign Labelling\" alt=\"TAXAassign Labelling\">\n</p>\n\n### Original MaxBin Labelling with 2 Mis-binned Contigs\n<p align=\"center\">\n  <img src=\"https://raw.githubusercontent.com/metagentools/GraphBin/master/images/3G_SPAdes_maxbin_graph_plot_edit.png\" width=\"400\" title=\"MaxBin Labelling\" alt=\"MaxBin Labelling\">\n</p>\n\n### Refined Labels\n<p align=\"center\">\n  <img src=\"https://raw.githubusercontent.com/metagentools/GraphBin/master/images/3G_SPAdes_maxbin_graph_plot_correct.png\" width=\"400\" title=\"Refined Labels\" alt=\"Refined Labels\">\n</p>\n\n### Final Labelling of GraphBin\n<p align=\"center\">\n  <img src=\"https://raw.githubusercontent.com/metagentools/GraphBin/master/images/3G_SPAdes_after_label_prop_graph_plot.png\" width=\"400\" title=\"Final Labelling\" alt=\"Final Labelling\">\n</p>\n\n\n## Citation\nIf you use GraphBin in your work, please cite GraphBin as,\n\nVijini Mallawaarachchi, Anuradha Wickramarachchi, Yu Lin. GraphBin: Refined binning of metagenomic contigs using assembly graphs. Bioinformatics, Volume 36, Issue 11, June 2020, Pages 3307\u20133313, DOI: [10.1093/bioinformatics/btaa180](http://dx.doi.org/10.1093/bioinformatics/btaa180)\n\n```bibtex\n@article{10.1093/bioinformatics/btaa180,\n    author = {Mallawaarachchi, Vijini and Wickramarachchi, Anuradha and Lin, Yu},\n    title = \"{GraphBin: refined binning of metagenomic contigs using assembly graphs}\",\n    journal = {Bioinformatics},\n    volume = {36},\n    number = {11},\n    pages = {3307-3313},\n    year = {2020},\n    month = {03},\n    abstract = \"{The field of metagenomics has provided valuable insights into the structure, diversity and ecology within microbial communities. One key step in metagenomics analysis is to assemble reads into longer contigs which are then binned into groups of contigs that belong to different species present in the metagenomic sample. Binning of contigs plays an important role in metagenomics and most available binning algorithms bin contigs using genomic features such as oligonucleotide/k-mer composition and contig coverage. As metagenomic contigs are derived from the assembly process, they are output from the underlying assembly graph which contains valuable connectivity information between contigs that can be used for binning. We propose GraphBin, a new binning method that makes use of the assembly graph and applies a label propagation algorithm to refine the binning result of existing tools. We show that GraphBin can make use of the assembly graphs constructed from both the de Bruijn graph and the overlap-layout-consensus approach. Moreover, we demonstrate improved experimental results from GraphBin in terms of identifying mis-binned contigs and binning of contigs discarded by existing binning tools. To the best of our knowledge, this is the first time that the information from the assembly graph has been used in a tool for the binning of metagenomic contigs. The source code of GraphBin is available at https://github.com/Vini2/GraphBin.vijini.mallawaarachchi@anu.edu.au or yu.lin@anu.edu.auSupplementary data are available at Bioinformatics online.}\",\n    issn = {1367-4803},\n    doi = {10.1093/bioinformatics/btaa180},\n    url = {https://doi.org/10.1093/bioinformatics/btaa180},\n    eprint = {https://academic.oup.com/bioinformatics/article-pdf/36/11/3307/33329097/btaa180.pdf},\n}\n```\n\n## Funding\n\nGraphBin is funded by an [Essential Open Source Software for Science Grant](https://chanzuckerberg.com/eoss/proposals/cogent3-python-apis-for-iq-tree-and-graphbin-via-a-plug-in-architecture/) from the Chan Zuckerberg Initiative.\n\n<p align=\"left\">\n  <img src=\"https://chanzuckerberg.com/wp-content/themes/czi/img/logo.svg\" width=\"300\">\n</p>\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "graphbin: Refined binning of metagenomic contigs using assembly graphs.",
    "version": "1.7.1",
    "project_urls": {
        "Bug Tracker": "https://github.com/metagentools/GraphBin/issues",
        "Documentation": "https://graphbin.readthedocs.io/en/latest/",
        "Source Code": "https://github.com/metagentools/GraphBin/"
    },
    "split_keywords": [
        "genomics",
        "bioinformatics"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "147280e659b6cc7c5827f6b0d674465f1f683d2d3bddb1d1cf3d76bfd5ee0883",
                "md5": "5f7e2424eb70ea1ce61f30ad5ccef255",
                "sha256": "c75d53f55518229a67e66b260232fe3040d6ab9f2ee53d199ab553058592b8b4"
            },
            "downloads": -1,
            "filename": "graphbin-1.7.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "5f7e2424eb70ea1ce61f30ad5ccef255",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 55071,
            "upload_time": "2023-07-25T02:17:48",
            "upload_time_iso_8601": "2023-07-25T02:17:48.649274Z",
            "url": "https://files.pythonhosted.org/packages/14/72/80e659b6cc7c5827f6b0d674465f1f683d2d3bddb1d1cf3d76bfd5ee0883/graphbin-1.7.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "af665178762875c6ec06f15e7e166ef7d5ed82b4c8c5f872e18cf1ec49ca5935",
                "md5": "88e796315d7f9393b4f6be002f11e4e8",
                "sha256": "ed4b986f7215183b08ae7acd1e81d8d2f653facc5a902f6f5af10d2c91b7165a"
            },
            "downloads": -1,
            "filename": "graphbin-1.7.1.tar.gz",
            "has_sig": false,
            "md5_digest": "88e796315d7f9393b4f6be002f11e4e8",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 54537569,
            "upload_time": "2023-07-25T02:17:56",
            "upload_time_iso_8601": "2023-07-25T02:17:56.578296Z",
            "url": "https://files.pythonhosted.org/packages/af/66/5178762875c6ec06f15e7e166ef7d5ed82b4c8c5f872e18cf1ec49ca5935/graphbin-1.7.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-07-25 02:17:56",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "metagentools",
    "github_project": "GraphBin",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "graphbin"
}
        
Elapsed time: 0.09192s