# spacegraphcats
![Test](https://github.com/spacegraphcats/spacegraphcats/workflows/Test/badge.svg) [![codecov](https://codecov.io/gh/spacegraphcats/spacegraphcats/branch/latest/graph/badge.svg)](https://codecov.io/gh/spacegraphcats/spacegraphcats) [![DOI](https://zenodo.org/badge/58208221.svg)](https://zenodo.org/badge/latestdoi/58208221) <a href="https://pypi.org/project/spacegraphcats/"><img alt="PyPI" src="https://badge.fury.io/py/spacegraphcats.svg"></a>
Explore large, annoying graphs using hierarchies of dominating sets - because
in space, no one can hear you miao!
This is a collaboration between the
[Theory In Practice](https://github.com/TheoryInPractice/) lab at University of Utah, the
[Lab for Data Intensive Biology](https://github.com/dib-lab/) at UC Davis, and
[Dr. Felix Reidl](https://www.dcs.bbk.ac.uk/about/people/academic-staff/felix/) at Birkbeck University of London.
Initial development of spacegraphcats was generously supported by the Moore Foundation's
[Data Driven Discovery Initiative](https://www.moore.org/initiative-strategy-detail?initiativeId=data-driven-discovery).
![spacegraphcats graph](https://github.com/spacegraphcats/spacegraphcats/raw/latest/pics/logo.png)
## Documentation
This README file contains quickstart information.
For use cases and other information, please see the spacegraphcats documentation at https://spacegraphcats.github.io/spacegraphcats.
## Installation and execution quickstart
See [installation instructions](https://github.com/spacegraphcats/spacegraphcats/blob/latest/doc/00-installing-spacegraphcats.md) and [the run guide](https://github.com/spacegraphcats/spacegraphcats/blob/latest/doc/01-running-spacegraphcats.md).
For help or support with this software, please
[file an issue on GitHub](https://github.com/spacegraphcats/spacegraphcats/issues). Thank
you!
### Quickstart
There are two quickstart examples available! Please see
[dory-example](https://github.com/spacegraphcats/spacegraphcats-dory-example)
and
[twofoo-example](https://github.com/spacegraphcats/spacegraphcats-twofoo-example). The
latter example includes
[a snakemake Snakefile](https://snakemake.readthedocs.io/en/stable/).
### Notable dependencies
spacegraphcats uses code from
[BBHash](https://github.com/rizkg/BBHash), a C++ library for building
minimal perfect hash functions (Guillaume Rizk, Antoine Limasset,
Rayan Chikhi; see
[Limasset et al., 2017, arXiv](https://arxiv.org/abs/1702.03154), as
wrapped by [pybbhash](https://github.com/dib-lab/pybbhash).
spacegraphcats also uses functionality from
[khmer](https://github.com/dib-lab/khmer/) and
[sourmash](https://github.com/dib-lab/sourmash).
## Citation information
See the Genome Biology publication [Exploring neighborhoods in large metagenome assembly graphs using spacegraphcats reveals hidden sequence diversity](https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-02066-4), Brown et al., 2020, doi: https://doi.org/10.1186/s13059-020-02066-4.
## Pointers to interesting code
### Interesting algorithms
The `rdomset` code for efficently calculating a dominating set of a graph
at a given radius R is in [spacegraphcats/catlas/rdomset.py](https://github.com/spacegraphcats/spacegraphcats/blob/latest/spacegraphcats/catlas/rdomset.py).
The graph denoising code for removing low-abundance pendants from
BCALM cDBGs is in function `contract_degree_two` in
[cdbg/bcalm_to_gxt.py](https://github.com/spacegraphcats/spacegraphcats/blob/latest/spacegraphcats/cdbg/bcalm_to_gxt.py).
Part of the `indexPieces` code for indexing cDBG nodes by dominating
nodes is
[cdbg/index_cdbg_by_kmer.py](https://github.com/spacegraphcats/spacegraphcats/blob/latest/spacegraphcats/cdbg/index_cdbg_by_kmer.py). The
remainder is implemented in `search`, below.
The `search` code for extracting query neighborhoods is in
[search/query_by_sequence.py](https://github.com/spacegraphcats/spacegraphcats/blob/latest/spacegraphcats/search/query_by_sequence.py);
see especially the call to `kmer_idx.count_cdbg_matches(...)`.
### Interesting library functionality
Code for indexing large FASTQ/FASTA read files by cDBG unitig, and
extracting the reads corresponding to individual unitigs from BGZF
files, is available in
[cdbg/label_cdbg.py](https://github.com/spacegraphcats/spacegraphcats/blob/latest/spacegraphcats/cdbg/index_reads.py)
and
[search/search_utils.py](https://github.com/spacegraphcats/spacegraphcats/blob/latest/spacegraphcats/search/search_utils.py),
`get_reads_by_cdbg`, respectively.
Raw data
{
"_id": null,
"home_page": "https://github.com/spacegraphcats/spacegraphcats",
"name": "spacegraphcats",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "",
"author": "C. Titus Brown, Dominik Moritz, Michael P. O'Brien, Felix Reidl, Taylor Reiter, Yosuke Mizutani, and Blair D. Sullivan",
"author_email": "titus@idyll.org,blair.d.sullivan@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/59/9b/08f3a7776f4e4e7dacace546578a10d36c4475b3cddc6796780fd38a4576/spacegraphcats-2.1.2.tar.gz",
"platform": null,
"description": "# spacegraphcats\n\n![Test](https://github.com/spacegraphcats/spacegraphcats/workflows/Test/badge.svg) [![codecov](https://codecov.io/gh/spacegraphcats/spacegraphcats/branch/latest/graph/badge.svg)](https://codecov.io/gh/spacegraphcats/spacegraphcats) [![DOI](https://zenodo.org/badge/58208221.svg)](https://zenodo.org/badge/latestdoi/58208221) <a href=\"https://pypi.org/project/spacegraphcats/\"><img alt=\"PyPI\" src=\"https://badge.fury.io/py/spacegraphcats.svg\"></a>\n\n\nExplore large, annoying graphs using hierarchies of dominating sets - because\nin space, no one can hear you miao!\n\nThis is a collaboration between the\n[Theory In Practice](https://github.com/TheoryInPractice/) lab at University of Utah, the\n[Lab for Data Intensive Biology](https://github.com/dib-lab/) at UC Davis, and\n[Dr. Felix Reidl](https://www.dcs.bbk.ac.uk/about/people/academic-staff/felix/) at Birkbeck University of London. \nInitial development of spacegraphcats was generously supported by the Moore Foundation's\n[Data Driven Discovery Initiative](https://www.moore.org/initiative-strategy-detail?initiativeId=data-driven-discovery).\n\n![spacegraphcats graph](https://github.com/spacegraphcats/spacegraphcats/raw/latest/pics/logo.png)\n\n## Documentation\n\nThis README file contains quickstart information.\nFor use cases and other information, please see the spacegraphcats documentation at https://spacegraphcats.github.io/spacegraphcats.\n\n## Installation and execution quickstart\n\nSee [installation instructions](https://github.com/spacegraphcats/spacegraphcats/blob/latest/doc/00-installing-spacegraphcats.md) and [the run guide](https://github.com/spacegraphcats/spacegraphcats/blob/latest/doc/01-running-spacegraphcats.md).\n\nFor help or support with this software, please\n[file an issue on GitHub](https://github.com/spacegraphcats/spacegraphcats/issues). Thank\nyou!\n\n### Quickstart\n\nThere are two quickstart examples available! Please see\n[dory-example](https://github.com/spacegraphcats/spacegraphcats-dory-example)\nand\n[twofoo-example](https://github.com/spacegraphcats/spacegraphcats-twofoo-example). The\nlatter example includes\n[a snakemake Snakefile](https://snakemake.readthedocs.io/en/stable/).\n\n### Notable dependencies\n\nspacegraphcats uses code from\n[BBHash](https://github.com/rizkg/BBHash), a C++ library for building\nminimal perfect hash functions (Guillaume Rizk, Antoine Limasset,\nRayan Chikhi; see\n[Limasset et al., 2017, arXiv](https://arxiv.org/abs/1702.03154), as\nwrapped by [pybbhash](https://github.com/dib-lab/pybbhash).\n\nspacegraphcats also uses functionality from\n[khmer](https://github.com/dib-lab/khmer/) and\n[sourmash](https://github.com/dib-lab/sourmash).\n\n## Citation information\n\nSee the Genome Biology publication [Exploring neighborhoods in large metagenome assembly graphs using spacegraphcats reveals hidden sequence diversity](https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-02066-4), Brown et al., 2020, doi: https://doi.org/10.1186/s13059-020-02066-4.\n\n## Pointers to interesting code\n\n### Interesting algorithms\n\nThe `rdomset` code for efficently calculating a dominating set of a graph\nat a given radius R is in [spacegraphcats/catlas/rdomset.py](https://github.com/spacegraphcats/spacegraphcats/blob/latest/spacegraphcats/catlas/rdomset.py).\n\nThe graph denoising code for removing low-abundance pendants from\nBCALM cDBGs is in function `contract_degree_two` in\n[cdbg/bcalm_to_gxt.py](https://github.com/spacegraphcats/spacegraphcats/blob/latest/spacegraphcats/cdbg/bcalm_to_gxt.py).\n\nPart of the `indexPieces` code for indexing cDBG nodes by dominating\nnodes is\n[cdbg/index_cdbg_by_kmer.py](https://github.com/spacegraphcats/spacegraphcats/blob/latest/spacegraphcats/cdbg/index_cdbg_by_kmer.py). The\nremainder is implemented in `search`, below.\n\nThe `search` code for extracting query neighborhoods is in\n[search/query_by_sequence.py](https://github.com/spacegraphcats/spacegraphcats/blob/latest/spacegraphcats/search/query_by_sequence.py);\nsee especially the call to `kmer_idx.count_cdbg_matches(...)`.\n\n### Interesting library functionality\n\nCode for indexing large FASTQ/FASTA read files by cDBG unitig, and\nextracting the reads corresponding to individual unitigs from BGZF\nfiles, is available in\n[cdbg/label_cdbg.py](https://github.com/spacegraphcats/spacegraphcats/blob/latest/spacegraphcats/cdbg/index_reads.py)\nand\n[search/search_utils.py](https://github.com/spacegraphcats/spacegraphcats/blob/latest/spacegraphcats/search/search_utils.py),\n`get_reads_by_cdbg`, respectively.\n\n\n",
"bugtrack_url": null,
"license": "BSD 3-clause",
"summary": "tools for biological assembly graph neighborhood analysis",
"version": "2.1.2",
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"md5": "1e7bb607d1137b8a5cd698d91a890cb9",
"sha256": "de5ecde0c39ea9e3e7316bc7c97a7795ab59b32436c018440843e097e6269359"
},
"downloads": -1,
"filename": "spacegraphcats-2.1.2.tar.gz",
"has_sig": false,
"md5_digest": "1e7bb607d1137b8a5cd698d91a890cb9",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 10960454,
"upload_time": "2022-12-04T14:14:56",
"upload_time_iso_8601": "2022-12-04T14:14:56.608954Z",
"url": "https://files.pythonhosted.org/packages/59/9b/08f3a7776f4e4e7dacace546578a10d36c4475b3cddc6796780fd38a4576/spacegraphcats-2.1.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2022-12-04 14:14:56",
"github": true,
"gitlab": false,
"bitbucket": false,
"github_user": "spacegraphcats",
"github_project": "spacegraphcats",
"travis_ci": false,
"coveralls": true,
"github_actions": true,
"requirements": [
{
"name": "screed",
"specs": [
[
">=",
"1.1"
],
[
"<",
"2"
]
]
},
{
"name": "pytest",
"specs": [
[
">=",
"5.1.2"
]
]
},
{
"name": "pytest-dependency",
"specs": []
},
{
"name": "numpy",
"specs": []
},
{
"name": "pandas",
"specs": []
},
{
"name": "snakemake",
"specs": [
[
"==",
"7.18.2"
]
]
},
{
"name": "click",
"specs": [
[
">=",
"8.1.2"
],
[
"<",
"9"
]
]
},
{
"name": "sortedcontainers",
"specs": []
},
{
"name": "bbhash",
"specs": [
[
">=",
"0.5"
]
]
},
{
"name": "khmer",
"specs": []
},
{
"name": "sourmash",
"specs": [
[
"<",
"5"
],
[
">=",
"4.6.1"
]
]
},
{
"name": "flake8",
"specs": []
},
{
"name": "black",
"specs": []
},
{
"name": "pre-commit",
"specs": []
}
],
"lcname": "spacegraphcats"
}