# GPatch
## Assemble contigs into a chromosome-scalse pseudo-assembly using alignments to a reference sequence.
Starting with alignments of contigs to a reference genome, produce a chromosome-scale pseudoassembly by patching gaps between mapped contigs with sequences from the reference.
## Dependencies
* Python >= v3.7
* samtools (https://github.com/samtools/samtools)
* biopython (https://biopython.org/)
* pysam (https://github.com/pysam-developers/pysam)
* minimap2 (https://github.com/lh3/minimap2)
We recommend using minimap2 for alignment, using the -a option to generate SAM output.
## Installation
We recommend installing with conda, into a new environment:
```
conda create -n GPatch -c conda-forge -c bioconda Bio pysam minimap2 samtools GPatch
```
Install with pip:
```
pip install GPatch
```
Installation from the github repository is not recommended. However, if you must, follow the steps below:
1) git clone https://github.com/adadiehl/GPatch
2) cd GPatch/
3) python3 -m pip install -e .
## Usage
```
usage: GPatch [-h] -q SAM/BAM -r FASTA [-x BED] [-b FILENAME] [-m N]
[-d N] [-f FLOAT] [-e FLOAT]
```
Starting with alignments of contigs to a reference genome, produce a chromosome-scale pseudoassembly by patching gaps between mapped contigs with sequences from the reference. Reference chromosomes with no mapped contigs are printed to output unchanged.
#### Required Arguments
| Argument | Description |
|---|---|
| __-q SAM/BAM, --query_bam SAM/BAM__ | Path to SAM/BAM file containing non-overlapping contig mappings to the reference genome. |
| __-r FASTA, --reference_fasta FASTA__ | Path to reference genome fasta. |
#### Optional Arguments:
| Argument | Description |
|---|---|
| __-h, --help__ | Show this help message and exit. |
| __-x STR, --prefix STR__ | Prefix to add to output file names. Default=None |
| __-b FILENAME, --store_final_bam FILENAME__ | Store the final set of primary contig alignments to the given file name. Default: Do not store the final BAM. |
| __-m N, --min_qual_score N__ | Minimum mapping quality score to retain an alignment. Default=30 |
## Output
GPatch produces three output files:
| File | Description |
|---|---|
| __patched.fasta__ | The final patched genome. |
| __contigs.bed__ | Location of contigs in the coordinate frame of the patched genome. |
| __patches.bed__ | Location of patches in the coordinate frame of the reference genome. |
## Citing GPatch
Please use the following citation if you use this software in your work:
CITATION_HERE
Raw data
{
"_id": null,
"home_page": "https://github.com/adadiehl/GPatch",
"name": "GPatch",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": "genomics, genome assembly",
"author": "Adam Diehl",
"author_email": "adadiehl@umich.edu",
"download_url": "https://files.pythonhosted.org/packages/58/4c/dd1546595fc2970eee77782d5e1c057875ed7d451d953333963f66d03a0d/GPatch-0.3.5.tar.gz",
"platform": null,
"description": "# GPatch\n## Assemble contigs into a chromosome-scalse pseudo-assembly using alignments to a reference sequence.\n\nStarting with alignments of contigs to a reference genome, produce a chromosome-scale pseudoassembly by patching gaps between mapped contigs with sequences from the reference.\n\n## Dependencies\n* Python >= v3.7\n* samtools (https://github.com/samtools/samtools)\n* biopython (https://biopython.org/)\n* pysam (https://github.com/pysam-developers/pysam)\n* minimap2 (https://github.com/lh3/minimap2)\n\nWe recommend using minimap2 for alignment, using the -a option to generate SAM output.\n\n## Installation\n\nWe recommend installing with conda, into a new environment:\n```\nconda create -n GPatch -c conda-forge -c bioconda Bio pysam minimap2 samtools GPatch\n```\n\nInstall with pip:\n```\npip install GPatch\n```\n\nInstallation from the github repository is not recommended. However, if you must, follow the steps below:\n1) git clone https://github.com/adadiehl/GPatch\n2) cd GPatch/\n3) python3 -m pip install -e .\n\n\n## Usage\n```\nusage: GPatch [-h] -q SAM/BAM -r FASTA [-x BED] [-b FILENAME] [-m N]\n [-d N] [-f FLOAT] [-e FLOAT]\n```\n\nStarting with alignments of contigs to a reference genome, produce a chromosome-scale pseudoassembly by patching gaps between mapped contigs with sequences from the reference. Reference chromosomes with no mapped contigs are printed to output unchanged.\n\n#### Required Arguments\n| Argument | Description |\n|---|---|\n| __-q SAM/BAM, --query_bam SAM/BAM__ | Path to SAM/BAM file containing non-overlapping contig mappings to the reference genome. |\n| __-r FASTA, --reference_fasta FASTA__ | Path to reference genome fasta. |\n\n#### Optional Arguments:\n| Argument | Description |\n|---|---|\n| __-h, --help__ | Show this help message and exit. |\n| __-x STR, --prefix STR__ | Prefix to add to output file names. Default=None |\n| __-b FILENAME, --store_final_bam FILENAME__ | Store the final set of primary contig alignments to the given file name. Default: Do not store the final BAM. |\n| __-m N, --min_qual_score N__ | Minimum mapping quality score to retain an alignment. Default=30 |\n\n\n## Output\n\nGPatch produces three output files:\n| File | Description |\n|---|---|\n| __patched.fasta__ | The final patched genome. |\n| __contigs.bed__ | Location of contigs in the coordinate frame of the patched genome. |\n| __patches.bed__ | Location of patches in the coordinate frame of the reference genome. |\n\n\n## Citing GPatch\nPlease use the following citation if you use this software in your work:\n\nCITATION_HERE\n\n",
"bugtrack_url": null,
"license": null,
"summary": "Assemble contigs into a chromosome-scalse pseudo-assembly using alignments to a reference sequence.",
"version": "0.3.5",
"project_urls": {
"Homepage": "https://github.com/adadiehl/GPatch"
},
"split_keywords": [
"genomics",
" genome assembly"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "5e59e3850f5ce476b1f6124033fad208b5ad6f37e90e70780e7847380c977d7b",
"md5": "075c9b8bda12f850ce8f6d0b99e40059",
"sha256": "e7d0e809f8809341a1372acd5d635ff431f612e501142a02ea1b9b048421efa4"
},
"downloads": -1,
"filename": "GPatch-0.3.5-py3-none-any.whl",
"has_sig": false,
"md5_digest": "075c9b8bda12f850ce8f6d0b99e40059",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 8161,
"upload_time": "2024-12-12T16:42:57",
"upload_time_iso_8601": "2024-12-12T16:42:57.639362Z",
"url": "https://files.pythonhosted.org/packages/5e/59/e3850f5ce476b1f6124033fad208b5ad6f37e90e70780e7847380c977d7b/GPatch-0.3.5-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "584cdd1546595fc2970eee77782d5e1c057875ed7d451d953333963f66d03a0d",
"md5": "a2ff0bdba4674f9cfafc53ed787faae9",
"sha256": "1443daef746825d0749b7870f6b9e49d874ac4d057486d934f702abd194c52c1"
},
"downloads": -1,
"filename": "GPatch-0.3.5.tar.gz",
"has_sig": false,
"md5_digest": "a2ff0bdba4674f9cfafc53ed787faae9",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 10683,
"upload_time": "2024-12-12T16:43:02",
"upload_time_iso_8601": "2024-12-12T16:43:02.452960Z",
"url": "https://files.pythonhosted.org/packages/58/4c/dd1546595fc2970eee77782d5e1c057875ed7d451d953333963f66d03a0d/GPatch-0.3.5.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-12-12 16:43:02",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "adadiehl",
"github_project": "GPatch",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [
{
"name": "Bio",
"specs": []
},
{
"name": "pysam",
"specs": []
}
],
"lcname": "gpatch"
}