# SvPhaser
> **Haplotype‑aware structural‑variant genotyper for long‑read data**
[](https://pypi.org/project/svphaser)
[](https://github.com/your‑org/SvPhaser/actions)
[](LICENSE)
---
`SvPhaser` phases **pre‑called structural variants (SVs)** using *HP‑tagged* long‑read alignments (PacBio HiFi, ONT Q20+, …). Think of it as *WhatsHap* for insertions/deletions/duplications: we do **not** discover SVs; we assign each variant a haplotype genotype (`0|1`, `1|0`, `1|1`, or `./.`) together with a **Genotype Quality (GQ)** score – all in a single, embarrassingly‑parallel pass over the genome.
## Key highlights
* **Fast, per‑chromosome multiprocessing** – linear scale‑out on 32‑core workstations.
* **Deterministic Δ‑based decision tree** – no MCMC or hidden state machines.
* **Friendly CLI** (`svphaser phase …`) and importable Python API.
* **Seamless VCF injection** – adds `HP_GT`, `HP_GQ`, `HP_GQBIN` INFO tags while copying the original header verbatim.
* **Configurable confidence bins** and publication‑ready plots (see `result_images/`).
---
## Installation
```bash
# Requires Python ≥3.9
pip install svphaser # PyPI (coming soon)
# or
pip install git+https://github.com/your‑org/SvPhaser.git@v0.2.0
```
`cyvcf2`, `pysam`, `typer[all]`, and `pandas` are pulled in automatically.
## Quick‑start
```bash
svphaser phase \
sample_unphased.vcf.gz \
sample.sorted_phased.bam \
--out-dir results/ \
--min-support 10 \
--major-delta 0.70 \
--equal-delta 0.25 \
--gq-bins "30:High,10:Moderate" \
--threads 32
```
Outputs (written inside **`results/`**)
```
sample_unphased_phased.vcf # original VCF + HP_* INFO fields
sample_unphased_phased.csv # tidy table for plotting / downstream R
```
See [`docs/methodology.md`](docs/Methodology.md) and the flow‑chart below for algorithmic details.

## Folder layout
```
SvPhaser/
├─ src/svphaser/ # importable package
│ ├─ cli.py # Typer entry‑point
│ ├─ logging.py # unified log setup
│ └─ phasing/
│ ├─ algorithms.py # core maths
│ ├─ io.py # driver & I/O
│ ├─ _workers.py # per‑chrom processes
│ └─ types.py # thin dataclasses
├─ tests/ # pytest suite + mini data
├─ docs/ # extra documentation
├─ result_images/ # generated plots & diagrams
└─ CHANGELOG.md
```
## Python usage
```python
from pathlib import Path
from svphaser.phasing.io import phase_vcf
phase_vcf(
Path("sample.vcf.gz"),
Path("sample.bam"),
out_dir=Path("results"),
min_support=10,
major_delta=0.70,
equal_delta=0.25,
gq_bins="30:High,10:Moderate",
threads=8,
)
```
The resulting `DataFrame` can be loaded from the CSV for custom analytics.
## Development & contributing
1. Clone and create a virtual env:
```bash
git clone https://github.com/your‑org/SvPhaser.git && cd SvPhaser
python -m venv .venv && source .venv/bin/activate
pip install -e .[dev]
```
2. Run the test‑suite & type checks:
```bash
pytest -q
mypy src/svphaser
black --check src tests
```
3. Send a PR targeting the **`dev`** branch; one topic per PR.
Please read `CONTRIBUTING.md` (to come) for style‑guides and the DCO sign‑off.
## Citing SvPhaser
If SvPhaser contributed to your research, please cite:
```bibtex
@software{svphaser2024,
author = {Pranjul Mishra, Sachin Ghadak, CeNT Lab},
title = {SvPhaser: haplotype‑aware SV genotyping},
version = {0.2.0},
date = {2024-06-18},
url = {https://github.com/your‑org/SvPhaser}
}
```
## License
`SvPhaser` is released under the MIT License – see [`LICENSE`](LICENSE).
## 📬 Contact
Developed by **Team5** (*BioAI Hackathon*) – Sachin Gadakh & Pranjul Mishra.
Lead contacts:
• [pranjul.mishra@proton.me](mailto:pranjul.mishra@proton.me)
• [s.gadakh@cent.uw.edu.pl](mailto:s.gadakh@cent.uw.edu.pl)
Feedback, feature requests and bug reports are all appreciated — feel free to open a GitHub issue or reach out by e‑mail.
---
*Happy phasing!*
Raw data
{
"_id": null,
"home_page": null,
"name": "svphaser",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": "BAM, ONT, VCF, genomics, long-reads, phasing, structural-variants",
"author": null,
"author_email": "SvPhaser Team <you@lab.org>",
"download_url": "https://files.pythonhosted.org/packages/80/a9/c3f292ec30c886739dcd0dea4d1f437832da7ee27d6d3cb1da42a2be366f/svphaser-2.0.1.tar.gz",
"platform": null,
"description": "# SvPhaser\n\n> **Haplotype\u2011aware structural\u2011variant genotyper for long\u2011read data**\n\n[](https://pypi.org/project/svphaser)\n[](https://github.com/your\u2011org/SvPhaser/actions)\n[](LICENSE)\n\n---\n\n`SvPhaser` phases **pre\u2011called structural variants (SVs)** using *HP\u2011tagged* long\u2011read alignments (PacBio HiFi, ONT Q20+, \u2026). Think of it as *WhatsHap* for insertions/deletions/duplications: we do **not** discover SVs; we assign each variant a haplotype genotype (`0|1`, `1|0`, `1|1`, or `./.`) together with a **Genotype Quality (GQ)** score \u2013 all in a single, embarrassingly\u2011parallel pass over the genome.\n\n## Key highlights\n\n* **Fast, per\u2011chromosome multiprocessing** \u2013 linear scale\u2011out on 32\u2011core workstations.\n* **Deterministic \u0394\u2011based decision tree** \u2013 no MCMC or hidden state machines.\n* **Friendly CLI** (`svphaser phase \u2026`) and importable Python API.\n* **Seamless VCF injection** \u2013 adds `HP_GT`, `HP_GQ`, `HP_GQBIN` INFO tags while copying the original header verbatim.\n* **Configurable confidence bins** and publication\u2011ready plots (see `result_images/`).\n\n---\n\n## Installation\n\n```bash\n# Requires Python \u22653.9\npip install svphaser # PyPI (coming soon)\n# or\npip install git+https://github.com/your\u2011org/SvPhaser.git@v0.2.0\n```\n\n`cyvcf2`, `pysam`, `typer[all]`, and `pandas` are pulled in automatically.\n\n## Quick\u2011start\n\n```bash\nsvphaser phase \\\n sample_unphased.vcf.gz \\\n sample.sorted_phased.bam \\\n --out-dir results/ \\\n --min-support 10 \\\n --major-delta 0.70 \\\n --equal-delta 0.25 \\\n --gq-bins \"30:High,10:Moderate\" \\\n --threads 32\n```\n\nOutputs (written inside **`results/`**)\n\n```\nsample_unphased_phased.vcf # original VCF + HP_* INFO fields\nsample_unphased_phased.csv # tidy table for plotting / downstream R\n```\n\nSee [`docs/methodology.md`](docs/Methodology.md) and the flow\u2011chart below for algorithmic details.\n\n\n\n## Folder layout\n\n```\nSvPhaser/\n\u251c\u2500 src/svphaser/ # importable package\n\u2502 \u251c\u2500 cli.py # Typer entry\u2011point\n\u2502 \u251c\u2500 logging.py # unified log setup\n\u2502 \u2514\u2500 phasing/\n\u2502 \u251c\u2500 algorithms.py # core maths\n\u2502 \u251c\u2500 io.py # driver & I/O\n\u2502 \u251c\u2500 _workers.py # per\u2011chrom processes\n\u2502 \u2514\u2500 types.py # thin dataclasses\n\u251c\u2500 tests/ # pytest suite + mini data\n\u251c\u2500 docs/ # extra documentation\n\u251c\u2500 result_images/ # generated plots & diagrams\n\u2514\u2500 CHANGELOG.md\n```\n\n## Python usage\n\n```python\nfrom pathlib import Path\nfrom svphaser.phasing.io import phase_vcf\n\nphase_vcf(\n Path(\"sample.vcf.gz\"),\n Path(\"sample.bam\"),\n out_dir=Path(\"results\"),\n min_support=10,\n major_delta=0.70,\n equal_delta=0.25,\n gq_bins=\"30:High,10:Moderate\",\n threads=8,\n)\n```\n\nThe resulting `DataFrame` can be loaded from the CSV for custom analytics.\n\n\n\n\n## Development & contributing\n\n1. Clone and create a virtual env:\n\n ```bash\n git clone https://github.com/your\u2011org/SvPhaser.git && cd SvPhaser\n python -m venv .venv && source .venv/bin/activate\n pip install -e .[dev]\n ```\n2. Run the test\u2011suite & type checks:\n\n ```bash\n pytest -q\n mypy src/svphaser\n black --check src tests\n ```\n3. Send a PR targeting the **`dev`** branch; one topic per PR.\n\nPlease read `CONTRIBUTING.md` (to come) for style\u2011guides and the DCO sign\u2011off.\n\n## Citing SvPhaser\n\nIf SvPhaser contributed to your research, please cite:\n\n```bibtex\n@software{svphaser2024,\n author = {Pranjul Mishra, Sachin Ghadak, CeNT Lab},\n title = {SvPhaser: haplotype\u2011aware SV genotyping},\n version = {0.2.0},\n date = {2024-06-18},\n url = {https://github.com/your\u2011org/SvPhaser}\n}\n```\n\n\n\n\n## License\n`SvPhaser` is released under the MIT License \u2013 see [`LICENSE`](LICENSE).\n\n\n\n\n\n## \ud83d\udcec Contact\n\nDeveloped by **Team5** (*BioAI\u00a0Hackathon*) \u2013 Sachin\u00a0Gadakh & Pranjul\u00a0Mishra.\n\nLead contacts:\n\u2022 [pranjul.mishra@proton.me](mailto:pranjul.mishra@proton.me)\n\u2022 [s.gadakh@cent.uw.edu.pl](mailto:s.gadakh@cent.uw.edu.pl)\n\nFeedback, feature requests and bug reports are all appreciated \u2014 feel free to open a GitHub issue or reach out by e\u2011mail.\n\n---\n\n*Happy phasing!*\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Structural-variant phasing from HP-tagged long-read BAMs",
"version": "2.0.1",
"project_urls": {
"Homepage": "https://github.com/your-org/svphaser",
"Issues": "https://github.com/your-org/svphaser/issues",
"Source": "https://github.com/your-org/svphaser"
},
"split_keywords": [
"bam",
" ont",
" vcf",
" genomics",
" long-reads",
" phasing",
" structural-variants"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "32ebe96c268b7ffe38ecd015eb9d7f936f1120cc07c13a7ea09695ab0a798c64",
"md5": "969c1d3257b24963019f89b524b67ec9",
"sha256": "4c6b8c59b2750601cdecb2ceb343a0aa46788a17c3b3a452b18b11697ef90f94"
},
"downloads": -1,
"filename": "svphaser-2.0.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "969c1d3257b24963019f89b524b67ec9",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 16274,
"upload_time": "2025-08-31T17:28:43",
"upload_time_iso_8601": "2025-08-31T17:28:43.594159Z",
"url": "https://files.pythonhosted.org/packages/32/eb/e96c268b7ffe38ecd015eb9d7f936f1120cc07c13a7ea09695ab0a798c64/svphaser-2.0.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "80a9c3f292ec30c886739dcd0dea4d1f437832da7ee27d6d3cb1da42a2be366f",
"md5": "d2d77165a3114a7602d0a0b4e755f7dc",
"sha256": "4df829fafc4b1362370bd190dab1405ee3b9c6752a42a5a1b33a3d2243f240b2"
},
"downloads": -1,
"filename": "svphaser-2.0.1.tar.gz",
"has_sig": false,
"md5_digest": "d2d77165a3114a7602d0a0b4e755f7dc",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 12947,
"upload_time": "2025-08-31T17:28:45",
"upload_time_iso_8601": "2025-08-31T17:28:45.105094Z",
"url": "https://files.pythonhosted.org/packages/80/a9/c3f292ec30c886739dcd0dea4d1f437832da7ee27d6d3cb1da42a2be366f/svphaser-2.0.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-31 17:28:45",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "your-org",
"github_project": "svphaser",
"github_not_found": true,
"lcname": "svphaser"
}