svphaser


Namesvphaser JSON
Version 2.0.1 PyPI version JSON
download
home_pageNone
SummaryStructural-variant phasing from HP-tagged long-read BAMs
upload_time2025-08-31 17:28:45
maintainerNone
docs_urlNone
authorNone
requires_python>=3.9
licenseMIT
keywords bam ont vcf genomics long-reads phasing structural-variants
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # SvPhaser

> **Haplotype‑aware structural‑variant genotyper for long‑read data**

[![PyPI version](https://img.shields.io/pypi/v/svphaser.svg?logo=pypi)](https://pypi.org/project/svphaser)
[![Tests](https://img.shields.io/github/actions/workflow/status/your‑org/SvPhaser/ci.yml?label=ci)](https://github.com/your‑org/SvPhaser/actions)
[![License](https://img.shields.io/github/license/your‑org/SvPhaser.svg)](LICENSE)

---

`SvPhaser` phases **pre‑called structural variants (SVs)** using *HP‑tagged* long‑read alignments (PacBio HiFi, ONT Q20+, …).  Think of it as *WhatsHap* for insertions/deletions/duplications: we do **not** discover SVs; we assign each variant a haplotype genotype (`0|1`, `1|0`, `1|1`, or `./.`) together with a **Genotype Quality (GQ)** score – all in a single, embarrassingly‑parallel pass over the genome.

## Key highlights

* **Fast, per‑chromosome multiprocessing** – linear scale‑out on 32‑core workstations.
* **Deterministic Δ‑based decision tree** – no MCMC or hidden state machines.
* **Friendly CLI** (`svphaser phase …`) and importable Python API.
* **Seamless VCF injection** – adds `HP_GT`, `HP_GQ`, `HP_GQBIN` INFO tags while copying the original header verbatim.
* **Configurable confidence bins** and publication‑ready plots (see `result_images/`).

---

## Installation

```bash
# Requires Python ≥3.9
pip install svphaser            # PyPI (coming soon)
# or
pip install git+https://github.com/your‑org/SvPhaser.git@v0.2.0
```

`cyvcf2`, `pysam`, `typer[all]`, and `pandas` are pulled in automatically.

## Quick‑start

```bash
svphaser phase \
    sample_unphased.vcf.gz \
    sample.sorted_phased.bam \
    --out-dir results/ \
    --min-support 10 \
    --major-delta 0.70 \
    --equal-delta 0.25 \
    --gq-bins "30:High,10:Moderate" \
    --threads 32
```

Outputs (written inside **`results/`**)

```
sample_unphased_phased.vcf   # original VCF + HP_* INFO fields
sample_unphased_phased.csv   # tidy table for plotting / downstream R
```

See [`docs/methodology.md`](docs/Methodology.md) and the flow‑chart below for algorithmic details.

![SvPhaser methodology](docs/result_images/methodology_diagram.png)

## Folder layout

```
SvPhaser/
├─ src/svphaser/        # importable package
│  ├─ cli.py            # Typer entry‑point
│  ├─ logging.py        # unified log setup
│  └─ phasing/
│     ├─ algorithms.py  # core maths
│     ├─ io.py          # driver & I/O
│     ├─ _workers.py    # per‑chrom processes
│     └─ types.py       # thin dataclasses
├─ tests/               # pytest suite + mini data
├─ docs/                # extra documentation
├─ result_images/       # generated plots & diagrams
└─ CHANGELOG.md
```

## Python usage

```python
from pathlib import Path
from svphaser.phasing.io import phase_vcf

phase_vcf(
    Path("sample.vcf.gz"),
    Path("sample.bam"),
    out_dir=Path("results"),
    min_support=10,
    major_delta=0.70,
    equal_delta=0.25,
    gq_bins="30:High,10:Moderate",
    threads=8,
)
```

The resulting `DataFrame` can be loaded from the CSV for custom analytics.




## Development & contributing

1. Clone and create a virtual env:

   ```bash
   git clone https://github.com/your‑org/SvPhaser.git && cd SvPhaser
   python -m venv .venv && source .venv/bin/activate
   pip install -e .[dev]
   ```
2. Run the test‑suite & type checks:

   ```bash
   pytest -q
   mypy src/svphaser
   black --check src tests
   ```
3. Send a PR targeting the **`dev`** branch; one topic per PR.

Please read `CONTRIBUTING.md` (to come) for style‑guides and the DCO sign‑off.

## Citing SvPhaser

If SvPhaser contributed to your research, please cite:

```bibtex
@software{svphaser2024,
  author       = {Pranjul Mishra, Sachin Ghadak, CeNT Lab},
  title        = {SvPhaser: haplotype‑aware SV genotyping},
  version      = {0.2.0},
  date         = {2024-06-18},
  url          = {https://github.com/your‑org/SvPhaser}
}
```




## License
`SvPhaser` is released under the MIT License – see [`LICENSE`](LICENSE).





## 📬 Contact

Developed by **Team5** (*BioAI Hackathon*) – Sachin Gadakh & Pranjul Mishra.

Lead contacts:
• [pranjul.mishra@proton.me](mailto:pranjul.mishra@proton.me)
• [s.gadakh@cent.uw.edu.pl](mailto:s.gadakh@cent.uw.edu.pl)

Feedback, feature requests and bug reports are all appreciated — feel free to open a GitHub issue or reach out by e‑mail.

---

*Happy phasing!*

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "svphaser",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "BAM, ONT, VCF, genomics, long-reads, phasing, structural-variants",
    "author": null,
    "author_email": "SvPhaser Team <you@lab.org>",
    "download_url": "https://files.pythonhosted.org/packages/80/a9/c3f292ec30c886739dcd0dea4d1f437832da7ee27d6d3cb1da42a2be366f/svphaser-2.0.1.tar.gz",
    "platform": null,
    "description": "# SvPhaser\n\n> **Haplotype\u2011aware structural\u2011variant genotyper for long\u2011read data**\n\n[![PyPI version](https://img.shields.io/pypi/v/svphaser.svg?logo=pypi)](https://pypi.org/project/svphaser)\n[![Tests](https://img.shields.io/github/actions/workflow/status/your\u2011org/SvPhaser/ci.yml?label=ci)](https://github.com/your\u2011org/SvPhaser/actions)\n[![License](https://img.shields.io/github/license/your\u2011org/SvPhaser.svg)](LICENSE)\n\n---\n\n`SvPhaser` phases **pre\u2011called structural variants (SVs)** using *HP\u2011tagged* long\u2011read alignments (PacBio HiFi, ONT Q20+, \u2026).  Think of it as *WhatsHap* for insertions/deletions/duplications: we do **not** discover SVs; we assign each variant a haplotype genotype (`0|1`, `1|0`, `1|1`, or `./.`) together with a **Genotype Quality (GQ)** score \u2013 all in a single, embarrassingly\u2011parallel pass over the genome.\n\n## Key highlights\n\n* **Fast, per\u2011chromosome multiprocessing** \u2013 linear scale\u2011out on 32\u2011core workstations.\n* **Deterministic \u0394\u2011based decision tree** \u2013 no MCMC or hidden state machines.\n* **Friendly CLI** (`svphaser phase \u2026`) and importable Python API.\n* **Seamless VCF injection** \u2013 adds `HP_GT`, `HP_GQ`, `HP_GQBIN` INFO tags while copying the original header verbatim.\n* **Configurable confidence bins** and publication\u2011ready plots (see `result_images/`).\n\n---\n\n## Installation\n\n```bash\n# Requires Python \u22653.9\npip install svphaser            # PyPI (coming soon)\n# or\npip install git+https://github.com/your\u2011org/SvPhaser.git@v0.2.0\n```\n\n`cyvcf2`, `pysam`, `typer[all]`, and `pandas` are pulled in automatically.\n\n## Quick\u2011start\n\n```bash\nsvphaser phase \\\n    sample_unphased.vcf.gz \\\n    sample.sorted_phased.bam \\\n    --out-dir results/ \\\n    --min-support 10 \\\n    --major-delta 0.70 \\\n    --equal-delta 0.25 \\\n    --gq-bins \"30:High,10:Moderate\" \\\n    --threads 32\n```\n\nOutputs (written inside **`results/`**)\n\n```\nsample_unphased_phased.vcf   # original VCF + HP_* INFO fields\nsample_unphased_phased.csv   # tidy table for plotting / downstream R\n```\n\nSee [`docs/methodology.md`](docs/Methodology.md) and the flow\u2011chart below for algorithmic details.\n\n![SvPhaser methodology](docs/result_images/methodology_diagram.png)\n\n## Folder layout\n\n```\nSvPhaser/\n\u251c\u2500 src/svphaser/        # importable package\n\u2502  \u251c\u2500 cli.py            # Typer entry\u2011point\n\u2502  \u251c\u2500 logging.py        # unified log setup\n\u2502  \u2514\u2500 phasing/\n\u2502     \u251c\u2500 algorithms.py  # core maths\n\u2502     \u251c\u2500 io.py          # driver & I/O\n\u2502     \u251c\u2500 _workers.py    # per\u2011chrom processes\n\u2502     \u2514\u2500 types.py       # thin dataclasses\n\u251c\u2500 tests/               # pytest suite + mini data\n\u251c\u2500 docs/                # extra documentation\n\u251c\u2500 result_images/       # generated plots & diagrams\n\u2514\u2500 CHANGELOG.md\n```\n\n## Python usage\n\n```python\nfrom pathlib import Path\nfrom svphaser.phasing.io import phase_vcf\n\nphase_vcf(\n    Path(\"sample.vcf.gz\"),\n    Path(\"sample.bam\"),\n    out_dir=Path(\"results\"),\n    min_support=10,\n    major_delta=0.70,\n    equal_delta=0.25,\n    gq_bins=\"30:High,10:Moderate\",\n    threads=8,\n)\n```\n\nThe resulting `DataFrame` can be loaded from the CSV for custom analytics.\n\n\n\n\n## Development & contributing\n\n1. Clone and create a virtual env:\n\n   ```bash\n   git clone https://github.com/your\u2011org/SvPhaser.git && cd SvPhaser\n   python -m venv .venv && source .venv/bin/activate\n   pip install -e .[dev]\n   ```\n2. Run the test\u2011suite & type checks:\n\n   ```bash\n   pytest -q\n   mypy src/svphaser\n   black --check src tests\n   ```\n3. Send a PR targeting the **`dev`** branch; one topic per PR.\n\nPlease read `CONTRIBUTING.md` (to come) for style\u2011guides and the DCO sign\u2011off.\n\n## Citing SvPhaser\n\nIf SvPhaser contributed to your research, please cite:\n\n```bibtex\n@software{svphaser2024,\n  author       = {Pranjul Mishra, Sachin Ghadak, CeNT Lab},\n  title        = {SvPhaser: haplotype\u2011aware SV genotyping},\n  version      = {0.2.0},\n  date         = {2024-06-18},\n  url          = {https://github.com/your\u2011org/SvPhaser}\n}\n```\n\n\n\n\n## License\n`SvPhaser` is released under the MIT License \u2013 see [`LICENSE`](LICENSE).\n\n\n\n\n\n## \ud83d\udcec Contact\n\nDeveloped by **Team5** (*BioAI\u00a0Hackathon*) \u2013 Sachin\u00a0Gadakh & Pranjul\u00a0Mishra.\n\nLead contacts:\n\u2022 [pranjul.mishra@proton.me](mailto:pranjul.mishra@proton.me)\n\u2022 [s.gadakh@cent.uw.edu.pl](mailto:s.gadakh@cent.uw.edu.pl)\n\nFeedback, feature requests and bug reports are all appreciated \u2014 feel free to open a GitHub issue or reach out by e\u2011mail.\n\n---\n\n*Happy phasing!*\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Structural-variant phasing from HP-tagged long-read BAMs",
    "version": "2.0.1",
    "project_urls": {
        "Homepage": "https://github.com/your-org/svphaser",
        "Issues": "https://github.com/your-org/svphaser/issues",
        "Source": "https://github.com/your-org/svphaser"
    },
    "split_keywords": [
        "bam",
        " ont",
        " vcf",
        " genomics",
        " long-reads",
        " phasing",
        " structural-variants"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "32ebe96c268b7ffe38ecd015eb9d7f936f1120cc07c13a7ea09695ab0a798c64",
                "md5": "969c1d3257b24963019f89b524b67ec9",
                "sha256": "4c6b8c59b2750601cdecb2ceb343a0aa46788a17c3b3a452b18b11697ef90f94"
            },
            "downloads": -1,
            "filename": "svphaser-2.0.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "969c1d3257b24963019f89b524b67ec9",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 16274,
            "upload_time": "2025-08-31T17:28:43",
            "upload_time_iso_8601": "2025-08-31T17:28:43.594159Z",
            "url": "https://files.pythonhosted.org/packages/32/eb/e96c268b7ffe38ecd015eb9d7f936f1120cc07c13a7ea09695ab0a798c64/svphaser-2.0.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "80a9c3f292ec30c886739dcd0dea4d1f437832da7ee27d6d3cb1da42a2be366f",
                "md5": "d2d77165a3114a7602d0a0b4e755f7dc",
                "sha256": "4df829fafc4b1362370bd190dab1405ee3b9c6752a42a5a1b33a3d2243f240b2"
            },
            "downloads": -1,
            "filename": "svphaser-2.0.1.tar.gz",
            "has_sig": false,
            "md5_digest": "d2d77165a3114a7602d0a0b4e755f7dc",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 12947,
            "upload_time": "2025-08-31T17:28:45",
            "upload_time_iso_8601": "2025-08-31T17:28:45.105094Z",
            "url": "https://files.pythonhosted.org/packages/80/a9/c3f292ec30c886739dcd0dea4d1f437832da7ee27d6d3cb1da42a2be366f/svphaser-2.0.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-31 17:28:45",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "your-org",
    "github_project": "svphaser",
    "github_not_found": true,
    "lcname": "svphaser"
}
        
Elapsed time: 4.12094s