rustbam


Namerustbam JSON
Version 0.1.8 PyPI version JSON
download
home_pageNone
SummaryA Rust-based BAM depth calculator for Python.
upload_time2025-02-05 16:41:58
maintainerNone
docs_urlNone
authorNone
requires_python>=3.7
licenseMIT
keywords bam bioinformatics genomics rust parallel
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # πŸ¦€ `rustbam` - Rust-powered fast BAM depth extraction with Python bindings

**rustbam** is a high-performance BAM depth calculator written in **Rust**, with **Python bindings** for fast and efficient genomic data analysis.

## πŸ“¦ Installation  

### **Install from PyPI (No Conda Required)** 

You can install `rustbam` directly with `pip`:

```
pip install rustbam
```

## πŸ› οΈ Usage

### **Python API**

After installation, you can use `rustbam` in Python:

```bash
import rustbam

positions, depths = rustbam.get_depths(
    bam_path,         # path to bam file
    chromosome,       # chromosome/contig name
    start,            # 1-based inclusive start coordinate
    end,              # 1-based inclusive end coordinate
    step=10,          # step as in range(start, end, step) - default: 1
    min_mapq=0,       # minimum mapping quality - default 0
    min_bq=13,        # minimum base quality - default 13 (as in samtools mpileup)
    max_depth=8000,   # maximum depth to return per base position
    num_threads=12,   # number of threads for parallelization
)

print(positions[:5])  # e.g. [100000, 100010, 100020, 100030, 100040]
print(depths[:5])     # e.g. [12, 15, 10, 8, 20]
```


### **CLI (Command Line Interface)**

After installation, you can use `rustbam` in your shell (note that coordinates are 1-based and inclusive, as in `samtools mpileup`):

```bash
$ rustbam --help
usage: rustbam [-h] [-t STEP] [-Q MIN_MAPQ] [-q MIN_BQ] [-d MAX_DEPTH] [-n NUM_THREADS] [-j] bam chromosome start end

Compute sequencing depth from a BAM file.

positional arguments:
  bam                   Path to the indexed BAM file
  chromosome            Chromosome name (e.g., 'chr1')
  start                 Start position (1-based)
  end                   End position (1-based)

options:
  -h, --help            show this help message and exit
  -t STEP, --step STEP  Step size for sampling positions (default: 1)
  -Q MIN_MAPQ, --min_mapq MIN_MAPQ
                        Minimum mapping quality (default: 0)
  -q MIN_BQ, --min_bq MIN_BQ
                        Minimum base quality (default: 13)
  -d MAX_DEPTH, --max_depth MAX_DEPTH
                        Maximum depth allowed (default: 8000)
  -n NUM_THREADS, --num_threads NUM_THREADS
                        Number of threads (default: 12)
  -j, --json            Output results in JSON format
```

An example usage of the CLI:

```bash
$ rustbam tests/example.bam chr1 1000000 1000005
1000000 51
1000001 52
1000002 44
1000003 52
1000004 53
1000005 47
```

You can get much faster depths result compared to samtools mpileup (as long as you use the multithreading option, `-n`):

```bash
$ time samtools mpileup /path/to/a/large/bam -r chr1:1-30000000 > /dev/null
[mpileup] 1 samples in 1 input files

real    0m52.897s
user    0m52.270s
sys     0m0.436s

$ time rustbam /path/to/a/large/bam chr1 1 30000000 -n 12 > /dev/null

real    0m18.725s
user    0m50.806s
sys     0m6.303s
```

Don't even get me started about `pysam` (about 16x faster with `-n 12`, which is the default option). 😠

---

## πŸ”₯ Features

βœ… **Fast**: Uses Rust’s efficient `rust-htslib` for BAM processing, and supports parallelism.  
βœ… **Python bindings**: Seamless integration with Python via `pyo3`.  
βœ… **Custom filtering**: Supports read quality (`-q`), base quality (`-Q`), and max depth (`-d`).  
βœ… **Supports large BAM files**: Uses `IndexedReader` for efficient region querying.

---

## πŸ“œ License

`rustbam` is released under the **MIT License**. See LICENSE for details.

---

## 🀝 Contributing

1. Fork the repo on GitHub.
2. Create a new branch: `git checkout -b feature-new`
3. Commit your changes: `git commit -m "Add new feature"`
4. Push to your branch: `git push origin feature-new`
5. Open a **Pull Request** πŸŽ‰

---

## 🌍 Acknowledgments

Built using **[rust-htslib](https://github.com/rust-bio/rust-htslib)** and **[pyo3](https://github.com/PyO3/pyo3)**.



            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "rustbam",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": null,
    "keywords": "bam, bioinformatics, genomics, rust, parallel",
    "author": null,
    "author_email": "Seongmin Choi <soymintc@gmail.com>",
    "download_url": null,
    "platform": null,
    "description": "# \ud83e\udd80 `rustbam` - Rust-powered fast BAM depth extraction with Python bindings\n\n**rustbam** is a high-performance BAM depth calculator written in **Rust**, with **Python bindings** for fast and efficient genomic data analysis.\n\n## \ud83d\udce6 Installation  \n\n### **Install from PyPI (No Conda Required)** \n\nYou can install `rustbam` directly with `pip`:\n\n```\npip install rustbam\n```\n\n## \ud83d\udee0\ufe0f Usage\n\n### **Python API**\n\nAfter installation, you can use `rustbam` in Python:\n\n```bash\nimport rustbam\n\npositions, depths = rustbam.get_depths(\n    bam_path,         # path to bam file\n    chromosome,       # chromosome/contig name\n    start,            # 1-based inclusive start coordinate\n    end,              # 1-based inclusive end coordinate\n    step=10,          # step as in range(start, end, step) - default: 1\n    min_mapq=0,       # minimum mapping quality - default 0\n    min_bq=13,        # minimum base quality - default 13 (as in samtools mpileup)\n    max_depth=8000,   # maximum depth to return per base position\n    num_threads=12,   # number of threads for parallelization\n)\n\nprint(positions[:5])  # e.g. [100000, 100010, 100020, 100030, 100040]\nprint(depths[:5])     # e.g. [12, 15, 10, 8, 20]\n```\n\n\n### **CLI (Command Line Interface)**\n\nAfter installation, you can use `rustbam` in your shell (note that coordinates are 1-based and inclusive, as in `samtools mpileup`):\n\n```bash\n$ rustbam --help\nusage: rustbam [-h] [-t STEP] [-Q MIN_MAPQ] [-q MIN_BQ] [-d MAX_DEPTH] [-n NUM_THREADS] [-j] bam chromosome start end\n\nCompute sequencing depth from a BAM file.\n\npositional arguments:\n  bam                   Path to the indexed BAM file\n  chromosome            Chromosome name (e.g., 'chr1')\n  start                 Start position (1-based)\n  end                   End position (1-based)\n\noptions:\n  -h, --help            show this help message and exit\n  -t STEP, --step STEP  Step size for sampling positions (default: 1)\n  -Q MIN_MAPQ, --min_mapq MIN_MAPQ\n                        Minimum mapping quality (default: 0)\n  -q MIN_BQ, --min_bq MIN_BQ\n                        Minimum base quality (default: 13)\n  -d MAX_DEPTH, --max_depth MAX_DEPTH\n                        Maximum depth allowed (default: 8000)\n  -n NUM_THREADS, --num_threads NUM_THREADS\n                        Number of threads (default: 12)\n  -j, --json            Output results in JSON format\n```\n\nAn example usage of the CLI:\n\n```bash\n$ rustbam tests/example.bam chr1 1000000 1000005\n1000000 51\n1000001 52\n1000002 44\n1000003 52\n1000004 53\n1000005 47\n```\n\nYou can get much faster depths result compared to samtools mpileup (as long as you use the multithreading option, `-n`):\n\n```bash\n$ time samtools mpileup /path/to/a/large/bam -r chr1:1-30000000 > /dev/null\n[mpileup] 1 samples in 1 input files\n\nreal    0m52.897s\nuser    0m52.270s\nsys     0m0.436s\n\n$ time rustbam /path/to/a/large/bam chr1 1 30000000 -n 12 > /dev/null\n\nreal    0m18.725s\nuser    0m50.806s\nsys     0m6.303s\n```\n\nDon't even get me started about `pysam` (about 16x faster with `-n 12`, which is the default option). \ud83d\ude20\n\n---\n\n## \ud83d\udd25 Features\n\n\u2705 **Fast**: Uses Rust\u2019s efficient `rust-htslib` for BAM processing, and supports parallelism.  \n\u2705 **Python bindings**: Seamless integration with Python via `pyo3`.  \n\u2705 **Custom filtering**: Supports read quality (`-q`), base quality (`-Q`), and max depth (`-d`).  \n\u2705 **Supports large BAM files**: Uses `IndexedReader` for efficient region querying.\n\n---\n\n## \ud83d\udcdc License\n\n`rustbam` is released under the **MIT License**. See LICENSE for details.\n\n---\n\n## \ud83e\udd1d Contributing\n\n1. Fork the repo on GitHub.\n2. Create a new branch: `git checkout -b feature-new`\n3. Commit your changes: `git commit -m \"Add new feature\"`\n4. Push to your branch: `git push origin feature-new`\n5. Open a **Pull Request** \ud83c\udf89\n\n---\n\n## \ud83c\udf0d Acknowledgments\n\nBuilt using **[rust-htslib](https://github.com/rust-bio/rust-htslib)** and **[pyo3](https://github.com/PyO3/pyo3)**.\n\n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A Rust-based BAM depth calculator for Python.",
    "version": "0.1.8",
    "project_urls": null,
    "split_keywords": [
        "bam",
        " bioinformatics",
        " genomics",
        " rust",
        " parallel"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "fdbfea2b82995443559190f85d32ca3337ce822ed21b96d015680ed402bda004",
                "md5": "6c28acbd51e5a986bb96e0b1d1be53a0",
                "sha256": "9ac6101c39159d219c31c0446d92af973b9b29cf4b7b483a7a20db92b6026922"
            },
            "downloads": -1,
            "filename": "rustbam-0.1.8-cp37-abi3-manylinux_2_28_x86_64.whl",
            "has_sig": false,
            "md5_digest": "6c28acbd51e5a986bb96e0b1d1be53a0",
            "packagetype": "bdist_wheel",
            "python_version": "cp37",
            "requires_python": ">=3.7",
            "size": 3093433,
            "upload_time": "2025-02-05T16:41:58",
            "upload_time_iso_8601": "2025-02-05T16:41:58.466852Z",
            "url": "https://files.pythonhosted.org/packages/fd/bf/ea2b82995443559190f85d32ca3337ce822ed21b96d015680ed402bda004/rustbam-0.1.8-cp37-abi3-manylinux_2_28_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-02-05 16:41:58",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "rustbam"
}
        
Elapsed time: 0.41649s