rocco


Namerocco JSON
Version 1.6.0.post1 PyPI version JSON
download
home_pagehttps://github.com/nolan-h-hamilton/rocco
SummaryRobust ATAC-seq Peak Calling for Many Samples via Convex Optimization
upload_time2025-02-11 01:49:32
maintainerNone
docs_urlNone
authorNolan Holt Hamilton
requires_python<4,>=3.8
licenseMIT
keywords genomics functional genomics epigenomics epigenetics peak calling atac-seq
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # ROCCO: [R]obust [O]pen [C]hromatin Detection via [C]onvex [O]ptimization

[![API](https://github.com/nolan-h-hamilton/ROCCO/actions/workflows/docs.yml/badge.svg)](https://github.com/nolan-h-hamilton/ROCCO/actions/workflows/docs.yml)
[![Tests](https://github.com/nolan-h-hamilton/ROCCO/actions/workflows/tests.yml/badge.svg)](https://github.com/nolan-h-hamilton/ROCCO/actions/workflows/tests.yml)
![PyPI - Version](https://img.shields.io/pypi/v/rocco?logo=Python&logoColor=%23FFFFFF&color=%233776AB&link=https%3A%2F%2Fpypi.org%2Fproject%2Frocco%2F)

<p align="center">
<img width="400" alt="logo" src="docs/logo.png">
</p>

## What

ROCCO is an efficient algorithm for detection of "consensus peaks" in large datasets with multiple HTS data samples (namely, ATAC-seq), where an enrichment in read counts/densities is observed in a nontrivial subset of samples.

### Input/Output

* *Input*: Samples' BAM alignments or BigWig tracks
* *Output*: BED file of consensus peak regions (Default format is BED3: `chrom,start,end`)

* Note, if BigWig input is used, no preprocessing options can be applied at the alignment level and narrowPeak output cannot be generated.

## How

ROCCO models consensus peak calling as a constrained optimization problem with an upper-bound on the total proportion of the genome selected as open/accessible and a fragmentation penalty to promote spatial consistency in active regions and sparsity elsewhere.

## Why

ROCCO offers several attractive features:

1. **Consideration of enrichment and spatial characteristics** of open chromatin signals
2. **Scaling to large sample sizes (100+)** with an asymptotic time complexity independent of sample size
3. **No required training data** or a heuristically determined set of initial candidate peak regions
4. **No rigid thresholds** on the minimum number/width of supporting samples/replicates
5. **Mathematically tractable model** permitting worst-case analysis of runtime and performance

## Example Behavior

### Input

* ENCODE lymphoblastoid data (BEST5, WORST5): 10 real ATAC-seq alignments of varying TSS enrichment (SNR-like quality measure for ATAC-seq)
* Synthetic noisy data (NOISY5)

We run twice under two conditions -- *with noisy samples* and *without* for comparison (blue)

  ```shell
  rocco -i *.BEST5.bam *.WORST5.bam -g hg38 -o rocco_output_without_noise.bed
  rocco -i *.BEST5.bam *.WORST5.bam *.NOISY5.bam -g hg38 -o rocco_output_with_noise.bed
  ```
Note, users may run ROCCO with flag [`--narrowPeak`](https://genome.ucsc.edu/FAQ/FAQformat.html#format12) to generate 10-column output with various statistics for comparison of peaks.

### Output

Comparing each output file:

* *ROCCO is unaffected by the Noisy5 samples and effectively identifies true signal across multiple samples*
* *ROCCO simultaneously detects both wide and narrow consensus peaks*

<p align="center">
<img width="800" height="400" alt="example" src="docs/example_behavior.png">
</p>

## Paper/Citation

If using ROCCO in your research, please cite the [original paper](https://doi.org/10.1093/bioinformatics/btad725) in *Bioinformatics* (DOI: `btad725`)

   ```plaintext
    Nolan H Hamilton, Terrence S Furey, ROCCO: a robust method for detection of open chromatin via convex optimization,
    Bioinformatics, Volume 39, Issue 12, December 2023
   ```

## Documentation

For additional details, usage examples, etc. please see ROCCO's documentation: <https://nolan-h-hamilton.github.io/ROCCO/>

## Installation

### PyPI (`pip`)

   ```shell
   python -m pip install rocco --upgrade
   ```

If lacking administrative control, you may need to append `--user` to the above.


### Build from Source

If preferred, ROCCO can easily be built from source:

* Clone or download this repository

  ```shell
  git clone https://github.com/nolan-h-hamilton/ROCCO.git
  cd ROCCO
  python setup.py sdist bdist_wheel
  pip install -e .
  ```

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/nolan-h-hamilton/rocco",
    "name": "rocco",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4,>=3.8",
    "maintainer_email": null,
    "keywords": "genomics, functional genomics, epigenomics, epigenetics, peak calling, ATAC-seq",
    "author": "Nolan Holt Hamilton",
    "author_email": "nolan.hamilton@unc.edu",
    "download_url": "https://files.pythonhosted.org/packages/df/80/b1b9bcfb4c9dae790640bc70f2dcbb5224a937b4ff5907edf6b22a32f95a/rocco-1.6.0.post1.tar.gz",
    "platform": null,
    "description": "# ROCCO: [R]obust [O]pen [C]hromatin Detection via [C]onvex [O]ptimization\n\n[![API](https://github.com/nolan-h-hamilton/ROCCO/actions/workflows/docs.yml/badge.svg)](https://github.com/nolan-h-hamilton/ROCCO/actions/workflows/docs.yml)\n[![Tests](https://github.com/nolan-h-hamilton/ROCCO/actions/workflows/tests.yml/badge.svg)](https://github.com/nolan-h-hamilton/ROCCO/actions/workflows/tests.yml)\n![PyPI - Version](https://img.shields.io/pypi/v/rocco?logo=Python&logoColor=%23FFFFFF&color=%233776AB&link=https%3A%2F%2Fpypi.org%2Fproject%2Frocco%2F)\n\n<p align=\"center\">\n<img width=\"400\" alt=\"logo\" src=\"docs/logo.png\">\n</p>\n\n## What\n\nROCCO is an efficient algorithm for detection of \"consensus peaks\" in large datasets with multiple HTS data samples (namely, ATAC-seq), where an enrichment in read counts/densities is observed in a nontrivial subset of samples.\n\n### Input/Output\n\n* *Input*: Samples' BAM alignments or BigWig tracks\n* *Output*: BED file of consensus peak regions (Default format is BED3: `chrom,start,end`)\n\n* Note, if BigWig input is used, no preprocessing options can be applied at the alignment level and narrowPeak output cannot be generated.\n\n## How\n\nROCCO models consensus peak calling as a constrained optimization problem with an upper-bound on the total proportion of the genome selected as open/accessible and a fragmentation penalty to promote spatial consistency in active regions and sparsity elsewhere.\n\n## Why\n\nROCCO offers several attractive features:\n\n1. **Consideration of enrichment and spatial characteristics** of open chromatin signals\n2. **Scaling to large sample sizes (100+)** with an asymptotic time complexity independent of sample size\n3. **No required training data** or a heuristically determined set of initial candidate peak regions\n4. **No rigid thresholds** on the minimum number/width of supporting samples/replicates\n5. **Mathematically tractable model** permitting worst-case analysis of runtime and performance\n\n## Example Behavior\n\n### Input\n\n* ENCODE lymphoblastoid data (BEST5, WORST5): 10 real ATAC-seq alignments of varying TSS enrichment (SNR-like quality measure for ATAC-seq)\n* Synthetic noisy data (NOISY5)\n\nWe run twice under two conditions -- *with noisy samples* and *without* for comparison (blue)\n\n  ```shell\n  rocco -i *.BEST5.bam *.WORST5.bam -g hg38 -o rocco_output_without_noise.bed\n  rocco -i *.BEST5.bam *.WORST5.bam *.NOISY5.bam -g hg38 -o rocco_output_with_noise.bed\n  ```\nNote, users may run ROCCO with flag [`--narrowPeak`](https://genome.ucsc.edu/FAQ/FAQformat.html#format12) to generate 10-column output with various statistics for comparison of peaks.\n\n### Output\n\nComparing each output file:\n\n* *ROCCO is unaffected by the Noisy5 samples and effectively identifies true signal across multiple samples*\n* *ROCCO simultaneously detects both wide and narrow consensus peaks*\n\n<p align=\"center\">\n<img width=\"800\" height=\"400\" alt=\"example\" src=\"docs/example_behavior.png\">\n</p>\n\n## Paper/Citation\n\nIf using ROCCO in your research, please cite the [original paper](https://doi.org/10.1093/bioinformatics/btad725) in *Bioinformatics* (DOI: `btad725`)\n\n   ```plaintext\n    Nolan H Hamilton, Terrence S Furey, ROCCO: a robust method for detection of open chromatin via convex optimization,\n    Bioinformatics, Volume 39, Issue 12, December 2023\n   ```\n\n## Documentation\n\nFor additional details, usage examples, etc. please see ROCCO's documentation: <https://nolan-h-hamilton.github.io/ROCCO/>\n\n## Installation\n\n### PyPI (`pip`)\n\n   ```shell\n   python -m pip install rocco --upgrade\n   ```\n\nIf lacking administrative control, you may need to append `--user` to the above.\n\n\n### Build from Source\n\nIf preferred, ROCCO can easily be built from source:\n\n* Clone or download this repository\n\n  ```shell\n  git clone https://github.com/nolan-h-hamilton/ROCCO.git\n  cd ROCCO\n  python setup.py sdist bdist_wheel\n  pip install -e .\n  ```\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Robust ATAC-seq Peak Calling for Many Samples via Convex Optimization",
    "version": "1.6.0.post1",
    "project_urls": {
        "Homepage": "https://github.com/nolan-h-hamilton/rocco"
    },
    "split_keywords": [
        "genomics",
        " functional genomics",
        " epigenomics",
        " epigenetics",
        " peak calling",
        " atac-seq"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "3f1072d36d21f99b8a8d16e68e3b5fb15f758eb7fe96d3bea37fb676b10a1288",
                "md5": "16168f43b524875237b1ef3c08abc9ee",
                "sha256": "85373a19b334de9e3df99d377961b74f06ae2aca9df63b0ed6c51a62f7354946"
            },
            "downloads": -1,
            "filename": "rocco-1.6.0.post1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "16168f43b524875237b1ef3c08abc9ee",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4,>=3.8",
            "size": 40683,
            "upload_time": "2025-02-11T01:49:31",
            "upload_time_iso_8601": "2025-02-11T01:49:31.103615Z",
            "url": "https://files.pythonhosted.org/packages/3f/10/72d36d21f99b8a8d16e68e3b5fb15f758eb7fe96d3bea37fb676b10a1288/rocco-1.6.0.post1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "df80b1b9bcfb4c9dae790640bc70f2dcbb5224a937b4ff5907edf6b22a32f95a",
                "md5": "1bb8f0a52bf7130800ddd60ab5555dde",
                "sha256": "1b57950f1467c74dd427098c5e6cca5cd4ec3dea25f47855bfb3700d1906d63b"
            },
            "downloads": -1,
            "filename": "rocco-1.6.0.post1.tar.gz",
            "has_sig": false,
            "md5_digest": "1bb8f0a52bf7130800ddd60ab5555dde",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4,>=3.8",
            "size": 706001,
            "upload_time": "2025-02-11T01:49:32",
            "upload_time_iso_8601": "2025-02-11T01:49:32.811333Z",
            "url": "https://files.pythonhosted.org/packages/df/80/b1b9bcfb4c9dae790640bc70f2dcbb5224a937b4ff5907edf6b22a32f95a/rocco-1.6.0.post1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-02-11 01:49:32",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "nolan-h-hamilton",
    "github_project": "rocco",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "rocco"
}
        
Elapsed time: 0.44118s