stripepy-hic


Namestripepy-hic JSON
Version 0.0.2 PyPI version JSON
download
home_pageNone
SummaryStripePy recognizes architectural stripes in 3C and Hi-C contact maps using geometric reasoning
upload_time2024-12-20 14:25:56
maintainerNone
docs_urlNone
authorNone
requires_python>=3.9
licenseNone
keywords architectural stripe contact map cooler hi-c hic stripe stripe recognition stripes
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <!--
Copyright (C) 2024 Roberto Rossini <roberros@uio.no>

SPDX-License-Identifier: MIT
-->

# StripePy

[![License](https://img.shields.io/badge/license-MIT-green)](https://github.com/paulsengroup/StripePy/blob/main/LICENCE)
[![CI](https://github.com/paulsengroup/StripePy/actions/workflows/ci.yml/badge.svg)](https://github.com/paulsengroup/StripePy/actions/workflows/ci.yml)
[![Build Dockerfile](https://github.com/paulsengroup/StripePy/actions/workflows/build-dockerfile.yml/badge.svg)](https://github.com/paulsengroup/StripePy/actions/workflows/build-dockerfile.yml)
[![Zenodo DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.14394042.svg)](https://doi.org/10.5281/zenodo.14394041)

<!--
[![Download from Bioconda](https://img.shields.io/conda/vn/bioconda/StripePy?label=bioconda&logo=Anaconda)](https://anaconda.org/bioconda/StripePy)
[![docs](https://readthedocs.org/projects/stripepy/badge/?version=stable)](https://stripepy.readthedocs.io/en/latest/?badge=stable)

-->

---

StripePy is a CLI application written in Python that recognizes architectural stripes found in the interaction matrix files generated by Chromosome Conformation Capture experiments, such as Hi-C and Micro-C.
Matrix files in `.cool`, `.mcool`, and `.hic` (including `.hic` v9 files) are supported.

StripePy is developed on Linux and macOS and is also tested on Windows.

## Installing StripePy

### Installing with pip

```bash
pip install stripepy-hic
```

<!--

### Installing with conda

```bash
conda create -n stripepy -c conda-forge -c bioconda stripepy-hic
```

-->

### Installing from source

Instructions for Linux and macOS:

```bash
# create and activate a venv (optional)
python3 -m venv venv
. venv/bin/activate

# get StripePy source code
git clone https://github.com/paulsengroup/StripePy.git

# optional, checkout a specific version
# git checkout v0.0.2

# install StripePy
cd StripePy
pip install .

# ensure StripePy is in your PATH
stripepy --help
```

<details>
<summary>Instructions for Windows</summary>

```bash
# create and activate a venv (optional)
python3 -m venv venv
venv\Scripts\activate

# get StripePy source code
git clone https://github.com/paulsengroup/StripePy.git

# optional, checkout a specific version
# git checkout v0.0.2

# install StripePy
cd StripePy
pip install .

# ensure StripePy is in your PATH
stripepy --help
```

</details>

## Running StripePy

StripePy is organized into a few subcommands:

- `stripepy call`: run the stripe detection algorithm and store the identified stripes in a `.hdf5` file.
- `stripepy view`: take the `result.hdf5` file generated by `stripepy call` and extract stripes in BEDPE format.
- `stripepy plot`: generate various kinds of plots to inspect the stripes identified by `stripepy call`.
- `stripepy download`: download a minified sample dataset suitable to quickly test StripePy.

### Walkthrough

The following is an example of a typical run of StripePy.
The steps outlined in this section assume that StripePy is running on a UNIX system.
Some commands may need some tweaking to run on Windows.

#### 1) Download a sample dataset

This step is optional.
Feel free to use your own interaction matrix (make sure the matrix is in `.cool`, `.mcool`, or `.hic` format).

```console
# This may take a while on slow internet connections
user@dev:/tmp$ stripepy download --name 4DNFI9GMP2J8

[2024-12-11 15:25:56,101] INFO: downloading dataset "4DNFI9GMP2J8" (assembly=hg38)...
[2024-12-11 15:25:56,296] INFO: downloaded 0.00/106.84 MB (0.00%)
[2024-12-11 15:26:11,309] INFO: downloaded 57.53/106.84 MB (53.85%)
[2024-12-11 15:26:26,312] INFO: downloaded 86.59/106.84 MB (81.05%)
[2024-12-11 15:26:35,156] INFO: DONE! Downloading dataset "4DNFI9GMP2J8" took 39.06s.
[2024-12-11 15:26:35,156] INFO: computing MD5 digest for file "/tmp/4DNFI9GMP2J8.zf9qbdmi"...
[2024-12-11 15:26:35,304] INFO: MD5 checksum match!
[2024-12-11 15:26:35,304] INFO: successfully downloaded dataset "https://zenodo.org/records/14283922/files/4DNFI9GMP2J8.stripepy.mcool?download=1" to file "4DNFI9GMP2J8.mcool"
[2024-12-11 15:26:35,305] INFO: file size: 106.84MB. Elapsed time: 39.20s
```

#### 2) Detect architectural stripes

This is the core of the analysis and may take several minutes when processing large files.

```console
user@dev:/tmp$ stripepy call 4DNFI9GMP2J8.mcool 10000 -o stripepy/


Arguments:
--contact-map: 4DNFI9GMP2J8.mcool
--resolution: 10000
--normalization: NONE
--genomic-belt: 5000000
--roi: None
--max-width: 100000
--glob-pers-min: 0.05
--constrain-heights: False
--loc-pers-min: 0.33
--loc-trend-min: 0.25
--output-folder: stripepy
--force: False
--nproc: 1

CHROMOSOME chr1
RoI is: None
Step 1: pre-processing step
1.1) Log-transformation...
1.2) Focusing on a neighborhood of the main diagonal...
1.3) Projection onto [0, 1]...
Execution time of step 1: 0.446779727935791 seconds ---
Step 2: Topological Data Analysis
2.1) Global 1D pseudo-distributions...
2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...
2.2.0) All maxima and their persistence
2.2.1) Lower triangular part
2.2.2) Upper triangular part
2.2.3) Filter out seeds in sparse regions
Number of lower-triangular seed sites is reduced from 1353 to 1304
Number of upper-triangular seed sites is reduced from 1217 to 1180
2.3) Storing into a list of Stripe objects...
Execution time of step 2: 0.3155953884124756 seconds ---
Step 3: Shape analysis
...
3.6) Bar plots of widths and heights...
Execution time of step 3: 0.3037407398223877 seconds ---
Step 4: Statistical analysis and post-processing
4.1) Computing and saving biological descriptors
Execution time of step 4: 0.059529781341552734 seconds ---
This chromosome has taken 0.450955867767334 seconds


The code has run for 2.022001071770986 minutes
```

<details>
<summary>Complete log</summary>

```

Arguments:
--contact-map: 4DNFI9GMP2J8.mcool
--resolution: 10000
--normalization: NONE
--genomic-belt: 5000000
--roi: None
--max-width: 100000
--glob-pers-min: 0.05
--constrain-heights: False
--loc-pers-min: 0.33
--loc-trend-min: 0.25
--output-folder: stripepy
--force: False
--nproc: 1

CHROMOSOME chr1
RoI is: None
Step 1: pre-processing step
1.1) Log-transformation...
1.2) Focusing on a neighborhood of the main diagonal...
1.3) Projection onto [0, 1]...
Execution time of step 1: 0.446779727935791 seconds ---
Step 2: Topological Data Analysis
2.1) Global 1D pseudo-distributions...
2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...
2.2.0) All maxima and their persistence
2.2.1) Lower triangular part
2.2.2) Upper triangular part
2.2.3) Filter out seeds in sparse regions
Number of lower-triangular seed sites is reduced from 1353 to 1304
Number of upper-triangular seed sites is reduced from 1217 to 1180
2.3) Storing into a list of Stripe objects...
Execution time of step 2: 0.3155953884124756 seconds ---
Step 3: Shape analysis
3.1) Width estimation
3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...
3.1.2) Updating list of Stripe objects with HIoIs...
Execution time: 3.5695345401763916 seconds ---
3.2) Height estimation
3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...
3.2.2) Updating list of Stripe objects with VIoIs...
Execution time: 3.528017997741699 seconds ---
3.5) Saving geometric descriptors...
3.6) Bar plots of widths and heights...
Execution time of step 3: 7.102360963821411 seconds ---
Step 4: Statistical analysis and post-processing
4.1) Computing and saving biological descriptors
Execution time of step 4: 0.9378914833068848 seconds ---
This chromosome has taken 9.699488639831543 seconds

CHROMOSOME chr2
RoI is: None
Step 1: pre-processing step
1.1) Log-transformation...
1.2) Focusing on a neighborhood of the main diagonal...
1.3) Projection onto [0, 1]...
Execution time of step 1: 0.47316646575927734 seconds ---
Step 2: Topological Data Analysis
2.1) Global 1D pseudo-distributions...
2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...
2.2.0) All maxima and their persistence
2.2.1) Lower triangular part
2.2.2) Upper triangular part
2.2.3) Filter out seeds in sparse regions
Number of lower-triangular seed sites is reduced from 1504 to 1473
Number of upper-triangular seed sites is reduced from 1402 to 1367
2.3) Storing into a list of Stripe objects...
Execution time of step 2: 0.3005075454711914 seconds ---
Step 3: Shape analysis
3.1) Width estimation
3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...
3.1.2) Updating list of Stripe objects with HIoIs...
Execution time: 3.9707601070404053 seconds ---
3.2) Height estimation
3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...
3.2.2) Updating list of Stripe objects with VIoIs...
Execution time: 3.9840147495269775 seconds ---
3.5) Saving geometric descriptors...
3.6) Bar plots of widths and heights...
Execution time of step 3: 7.959795951843262 seconds ---
Step 4: Statistical analysis and post-processing
4.1) Computing and saving biological descriptors
Execution time of step 4: 1.0786504745483398 seconds ---
This chromosome has taken 10.67337441444397 seconds

CHROMOSOME chr3
RoI is: None
Step 1: pre-processing step
1.1) Log-transformation...
1.2) Focusing on a neighborhood of the main diagonal...
1.3) Projection onto [0, 1]...
Execution time of step 1: 0.3940746784210205 seconds ---
Step 2: Topological Data Analysis
2.1) Global 1D pseudo-distributions...
2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...
2.2.0) All maxima and their persistence
2.2.1) Lower triangular part
2.2.2) Upper triangular part
2.2.3) Filter out seeds in sparse regions
Number of lower-triangular seed sites is reduced from 1173 to 1168
Number of upper-triangular seed sites is reduced from 1305 to 1297
2.3) Storing into a list of Stripe objects...
Execution time of step 2: 0.2587897777557373 seconds ---
Step 3: Shape analysis
3.1) Width estimation
3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...
3.1.2) Updating list of Stripe objects with HIoIs...
Execution time: 2.884906768798828 seconds ---
3.2) Height estimation
3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...
3.2.2) Updating list of Stripe objects with VIoIs...
Execution time: 3.896686553955078 seconds ---
3.5) Saving geometric descriptors...
3.6) Bar plots of widths and heights...
Execution time of step 3: 6.78642725944519 seconds ---
Step 4: Statistical analysis and post-processing
4.1) Computing and saving biological descriptors
Execution time of step 4: 1.012127161026001 seconds ---
This chromosome has taken 9.16284704208374 seconds

CHROMOSOME chr4
RoI is: None
Step 1: pre-processing step
1.1) Log-transformation...
1.2) Focusing on a neighborhood of the main diagonal...
1.3) Projection onto [0, 1]...
Execution time of step 1: 0.3355872631072998 seconds ---
Step 2: Topological Data Analysis
2.1) Global 1D pseudo-distributions...
2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...
2.2.0) All maxima and their persistence
2.2.1) Lower triangular part
2.2.2) Upper triangular part
2.2.3) Filter out seeds in sparse regions
Number of lower-triangular seed sites is reduced from 1152 to 1144
Number of upper-triangular seed sites is reduced from 993 to 985
2.3) Storing into a list of Stripe objects...
Execution time of step 2: 0.27007126808166504 seconds ---
Step 3: Shape analysis
3.1) Width estimation
3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...
3.1.2) Updating list of Stripe objects with HIoIs...
Execution time: 2.4963622093200684 seconds ---
3.2) Height estimation
3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...
3.2.2) Updating list of Stripe objects with VIoIs...
Execution time: 3.3167238235473633 seconds ---
3.5) Saving geometric descriptors...
3.6) Bar plots of widths and heights...
Execution time of step 3: 5.8175835609436035 seconds ---
Step 4: Statistical analysis and post-processing
4.1) Computing and saving biological descriptors
Execution time of step 4: 0.8550224304199219 seconds ---
This chromosome has taken 7.937693357467651 seconds

CHROMOSOME chr5
RoI is: None
Step 1: pre-processing step
1.1) Log-transformation...
1.2) Focusing on a neighborhood of the main diagonal...
1.3) Projection onto [0, 1]...
Execution time of step 1: 0.2822716236114502 seconds ---
Step 2: Topological Data Analysis
2.1) Global 1D pseudo-distributions...
2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...
2.2.0) All maxima and their persistence
2.2.1) Lower triangular part
2.2.2) Upper triangular part
2.2.3) Filter out seeds in sparse regions
Number of lower-triangular seed sites is reduced from 978 to 967
Number of upper-triangular seed sites is reduced from 1365 to 1353
2.3) Storing into a list of Stripe objects...
Execution time of step 2: 0.2545452117919922 seconds ---
Step 3: Shape analysis
3.1) Width estimation
3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...
3.1.2) Updating list of Stripe objects with HIoIs...
Execution time: 2.7692947387695312 seconds ---
3.2) Height estimation
3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...
3.2.2) Updating list of Stripe objects with VIoIs...
Execution time: 3.6233880519866943 seconds ---
3.5) Saving geometric descriptors...
3.6) Bar plots of widths and heights...
Execution time of step 3: 6.397336721420288 seconds ---
Step 4: Statistical analysis and post-processing
4.1) Computing and saving biological descriptors
Execution time of step 4: 0.9528613090515137 seconds ---
This chromosome has taken 8.481026411056519 seconds

CHROMOSOME chr6
RoI is: None
Step 1: pre-processing step
1.1) Log-transformation...
1.2) Focusing on a neighborhood of the main diagonal...
1.3) Projection onto [0, 1]...
Execution time of step 1: 0.26506567001342773 seconds ---
Step 2: Topological Data Analysis
2.1) Global 1D pseudo-distributions...
2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...
2.2.0) All maxima and their persistence
2.2.1) Lower triangular part
2.2.2) Upper triangular part
2.2.3) Filter out seeds in sparse regions
Number of lower-triangular seed sites is reduced from 1102 to 1087
Number of upper-triangular seed sites is reduced from 986 to 975
2.3) Storing into a list of Stripe objects...
Execution time of step 2: 0.25983405113220215 seconds ---
Step 3: Shape analysis
3.1) Width estimation
3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...
3.1.2) Updating list of Stripe objects with HIoIs...
Execution time: 2.29978084564209 seconds ---
3.2) Height estimation
3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...
3.2.2) Updating list of Stripe objects with VIoIs...
Execution time: 3.2286500930786133 seconds ---
3.5) Saving geometric descriptors...
3.6) Bar plots of widths and heights...
Execution time of step 3: 5.532674789428711 seconds ---
Step 4: Statistical analysis and post-processing
4.1) Computing and saving biological descriptors
Execution time of step 4: 0.8356080055236816 seconds ---
This chromosome has taken 7.442054748535156 seconds

CHROMOSOME chr7
RoI is: None
Step 1: pre-processing step
1.1) Log-transformation...
1.2) Focusing on a neighborhood of the main diagonal...
1.3) Projection onto [0, 1]...
Execution time of step 1: 0.21904230117797852 seconds ---
Step 2: Topological Data Analysis
2.1) Global 1D pseudo-distributions...
2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...
2.2.0) All maxima and their persistence
2.2.1) Lower triangular part
2.2.2) Upper triangular part
2.2.3) Filter out seeds in sparse regions
Number of lower-triangular seed sites is reduced from 770 to 735
Number of upper-triangular seed sites is reduced from 1114 to 1083
2.3) Storing into a list of Stripe objects...
Execution time of step 2: 0.22430944442749023 seconds ---
Step 3: Shape analysis
3.1) Width estimation
3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...
3.1.2) Updating list of Stripe objects with HIoIs...
Execution time: 1.8583557605743408 seconds ---
3.2) Height estimation
3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...
3.2.2) Updating list of Stripe objects with VIoIs...
Execution time: 2.8366761207580566 seconds ---
3.5) Saving geometric descriptors...
3.6) Bar plots of widths and heights...
Execution time of step 3: 4.698970079421997 seconds ---
Step 4: Statistical analysis and post-processing
4.1) Computing and saving biological descriptors
Execution time of step 4: 0.7353849411010742 seconds ---
This chromosome has taken 6.372534275054932 seconds

CHROMOSOME chr8
RoI is: None
Step 1: pre-processing step
1.1) Log-transformation...
1.2) Focusing on a neighborhood of the main diagonal...
1.3) Projection onto [0, 1]...
Execution time of step 1: 0.19896483421325684 seconds ---
Step 2: Topological Data Analysis
2.1) Global 1D pseudo-distributions...
2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...
2.2.0) All maxima and their persistence
2.2.1) Lower triangular part
2.2.2) Upper triangular part
2.2.3) Filter out seeds in sparse regions
Number of lower-triangular seed sites is reduced from 950 to 927
Number of upper-triangular seed sites is reduced from 789 to 770
2.3) Storing into a list of Stripe objects...
Execution time of step 2: 0.20422124862670898 seconds ---
Step 3: Shape analysis
3.1) Width estimation
3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...
3.1.2) Updating list of Stripe objects with HIoIs...
Execution time: 1.6131336688995361 seconds ---
3.2) Height estimation
3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...
3.2.2) Updating list of Stripe objects with VIoIs...
Execution time: 2.68200421333313 seconds ---
3.5) Saving geometric descriptors...
3.6) Bar plots of widths and heights...
Execution time of step 3: 4.298713445663452 seconds ---
Step 4: Statistical analysis and post-processing
4.1) Computing and saving biological descriptors
Execution time of step 4: 0.7141485214233398 seconds ---
This chromosome has taken 5.876908302307129 seconds

CHROMOSOME chr9
RoI is: None
Step 1: pre-processing step
1.1) Log-transformation...
1.2) Focusing on a neighborhood of the main diagonal...
1.3) Projection onto [0, 1]...
Execution time of step 1: 0.13652896881103516 seconds ---
Step 2: Topological Data Analysis
2.1) Global 1D pseudo-distributions...
2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...
2.2.0) All maxima and their persistence
2.2.1) Lower triangular part
2.2.2) Upper triangular part
2.2.3) Filter out seeds in sparse regions
Number of lower-triangular seed sites is reduced from 602 to 587
Number of upper-triangular seed sites is reduced from 690 to 675
2.3) Storing into a list of Stripe objects...
Execution time of step 2: 0.18230485916137695 seconds ---
Step 3: Shape analysis
3.1) Width estimation
3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...
3.1.2) Updating list of Stripe objects with HIoIs...
Execution time: 1.1340200901031494 seconds ---
3.2) Height estimation
3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...
3.2.2) Updating list of Stripe objects with VIoIs...
Execution time: 2.0012364387512207 seconds ---
3.5) Saving geometric descriptors...
3.6) Bar plots of widths and heights...
Execution time of step 3: 3.138490915298462 seconds ---
Step 4: Statistical analysis and post-processing
4.1) Computing and saving biological descriptors
Execution time of step 4: 0.5209352970123291 seconds ---
This chromosome has taken 4.350054979324341 seconds

CHROMOSOME chr10
RoI is: None
Step 1: pre-processing step
1.1) Log-transformation...
1.2) Focusing on a neighborhood of the main diagonal...
1.3) Projection onto [0, 1]...
Execution time of step 1: 0.1949782371520996 seconds ---
Step 2: Topological Data Analysis
2.1) Global 1D pseudo-distributions...
2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...
2.2.0) All maxima and their persistence
2.2.1) Lower triangular part
2.2.2) Upper triangular part
2.2.3) Filter out seeds in sparse regions
Number of lower-triangular seed sites is reduced from 855 to 830
Number of upper-triangular seed sites is reduced from 891 to 867
2.3) Storing into a list of Stripe objects...
Execution time of step 2: 0.18785762786865234 seconds ---
Step 3: Shape analysis
3.1) Width estimation
3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...
3.1.2) Updating list of Stripe objects with HIoIs...
Execution time: 1.5026066303253174 seconds ---
3.2) Height estimation
3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...
3.2.2) Updating list of Stripe objects with VIoIs...
Execution time: 2.675481081008911 seconds ---
3.5) Saving geometric descriptors...
3.6) Bar plots of widths and heights...
Execution time of step 3: 4.1818461418151855 seconds ---
Step 4: Statistical analysis and post-processing
4.1) Computing and saving biological descriptors
Execution time of step 4: 0.7034149169921875 seconds ---
This chromosome has taken 5.699317932128906 seconds

CHROMOSOME chr11
RoI is: None
Step 1: pre-processing step
1.1) Log-transformation...
1.2) Focusing on a neighborhood of the main diagonal...
1.3) Projection onto [0, 1]...
Execution time of step 1: 0.1772167682647705 seconds ---
Step 2: Topological Data Analysis
2.1) Global 1D pseudo-distributions...
2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...
2.2.0) All maxima and their persistence
2.2.1) Lower triangular part
2.2.2) Upper triangular part
2.2.3) Filter out seeds in sparse regions
Number of lower-triangular seed sites is reduced from 909 to 897
Number of upper-triangular seed sites is reduced from 1056 to 1044
2.3) Storing into a list of Stripe objects...
Execution time of step 2: 0.18897652626037598 seconds ---
Step 3: Shape analysis
3.1) Width estimation
3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...
3.1.2) Updating list of Stripe objects with HIoIs...
Execution time: 1.6935629844665527 seconds ---
3.2) Height estimation
3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...
3.2.2) Updating list of Stripe objects with VIoIs...
Execution time: 3.049086093902588 seconds ---
3.5) Saving geometric descriptors...
3.6) Bar plots of widths and heights...
Execution time of step 3: 4.746764183044434 seconds ---
Step 4: Statistical analysis and post-processing
4.1) Computing and saving biological descriptors
Execution time of step 4: 0.7810971736907959 seconds ---
This chromosome has taken 6.334598541259766 seconds

CHROMOSOME chr12
RoI is: None
Step 1: pre-processing step
1.1) Log-transformation...
1.2) Focusing on a neighborhood of the main diagonal...
1.3) Projection onto [0, 1]...
Execution time of step 1: 0.16396474838256836 seconds ---
Step 2: Topological Data Analysis
2.1) Global 1D pseudo-distributions...
2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...
2.2.0) All maxima and their persistence
2.2.1) Lower triangular part
2.2.2) Upper triangular part
2.2.3) Filter out seeds in sparse regions
Number of lower-triangular seed sites is reduced from 633 to 629
Number of upper-triangular seed sites is reduced from 883 to 878
2.3) Storing into a list of Stripe objects...
Execution time of step 2: 0.2068781852722168 seconds ---
Step 3: Shape analysis
3.1) Width estimation
3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...
3.1.2) Updating list of Stripe objects with HIoIs...
Execution time: 1.2998018264770508 seconds ---
3.2) Height estimation
3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...
3.2.2) Updating list of Stripe objects with VIoIs...
Execution time: 2.355454206466675 seconds ---
3.5) Saving geometric descriptors...
3.6) Bar plots of widths and heights...
Execution time of step 3: 3.6587283611297607 seconds ---
Step 4: Statistical analysis and post-processing
4.1) Computing and saving biological descriptors
Execution time of step 4: 0.6075148582458496 seconds ---
This chromosome has taken 5.056737422943115 seconds

CHROMOSOME chr13
RoI is: None
Step 1: pre-processing step
1.1) Log-transformation...
1.2) Focusing on a neighborhood of the main diagonal...
1.3) Projection onto [0, 1]...
Execution time of step 1: 0.11529994010925293 seconds ---
Step 2: Topological Data Analysis
2.1) Global 1D pseudo-distributions...
2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...
2.2.0) All maxima and their persistence
2.2.1) Lower triangular part
2.2.2) Upper triangular part
2.2.3) Filter out seeds in sparse regions
Number of lower-triangular seed sites is reduced from 686 to 675
Number of upper-triangular seed sites is reduced from 653 to 643
2.3) Storing into a list of Stripe objects...
Execution time of step 2: 0.15171098709106445 seconds ---
Step 3: Shape analysis
3.1) Width estimation
3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...
3.1.2) Updating list of Stripe objects with HIoIs...
Execution time: 0.9743459224700928 seconds ---
3.2) Height estimation
3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...
3.2.2) Updating list of Stripe objects with VIoIs...
Execution time: 2.0503764152526855 seconds ---
3.5) Saving geometric descriptors...
3.6) Bar plots of widths and heights...
Execution time of step 3: 3.027937650680542 seconds ---
Step 4: Statistical analysis and post-processing
4.1) Computing and saving biological descriptors
Execution time of step 4: 0.5435254573822021 seconds ---
This chromosome has taken 4.14011549949646 seconds

CHROMOSOME chr14
RoI is: None
Step 1: pre-processing step
1.1) Log-transformation...
1.2) Focusing on a neighborhood of the main diagonal...
1.3) Projection onto [0, 1]...
Execution time of step 1: 0.10641026496887207 seconds ---
Step 2: Topological Data Analysis
2.1) Global 1D pseudo-distributions...
2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...
2.2.0) All maxima and their persistence
2.2.1) Lower triangular part
2.2.2) Upper triangular part
2.2.3) Filter out seeds in sparse regions
Number of lower-triangular seed sites is reduced from 567 to 561
Number of upper-triangular seed sites is reduced from 523 to 521
2.3) Storing into a list of Stripe objects...
Execution time of step 2: 0.14020061492919922 seconds ---
Step 3: Shape analysis
3.1) Width estimation
3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...
3.1.2) Updating list of Stripe objects with HIoIs...
Execution time: 0.7857282161712646 seconds ---
3.2) Height estimation
3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...
3.2.2) Updating list of Stripe objects with VIoIs...
Execution time: 1.7303550243377686 seconds ---
3.5) Saving geometric descriptors...
3.6) Bar plots of widths and heights...
Execution time of step 3: 2.519143581390381 seconds ---
Step 4: Statistical analysis and post-processing
4.1) Computing and saving biological descriptors
Execution time of step 4: 0.4577367305755615 seconds ---
This chromosome has taken 3.510401725769043 seconds

CHROMOSOME chr15
RoI is: None
Step 1: pre-processing step
1.1) Log-transformation...
1.2) Focusing on a neighborhood of the main diagonal...
1.3) Projection onto [0, 1]...
Execution time of step 1: 0.11155319213867188 seconds ---
Step 2: Topological Data Analysis
2.1) Global 1D pseudo-distributions...
2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...
2.2.0) All maxima and their persistence
2.2.1) Lower triangular part
2.2.2) Upper triangular part
2.2.3) Filter out seeds in sparse regions
Number of lower-triangular seed sites is reduced from 637 to 602
Number of upper-triangular seed sites is reduced from 635 to 607
2.3) Storing into a list of Stripe objects...
Execution time of step 2: 0.12435340881347656 seconds ---
Step 3: Shape analysis
3.1) Width estimation
3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...
3.1.2) Updating list of Stripe objects with HIoIs...
Execution time: 0.7276151180267334 seconds ---
3.2) Height estimation
3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...
3.2.2) Updating list of Stripe objects with VIoIs...
Execution time: 1.8632171154022217 seconds ---
3.5) Saving geometric descriptors...
3.6) Bar plots of widths and heights...
Execution time of step 3: 2.593841552734375 seconds ---
Step 4: Statistical analysis and post-processing
4.1) Computing and saving biological descriptors
Execution time of step 4: 0.4837307929992676 seconds ---
This chromosome has taken 3.5654518604278564 seconds

CHROMOSOME chr16
RoI is: None
Step 1: pre-processing step
1.1) Log-transformation...
1.2) Focusing on a neighborhood of the main diagonal...
1.3) Projection onto [0, 1]...
Execution time of step 1: 0.08726286888122559 seconds ---
Step 2: Topological Data Analysis
2.1) Global 1D pseudo-distributions...
2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...
2.2.0) All maxima and their persistence
2.2.1) Lower triangular part
2.2.2) Upper triangular part
2.2.3) Filter out seeds in sparse regions
Number of lower-triangular seed sites is reduced from 330 to 299
Number of upper-triangular seed sites is reduced from 494 to 454
2.3) Storing into a list of Stripe objects...
Execution time of step 2: 0.12026333808898926 seconds ---
Step 3: Shape analysis
3.1) Width estimation
3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...
3.1.2) Updating list of Stripe objects with HIoIs...
Execution time: 0.4373819828033447 seconds ---
3.2) Height estimation
3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...
3.2.2) Updating list of Stripe objects with VIoIs...
Execution time: 1.1490724086761475 seconds ---
3.5) Saving geometric descriptors...
3.6) Bar plots of widths and heights...
Execution time of step 3: 1.5888426303863525 seconds ---
Step 4: Statistical analysis and post-processing
4.1) Computing and saving biological descriptors
Execution time of step 4: 0.3165297508239746 seconds ---
This chromosome has taken 2.3602585792541504 seconds

CHROMOSOME chr17
RoI is: None
Step 1: pre-processing step
1.1) Log-transformation...
1.2) Focusing on a neighborhood of the main diagonal...
1.3) Projection onto [0, 1]...
Execution time of step 1: 0.09188055992126465 seconds ---
Step 2: Topological Data Analysis
2.1) Global 1D pseudo-distributions...
2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...
2.2.0) All maxima and their persistence
2.2.1) Lower triangular part
2.2.2) Upper triangular part
2.2.3) Filter out seeds in sparse regions
Number of lower-triangular seed sites is reduced from 586 to 561
Number of upper-triangular seed sites is reduced from 548 to 518
2.3) Storing into a list of Stripe objects...
Execution time of step 2: 0.11394453048706055 seconds ---
Step 3: Shape analysis
3.1) Width estimation
3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...
3.1.2) Updating list of Stripe objects with HIoIs...
Execution time: 0.5639193058013916 seconds ---
3.2) Height estimation
3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...
3.2.2) Updating list of Stripe objects with VIoIs...
Execution time: 1.671928882598877 seconds ---
3.5) Saving geometric descriptors...
3.6) Bar plots of widths and heights...
Execution time of step 3: 2.238788366317749 seconds ---
Step 4: Statistical analysis and post-processing
4.1) Computing and saving biological descriptors
Execution time of step 4: 0.4385089874267578 seconds ---
This chromosome has taken 3.1356685161590576 seconds

CHROMOSOME chr18
RoI is: None
Step 1: pre-processing step
1.1) Log-transformation...
1.2) Focusing on a neighborhood of the main diagonal...
1.3) Projection onto [0, 1]...
Execution time of step 1: 0.08964848518371582 seconds ---
Step 2: Topological Data Analysis
2.1) Global 1D pseudo-distributions...
2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...
2.2.0) All maxima and their persistence
2.2.1) Lower triangular part
2.2.2) Upper triangular part
2.2.3) Filter out seeds in sparse regions
Number of lower-triangular seed sites is reduced from 485 to 480
Number of upper-triangular seed sites is reduced from 524 to 517
2.3) Storing into a list of Stripe objects...
Execution time of step 2: 0.11319613456726074 seconds ---
Step 3: Shape analysis
3.1) Width estimation
3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...
3.1.2) Updating list of Stripe objects with HIoIs...
Execution time: 0.535980224609375 seconds ---
3.2) Height estimation
3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...
3.2.2) Updating list of Stripe objects with VIoIs...
Execution time: 1.5569026470184326 seconds ---
3.5) Saving geometric descriptors...
3.6) Bar plots of widths and heights...
Execution time of step 3: 2.0957298278808594 seconds ---
Step 4: Statistical analysis and post-processing
4.1) Computing and saving biological descriptors
Execution time of step 4: 0.4030795097351074 seconds ---
This chromosome has taken 2.9409568309783936 seconds

CHROMOSOME chr19
RoI is: None
Step 1: pre-processing step
1.1) Log-transformation...
1.2) Focusing on a neighborhood of the main diagonal...
1.3) Projection onto [0, 1]...
Execution time of step 1: 0.0656130313873291 seconds ---
Step 2: Topological Data Analysis
2.1) Global 1D pseudo-distributions...
2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...
2.2.0) All maxima and their persistence
2.2.1) Lower triangular part
2.2.2) Upper triangular part
2.2.3) Filter out seeds in sparse regions
Number of lower-triangular seed sites is reduced from 547 to 543
Number of upper-triangular seed sites is reduced from 558 to 555
2.3) Storing into a list of Stripe objects...
Execution time of step 2: 0.08183503150939941 seconds ---
Step 3: Shape analysis
3.1) Width estimation
3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...
3.1.2) Updating list of Stripe objects with HIoIs...
Execution time: 0.42258787155151367 seconds ---
3.2) Height estimation
3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...
3.2.2) Updating list of Stripe objects with VIoIs...
Execution time: 1.7407689094543457 seconds ---
3.5) Saving geometric descriptors...
3.6) Bar plots of widths and heights...
Execution time of step 3: 2.166306495666504 seconds ---
Step 4: Statistical analysis and post-processing
4.1) Computing and saving biological descriptors
Execution time of step 4: 0.4351332187652588 seconds ---
This chromosome has taken 2.9370124340057373 seconds

CHROMOSOME chr20
RoI is: None
Step 1: pre-processing step
1.1) Log-transformation...
1.2) Focusing on a neighborhood of the main diagonal...
1.3) Projection onto [0, 1]...
Execution time of step 1: 0.07147622108459473 seconds ---
Step 2: Topological Data Analysis
2.1) Global 1D pseudo-distributions...
2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...
2.2.0) All maxima and their persistence
2.2.1) Lower triangular part
2.2.2) Upper triangular part
2.2.3) Filter out seeds in sparse regions
Number of lower-triangular seed sites is reduced from 501 to 481
Number of upper-triangular seed sites is reduced from 411 to 393
2.3) Storing into a list of Stripe objects...
Execution time of step 2: 0.09188032150268555 seconds ---
Step 3: Shape analysis
3.1) Width estimation
3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...
3.1.2) Updating list of Stripe objects with HIoIs...
Execution time: 0.36221981048583984 seconds ---
3.2) Height estimation
3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...
3.2.2) Updating list of Stripe objects with VIoIs...
Execution time: 1.3752944469451904 seconds ---
3.5) Saving geometric descriptors...
3.6) Bar plots of widths and heights...
Execution time of step 3: 1.7400612831115723 seconds ---
Step 4: Statistical analysis and post-processing
4.1) Computing and saving biological descriptors
Execution time of step 4: 0.3595857620239258 seconds ---
This chromosome has taken 2.4734344482421875 seconds

CHROMOSOME chr21
RoI is: None
Step 1: pre-processing step
1.1) Log-transformation...
1.2) Focusing on a neighborhood of the main diagonal...
1.3) Projection onto [0, 1]...
Execution time of step 1: 0.026866912841796875 seconds ---
Step 2: Topological Data Analysis
2.1) Global 1D pseudo-distributions...
2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...
2.2.0) All maxima and their persistence
2.2.1) Lower triangular part
2.2.2) Upper triangular part
2.2.3) Filter out seeds in sparse regions
Number of lower-triangular seed sites is reduced from 232 to 210
Number of upper-triangular seed sites is reduced from 220 to 206
2.3) Storing into a list of Stripe objects...
Execution time of step 2: 0.06076812744140625 seconds ---
Step 3: Shape analysis
3.1) Width estimation
3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...
3.1.2) Updating list of Stripe objects with HIoIs...
Execution time: 0.12738490104675293 seconds ---
3.2) Height estimation
3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...
3.2.2) Updating list of Stripe objects with VIoIs...
Execution time: 0.6408932209014893 seconds ---
3.5) Saving geometric descriptors...
3.6) Bar plots of widths and heights...
Execution time of step 3: 0.769859790802002 seconds ---
Step 4: Statistical analysis and post-processing
4.1) Computing and saving biological descriptors
Execution time of step 4: 0.167100191116333 seconds ---
This chromosome has taken 1.1280708312988281 seconds

CHROMOSOME chr22
RoI is: None
Step 1: pre-processing step
1.1) Log-transformation...
1.2) Focusing on a neighborhood of the main diagonal...
1.3) Projection onto [0, 1]...
Execution time of step 1: 0.03299999237060547 seconds ---
Step 2: Topological Data Analysis
2.1) Global 1D pseudo-distributions...
2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...
2.2.0) All maxima and their persistence
2.2.1) Lower triangular part
2.2.2) Upper triangular part
2.2.3) Filter out seeds in sparse regions
Number of lower-triangular seed sites is reduced from 310 to 285
Number of upper-triangular seed sites is reduced from 286 to 272
2.3) Storing into a list of Stripe objects...
Execution time of step 2: 0.06340456008911133 seconds ---
Step 3: Shape analysis
3.1) Width estimation
3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...
3.1.2) Updating list of Stripe objects with HIoIs...
Execution time: 0.1843411922454834 seconds ---
3.2) Height estimation
3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...
3.2.2) Updating list of Stripe objects with VIoIs...
Execution time: 0.8908343315124512 seconds ---
3.5) Saving geometric descriptors...
3.6) Bar plots of widths and heights...
Execution time of step 3: 1.0772550106048584 seconds ---
Step 4: Statistical analysis and post-processing
4.1) Computing and saving biological descriptors
Execution time of step 4: 0.23577547073364258 seconds ---
This chromosome has taken 1.5294694900512695 seconds

CHROMOSOME chrX
RoI is: None
Step 1: pre-processing step
1.1) Log-transformation...
1.2) Focusing on a neighborhood of the main diagonal...
1.3) Projection onto [0, 1]...
Execution time of step 1: 0.13507843017578125 seconds ---
Step 2: Topological Data Analysis
2.1) Global 1D pseudo-distributions...
2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...
2.2.0) All maxima and their persistence
2.2.1) Lower triangular part
2.2.2) Upper triangular part
2.2.3) Filter out seeds in sparse regions
Number of lower-triangular seed sites is reduced from 1183 to 1097
Number of upper-triangular seed sites is reduced from 1142 to 1054
2.3) Storing into a list of Stripe objects...
Execution time of step 2: 0.21353983879089355 seconds ---
Step 3: Shape analysis
3.1) Width estimation
3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...
3.1.2) Updating list of Stripe objects with HIoIs...
Execution time: 1.9477896690368652 seconds ---
3.2) Height estimation
3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...
3.2.2) Updating list of Stripe objects with VIoIs...
Execution time: 2.7261645793914795 seconds ---
3.5) Saving geometric descriptors...
3.6) Bar plots of widths and heights...
Execution time of step 3: 4.678062200546265 seconds ---
Step 4: Statistical analysis and post-processing
4.1) Computing and saving biological descriptors
Execution time of step 4: 0.6798150539398193 seconds ---
This chromosome has taken 6.042936563491821 seconds

CHROMOSOME chrY
RoI is: None
Step 1: pre-processing step
1.1) Log-transformation...
1.2) Focusing on a neighborhood of the main diagonal...
1.3) Projection onto [0, 1]...
Execution time of step 1: 0.00628972053527832 seconds ---
Step 2: Topological Data Analysis
2.1) Global 1D pseudo-distributions...
2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...
2.2.0) All maxima and their persistence
2.2.1) Lower triangular part
2.2.2) Upper triangular part
2.2.3) Filter out seeds in sparse regions
Number of lower-triangular seed sites is reduced from 130 to 97
Number of upper-triangular seed sites is reduced from 148 to 112
2.3) Storing into a list of Stripe objects...
Execution time of step 2: 0.05667591094970703 seconds ---
Step 3: Shape analysis
3.1) Width estimation
3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...
3.1.2) Updating list of Stripe objects with HIoIs...
Execution time: 0.07381796836853027 seconds ---
3.2) Height estimation
3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...
3.2.2) Updating list of Stripe objects with VIoIs...
Execution time: 0.2287464141845703 seconds ---
3.5) Saving geometric descriptors...
3.6) Bar plots of widths and heights...
Execution time of step 3: 0.3037407398223877 seconds ---
Step 4: Statistical analysis and post-processing
4.1) Computing and saving biological descriptors
Execution time of step 4: 0.059529781341552734 seconds ---
This chromosome has taken 0.450955867767334 seconds


The code has run for 2.022001071770986 minutes
```

</details>

Running the above command produces the following output:

```
/tmp/stripepy
└── 4DNFI9GMP2J8
    └── 10000
        └── results.hdf5

3 directories, 1 file
```

When processing larger Hi-C matrix, StripePy can take advantage of multicore processors.

The maximum number of CPU cores use by StripePy can be changed through option `--nproc` (set to 1 core by default).

#### 4) Fetch stripes in BEDPE format

The `.hdf5` file produced by `stripepy call` contains various kinds of information, including stripe coordinates, various descriptive statistics, persistence vectors, and more.

While having access to all this information can be useful, usually we are mostly interested in the stripe coordinates, which can be fetched using `stripepy view`.

```console
# Fetch the first 10 stripes in BEDPE format
user@dev:/tmp$ stripepy view stripepy/4DNFI9GMP2J8/10000/results.hdf5 | head

chr1	910000	960000	chr1	930000	3590000
chr1	1060000	1110000	chr1	1080000	3540000
chr1	1400000	1490000	chr1	1430000	3540000
chr1	1600000	1670000	chr1	880000	1620000
chr1	1670000	1700000	chr1	1680000	2610000
chr1	1730000	1780000	chr1	1750000	2570000
chr1	1890000	1940000	chr1	1920000	3540000
chr1	2020000	2060000	chr1	2020000	3550000
chr1	2070000	2120000	chr1	2090000	3540000
chr1	2170000	2230000	chr1	2190000	3500000

# Redirect stdout to a file
user@dev:/tmp$ stripepy view stripepy/4DNFI9GMP2J8/10000/results.hdf5 > stripes.bedpe

# Compress stripes on the fly before writing to a file
user@dev:/tmp$ stripepy view stripepy/4DNFI9GMP2J8/10000/results.hdf5 | gzip -9 > stripes.bedpe.gz
```

#### 5) Quickly visualize architectural stripes

It is often a good idea to visually inspect at least some of the stripes to make sure that the used parameters are suitable for the dataset that was given to `stripepy call`.

We provide a Jupyter notebook ([visualize_stripes_with_highlass.ipynb](utils/visualize_stripes_with_highlass.ipynb)) to facilitate this visual inspection.
The notebook expects the input file to be in `.mcool` format.

If your matrix is in `.hic` format you can easily convert it to `.mcool` format using hictk by running `hictk convert matrix.hic matrix.mcool`.
HiGlass cannot visualize single-resolution Cooler files. If you are working with `.cool` files you can use hictk to generate `.mcool` files by running `hictk zoomify matrix.cool matrix.mcool`.

For more details, please refer to hictk's documentation: [hictk.readthedocs.io](https://hictk.readthedocs.io/en/stable/quickstart_cli.html).

We recommend running the notebook using [JupyterLab](https://jupyter.org/install).

Furthermore, the notebook depends on a few Python packages that can be installed with `pip`.
Please make sure that the following packages are installed in a virtual environment that is accessible from Jupyter. Refer to [IPython](https://ipython.readthedocs.io/en/stable/install/kernel_install.html) documentation for instructions on how to add a virtual environment to Jupyter.

```bash
pip install 'clodius>=0.20,<1' 'hictkpy>=1,<2' 'higlass-python>=1.2,<2'
```

Next, launch JupyterLab and open notebook [visualize_stripes_with_highlass.ipynb](utils/visualize_stripes_with_highlass.ipynb).

```bash
jupyter lab
```

Before running the notebook, scroll down to the following cell

```jupyter
mcool = ensure_file_exists("CHANGEME.mcool")
bedpe = ensure_file_exists("CHANGEME.bedpe")
```

and set the `mcool` and `bedpe` variables to the path to the `.mcool` file used to call stripes and the path to the stripe coordinates extracted with `stripepy view`, respectively.

```jupyter
mcool = ensure_file_exists("4DNFI9GMP2J8.mcool")
bedpe = ensure_file_exists("stripes.bedpe")
```

Now you are ready to run all cells.

Running the last cell will display a HiGlass window embedded in the Jupyter notebook (note that the interface may take a while to load).

![HiGlass window](https://github.com/paulsengroup/StripePy/blob/main/docs/assets/4DNFI9GMP2J8_chr2_156mbp_higlass_view.png?raw=true)

## Generating plots

StripePy comes with a `plot` subcommand that can be used to generate various kinds of plots.

`stripepy plot` supports the following subcommands:

- `contact-map` (`cm`): plot stripes and other features over the Hi-C matrix
- `pseudodistribution` (`pd`): plot the pseudo-distribution over the given region of interest
- `stripe-hist` (`hist`): generate and plot the histograms showing the distribution of the stripe heights and widths

`stripepy cm` takes as input a Hi-C matrix in `.cool`, `.mcool`, or `.hic` format, and optionally the `.hdf5` file generated by `stripepy call` (this parameter is mandatory when highlighting stripes or stripe seeds).

`stripepy pd` and `stripepy hist` do not require the Hi-C matrix file, and require the `.hdf5` file generated by `stripepy call` instead.

All three subcommands support specifying a region of interest through the `--region` option.
When the commands are run without specifying the region of interest, `stripepy cm` and `stripepy pd` will generate plots for a random 2.5 Mbp region, while `stripepy hist` will generate histograms using data from the entire genome.

Example usage:

```bash
# Plot the pseudo-distribution over a region of interest
stripepy plot pd results.hdf5 /tmp/pseudodistribution.png --region chr2:120100000-122100000

# Plot the histograms using genome-wide data
stripepy plot hist results.hdf5 /tmp/stripe_hist_gw.png

# Plot the Hi-C matrix
stripepy plot cm 4DNFI9GMP2J8.mcool 10000 /tmp/matrix.png

# Plot the Hi-C matrix higlighting the stripe seeds
stripepy plot cm 4DNFI9GMP2J8.mcool 10000 /tmp/matrix_with_seeds.png --stripepy-hdf5 results.hdf5 --highlight-seeds

# Plot the Hi-C matrix higlighting the architectural stripes
stripepy plot cm 4DNFI9GMP2J8.mcool 10000 /tmp/matrix_with_stripes.png --stripepy-hdf5 results.hdf5 --highlight-stripes
```

Some example plots generated with `stripepy plot` can be found in file `stripepy-plot-test-images.tar.xz` from [doi.org/10.5281/zenodo.14283921](https://doi.org/10.5281/zenodo.14283921)

## Getting help

For any issues regarding StripePy installation, walkthrough, and output interpretation please open a [discussion](https://github.com/paulsengroup/StripePy/discussions) on GitHub.

If you've found a bug or would like to suggest a new feature, please open a new [issue](https://github.com/paulsengroup/StripePy/issues) instead.

<!--
## Citing

If you use StripePy in your research, please cite the following publication:

TODO

<details>
<summary>BibTex</summary>

```bibtex
@article{stripepy,
  TODO
}
```

</details>
-->

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "stripepy-hic",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": "Andrea Raffo <andrea.raffo@ibv.uio.no>, Roberto Rossini <roberros@uio.no>",
    "keywords": "architectural stripe, contact map, cooler, hi-c, hic, stripe, stripe recognition, stripes",
    "author": null,
    "author_email": "Andrea Raffo <andrea.raffo@ibv.uio.no>, Bendik Berg <bendber@ifi.uio.no>, Roberto Rossini <roberros@uio.no>",
    "download_url": "https://files.pythonhosted.org/packages/d4/52/a8dcaa652d0a2cd33e8066bb6984f6e61f6987cbf28cd30716feb9e89017/stripepy_hic-0.0.2.tar.gz",
    "platform": null,
    "description": "<!--\nCopyright (C) 2024 Roberto Rossini <roberros@uio.no>\n\nSPDX-License-Identifier: MIT\n-->\n\n# StripePy\n\n[![License](https://img.shields.io/badge/license-MIT-green)](https://github.com/paulsengroup/StripePy/blob/main/LICENCE)\n[![CI](https://github.com/paulsengroup/StripePy/actions/workflows/ci.yml/badge.svg)](https://github.com/paulsengroup/StripePy/actions/workflows/ci.yml)\n[![Build Dockerfile](https://github.com/paulsengroup/StripePy/actions/workflows/build-dockerfile.yml/badge.svg)](https://github.com/paulsengroup/StripePy/actions/workflows/build-dockerfile.yml)\n[![Zenodo DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.14394042.svg)](https://doi.org/10.5281/zenodo.14394041)\n\n<!--\n[![Download from Bioconda](https://img.shields.io/conda/vn/bioconda/StripePy?label=bioconda&logo=Anaconda)](https://anaconda.org/bioconda/StripePy)\n[![docs](https://readthedocs.org/projects/stripepy/badge/?version=stable)](https://stripepy.readthedocs.io/en/latest/?badge=stable)\n\n-->\n\n---\n\nStripePy is a CLI application written in Python that recognizes architectural stripes found in the interaction matrix files generated by Chromosome Conformation Capture experiments, such as Hi-C and Micro-C.\nMatrix files in `.cool`, `.mcool`, and `.hic` (including `.hic` v9 files) are supported.\n\nStripePy is developed on Linux and macOS and is also tested on Windows.\n\n## Installing StripePy\n\n### Installing with pip\n\n```bash\npip install stripepy-hic\n```\n\n<!--\n\n### Installing with conda\n\n```bash\nconda create -n stripepy -c conda-forge -c bioconda stripepy-hic\n```\n\n-->\n\n### Installing from source\n\nInstructions for Linux and macOS:\n\n```bash\n# create and activate a venv (optional)\npython3 -m venv venv\n. venv/bin/activate\n\n# get StripePy source code\ngit clone https://github.com/paulsengroup/StripePy.git\n\n# optional, checkout a specific version\n# git checkout v0.0.2\n\n# install StripePy\ncd StripePy\npip install .\n\n# ensure StripePy is in your PATH\nstripepy --help\n```\n\n<details>\n<summary>Instructions for Windows</summary>\n\n```bash\n# create and activate a venv (optional)\npython3 -m venv venv\nvenv\\Scripts\\activate\n\n# get StripePy source code\ngit clone https://github.com/paulsengroup/StripePy.git\n\n# optional, checkout a specific version\n# git checkout v0.0.2\n\n# install StripePy\ncd StripePy\npip install .\n\n# ensure StripePy is in your PATH\nstripepy --help\n```\n\n</details>\n\n## Running StripePy\n\nStripePy is organized into a few subcommands:\n\n- `stripepy call`: run the stripe detection algorithm and store the identified stripes in a `.hdf5` file.\n- `stripepy view`: take the `result.hdf5` file generated by `stripepy call` and extract stripes in BEDPE format.\n- `stripepy plot`: generate various kinds of plots to inspect the stripes identified by `stripepy call`.\n- `stripepy download`: download a minified sample dataset suitable to quickly test StripePy.\n\n### Walkthrough\n\nThe following is an example of a typical run of StripePy.\nThe steps outlined in this section assume that StripePy is running on a UNIX system.\nSome commands may need some tweaking to run on Windows.\n\n#### 1) Download a sample dataset\n\nThis step is optional.\nFeel free to use your own interaction matrix (make sure the matrix is in `.cool`, `.mcool`, or `.hic` format).\n\n```console\n# This may take a while on slow internet connections\nuser@dev:/tmp$ stripepy download --name 4DNFI9GMP2J8\n\n[2024-12-11 15:25:56,101] INFO: downloading dataset \"4DNFI9GMP2J8\" (assembly=hg38)...\n[2024-12-11 15:25:56,296] INFO: downloaded 0.00/106.84 MB (0.00%)\n[2024-12-11 15:26:11,309] INFO: downloaded 57.53/106.84 MB (53.85%)\n[2024-12-11 15:26:26,312] INFO: downloaded 86.59/106.84 MB (81.05%)\n[2024-12-11 15:26:35,156] INFO: DONE! Downloading dataset \"4DNFI9GMP2J8\" took 39.06s.\n[2024-12-11 15:26:35,156] INFO: computing MD5 digest for file \"/tmp/4DNFI9GMP2J8.zf9qbdmi\"...\n[2024-12-11 15:26:35,304] INFO: MD5 checksum match!\n[2024-12-11 15:26:35,304] INFO: successfully downloaded dataset \"https://zenodo.org/records/14283922/files/4DNFI9GMP2J8.stripepy.mcool?download=1\" to file \"4DNFI9GMP2J8.mcool\"\n[2024-12-11 15:26:35,305] INFO: file size: 106.84MB. Elapsed time: 39.20s\n```\n\n#### 2) Detect architectural stripes\n\nThis is the core of the analysis and may take several minutes when processing large files.\n\n```console\nuser@dev:/tmp$ stripepy call 4DNFI9GMP2J8.mcool 10000 -o stripepy/\n\n\nArguments:\n--contact-map: 4DNFI9GMP2J8.mcool\n--resolution: 10000\n--normalization: NONE\n--genomic-belt: 5000000\n--roi: None\n--max-width: 100000\n--glob-pers-min: 0.05\n--constrain-heights: False\n--loc-pers-min: 0.33\n--loc-trend-min: 0.25\n--output-folder: stripepy\n--force: False\n--nproc: 1\n\nCHROMOSOME chr1\nRoI is: None\nStep 1: pre-processing step\n1.1) Log-transformation...\n1.2) Focusing on a neighborhood of the main diagonal...\n1.3) Projection onto [0, 1]...\nExecution time of step 1: 0.446779727935791 seconds ---\nStep 2: Topological Data Analysis\n2.1) Global 1D pseudo-distributions...\n2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...\n2.2.0) All maxima and their persistence\n2.2.1) Lower triangular part\n2.2.2) Upper triangular part\n2.2.3) Filter out seeds in sparse regions\nNumber of lower-triangular seed sites is reduced from 1353 to 1304\nNumber of upper-triangular seed sites is reduced from 1217 to 1180\n2.3) Storing into a list of Stripe objects...\nExecution time of step 2: 0.3155953884124756 seconds ---\nStep 3: Shape analysis\n...\n3.6) Bar plots of widths and heights...\nExecution time of step 3: 0.3037407398223877 seconds ---\nStep 4: Statistical analysis and post-processing\n4.1) Computing and saving biological descriptors\nExecution time of step 4: 0.059529781341552734 seconds ---\nThis chromosome has taken 0.450955867767334 seconds\n\n\nThe code has run for 2.022001071770986 minutes\n```\n\n<details>\n<summary>Complete log</summary>\n\n```\n\nArguments:\n--contact-map: 4DNFI9GMP2J8.mcool\n--resolution: 10000\n--normalization: NONE\n--genomic-belt: 5000000\n--roi: None\n--max-width: 100000\n--glob-pers-min: 0.05\n--constrain-heights: False\n--loc-pers-min: 0.33\n--loc-trend-min: 0.25\n--output-folder: stripepy\n--force: False\n--nproc: 1\n\nCHROMOSOME chr1\nRoI is: None\nStep 1: pre-processing step\n1.1) Log-transformation...\n1.2) Focusing on a neighborhood of the main diagonal...\n1.3) Projection onto [0, 1]...\nExecution time of step 1: 0.446779727935791 seconds ---\nStep 2: Topological Data Analysis\n2.1) Global 1D pseudo-distributions...\n2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...\n2.2.0) All maxima and their persistence\n2.2.1) Lower triangular part\n2.2.2) Upper triangular part\n2.2.3) Filter out seeds in sparse regions\nNumber of lower-triangular seed sites is reduced from 1353 to 1304\nNumber of upper-triangular seed sites is reduced from 1217 to 1180\n2.3) Storing into a list of Stripe objects...\nExecution time of step 2: 0.3155953884124756 seconds ---\nStep 3: Shape analysis\n3.1) Width estimation\n3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...\n3.1.2) Updating list of Stripe objects with HIoIs...\nExecution time: 3.5695345401763916 seconds ---\n3.2) Height estimation\n3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...\n3.2.2) Updating list of Stripe objects with VIoIs...\nExecution time: 3.528017997741699 seconds ---\n3.5) Saving geometric descriptors...\n3.6) Bar plots of widths and heights...\nExecution time of step 3: 7.102360963821411 seconds ---\nStep 4: Statistical analysis and post-processing\n4.1) Computing and saving biological descriptors\nExecution time of step 4: 0.9378914833068848 seconds ---\nThis chromosome has taken 9.699488639831543 seconds\n\nCHROMOSOME chr2\nRoI is: None\nStep 1: pre-processing step\n1.1) Log-transformation...\n1.2) Focusing on a neighborhood of the main diagonal...\n1.3) Projection onto [0, 1]...\nExecution time of step 1: 0.47316646575927734 seconds ---\nStep 2: Topological Data Analysis\n2.1) Global 1D pseudo-distributions...\n2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...\n2.2.0) All maxima and their persistence\n2.2.1) Lower triangular part\n2.2.2) Upper triangular part\n2.2.3) Filter out seeds in sparse regions\nNumber of lower-triangular seed sites is reduced from 1504 to 1473\nNumber of upper-triangular seed sites is reduced from 1402 to 1367\n2.3) Storing into a list of Stripe objects...\nExecution time of step 2: 0.3005075454711914 seconds ---\nStep 3: Shape analysis\n3.1) Width estimation\n3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...\n3.1.2) Updating list of Stripe objects with HIoIs...\nExecution time: 3.9707601070404053 seconds ---\n3.2) Height estimation\n3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...\n3.2.2) Updating list of Stripe objects with VIoIs...\nExecution time: 3.9840147495269775 seconds ---\n3.5) Saving geometric descriptors...\n3.6) Bar plots of widths and heights...\nExecution time of step 3: 7.959795951843262 seconds ---\nStep 4: Statistical analysis and post-processing\n4.1) Computing and saving biological descriptors\nExecution time of step 4: 1.0786504745483398 seconds ---\nThis chromosome has taken 10.67337441444397 seconds\n\nCHROMOSOME chr3\nRoI is: None\nStep 1: pre-processing step\n1.1) Log-transformation...\n1.2) Focusing on a neighborhood of the main diagonal...\n1.3) Projection onto [0, 1]...\nExecution time of step 1: 0.3940746784210205 seconds ---\nStep 2: Topological Data Analysis\n2.1) Global 1D pseudo-distributions...\n2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...\n2.2.0) All maxima and their persistence\n2.2.1) Lower triangular part\n2.2.2) Upper triangular part\n2.2.3) Filter out seeds in sparse regions\nNumber of lower-triangular seed sites is reduced from 1173 to 1168\nNumber of upper-triangular seed sites is reduced from 1305 to 1297\n2.3) Storing into a list of Stripe objects...\nExecution time of step 2: 0.2587897777557373 seconds ---\nStep 3: Shape analysis\n3.1) Width estimation\n3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...\n3.1.2) Updating list of Stripe objects with HIoIs...\nExecution time: 2.884906768798828 seconds ---\n3.2) Height estimation\n3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...\n3.2.2) Updating list of Stripe objects with VIoIs...\nExecution time: 3.896686553955078 seconds ---\n3.5) Saving geometric descriptors...\n3.6) Bar plots of widths and heights...\nExecution time of step 3: 6.78642725944519 seconds ---\nStep 4: Statistical analysis and post-processing\n4.1) Computing and saving biological descriptors\nExecution time of step 4: 1.012127161026001 seconds ---\nThis chromosome has taken 9.16284704208374 seconds\n\nCHROMOSOME chr4\nRoI is: None\nStep 1: pre-processing step\n1.1) Log-transformation...\n1.2) Focusing on a neighborhood of the main diagonal...\n1.3) Projection onto [0, 1]...\nExecution time of step 1: 0.3355872631072998 seconds ---\nStep 2: Topological Data Analysis\n2.1) Global 1D pseudo-distributions...\n2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...\n2.2.0) All maxima and their persistence\n2.2.1) Lower triangular part\n2.2.2) Upper triangular part\n2.2.3) Filter out seeds in sparse regions\nNumber of lower-triangular seed sites is reduced from 1152 to 1144\nNumber of upper-triangular seed sites is reduced from 993 to 985\n2.3) Storing into a list of Stripe objects...\nExecution time of step 2: 0.27007126808166504 seconds ---\nStep 3: Shape analysis\n3.1) Width estimation\n3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...\n3.1.2) Updating list of Stripe objects with HIoIs...\nExecution time: 2.4963622093200684 seconds ---\n3.2) Height estimation\n3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...\n3.2.2) Updating list of Stripe objects with VIoIs...\nExecution time: 3.3167238235473633 seconds ---\n3.5) Saving geometric descriptors...\n3.6) Bar plots of widths and heights...\nExecution time of step 3: 5.8175835609436035 seconds ---\nStep 4: Statistical analysis and post-processing\n4.1) Computing and saving biological descriptors\nExecution time of step 4: 0.8550224304199219 seconds ---\nThis chromosome has taken 7.937693357467651 seconds\n\nCHROMOSOME chr5\nRoI is: None\nStep 1: pre-processing step\n1.1) Log-transformation...\n1.2) Focusing on a neighborhood of the main diagonal...\n1.3) Projection onto [0, 1]...\nExecution time of step 1: 0.2822716236114502 seconds ---\nStep 2: Topological Data Analysis\n2.1) Global 1D pseudo-distributions...\n2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...\n2.2.0) All maxima and their persistence\n2.2.1) Lower triangular part\n2.2.2) Upper triangular part\n2.2.3) Filter out seeds in sparse regions\nNumber of lower-triangular seed sites is reduced from 978 to 967\nNumber of upper-triangular seed sites is reduced from 1365 to 1353\n2.3) Storing into a list of Stripe objects...\nExecution time of step 2: 0.2545452117919922 seconds ---\nStep 3: Shape analysis\n3.1) Width estimation\n3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...\n3.1.2) Updating list of Stripe objects with HIoIs...\nExecution time: 2.7692947387695312 seconds ---\n3.2) Height estimation\n3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...\n3.2.2) Updating list of Stripe objects with VIoIs...\nExecution time: 3.6233880519866943 seconds ---\n3.5) Saving geometric descriptors...\n3.6) Bar plots of widths and heights...\nExecution time of step 3: 6.397336721420288 seconds ---\nStep 4: Statistical analysis and post-processing\n4.1) Computing and saving biological descriptors\nExecution time of step 4: 0.9528613090515137 seconds ---\nThis chromosome has taken 8.481026411056519 seconds\n\nCHROMOSOME chr6\nRoI is: None\nStep 1: pre-processing step\n1.1) Log-transformation...\n1.2) Focusing on a neighborhood of the main diagonal...\n1.3) Projection onto [0, 1]...\nExecution time of step 1: 0.26506567001342773 seconds ---\nStep 2: Topological Data Analysis\n2.1) Global 1D pseudo-distributions...\n2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...\n2.2.0) All maxima and their persistence\n2.2.1) Lower triangular part\n2.2.2) Upper triangular part\n2.2.3) Filter out seeds in sparse regions\nNumber of lower-triangular seed sites is reduced from 1102 to 1087\nNumber of upper-triangular seed sites is reduced from 986 to 975\n2.3) Storing into a list of Stripe objects...\nExecution time of step 2: 0.25983405113220215 seconds ---\nStep 3: Shape analysis\n3.1) Width estimation\n3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...\n3.1.2) Updating list of Stripe objects with HIoIs...\nExecution time: 2.29978084564209 seconds ---\n3.2) Height estimation\n3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...\n3.2.2) Updating list of Stripe objects with VIoIs...\nExecution time: 3.2286500930786133 seconds ---\n3.5) Saving geometric descriptors...\n3.6) Bar plots of widths and heights...\nExecution time of step 3: 5.532674789428711 seconds ---\nStep 4: Statistical analysis and post-processing\n4.1) Computing and saving biological descriptors\nExecution time of step 4: 0.8356080055236816 seconds ---\nThis chromosome has taken 7.442054748535156 seconds\n\nCHROMOSOME chr7\nRoI is: None\nStep 1: pre-processing step\n1.1) Log-transformation...\n1.2) Focusing on a neighborhood of the main diagonal...\n1.3) Projection onto [0, 1]...\nExecution time of step 1: 0.21904230117797852 seconds ---\nStep 2: Topological Data Analysis\n2.1) Global 1D pseudo-distributions...\n2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...\n2.2.0) All maxima and their persistence\n2.2.1) Lower triangular part\n2.2.2) Upper triangular part\n2.2.3) Filter out seeds in sparse regions\nNumber of lower-triangular seed sites is reduced from 770 to 735\nNumber of upper-triangular seed sites is reduced from 1114 to 1083\n2.3) Storing into a list of Stripe objects...\nExecution time of step 2: 0.22430944442749023 seconds ---\nStep 3: Shape analysis\n3.1) Width estimation\n3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...\n3.1.2) Updating list of Stripe objects with HIoIs...\nExecution time: 1.8583557605743408 seconds ---\n3.2) Height estimation\n3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...\n3.2.2) Updating list of Stripe objects with VIoIs...\nExecution time: 2.8366761207580566 seconds ---\n3.5) Saving geometric descriptors...\n3.6) Bar plots of widths and heights...\nExecution time of step 3: 4.698970079421997 seconds ---\nStep 4: Statistical analysis and post-processing\n4.1) Computing and saving biological descriptors\nExecution time of step 4: 0.7353849411010742 seconds ---\nThis chromosome has taken 6.372534275054932 seconds\n\nCHROMOSOME chr8\nRoI is: None\nStep 1: pre-processing step\n1.1) Log-transformation...\n1.2) Focusing on a neighborhood of the main diagonal...\n1.3) Projection onto [0, 1]...\nExecution time of step 1: 0.19896483421325684 seconds ---\nStep 2: Topological Data Analysis\n2.1) Global 1D pseudo-distributions...\n2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...\n2.2.0) All maxima and their persistence\n2.2.1) Lower triangular part\n2.2.2) Upper triangular part\n2.2.3) Filter out seeds in sparse regions\nNumber of lower-triangular seed sites is reduced from 950 to 927\nNumber of upper-triangular seed sites is reduced from 789 to 770\n2.3) Storing into a list of Stripe objects...\nExecution time of step 2: 0.20422124862670898 seconds ---\nStep 3: Shape analysis\n3.1) Width estimation\n3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...\n3.1.2) Updating list of Stripe objects with HIoIs...\nExecution time: 1.6131336688995361 seconds ---\n3.2) Height estimation\n3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...\n3.2.2) Updating list of Stripe objects with VIoIs...\nExecution time: 2.68200421333313 seconds ---\n3.5) Saving geometric descriptors...\n3.6) Bar plots of widths and heights...\nExecution time of step 3: 4.298713445663452 seconds ---\nStep 4: Statistical analysis and post-processing\n4.1) Computing and saving biological descriptors\nExecution time of step 4: 0.7141485214233398 seconds ---\nThis chromosome has taken 5.876908302307129 seconds\n\nCHROMOSOME chr9\nRoI is: None\nStep 1: pre-processing step\n1.1) Log-transformation...\n1.2) Focusing on a neighborhood of the main diagonal...\n1.3) Projection onto [0, 1]...\nExecution time of step 1: 0.13652896881103516 seconds ---\nStep 2: Topological Data Analysis\n2.1) Global 1D pseudo-distributions...\n2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...\n2.2.0) All maxima and their persistence\n2.2.1) Lower triangular part\n2.2.2) Upper triangular part\n2.2.3) Filter out seeds in sparse regions\nNumber of lower-triangular seed sites is reduced from 602 to 587\nNumber of upper-triangular seed sites is reduced from 690 to 675\n2.3) Storing into a list of Stripe objects...\nExecution time of step 2: 0.18230485916137695 seconds ---\nStep 3: Shape analysis\n3.1) Width estimation\n3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...\n3.1.2) Updating list of Stripe objects with HIoIs...\nExecution time: 1.1340200901031494 seconds ---\n3.2) Height estimation\n3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...\n3.2.2) Updating list of Stripe objects with VIoIs...\nExecution time: 2.0012364387512207 seconds ---\n3.5) Saving geometric descriptors...\n3.6) Bar plots of widths and heights...\nExecution time of step 3: 3.138490915298462 seconds ---\nStep 4: Statistical analysis and post-processing\n4.1) Computing and saving biological descriptors\nExecution time of step 4: 0.5209352970123291 seconds ---\nThis chromosome has taken 4.350054979324341 seconds\n\nCHROMOSOME chr10\nRoI is: None\nStep 1: pre-processing step\n1.1) Log-transformation...\n1.2) Focusing on a neighborhood of the main diagonal...\n1.3) Projection onto [0, 1]...\nExecution time of step 1: 0.1949782371520996 seconds ---\nStep 2: Topological Data Analysis\n2.1) Global 1D pseudo-distributions...\n2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...\n2.2.0) All maxima and their persistence\n2.2.1) Lower triangular part\n2.2.2) Upper triangular part\n2.2.3) Filter out seeds in sparse regions\nNumber of lower-triangular seed sites is reduced from 855 to 830\nNumber of upper-triangular seed sites is reduced from 891 to 867\n2.3) Storing into a list of Stripe objects...\nExecution time of step 2: 0.18785762786865234 seconds ---\nStep 3: Shape analysis\n3.1) Width estimation\n3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...\n3.1.2) Updating list of Stripe objects with HIoIs...\nExecution time: 1.5026066303253174 seconds ---\n3.2) Height estimation\n3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...\n3.2.2) Updating list of Stripe objects with VIoIs...\nExecution time: 2.675481081008911 seconds ---\n3.5) Saving geometric descriptors...\n3.6) Bar plots of widths and heights...\nExecution time of step 3: 4.1818461418151855 seconds ---\nStep 4: Statistical analysis and post-processing\n4.1) Computing and saving biological descriptors\nExecution time of step 4: 0.7034149169921875 seconds ---\nThis chromosome has taken 5.699317932128906 seconds\n\nCHROMOSOME chr11\nRoI is: None\nStep 1: pre-processing step\n1.1) Log-transformation...\n1.2) Focusing on a neighborhood of the main diagonal...\n1.3) Projection onto [0, 1]...\nExecution time of step 1: 0.1772167682647705 seconds ---\nStep 2: Topological Data Analysis\n2.1) Global 1D pseudo-distributions...\n2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...\n2.2.0) All maxima and their persistence\n2.2.1) Lower triangular part\n2.2.2) Upper triangular part\n2.2.3) Filter out seeds in sparse regions\nNumber of lower-triangular seed sites is reduced from 909 to 897\nNumber of upper-triangular seed sites is reduced from 1056 to 1044\n2.3) Storing into a list of Stripe objects...\nExecution time of step 2: 0.18897652626037598 seconds ---\nStep 3: Shape analysis\n3.1) Width estimation\n3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...\n3.1.2) Updating list of Stripe objects with HIoIs...\nExecution time: 1.6935629844665527 seconds ---\n3.2) Height estimation\n3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...\n3.2.2) Updating list of Stripe objects with VIoIs...\nExecution time: 3.049086093902588 seconds ---\n3.5) Saving geometric descriptors...\n3.6) Bar plots of widths and heights...\nExecution time of step 3: 4.746764183044434 seconds ---\nStep 4: Statistical analysis and post-processing\n4.1) Computing and saving biological descriptors\nExecution time of step 4: 0.7810971736907959 seconds ---\nThis chromosome has taken 6.334598541259766 seconds\n\nCHROMOSOME chr12\nRoI is: None\nStep 1: pre-processing step\n1.1) Log-transformation...\n1.2) Focusing on a neighborhood of the main diagonal...\n1.3) Projection onto [0, 1]...\nExecution time of step 1: 0.16396474838256836 seconds ---\nStep 2: Topological Data Analysis\n2.1) Global 1D pseudo-distributions...\n2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...\n2.2.0) All maxima and their persistence\n2.2.1) Lower triangular part\n2.2.2) Upper triangular part\n2.2.3) Filter out seeds in sparse regions\nNumber of lower-triangular seed sites is reduced from 633 to 629\nNumber of upper-triangular seed sites is reduced from 883 to 878\n2.3) Storing into a list of Stripe objects...\nExecution time of step 2: 0.2068781852722168 seconds ---\nStep 3: Shape analysis\n3.1) Width estimation\n3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...\n3.1.2) Updating list of Stripe objects with HIoIs...\nExecution time: 1.2998018264770508 seconds ---\n3.2) Height estimation\n3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...\n3.2.2) Updating list of Stripe objects with VIoIs...\nExecution time: 2.355454206466675 seconds ---\n3.5) Saving geometric descriptors...\n3.6) Bar plots of widths and heights...\nExecution time of step 3: 3.6587283611297607 seconds ---\nStep 4: Statistical analysis and post-processing\n4.1) Computing and saving biological descriptors\nExecution time of step 4: 0.6075148582458496 seconds ---\nThis chromosome has taken 5.056737422943115 seconds\n\nCHROMOSOME chr13\nRoI is: None\nStep 1: pre-processing step\n1.1) Log-transformation...\n1.2) Focusing on a neighborhood of the main diagonal...\n1.3) Projection onto [0, 1]...\nExecution time of step 1: 0.11529994010925293 seconds ---\nStep 2: Topological Data Analysis\n2.1) Global 1D pseudo-distributions...\n2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...\n2.2.0) All maxima and their persistence\n2.2.1) Lower triangular part\n2.2.2) Upper triangular part\n2.2.3) Filter out seeds in sparse regions\nNumber of lower-triangular seed sites is reduced from 686 to 675\nNumber of upper-triangular seed sites is reduced from 653 to 643\n2.3) Storing into a list of Stripe objects...\nExecution time of step 2: 0.15171098709106445 seconds ---\nStep 3: Shape analysis\n3.1) Width estimation\n3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...\n3.1.2) Updating list of Stripe objects with HIoIs...\nExecution time: 0.9743459224700928 seconds ---\n3.2) Height estimation\n3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...\n3.2.2) Updating list of Stripe objects with VIoIs...\nExecution time: 2.0503764152526855 seconds ---\n3.5) Saving geometric descriptors...\n3.6) Bar plots of widths and heights...\nExecution time of step 3: 3.027937650680542 seconds ---\nStep 4: Statistical analysis and post-processing\n4.1) Computing and saving biological descriptors\nExecution time of step 4: 0.5435254573822021 seconds ---\nThis chromosome has taken 4.14011549949646 seconds\n\nCHROMOSOME chr14\nRoI is: None\nStep 1: pre-processing step\n1.1) Log-transformation...\n1.2) Focusing on a neighborhood of the main diagonal...\n1.3) Projection onto [0, 1]...\nExecution time of step 1: 0.10641026496887207 seconds ---\nStep 2: Topological Data Analysis\n2.1) Global 1D pseudo-distributions...\n2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...\n2.2.0) All maxima and their persistence\n2.2.1) Lower triangular part\n2.2.2) Upper triangular part\n2.2.3) Filter out seeds in sparse regions\nNumber of lower-triangular seed sites is reduced from 567 to 561\nNumber of upper-triangular seed sites is reduced from 523 to 521\n2.3) Storing into a list of Stripe objects...\nExecution time of step 2: 0.14020061492919922 seconds ---\nStep 3: Shape analysis\n3.1) Width estimation\n3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...\n3.1.2) Updating list of Stripe objects with HIoIs...\nExecution time: 0.7857282161712646 seconds ---\n3.2) Height estimation\n3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...\n3.2.2) Updating list of Stripe objects with VIoIs...\nExecution time: 1.7303550243377686 seconds ---\n3.5) Saving geometric descriptors...\n3.6) Bar plots of widths and heights...\nExecution time of step 3: 2.519143581390381 seconds ---\nStep 4: Statistical analysis and post-processing\n4.1) Computing and saving biological descriptors\nExecution time of step 4: 0.4577367305755615 seconds ---\nThis chromosome has taken 3.510401725769043 seconds\n\nCHROMOSOME chr15\nRoI is: None\nStep 1: pre-processing step\n1.1) Log-transformation...\n1.2) Focusing on a neighborhood of the main diagonal...\n1.3) Projection onto [0, 1]...\nExecution time of step 1: 0.11155319213867188 seconds ---\nStep 2: Topological Data Analysis\n2.1) Global 1D pseudo-distributions...\n2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...\n2.2.0) All maxima and their persistence\n2.2.1) Lower triangular part\n2.2.2) Upper triangular part\n2.2.3) Filter out seeds in sparse regions\nNumber of lower-triangular seed sites is reduced from 637 to 602\nNumber of upper-triangular seed sites is reduced from 635 to 607\n2.3) Storing into a list of Stripe objects...\nExecution time of step 2: 0.12435340881347656 seconds ---\nStep 3: Shape analysis\n3.1) Width estimation\n3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...\n3.1.2) Updating list of Stripe objects with HIoIs...\nExecution time: 0.7276151180267334 seconds ---\n3.2) Height estimation\n3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...\n3.2.2) Updating list of Stripe objects with VIoIs...\nExecution time: 1.8632171154022217 seconds ---\n3.5) Saving geometric descriptors...\n3.6) Bar plots of widths and heights...\nExecution time of step 3: 2.593841552734375 seconds ---\nStep 4: Statistical analysis and post-processing\n4.1) Computing and saving biological descriptors\nExecution time of step 4: 0.4837307929992676 seconds ---\nThis chromosome has taken 3.5654518604278564 seconds\n\nCHROMOSOME chr16\nRoI is: None\nStep 1: pre-processing step\n1.1) Log-transformation...\n1.2) Focusing on a neighborhood of the main diagonal...\n1.3) Projection onto [0, 1]...\nExecution time of step 1: 0.08726286888122559 seconds ---\nStep 2: Topological Data Analysis\n2.1) Global 1D pseudo-distributions...\n2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...\n2.2.0) All maxima and their persistence\n2.2.1) Lower triangular part\n2.2.2) Upper triangular part\n2.2.3) Filter out seeds in sparse regions\nNumber of lower-triangular seed sites is reduced from 330 to 299\nNumber of upper-triangular seed sites is reduced from 494 to 454\n2.3) Storing into a list of Stripe objects...\nExecution time of step 2: 0.12026333808898926 seconds ---\nStep 3: Shape analysis\n3.1) Width estimation\n3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...\n3.1.2) Updating list of Stripe objects with HIoIs...\nExecution time: 0.4373819828033447 seconds ---\n3.2) Height estimation\n3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...\n3.2.2) Updating list of Stripe objects with VIoIs...\nExecution time: 1.1490724086761475 seconds ---\n3.5) Saving geometric descriptors...\n3.6) Bar plots of widths and heights...\nExecution time of step 3: 1.5888426303863525 seconds ---\nStep 4: Statistical analysis and post-processing\n4.1) Computing and saving biological descriptors\nExecution time of step 4: 0.3165297508239746 seconds ---\nThis chromosome has taken 2.3602585792541504 seconds\n\nCHROMOSOME chr17\nRoI is: None\nStep 1: pre-processing step\n1.1) Log-transformation...\n1.2) Focusing on a neighborhood of the main diagonal...\n1.3) Projection onto [0, 1]...\nExecution time of step 1: 0.09188055992126465 seconds ---\nStep 2: Topological Data Analysis\n2.1) Global 1D pseudo-distributions...\n2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...\n2.2.0) All maxima and their persistence\n2.2.1) Lower triangular part\n2.2.2) Upper triangular part\n2.2.3) Filter out seeds in sparse regions\nNumber of lower-triangular seed sites is reduced from 586 to 561\nNumber of upper-triangular seed sites is reduced from 548 to 518\n2.3) Storing into a list of Stripe objects...\nExecution time of step 2: 0.11394453048706055 seconds ---\nStep 3: Shape analysis\n3.1) Width estimation\n3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...\n3.1.2) Updating list of Stripe objects with HIoIs...\nExecution time: 0.5639193058013916 seconds ---\n3.2) Height estimation\n3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...\n3.2.2) Updating list of Stripe objects with VIoIs...\nExecution time: 1.671928882598877 seconds ---\n3.5) Saving geometric descriptors...\n3.6) Bar plots of widths and heights...\nExecution time of step 3: 2.238788366317749 seconds ---\nStep 4: Statistical analysis and post-processing\n4.1) Computing and saving biological descriptors\nExecution time of step 4: 0.4385089874267578 seconds ---\nThis chromosome has taken 3.1356685161590576 seconds\n\nCHROMOSOME chr18\nRoI is: None\nStep 1: pre-processing step\n1.1) Log-transformation...\n1.2) Focusing on a neighborhood of the main diagonal...\n1.3) Projection onto [0, 1]...\nExecution time of step 1: 0.08964848518371582 seconds ---\nStep 2: Topological Data Analysis\n2.1) Global 1D pseudo-distributions...\n2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...\n2.2.0) All maxima and their persistence\n2.2.1) Lower triangular part\n2.2.2) Upper triangular part\n2.2.3) Filter out seeds in sparse regions\nNumber of lower-triangular seed sites is reduced from 485 to 480\nNumber of upper-triangular seed sites is reduced from 524 to 517\n2.3) Storing into a list of Stripe objects...\nExecution time of step 2: 0.11319613456726074 seconds ---\nStep 3: Shape analysis\n3.1) Width estimation\n3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...\n3.1.2) Updating list of Stripe objects with HIoIs...\nExecution time: 0.535980224609375 seconds ---\n3.2) Height estimation\n3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...\n3.2.2) Updating list of Stripe objects with VIoIs...\nExecution time: 1.5569026470184326 seconds ---\n3.5) Saving geometric descriptors...\n3.6) Bar plots of widths and heights...\nExecution time of step 3: 2.0957298278808594 seconds ---\nStep 4: Statistical analysis and post-processing\n4.1) Computing and saving biological descriptors\nExecution time of step 4: 0.4030795097351074 seconds ---\nThis chromosome has taken 2.9409568309783936 seconds\n\nCHROMOSOME chr19\nRoI is: None\nStep 1: pre-processing step\n1.1) Log-transformation...\n1.2) Focusing on a neighborhood of the main diagonal...\n1.3) Projection onto [0, 1]...\nExecution time of step 1: 0.0656130313873291 seconds ---\nStep 2: Topological Data Analysis\n2.1) Global 1D pseudo-distributions...\n2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...\n2.2.0) All maxima and their persistence\n2.2.1) Lower triangular part\n2.2.2) Upper triangular part\n2.2.3) Filter out seeds in sparse regions\nNumber of lower-triangular seed sites is reduced from 547 to 543\nNumber of upper-triangular seed sites is reduced from 558 to 555\n2.3) Storing into a list of Stripe objects...\nExecution time of step 2: 0.08183503150939941 seconds ---\nStep 3: Shape analysis\n3.1) Width estimation\n3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...\n3.1.2) Updating list of Stripe objects with HIoIs...\nExecution time: 0.42258787155151367 seconds ---\n3.2) Height estimation\n3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...\n3.2.2) Updating list of Stripe objects with VIoIs...\nExecution time: 1.7407689094543457 seconds ---\n3.5) Saving geometric descriptors...\n3.6) Bar plots of widths and heights...\nExecution time of step 3: 2.166306495666504 seconds ---\nStep 4: Statistical analysis and post-processing\n4.1) Computing and saving biological descriptors\nExecution time of step 4: 0.4351332187652588 seconds ---\nThis chromosome has taken 2.9370124340057373 seconds\n\nCHROMOSOME chr20\nRoI is: None\nStep 1: pre-processing step\n1.1) Log-transformation...\n1.2) Focusing on a neighborhood of the main diagonal...\n1.3) Projection onto [0, 1]...\nExecution time of step 1: 0.07147622108459473 seconds ---\nStep 2: Topological Data Analysis\n2.1) Global 1D pseudo-distributions...\n2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...\n2.2.0) All maxima and their persistence\n2.2.1) Lower triangular part\n2.2.2) Upper triangular part\n2.2.3) Filter out seeds in sparse regions\nNumber of lower-triangular seed sites is reduced from 501 to 481\nNumber of upper-triangular seed sites is reduced from 411 to 393\n2.3) Storing into a list of Stripe objects...\nExecution time of step 2: 0.09188032150268555 seconds ---\nStep 3: Shape analysis\n3.1) Width estimation\n3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...\n3.1.2) Updating list of Stripe objects with HIoIs...\nExecution time: 0.36221981048583984 seconds ---\n3.2) Height estimation\n3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...\n3.2.2) Updating list of Stripe objects with VIoIs...\nExecution time: 1.3752944469451904 seconds ---\n3.5) Saving geometric descriptors...\n3.6) Bar plots of widths and heights...\nExecution time of step 3: 1.7400612831115723 seconds ---\nStep 4: Statistical analysis and post-processing\n4.1) Computing and saving biological descriptors\nExecution time of step 4: 0.3595857620239258 seconds ---\nThis chromosome has taken 2.4734344482421875 seconds\n\nCHROMOSOME chr21\nRoI is: None\nStep 1: pre-processing step\n1.1) Log-transformation...\n1.2) Focusing on a neighborhood of the main diagonal...\n1.3) Projection onto [0, 1]...\nExecution time of step 1: 0.026866912841796875 seconds ---\nStep 2: Topological Data Analysis\n2.1) Global 1D pseudo-distributions...\n2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...\n2.2.0) All maxima and their persistence\n2.2.1) Lower triangular part\n2.2.2) Upper triangular part\n2.2.3) Filter out seeds in sparse regions\nNumber of lower-triangular seed sites is reduced from 232 to 210\nNumber of upper-triangular seed sites is reduced from 220 to 206\n2.3) Storing into a list of Stripe objects...\nExecution time of step 2: 0.06076812744140625 seconds ---\nStep 3: Shape analysis\n3.1) Width estimation\n3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...\n3.1.2) Updating list of Stripe objects with HIoIs...\nExecution time: 0.12738490104675293 seconds ---\n3.2) Height estimation\n3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...\n3.2.2) Updating list of Stripe objects with VIoIs...\nExecution time: 0.6408932209014893 seconds ---\n3.5) Saving geometric descriptors...\n3.6) Bar plots of widths and heights...\nExecution time of step 3: 0.769859790802002 seconds ---\nStep 4: Statistical analysis and post-processing\n4.1) Computing and saving biological descriptors\nExecution time of step 4: 0.167100191116333 seconds ---\nThis chromosome has taken 1.1280708312988281 seconds\n\nCHROMOSOME chr22\nRoI is: None\nStep 1: pre-processing step\n1.1) Log-transformation...\n1.2) Focusing on a neighborhood of the main diagonal...\n1.3) Projection onto [0, 1]...\nExecution time of step 1: 0.03299999237060547 seconds ---\nStep 2: Topological Data Analysis\n2.1) Global 1D pseudo-distributions...\n2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...\n2.2.0) All maxima and their persistence\n2.2.1) Lower triangular part\n2.2.2) Upper triangular part\n2.2.3) Filter out seeds in sparse regions\nNumber of lower-triangular seed sites is reduced from 310 to 285\nNumber of upper-triangular seed sites is reduced from 286 to 272\n2.3) Storing into a list of Stripe objects...\nExecution time of step 2: 0.06340456008911133 seconds ---\nStep 3: Shape analysis\n3.1) Width estimation\n3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...\n3.1.2) Updating list of Stripe objects with HIoIs...\nExecution time: 0.1843411922454834 seconds ---\n3.2) Height estimation\n3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...\n3.2.2) Updating list of Stripe objects with VIoIs...\nExecution time: 0.8908343315124512 seconds ---\n3.5) Saving geometric descriptors...\n3.6) Bar plots of widths and heights...\nExecution time of step 3: 1.0772550106048584 seconds ---\nStep 4: Statistical analysis and post-processing\n4.1) Computing and saving biological descriptors\nExecution time of step 4: 0.23577547073364258 seconds ---\nThis chromosome has taken 1.5294694900512695 seconds\n\nCHROMOSOME chrX\nRoI is: None\nStep 1: pre-processing step\n1.1) Log-transformation...\n1.2) Focusing on a neighborhood of the main diagonal...\n1.3) Projection onto [0, 1]...\nExecution time of step 1: 0.13507843017578125 seconds ---\nStep 2: Topological Data Analysis\n2.1) Global 1D pseudo-distributions...\n2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...\n2.2.0) All maxima and their persistence\n2.2.1) Lower triangular part\n2.2.2) Upper triangular part\n2.2.3) Filter out seeds in sparse regions\nNumber of lower-triangular seed sites is reduced from 1183 to 1097\nNumber of upper-triangular seed sites is reduced from 1142 to 1054\n2.3) Storing into a list of Stripe objects...\nExecution time of step 2: 0.21353983879089355 seconds ---\nStep 3: Shape analysis\n3.1) Width estimation\n3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...\n3.1.2) Updating list of Stripe objects with HIoIs...\nExecution time: 1.9477896690368652 seconds ---\n3.2) Height estimation\n3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...\n3.2.2) Updating list of Stripe objects with VIoIs...\nExecution time: 2.7261645793914795 seconds ---\n3.5) Saving geometric descriptors...\n3.6) Bar plots of widths and heights...\nExecution time of step 3: 4.678062200546265 seconds ---\nStep 4: Statistical analysis and post-processing\n4.1) Computing and saving biological descriptors\nExecution time of step 4: 0.6798150539398193 seconds ---\nThis chromosome has taken 6.042936563491821 seconds\n\nCHROMOSOME chrY\nRoI is: None\nStep 1: pre-processing step\n1.1) Log-transformation...\n1.2) Focusing on a neighborhood of the main diagonal...\n1.3) Projection onto [0, 1]...\nExecution time of step 1: 0.00628972053527832 seconds ---\nStep 2: Topological Data Analysis\n2.1) Global 1D pseudo-distributions...\n2.2) Detection of persistent maxima and corresponding minima for lower- and upper-triangular matrices...\n2.2.0) All maxima and their persistence\n2.2.1) Lower triangular part\n2.2.2) Upper triangular part\n2.2.3) Filter out seeds in sparse regions\nNumber of lower-triangular seed sites is reduced from 130 to 97\nNumber of upper-triangular seed sites is reduced from 148 to 112\n2.3) Storing into a list of Stripe objects...\nExecution time of step 2: 0.05667591094970703 seconds ---\nStep 3: Shape analysis\n3.1) Width estimation\n3.1.1) Estimating widths (equiv. HIoIs, where HIoI stands for Horizontal Interval of Interest)...\n3.1.2) Updating list of Stripe objects with HIoIs...\nExecution time: 0.07381796836853027 seconds ---\n3.2) Height estimation\n3.2.1) Estimating heights (equiv. VIoIs, where VIoI stands for Vertical Interval of Interest)...\n3.2.2) Updating list of Stripe objects with VIoIs...\nExecution time: 0.2287464141845703 seconds ---\n3.5) Saving geometric descriptors...\n3.6) Bar plots of widths and heights...\nExecution time of step 3: 0.3037407398223877 seconds ---\nStep 4: Statistical analysis and post-processing\n4.1) Computing and saving biological descriptors\nExecution time of step 4: 0.059529781341552734 seconds ---\nThis chromosome has taken 0.450955867767334 seconds\n\n\nThe code has run for 2.022001071770986 minutes\n```\n\n</details>\n\nRunning the above command produces the following output:\n\n```\n/tmp/stripepy\n\u2514\u2500\u2500 4DNFI9GMP2J8\n    \u2514\u2500\u2500 10000\n        \u2514\u2500\u2500 results.hdf5\n\n3 directories, 1 file\n```\n\nWhen processing larger Hi-C matrix, StripePy can take advantage of multicore processors.\n\nThe maximum number of CPU cores use by StripePy can be changed through option `--nproc` (set to 1 core by default).\n\n#### 4) Fetch stripes in BEDPE format\n\nThe `.hdf5` file produced by `stripepy call` contains various kinds of information, including stripe coordinates, various descriptive statistics, persistence vectors, and more.\n\nWhile having access to all this information can be useful, usually we are mostly interested in the stripe coordinates, which can be fetched using `stripepy view`.\n\n```console\n# Fetch the first 10 stripes in BEDPE format\nuser@dev:/tmp$ stripepy view stripepy/4DNFI9GMP2J8/10000/results.hdf5 | head\n\nchr1\t910000\t960000\tchr1\t930000\t3590000\nchr1\t1060000\t1110000\tchr1\t1080000\t3540000\nchr1\t1400000\t1490000\tchr1\t1430000\t3540000\nchr1\t1600000\t1670000\tchr1\t880000\t1620000\nchr1\t1670000\t1700000\tchr1\t1680000\t2610000\nchr1\t1730000\t1780000\tchr1\t1750000\t2570000\nchr1\t1890000\t1940000\tchr1\t1920000\t3540000\nchr1\t2020000\t2060000\tchr1\t2020000\t3550000\nchr1\t2070000\t2120000\tchr1\t2090000\t3540000\nchr1\t2170000\t2230000\tchr1\t2190000\t3500000\n\n# Redirect stdout to a file\nuser@dev:/tmp$ stripepy view stripepy/4DNFI9GMP2J8/10000/results.hdf5 > stripes.bedpe\n\n# Compress stripes on the fly before writing to a file\nuser@dev:/tmp$ stripepy view stripepy/4DNFI9GMP2J8/10000/results.hdf5 | gzip -9 > stripes.bedpe.gz\n```\n\n#### 5) Quickly visualize architectural stripes\n\nIt is often a good idea to visually inspect at least some of the stripes to make sure that the used parameters are suitable for the dataset that was given to `stripepy call`.\n\nWe provide a Jupyter notebook ([visualize_stripes_with_highlass.ipynb](utils/visualize_stripes_with_highlass.ipynb)) to facilitate this visual inspection.\nThe notebook expects the input file to be in `.mcool` format.\n\nIf your matrix is in `.hic` format you can easily convert it to `.mcool` format using hictk by running `hictk convert matrix.hic matrix.mcool`.\nHiGlass cannot visualize single-resolution Cooler files. If you are working with `.cool` files you can use hictk to generate `.mcool` files by running `hictk zoomify matrix.cool matrix.mcool`.\n\nFor more details, please refer to hictk's documentation: [hictk.readthedocs.io](https://hictk.readthedocs.io/en/stable/quickstart_cli.html).\n\nWe recommend running the notebook using [JupyterLab](https://jupyter.org/install).\n\nFurthermore, the notebook depends on a few Python packages that can be installed with `pip`.\nPlease make sure that the following packages are installed in a virtual environment that is accessible from Jupyter. Refer to [IPython](https://ipython.readthedocs.io/en/stable/install/kernel_install.html) documentation for instructions on how to add a virtual environment to Jupyter.\n\n```bash\npip install 'clodius>=0.20,<1' 'hictkpy>=1,<2' 'higlass-python>=1.2,<2'\n```\n\nNext, launch JupyterLab and open notebook [visualize_stripes_with_highlass.ipynb](utils/visualize_stripes_with_highlass.ipynb).\n\n```bash\njupyter lab\n```\n\nBefore running the notebook, scroll down to the following cell\n\n```jupyter\nmcool = ensure_file_exists(\"CHANGEME.mcool\")\nbedpe = ensure_file_exists(\"CHANGEME.bedpe\")\n```\n\nand set the `mcool` and `bedpe` variables to the path to the `.mcool` file used to call stripes and the path to the stripe coordinates extracted with `stripepy view`, respectively.\n\n```jupyter\nmcool = ensure_file_exists(\"4DNFI9GMP2J8.mcool\")\nbedpe = ensure_file_exists(\"stripes.bedpe\")\n```\n\nNow you are ready to run all cells.\n\nRunning the last cell will display a HiGlass window embedded in the Jupyter notebook (note that the interface may take a while to load).\n\n![HiGlass window](https://github.com/paulsengroup/StripePy/blob/main/docs/assets/4DNFI9GMP2J8_chr2_156mbp_higlass_view.png?raw=true)\n\n## Generating plots\n\nStripePy comes with a `plot` subcommand that can be used to generate various kinds of plots.\n\n`stripepy plot` supports the following subcommands:\n\n- `contact-map` (`cm`): plot stripes and other features over the Hi-C matrix\n- `pseudodistribution` (`pd`): plot the pseudo-distribution over the given region of interest\n- `stripe-hist` (`hist`): generate and plot the histograms showing the distribution of the stripe heights and widths\n\n`stripepy cm` takes as input a Hi-C matrix in `.cool`, `.mcool`, or `.hic` format, and optionally the `.hdf5` file generated by `stripepy call` (this parameter is mandatory when highlighting stripes or stripe seeds).\n\n`stripepy pd` and `stripepy hist` do not require the Hi-C matrix file, and require the `.hdf5` file generated by `stripepy call` instead.\n\nAll three subcommands support specifying a region of interest through the `--region` option.\nWhen the commands are run without specifying the region of interest, `stripepy cm` and `stripepy pd` will generate plots for a random 2.5 Mbp region, while `stripepy hist` will generate histograms using data from the entire genome.\n\nExample usage:\n\n```bash\n# Plot the pseudo-distribution over a region of interest\nstripepy plot pd results.hdf5 /tmp/pseudodistribution.png --region chr2:120100000-122100000\n\n# Plot the histograms using genome-wide data\nstripepy plot hist results.hdf5 /tmp/stripe_hist_gw.png\n\n# Plot the Hi-C matrix\nstripepy plot cm 4DNFI9GMP2J8.mcool 10000 /tmp/matrix.png\n\n# Plot the Hi-C matrix higlighting the stripe seeds\nstripepy plot cm 4DNFI9GMP2J8.mcool 10000 /tmp/matrix_with_seeds.png --stripepy-hdf5 results.hdf5 --highlight-seeds\n\n# Plot the Hi-C matrix higlighting the architectural stripes\nstripepy plot cm 4DNFI9GMP2J8.mcool 10000 /tmp/matrix_with_stripes.png --stripepy-hdf5 results.hdf5 --highlight-stripes\n```\n\nSome example plots generated with `stripepy plot` can be found in file `stripepy-plot-test-images.tar.xz` from [doi.org/10.5281/zenodo.14283921](https://doi.org/10.5281/zenodo.14283921)\n\n## Getting help\n\nFor any issues regarding StripePy installation, walkthrough, and output interpretation please open a [discussion](https://github.com/paulsengroup/StripePy/discussions) on GitHub.\n\nIf you've found a bug or would like to suggest a new feature, please open a new [issue](https://github.com/paulsengroup/StripePy/issues) instead.\n\n<!--\n## Citing\n\nIf you use StripePy in your research, please cite the following publication:\n\nTODO\n\n<details>\n<summary>BibTex</summary>\n\n```bibtex\n@article{stripepy,\n  TODO\n}\n```\n\n</details>\n-->\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "StripePy recognizes architectural stripes in 3C and Hi-C contact maps using geometric reasoning",
    "version": "0.0.2",
    "project_urls": {
        "Homepage": "https://github.com/paulsengroup/StripePy",
        "Issues": "https://github.com/paulsengroup/StripePy/issues",
        "Repository": "https://github.com/paulsengroup/StripePy.git",
        "Source": "https://github.com/paulsengroup/StripePy"
    },
    "split_keywords": [
        "architectural stripe",
        " contact map",
        " cooler",
        " hi-c",
        " hic",
        " stripe",
        " stripe recognition",
        " stripes"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "4d1668bf107b2b12d3f5751f39bd1f1b0882290cecfd6832dc0a5c94cdc85943",
                "md5": "8fc75599098e52eab4621ce366dcc263",
                "sha256": "bdbd7fbbe181822d9b5055a8a6f38fe64d20355f9030f76444a7f9b14fdb8d98"
            },
            "downloads": -1,
            "filename": "stripepy_hic-0.0.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "8fc75599098e52eab4621ce366dcc263",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 60427,
            "upload_time": "2024-12-20T14:25:52",
            "upload_time_iso_8601": "2024-12-20T14:25:52.430815Z",
            "url": "https://files.pythonhosted.org/packages/4d/16/68bf107b2b12d3f5751f39bd1f1b0882290cecfd6832dc0a5c94cdc85943/stripepy_hic-0.0.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d452a8dcaa652d0a2cd33e8066bb6984f6e61f6987cbf28cd30716feb9e89017",
                "md5": "73249f2256d8440f62d361171d44d5dc",
                "sha256": "4e0e14f365334095b9af04874b01bae25fb385560add5f6f0bc91b35f757e556"
            },
            "downloads": -1,
            "filename": "stripepy_hic-0.0.2.tar.gz",
            "has_sig": false,
            "md5_digest": "73249f2256d8440f62d361171d44d5dc",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 477426,
            "upload_time": "2024-12-20T14:25:56",
            "upload_time_iso_8601": "2024-12-20T14:25:56.204013Z",
            "url": "https://files.pythonhosted.org/packages/d4/52/a8dcaa652d0a2cd33e8066bb6984f6e61f6987cbf28cd30716feb9e89017/stripepy_hic-0.0.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-12-20 14:25:56",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "paulsengroup",
    "github_project": "StripePy",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "stripepy-hic"
}
        
Elapsed time: 0.53174s