pathopatch


Namepathopatch JSON
Version 1.0.4b0 PyPI version JSON
download
home_pagehttps://github.com/TIO-IKIM/PathoPatcher
SummaryPathoPatch - Accelerating Artificial Intelligence Based Whole Slide Image Analysis with an Optimized Preprocessing Pipeline
upload_time2024-04-05 14:35:06
maintainerNone
docs_urlNone
authorFabian Hörst
requires_python>=3.9
licenseNone
keywords python pathopatch
VCS
bugtrack_url
requirements Pillow PyYAML Shapely colorama future geojson matplotlib natsort numpy opencv_python_headless openslide_python pandas pydantic rasterio requests scikit-image setuptools tqdm torchvision torch wsidicom wsidicomizer
Travis-CI No Travis.
coveralls test coverage No coveralls.
            [![Python 3.10](https://img.shields.io/badge/python-3.10-blue.svg)](https://www.python.org/downloads/release/python-360/)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
![Test-Results](https://github.com/TIO-IKIM/PathoPatcher/actions/workflows/test_build.yml/badge.svg)
<img src="https://img.shields.io/badge/PyTorch-EE4C2C?style=flat-square&logo=Pytorch&logoColor=white"/></a>

___

# PathoPatch:
## Accelerating Artificial Intelligence Based Whole Slide Image Analysis with an Optimized Preprocessing Pipeline
---

## Installation

1. Clone the repository:
2. Create a conda environment with Python 3.10.12 version and install conda requirements: `conda env create -f environment.yaml -vv`. You can change the environment name by editing the `name` tag in the environment.yaml file.
This step is necessary, as we need to install `Openslide` with binary files. This is easier with conda. Otherwise, installation from [source](https://openslide.org/api/python/) needs to be performed and packages installed with pi
3. Activate environment: `conda activate pathopatch_env`
4. **Optional: cuCIM**
Run `conda install -c rapidsai cucim` inside your conda environment. This process is time consuming, so you should be patient. Also follow their [official guideline](https://github.com/rapidsai/cucim) if any problems occur.

### Development
1. Install pre-commit with `pre-commit install`


## Usage
In our Pre-Processing pipeline, we are able to extract quadratic patches from detected tissue areas, load annotation files (`.json`) and apply color normlizations. We make use of the popular [OpenSlide](https://openslide.org/) library, but extended it with the [RAPIDS cuCIM](https://github.com/rapidsai/cucim) framework for a speedup in patch-extraction.


The CLI of the main script for patch extraction ([main_extraction](preprocessing/main_extraction.py)) is as follows:

```bash
python3 main_extraction.py [-h]
                          [--wsi_paths WSI_PATHS]
                          [--wsi_filelist WSI_FILELIST]
                          [--output_path OUTPUT_PATH]
                          [--wsi_extension {svs}]
                          [--config CONFIG]
                          [--patch_size PATCH_SIZE]
                          [--patch_overlap PATCH_OVERLAP]
                          [--target_mpp TARGET_MPP]
                          [--target_mag TARGET_MAG]
                          [--downsample DOWNSAMPLE]
                          [--level LEVEL]
                          [--context_scales [CONTEXT_SCALES ...]]
                          [--check_resolution CHECK_RESOLUTION]
                          [--processes PROCESSES]
                          [--overwrite]
                          [--annotation_paths ANNOTATION_PATHS]
                          [--annotation_extension {json,xml}]
                          [--incomplete_annotations]
                          [--label_map_file LABEL_MAP_FILE]
                          [--save_only_annotated_patches]
                          [--save_context_without_mask]
                          [--exclude_classes EXCLUDE_CLASSES]
                          [--store_masks]
                          [--overlapping_labels]
                          [--normalize_stains]
                          [--normalization_vector_json NORMALIZATION_VECTOR_JSON]
                          [--min_intersection_ratio MIN_INTERSECTION_RATIO]
                          [--tissue_annotation TISSUE_ANNOTATION]
                          [--tissue_annotation_intersection_ratio TISSUE_ANNOTATION_INTERSECTION_RATIO]
                          [--masked_otsu]
                          [--otsu_annotation OTSU_ANNOTATION]
                          [--filter_patches FILTER_PATCHES]
                          [--apply_prefilter APPLY_PREFILTER]
                          [--log_path LOG_PATH]
                          [--log_level {critical,error,warning,info,debug}]
                          [--hardware_selection {cucim,openslide}]
                          [--wsi_properties DICT]

optional arguments:
  -h, --help            show this help message and exit
  --wsi_paths WSI_PATHS
                        Path to the folder where all WSI are stored or path to a single WSI-file. (default: None)
  --wsi_filelist WSI_FILELIST
                        Path to a csv-filelist with WSI files (separator: `,`), if provided just these files are
                        used.Must include full paths to WSIs, including suffixes.Can be used as an replacement for
                        the wsi_paths option.If both are provided, yields an error. (default: None)
  --output_path OUTPUT_PATH
                        Path to the folder where the resulting dataset should be stored. (default: None)
  --wsi_extension {svs}
                        The extension types used for the WSI files, the options are: ['svs'] (default: None)
  --config CONFIG       Path to a config file. The config file can hold the same parameters as the CLI. Parameters
                        provided with the CLI are always having precedence over the parameters in the config file.
                        (default: None)
  --patch_size PATCH_SIZE
                        The size of the patches in pixel that will be retrieved from the WSI, e.g. 256 for 256px
                        (default: None)
  --patch_overlap PATCH_OVERLAP
                        The percentage amount pixels that should overlap between two different patches. Please
                        Provide as integer between 0 and 100, indicating overlap in percentage. (default: None)
  --downsample DOWNSAMPLE
                        Each WSI level is downsampled by a factor of 2, downsample expresses which kind of
                        downsampling should be used with respect to the highest possible resolution. Medium
                        priority, gets overwritten by target_mag if provided, but overwrites level. (default:
                        None)
  --target_mpp TARGET_MPP
                        If this parameter is provided, the output level of the WSI corresponds to the level that
                        is at the target microns per pixel of the WSI. Alternative to target_mag, downsaple and level.
                        Highest priority,
                        overwrites target_mag, downsample and level if provided. (default: None)
  --target_mag TARGET_MAG
                        If this parameter is provided, the output level of the WSI corresponds to the level that
                        is at the target magnification of the WSI. Alternative to target_mpp, downsaple and level.
                        High priority, just target_mpp has a higher priority,
                        overwrites downsample and level if provided. (default: None)
  --level LEVEL         The tile level for sampling, alternative to downsample. Lowest priority, gets overwritten
                        by target_mag and downsample if they are provided. (default: None)
  --context_scales [CONTEXT_SCALES ...]
                        Define context scales for context patches. Context patches are centered around a central
                        patch. The context-patch size is equal to the patch-size, but downsampling is different
                        (default: None)
  --check_resolution CHECK_RESOLUTION
                        If a float value is supplies, the program checks whether the resolution of all images
                        corresponds to the given value (default: None)
  --processes PROCESSES
                        The number of processes to use. (default: None)
  --overwrite           Overwrite the patches that have already been created in case they already exist. Removes
                        dataset. Handle with care! (default: None)
  --annotation_paths ANNOTATION_PATHS
                        Path to the subfolder where the XML/JSON annotations are stored or path to a file
                        (default: None)
  --annotation_extension {json,xml}
                        The extension types used for the annotation files, the options are: ['json', 'xml']
                        (default: None)
  --incomplete_annotations
                        Set to allow WSI without annotation file (default: None)
  --label_map_file LABEL_MAP_FILE
                        The path to a json file that contains the mapping between the annotation labels and some
                        integers; an example can be found in examples (default: None)
  --save_only_annotated_patches
                        If true only patches containing annotations will be stored (default: None)
  --save_context_without_mask
                        This is helpful for extracting patches, that are not within a mask, but needed for the
                        Valuing Vicinity Segmentation Algorithms. This flag is specifically helpful if only fully annotated
                        patches should be extracted from a region of interest and their masks are stored,
                        but also sourrounding neighbourhood patches are needed (default: False)
  --exclude_classes EXCLUDE_CLASSES
                        Can be used to exclude annotation classes (default: None)
  --store_masks         Set to store masks per patch. Defaults to false (default: None)
  --overlapping_labels  Per default, labels (annotations) are mutually exclusive. If labels overlap, they are
                        overwritten according to the label_map.json ordering (highest number = highest priority
                        (default: None)
  --normalize_stains    Uses Macenko normalization on a portion of the whole slide image (default: None)
  --normalization_vector_json NORMALIZATION_VECTOR_JSON
                        The path to a JSON file where the normalization vectors are stored (default: None)
  --adjust_brightness   Normalize brightness in a batch by clipping to 90 percent. Not recommended, but kept for legacy reasons (default: None)
  --min_intersection_ratio MIN_INTERSECTION_RATIO
                        The minimum intersection between the tissue mask and the patch. Must be between 0 and 1. 0
                        means that all patches are extracted. (default: None)
  --tissue_annotation TISSUE_ANNOTATION
                        Can be used to name a polygon annotation to determine the tissue area. If a tissue
                        annotation is provided, no Otsu-thresholding is performed (default: None)
  --tissue_annotation_intersection_ratio TISSUE_ANNOTATION_INTERSECTION_RATIO
                        Intersection ratio with tissue annotation. Helpful, if ROI annotation is passed,
                        which should not interfere with background ratio. If not provided,
                        the default min_intersection_ratio with the background is used. (default: None)
  --masked_otsu         Use annotation to mask the thumbnail before otsu-thresholding is used (default: None)
  --otsu_annotation OTSU_ANNOTATION
                        Can be used to name a polygon annotation to determine the area for masked otsu
                        thresholding. Seperate multiple labels with ' ' (whitespace) (default: None)
  --filter_patches FILTER_PATCHES
                        Post-extraction patch filtering to sort out artefacts, marker and other non-tissue patches with a DL model. Time consuming.
                        (default: False)
  --apply_prefilter APPLY_PREFILTER
                        Pre-extraction mask filtering to remove marker from mask before applying otsu
                        (default: False)
  --log_path LOG_PATH   Path where log files should be stored. Otherwise, log files are stored in the output
                        folder (default: None)
  --log_level {critical,error,warning,info,debug}
                        Set the logging level. Options are ['critical', 'error', 'warning', 'info', 'debug']
                        (default: None)
  --hardware_selection {cucim,openslide}
                        Select hardware device (just if available, otherwise always cucim). Defaults to cucim.)
  --wsi_properties WSI_PROPERTIES
                        Can be used to pass the wsi properties manually, but just applies if metadata cannot be derived from OpenSlide (e.g., for .tiff files). Supported keys are slide_mpp and magnification
                        (default: None)
```

**Label-Map**:

An exemplary `label_map.json` file is shown below. It is important that the background label always has a 0 assigned as integer value

Example:
```json
{
    "Background": 0,
    "Tissue-Annotation": 1,
    "Tumor": 2,
    "Stroma": 3,
    "Necrosis": 4
}
```
**Precedence of Target-Magnification, Downsampling and Level**

Target_mpp has the highest priority. If all four are passed, always the target mpp is used for output. Level has the lowest priority.
Sorted by priority:

- Target microns per pixel: Overwrites all other selections
- Target magnification: Overwrites downsampling and level
- Downsampling: Overwrites level
- Level: Lowest priority, default used when neither target magnification nor downsampling is passed

### CLI

A CLI is used to start the preprocessing. The entry-point is the [main_extraction.py](main_extraction.py) file. In addition to the CLI, also a configuration file can be passed via
```bash
python3 pathopatcher/main_extraction.py --config path/to/config.yaml
```
Exemplary configuration file: [patch_extraction.yaml](examples/patch_extraction.yaml)



### Resulting Dataset Structure
In general, the folder structure for a preprocessed dataset looks like this:
The aim of pre-processing is to create one dataset per WSI in the following structure:
```bash
WSI_Name
├── annotation_masks      # thumbnails of extracted annotation masks
│   ├── all_overlaid.png  # all with same dimension as the thumbnail
│   ├── tumor.png
│   └── ...  
├── context               # context patches, if extracted
│   ├── 2                 # subfolder for each scale
│   │   ├── WSI_Name_row1_col1_context_2.png
│   │   ├── WSI_Name_row2_col1_context_2.png
│   │   └── ...
│   └── 4
│   │   ├── WSI_Name_row1_col1_context_2.png
│   │   ├── WSI_Name_row2_col1_context_2.png
│   │   └── ...
├── masks                 # Mask (numpy) files for each patch -> optional folder for segmentation
│   ├── WSI_Name_row1_col1.npy
│   ├── WSI_Name_row2_col1.npy
│   └── ...
├── metadata              # Metadata files for each patch
│   ├── WSI_Name_row1_col1.yaml
│   ├── WSI_Name_row2_col1.yaml
│   └── ...
├── patches               # Patches as .png files
│   ├── WSI_Name_row1_col1.png
│   ├── WSI_Name_row2_col1.png
│   └── ...
├── thumbnails            # Different kind of thumbnails
│   ├── thumbnail_mpp_5.png
│   ├── thumbnail_downsample_32.png
│   └── ...
├── tissue_masks          # Tissue mask images for checking
│   ├── mask.png          # all with same dimension as the thumbnail
│   ├── mask_nogrid.png
│   └── tissue_grid.png
├── mask.png              # tissue mask with green grid  
├── metadata.yaml         # WSI metdata for patch extraction
├── patch_metadata.json   # Patch metadata of WSI merged in one file
└── thumbnail.png         # WSI thumbnail
```

## License
<p xmlns:cc="http://creativecommons.org/ns#" xmlns:dct="http://purl.org/dc/terms/"><a property="dct:title" rel="cc:attributionURL" href="https://github.com/TIO-IKIM/PathoPatcher">PathoPatcher</a> by <a rel="cc:attributionURL dct:creator" property="cc:attributionName" href="https://github.com/FabianHoerst">Fabian Hörst, University Hospital Essen,</a> is licensed under <a href="http://creativecommons.org/licenses/by-nc-sa/4.0/?ref=chooser-v1" target="_blank" rel="license noopener noreferrer" style="display:inline-block;">CC BY-NC-SA 4.0<img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/cc.svg?ref=chooser-v1"><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/by.svg?ref=chooser-v1"><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/nc.svg?ref=chooser-v1"><img style="height:22px!important;margin-left:3px;vertical-align:text-bottom;" src="https://mirrors.creativecommons.org/presskit/icons/sa.svg?ref=chooser-v1"></a></p>

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/TIO-IKIM/PathoPatcher",
    "name": "pathopatch",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "python, pathopatch",
    "author": "Fabian H\u00f6rst",
    "author_email": "fabian.hoerst@uk-essen.de",
    "download_url": "https://files.pythonhosted.org/packages/d8/8f/246eeb006e8628ab8cc4371428f55f0fed60bb4f655b127b66711d5b1ef2/pathopatch-1.0.4b0.tar.gz",
    "platform": null,
    "description": "[![Python 3.10](https://img.shields.io/badge/python-3.10-blue.svg)](https://www.python.org/downloads/release/python-360/)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n![Test-Results](https://github.com/TIO-IKIM/PathoPatcher/actions/workflows/test_build.yml/badge.svg)\n<img src=\"https://img.shields.io/badge/PyTorch-EE4C2C?style=flat-square&logo=Pytorch&logoColor=white\"/></a>\n\n___\n\n# PathoPatch:\n## Accelerating Artificial Intelligence Based Whole Slide Image Analysis with an Optimized Preprocessing Pipeline\n---\n\n## Installation\n\n1. Clone the repository:\n2. Create a conda environment with Python 3.10.12 version and install conda requirements: `conda env create -f environment.yaml -vv`. You can change the environment name by editing the `name` tag in the environment.yaml file.\nThis step is necessary, as we need to install `Openslide` with binary files. This is easier with conda. Otherwise, installation from [source](https://openslide.org/api/python/) needs to be performed and packages installed with pi\n3. Activate environment: `conda activate pathopatch_env`\n4. **Optional: cuCIM**\nRun `conda install -c rapidsai cucim` inside your conda environment. This process is time consuming, so you should be patient. Also follow their [official guideline](https://github.com/rapidsai/cucim) if any problems occur.\n\n### Development\n1. Install pre-commit with `pre-commit install`\n\n\n## Usage\nIn our Pre-Processing pipeline, we are able to extract quadratic patches from detected tissue areas, load annotation files (`.json`) and apply color normlizations. We make use of the popular [OpenSlide](https://openslide.org/) library, but extended it with the [RAPIDS cuCIM](https://github.com/rapidsai/cucim) framework for a speedup in patch-extraction.\n\n\nThe CLI of the main script for patch extraction ([main_extraction](preprocessing/main_extraction.py)) is as follows:\n\n```bash\npython3 main_extraction.py [-h]\n                          [--wsi_paths WSI_PATHS]\n                          [--wsi_filelist WSI_FILELIST]\n                          [--output_path OUTPUT_PATH]\n                          [--wsi_extension {svs}]\n                          [--config CONFIG]\n                          [--patch_size PATCH_SIZE]\n                          [--patch_overlap PATCH_OVERLAP]\n                          [--target_mpp TARGET_MPP]\n                          [--target_mag TARGET_MAG]\n                          [--downsample DOWNSAMPLE]\n                          [--level LEVEL]\n                          [--context_scales [CONTEXT_SCALES ...]]\n                          [--check_resolution CHECK_RESOLUTION]\n                          [--processes PROCESSES]\n                          [--overwrite]\n                          [--annotation_paths ANNOTATION_PATHS]\n                          [--annotation_extension {json,xml}]\n                          [--incomplete_annotations]\n                          [--label_map_file LABEL_MAP_FILE]\n                          [--save_only_annotated_patches]\n                          [--save_context_without_mask]\n                          [--exclude_classes EXCLUDE_CLASSES]\n                          [--store_masks]\n                          [--overlapping_labels]\n                          [--normalize_stains]\n                          [--normalization_vector_json NORMALIZATION_VECTOR_JSON]\n                          [--min_intersection_ratio MIN_INTERSECTION_RATIO]\n                          [--tissue_annotation TISSUE_ANNOTATION]\n                          [--tissue_annotation_intersection_ratio TISSUE_ANNOTATION_INTERSECTION_RATIO]\n                          [--masked_otsu]\n                          [--otsu_annotation OTSU_ANNOTATION]\n                          [--filter_patches FILTER_PATCHES]\n                          [--apply_prefilter APPLY_PREFILTER]\n                          [--log_path LOG_PATH]\n                          [--log_level {critical,error,warning,info,debug}]\n                          [--hardware_selection {cucim,openslide}]\n                          [--wsi_properties DICT]\n\noptional arguments:\n  -h, --help            show this help message and exit\n  --wsi_paths WSI_PATHS\n                        Path to the folder where all WSI are stored or path to a single WSI-file. (default: None)\n  --wsi_filelist WSI_FILELIST\n                        Path to a csv-filelist with WSI files (separator: `,`), if provided just these files are\n                        used.Must include full paths to WSIs, including suffixes.Can be used as an replacement for\n                        the wsi_paths option.If both are provided, yields an error. (default: None)\n  --output_path OUTPUT_PATH\n                        Path to the folder where the resulting dataset should be stored. (default: None)\n  --wsi_extension {svs}\n                        The extension types used for the WSI files, the options are: ['svs'] (default: None)\n  --config CONFIG       Path to a config file. The config file can hold the same parameters as the CLI. Parameters\n                        provided with the CLI are always having precedence over the parameters in the config file.\n                        (default: None)\n  --patch_size PATCH_SIZE\n                        The size of the patches in pixel that will be retrieved from the WSI, e.g. 256 for 256px\n                        (default: None)\n  --patch_overlap PATCH_OVERLAP\n                        The percentage amount pixels that should overlap between two different patches. Please\n                        Provide as integer between 0 and 100, indicating overlap in percentage. (default: None)\n  --downsample DOWNSAMPLE\n                        Each WSI level is downsampled by a factor of 2, downsample expresses which kind of\n                        downsampling should be used with respect to the highest possible resolution. Medium\n                        priority, gets overwritten by target_mag if provided, but overwrites level. (default:\n                        None)\n  --target_mpp TARGET_MPP\n                        If this parameter is provided, the output level of the WSI corresponds to the level that\n                        is at the target microns per pixel of the WSI. Alternative to target_mag, downsaple and level.\n                        Highest priority,\n                        overwrites target_mag, downsample and level if provided. (default: None)\n  --target_mag TARGET_MAG\n                        If this parameter is provided, the output level of the WSI corresponds to the level that\n                        is at the target magnification of the WSI. Alternative to target_mpp, downsaple and level.\n                        High priority, just target_mpp has a higher priority,\n                        overwrites downsample and level if provided. (default: None)\n  --level LEVEL         The tile level for sampling, alternative to downsample. Lowest priority, gets overwritten\n                        by target_mag and downsample if they are provided. (default: None)\n  --context_scales [CONTEXT_SCALES ...]\n                        Define context scales for context patches. Context patches are centered around a central\n                        patch. The context-patch size is equal to the patch-size, but downsampling is different\n                        (default: None)\n  --check_resolution CHECK_RESOLUTION\n                        If a float value is supplies, the program checks whether the resolution of all images\n                        corresponds to the given value (default: None)\n  --processes PROCESSES\n                        The number of processes to use. (default: None)\n  --overwrite           Overwrite the patches that have already been created in case they already exist. Removes\n                        dataset. Handle with care! (default: None)\n  --annotation_paths ANNOTATION_PATHS\n                        Path to the subfolder where the XML/JSON annotations are stored or path to a file\n                        (default: None)\n  --annotation_extension {json,xml}\n                        The extension types used for the annotation files, the options are: ['json', 'xml']\n                        (default: None)\n  --incomplete_annotations\n                        Set to allow WSI without annotation file (default: None)\n  --label_map_file LABEL_MAP_FILE\n                        The path to a json file that contains the mapping between the annotation labels and some\n                        integers; an example can be found in examples (default: None)\n  --save_only_annotated_patches\n                        If true only patches containing annotations will be stored (default: None)\n  --save_context_without_mask\n                        This is helpful for extracting patches, that are not within a mask, but needed for the\n                        Valuing Vicinity Segmentation Algorithms. This flag is specifically helpful if only fully annotated\n                        patches should be extracted from a region of interest and their masks are stored,\n                        but also sourrounding neighbourhood patches are needed (default: False)\n  --exclude_classes EXCLUDE_CLASSES\n                        Can be used to exclude annotation classes (default: None)\n  --store_masks         Set to store masks per patch. Defaults to false (default: None)\n  --overlapping_labels  Per default, labels (annotations) are mutually exclusive. If labels overlap, they are\n                        overwritten according to the label_map.json ordering (highest number = highest priority\n                        (default: None)\n  --normalize_stains    Uses Macenko normalization on a portion of the whole slide image (default: None)\n  --normalization_vector_json NORMALIZATION_VECTOR_JSON\n                        The path to a JSON file where the normalization vectors are stored (default: None)\n  --adjust_brightness   Normalize brightness in a batch by clipping to 90 percent. Not recommended, but kept for legacy reasons (default: None)\n  --min_intersection_ratio MIN_INTERSECTION_RATIO\n                        The minimum intersection between the tissue mask and the patch. Must be between 0 and 1. 0\n                        means that all patches are extracted. (default: None)\n  --tissue_annotation TISSUE_ANNOTATION\n                        Can be used to name a polygon annotation to determine the tissue area. If a tissue\n                        annotation is provided, no Otsu-thresholding is performed (default: None)\n  --tissue_annotation_intersection_ratio TISSUE_ANNOTATION_INTERSECTION_RATIO\n                        Intersection ratio with tissue annotation. Helpful, if ROI annotation is passed,\n                        which should not interfere with background ratio. If not provided,\n                        the default min_intersection_ratio with the background is used. (default: None)\n  --masked_otsu         Use annotation to mask the thumbnail before otsu-thresholding is used (default: None)\n  --otsu_annotation OTSU_ANNOTATION\n                        Can be used to name a polygon annotation to determine the area for masked otsu\n                        thresholding. Seperate multiple labels with ' ' (whitespace) (default: None)\n  --filter_patches FILTER_PATCHES\n                        Post-extraction patch filtering to sort out artefacts, marker and other non-tissue patches with a DL model. Time consuming.\n                        (default: False)\n  --apply_prefilter APPLY_PREFILTER\n                        Pre-extraction mask filtering to remove marker from mask before applying otsu\n                        (default: False)\n  --log_path LOG_PATH   Path where log files should be stored. Otherwise, log files are stored in the output\n                        folder (default: None)\n  --log_level {critical,error,warning,info,debug}\n                        Set the logging level. Options are ['critical', 'error', 'warning', 'info', 'debug']\n                        (default: None)\n  --hardware_selection {cucim,openslide}\n                        Select hardware device (just if available, otherwise always cucim). Defaults to cucim.)\n  --wsi_properties WSI_PROPERTIES\n                        Can be used to pass the wsi properties manually, but just applies if metadata cannot be derived from OpenSlide (e.g., for .tiff files). Supported keys are slide_mpp and magnification\n                        (default: None)\n```\n\n**Label-Map**:\n\nAn exemplary `label_map.json` file is shown below. It is important that the background label always has a 0 assigned as integer value\n\nExample:\n```json\n{\n    \"Background\": 0,\n    \"Tissue-Annotation\": 1,\n    \"Tumor\": 2,\n    \"Stroma\": 3,\n    \"Necrosis\": 4\n}\n```\n**Precedence of Target-Magnification, Downsampling and Level**\n\nTarget_mpp has the highest priority. If all four are passed, always the target mpp is used for output. Level has the lowest priority.\nSorted by priority:\n\n- Target microns per pixel: Overwrites all other selections\n- Target magnification: Overwrites downsampling and level\n- Downsampling: Overwrites level\n- Level: Lowest priority, default used when neither target magnification nor downsampling is passed\n\n### CLI\n\nA CLI is used to start the preprocessing. The entry-point is the [main_extraction.py](main_extraction.py) file. In addition to the CLI, also a configuration file can be passed via\n```bash\npython3 pathopatcher/main_extraction.py --config path/to/config.yaml\n```\nExemplary configuration file: [patch_extraction.yaml](examples/patch_extraction.yaml)\n\n\n\n### Resulting Dataset Structure\nIn general, the folder structure for a preprocessed dataset looks like this:\nThe aim of pre-processing is to create one dataset per WSI in the following structure:\n```bash\nWSI_Name\n\u251c\u2500\u2500 annotation_masks      # thumbnails of extracted annotation masks\n\u2502   \u251c\u2500\u2500 all_overlaid.png  # all with same dimension as the thumbnail\n\u2502   \u251c\u2500\u2500 tumor.png\n\u2502   \u2514\u2500\u2500 ...  \n\u251c\u2500\u2500 context               # context patches, if extracted\n\u2502   \u251c\u2500\u2500 2                 # subfolder for each scale\n\u2502   \u2502   \u251c\u2500\u2500 WSI_Name_row1_col1_context_2.png\n\u2502   \u2502   \u251c\u2500\u2500 WSI_Name_row2_col1_context_2.png\n\u2502   \u2502   \u2514\u2500\u2500 ...\n\u2502   \u2514\u2500\u2500 4\n\u2502   \u2502   \u251c\u2500\u2500 WSI_Name_row1_col1_context_2.png\n\u2502   \u2502   \u251c\u2500\u2500 WSI_Name_row2_col1_context_2.png\n\u2502   \u2502   \u2514\u2500\u2500 ...\n\u251c\u2500\u2500 masks                 # Mask (numpy) files for each patch -> optional folder for segmentation\n\u2502   \u251c\u2500\u2500 WSI_Name_row1_col1.npy\n\u2502   \u251c\u2500\u2500 WSI_Name_row2_col1.npy\n\u2502   \u2514\u2500\u2500 ...\n\u251c\u2500\u2500 metadata              # Metadata files for each patch\n\u2502   \u251c\u2500\u2500 WSI_Name_row1_col1.yaml\n\u2502   \u251c\u2500\u2500 WSI_Name_row2_col1.yaml\n\u2502   \u2514\u2500\u2500 ...\n\u251c\u2500\u2500 patches               # Patches as .png files\n\u2502   \u251c\u2500\u2500 WSI_Name_row1_col1.png\n\u2502   \u251c\u2500\u2500 WSI_Name_row2_col1.png\n\u2502   \u2514\u2500\u2500 ...\n\u251c\u2500\u2500 thumbnails            # Different kind of thumbnails\n\u2502   \u251c\u2500\u2500 thumbnail_mpp_5.png\n\u2502   \u251c\u2500\u2500 thumbnail_downsample_32.png\n\u2502   \u2514\u2500\u2500 ...\n\u251c\u2500\u2500 tissue_masks          # Tissue mask images for checking\n\u2502   \u251c\u2500\u2500 mask.png          # all with same dimension as the thumbnail\n\u2502   \u251c\u2500\u2500 mask_nogrid.png\n\u2502   \u2514\u2500\u2500 tissue_grid.png\n\u251c\u2500\u2500 mask.png              # tissue mask with green grid  \n\u251c\u2500\u2500 metadata.yaml         # WSI metdata for patch extraction\n\u251c\u2500\u2500 patch_metadata.json   # Patch metadata of WSI merged in one file\n\u2514\u2500\u2500 thumbnail.png         # WSI thumbnail\n```\n\n## License\n<p xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:dct=\"http://purl.org/dc/terms/\"><a property=\"dct:title\" rel=\"cc:attributionURL\" href=\"https://github.com/TIO-IKIM/PathoPatcher\">PathoPatcher</a> by <a rel=\"cc:attributionURL dct:creator\" property=\"cc:attributionName\" href=\"https://github.com/FabianHoerst\">Fabian H\u00f6rst, University Hospital Essen,</a> is licensed under <a href=\"http://creativecommons.org/licenses/by-nc-sa/4.0/?ref=chooser-v1\" target=\"_blank\" rel=\"license noopener noreferrer\" style=\"display:inline-block;\">CC BY-NC-SA 4.0<img style=\"height:22px!important;margin-left:3px;vertical-align:text-bottom;\" src=\"https://mirrors.creativecommons.org/presskit/icons/cc.svg?ref=chooser-v1\"><img style=\"height:22px!important;margin-left:3px;vertical-align:text-bottom;\" src=\"https://mirrors.creativecommons.org/presskit/icons/by.svg?ref=chooser-v1\"><img style=\"height:22px!important;margin-left:3px;vertical-align:text-bottom;\" src=\"https://mirrors.creativecommons.org/presskit/icons/nc.svg?ref=chooser-v1\"><img style=\"height:22px!important;margin-left:3px;vertical-align:text-bottom;\" src=\"https://mirrors.creativecommons.org/presskit/icons/sa.svg?ref=chooser-v1\"></a></p>\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "PathoPatch - Accelerating Artificial Intelligence Based Whole Slide Image Analysis with an Optimized Preprocessing Pipeline",
    "version": "1.0.4b0",
    "project_urls": {
        "Homepage": "https://github.com/TIO-IKIM/PathoPatcher"
    },
    "split_keywords": [
        "python",
        " pathopatch"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "aedf05b100d7bc121ff91f3c56b39bc840a02077926a13d74ee6c6b7f74728c5",
                "md5": "f5dfd648ee16a461a71c276b5be9da13",
                "sha256": "c98a30d882d738f8e748fd92556619afce5836ed3da501c0a7953b06c5e80857"
            },
            "downloads": -1,
            "filename": "pathopatch-1.0.4b0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f5dfd648ee16a461a71c276b5be9da13",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 5819644,
            "upload_time": "2024-04-05T14:35:04",
            "upload_time_iso_8601": "2024-04-05T14:35:04.155593Z",
            "url": "https://files.pythonhosted.org/packages/ae/df/05b100d7bc121ff91f3c56b39bc840a02077926a13d74ee6c6b7f74728c5/pathopatch-1.0.4b0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d88f246eeb006e8628ab8cc4371428f55f0fed60bb4f655b127b66711d5b1ef2",
                "md5": "1d24a1b705f0eb082e1474d662c07d51",
                "sha256": "8ae58be3a2e0e0557915d4dd2b587d3c39f110a95bc846f02a284ee953aad8b0"
            },
            "downloads": -1,
            "filename": "pathopatch-1.0.4b0.tar.gz",
            "has_sig": false,
            "md5_digest": "1d24a1b705f0eb082e1474d662c07d51",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 5814871,
            "upload_time": "2024-04-05T14:35:06",
            "upload_time_iso_8601": "2024-04-05T14:35:06.696950Z",
            "url": "https://files.pythonhosted.org/packages/d8/8f/246eeb006e8628ab8cc4371428f55f0fed60bb4f655b127b66711d5b1ef2/pathopatch-1.0.4b0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-04-05 14:35:06",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "TIO-IKIM",
    "github_project": "PathoPatcher",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "Pillow",
            "specs": [
                [
                    ">=",
                    "9.5.0"
                ]
            ]
        },
        {
            "name": "PyYAML",
            "specs": []
        },
        {
            "name": "Shapely",
            "specs": [
                [
                    "==",
                    "1.8.5.post1"
                ]
            ]
        },
        {
            "name": "colorama",
            "specs": []
        },
        {
            "name": "future",
            "specs": []
        },
        {
            "name": "geojson",
            "specs": [
                [
                    ">=",
                    "3.0.0"
                ]
            ]
        },
        {
            "name": "matplotlib",
            "specs": []
        },
        {
            "name": "natsort",
            "specs": []
        },
        {
            "name": "numpy",
            "specs": [
                [
                    "<",
                    "1.24"
                ],
                [
                    ">",
                    "1.22"
                ]
            ]
        },
        {
            "name": "opencv_python_headless",
            "specs": []
        },
        {
            "name": "openslide_python",
            "specs": []
        },
        {
            "name": "pandas",
            "specs": []
        },
        {
            "name": "pydantic",
            "specs": [
                [
                    "==",
                    "1.10.4"
                ]
            ]
        },
        {
            "name": "rasterio",
            "specs": [
                [
                    "==",
                    "1.3.5.post1"
                ]
            ]
        },
        {
            "name": "requests",
            "specs": []
        },
        {
            "name": "scikit-image",
            "specs": []
        },
        {
            "name": "setuptools",
            "specs": [
                [
                    "<=",
                    "65.6.3"
                ]
            ]
        },
        {
            "name": "tqdm",
            "specs": []
        },
        {
            "name": "torchvision",
            "specs": []
        },
        {
            "name": "torch",
            "specs": []
        },
        {
            "name": "wsidicom",
            "specs": [
                [
                    "==",
                    "0.20.4"
                ]
            ]
        },
        {
            "name": "wsidicomizer",
            "specs": [
                [
                    "==",
                    "0.13.2"
                ]
            ]
        }
    ],
    "lcname": "pathopatch"
}
        
Elapsed time: 0.26094s