histoslice


Namehistoslice JSON
Version 0.1.0 PyPI version JSON
download
home_pageNone
SummaryRead and process histological slide images with python!
upload_time2025-08-20 09:13:54
maintainerNone
docs_urlNone
authorNone
requires_python>=3.10
licenseNone
keywords wsi whole slide imaging histology image processing slide
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <div align="center">

# HistoSlice

[![PyPI - Version](https://img.shields.io/pypi/v/histoslice)](https://pypi.org/project/histoslice/)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/histoslice)](https://pypi.org/project/histoslice/)
[![GitHub License](https://img.shields.io/github/license/rmuraix/HistoSlice)](./LICENSE)
[![Check](https://github.com/rmuraix/HistoSlice/actions/workflows/check.yaml/badge.svg)](https://github.com/rmuraix/HistoSlice/actions/workflows/check.yaml)
[![Docs](https://github.com/rmuraix/HistoSlice/actions/workflows/docs.yaml/badge.svg)](https://github.com/rmuraix/HistoSlice/actions/workflows/docs.yaml)
[![codecov](https://codecov.io/github/rmuraix/HistoSlice/graph/badge.svg?token=NDSf4tDhzF)](https://codecov.io/github/rmuraix/HistoSlice)

Preprocessing large medical images for machine learning made easy!

<p align="center">
  <a href="https://lab.rmurai.com/HistoSlice/">Documentation</a> •
  <a href="https://pypi.org/project/histoslice/">PyPI</a>
</p>

</div>

## Description

`HistoSlice` makes is easy to prepare your histological slide images for deep
learning models. You can easily cut large slide images into smaller tiles and then
preprocess those tiles (remove tiles with shitty tissue, finger marks etc).

> [!NOTE]
> This project was forked from [HistoPrep](https://github.com/jopo666/HistoPrep), and further modified for additional features and improvements.

## Installation

```bash
uv add histoslice
# or
pip install histoslice
```

## Usage

Typical workflow for training deep learning models with histological images is the
following:

1. Cut each slide image into smaller tile images.
2. Preprocess smaller tile images by removing tiles with bad tissue, staining artifacts.

```bash
HistoSlice --input './train_images/*.tiff' --output ./tiles --width 512 --overlap 0.5 --max-background 0.5 --metrics --thumbnail
```

Or you can use the `HistoSlice` python API to do the same thing!

```python
from histoslice import SlideReader

# Read slide image.
reader = SlideReader("./slides/slide_with_ink.jpeg")
# Detect tissue.
threshold, tissue_mask = reader.get_tissue_mask(level=-1)
# Extract overlapping tile coordinates with less than 50% background.
tile_coordinates = reader.get_tile_coordinates(
    tissue_mask, width=512, overlap=0.5, max_background=0.5
)
# Save tile images with image metrics for preprocessing.
tile_metadata = reader.save_regions(
    "./train_tiles/",
    tile_coordinates,
    threshold=threshold,
    save_metrics=True,
    save_thumbnail=True
)
```

Let's take a look at the output and visualise the thumbnails.

```bash
train_tiles
└── slide_with_ink
    ├── metadata.parquet       # tile metadata
    ├── properties.json        # tile properties
    ├── thumbnail.jpeg         # thumbnail image
    ├── thumbnail_tiles.jpeg   # thumbnail with tiles
    ├── thumbnail_tissue.jpeg  # thumbnail of the tissue mask
    └── tiles [390 entries exceeds filelimit, not opening dir]
```

![Prostate biopsy sample](images/thumbnail.jpeg)
![Tissue mask](images/thumbnail_tissue.jpeg)
![Thumbnail with tiles](images/thumbnail_tiles.jpeg)

As we can see from the above images, histological slide images often contain areas that
we would not like to include into our training data. Might seem like a daunting task but
let's try it out!

```python
from histoslice.utils import OutlierDetector

# Let's wrap the tile metadata with a helper class.
detector = OutlierDetector(tile_metadata)
# Cluster tiles based on image metrics.
clusters = detector.cluster_kmeans(num_clusters=4, random_state=666)
# Visualise first cluster.
reader.get_annotated_thumbnail(
    image=reader.read_level(-1), coordinates=detector.coordinates[clusters == 0]
)
```

![Tiles in cluster 0](images/thumbnail_blue.jpeg)

Now we can mark tiles in cluster `0` as outliers!

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "histoslice",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "WSI, Whole Slide Imaging, histology, image processing, slide",
    "author": null,
    "author_email": "Ryota Murai <opensource@rmurai.com>",
    "download_url": "https://files.pythonhosted.org/packages/e3/2a/9f95ba57bf9aac6335b035e68b7ce1c628c1e14ed4ce71660033f2886c19/histoslice-0.1.0.tar.gz",
    "platform": null,
    "description": "<div align=\"center\">\n\n# HistoSlice\n\n[![PyPI - Version](https://img.shields.io/pypi/v/histoslice)](https://pypi.org/project/histoslice/)\n[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/histoslice)](https://pypi.org/project/histoslice/)\n[![GitHub License](https://img.shields.io/github/license/rmuraix/HistoSlice)](./LICENSE)\n[![Check](https://github.com/rmuraix/HistoSlice/actions/workflows/check.yaml/badge.svg)](https://github.com/rmuraix/HistoSlice/actions/workflows/check.yaml)\n[![Docs](https://github.com/rmuraix/HistoSlice/actions/workflows/docs.yaml/badge.svg)](https://github.com/rmuraix/HistoSlice/actions/workflows/docs.yaml)\n[![codecov](https://codecov.io/github/rmuraix/HistoSlice/graph/badge.svg?token=NDSf4tDhzF)](https://codecov.io/github/rmuraix/HistoSlice)\n\nPreprocessing large medical images for machine learning made easy!\n\n<p align=\"center\">\n  <a href=\"https://lab.rmurai.com/HistoSlice/\">Documentation</a> \u2022\n  <a href=\"https://pypi.org/project/histoslice/\">PyPI</a>\n</p>\n\n</div>\n\n## Description\n\n`HistoSlice` makes is easy to prepare your histological slide images for deep\nlearning models. You can easily cut large slide images into smaller tiles and then\npreprocess those tiles (remove tiles with shitty tissue, finger marks etc).\n\n> [!NOTE]\n> This project was forked from [HistoPrep](https://github.com/jopo666/HistoPrep), and further modified for additional features and improvements.\n\n## Installation\n\n```bash\nuv add histoslice\n# or\npip install histoslice\n```\n\n## Usage\n\nTypical workflow for training deep learning models with histological images is the\nfollowing:\n\n1. Cut each slide image into smaller tile images.\n2. Preprocess smaller tile images by removing tiles with bad tissue, staining artifacts.\n\n```bash\nHistoSlice --input './train_images/*.tiff' --output ./tiles --width 512 --overlap 0.5 --max-background 0.5 --metrics --thumbnail\n```\n\nOr you can use the `HistoSlice` python API to do the same thing!\n\n```python\nfrom histoslice import SlideReader\n\n# Read slide image.\nreader = SlideReader(\"./slides/slide_with_ink.jpeg\")\n# Detect tissue.\nthreshold, tissue_mask = reader.get_tissue_mask(level=-1)\n# Extract overlapping tile coordinates with less than 50% background.\ntile_coordinates = reader.get_tile_coordinates(\n    tissue_mask, width=512, overlap=0.5, max_background=0.5\n)\n# Save tile images with image metrics for preprocessing.\ntile_metadata = reader.save_regions(\n    \"./train_tiles/\",\n    tile_coordinates,\n    threshold=threshold,\n    save_metrics=True,\n    save_thumbnail=True\n)\n```\n\nLet's take a look at the output and visualise the thumbnails.\n\n```bash\ntrain_tiles\n\u2514\u2500\u2500 slide_with_ink\n    \u251c\u2500\u2500 metadata.parquet       # tile metadata\n    \u251c\u2500\u2500 properties.json        # tile properties\n    \u251c\u2500\u2500 thumbnail.jpeg         # thumbnail image\n    \u251c\u2500\u2500 thumbnail_tiles.jpeg   # thumbnail with tiles\n    \u251c\u2500\u2500 thumbnail_tissue.jpeg  # thumbnail of the tissue mask\n    \u2514\u2500\u2500 tiles [390 entries exceeds filelimit, not opening dir]\n```\n\n![Prostate biopsy sample](images/thumbnail.jpeg)\n![Tissue mask](images/thumbnail_tissue.jpeg)\n![Thumbnail with tiles](images/thumbnail_tiles.jpeg)\n\nAs we can see from the above images, histological slide images often contain areas that\nwe would not like to include into our training data. Might seem like a daunting task but\nlet's try it out!\n\n```python\nfrom histoslice.utils import OutlierDetector\n\n# Let's wrap the tile metadata with a helper class.\ndetector = OutlierDetector(tile_metadata)\n# Cluster tiles based on image metrics.\nclusters = detector.cluster_kmeans(num_clusters=4, random_state=666)\n# Visualise first cluster.\nreader.get_annotated_thumbnail(\n    image=reader.read_level(-1), coordinates=detector.coordinates[clusters == 0]\n)\n```\n\n![Tiles in cluster 0](images/thumbnail_blue.jpeg)\n\nNow we can mark tiles in cluster `0` as outliers!\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Read and process histological slide images with python!",
    "version": "0.1.0",
    "project_urls": {
        "Changelog": "https://github.com/rmuraix/HistoSlice/releases",
        "Documentation": "https://lab.rmurai.com/HistoSlice/",
        "Homepage": "https://github.com/rmuraix/HistoSlice",
        "Issues": "https://github.com/rmuraix/HistoSlice/issues",
        "Repository": "https://github.com/rmuraix/HistoSlice.git"
    },
    "split_keywords": [
        "wsi",
        " whole slide imaging",
        " histology",
        " image processing",
        " slide"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "56428b8a3fbf56d101d30ee6fecab6ef7a67b8d4ad58ed2c231d4d658c045472",
                "md5": "002fd08ad710c29f53d4fd4378c34f4c",
                "sha256": "088798b63cefb35b1c481de62c1022a451d7c6b6b5fb68e94be710bbcd958d81"
            },
            "downloads": -1,
            "filename": "histoslice-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "002fd08ad710c29f53d4fd4378c34f4c",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 45077,
            "upload_time": "2025-08-20T09:13:52",
            "upload_time_iso_8601": "2025-08-20T09:13:52.706796Z",
            "url": "https://files.pythonhosted.org/packages/56/42/8b8a3fbf56d101d30ee6fecab6ef7a67b8d4ad58ed2c231d4d658c045472/histoslice-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "e32a9f95ba57bf9aac6335b035e68b7ce1c628c1e14ed4ce71660033f2886c19",
                "md5": "cbdeb04778d8448b8435dfb0b49ed389",
                "sha256": "588c6a18ab330be52188ae9d1b435f28aebde62b8839d48217738aa626c4c9f6"
            },
            "downloads": -1,
            "filename": "histoslice-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "cbdeb04778d8448b8435dfb0b49ed389",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 4414883,
            "upload_time": "2025-08-20T09:13:54",
            "upload_time_iso_8601": "2025-08-20T09:13:54.460269Z",
            "url": "https://files.pythonhosted.org/packages/e3/2a/9f95ba57bf9aac6335b035e68b7ce1c628c1e14ed4ce71660033f2886c19/histoslice-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-20 09:13:54",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "rmuraix",
    "github_project": "HistoSlice",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "histoslice"
}
        
Elapsed time: 1.41435s