# *wsidicomizer*
*wsidicomizer* is a Python library for opening WSIs in proprietary formats and optionally convert them to DICOM. The aims of the project are:
- Provide read support for various proprietary formats.
- Provide lossless conversion for files supported by opentile.
- Provide 'as good as possible' conversion for other formats.
- Simplify the encoding of WSI metadata into DICOM.
## Supported formats
*wsidicomizer* currently supports the following formats:
- Aperio svs (lossless)
- Hamamatsu ndpi (lossless)
- Philips tiff (lossless)
- Zeiss czi (lossy, only base level)
- Optional: Formats supported by Bioformats (lossy)
With the `openslide` extra the following formats are also supported:
- Mirax mrxs (lossy)
- Leica scn (lossy)
- Sakura svslide (lossy)
- Trestle tif (lossy)
- Ventana bif, tif (lossy)
- Hamamatsu vms, vmu (lossy)
The `bioformats` extra by default enables lossy support for the [BSD-licensed Bioformat formats](https://docs.openmicroscopy.org/bio-formats/6.12.0/supported-formats.html).
## Installation
***Install wsidicomizer from pypi***
```console
pip install wsidicomizer
```
See [Openslide support](#openslide-support) and [Bioformats support](#bioformats-support) for how to install optional extras.
***Install libjpeg-turbo***
Install libjpeg-turbo either as binary from <https://libjpeg-turbo.org/> or using your package manager.
For Windows, you also need to add libjpeg-turbo's bin-folder to the environment variable 'Path'
## Important note
Please note that this is an early release and the API is not frozen yet. Function names and functionality is prone to change.
## Requirements
*wsidicomizer* requires python >=3.8 and uses numpy, pydicom, highdicom, imagecodecs, PyTurboJPEG, opentile, and wsidicom.
## Basic cli-usage
***Convert a wsi-file into DICOM using cli-interface***
```console
wsidicomizer -i 'path_to_wsi_file' -o 'path_to_output_folder'
```
### Arguments
~~~~
-i, --input, path to input wsi file
-o, --output, path to output folder
-t, --tile-size, required depending on input format
-m, --metadata, optional path to json file defining metadata
-d, --default-metadata, optional path to json file defining default metadata
-l, --levels, optional levels to include
-w, --workers, number of threads to use
--label, optional label image to use instead of label found in file
--no-label, if not to include label image
--no-overview, if not to include overview image
--no-confidential, if to not include confidential metadata
--chunk-size, number of tiles to give each worker at a time
--format, encoding format to use if re-encoding. 'jpeg' or 'jpeg2000'
--quality, quality to use if re-encoding.
--subsampling, subsampling option to use if re-encoding.
--offset-table, offset table to use, 'bot', 'eot', or 'None'
~~~~
### Flags
~~~~
--no-label, do not include label(s)
--no-overview, do not include overview(s)
--no-confidential, do not include confidential metadata from image
~~~~
Using the no-confidential-flag properties according to [DICOM Basic Confidentiality Profile](https://dicom.nema.org/medical/dicom/current/output/html/part15.html#table_E.1-1) are not included in the output file. Properties otherwise included are currently:
- Acquisition DateTime
- Device Serial Number
## Basic usage
***Create metadata (Optional)***
```python
from wsidicom.conceptcode import (
AnatomicPathologySpecimenTypesCode,
ContainerTypeCode,
SpecimenCollectionProcedureCode,
SpecimenEmbeddingMediaCode,
SpecimenFixativesCode,
SpecimenSamplingProcedureCode,
SpecimenStainsCode,
)
from wsidicom.metadata import (
Collection,
Embedding,
Equipment,
Fixation,
Label,
Patient,
Sample,
Series,
Slide,
SlideSample,
Specimen,
Staining,
Study,
)
from wsidicomizer.metadata import WsiDicomizerMetadata
study = Study(identifier="Study identifier")
series = Series(number=1)
patient = Patient(name="FamilyName^GivenName")
label = Label(text="Label text")
equipment = Equipment(
manufacturer="Scanner manufacturer",
model_name="Scanner model name",
device_serial_number="Scanner serial number",
software_versions=["Scanner software versions"],
)
specimen = Specimen(
identifier="Specimen",
extraction_step=Collection(method=SpecimenCollectionProcedureCode("Excision")),
type=AnatomicPathologySpecimenTypesCode("Gross specimen"),
container=ContainerTypeCode("Specimen container"),
steps=[Fixation(fixative=SpecimenFixativesCode("Neutral Buffered Formalin"))],
)
block = Sample(
identifier="Block",
sampled_from=[specimen.sample(method=SpecimenSamplingProcedureCode("Dissection"))],
type=AnatomicPathologySpecimenTypesCode("tissue specimen"),
container=ContainerTypeCode("Tissue cassette"),
steps=[Embedding(medium=SpecimenEmbeddingMediaCode("Paraffin wax"))],
)
slide_sample = SlideSample(
identifier="Slide sample",
sampled_from=block.sample(method=SpecimenSamplingProcedureCode("Block sectioning")),
)
slide = Slide(
identifier="Slide",
stainings=[
Staining(
substances=[
SpecimenStainsCode("hematoxylin stain"),
SpecimenStainsCode("water soluble eosin stain"),
]
)
],
samples=[slide_sample],
)
metadata = WsiDicomizerMetadata(
study=study,
series=series,
patient=patient,
equipment=equipment,
slide=slide,
label=label,
)
```
***Convert a wsi-file into DICOM using python-interface***
```python
from wsidicomizer import WsiDicomizer
created_files = WsiDicomizer.convert(
filepath=path_to_wsi_file,
output_path=path_to_output_folder,
metadata=metadata,
tile_size=tile_size
)
```
***Import a wsi file as a WsiDicom object.***
```python
from wsidicomizer import WsiDicomizer
wsi = WsiDicomizer.open(path_to_wsi_file)
region = wsi.read_region((1000, 1000), 6, (200, 200))
wsi.close()
```
## Openslide support
### Installation
Support for reading images using Openslide c library can optionally be enabled by installing *wsidicomizer* with the `openslide` extra:
```console
pip install wsidicomizer[openslide]
```
The OpenSlide extra requires the OpenSlide library to be installed separately. Instructions for how to install OpenSlide is available on <https://openslide.org/download/>
For Windows, you need also need add OpenSlide's bin-folder to the environment variable 'Path'
## Bioformats support
### Installation
Support for reading images using Bioformats java library can optionally be enabled by installing *wsidicomizer* with the `bioformats` extra:
```console
pip install wsidicomizer[bioformats]
```
The `bioformats` extra enables usage of the `bioformats` module and the `bioformats_wsidicomizer`-cli command. The required Bioformats java library (jar-file) is downloaded automatically when the module is imported using [scyjava](https://github.com/scijava/scyjava).
### Using
As the Bioformats library is a java library it needs to run in a java virtual machine (JVM). A JVM is started automatically when the `bioformats` module is imported. The JVM can´t be restarted in the same Python inteprenter, and is therefore left running once started. If you want to shutdown the JVM (without closing the Python inteprenter) you can call the shutdown_jvm()-method:
```python
import scyjava
scyjava.shutdown_jvm()
```
Due to the need to start a JVM, the `bioformats` module is not imported when using the default `WsiDicomzer`-class, instead the `BioformatsDicomizer`-class should be used. Similarly, the Bioformats support is only available in the `bioformats_wsidicomizer`-cli command.
### Bioformats version
The Bioformats java library is available in two versions, one with BSD and one with GPL2 license, and can read several [WSI formats](https://bio-formats.readthedocs.io/en/v6.12.0/supported-formats.html). However, most formats are only available in the GPL2 version. Due to the licensing incompatibility between Apache 2.0 and GPL2, *wsidicomizer* is distributed with a default setting of using the BSD licensed library. The loaded Biformats version can be changed by the user by setting the `BIOFORMATS_VERSION` environmental variable from the default value `bsd:6.12.0`.
## Limitations
Files with z-stacks or multiple focal paths are currently not supported.
## Other DICOM python tools
- [pydicom](https://pydicom.github.io/)
- [highdicom](https://github.com/MGHComputationalPathology/highdicom)
- [wsidicom](https://github.com/imi-bigpicture/wsidicom)
## Contributing
We welcome any contributions to help improve this tool for the WSI DICOM community!
We recommend first creating an issue before creating potential contributions to check that the contribution is in line with the goals of the project. To submit your contribution, please issue a pull request on the imi-bigpicture/wsidicomizer repository with your changes for review.
Our aim is to provide constructive and positive code reviews for all submissions. The project relies on gradual typing and roughly follows PEP8. However, we are not dogmatic. Most important is that the code is easy to read and understand.
## Acknowledgement
*wsidicomizer*: Copyright 2021 Sectra AB, licensed under Apache 2.0.
This project is part of a project that has received funding from the Innovative Medicines Initiative 2 Joint Undertaking under grant agreement No 945358. This Joint Undertaking receives support from the European Union’s Horizon 2020 research and innovation programme and EFPIA. IMI website: <www.imi.europa.eu>
Raw data
{
"_id": null,
"home_page": "https://github.com/imi-bigpicture/wsidicomizer",
"name": "wsidicomizer",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.10",
"maintainer_email": null,
"keywords": "whole slide image, digital pathology, dicom, converter",
"author": "Erik O Gabrielsson",
"author_email": "erik.o.gabrielsson@sectra.com",
"download_url": "https://files.pythonhosted.org/packages/a9/48/2210ce80b2bfebd1497f77f3edc2abe0446b6d4c7c21c6b12fd77181926a/wsidicomizer-0.15.0.tar.gz",
"platform": null,
"description": "# *wsidicomizer*\n\n*wsidicomizer* is a Python library for opening WSIs in proprietary formats and optionally convert them to DICOM. The aims of the project are:\n\n- Provide read support for various proprietary formats.\n- Provide lossless conversion for files supported by opentile.\n- Provide 'as good as possible' conversion for other formats.\n- Simplify the encoding of WSI metadata into DICOM.\n\n## Supported formats\n\n*wsidicomizer* currently supports the following formats:\n\n- Aperio svs (lossless)\n- Hamamatsu ndpi (lossless)\n- Philips tiff (lossless)\n- Zeiss czi (lossy, only base level)\n- Optional: Formats supported by Bioformats (lossy)\n\nWith the `openslide` extra the following formats are also supported:\n\n- Mirax mrxs (lossy)\n- Leica scn (lossy)\n- Sakura svslide (lossy)\n- Trestle tif (lossy)\n- Ventana bif, tif (lossy)\n- Hamamatsu vms, vmu (lossy)\n\nThe `bioformats` extra by default enables lossy support for the [BSD-licensed Bioformat formats](https://docs.openmicroscopy.org/bio-formats/6.12.0/supported-formats.html).\n\n## Installation\n\n***Install wsidicomizer from pypi***\n\n```console\npip install wsidicomizer\n```\n\nSee [Openslide support](#openslide-support) and [Bioformats support](#bioformats-support) for how to install optional extras.\n\n***Install libjpeg-turbo***\nInstall libjpeg-turbo either as binary from <https://libjpeg-turbo.org/> or using your package manager.\nFor Windows, you also need to add libjpeg-turbo's bin-folder to the environment variable 'Path'\n\n## Important note\n\nPlease note that this is an early release and the API is not frozen yet. Function names and functionality is prone to change.\n\n## Requirements\n\n*wsidicomizer* requires python >=3.8 and uses numpy, pydicom, highdicom, imagecodecs, PyTurboJPEG, opentile, and wsidicom.\n\n## Basic cli-usage\n\n***Convert a wsi-file into DICOM using cli-interface***\n\n```console\nwsidicomizer -i 'path_to_wsi_file' -o 'path_to_output_folder'\n```\n\n### Arguments\n\n~~~~\n-i, --input, path to input wsi file\n-o, --output, path to output folder\n-t, --tile-size, required depending on input format\n-m, --metadata, optional path to json file defining metadata\n-d, --default-metadata, optional path to json file defining default metadata\n-l, --levels, optional levels to include\n-w, --workers, number of threads to use\n--label, optional label image to use instead of label found in file\n--no-label, if not to include label image\n--no-overview, if not to include overview image\n--no-confidential, if to not include confidential metadata\n--chunk-size, number of tiles to give each worker at a time\n--format, encoding format to use if re-encoding. 'jpeg' or 'jpeg2000'\n--quality, quality to use if re-encoding.\n--subsampling, subsampling option to use if re-encoding.\n--offset-table, offset table to use, 'bot', 'eot', or 'None'\n~~~~\n\n### Flags\n\n~~~~\n--no-label, do not include label(s)\n--no-overview, do not include overview(s)\n--no-confidential, do not include confidential metadata from image\n~~~~\n\nUsing the no-confidential-flag properties according to [DICOM Basic Confidentiality Profile](https://dicom.nema.org/medical/dicom/current/output/html/part15.html#table_E.1-1) are not included in the output file. Properties otherwise included are currently:\n\n- Acquisition DateTime\n- Device Serial Number\n\n## Basic usage\n\n***Create metadata (Optional)***\n\n```python\nfrom wsidicom.conceptcode import (\n AnatomicPathologySpecimenTypesCode,\n ContainerTypeCode,\n SpecimenCollectionProcedureCode,\n SpecimenEmbeddingMediaCode,\n SpecimenFixativesCode,\n SpecimenSamplingProcedureCode,\n SpecimenStainsCode,\n)\nfrom wsidicom.metadata import (\n Collection,\n Embedding,\n Equipment,\n Fixation,\n Label,\n Patient,\n Sample,\n Series,\n Slide,\n SlideSample,\n Specimen,\n Staining,\n Study,\n)\nfrom wsidicomizer.metadata import WsiDicomizerMetadata\n\nstudy = Study(identifier=\"Study identifier\")\nseries = Series(number=1)\npatient = Patient(name=\"FamilyName^GivenName\")\nlabel = Label(text=\"Label text\")\nequipment = Equipment(\n manufacturer=\"Scanner manufacturer\",\n model_name=\"Scanner model name\",\n device_serial_number=\"Scanner serial number\",\n software_versions=[\"Scanner software versions\"],\n)\n\nspecimen = Specimen(\n identifier=\"Specimen\",\n extraction_step=Collection(method=SpecimenCollectionProcedureCode(\"Excision\")),\n type=AnatomicPathologySpecimenTypesCode(\"Gross specimen\"),\n container=ContainerTypeCode(\"Specimen container\"),\n steps=[Fixation(fixative=SpecimenFixativesCode(\"Neutral Buffered Formalin\"))],\n)\n\nblock = Sample(\n identifier=\"Block\",\n sampled_from=[specimen.sample(method=SpecimenSamplingProcedureCode(\"Dissection\"))],\n type=AnatomicPathologySpecimenTypesCode(\"tissue specimen\"),\n container=ContainerTypeCode(\"Tissue cassette\"),\n steps=[Embedding(medium=SpecimenEmbeddingMediaCode(\"Paraffin wax\"))],\n)\n\nslide_sample = SlideSample(\n identifier=\"Slide sample\",\n sampled_from=block.sample(method=SpecimenSamplingProcedureCode(\"Block sectioning\")),\n)\n\nslide = Slide(\n identifier=\"Slide\",\n stainings=[\n Staining(\n substances=[\n SpecimenStainsCode(\"hematoxylin stain\"),\n SpecimenStainsCode(\"water soluble eosin stain\"),\n ]\n )\n ],\n samples=[slide_sample],\n)\nmetadata = WsiDicomizerMetadata(\n study=study,\n series=series,\n patient=patient,\n equipment=equipment,\n slide=slide,\n label=label,\n)\n```\n\n***Convert a wsi-file into DICOM using python-interface***\n\n```python\nfrom wsidicomizer import WsiDicomizer\ncreated_files = WsiDicomizer.convert(\n filepath=path_to_wsi_file,\n output_path=path_to_output_folder,\n metadata=metadata,\n tile_size=tile_size\n)\n```\n\n***Import a wsi file as a WsiDicom object.***\n\n```python\nfrom wsidicomizer import WsiDicomizer\nwsi = WsiDicomizer.open(path_to_wsi_file)\nregion = wsi.read_region((1000, 1000), 6, (200, 200))\nwsi.close()\n```\n\n## Openslide support\n\n### Installation\n\nSupport for reading images using Openslide c library can optionally be enabled by installing *wsidicomizer* with the `openslide` extra:\n\n```console\npip install wsidicomizer[openslide]\n```\n\nThe OpenSlide extra requires the OpenSlide library to be installed separately. Instructions for how to install OpenSlide is available on <https://openslide.org/download/>\nFor Windows, you need also need add OpenSlide's bin-folder to the environment variable 'Path'\n\n## Bioformats support\n\n### Installation\n\nSupport for reading images using Bioformats java library can optionally be enabled by installing *wsidicomizer* with the `bioformats` extra:\n\n```console\npip install wsidicomizer[bioformats]\n```\n\nThe `bioformats` extra enables usage of the `bioformats` module and the `bioformats_wsidicomizer`-cli command. The required Bioformats java library (jar-file) is downloaded automatically when the module is imported using [scyjava](https://github.com/scijava/scyjava).\n\n### Using\n\nAs the Bioformats library is a java library it needs to run in a java virtual machine (JVM). A JVM is started automatically when the `bioformats` module is imported. The JVM can\u00b4t be restarted in the same Python inteprenter, and is therefore left running once started. If you want to shutdown the JVM (without closing the Python inteprenter) you can call the shutdown_jvm()-method:\n\n```python\nimport scyjava\nscyjava.shutdown_jvm()\n```\n\nDue to the need to start a JVM, the `bioformats` module is not imported when using the default `WsiDicomzer`-class, instead the `BioformatsDicomizer`-class should be used. Similarly, the Bioformats support is only available in the `bioformats_wsidicomizer`-cli command.\n\n### Bioformats version\n\nThe Bioformats java library is available in two versions, one with BSD and one with GPL2 license, and can read several [WSI formats](https://bio-formats.readthedocs.io/en/v6.12.0/supported-formats.html). However, most formats are only available in the GPL2 version. Due to the licensing incompatibility between Apache 2.0 and GPL2, *wsidicomizer* is distributed with a default setting of using the BSD licensed library. The loaded Biformats version can be changed by the user by setting the `BIOFORMATS_VERSION` environmental variable from the default value `bsd:6.12.0`.\n\n## Limitations\n\nFiles with z-stacks or multiple focal paths are currently not supported.\n\n## Other DICOM python tools\n\n- [pydicom](https://pydicom.github.io/)\n- [highdicom](https://github.com/MGHComputationalPathology/highdicom)\n- [wsidicom](https://github.com/imi-bigpicture/wsidicom)\n\n## Contributing\n\nWe welcome any contributions to help improve this tool for the WSI DICOM community!\n\nWe recommend first creating an issue before creating potential contributions to check that the contribution is in line with the goals of the project. To submit your contribution, please issue a pull request on the imi-bigpicture/wsidicomizer repository with your changes for review.\n\nOur aim is to provide constructive and positive code reviews for all submissions. The project relies on gradual typing and roughly follows PEP8. However, we are not dogmatic. Most important is that the code is easy to read and understand.\n\n## Acknowledgement\n\n*wsidicomizer*: Copyright 2021 Sectra AB, licensed under Apache 2.0.\n\nThis project is part of a project that has received funding from the Innovative Medicines Initiative 2 Joint Undertaking under grant agreement No 945358. This Joint Undertaking receives support from the European Union\u2019s Horizon 2020 research and innovation programme and EFPIA. IMI website: <www.imi.europa.eu>\n",
"bugtrack_url": null,
"license": "Apache-2.0",
"summary": "Tool for reading WSI files from proprietary formats and optionally convert them to to DICOM",
"version": "0.15.0",
"project_urls": {
"Homepage": "https://github.com/imi-bigpicture/wsidicomizer",
"Repository": "https://github.com/imi-bigpicture/wsidicomizer"
},
"split_keywords": [
"whole slide image",
" digital pathology",
" dicom",
" converter"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "0ca84bedccf6f39a5ec1a15365bb6a1c77758db48c733528cf8aaf30e937f965",
"md5": "9707f0421b54fdfcb8654d6bbf0b9a35",
"sha256": "b4914950e4fa8c2ecd6e1356a0f163e803bb4740e2cdc69235055036709f5ad9"
},
"downloads": -1,
"filename": "wsidicomizer-0.15.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "9707f0421b54fdfcb8654d6bbf0b9a35",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.10",
"size": 57802,
"upload_time": "2024-10-21T19:32:31",
"upload_time_iso_8601": "2024-10-21T19:32:31.224212Z",
"url": "https://files.pythonhosted.org/packages/0c/a8/4bedccf6f39a5ec1a15365bb6a1c77758db48c733528cf8aaf30e937f965/wsidicomizer-0.15.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "a9482210ce80b2bfebd1497f77f3edc2abe0446b6d4c7c21c6b12fd77181926a",
"md5": "17fe4c36f4a8a74c28c3aca7bdcbe999",
"sha256": "8610ca7abdeae840a4c9a8b064b3ea209256896835dbf361db87f264ad13466f"
},
"downloads": -1,
"filename": "wsidicomizer-0.15.0.tar.gz",
"has_sig": false,
"md5_digest": "17fe4c36f4a8a74c28c3aca7bdcbe999",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.10",
"size": 37640,
"upload_time": "2024-10-21T19:32:32",
"upload_time_iso_8601": "2024-10-21T19:32:32.656784Z",
"url": "https://files.pythonhosted.org/packages/a9/48/2210ce80b2bfebd1497f77f3edc2abe0446b6d4c7c21c6b12fd77181926a/wsidicomizer-0.15.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-21 19:32:32",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "imi-bigpicture",
"github_project": "wsidicomizer",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "wsidicomizer"
}