image-dataset-converter


Nameimage-dataset-converter JSON
Version 0.0.13 PyPI version JSON
download
home_pagehttps://github.com/waikato-datamining/image-dataset-converter
SummaryPython3 library for converting between various image annotation dataset formats.
upload_time2025-07-15 03:25:57
maintainerNone
docs_urlNone
authorPeter Reutemann
requires_pythonNone
licenseMIT License
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            The **image-dataset-converter** library allows the conversion between
various dataset formats for image annotation datasets.
Filters can be supplied as well, e.g., for cleaning up the data.

Dataset formats:

- depth data: CSV (r/w), grayscale (r/w), numpy (r/w), PFM (r/w)
- image classification: ADAMS (r/w), sub-dir (r/w)
- image segmentation: blue-channel (r/w), grayscale (r/w), indexed PNG (r/w), layer segments (r/w)
- object detection: ADAMS (r/w), COCO (r/w), OPEX (r/w), ROI (r/w), VOC (r/w), YOLO (r/w)

Examples can be found here:

https://github.com/waikato-datamining/image-dataset-converter-examples


Changelog
=========

0.0.13 (2025-07-15)
-------------------

- requiring seppl>=0.2.20 now for improved help requests in `idc-convert` tool


0.0.12 (2025-07-11)
-------------------

- dropped numpy<2.0.0 restriction
- added `grayscale-to-binary` filter
- fix: `sort-pixels`, `rgb-to-grayscale` filters
- added `ensure_grayscale` and `grayscale_required_info` convenience methods (package: `idc.api`)
- added `ensure_binary` and `binary_required_info` convenience methods (package: `idc.api`)
- added `--dump_pipeline` option to `idc-convert` for saving the pipeline command
- the `rename` filter now supports lower/upper case placeholders of name and extension as well
- requiring seppl>=0.2.17 now for skippable plugin support and avoiding deprecated use of pkg_resources
- added `any-to-rgb` filter for turning binary/grayscale images back into RGB ones
- using `wai_common` instead of `wai.common` now
- requiring `fast_opex>=0.0.4` now
- added `label-to-metadata` filter for transferring labels into meta-data
- added `metadata-to-placeholder` filter for transferring meta-data into placeholders
- added basic support for images with associated depth information: `DepthData`, `DepthInformation`
- added `depth-to-grayscale` filter for converting depth information to grayscale image
- prefixed image segmentation methods like `from_bluechannel` and `to_bluechannel` with `imgseg_`
- added depth information readers `from-grayscale-dp`, `from-numpy-dp`, `from-csv-dp` and `from-pfm-dp`
- added depth information writers `to-grayscale-dp`, `to-numpy-dp`, `to-csv-dp` and `to-pfm-dp`
- added `apply-ext-mask` filter to applying external PNG masks to image containers (image and/or annotations)
- added `apply-label-mask` filter for applying image segmentation label masks to their base images
- added `label-present-ic` and `label-present-is` that ensure that certain label(s) are present or otherwise discard the image
- filter `label-present` was renamed to `label-present-od` but keeping `label-present` as alias for the time being
- fix: `imgseg_to_bluechannel`, `imgseg_to_indexedpng` and `imgseg_to_grayscale` now handle overlapping pixels correctly,
  no longer adding them up and introducing additional labels
- `discard-by-name` filter can use names of files in specified paths now as well
- fixed the construction of the error messages in the pyfunc reader/filter/writer classes


0.0.11 (2025-04-03)
-------------------

- fix: `idc-registry --list writers` now returns writer plugins instead of reader ones


0.0.10 (2025-04-03)
-------------------

- added `set-placeholder` filter for dynamically setting (temporary) placeholders at runtime
- added `--resume_from` option to relevant readers that allows resuming the data processing
  from the first file that matches this glob expression (e.g., `*/012345.png`)
- requiring seppl>=0.2.14 now for resume support
- using underscores now instead of dashes in dependencies (`setup.py`)
- the `array_to_image` method no longer performs unnecessary conversions of Image objects
- the `dirs` generator can limit directories now to ones that have files matching a specific
  regexp (`--file_regexp`), to avoid the `Failed to locate any files using: ...` error message
  when a reader doesn't find any matching files
- requiring seppl>=0.2.15 now for new `--split_group` support
- added the `from-multi` meta-reader that combines multiple base readers and returns their output
- added the `to-multi` meta-writer that forwards the data to multiple base writers
- added the `use-mask` filter for using the image segmentation annotations (= mask) as the new base image


0.0.9 (2025-03-14)
------------------

- using `wai_logging` instead of `wai.logging` as dependency now


0.0.8 (2025-03-14)
------------------

- requiring seppl>=0.2.13 now for placeholder support
- added placeholder support to tools: `idc-convert`, `idc-exec`
- added placeholder support to readers: `from-adams-ic`, `from-subdir-ic`, `from-blue-channel-is`, `from-grayscale-is`,
  `from-indexed-png-is`, `from-layer-segments-is`, `from-adams-od`, `from-coco-od`, `from-opex-od`, `from-roicsv-od`,
  `from-voc-od`, `from-yolo-od`, `from-data`, `from-pyfunc`, `poll-dir`
- added placeholder support to filters: `write-labels`
- added placeholder support to writers: `to-adams-ic`, `to-subdir-ic`, `to-blue-channel-is`, `to-grayscale-is`,
  `to-indexed-png-is`, `to-layer-segments-is`, `to-adams-od`, `to-coco-od`, `to-opex-od`, `to-roicsv-od`,
  `to-voc-od`, `to-yolo-od`, `to-data`


0.0.7 (2025-03-12)
------------------

- added `safe_deepcopy` method to idc.api._utils which creates a deep copy of an object if not None
- added `rgb-to-grayscale` filter to convert color images into gray scale ones
- added `sort-pixels` filter for grayscale images
- the following filters can operate on lists of records now as well: `inspec`, `metadata`, `metadata-from-name`
- added `metadata-od` filter for filtering object-detection annotations based on their meta-data
  (e.g., scores from model predictions)
- the filters `discard-negatives` and `discard-invalid-images` now output how many were discarded/kept
  when processing finishes


0.0.6 (2025-02-26)
------------------

- `LayerSegmentsImageSegmentationReader` now suggest using `--lenient` flag in exception in case image not binary
- added the `discard-by-name` filter that allows user to discard images based on name, either exact match of regexp
  (matching sense can be inverted)
- requiring seppl>=0.2.10 now
- added support for aliases
- added `to_bluechannel`, `to_grayscale` and `to_indexedpng` image segmentation methods to `idc.api`
- added the `generate_palette_list` method to `idc.api` which turns a predefined palette name or comma-separated
  list of RGB values into a flat list of int values, e.g., used for indexed PNG files
- exposed method `save_image` through `idc.api`
- `filter-labels` now handles not specifying any labels and only regexp
- `write-labels` filter now allows specification of custom separator
- `write-labels`: fixed retrieval of image-segmentation labels
- using `simple_palette_utils` dependency now
- `idc-convert` tool now flags aliases on the help screen with `*`
- the `from-voc-od` reader now has the `-r/--image_rel_path` option which gets injected before the `folder` property
  from the XML file


0.0.5 (2025-01-13)
------------------

- added `setuptools` as dependency
- switched to underscores in project name
- using 90% as default quality for JPEG images now, can be overridden with environment variable `IDC_JPEG_QUALITY`
- added methods to idc.api module: `jpeg_quality()`, `array_to_image(...)`, `empty_image(...)`


0.0.4 (2024-07-16)
------------------

- limiting numpy to <2.0.0 due to problems with imgaug library


0.0.3 (2024-07-02)
------------------

- switched to the `fast-opex` library
- helper method `from_indexedpng` was using incorrect label index (off by 1)
- `Data.save_image` method now ensures that source/target files exist before calling `os.path.samefile`
- requiring seppl>=0.2.6 now
- readers now support default globs, allowing the user to just specify directories as input
  (and the default glob gets appended)
- the `to-yolo-od` writer now has an option for predefined labels (for enforcing label order)
- the `to-yolo-od` writer now stores the labels/labels_cvs files in the respective output folders
  rather than using an absolute file name
- the bluechannel/grayscale/indexed-png image segmentation readers/writers can use a value other
  than 0 now for the background
- `split` filter has been renamed to `split-records`


0.0.2 (2024-06-13)
------------------

- added generic plugins that take user Python functions: `from-pyfunc`, `pyfunc-filter`, `to-pyfunc`
- added `idc-exec` tool that uses generator to produce variable/value pairs that are used to expand
  the provided pipeline template which then gets executed
- added `polygon-simplifier` filter for reducing number of points in polygons
- moved several geometry/image related functions from imgaug library into core library to avoid duplication
- added python-image-complete as dependency
- the `ImageData` class now uses the python-image-complete library to determine the file format rather than
  loading the image into memory in order to determine that
- the `convert-image-format` filter now correctly creates a new container with the converted image data
- the `to-coco-od` writer only allows sorting of categories when using predefined categories now
- the `from-opex-od` reader now handles absent meta-data correctly
- added the `AnnotationsOnlyWriter` mixin for writers that can skip the base image and just output the annotations


0.0.1 (2024-05-06)
------------------

- initial release


            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/waikato-datamining/image-dataset-converter",
    "name": "image-dataset-converter",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": null,
    "author": "Peter Reutemann",
    "author_email": "fracpete@waikato.ac.nz",
    "download_url": "https://files.pythonhosted.org/packages/d5/87/f64e8979bb51424c27551f2319f657a9da6ce4f35b36ad89bb9a9737a9eb/image_dataset_converter-0.0.13.tar.gz",
    "platform": null,
    "description": "The **image-dataset-converter** library allows the conversion between\nvarious dataset formats for image annotation datasets.\nFilters can be supplied as well, e.g., for cleaning up the data.\n\nDataset formats:\n\n- depth data: CSV (r/w), grayscale (r/w), numpy (r/w), PFM (r/w)\n- image classification: ADAMS (r/w), sub-dir (r/w)\n- image segmentation: blue-channel (r/w), grayscale (r/w), indexed PNG (r/w), layer segments (r/w)\n- object detection: ADAMS (r/w), COCO (r/w), OPEX (r/w), ROI (r/w), VOC (r/w), YOLO (r/w)\n\nExamples can be found here:\n\nhttps://github.com/waikato-datamining/image-dataset-converter-examples\n\n\nChangelog\n=========\n\n0.0.13 (2025-07-15)\n-------------------\n\n- requiring seppl>=0.2.20 now for improved help requests in `idc-convert` tool\n\n\n0.0.12 (2025-07-11)\n-------------------\n\n- dropped numpy<2.0.0 restriction\n- added `grayscale-to-binary` filter\n- fix: `sort-pixels`, `rgb-to-grayscale` filters\n- added `ensure_grayscale` and `grayscale_required_info` convenience methods (package: `idc.api`)\n- added `ensure_binary` and `binary_required_info` convenience methods (package: `idc.api`)\n- added `--dump_pipeline` option to `idc-convert` for saving the pipeline command\n- the `rename` filter now supports lower/upper case placeholders of name and extension as well\n- requiring seppl>=0.2.17 now for skippable plugin support and avoiding deprecated use of pkg_resources\n- added `any-to-rgb` filter for turning binary/grayscale images back into RGB ones\n- using `wai_common` instead of `wai.common` now\n- requiring `fast_opex>=0.0.4` now\n- added `label-to-metadata` filter for transferring labels into meta-data\n- added `metadata-to-placeholder` filter for transferring meta-data into placeholders\n- added basic support for images with associated depth information: `DepthData`, `DepthInformation`\n- added `depth-to-grayscale` filter for converting depth information to grayscale image\n- prefixed image segmentation methods like `from_bluechannel` and `to_bluechannel` with `imgseg_`\n- added depth information readers `from-grayscale-dp`, `from-numpy-dp`, `from-csv-dp` and `from-pfm-dp`\n- added depth information writers `to-grayscale-dp`, `to-numpy-dp`, `to-csv-dp` and `to-pfm-dp`\n- added `apply-ext-mask` filter to applying external PNG masks to image containers (image and/or annotations)\n- added `apply-label-mask` filter for applying image segmentation label masks to their base images\n- added `label-present-ic` and `label-present-is` that ensure that certain label(s) are present or otherwise discard the image\n- filter `label-present` was renamed to `label-present-od` but keeping `label-present` as alias for the time being\n- fix: `imgseg_to_bluechannel`, `imgseg_to_indexedpng` and `imgseg_to_grayscale` now handle overlapping pixels correctly,\n  no longer adding them up and introducing additional labels\n- `discard-by-name` filter can use names of files in specified paths now as well\n- fixed the construction of the error messages in the pyfunc reader/filter/writer classes\n\n\n0.0.11 (2025-04-03)\n-------------------\n\n- fix: `idc-registry --list writers` now returns writer plugins instead of reader ones\n\n\n0.0.10 (2025-04-03)\n-------------------\n\n- added `set-placeholder` filter for dynamically setting (temporary) placeholders at runtime\n- added `--resume_from` option to relevant readers that allows resuming the data processing\n  from the first file that matches this glob expression (e.g., `*/012345.png`)\n- requiring seppl>=0.2.14 now for resume support\n- using underscores now instead of dashes in dependencies (`setup.py`)\n- the `array_to_image` method no longer performs unnecessary conversions of Image objects\n- the `dirs` generator can limit directories now to ones that have files matching a specific\n  regexp (`--file_regexp`), to avoid the `Failed to locate any files using: ...` error message\n  when a reader doesn't find any matching files\n- requiring seppl>=0.2.15 now for new `--split_group` support\n- added the `from-multi` meta-reader that combines multiple base readers and returns their output\n- added the `to-multi` meta-writer that forwards the data to multiple base writers\n- added the `use-mask` filter for using the image segmentation annotations (= mask) as the new base image\n\n\n0.0.9 (2025-03-14)\n------------------\n\n- using `wai_logging` instead of `wai.logging` as dependency now\n\n\n0.0.8 (2025-03-14)\n------------------\n\n- requiring seppl>=0.2.13 now for placeholder support\n- added placeholder support to tools: `idc-convert`, `idc-exec`\n- added placeholder support to readers: `from-adams-ic`, `from-subdir-ic`, `from-blue-channel-is`, `from-grayscale-is`,\n  `from-indexed-png-is`, `from-layer-segments-is`, `from-adams-od`, `from-coco-od`, `from-opex-od`, `from-roicsv-od`,\n  `from-voc-od`, `from-yolo-od`, `from-data`, `from-pyfunc`, `poll-dir`\n- added placeholder support to filters: `write-labels`\n- added placeholder support to writers: `to-adams-ic`, `to-subdir-ic`, `to-blue-channel-is`, `to-grayscale-is`,\n  `to-indexed-png-is`, `to-layer-segments-is`, `to-adams-od`, `to-coco-od`, `to-opex-od`, `to-roicsv-od`,\n  `to-voc-od`, `to-yolo-od`, `to-data`\n\n\n0.0.7 (2025-03-12)\n------------------\n\n- added `safe_deepcopy` method to idc.api._utils which creates a deep copy of an object if not None\n- added `rgb-to-grayscale` filter to convert color images into gray scale ones\n- added `sort-pixels` filter for grayscale images\n- the following filters can operate on lists of records now as well: `inspec`, `metadata`, `metadata-from-name`\n- added `metadata-od` filter for filtering object-detection annotations based on their meta-data\n  (e.g., scores from model predictions)\n- the filters `discard-negatives` and `discard-invalid-images` now output how many were discarded/kept\n  when processing finishes\n\n\n0.0.6 (2025-02-26)\n------------------\n\n- `LayerSegmentsImageSegmentationReader` now suggest using `--lenient` flag in exception in case image not binary\n- added the `discard-by-name` filter that allows user to discard images based on name, either exact match of regexp\n  (matching sense can be inverted)\n- requiring seppl>=0.2.10 now\n- added support for aliases\n- added `to_bluechannel`, `to_grayscale` and `to_indexedpng` image segmentation methods to `idc.api`\n- added the `generate_palette_list` method to `idc.api` which turns a predefined palette name or comma-separated\n  list of RGB values into a flat list of int values, e.g., used for indexed PNG files\n- exposed method `save_image` through `idc.api`\n- `filter-labels` now handles not specifying any labels and only regexp\n- `write-labels` filter now allows specification of custom separator\n- `write-labels`: fixed retrieval of image-segmentation labels\n- using `simple_palette_utils` dependency now\n- `idc-convert` tool now flags aliases on the help screen with `*`\n- the `from-voc-od` reader now has the `-r/--image_rel_path` option which gets injected before the `folder` property\n  from the XML file\n\n\n0.0.5 (2025-01-13)\n------------------\n\n- added `setuptools` as dependency\n- switched to underscores in project name\n- using 90% as default quality for JPEG images now, can be overridden with environment variable `IDC_JPEG_QUALITY`\n- added methods to idc.api module: `jpeg_quality()`, `array_to_image(...)`, `empty_image(...)`\n\n\n0.0.4 (2024-07-16)\n------------------\n\n- limiting numpy to <2.0.0 due to problems with imgaug library\n\n\n0.0.3 (2024-07-02)\n------------------\n\n- switched to the `fast-opex` library\n- helper method `from_indexedpng` was using incorrect label index (off by 1)\n- `Data.save_image` method now ensures that source/target files exist before calling `os.path.samefile`\n- requiring seppl>=0.2.6 now\n- readers now support default globs, allowing the user to just specify directories as input\n  (and the default glob gets appended)\n- the `to-yolo-od` writer now has an option for predefined labels (for enforcing label order)\n- the `to-yolo-od` writer now stores the labels/labels_cvs files in the respective output folders\n  rather than using an absolute file name\n- the bluechannel/grayscale/indexed-png image segmentation readers/writers can use a value other\n  than 0 now for the background\n- `split` filter has been renamed to `split-records`\n\n\n0.0.2 (2024-06-13)\n------------------\n\n- added generic plugins that take user Python functions: `from-pyfunc`, `pyfunc-filter`, `to-pyfunc`\n- added `idc-exec` tool that uses generator to produce variable/value pairs that are used to expand\n  the provided pipeline template which then gets executed\n- added `polygon-simplifier` filter for reducing number of points in polygons\n- moved several geometry/image related functions from imgaug library into core library to avoid duplication\n- added python-image-complete as dependency\n- the `ImageData` class now uses the python-image-complete library to determine the file format rather than\n  loading the image into memory in order to determine that\n- the `convert-image-format` filter now correctly creates a new container with the converted image data\n- the `to-coco-od` writer only allows sorting of categories when using predefined categories now\n- the `from-opex-od` reader now handles absent meta-data correctly\n- added the `AnnotationsOnlyWriter` mixin for writers that can skip the base image and just output the annotations\n\n\n0.0.1 (2024-05-06)\n------------------\n\n- initial release\n\n",
    "bugtrack_url": null,
    "license": "MIT License",
    "summary": "Python3 library for converting between various image annotation dataset formats.",
    "version": "0.0.13",
    "project_urls": {
        "Homepage": "https://github.com/waikato-datamining/image-dataset-converter"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d587f64e8979bb51424c27551f2319f657a9da6ce4f35b36ad89bb9a9737a9eb",
                "md5": "e0234ae109f55397252536c672ab5335",
                "sha256": "521b6ec605afa933b2bb85cfca6685fd86b7c9d23305a610ef7e30acdb62c348"
            },
            "downloads": -1,
            "filename": "image_dataset_converter-0.0.13.tar.gz",
            "has_sig": false,
            "md5_digest": "e0234ae109f55397252536c672ab5335",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 103090,
            "upload_time": "2025-07-15T03:25:57",
            "upload_time_iso_8601": "2025-07-15T03:25:57.043351Z",
            "url": "https://files.pythonhosted.org/packages/d5/87/f64e8979bb51424c27551f2319f657a9da6ce4f35b36ad89bb9a9737a9eb/image_dataset_converter-0.0.13.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-15 03:25:57",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "waikato-datamining",
    "github_project": "image-dataset-converter",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "image-dataset-converter"
}
        
Elapsed time: 0.45105s