# YOLO Dataset tiling
<div align="center">
[](https://pypi.org/project/yolo-tiling)
[](https://pypi.python.org/pypi/yolo-tiling)
[](https://pypi.org/project/yolo-tiling)
[](https://pypi.org/project/yolo-tiling)
[](https://pypi.org/project/yolo-tiling)
[](https://pypi.org/project/yolo-tiling)
</div>
This module can cut images and corresponding labels from a YOLO dataset into tiles of specified size and create a
new dataset based on these tiles. It supports both object detection and instance segmentation. Credit for the original
repository goes to [slanj](https://github.com/slanj/yolo-tiling).
## Installation
To install the package, use pip:
```bash
pip install yolo-tiling
```
## Usage
```python
from yolo_tiler import YoloTiler, TileConfig
src = "path/to/dataset" # Source YOLO dataset directory
dst = "path/to/tiled_dataset" # Output directory for tiled dataset
config = TileConfig(
# Size of each tile (width, height). Can be:
# - Single integer for square tiles: slice_wh=640
# - Tuple for rectangular tiles: slice_wh=(640, 480)
slice_wh=(640, 480),
# Overlap between adjacent tiles. Can be:
# - Single float (0-1) for uniform overlap percentage: overlap_wh=0.1
# - Tuple of floats for different overlap in each dimension: overlap_wh=(0.1, 0.1)
# - Single integer for pixel overlap: overlap_wh=64
# - Tuple of integers for different pixel overlaps: overlap_wh=(64, 48)
overlap_wh=(0.1, 0.1),
# Input image file extension to process
input_ext=".png",
# Output image file extension to save (default: same as input_ext)
output_ext=None,
# Type of YOLO annotations to process:
# - "object_detection": Standard YOLO format (class, x, y, width, height)
# - "instance_segmentation": YOLO segmentation format (class, x1, y1, x2, y2, ...)
annotation_type="instance_segmentation",
# For segmentation only: Controls point density along polygon edges
# Lower values = more points, higher quality but larger files
densify_factor=0.01,
# For segmentation only: Controls polygon smoothing
# Lower values = more details preserved, higher values = smoother shapes
smoothing_tolerance=0.99,
# Dataset split ratios (must sum to 1.0)
train_ratio=0.7, # Proportion of data for training
valid_ratio=0.2, # Proportion of data for validation
test_ratio=0.1, # Proportion of data for testing
# Optional margins to exclude from input images. Can be:
# - Single float (0-1) for uniform margin percentage: margins=0.1
# - Tuple of floats for different margins: margins=(0.1, 0.1, 0.1, 0.1)
# - Single integer for pixel margins: margins=64
# - Tuple of integers for different pixel margins: margins=(64, 64, 64, 64)
margins=0.0,
# Include negative samples (tiles without any instances)
include_negative_samples=True
)
tiler = YoloTiler(
source=src,
target=dst,
config=config,
num_viz_samples=15, # Number of samples to visualize
show_processing_status=True # Show the progress of the tiling process
progress_callback=progress_callback # Optional callback function to report progress (see below)
)
tiler.run()
```
```python
@dataclass
class TileProgress:
"""Data class to track tiling progress"""
current_set_name: str = ""
current_image_name: str = ""
current_image_idx: int = 0
total_images: int = 0
current_tile_idx: int = 0
total_tiles: int = 0
```
Using `TileProgress` custom callback functions can be created. An example of an (optional) `progress_callback` function
can be seen below:
```python
from yolo_tiler import TilerProgress
def progress_callback(progress: TileProgress):
# Determine whether to show tile or image progress
if progress.total_tiles > 0:
print(f"Processing {progress.current_image_name} in {progress.current_set_name} set: "
f"Tile {progress.current_tile_idx}/{progress.total_tiles}")
else:
print(f"Processing {progress.current_image_name} in {progress.current_set_name} set: "
f"Image {progress.current_image_idx}/{progress.total_images}")
```
### Notes
- The tiler **requires** a YOLO dataset structure within the source directory (see below).
- If only a `train` folder exists, the train / valid / test ratios will be used to split the tiled `train` folder.
- If there already exists train / valid/ test folders in the source directory, the ratios are ignored.
```bash
dataset/
├── train/
│ ├── images/
│ └── labels/
├── valid/
│ ├── images/
│ └── labels/
├── test/
│ ├── images/
│ └── labels/
└── data.yaml # Optional
```
### Test Data
```bash
python tests/test_yolo_tiler.py
```
## Command Line Usage
In addition to using the tiler within a script, it can also use the command line interface to run the tiling process.
Here are the instructions:
```bash
yolo_tiler --source --target [--slice_wh SLICE_WH SLICE_WH] [--overlap_wh OVERLAP_WH OVERLAP_WH] [--input_ext INPUT_EXT] [--output_ext OUTPUT_EXT] [--annotation_type ANNOTATION_TYPE] [--densify_factor DENSIFY_FACTOR] [--smoothing_tolerance SMOOTHING_TOLERANCE] [--train_ratio TRAIN_RATIO] [--valid_ratio VALID_RATIO] [--test_ratio TEST_RATIO] [--margins MARGINS] [--include_negative_samples INCLUDE_NEGATIVE_SAMPLES]
```
### Example Commands
1. Basic usage with default parameters:
```bash
yolo_tiler --source tests/detection --target tests/detection_tiled
```
2. Custom slice size and overlap:
```bash
yolo_tiler --source tests/detection --target tests/detection_tiled --slice_wh 640 480 --overlap_wh 0.1 0.1
```
3. Custom annotation type and image extension:
```bash
yolo_tiler--source tests/segmentation --target tests/segmentation_tiled --annotation_type instance_segmentation --input_ext .jpg --output_ext .png
```
### Memory Efficiency
The `tile_image` method now uses rasterio's Window to read and process image tiles directly from the disk, instead of loading the entire image into memory. This makes the tiling process more memory efficient, especially for large images.
---
## Disclaimer
This repository is a scientific product and is not official communication of the National
Oceanic and Atmospheric Administration, or the United States Department of Commerce. All NOAA
GitHub project code is provided on an 'as is' basis and the user assumes responsibility for its
use. Any claims against the Department of Commerce or Department of Commerce bureaus stemming from
the use of this GitHub project will be governed by all applicable Federal law. Any reference to
specific commercial products, processes, or services by service mark, trademark, manufacturer, or
otherwise, does not constitute or imply their endorsement, recommendation or favoring by the
Department of Commerce. The Department of Commerce seal and logo, or the seal and logo of a DOC
bureau, shall not be used in any manner to imply endorsement of any commercial product or activity
by DOC or the United States Government.
## License
Software code created by U.S. Government employees is not subject to copyright in the United States
(17 U.S.C. §105). The United States/Department of Commerce reserve all rights to seek and obtain
copyright protection in countries other than the United States for Software authored in its
entirety by the Department of Commerce. To this end, the Department of Commerce hereby grants to
Recipient a royalty-free, nonexclusive license to use, copy, and create derivative works of the
Software outside of the United States.
Raw data
{
"_id": null,
"home_page": null,
"name": "yolo-tiling",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "yolo-tiling",
"author": null,
"author_email": "Jordan Pierce <jordan.pierce@noaa.gov>",
"download_url": "https://files.pythonhosted.org/packages/44/e2/628c79296d2b2fe4c47cc9b6b850db193154f31c154bdb31f5a676f47e77/yolo_tiling-0.0.10.tar.gz",
"platform": null,
"description": "# YOLO Dataset tiling\n\n<div align=\"center\">\n\n[](https://pypi.org/project/yolo-tiling)\n[](https://pypi.python.org/pypi/yolo-tiling)\n[](https://pypi.org/project/yolo-tiling)\n[](https://pypi.org/project/yolo-tiling)\n[](https://pypi.org/project/yolo-tiling)\n[](https://pypi.org/project/yolo-tiling)\n</div>\n\nThis module can cut images and corresponding labels from a YOLO dataset into tiles of specified size and create a\nnew dataset based on these tiles. It supports both object detection and instance segmentation. Credit for the original\nrepository goes to [slanj](https://github.com/slanj/yolo-tiling).\n\n## Installation\n\nTo install the package, use pip:\n\n```bash\npip install yolo-tiling\n```\n\n## Usage\n\n```python\nfrom yolo_tiler import YoloTiler, TileConfig\n\nsrc = \"path/to/dataset\" # Source YOLO dataset directory\ndst = \"path/to/tiled_dataset\" # Output directory for tiled dataset\n\nconfig = TileConfig(\n # Size of each tile (width, height). Can be:\n # - Single integer for square tiles: slice_wh=640\n # - Tuple for rectangular tiles: slice_wh=(640, 480)\n slice_wh=(640, 480),\n\n # Overlap between adjacent tiles. Can be:\n # - Single float (0-1) for uniform overlap percentage: overlap_wh=0.1\n # - Tuple of floats for different overlap in each dimension: overlap_wh=(0.1, 0.1)\n # - Single integer for pixel overlap: overlap_wh=64\n # - Tuple of integers for different pixel overlaps: overlap_wh=(64, 48)\n overlap_wh=(0.1, 0.1),\n\n # Input image file extension to process\n input_ext=\".png\",\n\n # Output image file extension to save (default: same as input_ext)\n output_ext=None,\n\n # Type of YOLO annotations to process:\n # - \"object_detection\": Standard YOLO format (class, x, y, width, height)\n # - \"instance_segmentation\": YOLO segmentation format (class, x1, y1, x2, y2, ...)\n annotation_type=\"instance_segmentation\",\n\n # For segmentation only: Controls point density along polygon edges\n # Lower values = more points, higher quality but larger files\n densify_factor=0.01,\n\n # For segmentation only: Controls polygon smoothing\n # Lower values = more details preserved, higher values = smoother shapes\n smoothing_tolerance=0.99,\n\n # Dataset split ratios (must sum to 1.0)\n train_ratio=0.7, # Proportion of data for training\n valid_ratio=0.2, # Proportion of data for validation\n test_ratio=0.1, # Proportion of data for testing\n\n # Optional margins to exclude from input images. Can be:\n # - Single float (0-1) for uniform margin percentage: margins=0.1\n # - Tuple of floats for different margins: margins=(0.1, 0.1, 0.1, 0.1)\n # - Single integer for pixel margins: margins=64\n # - Tuple of integers for different pixel margins: margins=(64, 64, 64, 64)\n margins=0.0,\n\n # Include negative samples (tiles without any instances)\n include_negative_samples=True\n)\n\ntiler = YoloTiler(\n source=src,\n target=dst,\n config=config,\n num_viz_samples=15, # Number of samples to visualize\n show_processing_status=True # Show the progress of the tiling process\n progress_callback=progress_callback # Optional callback function to report progress (see below)\n)\n\ntiler.run()\n```\n\n```python\n@dataclass\nclass TileProgress:\n \"\"\"Data class to track tiling progress\"\"\"\n current_set_name: str = \"\"\n current_image_name: str = \"\"\n current_image_idx: int = 0\n total_images: int = 0\n current_tile_idx: int = 0 \n total_tiles: int = 0 \n```\n\nUsing `TileProgress` custom callback functions can be created. An example of an (optional) `progress_callback` function \ncan be seen below:\n\n```python\nfrom yolo_tiler import TilerProgress\n\ndef progress_callback(progress: TileProgress):\n # Determine whether to show tile or image progress\n if progress.total_tiles > 0:\n print(f\"Processing {progress.current_image_name} in {progress.current_set_name} set: \"\n f\"Tile {progress.current_tile_idx}/{progress.total_tiles}\")\n else:\n print(f\"Processing {progress.current_image_name} in {progress.current_set_name} set: \"\n f\"Image {progress.current_image_idx}/{progress.total_images}\")\n\n```\n\n### Notes\n\n- The tiler **requires** a YOLO dataset structure within the source directory (see below). \n- If only a `train` folder exists, the train / valid / test ratios will be used to split the tiled `train` folder.\n- If there already exists train / valid/ test folders in the source directory, the ratios are ignored.\n\n```bash\ndataset/\n\u251c\u2500\u2500 train/\n\u2502 \u251c\u2500\u2500 images/\n\u2502 \u2514\u2500\u2500 labels/\n\u251c\u2500\u2500 valid/\n\u2502 \u251c\u2500\u2500 images/\n\u2502 \u2514\u2500\u2500 labels/\n\u251c\u2500\u2500 test/\n\u2502 \u251c\u2500\u2500 images/\n\u2502 \u2514\u2500\u2500 labels/\n\u2514\u2500\u2500 data.yaml # Optional\n```\n\n### Test Data\n\n```bash\npython tests/test_yolo_tiler.py\n```\n\n## Command Line Usage\n\nIn addition to using the tiler within a script, it can also use the command line interface to run the tiling process. \nHere are the instructions:\n\n```bash\nyolo_tiler --source --target [--slice_wh SLICE_WH SLICE_WH] [--overlap_wh OVERLAP_WH OVERLAP_WH] [--input_ext INPUT_EXT] [--output_ext OUTPUT_EXT] [--annotation_type ANNOTATION_TYPE] [--densify_factor DENSIFY_FACTOR] [--smoothing_tolerance SMOOTHING_TOLERANCE] [--train_ratio TRAIN_RATIO] [--valid_ratio VALID_RATIO] [--test_ratio TEST_RATIO] [--margins MARGINS] [--include_negative_samples INCLUDE_NEGATIVE_SAMPLES]\n```\n\n### Example Commands\n\n1. Basic usage with default parameters:\n```bash\nyolo_tiler --source tests/detection --target tests/detection_tiled\n```\n\n2. Custom slice size and overlap:\n```bash\nyolo_tiler --source tests/detection --target tests/detection_tiled --slice_wh 640 480 --overlap_wh 0.1 0.1\n```\n\n3. Custom annotation type and image extension:\n```bash\nyolo_tiler--source tests/segmentation --target tests/segmentation_tiled --annotation_type instance_segmentation --input_ext .jpg --output_ext .png\n```\n\n### Memory Efficiency\n\nThe `tile_image` method now uses rasterio's Window to read and process image tiles directly from the disk, instead of loading the entire image into memory. This makes the tiling process more memory efficient, especially for large images.\n\n---\n## Disclaimer\n\nThis repository is a scientific product and is not official communication of the National\nOceanic and Atmospheric Administration, or the United States Department of Commerce. All NOAA\nGitHub project code is provided on an 'as is' basis and the user assumes responsibility for its\nuse. Any claims against the Department of Commerce or Department of Commerce bureaus stemming from\nthe use of this GitHub project will be governed by all applicable Federal law. Any reference to\nspecific commercial products, processes, or services by service mark, trademark, manufacturer, or\notherwise, does not constitute or imply their endorsement, recommendation or favoring by the\nDepartment of Commerce. The Department of Commerce seal and logo, or the seal and logo of a DOC\nbureau, shall not be used in any manner to imply endorsement of any commercial product or activity\nby DOC or the United States Government.\n\n\n## License\n\nSoftware code created by U.S. Government employees is not subject to copyright in the United States\n(17 U.S.C. \u00a7105). The United States/Department of Commerce reserve all rights to seek and obtain\ncopyright protection in countries other than the United States for Software authored in its\nentirety by the Department of Commerce. To this end, the Department of Commerce hereby grants to\nRecipient a royalty-free, nonexclusive license to use, copy, and create derivative works of the\nSoftware outside of the United States.\n",
"bugtrack_url": null,
"license": "MIT License",
"summary": "Tile (slice) YOLO Dataset for Small Objects Detection and Instance Segmentation",
"version": "0.0.10",
"project_urls": {
"Homepage": "https://github.com/Jordan-Pierce/yolo-tiling"
},
"split_keywords": [
"yolo-tiling"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "c9f4a4819ec5800da940292e4e94e4f6197bc37672a867ea01f32c10cf8152f1",
"md5": "dad414264bd7bc980f7623926c1730a0",
"sha256": "d8fa7ecf993348f380bf90d95e110934dfe59f355f6cb1e66ce1123d7f3faf41"
},
"downloads": -1,
"filename": "yolo_tiling-0.0.10-py2.py3-none-any.whl",
"has_sig": false,
"md5_digest": "dad414264bd7bc980f7623926c1730a0",
"packagetype": "bdist_wheel",
"python_version": "py2.py3",
"requires_python": ">=3.8",
"size": 15869,
"upload_time": "2025-01-14T18:05:13",
"upload_time_iso_8601": "2025-01-14T18:05:13.378155Z",
"url": "https://files.pythonhosted.org/packages/c9/f4/a4819ec5800da940292e4e94e4f6197bc37672a867ea01f32c10cf8152f1/yolo_tiling-0.0.10-py2.py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "44e2628c79296d2b2fe4c47cc9b6b850db193154f31c154bdb31f5a676f47e77",
"md5": "0d86f7df6c3e04a8c9d45b0f94bc8aeb",
"sha256": "8e5a0b7f3cb19743b1aae057cb3d8b8fc2dd26420304e13c5157ad730dd833e8"
},
"downloads": -1,
"filename": "yolo_tiling-0.0.10.tar.gz",
"has_sig": false,
"md5_digest": "0d86f7df6c3e04a8c9d45b0f94bc8aeb",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 39144636,
"upload_time": "2025-01-14T18:05:17",
"upload_time_iso_8601": "2025-01-14T18:05:17.649038Z",
"url": "https://files.pythonhosted.org/packages/44/e2/628c79296d2b2fe4c47cc9b6b850db193154f31c154bdb31f5a676f47e77/yolo_tiling-0.0.10.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-01-14 18:05:17",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "Jordan-Pierce",
"github_project": "yolo-tiling",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "numpy",
"specs": []
},
{
"name": "tqdm",
"specs": []
},
{
"name": "pandas",
"specs": []
},
{
"name": "Shapely",
"specs": []
},
{
"name": "rasterio",
"specs": []
},
{
"name": "opencv-python",
"specs": []
},
{
"name": "matplotlib",
"specs": []
}
],
"lcname": "yolo-tiling"
}