patched-yolo-infer


Namepatched-yolo-infer JSON
Version 1.1.2 PyPI version JSON
download
home_pagehttps://github.com/Koldim2001/YOLO-Patch-Based-Inference
SummaryYOLO-Patch-Based-Inference for detection/segmentation of small objects in images.
upload_time2024-03-29 08:35:15
maintainerNone
docs_urlNone
authorKoldim2001
requires_python>=3.8
licenseAGPL-3.0 license
keywords python yolov8 yolov9 rtdetr sam object detection instance segmentation patch-based inference small object detection yolov8-seg image patching yolo visualization slice-based inference slicing inference inference visualization patchify ultralytics
VCS
bugtrack_url
requirements torch numpy opencv-python matplotlib ultralytics
Travis-CI No Travis.
coveralls test coverage No coveralls.
            
# YOLO-Patch-Based-Inference



This library facilitates various visualizations of inference results from ultralytics segmentation/detection models, including cropping with overlays, as well as **a patch-based inference algorithm enabling the detection/instance segmentation of small objects in images**. 



**Model Support**: The library offers support for multiple ultralytics deep learning models, such as YOLOv8, YOLOv9, SAM, and RTDETR. Users can select from pre-trained options or utilize custom-trained models to best meet their task requirements.





## Installation

You can install the library via pip:



```bash

pip install patched_yolo_infer

```



Note: If CUDA support is available, it's recommended to pre-install PyTorch with CUDA support before installing the library. Otherwise, the CPU version will be installed by default.





</details>



## Notebooks



Interactive notebooks are provided to showcase the functionality of the library. These notebooks cover batch inference procedures for detection, instance segmentation, custom visualization of inference, and more. Each notebook is paired with a tutorial on YouTube, making it easy to learn and implement features. Check the GitHub page for the current links to the videos: https://github.com/Koldim2001/YOLO-Patch-Based-Inference



__Check this Colab examples:__

                         

YOLO-Patch-Based-Inference Example - [Open in Colab](https://colab.research.google.com/drive/1FUao91GyB-ojGRN_okUxYyfagTT9tdsP?usp=sharing)



Example of using various functions for visualizing basic YOLOv8/v9 inference results and handling overlapping crops - [Open in Colab](https://colab.research.google.com/drive/1eM4o1e0AUQrS1mLDpcgK9HKInWEvnaMn?usp=sharing)



## Usage



### 1. Patch-Based-Inference

To carry out patch-based inference of YOLO models using our library, you need to follow a sequential procedure. First, you create an instance of the MakeCropsDetectThem class, providing all desired parameters related to YOLO inference and the patch segmentation principle.<br/> Subsequently, you pass the obtained object of this class to CombineDetections, which facilitates the consolidation of all predictions from each overlapping crop, followed by intelligent suppression of duplicates. <br/>Upon completion, you receive the result, from which you can extract the desired outcome of frame processing.



The output obtained from the process includes several attributes that can be leveraged for further analysis or visualization:



1. img: This attribute contains the original image on which the inference was performed. It provides context for the detected objects.



2. confidences: This attribute holds the confidence scores associated with each detected object. These scores indicate the model's confidence level in the accuracy of its predictions.



3. boxes: These bounding boxes are represented as a list of lists, where each list contains four values: [x_min, y_min, x_max, y_max]. These values correspond to the coordinates of the top-left and bottom-right corners of each bounding box.



4. masks: If available, this attribute provides segmentation masks corresponding to the detected objects. These masks can be used to precisely delineate object boundaries.



5. classes_ids: This attribute contains the class IDs assigned to each detected object. These IDs correspond to specific object classes defined during the model training phase.



6. classes_names: These are the human-readable names corresponding to the class IDs. They provide semantic labels for the detected objects, making the results easier to interpret.



```python

import cv2

from patched_yolo_infer import MakeCropsDetectThem, CombineDetections



# Load the image 

img_path = 'test_image.jpg'

img = cv2.imread(img_path)



element_crops = MakeCropsDetectThem(

    image=img,

    model_path="yolov8m.pt",

    segment=False,

    shape_x=640,

    shape_y=640,

    overlap_x=50,

    overlap_y=50,

    conf=0.5,

    iou=0.7,

    resize_initial_size=True,

)

result = CombineDetections(element_crops, nms_threshold=0.25, match_metric='IOS')  



# Final Results:

img=result.image

confidences=result.filtered_confidences

boxes=result.filtered_boxes

masks=result.filtered_masks

classes_ids=result.filtered_classes_id

classes_names=result.filtered_classes_names

```



#### Explanation of possible input arguments:



**MakeCropsDetectThem**

Class implementing cropping and passing crops through a neural network for detection/segmentation.\

**Args:**

- **image** (*np.ndarray*): Input image BGR.

- **model_path** (*str*): Path to the YOLO model.

- **model** (*ultralytics model*) Pre-initialized model object. If provided, the model will be used directly instead of loading from model_path.

- **imgsz** (*int*): Size of the input image for inference YOLO.

- **conf** (*float*): Confidence threshold for detections YOLO.

- **iou** (*float*): IoU threshold for non-maximum suppression YOLOv8 of single crop.

- **classes_list** (*List[int] or None*): List of classes to filter detections. If None, all classes are considered. Defaults to None.

- **segment** (*bool*): Whether to perform segmentation (YOLOv8-seg).

- **shape_x** (*int*): Size of the crop in the x-coordinate.

- **shape_y** (*int*): Size of the crop in the y-coordinate.

- **overlap_x** (*float*): Percentage of overlap along the x-axis.

- **overlap_y** (*float*): Percentage of overlap along the y-axis.

- **show_crops** (*bool*): Whether to visualize the cropping.

- **resize_initial_size** (*bool*): Whether to resize the results to the original image size (ps: slow operation).



**CombineDetections**

Class implementing combining masks/boxes from multiple crops + NMS (Non-Maximum Suppression).\

**Args:**

- **element_crops** (*MakeCropsDetectThem*): Object containing crop information.

- **nms_threshold** (*float*): IoU/IoS threshold for non-maximum suppression.

- **match_metric** (*str*): Matching metric, either 'IOU' or 'IOS'.

- **intelligent_sorter** (*bool*): Enable sorting by area and rounded confidence parameter. If False, sorting will be done only by confidence (usual nms). (Dafault is True)



---

### 2. Custom inference visualization:

Visualizes custom results of object detection or segmentation on an image.



**Args:**

- **img** (*numpy.ndarray*): The input image in BGR format.

- **boxes** (*list*): A list of bounding boxes in the format [x_min, y_min, x_max, y_max].

- **classes_ids** (*list*): A list of class IDs for each detection.

- **confidences** (*list*): A list of confidence scores corresponding to each bounding box. Default is an empty list.

- **classes_names** (*list*): A list of class names corresponding to the class IDs. Default is an empty list.

- **masks** (*list*): A list of masks. Default is an empty list.

- **segment** (*bool*): Whether to perform instance segmentation. Default is False.

- **show_boxes** (*bool*): Whether to show bounding boxes. Default is True.

- **show_class** (*bool*): Whether to show class labels. Default is True.

- **fill_mask** (*bool*): Whether to fill the segmented regions with color. Default is False.

- **alpha** (*float*): The transparency of filled masks. Default is 0.3.

- **color_class_background** (*tuple*): The background BGR color for class labels. Default is (0, 0, 255) (red).

- **color_class_text** (*tuple*): The text color for class labels. Default is (255, 255, 255) (white).

- **thickness** (*int*): The thickness of bounding box and text. Default is 4.

- **font**: The font type for class labels. Default is cv2.FONT_HERSHEY_SIMPLEX.

- **font_scale** (*float*): The scale factor for font size. Default is 1.5.

- **delta_colors** (*int*): The random seed offset for color variation. Default is seed=0.

- **dpi** (*int*): Final visualization size (plot is bigger when dpi is higher). Default is 150.

- **random_object_colors** (*bool*): If true, colors for each object are selected randomly. Default is False.

- **show_confidences** (*bool*): If true and show_class=True, confidences near class are visualized. Default is False.

- **axis_off** (*bool*): If true, axis is turned off in the final visualization. Default is True.

- **show_classes_list** (*list*): If empty, visualize all classes. Otherwise, visualize only classes in the list.

- **return_image_array** (*bool*): If True, the function returns the image (BGR np.array) instead of displaying it. Default is False.





Example of using:

```python

from patched_yolo_infer import visualize_results



# Assuming result is an instance of the CombineDetections class

result = CombineDetections(...) 



# Visualizing the results using the visualize_results function

visualize_results(

    img=result.image,

    confidences=result.filtered_confidences,

    boxes=result.filtered_boxes,

    masks=result.filtered_masks,

    classes_ids=result.filtered_classes_id,

    classes_names=result.filtered_classes_names,

    segment=False,

)

```

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/Koldim2001/YOLO-Patch-Based-Inference",
    "name": "patched-yolo-infer",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "python, yolov8, yolov9, rtdetr, sam, object detection, instance segmentation, patch-based inference, small object detection, yolov8-seg, image patching, yolo visualization, slice-based inference, slicing inference, inference visualization, patchify, ultralytics",
    "author": "Koldim2001",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/a7/fa/245e0f3ed5eb66df27b1b56611462ddc887a86162f3d8d5a2584ae911543/patched_yolo_infer-1.1.2.tar.gz",
    "platform": null,
    "description": "\r\n# YOLO-Patch-Based-Inference\r\n\r\n\r\n\r\nThis library facilitates various visualizations of inference results from ultralytics segmentation/detection models, including cropping with overlays, as well as **a patch-based inference algorithm enabling the detection/instance segmentation of small objects in images**. \r\n\r\n\r\n\r\n**Model Support**: The library offers support for multiple ultralytics deep learning models, such as YOLOv8, YOLOv9, SAM, and RTDETR. Users can select from pre-trained options or utilize custom-trained models to best meet their task requirements.\r\n\r\n\r\n\r\n\r\n\r\n## Installation\r\n\r\nYou can install the library via pip:\r\n\r\n\r\n\r\n```bash\r\n\r\npip install patched_yolo_infer\r\n\r\n```\r\n\r\n\r\n\r\nNote: If CUDA support is available, it's recommended to pre-install PyTorch with CUDA support before installing the library. Otherwise, the CPU version will be installed by default.\r\n\r\n\r\n\r\n\r\n\r\n</details>\r\n\r\n\r\n\r\n## Notebooks\r\n\r\n\r\n\r\nInteractive notebooks are provided to showcase the functionality of the library. These notebooks cover batch inference procedures for detection, instance segmentation, custom visualization of inference, and more. Each notebook is paired with a tutorial on YouTube, making it easy to learn and implement features. Check the GitHub page for the current links to the videos: https://github.com/Koldim2001/YOLO-Patch-Based-Inference\r\n\r\n\r\n\r\n__Check this Colab examples:__\r\n\r\n                         \r\n\r\nYOLO-Patch-Based-Inference Example - [Open in Colab](https://colab.research.google.com/drive/1FUao91GyB-ojGRN_okUxYyfagTT9tdsP?usp=sharing)\r\n\r\n\r\n\r\nExample of using various functions for visualizing basic YOLOv8/v9 inference results and handling overlapping crops - [Open in Colab](https://colab.research.google.com/drive/1eM4o1e0AUQrS1mLDpcgK9HKInWEvnaMn?usp=sharing)\r\n\r\n\r\n\r\n## Usage\r\n\r\n\r\n\r\n### 1. Patch-Based-Inference\r\n\r\nTo carry out patch-based inference of YOLO models using our library, you need to follow a sequential procedure. First, you create an instance of the MakeCropsDetectThem class, providing all desired parameters related to YOLO inference and the patch segmentation principle.<br/> Subsequently, you pass the obtained object of this class to CombineDetections, which facilitates the consolidation of all predictions from each overlapping crop, followed by intelligent suppression of duplicates. <br/>Upon completion, you receive the result, from which you can extract the desired outcome of frame processing.\r\n\r\n\r\n\r\nThe output obtained from the process includes several attributes that can be leveraged for further analysis or visualization:\r\n\r\n\r\n\r\n1. img: This attribute contains the original image on which the inference was performed. It provides context for the detected objects.\r\n\r\n\r\n\r\n2. confidences: This attribute holds the confidence scores associated with each detected object. These scores indicate the model's confidence level in the accuracy of its predictions.\r\n\r\n\r\n\r\n3. boxes: These bounding boxes are represented as a list of lists, where each list contains four values: [x_min, y_min, x_max, y_max]. These values correspond to the coordinates of the top-left and bottom-right corners of each bounding box.\r\n\r\n\r\n\r\n4. masks: If available, this attribute provides segmentation masks corresponding to the detected objects. These masks can be used to precisely delineate object boundaries.\r\n\r\n\r\n\r\n5. classes_ids: This attribute contains the class IDs assigned to each detected object. These IDs correspond to specific object classes defined during the model training phase.\r\n\r\n\r\n\r\n6. classes_names: These are the human-readable names corresponding to the class IDs. They provide semantic labels for the detected objects, making the results easier to interpret.\r\n\r\n\r\n\r\n```python\r\n\r\nimport cv2\r\n\r\nfrom patched_yolo_infer import MakeCropsDetectThem, CombineDetections\r\n\r\n\r\n\r\n# Load the image \r\n\r\nimg_path = 'test_image.jpg'\r\n\r\nimg = cv2.imread(img_path)\r\n\r\n\r\n\r\nelement_crops = MakeCropsDetectThem(\r\n\r\n    image=img,\r\n\r\n    model_path=\"yolov8m.pt\",\r\n\r\n    segment=False,\r\n\r\n    shape_x=640,\r\n\r\n    shape_y=640,\r\n\r\n    overlap_x=50,\r\n\r\n    overlap_y=50,\r\n\r\n    conf=0.5,\r\n\r\n    iou=0.7,\r\n\r\n    resize_initial_size=True,\r\n\r\n)\r\n\r\nresult = CombineDetections(element_crops, nms_threshold=0.25, match_metric='IOS')  \r\n\r\n\r\n\r\n# Final Results:\r\n\r\nimg=result.image\r\n\r\nconfidences=result.filtered_confidences\r\n\r\nboxes=result.filtered_boxes\r\n\r\nmasks=result.filtered_masks\r\n\r\nclasses_ids=result.filtered_classes_id\r\n\r\nclasses_names=result.filtered_classes_names\r\n\r\n```\r\n\r\n\r\n\r\n#### Explanation of possible input arguments:\r\n\r\n\r\n\r\n**MakeCropsDetectThem**\r\n\r\nClass implementing cropping and passing crops through a neural network for detection/segmentation.\\\r\n\r\n**Args:**\r\n\r\n- **image** (*np.ndarray*): Input image BGR.\r\n\r\n- **model_path** (*str*): Path to the YOLO model.\r\n\r\n- **model** (*ultralytics model*) Pre-initialized model object. If provided, the model will be used directly instead of loading from model_path.\r\n\r\n- **imgsz** (*int*): Size of the input image for inference YOLO.\r\n\r\n- **conf** (*float*): Confidence threshold for detections YOLO.\r\n\r\n- **iou** (*float*): IoU threshold for non-maximum suppression YOLOv8 of single crop.\r\n\r\n- **classes_list** (*List[int] or None*): List of classes to filter detections. If None, all classes are considered. Defaults to None.\r\n\r\n- **segment** (*bool*): Whether to perform segmentation (YOLOv8-seg).\r\n\r\n- **shape_x** (*int*): Size of the crop in the x-coordinate.\r\n\r\n- **shape_y** (*int*): Size of the crop in the y-coordinate.\r\n\r\n- **overlap_x** (*float*): Percentage of overlap along the x-axis.\r\n\r\n- **overlap_y** (*float*): Percentage of overlap along the y-axis.\r\n\r\n- **show_crops** (*bool*): Whether to visualize the cropping.\r\n\r\n- **resize_initial_size** (*bool*): Whether to resize the results to the original image size (ps: slow operation).\r\n\r\n\r\n\r\n**CombineDetections**\r\n\r\nClass implementing combining masks/boxes from multiple crops + NMS (Non-Maximum Suppression).\\\r\n\r\n**Args:**\r\n\r\n- **element_crops** (*MakeCropsDetectThem*): Object containing crop information.\r\n\r\n- **nms_threshold** (*float*): IoU/IoS threshold for non-maximum suppression.\r\n\r\n- **match_metric** (*str*): Matching metric, either 'IOU' or 'IOS'.\r\n\r\n- **intelligent_sorter** (*bool*): Enable sorting by area and rounded confidence parameter. If False, sorting will be done only by confidence (usual nms). (Dafault is True)\r\n\r\n\r\n\r\n---\r\n\r\n### 2. Custom inference visualization:\r\n\r\nVisualizes custom results of object detection or segmentation on an image.\r\n\r\n\r\n\r\n**Args:**\r\n\r\n- **img** (*numpy.ndarray*): The input image in BGR format.\r\n\r\n- **boxes** (*list*): A list of bounding boxes in the format [x_min, y_min, x_max, y_max].\r\n\r\n- **classes_ids** (*list*): A list of class IDs for each detection.\r\n\r\n- **confidences** (*list*): A list of confidence scores corresponding to each bounding box. Default is an empty list.\r\n\r\n- **classes_names** (*list*): A list of class names corresponding to the class IDs. Default is an empty list.\r\n\r\n- **masks** (*list*): A list of masks. Default is an empty list.\r\n\r\n- **segment** (*bool*): Whether to perform instance segmentation. Default is False.\r\n\r\n- **show_boxes** (*bool*): Whether to show bounding boxes. Default is True.\r\n\r\n- **show_class** (*bool*): Whether to show class labels. Default is True.\r\n\r\n- **fill_mask** (*bool*): Whether to fill the segmented regions with color. Default is False.\r\n\r\n- **alpha** (*float*): The transparency of filled masks. Default is 0.3.\r\n\r\n- **color_class_background** (*tuple*): The background BGR color for class labels. Default is (0, 0, 255) (red).\r\n\r\n- **color_class_text** (*tuple*): The text color for class labels. Default is (255, 255, 255) (white).\r\n\r\n- **thickness** (*int*): The thickness of bounding box and text. Default is 4.\r\n\r\n- **font**: The font type for class labels. Default is cv2.FONT_HERSHEY_SIMPLEX.\r\n\r\n- **font_scale** (*float*): The scale factor for font size. Default is 1.5.\r\n\r\n- **delta_colors** (*int*): The random seed offset for color variation. Default is seed=0.\r\n\r\n- **dpi** (*int*): Final visualization size (plot is bigger when dpi is higher). Default is 150.\r\n\r\n- **random_object_colors** (*bool*): If true, colors for each object are selected randomly. Default is False.\r\n\r\n- **show_confidences** (*bool*): If true and show_class=True, confidences near class are visualized. Default is False.\r\n\r\n- **axis_off** (*bool*): If true, axis is turned off in the final visualization. Default is True.\r\n\r\n- **show_classes_list** (*list*): If empty, visualize all classes. Otherwise, visualize only classes in the list.\r\n\r\n- **return_image_array** (*bool*): If True, the function returns the image (BGR np.array) instead of displaying it. Default is False.\r\n\r\n\r\n\r\n\r\n\r\nExample of using:\r\n\r\n```python\r\n\r\nfrom patched_yolo_infer import visualize_results\r\n\r\n\r\n\r\n# Assuming result is an instance of the CombineDetections class\r\n\r\nresult = CombineDetections(...) \r\n\r\n\r\n\r\n# Visualizing the results using the visualize_results function\r\n\r\nvisualize_results(\r\n\r\n    img=result.image,\r\n\r\n    confidences=result.filtered_confidences,\r\n\r\n    boxes=result.filtered_boxes,\r\n\r\n    masks=result.filtered_masks,\r\n\r\n    classes_ids=result.filtered_classes_id,\r\n\r\n    classes_names=result.filtered_classes_names,\r\n\r\n    segment=False,\r\n\r\n)\r\n\r\n```\r\n",
    "bugtrack_url": null,
    "license": "AGPL-3.0 license",
    "summary": "YOLO-Patch-Based-Inference for detection/segmentation of small objects in images.",
    "version": "1.1.2",
    "project_urls": {
        "Homepage": "https://github.com/Koldim2001/YOLO-Patch-Based-Inference"
    },
    "split_keywords": [
        "python",
        " yolov8",
        " yolov9",
        " rtdetr",
        " sam",
        " object detection",
        " instance segmentation",
        " patch-based inference",
        " small object detection",
        " yolov8-seg",
        " image patching",
        " yolo visualization",
        " slice-based inference",
        " slicing inference",
        " inference visualization",
        " patchify",
        " ultralytics"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "9264953666a2abf845b624a4c3dd130ac2fcb51d7b71ae3fa77b97537a9648db",
                "md5": "b44c19369af791d55ff49ab2b089da0e",
                "sha256": "b61e8e85c5e2db157b1105e904356c0a1c277f30f9691fbb27ce4ad0f17e5302"
            },
            "downloads": -1,
            "filename": "patched_yolo_infer-1.1.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "b44c19369af791d55ff49ab2b089da0e",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 28385,
            "upload_time": "2024-03-29T08:35:13",
            "upload_time_iso_8601": "2024-03-29T08:35:13.815125Z",
            "url": "https://files.pythonhosted.org/packages/92/64/953666a2abf845b624a4c3dd130ac2fcb51d7b71ae3fa77b97537a9648db/patched_yolo_infer-1.1.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "a7fa245e0f3ed5eb66df27b1b56611462ddc887a86162f3d8d5a2584ae911543",
                "md5": "4a5598f48f25d8a5fa5de43dce91e409",
                "sha256": "749cbd583756828dbe0908dbd4558ee5b9fd48dac1522b2b1156e6d6ec2f2026"
            },
            "downloads": -1,
            "filename": "patched_yolo_infer-1.1.2.tar.gz",
            "has_sig": false,
            "md5_digest": "4a5598f48f25d8a5fa5de43dce91e409",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 29571,
            "upload_time": "2024-03-29T08:35:15",
            "upload_time_iso_8601": "2024-03-29T08:35:15.708658Z",
            "url": "https://files.pythonhosted.org/packages/a7/fa/245e0f3ed5eb66df27b1b56611462ddc887a86162f3d8d5a2584ae911543/patched_yolo_infer-1.1.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-03-29 08:35:15",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Koldim2001",
    "github_project": "YOLO-Patch-Based-Inference",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "torch",
            "specs": []
        },
        {
            "name": "numpy",
            "specs": []
        },
        {
            "name": "opencv-python",
            "specs": []
        },
        {
            "name": "matplotlib",
            "specs": []
        },
        {
            "name": "ultralytics",
            "specs": []
        }
    ],
    "lcname": "patched-yolo-infer"
}
        
Elapsed time: 0.20873s