imagedatasetanalyzer


Nameimagedatasetanalyzer JSON
Version 0.1.6 PyPI version JSON
download
home_pagehttps://github.com/joortif/ImageDatasetAnalyzer
SummaryImage dataset analyzer using image embedding models and clustering methods.
upload_time2025-01-27 09:06:50
maintainerJoaquin Ortiz de Murua Ferrero
docs_urlNone
authorJoaquin Ortiz de Murua Ferrero
requires_pythonNone
licenseMIT license
keywords instance semantic segmentation pytorch tensorflow huggingface opencv embedding image analysis machine learning deep learning active learning computer vision
VCS
bugtrack_url
requirements kneed matplotlib numpy opencv_contrib_python opencv_python opencv_python_headless Pillow scikit-learn scipy scikit-image tensorflow tensorflow-intel tqdm transformers torch torchvision
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # ImageDatasetAnalyzer



*ImageDatasetAnalyzer* is a Python library designed to simplify and automate the analysis of a set of images and optionally its segmentation labels. It provides several tools and methods to perform an initial analysis of the images and its labels obtaining useful information such as sizes, number of classes, total number of objects from a class per image and bounding boxes metrics. 



Aditionally, it includes a wide variety of models for image feature extraction and embedding of images from frameworks such as HuggingFace or PyTorch. These embeddings are useful for pattern recognition in images using traditional clustering algorithms like KMeans or AgglomerativeClustering. 



It can also be used to apply these clustering methods for [Active Learning](https://en.wikipedia.org/wiki/Active_learning_(machine_learning)) in semantic segmentation and perform a reduction of the original dataset obtaining the most representative images from each cluster. By these means, this library can be a useful tool to select which images to label for semantic segmentation (or other task that benefits from selective labeling).



## πŸ”§ Key features



* **Image and label dataset analysis**: Evaluate the distribution of images and labels in a dataset to understand its structure and characteristics. This analyisis can also be used ensure that everything is correct: each image has its label, sizes are accurate, the number of classes matches expectations...

* **Embedding clustering**: Group similar images using clustering techniques based on embeddings generated by pre-trained models. The library supports KMeans, AgglomerativeClustering, DBSCAN and OPTICS from skicit-learn. They also include methods to search for hyperparameter tuning using grid search.

* **Support for pre-trained models**: Compatible with embedding models from [πŸ€—HuggingFaceπŸ€—](https://huggingface.co/), [PyTorch](https://pytorch.org/), [TensorFlow](https://www.tensorflow.org/) and [OpenCV](https://opencv.org/) frameworks. New frameworks can be easily added using the Embedding superclass.

* **Image dataset reduction**: Reduce the number of images in the dataset by selecting the most representative ones (those who are closest to the centroid) or the most diverse ones (those who are farthest from the centroid) from each cluster.   



## πŸš€ Getting Started



To start using this package, install it using `pip`:



For example for Ubuntu use:

```bash

pip3 install ImageDatasetAnalyzer

```



On Windows, use:

```bash

pip install ImageDatasetAnalyzer

```



## πŸ€– Supported models



The compatibility of the following models have been tested. You can use other models and versions of these frameworks as well, although their performance and compatibility might not be fully guaranteed.



| Framework     | Model names     |   

|--------------------------|---------------------|

| Hugging Face             | ``CLIP`` , ``ViT``, ``DeiT``,   ``Swin Transformer``,  ``DINO ViT``, ``ConvNeXt``   | 

| PyTorch                  | ``ResNet (50, 101)``, ``VGG (16,19)``, ``DenseNet (121, 169, 201)``, ``InceptionV3``       | 

| Tensorflow               | ``MobileNet (V2)``, ``InceptionV3``, ``VGG (16, 19)``, ``ResNet (50, 101, 152)``,  ``ResNetV2 (50, 101, 152)``, ``NASNet (Large, Mobile)``, ``ConvNeXt (Tiny, Small, Base, Large, XLarge)``, ``DenseNet (121, 169, 201)`` | 



## πŸ‘©β€πŸ’» Usage

This package includes 3 main modules for **Analysis**, **Embedding generation and clustering** and **Dataset Reduction**.



### πŸ“Š Dataset analysis

You can analyze the dataset and explore its properties, obtain metrics and visualizations. This module works both for image datasets with labels and for just image datasets.



```python

from imagedatasetanalyzer.src.datasets.imagelabeldataset import ImageLabelDataset



# Define paths to the images and labels

img_dir = r"images/path"

labels_dir = r"labels/path"



# Load the image and label dataset

dataset = ImageLabelDataset(img_dir=img_dir, label_dir=labels_dir)



# Alternatively, you can use just an image dataset without labels

image_dataset = ImageDataset(img_dir=img_dir)



# Perform dataset analysis (visualize and analyze)

dataset.analyze(plot=True, output="results/path", verbose=True)



# If you use only images (without labels), the analysis will provide less information

image_dataset.analyze()

```



### πŸ” Embedding generation and clustering

This module is used to generate embeddings for your images and then perform clustering using different algorithms (e.g., K-Means, DBSCAN). Here’s how to generate embeddings and perform clustering:



```python

from imagedatasetanalyzer.src.embeddings.huggingfaceembedding import HuggingFaceEmbedding

from imagedatasetanalyzer.src.datasets.imagedataset import ImageDataset

from imagedatasetanalyzer.src.models.kmeansclustering import KMeansClustering

import numpy as np



# Define image dataset directory

img_dir = r"image/path"



# Load the dataset

dataset = ImageDataset(img_dir)



# Choose an embedding model (e.g., HuggingFace DINO).

embedding_model = HuggingFaceEmbedding("facebook/dino-vits16")

embeddings = embedding_model.generate_embeddings(dataset)



# Perform K-Means clustering

kmeans = KMeansClustering(dataset, embeddings, random_state=123)

best_k = kmeans.find_elbow(25)  # Find the optimal number of clusters using the elbow method



# Apply K-Means clustering with the best number of clusters

labels_kmeans = kmeans.clustering(best_k)



# Display images from each cluster

for cluster in np.unique(labels_kmeans):

    kmeans.show_cluster_images(cluster, labels_kmeans)



# Visualize clusters using TSNE instead of PCA

kmeans.clustering(num_clusters=best_k, reduction='tsne', output='tsne_reduction')

```



### πŸ“‰ Dataset reduction 

This feature allows reducing a dataset based on various clustering methods. You can use different clustering techniques to select a smaller subset of images from the dataset. It can be done selecting those images that are closer to the centroid of each cluster (```selection_type=representative```), selecting those that are farthest (```selection_type=diverse```) or randomly (```selection_type=random```).



```python

from imagedatasetanalyzer.src.datasets.imagedataset import ImageDataset

from imagedatasetanalyzer.src.embeddings.tensorflowembedding import TensorflowEmbedding

from imagedatasetanalyzer.src.models.kmeansclustering import KMeansClustering



# Define paths

img_dir = r"images/path"



# Load dataset

dataset = ImageDataset(img_dir)



# Choose embedding method. We are using MobileNetV2 from Tensorflow.

emb = TensorflowEmbedding("MobileNetV2")

embeddings = emb.generate_embeddings(dataset)



# Initialize KMeans clustering

kmeans = KMeansClustering(dataset, embeddings, random_state=123)



# Select the number of clusters with KMeans that maximize the silhouette score.

best_k = kmeans.find_best_n_clusters(range(2,25), 'silhouette', plot=False)



# Reduce dataset using the best KMeans model according to the silhouette score. 

# In this case, we are mantaining the 70% of the original dataset (reduction=0.7), 

# obtaining the closest images from each cluster (selection_type='representative') 

# and ensuring that 20% of the selected images within each cluster are diverse (diverse_percentage=0.2).

# The reduced dataset will be saved to the specified output directory ("reduced/dataset/path")

reduced_dataset = kmeans.select_balanced_images(n_clusters=best_k, 

                                                reduction=0.7, 

                                                selection_type='representative', 

                                                diverse_percentage=0.2, 

                                                output="reduced/dataset/path")

```



## 🧰 Requirements



The dependencies and requirements to use this library are in the requirements.txt file. The following list includes all the dependencies:



* Kneed 

* Matplotlib 

* Numpy

* OpenCV

* Pillow

* Scikit learn

* Scipy

* Scikit image

* Tensorflow

* Torch

* Tqdm

* Transformers



## βœ‰οΈ Contact 



πŸ“§ jortizdemuruaferrero@gmail.com


            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/joortif/ImageDatasetAnalyzer",
    "name": "imagedatasetanalyzer",
    "maintainer": "Joaquin Ortiz de Murua Ferrero",
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": "jortizdemuruaferrero@gmail.com",
    "keywords": "instance semantic segmentation pytorch tensorflow huggingface opencv embedding image analysis machine learning deep learning active learning computer vision",
    "author": "Joaquin Ortiz de Murua Ferrero",
    "author_email": "jortizdemuruaferrero@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/e6/3f/c52e463ee309b9b7bc602bf87d0a9dee57b1ff86b04e3b8aebb814b337c2/imagedatasetanalyzer-0.1.6.tar.gz",
    "platform": null,
    "description": "# ImageDatasetAnalyzer\r\r\n\r\r\n*ImageDatasetAnalyzer* is a Python library designed to simplify and automate the analysis of a set of images and optionally its segmentation labels. It provides several tools and methods to perform an initial analysis of the images and its labels obtaining useful information such as sizes, number of classes, total number of objects from a class per image and bounding boxes metrics. \r\r\n\r\r\nAditionally, it includes a wide variety of models for image feature extraction and embedding of images from frameworks such as HuggingFace or PyTorch. These embeddings are useful for pattern recognition in images using traditional clustering algorithms like KMeans or AgglomerativeClustering. \r\r\n\r\r\nIt can also be used to apply these clustering methods for [Active Learning](https://en.wikipedia.org/wiki/Active_learning_(machine_learning)) in semantic segmentation and perform a reduction of the original dataset obtaining the most representative images from each cluster. By these means, this library can be a useful tool to select which images to label for semantic segmentation (or other task that benefits from selective labeling).\r\r\n\r\r\n## \ud83d\udd27 Key features\r\r\n\r\r\n* **Image and label dataset analysis**: Evaluate the distribution of images and labels in a dataset to understand its structure and characteristics. This analyisis can also be used ensure that everything is correct: each image has its label, sizes are accurate, the number of classes matches expectations...\r\r\n* **Embedding clustering**: Group similar images using clustering techniques based on embeddings generated by pre-trained models. The library supports KMeans, AgglomerativeClustering, DBSCAN and OPTICS from skicit-learn. They also include methods to search for hyperparameter tuning using grid search.\r\r\n* **Support for pre-trained models**: Compatible with embedding models from [\ud83e\udd17HuggingFace\ud83e\udd17](https://huggingface.co/), [PyTorch](https://pytorch.org/), [TensorFlow](https://www.tensorflow.org/) and [OpenCV](https://opencv.org/) frameworks. New frameworks can be easily added using the Embedding superclass.\r\r\n* **Image dataset reduction**: Reduce the number of images in the dataset by selecting the most representative ones (those who are closest to the centroid) or the most diverse ones (those who are farthest from the centroid) from each cluster.   \r\r\n\r\r\n## \ud83d\ude80 Getting Started\r\r\n\r\r\nTo start using this package, install it using `pip`:\r\r\n\r\r\nFor example for Ubuntu use:\r\r\n```bash\r\r\npip3 install ImageDatasetAnalyzer\r\r\n```\r\r\n\r\r\nOn Windows, use:\r\r\n```bash\r\r\npip install ImageDatasetAnalyzer\r\r\n```\r\r\n\r\r\n## \ud83e\udd16 Supported models\r\r\n\r\r\nThe compatibility of the following models have been tested. You can use other models and versions of these frameworks as well, although their performance and compatibility might not be fully guaranteed.\r\r\n\r\r\n| Framework     | Model names     |   \r\r\n|--------------------------|---------------------|\r\r\n| Hugging Face             | ``CLIP`` , ``ViT``, ``DeiT``,   ``Swin Transformer``,  ``DINO ViT``, ``ConvNeXt``   | \r\r\n| PyTorch                  | ``ResNet (50, 101)``, ``VGG (16,19)``, ``DenseNet (121, 169, 201)``, ``InceptionV3``       | \r\r\n| Tensorflow               | ``MobileNet (V2)``, ``InceptionV3``, ``VGG (16, 19)``, ``ResNet (50, 101, 152)``,  ``ResNetV2 (50, 101, 152)``, ``NASNet (Large, Mobile)``, ``ConvNeXt (Tiny, Small, Base, Large, XLarge)``, ``DenseNet (121, 169, 201)`` | \r\r\n\r\r\n## \ud83d\udc69\u200d\ud83d\udcbb Usage\r\r\nThis package includes 3 main modules for **Analysis**, **Embedding generation and clustering** and **Dataset Reduction**.\r\r\n\r\r\n### \ud83d\udcca Dataset analysis\r\r\nYou can analyze the dataset and explore its properties, obtain metrics and visualizations. This module works both for image datasets with labels and for just image datasets.\r\r\n\r\r\n```python\r\r\nfrom imagedatasetanalyzer.src.datasets.imagelabeldataset import ImageLabelDataset\r\r\n\r\r\n# Define paths to the images and labels\r\r\nimg_dir = r\"images/path\"\r\r\nlabels_dir = r\"labels/path\"\r\r\n\r\r\n# Load the image and label dataset\r\r\ndataset = ImageLabelDataset(img_dir=img_dir, label_dir=labels_dir)\r\r\n\r\r\n# Alternatively, you can use just an image dataset without labels\r\r\nimage_dataset = ImageDataset(img_dir=img_dir)\r\r\n\r\r\n# Perform dataset analysis (visualize and analyze)\r\r\ndataset.analyze(plot=True, output=\"results/path\", verbose=True)\r\r\n\r\r\n# If you use only images (without labels), the analysis will provide less information\r\r\nimage_dataset.analyze()\r\r\n```\r\r\n\r\r\n### \ud83d\udd0d Embedding generation and clustering\r\r\nThis module is used to generate embeddings for your images and then perform clustering using different algorithms (e.g., K-Means, DBSCAN). Here\u2019s how to generate embeddings and perform clustering:\r\r\n\r\r\n```python\r\r\nfrom imagedatasetanalyzer.src.embeddings.huggingfaceembedding import HuggingFaceEmbedding\r\r\nfrom imagedatasetanalyzer.src.datasets.imagedataset import ImageDataset\r\r\nfrom imagedatasetanalyzer.src.models.kmeansclustering import KMeansClustering\r\r\nimport numpy as np\r\r\n\r\r\n# Define image dataset directory\r\r\nimg_dir = r\"image/path\"\r\r\n\r\r\n# Load the dataset\r\r\ndataset = ImageDataset(img_dir)\r\r\n\r\r\n# Choose an embedding model (e.g., HuggingFace DINO).\r\r\nembedding_model = HuggingFaceEmbedding(\"facebook/dino-vits16\")\r\r\nembeddings = embedding_model.generate_embeddings(dataset)\r\r\n\r\r\n# Perform K-Means clustering\r\r\nkmeans = KMeansClustering(dataset, embeddings, random_state=123)\r\r\nbest_k = kmeans.find_elbow(25)  # Find the optimal number of clusters using the elbow method\r\r\n\r\r\n# Apply K-Means clustering with the best number of clusters\r\r\nlabels_kmeans = kmeans.clustering(best_k)\r\r\n\r\r\n# Display images from each cluster\r\r\nfor cluster in np.unique(labels_kmeans):\r\r\n    kmeans.show_cluster_images(cluster, labels_kmeans)\r\r\n\r\r\n# Visualize clusters using TSNE instead of PCA\r\r\nkmeans.clustering(num_clusters=best_k, reduction='tsne', output='tsne_reduction')\r\r\n```\r\r\n\r\r\n### \ud83d\udcc9 Dataset reduction \r\r\nThis feature allows reducing a dataset based on various clustering methods. You can use different clustering techniques to select a smaller subset of images from the dataset. It can be done selecting those images that are closer to the centroid of each cluster (```selection_type=representative```), selecting those that are farthest (```selection_type=diverse```) or randomly (```selection_type=random```).\r\r\n\r\r\n```python\r\r\nfrom imagedatasetanalyzer.src.datasets.imagedataset import ImageDataset\r\r\nfrom imagedatasetanalyzer.src.embeddings.tensorflowembedding import TensorflowEmbedding\r\r\nfrom imagedatasetanalyzer.src.models.kmeansclustering import KMeansClustering\r\r\n\r\r\n# Define paths\r\r\nimg_dir = r\"images/path\"\r\r\n\r\r\n# Load dataset\r\r\ndataset = ImageDataset(img_dir)\r\r\n\r\r\n# Choose embedding method. We are using MobileNetV2 from Tensorflow.\r\r\nemb = TensorflowEmbedding(\"MobileNetV2\")\r\r\nembeddings = emb.generate_embeddings(dataset)\r\r\n\r\r\n# Initialize KMeans clustering\r\r\nkmeans = KMeansClustering(dataset, embeddings, random_state=123)\r\r\n\r\r\n# Select the number of clusters with KMeans that maximize the silhouette score.\r\r\nbest_k = kmeans.find_best_n_clusters(range(2,25), 'silhouette', plot=False)\r\r\n\r\r\n# Reduce dataset using the best KMeans model according to the silhouette score. \r\r\n# In this case, we are mantaining the 70% of the original dataset (reduction=0.7), \r\r\n# obtaining the closest images from each cluster (selection_type='representative') \r\r\n# and ensuring that 20% of the selected images within each cluster are diverse (diverse_percentage=0.2).\r\r\n# The reduced dataset will be saved to the specified output directory (\"reduced/dataset/path\")\r\r\nreduced_dataset = kmeans.select_balanced_images(n_clusters=best_k, \r\r\n                                                reduction=0.7, \r\r\n                                                selection_type='representative', \r\r\n                                                diverse_percentage=0.2, \r\r\n                                                output=\"reduced/dataset/path\")\r\r\n```\r\r\n\r\r\n## \ud83e\uddf0 Requirements\r\r\n\r\r\nThe dependencies and requirements to use this library are in the requirements.txt file. The following list includes all the dependencies:\r\r\n\r\r\n* Kneed \r\r\n* Matplotlib \r\r\n* Numpy\r\r\n* OpenCV\r\r\n* Pillow\r\r\n* Scikit learn\r\r\n* Scipy\r\r\n* Scikit image\r\r\n* Tensorflow\r\r\n* Torch\r\r\n* Tqdm\r\r\n* Transformers\r\r\n\r\r\n## \u2709\ufe0f Contact \r\r\n\r\r\n\ud83d\udce7 jortizdemuruaferrero@gmail.com\r\n\r\n",
    "bugtrack_url": null,
    "license": "MIT license",
    "summary": "Image dataset analyzer using image embedding models and clustering methods.",
    "version": "0.1.6",
    "project_urls": {
        "Homepage": "https://github.com/joortif/ImageDatasetAnalyzer"
    },
    "split_keywords": [
        "instance",
        "semantic",
        "segmentation",
        "pytorch",
        "tensorflow",
        "huggingface",
        "opencv",
        "embedding",
        "image",
        "analysis",
        "machine",
        "learning",
        "deep",
        "learning",
        "active",
        "learning",
        "computer",
        "vision"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e63fc52e463ee309b9b7bc602bf87d0a9dee57b1ff86b04e3b8aebb814b337c2",
                "md5": "08587ebd2b19d344c6fa0cd145cfe6c8",
                "sha256": "935d7e0d1642f6c130e5745f68e907fa726aa5d2e3fe19a86283c75d02fe6f77"
            },
            "downloads": -1,
            "filename": "imagedatasetanalyzer-0.1.6.tar.gz",
            "has_sig": false,
            "md5_digest": "08587ebd2b19d344c6fa0cd145cfe6c8",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 28516,
            "upload_time": "2025-01-27T09:06:50",
            "upload_time_iso_8601": "2025-01-27T09:06:50.882319Z",
            "url": "https://files.pythonhosted.org/packages/e6/3f/c52e463ee309b9b7bc602bf87d0a9dee57b1ff86b04e3b8aebb814b337c2/imagedatasetanalyzer-0.1.6.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-01-27 09:06:50",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "joortif",
    "github_project": "ImageDatasetAnalyzer",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "kneed",
            "specs": [
                [
                    "==",
                    "0.8.5"
                ]
            ]
        },
        {
            "name": "matplotlib",
            "specs": [
                [
                    ">=",
                    "3.7.0"
                ],
                [
                    "<",
                    "4.0.0"
                ]
            ]
        },
        {
            "name": "numpy",
            "specs": [
                [
                    ">=",
                    "1.25.0"
                ],
                [
                    "<",
                    "2.0.0"
                ]
            ]
        },
        {
            "name": "opencv_contrib_python",
            "specs": [
                [
                    ">=",
                    "4.8.0"
                ],
                [
                    "<",
                    "5.0.0"
                ]
            ]
        },
        {
            "name": "opencv_python",
            "specs": [
                [
                    ">=",
                    "4.8.0"
                ],
                [
                    "<",
                    "5.0.0"
                ]
            ]
        },
        {
            "name": "opencv_python_headless",
            "specs": [
                [
                    ">=",
                    "4.8.0"
                ],
                [
                    "<",
                    "5.0.0"
                ]
            ]
        },
        {
            "name": "Pillow",
            "specs": [
                [
                    "<",
                    "12.0.0"
                ],
                [
                    ">=",
                    "10.0.0"
                ]
            ]
        },
        {
            "name": "scikit-learn",
            "specs": [
                [
                    "<",
                    "1.5.0"
                ],
                [
                    ">=",
                    "1.3.0"
                ]
            ]
        },
        {
            "name": "scipy",
            "specs": [
                [
                    "<",
                    "2.0.0"
                ],
                [
                    ">=",
                    "1.11.0"
                ]
            ]
        },
        {
            "name": "scikit-image",
            "specs": [
                [
                    "<",
                    "1.0.0"
                ],
                [
                    ">=",
                    "0.21.0"
                ]
            ]
        },
        {
            "name": "tensorflow",
            "specs": [
                [
                    "<",
                    "3.0.0"
                ],
                [
                    ">=",
                    "2.13.0"
                ]
            ]
        },
        {
            "name": "tensorflow-intel",
            "specs": [
                [
                    "<",
                    "3.0.0"
                ],
                [
                    ">=",
                    "2.13.0"
                ]
            ]
        },
        {
            "name": "tqdm",
            "specs": [
                [
                    ">=",
                    "4.65.0"
                ],
                [
                    "<",
                    "5.0.0"
                ]
            ]
        },
        {
            "name": "transformers",
            "specs": [
                [
                    ">=",
                    "4.33.0"
                ],
                [
                    "<",
                    "5.0.0"
                ]
            ]
        },
        {
            "name": "torch",
            "specs": [
                [
                    "<",
                    "2.0.0"
                ],
                [
                    ">=",
                    "1.13.0"
                ]
            ]
        },
        {
            "name": "torchvision",
            "specs": [
                [
                    ">=",
                    "0.14.0"
                ],
                [
                    "<",
                    "1.0.0"
                ]
            ]
        }
    ],
    "lcname": "imagedatasetanalyzer"
}
        
Elapsed time: 0.43874s