image-dataset-loader

Name	image-dataset-loader JSON
Version	0.0.1 JSON
	download
home_page	https://github.com/soroushj/image-dataset-loader
Summary	Load image datasets as NumPy arrays
upload_time	2024-03-23 10:54:12
maintainer	None
docs_url	None
author	Soroush Javadi
requires_python	>=3.0
license	MIT
keywords	datasets image-datasets
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # image-dataset-loader: Load image datasets as NumPy arrays

[![PyPI](https://img.shields.io/pypi/v/image-dataset-loader.svg)](https://pypi.org/project/image-dataset-loader/)
[![MIT license](https://img.shields.io/badge/license-MIT-brightgreen.svg)](https://opensource.org/licenses/MIT)

## Installation

```bash
pip install image-dataset-loader
```

## Overview

Suppose you have an image dataset in a directory which looks like this:

```
data/
  train/
    cats/
      cat0001.jpg
      cat0002.jpg
      ...
    dogs/
      dog0001.jpg
      dog0002.jpg
      ...
  test/
    cats/
      cat0001.jpg
      cat0002.jpg
      ...
    dogs/
      dog0001.jpg
      dog0002.jpg
      ...
```

You can use the `image_dataset_loader.load` function to load this dataset as NumPy arrays:

```python
import image_dataset_loader

(x_train, y_train), (x_test, y_test) = image_dataset_loader.load('path/to/data', ['train', 'test'])
```

The shape of the `x_*` arrays will be `(instances, rows, cols, channels)` for color images and `(instances, rows, cols)` for grayscale images.
Also, the shape of the `y_*` arrays will be `(instances,)`.

All images in the dataset must have the same shape.
Also, all data subsets (i.e., `train` and `test` in this example) must contain the same set of classes.
Class names will be sorted alphabetically.
So, in this example, `cats` and `dogs` will be represented by `0` and `1`, respectively.

You can also load a single data subset. For example:

```python
(x_train, y_train), = image_dataset_loader.load('path/to/data', ['train'])
```

Note that the comma after `(x_train, y_train)` is required, because the function always returns a tuple of tuples.

## API

```python
load(dataset_path, set_names,
     shuffle=True, seed=None,
     x_dtype='uint8', y_dtype='uint32')
```

- **`dataset_path:`** Path to the dataset directory.
- **`set_names:`** List of the data subsets (subdirectories of the dataset directory).
- **`shuffle:`** Whether to shuffle the samples. If false, instances will be sorted by file name.
- **`seed:`** Random seed used for shuffling (see the [docs](https://docs.python.org/3/library/random.html#random.seed)).
- **`x_dtype:`** NumPy data type for the X arrays (see the [docs](https://numpy.org/devdocs/user/basics.types.html)).
- **`y_dtype:`** NumPy data type for the Y arrays (see the [docs](https://numpy.org/devdocs/user/basics.types.html)).
- Returns a tuple of `(x, y)` tuples corresponding to `set_names`.

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/soroushj/image-dataset-loader",
    "name": "image-dataset-loader",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.0",
    "maintainer_email": null,
    "keywords": "datasets, image-datasets",
    "author": "Soroush Javadi",
    "author_email": "soroush.javadi@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/73/d4/3b899a63589c977afe0a836d823fd53359c2699c352ba8cfac6d371d962b/image-dataset-loader-0.0.1.tar.gz",
    "platform": null,
    "description": "# image-dataset-loader: Load image datasets as NumPy arrays\n\n[![PyPI](https://img.shields.io/pypi/v/image-dataset-loader.svg)](https://pypi.org/project/image-dataset-loader/)\n[![MIT license](https://img.shields.io/badge/license-MIT-brightgreen.svg)](https://opensource.org/licenses/MIT)\n\n## Installation\n\n```bash\npip install image-dataset-loader\n```\n\n## Overview\n\nSuppose you have an image dataset in a directory which looks like this:\n\n```\ndata/\n  train/\n    cats/\n      cat0001.jpg\n      cat0002.jpg\n      ...\n    dogs/\n      dog0001.jpg\n      dog0002.jpg\n      ...\n  test/\n    cats/\n      cat0001.jpg\n      cat0002.jpg\n      ...\n    dogs/\n      dog0001.jpg\n      dog0002.jpg\n      ...\n```\n\nYou can use the `image_dataset_loader.load` function to load this dataset as NumPy arrays:\n\n```python\nimport image_dataset_loader\n\n(x_train, y_train), (x_test, y_test) = image_dataset_loader.load('path/to/data', ['train', 'test'])\n```\n\nThe shape of the `x_*` arrays will be `(instances, rows, cols, channels)` for color images and `(instances, rows, cols)` for grayscale images.\nAlso, the shape of the `y_*` arrays will be `(instances,)`.\n\nAll images in the dataset must have the same shape.\nAlso, all data subsets (i.e., `train` and `test` in this example) must contain the same set of classes.\nClass names will be sorted alphabetically.\nSo, in this example, `cats` and `dogs` will be represented by `0` and `1`, respectively.\n\nYou can also load a single data subset. For example:\n\n```python\n(x_train, y_train), = image_dataset_loader.load('path/to/data', ['train'])\n```\n\nNote that the comma after `(x_train, y_train)` is required, because the function always returns a tuple of tuples.\n\n## API\n\n```python\nload(dataset_path, set_names,\n     shuffle=True, seed=None,\n     x_dtype='uint8', y_dtype='uint32')\n```\n\n- **`dataset_path:`** Path to the dataset directory.\n- **`set_names:`** List of the data subsets (subdirectories of the dataset directory).\n- **`shuffle:`** Whether to shuffle the samples. If false, instances will be sorted by file name.\n- **`seed:`** Random seed used for shuffling (see the [docs](https://docs.python.org/3/library/random.html#random.seed)).\n- **`x_dtype:`** NumPy data type for the X arrays (see the [docs](https://numpy.org/devdocs/user/basics.types.html)).\n- **`y_dtype:`** NumPy data type for the Y arrays (see the [docs](https://numpy.org/devdocs/user/basics.types.html)).\n- Returns a tuple of `(x, y)` tuples corresponding to `set_names`.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Load image datasets as NumPy arrays",
    "version": "0.0.1",
    "project_urls": {
        "Homepage": "https://github.com/soroushj/image-dataset-loader"
    },
    "split_keywords": [
        "datasets",
        " image-datasets"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "9d5ccb9d9769ab2079886e53823c0bda7dad6ad8ebf21ad051703996865bb84d",
                "md5": "a21c68467cae2e147ddaec7b01ca237b",
                "sha256": "f1972bb1ba46902d3cea44f880a6b00440399d3615e22b7c8941181e07710c89"
            },
            "downloads": -1,
            "filename": "image_dataset_loader-0.0.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "a21c68467cae2e147ddaec7b01ca237b",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.0",
            "size": 4235,
            "upload_time": "2024-03-23T10:54:10",
            "upload_time_iso_8601": "2024-03-23T10:54:10.348135Z",
            "url": "https://files.pythonhosted.org/packages/9d/5c/cb9d9769ab2079886e53823c0bda7dad6ad8ebf21ad051703996865bb84d/image_dataset_loader-0.0.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "73d43b899a63589c977afe0a836d823fd53359c2699c352ba8cfac6d371d962b",
                "md5": "44833b30e665be7b6755390de57e9b0c",
                "sha256": "b72abf6dd44357f5a26516abad34396cf5123cd1726fc13de87bf7301afaa31b"
            },
            "downloads": -1,
            "filename": "image-dataset-loader-0.0.1.tar.gz",
            "has_sig": false,
            "md5_digest": "44833b30e665be7b6755390de57e9b0c",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.0",
            "size": 3781,
            "upload_time": "2024-03-23T10:54:12",
            "upload_time_iso_8601": "2024-03-23T10:54:12.032825Z",
            "url": "https://files.pythonhosted.org/packages/73/d4/3b899a63589c977afe0a836d823fd53359c2699c352ba8cfac6d371d962b/image-dataset-loader-0.0.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-03-23 10:54:12",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "soroushj",
    "github_project": "image-dataset-loader",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "image-dataset-loader"
}

Soroush Javadi