cell-data-loader

Name	cell-data-loader JSON
Version	0.0.3 JSON
	download
home_page
Summary	Converts general images of cells into formats and labels for deep learning pipelines
upload_time	2024-02-05 22:13:58
maintainer
docs_url	None
author
requires_python	>=3.8
license	MIT License
keywords	biomedical cell cell image cellpose csz dataloader
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            Cell Data Loader
================

Cell Data Loader is a simple AI support tool in Python that can take in images of cells (or other image types) and output them with minimal effort to formats that can be read by Pytorch (Tensor) or Tensorflow (Numpy) format. With Cell Data Loader, users have the option to output their cell images as whole images, sliced images, or, with the support of [CellPose](https://github.com/MouseLand/cellpose), segment their images by cell and output those individually.

To install Cell Data Loader, simply type into a standard UNIX terminal

    pip install cell-data-loader

The simplest way to use Cell Data Loader is to instantiate a dataloader as such:

~~~python
from cell_data_loader import CellDataloader

imfolder = '/path/to/my/images'

dataloader = CellDataloader(imfolder)

for image in dataloader:
	...
~~~

And viola!

Lists of files are also supported:

~~~python

imfiles = ['/path/to/image1.png','/path/to/image2.png','/path/to/image3.png']

dataloader = CellDataloader(imfiles)

for image in dataloader:
	...
~~~

Labels
------

Cell Data Loader has a few ways to support image labels. The simplest is whole images that are located in different folders, with each folder representing a label. This can be supported via the following:

~~~python
imfolder1 = '/path/to/my/images'
imfolder2 = '/path/to/your/images'

dataloader = CellDataloader(imfolder1,imfolder2)

for label,image in dataloader:
	...
~~~

Alternatively, if you have one folder or file list with images that have different naming conventions, a regex match is supported:

~~~python
imfiles = ['/path/to/CANCER_image1.png','/path/to/CANCER_image2.png','/path/to/CANCER_image3.png','/path/to/HEALTHY_image1.png','/path/to/HEALTHY_image2.png','/path/to/HEALTHY_image3.png']

dataloader = CellDataloader(imfiles,label_regex = ["CANCER","HEALTHY"])
for label,image in dataloader:
	...
~~~


Arguments
---------

Additional arguments taken by Cell Data Loader include

~~~python

imfolder = '/path/to/folder'

dataloader = CellDataloader(imfolder,
			dim = (64,64),
			batch_size = 32,
			dtype = "numpy", # Can also be "torch"
			label_regex = None,
			n_channels = 3, # This is detected in the first read image by default, if not provided; it re-samples all images to force this number of channels
			match_labels = False, # Outputs proportional amounts of each label in the dataset
			)
~~~


Dependencies
------------

Strict dependencies:

	numpy
	torch
	torchvision
	opencv-python>=4.5.4
	slideio==2.4.1
	scipy
	scikit-image
	pillow

Soft dependencies:

	CellPose # For cell segmentation support
	Tensorflow

Note that some of the dependencies are not strict and vary depending on usage. Numpy is a hard requirement, but Tensorflow is not if the user only uses the Torch capabilities. If the user attempts to load cell images in "cell" mode without a working Cellpose installation, CellDataLoader will throw an error. Cellpose needs be be installed separately to use "cell" mode:

	pip install cellpose

And GPU integration is a separate matter.

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "cell-data-loader",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "Matt Leming <mleming@mgh.harvard.edu>",
    "keywords": "biomedical,cell,cell image,cellpose,csz,dataloader",
    "author": "",
    "author_email": "Matt Leming <mleming@mgh.harvard.edu>",
    "download_url": "https://files.pythonhosted.org/packages/d4/e2/45b24f0bc6fc9e852ec05031e38ff4c8aa8e342ef11b60ee2cb775be1889/cell_data_loader-0.0.3.tar.gz",
    "platform": null,
    "description": "Cell Data Loader\n================\n\nCell Data Loader is a simple AI support tool in Python that can take in images of cells (or other image types) and output them with minimal effort to formats that can be read by Pytorch (Tensor) or Tensorflow (Numpy) format. With Cell Data Loader, users have the option to output their cell images as whole images, sliced images, or, with the support of [CellPose](https://github.com/MouseLand/cellpose), segment their images by cell and output those individually.\n\nTo install Cell Data Loader, simply type into a standard UNIX terminal\n\n    pip install cell-data-loader\n\nThe simplest way to use Cell Data Loader is to instantiate a dataloader as such:\n\n~~~python\nfrom cell_data_loader import CellDataloader\n\nimfolder = '/path/to/my/images'\n\ndataloader = CellDataloader(imfolder)\n\nfor image in dataloader:\n\t...\n~~~\n\nAnd viola!\n\nLists of files are also supported:\n\n~~~python\n\nimfiles = ['/path/to/image1.png','/path/to/image2.png','/path/to/image3.png']\n\ndataloader = CellDataloader(imfiles)\n\nfor image in dataloader:\n\t...\n~~~\n\nLabels\n------\n\nCell Data Loader has a few ways to support image labels. The simplest is whole images that are located in different folders, with each folder representing a label. This can be supported via the following:\n\n~~~python\nimfolder1 = '/path/to/my/images'\nimfolder2 = '/path/to/your/images'\n\ndataloader = CellDataloader(imfolder1,imfolder2)\n\nfor label,image in dataloader:\n\t...\n~~~\n\nAlternatively, if you have one folder or file list with images that have different naming conventions, a regex match is supported:\n\n~~~python\nimfiles = ['/path/to/CANCER_image1.png','/path/to/CANCER_image2.png','/path/to/CANCER_image3.png','/path/to/HEALTHY_image1.png','/path/to/HEALTHY_image2.png','/path/to/HEALTHY_image3.png']\n\ndataloader = CellDataloader(imfiles,label_regex = [\"CANCER\",\"HEALTHY\"])\nfor label,image in dataloader:\n\t...\n~~~\n\n\nArguments\n---------\n\nAdditional arguments taken by Cell Data Loader include\n\n~~~python\n\nimfolder = '/path/to/folder'\n\ndataloader = CellDataloader(imfolder,\n\t\t\tdim = (64,64),\n\t\t\tbatch_size = 32,\n\t\t\tdtype = \"numpy\", # Can also be \"torch\"\n\t\t\tlabel_regex = None,\n\t\t\tn_channels = 3, # This is detected in the first read image by default, if not provided; it re-samples all images to force this number of channels\n\t\t\tmatch_labels = False, # Outputs proportional amounts of each label in the dataset\n\t\t\t)\n~~~\n\n\nDependencies\n------------\n\nStrict dependencies:\n\n\tnumpy\n\ttorch\n\ttorchvision\n\topencv-python>=4.5.4\n\tslideio==2.4.1\n\tscipy\n\tscikit-image\n\tpillow\n\nSoft dependencies:\n\n\tCellPose # For cell segmentation support\n\tTensorflow\n\nNote that some of the dependencies are not strict and vary depending on usage. Numpy is a hard requirement, but Tensorflow is not if the user only uses the Torch capabilities. If the user attempts to load cell images in \"cell\" mode without a working Cellpose installation, CellDataLoader will throw an error. Cellpose needs be be installed separately to use \"cell\" mode:\n\n\tpip install cellpose\n\nAnd GPU integration is a separate matter.\n",
    "bugtrack_url": null,
    "license": "MIT License",
    "summary": "Converts general images of cells into formats and labels for deep learning pipelines",
    "version": "0.0.3",
    "project_urls": {
        "Homepage": "https://github.com/mleming/CellDataLoader"
    },
    "split_keywords": [
        "biomedical",
        "cell",
        "cell image",
        "cellpose",
        "csz",
        "dataloader"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d4e245b24f0bc6fc9e852ec05031e38ff4c8aa8e342ef11b60ee2cb775be1889",
                "md5": "fc838ec4e261351fcd498449a0ccc081",
                "sha256": "342c42e134fa956697e22fed13a16347b52f0b579e785e9d06a635ef6b0694e3"
            },
            "downloads": -1,
            "filename": "cell_data_loader-0.0.3.tar.gz",
            "has_sig": false,
            "md5_digest": "fc838ec4e261351fcd498449a0ccc081",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 17547,
            "upload_time": "2024-02-05T22:13:58",
            "upload_time_iso_8601": "2024-02-05T22:13:58.182048Z",
            "url": "https://files.pythonhosted.org/packages/d4/e2/45b24f0bc6fc9e852ec05031e38ff4c8aa8e342ef11b60ee2cb775be1889/cell_data_loader-0.0.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-02-05 22:13:58",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "mleming",
    "github_project": "CellDataLoader",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [],
    "lcname": "cell-data-loader"
}