aind-large-scale-cellpose

Name	aind-large-scale-cellpose JSON
Version	0.0.6 JSON
	download
home_page	None
Summary	Large-scale prediction with cellpose
upload_time	2024-05-17 18:15:14
maintainer	None
docs_url	None
author	Allen Institute for Neural Dynamics
requires_python	>=3.9
license	MIT
keywords
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

# aind-large-scale-cellpose

Large-scale cell segmentation using cellpose.

The approach is the following:

1. Large-scale computation of percentiles. This uses the percentile range provided by the user as well as the minimum voxel size for the dataset you're using. This is a cellpose parameter that discards all regions of interests below this value. See [cellpose docs](https://github.com/MouseLand/cellpose/blob/main/cellpose/models.py#L347).
2. Prediction of gradients in each axis, YX, ZX and ZY. We create a zarr dataset where we will output the predictions of each axis. It is important to mention that we have to normalize the data before processing, we have a local normalization option that assumes that step 1 was not computed and therefore only uses the data from that plane to compute the normalization. If avoided, this could result in segmentation masks merges between distinct cells and missing cells in dimmer areas. The global normalization step uses the precomputed percentiles from step 1.
3. Local combination of gradients, no overlapping areas are needed to perform this step. We create another zarr dataset that stores the combination of these gradients.
4. Following ZYX flows, computing histograms and generating cell centroids (seeds) in overlaped chunks. The ZYX flows and histograms are zarr datasets as well, however, these are computed in overlapping chunks and this overlap area must be $$overlapArea = 2*MeanCellDiameter$$ to avoid having cut-offs between the flows and affect the final segmentation mask.
5. Generation of segmentation masks in overlapped chunks. The overlapped chunks must have the exact same chunk area as step 4. We take all of the global seeds and assign individual cell IDs which then are assigned to the ZYX flow area.

## Disadvantages of this method.
1. We need enough disk storage to generate the final segmentation mask since we need to generate the gradients in each axis, combined gradients, ZYX flows and histograms. We recommend using a downsampled version of your data if your microscope has a very high resolution. It is only necessary to store the `segmentation_mask.zarr` and the other zarrs could be deleted without any problem.
2. Each chunk must have a size of:
$$chunksize = (area_z, area_y, area_x) + (overlap_z * 2, overlap_y * 2, overlap_x * 2)$$
where the overlap is on each side of each axis.

## Documentation
You can access the documentation for this module [here]().

## Contributing

To develop the code, install the packages described in the Dockerfile.

### Linters and testing

There are several libraries used to run linters, check documentation, and run tests.

- Please test your changes using the **coverage** library, which will run the tests and log a coverage report:

```
coverage run -m unittest discover && coverage report
```

- Use **interrogate** to check that modules, methods, etc. have been documented thoroughly:

```
interrogate .
```

- Use **flake8** to check that code is up to standards (no unused imports, etc.):
```
flake8 . --max-line-length=100
```

- Use **black** to automatically format the code into PEP standards:
```
black .
```

- Use **isort** to automatically sort import statements:
```
isort .
```

### Pull requests

For internal members, please create a branch. For external members, please fork the repo and open a pull request from the fork. We'll primarily use [Angular](https://github.com/angular/angular/blob/main/CONTRIBUTING.md#commit) style for commit messages. Roughly, they should follow the pattern:
```
<type>(<scope>): <short summary>
```

where scope (optional) describes the packages affected by the code changes and type (mandatory) is one of:

- **build**: Changes that affect the build system or external dependencies (example scopes: pyproject.toml, setup.py)
- **ci**: Changes to our CI configuration files and scripts (examples: .github/workflows/ci.yml)
- **docs**: Documentation only changes
- **feat**: A new feature
- **fix**: A bug fix
- **perf**: A code change that improves performance
- **refactor**: A code change that neither fixes a bug nor adds a feature
- **test**: Adding missing tests or correcting existing tests

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "aind-large-scale-cellpose",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": null,
    "author": "Allen Institute for Neural Dynamics",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/87/11/0e9af1a5a7b6ea65f08652cdd7fedbb25cb22f577ccd1c6a81807ee990cd/aind_large_scale_cellpose-0.0.6.tar.gz",
    "platform": null,
    "description": "# aind-large-scale-cellpose\n\nLarge-scale cell segmentation using cellpose.\n\nThe approach is the following:\n\n1. Large-scale computation of percentiles. This uses the percentile range provided by the user as well as the minimum voxel size for the dataset you're using. This is a cellpose parameter that discards all regions of interests below this value. See [cellpose docs](https://github.com/MouseLand/cellpose/blob/main/cellpose/models.py#L347).\n2. Prediction of gradients in each axis, YX, ZX and ZY. We create a zarr dataset where we will output the predictions of each axis. It is important to mention that we have to normalize the data before processing, we have a local normalization option that assumes that step 1 was not computed and therefore only uses the data from that plane to compute the normalization. If avoided, this could result in segmentation masks merges between distinct cells and missing cells in dimmer areas. The global normalization step uses the precomputed percentiles from step 1.\n3. Local combination of gradients, no overlapping areas are needed to perform this step. We create another zarr dataset that stores the combination of these gradients.\n4. Following ZYX flows, computing histograms and generating cell centroids (seeds) in overlaped chunks. The ZYX flows and histograms are zarr datasets as well, however, these are computed in overlapping chunks and this overlap area must be $$overlapArea = 2*MeanCellDiameter$$ to avoid having cut-offs between the flows and affect the final segmentation mask.\n5. Generation of segmentation masks in overlapped chunks. The overlapped chunks must have the exact same chunk area as step 4. We take all of the global seeds and assign individual cell IDs which then are assigned to the ZYX flow area.\n\n## Disadvantages of this method.\n1. We need enough disk storage to generate the final segmentation mask since we need to generate the gradients in each axis, combined gradients, ZYX flows and histograms. We recommend using a downsampled version of your data if your microscope has a very high resolution. It is only necessary to store the `segmentation_mask.zarr` and the other zarrs could be deleted without any problem.\n2. Each chunk must have a size of: \n$$chunksize = (area_z, area_y, area_x) + (overlap_z * 2, overlap_y * 2, overlap_x * 2)$$ \nwhere the overlap is on each side of each axis.\n\n## Documentation\nYou can access the documentation for this module [here]().\n\n## Contributing\n\nTo develop the code, install the packages described in the Dockerfile.\n\n### Linters and testing\n\nThere are several libraries used to run linters, check documentation, and run tests.\n\n- Please test your changes using the **coverage** library, which will run the tests and log a coverage report:\n\n```\ncoverage run -m unittest discover && coverage report\n```\n\n- Use **interrogate** to check that modules, methods, etc. have been documented thoroughly:\n\n```\ninterrogate .\n```\n\n- Use **flake8** to check that code is up to standards (no unused imports, etc.):\n```\nflake8 . --max-line-length=100\n```\n\n- Use **black** to automatically format the code into PEP standards:\n```\nblack .\n```\n\n- Use **isort** to automatically sort import statements:\n```\nisort .\n```\n\n### Pull requests\n\nFor internal members, please create a branch. For external members, please fork the repo and open a pull request from the fork. We'll primarily use [Angular](https://github.com/angular/angular/blob/main/CONTRIBUTING.md#commit) style for commit messages. Roughly, they should follow the pattern:\n```\n<type>(<scope>): <short summary>\n```\n\nwhere scope (optional) describes the packages affected by the code changes and type (mandatory) is one of:\n\n- **build**: Changes that affect the build system or external dependencies (example scopes: pyproject.toml, setup.py)\n- **ci**: Changes to our CI configuration files and scripts (examples: .github/workflows/ci.yml)\n- **docs**: Documentation only changes\n- **feat**: A new feature\n- **fix**: A bug fix\n- **perf**: A code change that improves performance\n- **refactor**: A code change that neither fixes a bug nor adds a feature\n- **test**: Adding missing tests or correcting existing tests\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Large-scale prediction with cellpose",
    "version": "0.0.6",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "de1d5eab6fa524a3256b58ca94edef6157adf314864e2365e16f60c1195730b0",
                "md5": "eedd6c79c220abe0fbf973689a1c45b4",
                "sha256": "9e11095a925a5649d117580c1e114b9add31ce8b8ab2b7c30002f3cfd84d2bf9"
            },
            "downloads": -1,
            "filename": "aind_large_scale_cellpose-0.0.6-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "eedd6c79c220abe0fbf973689a1c45b4",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 40269,
            "upload_time": "2024-05-17T18:15:11",
            "upload_time_iso_8601": "2024-05-17T18:15:11.875908Z",
            "url": "https://files.pythonhosted.org/packages/de/1d/5eab6fa524a3256b58ca94edef6157adf314864e2365e16f60c1195730b0/aind_large_scale_cellpose-0.0.6-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "87110e9af1a5a7b6ea65f08652cdd7fedbb25cb22f577ccd1c6a81807ee990cd",
                "md5": "d19f7729440e0781bc00f9f602fdf0ca",
                "sha256": "50a30dd815732efe60c18acd26ce949068a2657d55a570efc5a5db8295038910"
            },
            "downloads": -1,
            "filename": "aind_large_scale_cellpose-0.0.6.tar.gz",
            "has_sig": false,
            "md5_digest": "d19f7729440e0781bc00f9f602fdf0ca",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 35257,
            "upload_time": "2024-05-17T18:15:14",
            "upload_time_iso_8601": "2024-05-17T18:15:14.292399Z",
            "url": "https://files.pythonhosted.org/packages/87/11/0e9af1a5a7b6ea65f08652cdd7fedbb25cb22f577ccd1c6a81807ee990cd/aind_large_scale_cellpose-0.0.6.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-05-17 18:15:14",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "aind-large-scale-cellpose"
}

Allen Institute for Neural Dynamics