fibsem-tools


Namefibsem-tools JSON
Version 7.0.3 PyPI version JSON
download
home_pageNone
SummaryTools for processing FIBSEM datasets
upload_time2024-09-10 09:28:54
maintainerNone
docs_urlNone
authorNone
requires_python>=3.10
licenseNone
keywords fibsem n5 zarr
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # FIB-SEM Tools

Tools for processing FIB-SEM data and annotations generated at Janelia Research Campus


# Installation

This package is currently distributed via pip. We are probably going to put it on conda eventually.

```bash
pip install fibsem_tools
```

# Usage

The bulk of this libary is a collection of python functions that provide a uniform interface to a variety of file + metadata formats used for storing FIB-SEM datasets. The following file formats are supported: 

| Format  | Access mode | Storage backend |
| ------------- | ------------- | ------------- |
| n5 | r/w | local, s3, gcs (via [fsspec](https://github.com/intake/filesystem_spec)) |
| zarr | r/w | local, s3, gcs (via [fsspec](https://github.com/intake/filesystem_spec)) |
| hdf5 | r | local |
| mrc | r | local |
| dat | r | local |

Because physical coordinates and metadata are extremely important for imaging data, this library uses the [`DataArray`](http://xarray.pydata.org/en/stable/generated/xarray.DataArray.html) datastructure from [`xarray`](https://github.com/pydata/xarray) to represent FIB-SEM data as arrays with spatial coordinates + metadata. E.g.,

```python
>>> from fibsem_tools import read_xarray, read
>>> from rich import print # pretty printing
>>> creds = {'anon': True} # anonymous credentials for s3
>>> group_url = 's3://janelia-cosem-datasets/jrc_sum159-1/jrc_sum159-1.n5/em/fibsem-uint16/' # path to a group of arrays on s3
>>> group = read(url, storage_options=creds) # this returns a zarr group, which in this case is a collection of arrays
>>> print(tuple(group.arrays())) # this shows all the arrays in the group
(
    ('s0', <zarr.core.Array '/em/fibsem-uint16/s0' (7632, 2800, 16000) uint16 read-only>),
    ('s1', <zarr.core.Array '/em/fibsem-uint16/s1' (3816, 1400, 8000) uint16 read-only>),
    ('s2', <zarr.core.Array '/em/fibsem-uint16/s2' (1908, 700, 4000) uint16 read-only>),
    ('s3', <zarr.core.Array '/em/fibsem-uint16/s3' (954, 350, 2000) uint16 read-only>),
    ('s4', <zarr.core.Array '/em/fibsem-uint16/s4' (477, 175, 1000) uint16 read-only>),
    ('s5', <zarr.core.Array '/em/fibsem-uint16/s5' (239, 88, 500) uint16 read-only>)
)
>>> tree = read_xarray(url, storage_options=creds) # read the group as a DataTree, a collection of xarray objects
>>> print(tree)
DataTree('fibsem-uint16', parent=None)
│   Dimensions:  ()
│   Data variables:
│       *empty*
│   Attributes:
│       axes:             ['x', 'y', 'z']
│       multiscales:      [{'datasets': [{'path': 's0', 'transform': {'axes': ['z...
│       pixelResolution:  {'dimensions': [4.0, 4.0, 4.56], 'unit': 'nm'}
│       scales:           [[1, 1, 1], [2, 2, 2], [4, 4, 4], [8, 8, 8], [16, 16, 1...
│       units:            ['nm', 'nm', 'nm']
├── DataTree('s0')
│       Dimensions:  (z: 7632, y: 2800, x: 16000)
│       Coordinates:
│         * z        (z) float64 0.0 4.56 9.12 13.68 ... 3.479e+04 3.479e+04 3.48e+04
│         * y        (y) float64 0.0 4.0 8.0 12.0 ... 1.119e+04 1.119e+04 1.12e+04
│         * x        (x) float64 0.0 4.0 8.0 12.0 ... 6.399e+04 6.399e+04 6.4e+04
│       Data variables:
│           data     (z, y, x) uint16 dask.array<chunksize=(384, 384, 384), meta=np.ndarray>
├── DataTree('s1')
│       Dimensions:  (z: 3816, y: 1400, x: 8000)
│       Coordinates:
│         * z        (z) float64 2.28 11.4 20.52 29.64 ... 3.478e+04 3.479e+04 3.48e+04
│         * y        (y) float64 2.0 10.0 18.0 26.0 ... 1.118e+04 1.119e+04 1.119e+04
│         * x        (x) float64 2.0 10.0 18.0 26.0 ... 6.398e+04 6.399e+04 6.399e+04
│       Data variables:
│           data     (z, y, x) uint16 dask.array<chunksize=(384, 384, 384), meta=np.ndarray>
├── DataTree('s2')
│       Dimensions:  (z: 1908, y: 700, x: 4000)
│       Coordinates:
│         * z        (z) float64 6.84 25.08 43.32 ... 3.475e+04 3.477e+04 3.479e+04
│         * y        (y) float64 6.0 22.0 38.0 54.0 ... 1.116e+04 1.117e+04 1.119e+04
│         * x        (x) float64 6.0 22.0 38.0 54.0 ... 6.396e+04 6.397e+04 6.399e+04
│       Data variables:
│           data     (z, y, x) uint16 dask.array<chunksize=(384, 384, 384), meta=np.ndarray>
├── DataTree('s3')
│       Dimensions:  (z: 954, y: 350, x: 2000)
│       Coordinates:
│         * z        (z) float64 15.96 52.44 88.92 ... 3.471e+04 3.474e+04 3.478e+04
│         * y        (y) float64 14.0 46.0 78.0 110.0 ... 1.112e+04 1.115e+04 1.118e+04
│         * x        (x) float64 14.0 46.0 78.0 110.0 ... 6.392e+04 6.395e+04 6.398e+04
│       Data variables:
│           data     (z, y, x) uint16 dask.array<chunksize=(288, 350, 576), meta=np.ndarray>
├── DataTree('s4')
│       Dimensions:  (z: 477, y: 175, x: 1000)
│       Coordinates:
│         * z        (z) float64 34.2 107.2 180.1 ... 3.462e+04 3.469e+04 3.476e+04
│         * y        (y) float64 30.0 94.0 158.0 222.0 ... 1.104e+04 1.11e+04 1.117e+04
│         * x        (x) float64 30.0 94.0 158.0 222.0 ... 6.384e+04 6.39e+04 6.397e+04
│       Data variables:
│           data     (z, y, x) uint16 dask.array<chunksize=(384, 175, 864), meta=np.ndarray>
└── DataTree('s5')
        Dimensions:  (z: 239, y: 88, x: 500)
        Coordinates:
          * z        (z) float64 70.68 216.6 362.5 ... 3.451e+04 3.465e+04 3.48e+04
          * y        (y) float64 62.0 190.0 318.0 446.0 ... 1.094e+04 1.107e+04 1.12e+04
          * x        (x) float64 62.0 190.0 318.0 ... 6.368e+04 6.381e+04 6.393e+04
        Data variables:
            data     (z, y, x) uint16 dask.array<chunksize=(239, 88, 500), meta=np.ndarray>

>>> array = read_xarray(url + '/s0', storage_options=creds) # get one of the arrays as a dataarray
>>> print(array)
<xarray.DataArray 's0' (z: 7632, y: 2800, x: 16000)>
dask.array<s0, shape=(7632, 2800, 16000), dtype=uint16, chunksize=(384, 384, 384), chunktype=numpy.ndarray>
Coordinates:
  * z        (z) float64 0.0 4.56 9.12 13.68 ... 3.479e+04 3.479e+04 3.48e+04
  * y        (y) float64 0.0 4.0 8.0 12.0 ... 1.119e+04 1.119e+04 1.12e+04
  * x        (x) float64 0.0 4.0 8.0 12.0 ... 6.399e+04 6.399e+04 6.4e+04
Attributes:
    pixelResolution:  {'dimensions': [4.0, 4.0, 4.56], 'unit': 'nm'}
    transform:        {'axes': ['z', 'y', 'x'], 'scale': [4.56, 4.0, 4.0], 't...
```

To get the data as a numpy array (this will download *all* the chunks from s3, so be careful):
```python
>>> array = result.compute().data
```


# Development

Clone the repo: 

```bash
git clone https://github.com/janelia-cosem/fibsem-tools.git
```

Install [poetry](https://python-poetry.org/), e.g. via [pipx](https://pypa.github.io/pipx/).

Then install dependencies 
```bash
cd fibsem_tools
poetry install
```


            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "fibsem-tools",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "fibsem, n5, zarr",
    "author": null,
    "author_email": "Davis Vann Bennett <davis.v.bennett@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/7a/e5/6c9444c17c0ed21342bee59365fb90b2561b646da658137ab83700d54de4/fibsem_tools-7.0.3.tar.gz",
    "platform": null,
    "description": "# FIB-SEM Tools\n\nTools for processing FIB-SEM data and annotations generated at Janelia Research Campus\n\n\n# Installation\n\nThis package is currently distributed via pip. We are probably going to put it on conda eventually.\n\n```bash\npip install fibsem_tools\n```\n\n# Usage\n\nThe bulk of this libary is a collection of python functions that provide a uniform interface to a variety of file + metadata formats used for storing FIB-SEM datasets. The following file formats are supported: \n\n| Format  | Access mode | Storage backend |\n| ------------- | ------------- | ------------- |\n| n5 | r/w | local, s3, gcs (via [fsspec](https://github.com/intake/filesystem_spec)) |\n| zarr | r/w | local, s3, gcs (via [fsspec](https://github.com/intake/filesystem_spec)) |\n| hdf5 | r | local |\n| mrc | r | local |\n| dat | r | local |\n\nBecause physical coordinates and metadata are extremely important for imaging data, this library uses the [`DataArray`](http://xarray.pydata.org/en/stable/generated/xarray.DataArray.html) datastructure from [`xarray`](https://github.com/pydata/xarray) to represent FIB-SEM data as arrays with spatial coordinates + metadata. E.g.,\n\n```python\n>>> from fibsem_tools import read_xarray, read\n>>> from rich import print # pretty printing\n>>> creds = {'anon': True} # anonymous credentials for s3\n>>> group_url = 's3://janelia-cosem-datasets/jrc_sum159-1/jrc_sum159-1.n5/em/fibsem-uint16/' # path to a group of arrays on s3\n>>> group = read(url, storage_options=creds) # this returns a zarr group, which in this case is a collection of arrays\n>>> print(tuple(group.arrays())) # this shows all the arrays in the group\n(\n    ('s0', <zarr.core.Array '/em/fibsem-uint16/s0' (7632, 2800, 16000) uint16 read-only>),\n    ('s1', <zarr.core.Array '/em/fibsem-uint16/s1' (3816, 1400, 8000) uint16 read-only>),\n    ('s2', <zarr.core.Array '/em/fibsem-uint16/s2' (1908, 700, 4000) uint16 read-only>),\n    ('s3', <zarr.core.Array '/em/fibsem-uint16/s3' (954, 350, 2000) uint16 read-only>),\n    ('s4', <zarr.core.Array '/em/fibsem-uint16/s4' (477, 175, 1000) uint16 read-only>),\n    ('s5', <zarr.core.Array '/em/fibsem-uint16/s5' (239, 88, 500) uint16 read-only>)\n)\n>>> tree = read_xarray(url, storage_options=creds) # read the group as a DataTree, a collection of xarray objects\n>>> print(tree)\nDataTree('fibsem-uint16', parent=None)\n\u2502   Dimensions:  ()\n\u2502   Data variables:\n\u2502       *empty*\n\u2502   Attributes:\n\u2502       axes:             ['x', 'y', 'z']\n\u2502       multiscales:      [{'datasets': [{'path': 's0', 'transform': {'axes': ['z...\n\u2502       pixelResolution:  {'dimensions': [4.0, 4.0, 4.56], 'unit': 'nm'}\n\u2502       scales:           [[1, 1, 1], [2, 2, 2], [4, 4, 4], [8, 8, 8], [16, 16, 1...\n\u2502       units:            ['nm', 'nm', 'nm']\n\u251c\u2500\u2500 DataTree('s0')\n\u2502       Dimensions:  (z: 7632, y: 2800, x: 16000)\n\u2502       Coordinates:\n\u2502         * z        (z) float64 0.0 4.56 9.12 13.68 ... 3.479e+04 3.479e+04 3.48e+04\n\u2502         * y        (y) float64 0.0 4.0 8.0 12.0 ... 1.119e+04 1.119e+04 1.12e+04\n\u2502         * x        (x) float64 0.0 4.0 8.0 12.0 ... 6.399e+04 6.399e+04 6.4e+04\n\u2502       Data variables:\n\u2502           data     (z, y, x) uint16 dask.array<chunksize=(384, 384, 384), meta=np.ndarray>\n\u251c\u2500\u2500 DataTree('s1')\n\u2502       Dimensions:  (z: 3816, y: 1400, x: 8000)\n\u2502       Coordinates:\n\u2502         * z        (z) float64 2.28 11.4 20.52 29.64 ... 3.478e+04 3.479e+04 3.48e+04\n\u2502         * y        (y) float64 2.0 10.0 18.0 26.0 ... 1.118e+04 1.119e+04 1.119e+04\n\u2502         * x        (x) float64 2.0 10.0 18.0 26.0 ... 6.398e+04 6.399e+04 6.399e+04\n\u2502       Data variables:\n\u2502           data     (z, y, x) uint16 dask.array<chunksize=(384, 384, 384), meta=np.ndarray>\n\u251c\u2500\u2500 DataTree('s2')\n\u2502       Dimensions:  (z: 1908, y: 700, x: 4000)\n\u2502       Coordinates:\n\u2502         * z        (z) float64 6.84 25.08 43.32 ... 3.475e+04 3.477e+04 3.479e+04\n\u2502         * y        (y) float64 6.0 22.0 38.0 54.0 ... 1.116e+04 1.117e+04 1.119e+04\n\u2502         * x        (x) float64 6.0 22.0 38.0 54.0 ... 6.396e+04 6.397e+04 6.399e+04\n\u2502       Data variables:\n\u2502           data     (z, y, x) uint16 dask.array<chunksize=(384, 384, 384), meta=np.ndarray>\n\u251c\u2500\u2500 DataTree('s3')\n\u2502       Dimensions:  (z: 954, y: 350, x: 2000)\n\u2502       Coordinates:\n\u2502         * z        (z) float64 15.96 52.44 88.92 ... 3.471e+04 3.474e+04 3.478e+04\n\u2502         * y        (y) float64 14.0 46.0 78.0 110.0 ... 1.112e+04 1.115e+04 1.118e+04\n\u2502         * x        (x) float64 14.0 46.0 78.0 110.0 ... 6.392e+04 6.395e+04 6.398e+04\n\u2502       Data variables:\n\u2502           data     (z, y, x) uint16 dask.array<chunksize=(288, 350, 576), meta=np.ndarray>\n\u251c\u2500\u2500 DataTree('s4')\n\u2502       Dimensions:  (z: 477, y: 175, x: 1000)\n\u2502       Coordinates:\n\u2502         * z        (z) float64 34.2 107.2 180.1 ... 3.462e+04 3.469e+04 3.476e+04\n\u2502         * y        (y) float64 30.0 94.0 158.0 222.0 ... 1.104e+04 1.11e+04 1.117e+04\n\u2502         * x        (x) float64 30.0 94.0 158.0 222.0 ... 6.384e+04 6.39e+04 6.397e+04\n\u2502       Data variables:\n\u2502           data     (z, y, x) uint16 dask.array<chunksize=(384, 175, 864), meta=np.ndarray>\n\u2514\u2500\u2500 DataTree('s5')\n        Dimensions:  (z: 239, y: 88, x: 500)\n        Coordinates:\n          * z        (z) float64 70.68 216.6 362.5 ... 3.451e+04 3.465e+04 3.48e+04\n          * y        (y) float64 62.0 190.0 318.0 446.0 ... 1.094e+04 1.107e+04 1.12e+04\n          * x        (x) float64 62.0 190.0 318.0 ... 6.368e+04 6.381e+04 6.393e+04\n        Data variables:\n            data     (z, y, x) uint16 dask.array<chunksize=(239, 88, 500), meta=np.ndarray>\n\n>>> array = read_xarray(url + '/s0', storage_options=creds) # get one of the arrays as a dataarray\n>>> print(array)\n<xarray.DataArray 's0' (z: 7632, y: 2800, x: 16000)>\ndask.array<s0, shape=(7632, 2800, 16000), dtype=uint16, chunksize=(384, 384, 384), chunktype=numpy.ndarray>\nCoordinates:\n  * z        (z) float64 0.0 4.56 9.12 13.68 ... 3.479e+04 3.479e+04 3.48e+04\n  * y        (y) float64 0.0 4.0 8.0 12.0 ... 1.119e+04 1.119e+04 1.12e+04\n  * x        (x) float64 0.0 4.0 8.0 12.0 ... 6.399e+04 6.399e+04 6.4e+04\nAttributes:\n    pixelResolution:  {'dimensions': [4.0, 4.0, 4.56], 'unit': 'nm'}\n    transform:        {'axes': ['z', 'y', 'x'], 'scale': [4.56, 4.0, 4.0], 't...\n```\n\nTo get the data as a numpy array (this will download *all* the chunks from s3, so be careful):\n```python\n>>> array = result.compute().data\n```\n\n\n# Development\n\nClone the repo: \n\n```bash\ngit clone https://github.com/janelia-cosem/fibsem-tools.git\n```\n\nInstall [poetry](https://python-poetry.org/), e.g. via [pipx](https://pypa.github.io/pipx/).\n\nThen install dependencies \n```bash\ncd fibsem_tools\npoetry install\n```\n\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Tools for processing FIBSEM datasets",
    "version": "7.0.3",
    "project_urls": {
        "Documentation": "https://github.com/janelia-cellmap/fibsem-tools#readme",
        "Issues": "https://github.com/janelia-cellmap/fibsem-tools/issues",
        "Source": "https://github.com/janelia-cellmap/fibsem-tools"
    },
    "split_keywords": [
        "fibsem",
        " n5",
        " zarr"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "2ba102d46a4456dc2bc77e6aa8ff2537e20da33966172cc26eba8d3ddcaf4fb3",
                "md5": "9f37799889d4c4bef5fdf1107e081ace",
                "sha256": "d90ebac419ef65a96be4635e9d697b80398c355d665e3feb91e5c497d4c576d9"
            },
            "downloads": -1,
            "filename": "fibsem_tools-7.0.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "9f37799889d4c4bef5fdf1107e081ace",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 46005,
            "upload_time": "2024-09-10T09:28:56",
            "upload_time_iso_8601": "2024-09-10T09:28:56.485218Z",
            "url": "https://files.pythonhosted.org/packages/2b/a1/02d46a4456dc2bc77e6aa8ff2537e20da33966172cc26eba8d3ddcaf4fb3/fibsem_tools-7.0.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "7ae56c9444c17c0ed21342bee59365fb90b2561b646da658137ab83700d54de4",
                "md5": "f262d2e0d339d32124bc2fcaa70354e8",
                "sha256": "abcbf5cafa2831e0a5c3f9666badee5dfc36096396fafab9b38bb73bb52ee945"
            },
            "downloads": -1,
            "filename": "fibsem_tools-7.0.3.tar.gz",
            "has_sig": false,
            "md5_digest": "f262d2e0d339d32124bc2fcaa70354e8",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 49365,
            "upload_time": "2024-09-10T09:28:54",
            "upload_time_iso_8601": "2024-09-10T09:28:54.401349Z",
            "url": "https://files.pythonhosted.org/packages/7a/e5/6c9444c17c0ed21342bee59365fb90b2561b646da658137ab83700d54de4/fibsem_tools-7.0.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-09-10 09:28:54",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "janelia-cellmap",
    "github_project": "fibsem-tools#readme",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "fibsem-tools"
}
        
Elapsed time: 0.35644s