ZCollection
===========
This project is a Python library allowing manipulating data partitioned into a
**collection** of `Zarr <https://zarr.readthedocs.io/en/stable/>`_ groups.
This collection allows dividing a dataset into several partitions to facilitate
acquisitions or updates made from new products. Possible data partitioning is:
by **date** (hour, day, month, etc.) or by **sequence**.
A collection partitioned by date, with a monthly resolution, may look like on
the disk:
.. code-block:: text
collection/
├── year=2022
│ ├── month=01/
│ │ ├── time/
│ │ │ ├── 0.0
│ │ │ ├── .zarray
│ │ │ └── .zattrs
│ │ ├── var1/
│ │ │ ├── 0.0
│ │ │ ├── .zarray
│ │ │ └── .zattrs
│ │ ├── .zattrs
│ │ ├── .zgroup
│ │ └── .zmetadata
│ └── month=02/
│ ├── time/
│ │ ├── 0.0
│ │ ├── .zarray
│ │ └── .zattrs
│ ├── var1/
│ │ ├── 0.0
│ │ ├── .zarray
│ │ └── .zattrs
│ ├── .zattrs
│ ├── .zgroup
│ └── .zmetadata
└── .zcollection
Partition updates can be set to overwrite existing data with new ones or to
update them using different **strategies**.
The `Dask library <https://dask.org/>`_ handles the data to scale the treatments
quickly.
It is possible to create views on a reference collection, to add and modify
variables contained in a reference collection, accessible in reading only.
This library can store data on POSIX, S3, or any other file system supported by
the Python library `fsspec
<https://filesystem-spec.readthedocs.io/en/latest/>`_. Note, however, only POSIX
and S3 file systems have been tested.
Raw data
{
"_id": null,
"home_page": "https://github.com/CNES/zcollection",
"name": "zcollection",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": "",
"keywords": "zarr,collection,xarray,dask",
"author": "CNES/CLS",
"author_email": "fbriol@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/a6/76/3668b65631509a095f09cac6827fcb97a87c89f81f89245ad4a986191003/zcollection-2024.2.0.tar.gz",
"platform": null,
"description": "ZCollection\n===========\n\nThis project is a Python library allowing manipulating data partitioned into a\n**collection** of `Zarr <https://zarr.readthedocs.io/en/stable/>`_ groups.\n\nThis collection allows dividing a dataset into several partitions to facilitate\nacquisitions or updates made from new products. Possible data partitioning is:\nby **date** (hour, day, month, etc.) or by **sequence**.\n\nA collection partitioned by date, with a monthly resolution, may look like on\nthe disk:\n\n.. code-block:: text\n\n collection/\n \u251c\u2500\u2500 year=2022\n \u2502 \u251c\u2500\u2500 month=01/\n \u2502 \u2502 \u251c\u2500\u2500 time/\n \u2502 \u2502 \u2502 \u251c\u2500\u2500 0.0\n \u2502 \u2502 \u2502 \u251c\u2500\u2500 .zarray\n \u2502 \u2502 \u2502 \u2514\u2500\u2500 .zattrs\n \u2502 \u2502 \u251c\u2500\u2500 var1/\n \u2502 \u2502 \u2502 \u251c\u2500\u2500 0.0\n \u2502 \u2502 \u2502 \u251c\u2500\u2500 .zarray\n \u2502 \u2502 \u2502 \u2514\u2500\u2500 .zattrs\n \u2502 \u2502 \u251c\u2500\u2500 .zattrs\n \u2502 \u2502 \u251c\u2500\u2500 .zgroup\n \u2502 \u2502 \u2514\u2500\u2500 .zmetadata\n \u2502 \u2514\u2500\u2500 month=02/\n \u2502 \u251c\u2500\u2500 time/\n \u2502 \u2502 \u251c\u2500\u2500 0.0\n \u2502 \u2502 \u251c\u2500\u2500 .zarray\n \u2502 \u2502 \u2514\u2500\u2500 .zattrs\n \u2502 \u251c\u2500\u2500 var1/\n \u2502 \u2502 \u251c\u2500\u2500 0.0\n \u2502 \u2502 \u251c\u2500\u2500 .zarray\n \u2502 \u2502 \u2514\u2500\u2500 .zattrs\n \u2502 \u251c\u2500\u2500 .zattrs\n \u2502 \u251c\u2500\u2500 .zgroup\n \u2502 \u2514\u2500\u2500 .zmetadata\n \u2514\u2500\u2500 .zcollection\n\nPartition updates can be set to overwrite existing data with new ones or to\nupdate them using different **strategies**.\n\nThe `Dask library <https://dask.org/>`_ handles the data to scale the treatments\nquickly.\n\nIt is possible to create views on a reference collection, to add and modify\nvariables contained in a reference collection, accessible in reading only.\n\nThis library can store data on POSIX, S3, or any other file system supported by\nthe Python library `fsspec\n<https://filesystem-spec.readthedocs.io/en/latest/>`_. Note, however, only POSIX\nand S3 file systems have been tested.\n",
"bugtrack_url": null,
"license": "BSD License",
"summary": "Zarr Collection",
"version": "2024.2.0",
"project_urls": {
"Homepage": "https://github.com/CNES/zcollection"
},
"split_keywords": [
"zarr",
"collection",
"xarray",
"dask"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "a6763668b65631509a095f09cac6827fcb97a87c89f81f89245ad4a986191003",
"md5": "5a61cfc10ad8e91e60125a215b2a6df4",
"sha256": "b2b3fa26f7d9638c75413ac6bc07d07a103bc105dddc3e37d367e8a0109c10cb"
},
"downloads": -1,
"filename": "zcollection-2024.2.0.tar.gz",
"has_sig": false,
"md5_digest": "5a61cfc10ad8e91e60125a215b2a6df4",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 160112,
"upload_time": "2024-02-10T14:20:29",
"upload_time_iso_8601": "2024-02-10T14:20:29.748490Z",
"url": "https://files.pythonhosted.org/packages/a6/76/3668b65631509a095f09cac6827fcb97a87c89f81f89245ad4a986191003/zcollection-2024.2.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-02-10 14:20:29",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "CNES",
"github_project": "zcollection",
"travis_ci": false,
"coveralls": true,
"github_actions": true,
"lcname": "zcollection"
}