| Name | h5yaml JSON |
| Version |
0.1.1
JSON |
| download |
| home_page | None |
| Summary | Use YAML configuration file to generate HDF5/netCDF4 formated files. |
| upload_time | 2025-10-27 10:22:16 |
| maintainer | None |
| docs_url | None |
| author | None |
| requires_python | >=3.10 |
| license | None |
| keywords |
hdf5
yaml
netcdf4
|
| VCS |
 |
| bugtrack_url |
|
| requirements |
No requirements were recorded.
|
| Travis-CI |
No Travis.
|
| coveralls test coverage |
No coveralls.
|
# H5YAML
[](https://github.com/rmvanhees/h5yaml/)
[](https://github.com/rmvanhees/h5yaml/LICENSE)
[](https://pypi.org/project/h5yaml/)
[](https://pypi.org/project/h5yaml/)
## Description
This package let you generate [HDF5](https://docs.h5py.org/en/stable/)/[netCDF4](https://unidata.github.io/netcdf4-python/)
formatted files as defined in a [YAML](https://yaml.org/) configuration file. This has several advantages:
* you define the layout of your HDF5/netCDF4 file using YAML which is human-readable and has intuitive syntax.
* you can reuse the YAML configuration file to to have all your product have a consistent layout.
* you can make updates by only changing the YAML configuration file
* you can have the layout of your HDF5/netCDF4 file as a python dictionary, thus without accessing any HDF5/netCDF4 file
The `H5YAML` package has two classes to generate a HDF5/netCDF4 formatted file.
1. The class `H5Yaml` uses the [h5py](https://pypi.org/project/h5py/) package, which is a Pythonic interface to
the HDF5 binary data format.
Let 'h5_def.yaml' be your YAML configuration file then ```H5Yaml("h5_def.yaml").create("foo.h5")``` will create
the HDF5 file 'foo.h5'. This can be read by netCDF4 software, because it uses dimension-scales to each dataset.
2. The class `NcYaml` uses the [netCDF4](https://pypi.org/project/netCDF4/) package, which provides an object-oriented
python interface to the netCDF version 4 library.
Let 'nc_def.yaml' be your YAML configuration file then ```NcYaml("nc_def.yaml").create("foo.nc")``` will create
the netCDF4/HDF5 file 'foo.nc'
The class `NcYaml` must be used when strict conformance to the netCDF4 format is required.
However, package `netCDF4` has some limitations, which `h5py` has not, for example it does
not allow variable-length variables to have a compound data-type.
## Installation
Releases of the code, starting from version 0.1, will be made available via PyPI.
## Usage
The YAML file should be structured as follows:
* The top level are: 'groups', 'dimensions', 'compounds' and 'variables'
* The section 'groups' are optional, but you should provide each group you want to use
in your file. The 'groups' section in the YAML file may look like this:
```
groups:
- engineering_data
- image_attributes
- navigation_data
- processing_control
- science_data
```
* The section 'dimensions' is obligatory, you should define the dimensions for each
variable in your file. The 'dimensions' section may look like this:
```
dimensions:
days:
_dtype: u4
_size: 0
long_name: days since 2024-01-01 00:00:00Z
number_of_images: # an unlimited dimension
_dtype: u2
_size: 0
samples_per_image: # a fixed dimension
_dtype: u4
_size: 307200
/navigation_data/att_time: # an unlimited dimension in a group with attributes
_dtype: f8
_size: 0
_FillValue: -32767
long_name: Attitude sample time (seconds of day)
calendar: proleptic_gregorian
units: seconds since %Y-%m-%d %H:%M:%S
valid_min: 0
valid_max: 92400
n_viewport: # a fixed dimension with fixed values and attributes
_dtype: i2
_size: 5
_values: [-50, -20, 0, 20, 50]
long_name: along-track view angles at sensor
units: degrees
```
* The 'compounds' are optional, but you should provide each compound data-type which
you want to use in your file. For each compound element you have to provide its
data-type and attributes: units and long_name. The 'compound' section may look like
this:
```
compounds:
stats_dtype:
time: [u8, seconds since 1970-01-01T00:00:00, timestamp]
index: [u2, '1', index]
tbl_id: [u1, '1', binning id]
saa: [u1, '1', saa-flag]
coad: [u1, '1', co-addings]
texp: [f4, ms, exposure time]
lat: [f4, degree, latitude]
lon: [f4, degree, longitude]
avg: [f4, '1', '$S - S_{ref}$']
unc: [f4, '1', '\u03c3($S - S_{ref}$)']
dark_offs: [f4, '1', dark-offset]
```
Alternatively, provide a list with names of YAML files which contain the definitions
of the compounds.
```
compounds:
- h5_nomhk_tm.yaml
- h5_science_hk.yaml
```
* The 'variables' are defined by their data-type ('_dtype') and dimensions ('_dims'),
and optionally chunk sizes ('_chunks'), compression ('_compression'), variable length
('_vlen'). In addition, each variable can have as many attributes as you like,
defined by its name and value. The 'variables' section may look like this:
```
variables:
/image_attributes/nr_coadditions:
_dtype: u2
_dims: [number_of_images]
_FillValue: 0
long_name: Number of coadditions
units: '1'
valid_min: 1
/image_attributes/exposure_time:
_dtype: f8
_dims: [number_of_images]
_FillValue: -32767
long_name: Exposure time
units: seconds
stats_163:
_dtype: stats_dtype
_vlen: True
_dims: [days]
comment: detector map statistics (MPS=163)
```
### Notes and ToDo:
* The usage of older versions of h5py may result in broken netCDF4 files
* Explain usage of parameter '_chunks', which is currently not correctly implemented.
* Explain that the usage of variable length data-sets may break netCDF4 compatibility
## Support [TBW]
## Roadmap
* Release v0.1 : stable API to read your YAML files and generate the HDF5/netCDF4 file
## Authors and acknowledgment
The code is developed by R.M. van Hees (SRON)
## License
* Copyright: SRON (https://www.sron.nl).
* License: BSD-3-clause
Raw data
{
"_id": null,
"home_page": null,
"name": "h5yaml",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": "HDF5, YAML, netCDF4",
"author": null,
"author_email": "Richard van Hees <r.m.van.hees@sron.nl>",
"download_url": "https://files.pythonhosted.org/packages/d5/66/60776fd3b5135c4f64a25565d8b1f4a5bc64d8b75e72544469620f561034/h5yaml-0.1.1.tar.gz",
"platform": null,
"description": "# H5YAML\n[](https://github.com/rmvanhees/h5yaml/)\n[](https://github.com/rmvanhees/h5yaml/LICENSE)\n[](https://pypi.org/project/h5yaml/)\n[](https://pypi.org/project/h5yaml/)\n\n## Description\nThis package let you generate [HDF5](https://docs.h5py.org/en/stable/)/[netCDF4](https://unidata.github.io/netcdf4-python/)\nformatted files as defined in a [YAML](https://yaml.org/) configuration file. This has several advantages: \n\n * you define the layout of your HDF5/netCDF4 file using YAML which is human-readable and has intuitive syntax.\n * you can reuse the YAML configuration file to to have all your product have a consistent layout.\n * you can make updates by only changing the YAML configuration file\n * you can have the layout of your HDF5/netCDF4 file as a python dictionary, thus without accessing any HDF5/netCDF4 file\n\nThe `H5YAML` package has two classes to generate a HDF5/netCDF4 formatted file.\n\n 1. The class `H5Yaml` uses the [h5py](https://pypi.org/project/h5py/) package, which is a Pythonic interface to\n the HDF5 binary data format.\n Let 'h5_def.yaml' be your YAML configuration file then ```H5Yaml(\"h5_def.yaml\").create(\"foo.h5\")``` will create\n\tthe HDF5 file 'foo.h5'. This can be read by netCDF4 software, because it uses dimension-scales to each dataset.\n 2. The class `NcYaml` uses the [netCDF4](https://pypi.org/project/netCDF4/) package, which provides an object-oriented\n python interface to the netCDF version 4 library.\n Let 'nc_def.yaml' be your YAML configuration file then ```NcYaml(\"nc_def.yaml\").create(\"foo.nc\")``` will create\n\tthe netCDF4/HDF5 file 'foo.nc'\n\nThe class `NcYaml` must be used when strict conformance to the netCDF4 format is required.\nHowever, package `netCDF4` has some limitations, which `h5py` has not, for example it does\nnot allow variable-length variables to have a compound data-type.\n\n## Installation\nReleases of the code, starting from version 0.1, will be made available via PyPI.\n\n## Usage\n\nThe YAML file should be structured as follows:\n\n * The top level are: 'groups', 'dimensions', 'compounds' and 'variables'\n * The section 'groups' are optional, but you should provide each group you want to use\n in your file. The 'groups' section in the YAML file may look like this:\n\n ```\n groups:\n - engineering_data\n - image_attributes\n - navigation_data\n - processing_control\n - science_data\n ```\n\n * The section 'dimensions' is obligatory, you should define the dimensions for each\n variable in your file. The 'dimensions' section may look like this:\n\n ```\n dimensions:\n days:\n _dtype: u4\n _size: 0\n long_name: days since 2024-01-01 00:00:00Z\n number_of_images: # an unlimited dimension\n _dtype: u2\n _size: 0\n samples_per_image: # a fixed dimension\n _dtype: u4\n _size: 307200\n /navigation_data/att_time: # an unlimited dimension in a group with attributes\n _dtype: f8\n _size: 0\n _FillValue: -32767\n long_name: Attitude sample time (seconds of day)\n calendar: proleptic_gregorian\n units: seconds since %Y-%m-%d %H:%M:%S\n valid_min: 0\n valid_max: 92400\n n_viewport: # a fixed dimension with fixed values and attributes\n _dtype: i2\n _size: 5\n _values: [-50, -20, 0, 20, 50]\n long_name: along-track view angles at sensor\n units: degrees\n ```\n\n * The 'compounds' are optional, but you should provide each compound data-type which\n you want to use in your file. For each compound element you have to provide its\n data-type and attributes: units and long_name. The 'compound' section may look like\n this:\n\n ```\n compounds:\n stats_dtype:\n time: [u8, seconds since 1970-01-01T00:00:00, timestamp]\n index: [u2, '1', index]\n tbl_id: [u1, '1', binning id]\n saa: [u1, '1', saa-flag]\n coad: [u1, '1', co-addings]\n texp: [f4, ms, exposure time]\n lat: [f4, degree, latitude]\n lon: [f4, degree, longitude]\n avg: [f4, '1', '$S - S_{ref}$']\n unc: [f4, '1', '\\u03c3($S - S_{ref}$)']\n dark_offs: [f4, '1', dark-offset]\n ```\n\n Alternatively, provide a list with names of YAML files which contain the definitions\n of the compounds.\n\n ```\n compounds:\n - h5_nomhk_tm.yaml\n - h5_science_hk.yaml\n ```\n * The 'variables' are defined by their data-type ('_dtype') and dimensions ('_dims'),\n and optionally chunk sizes ('_chunks'), compression ('_compression'), variable length\n ('_vlen'). In addition, each variable can have as many attributes as you like,\n defined by its name and value. The 'variables' section may look like this:\n\n ```\n variables:\n /image_attributes/nr_coadditions:\n _dtype: u2\n _dims: [number_of_images]\n _FillValue: 0\n long_name: Number of coadditions\n units: '1'\n valid_min: 1\n /image_attributes/exposure_time:\n _dtype: f8\n _dims: [number_of_images]\n _FillValue: -32767\n long_name: Exposure time\n units: seconds\n stats_163:\n _dtype: stats_dtype\n _vlen: True\n _dims: [days]\n comment: detector map statistics (MPS=163)\n ```\n\n### Notes and ToDo:\n\n * The usage of older versions of h5py may result in broken netCDF4 files\n * Explain usage of parameter '_chunks', which is currently not correctly implemented.\n * Explain that the usage of variable length data-sets may break netCDF4 compatibility\n\n## Support [TBW]\n\n## Roadmap\n\n * Release v0.1 : stable API to read your YAML files and generate the HDF5/netCDF4 file\n\n\n## Authors and acknowledgment\nThe code is developed by R.M. van Hees (SRON)\n\n## License\n\n* Copyright: SRON (https://www.sron.nl).\n* License: BSD-3-clause\n",
"bugtrack_url": null,
"license": null,
"summary": "Use YAML configuration file to generate HDF5/netCDF4 formated files.",
"version": "0.1.1",
"project_urls": {
"Homepage": "https://github.com/rmvanhees/h5_yaml",
"Issues": "https://github.com/rmvanhees/h5_yaml/issues",
"Source": "https://github.com/rmvanhees/h5_yaml"
},
"split_keywords": [
"hdf5",
" yaml",
" netcdf4"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "0aa27a85052f7dfa2686989dda3208670222fd78bf107d7c26c0d0f454e8ff4a",
"md5": "4de47941698807d3feebd742b500e341",
"sha256": "fa5232a16c2c7a4441163a34cadb75b029d426bc0225547879784a2998380ec9"
},
"downloads": -1,
"filename": "h5yaml-0.1.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "4de47941698807d3feebd742b500e341",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 13367,
"upload_time": "2025-10-27T10:22:14",
"upload_time_iso_8601": "2025-10-27T10:22:14.523459Z",
"url": "https://files.pythonhosted.org/packages/0a/a2/7a85052f7dfa2686989dda3208670222fd78bf107d7c26c0d0f454e8ff4a/h5yaml-0.1.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "d56660776fd3b5135c4f64a25565d8b1f4a5bc64d8b75e72544469620f561034",
"md5": "f15c9ed0a346483768f6e486d3246fe7",
"sha256": "7459478435418bc74d6f632abbfb48379907b32828f0594f9e1a46cc042debbb"
},
"downloads": -1,
"filename": "h5yaml-0.1.1.tar.gz",
"has_sig": false,
"md5_digest": "f15c9ed0a346483768f6e486d3246fe7",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 9328,
"upload_time": "2025-10-27T10:22:16",
"upload_time_iso_8601": "2025-10-27T10:22:16.025698Z",
"url": "https://files.pythonhosted.org/packages/d5/66/60776fd3b5135c4f64a25565d8b1f4a5bc64d8b75e72544469620f561034/h5yaml-0.1.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-10-27 10:22:16",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "rmvanhees",
"github_project": "h5_yaml",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "h5yaml"
}