gdio


Namegdio JSON
Version 0.3.3 PyPI version JSON
download
home_pagehttps://github.com/rodri90y/gdio
SummaryGridded data io library
upload_time2023-07-14 17:53:12
maintainer
docs_urlNone
authorRodrigo Yamamoto
requires_python>=3.9
licenseMIT
keywords gdio grib netcdf hdf5
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            
# GDIO - Gridded Data IO

A simple and concise gridded data IO library for reading multiples grib, netcdf and hdf5 files, automatic spatial interpolation of the all data to a single resolution.

The library gdio is based on my own professionals and personal needs as a meteorologist. 
The currents libraries always fail when you need to read handle multiples large 
netcdf/grib/hdf5 files, with different resolutions and time steps.

After version 0.1.2 the output data was converted to object with key-values accessible using attribute notation, and after version 0.1.8 a new multilevel dictionary data structure. 
In the version 0.2.5 the latitude and longitude come in mesh array (ny,nx) format to support irregular or lambert projection.

## Instalation
```
conda config --env --add channels conda-forge
conda install -c rodri90y gdio

if you are using pip install, before install manually the requirements

conda create -n envname --file requirements/base.txt
pip install gdio
or
pip install --index-url https://test.pypi.org/simple/ --upgrade --no-cache-dir --extra-index-url=https://pypi.org/simple/ gdio
```

#### Required dependencies

conda config --add channels conda-forge

+ Python (3.8.5=> or later)
+ netCDF4 (1.5.8 or later)
+ h5py (3.6.0 or later)
+ eccodes (2.24.2 or later)
+ python-eccodes (1.4.0 or later)
+ pyproj


#### Optional dependencies
+ scipy (1.4.1 or later)

#### Testing
```
python -m unittest 
```


## Reading files
The gdio support the IO of grib1/2 and netcdf file formats, allowing the time and spatial subdomains cut.

This library unifies categories of information (variable, level, members) in a single 
data structure as a multilevel dictionary/attribute, regardless of the format read (netcdf and grib), the 
output format will be standardized in order to simplify access to the data.

In the dataset first level the following parameters are accessible: ref_time, time_units and time in addition to the weather variables.
ds.ref_time, ds.time
At the variable level we have: level_type, param_id, long_name, parameterUnits, latitude and longitude and at vertical level (isobaricInh, surface, etc) the variable data as value and level are exposed.

Structure data:

    + dataset
        + ref_time
        + time_units
        + time
        + variable (u,v,2t,etc) 
            + centre
            + dataType
            + param_id
            + long_name
            + parameter_units
            + latitude
            + longitude
            + grid_type
            + projparams
            + isobaricInhPa/surface/maxWind/sigma (any level type key)
                + value
                + level
                + members

Example:
            
    ds.time
    ds.time_units
    ds.v.latitude
    ds.v.isobaricInhPa.value
    ds.v.isobaricInhPa.level
    ds.v.isobaricInhPa.members


### Reading multiple files
This class has high level routines for multiple files and type reading, returning the netcdf/grib data as a list of dictionary type.

```
from gdio.core import gdio

ds = gdio(verbose=False)
ds.mload(['tests/data/era5_20191226-27_lev.grib', 'tests/data/era5_20191227_lev.nc'],  
        merge_files=True, uniformize_grid=True, inplace=True)

>>> ds.dataset[0].keys()
dict_keys(['ref_time', 'time_units', 'time', 'longitude', 'latitude', 't', 'u', 'v', 'r'])

>>> print(ds.dataset[0].u.isobaricInhPa.value.shape)
(1, 6, 7, 241, 281)

>>> ds.dataset[0].time
masked_array(data=[datetime.datetime(2019, 12, 26, 0, 0),
                   datetime.datetime(2019, 12, 26, 12, 0),
                   datetime.datetime(2019, 12, 27, 0, 0),
                   datetime.datetime(2019, 12, 27, 12, 0),
                   datetime.datetime(2019, 12, 27, 0, 0),
                   datetime.datetime(2019, 12, 27, 12, 0)],
             mask=False,
       fill_value='?',
            dtype=object)

```
Loading the data into the spatial subdomain between lat -30, lon 300 and lat 10, lon 320, selecting the time between 
timespteps 12 and 24, and changing the variable names t and u to 2t and 10u.

```
from gdio.core import gdio

ds = gdio(verbose=False)
ds.mload(['tests/data/era5_20191226-27_lev.grib', 'tests/data/era5_20191227_lev.nc'],  
        merge_files=True, uniformize_grid=True, 
        cut_domain=(-30, 300, 10, 320), cut_time=(12, 24), 
        rename_vars={'t': '2t', 'u': '10u'}, inplace=True)

>>> ds.dataset[0].keys()
dict_keys(['ref_time', 'time_units', 'time', 'longitude', 'latitude', 'r', '2t', '10u', 'v'])

>>> print(ds.dataset[0]['10u'].isobaricInhPa.value.shape)
(1, 2, 7, 160, 80)

>>> ds.dataset[0].time
masked_array(data=[datetime.datetime(2019, 12, 26, 12, 0),
                   datetime.datetime(2019, 12, 27, 0, 0),
                   datetime.datetime(2019, 12, 27, 12, 0)],
             mask=False,
       fill_value='?',
            dtype=object)

```

The following parameters can be set to operate on the data during reading.

**uniformize_grid:     boolean**\
interpolate all gridded data to first grid data file resolution

**vars:                list**\
variables names

**merge_files:         boolean**\
merge the variables data of all files into a single data array per variable

**cut_time:            tuple**\
range of time to cut ex.: (0,10)/(0,None)/(None,10)

**cut_domain:          tuple**\
range of latitudes and longitudes to cut: (lat1, lon1, lat2, lon2)
ex.: (-45,-90,20,-30)/(-45,None,20,-30)/(None,-90,None,-20)

**level_type:          list**\
type of level (hybrid, isobaricInhPa, surface)

**filter_by:           dictonary**\
dict with grib parameters at form of pair key:values (list or single values)
eg: filter_by={'perturbationNumber': [0,10],'level': [1000,500,250]} or filter_by={'gridType': 'regular_ll'}|
Obs: this parameter only works on grib files

**rename_vars:         dictonary**\
rename the original variable name (key) to a new name (value). 

Eg. {'tmpmdl': 't', 'tmpprs': 't'}

**sort_before:         bool**\
Sort fields before process validityDate, validityTime, paramId, typeOfLevel, perturbationNumber and level. Warning high
consumption of memory, just use when the grib data structure is not standard


### Selecting a sub sample in mload dataset
Select data by coordinates (date, latitude, longitude, levels and members)

```
sub_set = ds.sel(dates=[datetime(2019,12,26,12,0)], latitude=[-23.54,-22], longitude=[-46.64,-42.2], level=[2,6])

>>> print(sub_set[0].get('u').isobaricInhPa.value.shape)
(1, 1, 4, 6, 18)
```

### Showing the data structure
Prints the data structure tree.
```
>>> ds.describe

    +-- ref_time: 2019-12-26 00:00:00
    +-- time_units: hours
    +-- time: <class 'numpy.ma.core.MaskedArray'> (6,)
    +-- r 
        +-- isobaricInhPa 
            +-- value: <class 'numpy.ndarray'> (1, 6, 7, 160, 80)
            +-- level: [200, 300, 500, 700, 800, 950, 1000]
            +-- members: [0]
        +-- centre: 'ecmwf',
        +-- dataType: 'an',
        +-- param_id: 157
        +-- long_name: Relative humidity
        +-- parameter_units: %
        +-- latitude: <class 'numpy.ndarray'> (160, 80)
        +-- longitude: <class 'numpy.ndarray'> (160, 80)
        +-- level_type: ['isobaricInhPa']
        +-- grid_type: 'regular_ll'
        +-- projparams: { 'a': 6371229.0, 'b': 6371229.0, 'proj': 'regular_ll'}
        
    .
    .
    .
    
    +-- v 
    +-- isobaricInhPa 
        +-- value: <class 'numpy.ndarray'> (1, 6, 7, 160, 80)
        +-- level: [200, 300, 500, 700, 800, 950, 1000]
        +-- members: [0]
    +-- centre: 'ecmwf',
    +-- dataType: 'an',
    +-- param_id: 132
    +-- long_name: V component of wind
    +-- parameter_units: m s**-1
    +-- latitude: <class 'numpy.ndarray'> (160, 80)
    +-- longitude: <class 'numpy.ndarray'> (160, 80)
    +-- level_type: ['isobaricInhPa']
    +-- grid_type: 'regular_ll'
    +-- projparams: { 'a': 6371229.0, 'b': 6371229.0, 'proj': 'regular_ll'}
```


Setting the ensemble grouping grib id key

```
ds.fields_ensemble = 'perturbationNumber'
ds.fields_ensemble_exception = [0]
```


#### Grib
The class netcdf encapsulates all grib functions, as well as the cutting of time and spatial domains , returning the netcdf data as a dictionary type.

Simple reading
```
from gdio.grib import grib
gr = grib(verbose=False)
ds = gr.gb_load('data/era5_20191226-27_lev.grib')

>>> ds.keys()
dict_keys(['ref_time', 'time_units', 'time', 't', 'u', 'v', 'r'])
>>> print(ds.u.isobaricInhPa.value.shape)
(1, 4, 7, 241, 281)
>>> print(ds.u.level_type)
['isobaricInhPa']
>>> print(ds.u.keys())
dict_keys(['centre', 'dataType','isobaricInhPa', 'param_id', 'long_name', 'parameter_units', 'latitude', 'longitude', 'level_type', 'grid_type', projparams])
>>> print(ds.u.isobaricInhPa.level)
[200, 300, 500, 700, 800, 950, 1000]
>>> print(ds.u.parameter_units)
m s**-1
>>> print(ds.u.param_id)
131
```

Reading a subsample in time (time 12-24) and space (bbox -30,-60 and 10,-40)

```
ds = gr.gb_load('data/era5_20191226-27_lev.grib', cut_domain=(-30, -60, 10, -40), cut_time=(12, 24))
```

Setting the ensemble grouping grib id key

```
gr.fields_ensemble = 'perturbationNumber'
gr.fields_ensemble_exception = [0]
```

Filtering by a grib key, dict with grib parameters at form of pair key:
values (list or single values)
eg: filter_by={'perturbationNumber': [0,10],'level': [1000,500,250]}
                            or filter_by={'gridType': 'regular_ll'}

```
ds = gr.gb_load('tests/data/era5_20191226-27_lev.grib', 
                cut_domain=(-30, -60, 10, -40), 
                cut_time=(12, 24), 
                filter_by={'perturbationNumber': 0, 'level':[200,500,950]})
>>> print(ds.u.isobaricInhPa.level)
[200, 500, 950]
```
Rename variables
A dictionary input will rename variables names (key) for a new name (value).
Eg. {'tmpmdl': 't', 'tmpprs': 't'}

```
ds = gr.gb_load('data/era5_20191227_lev.nc', rename_vars={'u':'10u'})
>>> ds.keys()
dict_keys(['ref_time', 'time_units', 'time', 't', '10u', 'v', 'r'])
```
Sorting grib parameter before (extra consumption of memory and possible a little slow). 
Fix grib files unstructured or non-standard.
```
ds = gr.gb_load('data/era5_20191227_lev.nc', sort_before=True)
```
#### Writing a netcdf file

From the loaded dataset
```
nc.nc_write('data/output.nc', ds)
```
From a dictionary
```
from gdio.grib import grib
gr = grib(verbose=False)
ds = gr.gb_load('data/era5_20191226-27_lev.grib')

gr.gb_write('output.grib', self.gbr, least_significant_digit=3, packingType='grid_jpeg')
```

#### Netcdf
The class netcdf encapsulates all netcdf functions of reading and writing, as well as the cutting of time and spatial domains, returning the netcdf data as a dictionary type. The returned dictionary contains for each variable the value, param_id, type_level, level and parameter_units property.

Simple reading
```
from gdio.netcdf import netcdf
nc = netcdf(verbose=False)

ds = nc.nc_load('tests/data/era5_20191227_lev.nc')
>>> ds.keys()
dict_keys(['ref_time', 'time_units', 'time', 'r', 't', 'u', 'v'])
>>> print(ds.u.isobaricInhPa.value.shape)
(1, 2, 7, 161, 241)
>>> print(ds.u.level_type)
['isobaricInhPa']
>>> print(ds.u.keys())
dict_keys(['isobaricInhPa', 'param_id', 'long_name', 'parameter_units', 'latitude', 'longitude', 'level_type'])
>>> print(ds.u.isobaricInhPa.level)
[200, 300, 500, 700, 800, 950, 1000]
>>> print(ds.u.parameter_units)
m s**-1
>>> print(ds.u.param_id)
None
```

Reading a subsample in time (time 12-24) and space (bbox -30,-60 and 10,-40). The returned multilevels dictionary/attributes contains for each variable the value, param_id, type_level, level and parameter_units property.

```
ds = nc.nc_load('data/era5_20191227_lev.nc', cut_domain=(-30, -60, 10, -40), cut_time=(12, 24))
>>> print(ds.u.isobaricInhPa.value.shape)
(1, 1, 7, 80, 40)
```
Rename variables
A dictionary input will rename variables names (key) for a new name (value).
Eg. {'tmpmdl': 't', 'tmpprs': 't'}

```
ds = nc.nc_load('data/era5_20191227_lev.nc', rename_vars={'u':'10u'})
>>> ds.keys()
dict_keys(['ref_time', 'time_units', 'time', 't', '10u', 'v', 'r'])
```

#### Writing a netcdf file

From the loaded dataset
```
nc.nc_write('data/output.nc', ds)
```
From a dictionary
```
from datetime import datetime
import numpy as np
from gdio.netcdf import netcdf

nc = netcdf(verbose=False)

ds = {'ref_time': datetime(2019, 12, 27, 0, 0), 
      'time_units': 'hours', 
      'time': np.array([12]),
      'u': {'isobaricInhPa': {  'value': np.random.random((1, 1, 7, 80, 40)),
                                'level': [200, 300, 500, 700, 800, 950, 1000]
                              },
            'param_id': None, 
            'long_name': 'U component of wind', 
            'level_type': ['isobaricInhPa'],
            'parameter_units': 'm s**-1',
            'longitude': np.array([300. , 300.5, 301. , 301.5, 302. , 302.5, 303. , 303.5,
               304. , 304.5, 305. , 305.5, 306. , 306.5, 307. , 307.5,
               308. , 308.5, 309. , 309.5, 310. , 310.5, 311. , 311.5,
               312. , 312.5, 313. , 313.5, 314. , 314.5, 315. , 315.5,
               316. , 316.5, 317. , 317.5, 318. , 318.5, 319. , 319.5]),
            'latitude': np.array([-30. , -29.5, -29. , -28.5, -28. , -27.5, -27. , -26.5,
               -26. , -25.5, -25. , -24.5, -24. , -23.5, -23. , -22.5,
               -22. , -21.5, -21. , -20.5, -20. , -19.5, -19. , -18.5,
               -18. , -17.5, -17. , -16.5, -16. , -15.5, -15. , -14.5,
               -14. , -13.5, -13. , -12.5, -12. , -11.5, -11. , -10.5,
               -10. ,  -9.5,  -9. ,  -8.5,  -8. ,  -7.5,  -7. ,  -6.5,
                -6. ,  -5.5,  -5. ,  -4.5,  -4. ,  -3.5,  -3. ,  -2.5,
                -2. ,  -1.5,  -1. ,  -0.5,   0. ,   0.5,   1. ,   1.5,
                 2. ,   2.5,   3. ,   3.5,   4. ,   4.5,   5. ,   5.5,
                 6. ,   6.5,   7. ,   7.5,   8. ,   8.5,   9. ,   9.5]),
            }
      }

nc.nc_write('data/output.nc', ds)
```


#### HDF5
The class hdf encapsulates all hdf5 functions of reading and writing, as well as the cutting of time and spatial domains, returning the hdf5 data as a dictionary type. The returned dictionary contains for each variable the value, param_id, type_level, level and parameter_units property.

Simple reading
```
from gdio.hdf import hdf
hd = hdf(verbose=False)

ds = hd.hdf_load('tests/data/gpm_3imerg_20220101.hdf')
>>> ds.keys()
dict_keys(['ref_time', 'time_units', 'time', 'r', 't', 'u', 'v'])
>>> print(ds.u.isobaricInhPa.value.shape)
(1, 2, 7, 161, 241)
>>> print(ds.u.level_type)
['isobaricInhPa']
>>> print(ds.u.keys())
dict_keys(['isobaricInhPa', 'param_id', 'long_name', 'parameter_units', 'latitude', 'longitude', 'level_type'])
>>> print(ds.u.isobaricInhPa.level)
[200, 300, 500, 700, 800, 950, 1000]
>>> print(ds.u.parameter_units)
m s**-1
>>> print(ds.u.param_id)
None
```

Reading a subsample in time (time 12-24) and space (bbox -30,-60 and 10,-40). The returned multilevels dictionary/attributes contains for each variable the value, param_id, type_level, level and parameter_units property.

```
ds = hd.hdf_load('tests/data/gpm_3imerg_20220101.hdf', cut_domain=(-30, -60, 10, -40), cut_time=(0, 1))
>>> print(ds.u.isobaricInhPa.value.shape)
(1, 1, 7, 80, 40)
```
Rename variables
A dictionary input will rename variables names (key) for a new name (value).
Eg. {'tmpmdl': 't', 'tmpprs': 't'}

```
ds = hd.hdf_load('tests/data/gpm_3imerg_20220101.hdf', rename_vars={'precipitationCal':'prec_merge', 'IRprecipitation': 'prec_ir'})
>>> ds.keys()
dict_keys(['ref_time', 'time_units', 'time', 't', '10u', 'v', 'r'])
```

#### Writing a HDF5 file

From the loaded dataset
```
nc.nc_write('data/output.nc', ds)
```
From a dictionary
```
from datetime import datetime
import numpy as np
from gdio.hdf import hdf

nc = hdf(verbose=False)

ds = {'ref_time': datetime(2019, 12, 27, 0, 0), 
      'time_units': 'hours', 
      'time': np.array([12]),
      'u': {'isobaricInhPa': {  'value': np.random.random((1, 1, 7, 80, 40)),
                                'level': [200, 300, 500, 700, 800, 950, 1000]
                              },
            'param_id': None, 
            'long_name': 'U component of wind', 
            'level_type': ['isobaricInhPa'],
            'parameter_units': 'm s**-1',
            'longitude': np.array([300. , 300.5, 301. , 301.5, 302. , 302.5, 303. , 303.5,
               304. , 304.5, 305. , 305.5, 306. , 306.5, 307. , 307.5,
               308. , 308.5, 309. , 309.5, 310. , 310.5, 311. , 311.5,
               312. , 312.5, 313. , 313.5, 314. , 314.5, 315. , 315.5,
               316. , 316.5, 317. , 317.5, 318. , 318.5, 319. , 319.5]),
            'latitude': np.array([-30. , -29.5, -29. , -28.5, -28. , -27.5, -27. , -26.5,
               -26. , -25.5, -25. , -24.5, -24. , -23.5, -23. , -22.5,
               -22. , -21.5, -21. , -20.5, -20. , -19.5, -19. , -18.5,
               -18. , -17.5, -17. , -16.5, -16. , -15.5, -15. , -14.5,
               -14. , -13.5, -13. , -12.5, -12. , -11.5, -11. , -10.5,
               -10. ,  -9.5,  -9. ,  -8.5,  -8. ,  -7.5,  -7. ,  -6.5,
                -6. ,  -5.5,  -5. ,  -4.5,  -4. ,  -3.5,  -3. ,  -2.5,
                -2. ,  -1.5,  -1. ,  -0.5,   0. ,   0.5,   1. ,   1.5,
                 2. ,   2.5,   3. ,   3.5,   4. ,   4.5,   5. ,   5.5,
                 6. ,   6.5,   7. ,   7.5,   8. ,   8.5,   9. ,   9.5]),
            }
      }

nc.hdf_write('data/output.nc', ds)
```

## Routines
### gdio.mload
Load multiple files (netcdf/grib) returning the data as a list of dictionary type interpolating the data to a same grid

```
mload(files, vars=None, merge_files=True, cut_time=None,
      cut_domain=None, level_type=None, filter_by={},
      uniformize_grid=True, sort_before=False, inplace=False)
```          
**files:               list**

files names
                                    
**uniformize_grid:     boolean**\
interpolate all ncs to first nc grid specification

**vars:                list**\
variables names

**merge_files:         boolean**\
merge files

**cut_time:            tuple**\
                        range of time to cut ex.: (0,10)/(0,None)/(None,10)

**cut_domain:          tuple**\
                        range of latitudes and longitudes to cut: (lat1, lon1, lat2, lon2)
                        ex.: (-45,-90,20,-30)/(-45,None,20,-30)/(None,-90,None,-20)

**level_type:          list**\
                        type of level (hybrid, isobaricInhPa, surface)

**filter_by:           dictonary**\
dict with grib parameters at form of pair key:values (list or single values)
eg: filter_by={'perturbationNumber': [0,10],'level': [1000,500,250]} or filter_by={'gridType': 'regular_ll'}|

**rename_vars:         dictonary**\
rename variables names (key) for a new name (value). Eg. {'tmpmdl': 't', 'tmpprs': 't'}

**sort_before:         bool**\
Sort fields before process validityDate, validityTime, paramId, typeOfLevel, 
perturbationNumber and level. Warning extra consumption of memory and time, 
just use when the grib data structure is not standard (default False)

**return:**                    list of dictionaries

### gdio.sel
Select data by coordinates (date, latitude, longitude, levels and members)

```
sel(data=None, latitude=None, longitude=None, 
    dates=None, level=None, member=None, date_format="%Y-%m-%d %H:%M")
```


**data:       list of dictionaries**\
                             raw dataset
                             
**latitude:     list of floats**\
                             latitudes
                             range of latitudes to select: [lat1, lat2]
                             especific latitudes (1 or >2) [lat1, lat2, lat3, ...]

**longitude:    list of floats**\
                             range of longitudes to select: [lon1, lon2]
                             especific longitudes (1 or >2) [lon1, lon2, lon3, ...]

**dates:        list of datetime/string**\
                             datetime/string date
                             range of dates to select: [date1, date2]
                             especific dates (1 or >2) [date1, date2, date3, ...]

**level:        list of int**\
                             range of levels to select: [level1, level2]
                             especific levels (1 or >2) [level1, level2, level3, ...]

**member:       list of int**\
                             range of levels to select: [member, member]
                             especific levels (1 or >2) [level1, level2, level3, ...]

**return:**     list of dictionaries

### gdio.grib.gb_load
Load grib file
```
def gb_load(ifile, vars=None, level_type=None,
            cut_time=None, cut_domain=None, filter_by={},
            rename_vars={}, sort_before=False)
```

**ifile:       string**\
                            grib 1 or 2 file name

**vars:        list**\
                            variables short name or id parameter number

**cut_time:    tuple**\
                            range of time to cut ex.: (0,10)/(0,None)/(None,10)

**cut_domain:  tuple**\
                            range of latitudes and longitudes to cut: (lat1, lon1, lat2, lon2)
                            ex.: (-45,290,20,330)/(-45,None,20,330)/(None,290,None,320)

**level_type:  list**\
                            type of level (hybrid, isobaricInhPa, surface)

**filter_by:   dictonary**\
                            dict with grib parameters at form of pair key:values (list or single values)
                            eg: filter_by={"perturbationNumber": [0,10],"level": [1000,500,250]}
                            or filter_by={"gridType": "regular_ll"}

**rename_vars: dictonary**\
                            rename variables names (key) for a new name (value).
                            Eg. {"tmpmdl": "t", "tmpprs": "t"}

**sort_before: bool**\
                            Sort fields before process validityDate, validityTime, paramId, typeOfLevel, perturbationNumber and level
                            Warning high consumption of memory, just use when the grib data structure is not standard

**return: dictonary/attributes**\
multiple time data container

### gdio.grib.gb_write
Write grib2 file
```
def gb_write(ofile, data, packingType='grid_simple', least_significant_digit=3, **kwargs))
```
**ifile: string**\
file path

**data: dict**\
dataset

**packingType: string**\
packingType\
- Type of packing:
  - grid_simple
  - spectral_simple
  - grid_simple_matrix
  - grid_jpeg
  - grid_png
  - grid_ieee
  - grid_simple_log_preprocessing
  - grid_second_order

**least_significant_digit: int (default 3)**\
Specify the power of ten of the smallest decimal place in the data that is a
reliable value that dramatically improve the compression by quantizing
(or truncating) the data

### gdio.netcdf.nc_load
Load netcdf files
```
nc_load(ifile, vars=None, cut_time=None, cut_domain=None, level_type=None, rename_vars={}):
```

**ifile:       string**\
                    netcdf file name
                    
**vars:        list**\
                    variables short name
                    
**cut_time:    tuple**\
                    range of time (absolute) to cut ex.: (0,10)/(0,None)/(None,10)
                    
**cut_domain:  tuple**\
                    range of latitudes and longitudes to cut: (lat1, lon1, lat2, lon2)
                    ex.: (-45,290,20,330)/(-45,None,20,330)/(None,290,None,320)
                    
**level_type:  list**\
                    type of level (hybrid, isobaricInhPa, surface)

**rename_vars: dictonary**\
                            rename variables names (key) for a new name (value).
                            Eg. {"tmpmdl": "t", "tmpprs": "t"}
                            
**return: dictonary/attributes**\
multiple time data container

### gdio.netcdf.nc_write
Write netcdf file
```
nc_write(ifile, data, zlib=True, netcdf_format='NETCDF4')
```


**ifile:           string**\
                                file path
                                
**data:            dict**\
                                dataset
                                
**zlib:            bool**\
                                enable compression
                                
**netcdf_format:   string**\
                                netcdf format: NETCDF4, NETCDF4_CLASSIC, NETCDF3_CLASSIC or NETCDF3_64BIT


**complevel:      int**\
 compression level (default 4)

**least_significant_digit: int**\
specify the power of ten of the smallest decimal place in the data that is a
                reliable value that dramatically improve zlib compression by quantizing
                (or truncating) the data (default None)


### gdio.hdf.hdf_load
Load HDF5 files
```
hdf_load(ifile, vars=None, cut_time=None, cut_domain=None, level_type=None, rename_vars={}):
```

**ifile:       string**\
                    hdf5 file name
                    
**vars:        list**\
                    variables short name
                    
**cut_time:    tuple**\
                    range of time (absolute) to cut ex.: (0,10)/(0,None)/(None,10)
                    
**cut_domain:  tuple**\
                    range of latitudes and longitudes to cut: (lat1, lon1, lat2, lon2)
                    ex.: (-45,290,20,330)/(-45,None,20,330)/(None,290,None,320)
                    
**level_type:  list**\
                    type of level (hybrid, isobaricInhPa, surface)

**rename_vars: dictonary**\
                            rename variables names (key) for a new name (value).
                            Eg. {"tmpmdl": "t", "tmpprs": "t"}
                            
**return: dictonary/attributes**\
multiple time data container


### gdio.hdf.hdf_write
Write netcdf file
```
hdf_write(ifile, data, compress_type='gzip', netcdf_format='NETCDF4')
```


**ifile:           string**\
                                file path
                                
**data:            dict**\
                                dataset
                                
**compress_type:   string**\
                                type of compression: zlib, gzip, lzf (default gzip)
**complevel:       int**\
                                compression level (default 9)

**least_significant_digit: int**\
specify the power of ten of the smallest decimal place in the data that is a
                reliable value that dramatically improve zlib compression by quantizing
                (or truncating) the data (default None)


### gdio.remapbil
```
remapbil(data, lon, lat, lon_new, lat_new, order=1, masked=False)
```

Interpolate data to new domain resolution

**data: array**\
                        3D data (time,lon,lat)

**lon: array**

**lat: array**

**lon_new: array**\
                        new grid logitudes

**lat_new: array**\
                        new grid latitudes

**order:   int**\
                        0- nearest-neighbor, 1 - bilinear, 2 - cubic spline

**masked: boolean**\
                        If True, points outside the range of xin and yin
                        are masked (in a masked array). If masked is set to a number

**return: 3D array**


## Dev utils
Docker compose to support development

### Commands
 - make build
   - Build the container
 - make up
   - Start container
 - make stop
   - Stop container
 - make test
   - Run unit tests in container
 - make bash
   - Access container
 - make ipython
   - Run ipython in container
 - make fix
   - Run autopep to fix code format

## Release History


## Meta
Rodrigo Yamamoto codes@rodrigoyamamoto.com

https://github.com/rodri90y/gdio

## Contributing

* 0.3.3
    * alpha release
    

## License

MIT

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/rodri90y/gdio",
    "name": "gdio",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": "",
    "keywords": "gdio,grib,netcdf,hdf5",
    "author": "Rodrigo Yamamoto",
    "author_email": "codes@rodrigoyamamoto.com",
    "download_url": "https://github.com/rodri90y/gdio/archive/v0.3.3.tar.gz",
    "platform": null,
    "description": "\n# GDIO - Gridded Data IO\n\nA simple and concise gridded data IO library for reading multiples grib, netcdf and hdf5 files, automatic spatial interpolation of the all data to a single resolution.\n\nThe library gdio is based on my own professionals and personal needs as a meteorologist. \nThe currents libraries always fail when you need to read handle multiples large \nnetcdf/grib/hdf5 files, with different resolutions and time steps.\n\nAfter version 0.1.2 the output data was converted to object with key-values accessible using attribute notation, and after version 0.1.8 a new multilevel dictionary data structure. \nIn the version 0.2.5 the latitude and longitude come in mesh array (ny,nx) format to support irregular or lambert projection.\n\n## Instalation\n```\nconda config --env --add channels conda-forge\nconda install -c rodri90y gdio\n\nif you are using pip install, before install manually the requirements\n\nconda create -n envname --file requirements/base.txt\npip install gdio\nor\npip install --index-url https://test.pypi.org/simple/ --upgrade --no-cache-dir --extra-index-url=https://pypi.org/simple/ gdio\n```\n\n#### Required dependencies\n\nconda config --add channels conda-forge\n\n+ Python (3.8.5=> or later)\n+ netCDF4 (1.5.8 or later)\n+ h5py (3.6.0 or later)\n+ eccodes (2.24.2 or later)\n+ python-eccodes (1.4.0 or later)\n+ pyproj\n\n\n#### Optional dependencies\n+ scipy (1.4.1 or later)\n\n#### Testing\n```\npython -m unittest \n```\n\n\n## Reading files\nThe gdio support the IO of grib1/2 and netcdf file formats, allowing the time and spatial subdomains cut.\n\nThis library unifies categories of information (variable, level, members) in a single \ndata structure as a multilevel dictionary/attribute, regardless of the format read (netcdf and grib), the \noutput format will be standardized in order to simplify access to the data.\n\nIn the dataset first level the following parameters are accessible: ref_time, time_units and time in addition to the weather variables.\nds.ref_time, ds.time\nAt the variable level we have: level_type, param_id, long_name, parameterUnits, latitude and longitude and at vertical level (isobaricInh, surface, etc) the variable data as value and level are exposed.\n\nStructure data:\n\n    + dataset\n        + ref_time\n        + time_units\n        + time\n        + variable (u,v,2t,etc) \n            + centre\n            + dataType\n            + param_id\n            + long_name\n            + parameter_units\n            + latitude\n            + longitude\n            + grid_type\n            + projparams\n            + isobaricInhPa/surface/maxWind/sigma (any level type key)\n                + value\n                + level\n                + members\n\nExample:\n            \n    ds.time\n    ds.time_units\n    ds.v.latitude\n    ds.v.isobaricInhPa.value\n    ds.v.isobaricInhPa.level\n    ds.v.isobaricInhPa.members\n\n\n### Reading multiple files\nThis class has high level routines for multiple files and type reading, returning the netcdf/grib data as a list of dictionary type.\n\n```\nfrom gdio.core import gdio\n\nds = gdio(verbose=False)\nds.mload(['tests/data/era5_20191226-27_lev.grib', 'tests/data/era5_20191227_lev.nc'],  \n        merge_files=True, uniformize_grid=True, inplace=True)\n\n>>> ds.dataset[0].keys()\ndict_keys(['ref_time', 'time_units', 'time', 'longitude', 'latitude', 't', 'u', 'v', 'r'])\n\n>>> print(ds.dataset[0].u.isobaricInhPa.value.shape)\n(1, 6, 7, 241, 281)\n\n>>> ds.dataset[0].time\nmasked_array(data=[datetime.datetime(2019, 12, 26, 0, 0),\n                   datetime.datetime(2019, 12, 26, 12, 0),\n                   datetime.datetime(2019, 12, 27, 0, 0),\n                   datetime.datetime(2019, 12, 27, 12, 0),\n                   datetime.datetime(2019, 12, 27, 0, 0),\n                   datetime.datetime(2019, 12, 27, 12, 0)],\n             mask=False,\n       fill_value='?',\n            dtype=object)\n\n```\nLoading the data into the spatial subdomain between lat -30, lon 300 and lat 10, lon 320, selecting the time between \ntimespteps 12 and 24, and changing the variable names t and u to 2t and 10u.\n\n```\nfrom gdio.core import gdio\n\nds = gdio(verbose=False)\nds.mload(['tests/data/era5_20191226-27_lev.grib', 'tests/data/era5_20191227_lev.nc'],  \n        merge_files=True, uniformize_grid=True, \n        cut_domain=(-30, 300, 10, 320), cut_time=(12, 24), \n        rename_vars={'t': '2t', 'u': '10u'}, inplace=True)\n\n>>> ds.dataset[0].keys()\ndict_keys(['ref_time', 'time_units', 'time', 'longitude', 'latitude', 'r', '2t', '10u', 'v'])\n\n>>> print(ds.dataset[0]['10u'].isobaricInhPa.value.shape)\n(1, 2, 7, 160, 80)\n\n>>> ds.dataset[0].time\nmasked_array(data=[datetime.datetime(2019, 12, 26, 12, 0),\n                   datetime.datetime(2019, 12, 27, 0, 0),\n                   datetime.datetime(2019, 12, 27, 12, 0)],\n             mask=False,\n       fill_value='?',\n            dtype=object)\n\n```\n\nThe following parameters can be set to operate on the data during reading.\n\n**uniformize_grid:     boolean**\\\ninterpolate all gridded data to first grid data file resolution\n\n**vars:                list**\\\nvariables names\n\n**merge_files:         boolean**\\\nmerge the variables data of all files into a single data array per variable\n\n**cut_time:            tuple**\\\nrange of time to cut ex.: (0,10)/(0,None)/(None,10)\n\n**cut_domain:          tuple**\\\nrange of latitudes and longitudes to cut: (lat1, lon1, lat2, lon2)\nex.: (-45,-90,20,-30)/(-45,None,20,-30)/(None,-90,None,-20)\n\n**level_type:          list**\\\ntype of level (hybrid, isobaricInhPa, surface)\n\n**filter_by:           dictonary**\\\ndict with grib parameters at form of pair key:values (list or single values)\neg: filter_by={'perturbationNumber': [0,10],'level': [1000,500,250]} or filter_by={'gridType': 'regular_ll'}|\nObs: this parameter only works on grib files\n\n**rename_vars:         dictonary**\\\nrename the original variable name (key) to a new name (value). \n\nEg. {'tmpmdl': 't', 'tmpprs': 't'}\n\n**sort_before:         bool**\\\nSort fields before process validityDate, validityTime, paramId, typeOfLevel, perturbationNumber and level. Warning high\nconsumption of memory, just use when the grib data structure is not standard\n\n\n### Selecting a sub sample in mload dataset\nSelect data by coordinates (date, latitude, longitude, levels and members)\n\n```\nsub_set = ds.sel(dates=[datetime(2019,12,26,12,0)], latitude=[-23.54,-22], longitude=[-46.64,-42.2], level=[2,6])\n\n>>> print(sub_set[0].get('u').isobaricInhPa.value.shape)\n(1, 1, 4, 6, 18)\n```\n\n### Showing the data structure\nPrints the data structure tree.\n```\n>>> ds.describe\n\n    +-- ref_time: 2019-12-26 00:00:00\n    +-- time_units: hours\n    +-- time: <class 'numpy.ma.core.MaskedArray'> (6,)\n    +-- r \n        +-- isobaricInhPa \n            +-- value: <class 'numpy.ndarray'> (1, 6, 7, 160, 80)\n            +-- level: [200, 300, 500, 700, 800, 950, 1000]\n            +-- members: [0]\n        +-- centre: 'ecmwf',\n        +-- dataType: 'an',\n        +-- param_id: 157\n        +-- long_name: Relative humidity\n        +-- parameter_units: %\n        +-- latitude: <class 'numpy.ndarray'> (160, 80)\n        +-- longitude: <class 'numpy.ndarray'> (160, 80)\n        +-- level_type: ['isobaricInhPa']\n        +-- grid_type: 'regular_ll'\n        +-- projparams: { 'a': 6371229.0, 'b': 6371229.0, 'proj': 'regular_ll'}\n        \n    .\n    .\n    .\n    \n    +-- v \n    +-- isobaricInhPa \n        +-- value: <class 'numpy.ndarray'> (1, 6, 7, 160, 80)\n        +-- level: [200, 300, 500, 700, 800, 950, 1000]\n        +-- members: [0]\n    +-- centre: 'ecmwf',\n    +-- dataType: 'an',\n    +-- param_id: 132\n    +-- long_name: V component of wind\n    +-- parameter_units: m s**-1\n    +-- latitude: <class 'numpy.ndarray'> (160, 80)\n    +-- longitude: <class 'numpy.ndarray'> (160, 80)\n    +-- level_type: ['isobaricInhPa']\n    +-- grid_type: 'regular_ll'\n    +-- projparams: { 'a': 6371229.0, 'b': 6371229.0, 'proj': 'regular_ll'}\n```\n\n\nSetting the ensemble grouping grib id key\n\n```\nds.fields_ensemble = 'perturbationNumber'\nds.fields_ensemble_exception = [0]\n```\n\n\n#### Grib\nThe class netcdf encapsulates all grib functions, as well as the cutting of time and spatial domains , returning the netcdf data as a dictionary type.\n\nSimple reading\n```\nfrom gdio.grib import grib\ngr = grib(verbose=False)\nds = gr.gb_load('data/era5_20191226-27_lev.grib')\n\n>>> ds.keys()\ndict_keys(['ref_time', 'time_units', 'time', 't', 'u', 'v', 'r'])\n>>> print(ds.u.isobaricInhPa.value.shape)\n(1, 4, 7, 241, 281)\n>>> print(ds.u.level_type)\n['isobaricInhPa']\n>>> print(ds.u.keys())\ndict_keys(['centre', 'dataType','isobaricInhPa', 'param_id', 'long_name', 'parameter_units', 'latitude', 'longitude', 'level_type', 'grid_type', projparams])\n>>> print(ds.u.isobaricInhPa.level)\n[200, 300, 500, 700, 800, 950, 1000]\n>>> print(ds.u.parameter_units)\nm s**-1\n>>> print(ds.u.param_id)\n131\n```\n\nReading a subsample in time (time 12-24) and space (bbox -30,-60 and 10,-40)\n\n```\nds = gr.gb_load('data/era5_20191226-27_lev.grib', cut_domain=(-30, -60, 10, -40), cut_time=(12, 24))\n```\n\nSetting the ensemble grouping grib id key\n\n```\ngr.fields_ensemble = 'perturbationNumber'\ngr.fields_ensemble_exception = [0]\n```\n\nFiltering by a grib key, dict with grib parameters at form of pair key:\nvalues (list or single values)\neg: filter_by={'perturbationNumber': [0,10],'level': [1000,500,250]}\n                            or filter_by={'gridType': 'regular_ll'}\n\n```\nds = gr.gb_load('tests/data/era5_20191226-27_lev.grib', \n                cut_domain=(-30, -60, 10, -40), \n                cut_time=(12, 24), \n                filter_by={'perturbationNumber': 0, 'level':[200,500,950]})\n>>> print(ds.u.isobaricInhPa.level)\n[200, 500, 950]\n```\nRename variables\nA dictionary input will rename variables names (key) for a new name (value).\nEg. {'tmpmdl': 't', 'tmpprs': 't'}\n\n```\nds = gr.gb_load('data/era5_20191227_lev.nc', rename_vars={'u':'10u'})\n>>> ds.keys()\ndict_keys(['ref_time', 'time_units', 'time', 't', '10u', 'v', 'r'])\n```\nSorting grib parameter before (extra consumption of memory and possible a little slow). \nFix grib files unstructured or non-standard.\n```\nds = gr.gb_load('data/era5_20191227_lev.nc', sort_before=True)\n```\n#### Writing a netcdf file\n\nFrom the loaded dataset\n```\nnc.nc_write('data/output.nc', ds)\n```\nFrom a dictionary\n```\nfrom gdio.grib import grib\ngr = grib(verbose=False)\nds = gr.gb_load('data/era5_20191226-27_lev.grib')\n\ngr.gb_write('output.grib', self.gbr, least_significant_digit=3, packingType='grid_jpeg')\n```\n\n#### Netcdf\nThe class netcdf encapsulates all netcdf functions of reading and writing, as well as the cutting of time and spatial domains, returning the netcdf data as a dictionary type. The returned dictionary contains for each variable the value, param_id, type_level, level and parameter_units property.\n\nSimple reading\n```\nfrom gdio.netcdf import netcdf\nnc = netcdf(verbose=False)\n\nds = nc.nc_load('tests/data/era5_20191227_lev.nc')\n>>> ds.keys()\ndict_keys(['ref_time', 'time_units', 'time', 'r', 't', 'u', 'v'])\n>>> print(ds.u.isobaricInhPa.value.shape)\n(1, 2, 7, 161, 241)\n>>> print(ds.u.level_type)\n['isobaricInhPa']\n>>> print(ds.u.keys())\ndict_keys(['isobaricInhPa', 'param_id', 'long_name', 'parameter_units', 'latitude', 'longitude', 'level_type'])\n>>> print(ds.u.isobaricInhPa.level)\n[200, 300, 500, 700, 800, 950, 1000]\n>>> print(ds.u.parameter_units)\nm s**-1\n>>> print(ds.u.param_id)\nNone\n```\n\nReading a subsample in time (time 12-24) and space (bbox -30,-60 and 10,-40). The returned multilevels dictionary/attributes contains for each variable the value, param_id, type_level, level and parameter_units property.\n\n```\nds = nc.nc_load('data/era5_20191227_lev.nc', cut_domain=(-30, -60, 10, -40), cut_time=(12, 24))\n>>> print(ds.u.isobaricInhPa.value.shape)\n(1, 1, 7, 80, 40)\n```\nRename variables\nA dictionary input will rename variables names (key) for a new name (value).\nEg. {'tmpmdl': 't', 'tmpprs': 't'}\n\n```\nds = nc.nc_load('data/era5_20191227_lev.nc', rename_vars={'u':'10u'})\n>>> ds.keys()\ndict_keys(['ref_time', 'time_units', 'time', 't', '10u', 'v', 'r'])\n```\n\n#### Writing a netcdf file\n\nFrom the loaded dataset\n```\nnc.nc_write('data/output.nc', ds)\n```\nFrom a dictionary\n```\nfrom datetime import datetime\nimport numpy as np\nfrom gdio.netcdf import netcdf\n\nnc = netcdf(verbose=False)\n\nds = {'ref_time': datetime(2019, 12, 27, 0, 0), \n      'time_units': 'hours', \n      'time': np.array([12]),\n      'u': {'isobaricInhPa': {  'value': np.random.random((1, 1, 7, 80, 40)),\n                                'level': [200, 300, 500, 700, 800, 950, 1000]\n                              },\n            'param_id': None, \n            'long_name': 'U component of wind', \n            'level_type': ['isobaricInhPa'],\n            'parameter_units': 'm s**-1',\n            'longitude': np.array([300. , 300.5, 301. , 301.5, 302. , 302.5, 303. , 303.5,\n               304. , 304.5, 305. , 305.5, 306. , 306.5, 307. , 307.5,\n               308. , 308.5, 309. , 309.5, 310. , 310.5, 311. , 311.5,\n               312. , 312.5, 313. , 313.5, 314. , 314.5, 315. , 315.5,\n               316. , 316.5, 317. , 317.5, 318. , 318.5, 319. , 319.5]),\n            'latitude': np.array([-30. , -29.5, -29. , -28.5, -28. , -27.5, -27. , -26.5,\n               -26. , -25.5, -25. , -24.5, -24. , -23.5, -23. , -22.5,\n               -22. , -21.5, -21. , -20.5, -20. , -19.5, -19. , -18.5,\n               -18. , -17.5, -17. , -16.5, -16. , -15.5, -15. , -14.5,\n               -14. , -13.5, -13. , -12.5, -12. , -11.5, -11. , -10.5,\n               -10. ,  -9.5,  -9. ,  -8.5,  -8. ,  -7.5,  -7. ,  -6.5,\n                -6. ,  -5.5,  -5. ,  -4.5,  -4. ,  -3.5,  -3. ,  -2.5,\n                -2. ,  -1.5,  -1. ,  -0.5,   0. ,   0.5,   1. ,   1.5,\n                 2. ,   2.5,   3. ,   3.5,   4. ,   4.5,   5. ,   5.5,\n                 6. ,   6.5,   7. ,   7.5,   8. ,   8.5,   9. ,   9.5]),\n            }\n      }\n\nnc.nc_write('data/output.nc', ds)\n```\n\n\n#### HDF5\nThe class hdf encapsulates all hdf5 functions of reading and writing, as well as the cutting of time and spatial domains, returning the hdf5 data as a dictionary type. The returned dictionary contains for each variable the value, param_id, type_level, level and parameter_units property.\n\nSimple reading\n```\nfrom gdio.hdf import hdf\nhd = hdf(verbose=False)\n\nds = hd.hdf_load('tests/data/gpm_3imerg_20220101.hdf')\n>>> ds.keys()\ndict_keys(['ref_time', 'time_units', 'time', 'r', 't', 'u', 'v'])\n>>> print(ds.u.isobaricInhPa.value.shape)\n(1, 2, 7, 161, 241)\n>>> print(ds.u.level_type)\n['isobaricInhPa']\n>>> print(ds.u.keys())\ndict_keys(['isobaricInhPa', 'param_id', 'long_name', 'parameter_units', 'latitude', 'longitude', 'level_type'])\n>>> print(ds.u.isobaricInhPa.level)\n[200, 300, 500, 700, 800, 950, 1000]\n>>> print(ds.u.parameter_units)\nm s**-1\n>>> print(ds.u.param_id)\nNone\n```\n\nReading a subsample in time (time 12-24) and space (bbox -30,-60 and 10,-40). The returned multilevels dictionary/attributes contains for each variable the value, param_id, type_level, level and parameter_units property.\n\n```\nds = hd.hdf_load('tests/data/gpm_3imerg_20220101.hdf', cut_domain=(-30, -60, 10, -40), cut_time=(0, 1))\n>>> print(ds.u.isobaricInhPa.value.shape)\n(1, 1, 7, 80, 40)\n```\nRename variables\nA dictionary input will rename variables names (key) for a new name (value).\nEg. {'tmpmdl': 't', 'tmpprs': 't'}\n\n```\nds = hd.hdf_load('tests/data/gpm_3imerg_20220101.hdf', rename_vars={'precipitationCal':'prec_merge', 'IRprecipitation': 'prec_ir'})\n>>> ds.keys()\ndict_keys(['ref_time', 'time_units', 'time', 't', '10u', 'v', 'r'])\n```\n\n#### Writing a HDF5 file\n\nFrom the loaded dataset\n```\nnc.nc_write('data/output.nc', ds)\n```\nFrom a dictionary\n```\nfrom datetime import datetime\nimport numpy as np\nfrom gdio.hdf import hdf\n\nnc = hdf(verbose=False)\n\nds = {'ref_time': datetime(2019, 12, 27, 0, 0), \n      'time_units': 'hours', \n      'time': np.array([12]),\n      'u': {'isobaricInhPa': {  'value': np.random.random((1, 1, 7, 80, 40)),\n                                'level': [200, 300, 500, 700, 800, 950, 1000]\n                              },\n            'param_id': None, \n            'long_name': 'U component of wind', \n            'level_type': ['isobaricInhPa'],\n            'parameter_units': 'm s**-1',\n            'longitude': np.array([300. , 300.5, 301. , 301.5, 302. , 302.5, 303. , 303.5,\n               304. , 304.5, 305. , 305.5, 306. , 306.5, 307. , 307.5,\n               308. , 308.5, 309. , 309.5, 310. , 310.5, 311. , 311.5,\n               312. , 312.5, 313. , 313.5, 314. , 314.5, 315. , 315.5,\n               316. , 316.5, 317. , 317.5, 318. , 318.5, 319. , 319.5]),\n            'latitude': np.array([-30. , -29.5, -29. , -28.5, -28. , -27.5, -27. , -26.5,\n               -26. , -25.5, -25. , -24.5, -24. , -23.5, -23. , -22.5,\n               -22. , -21.5, -21. , -20.5, -20. , -19.5, -19. , -18.5,\n               -18. , -17.5, -17. , -16.5, -16. , -15.5, -15. , -14.5,\n               -14. , -13.5, -13. , -12.5, -12. , -11.5, -11. , -10.5,\n               -10. ,  -9.5,  -9. ,  -8.5,  -8. ,  -7.5,  -7. ,  -6.5,\n                -6. ,  -5.5,  -5. ,  -4.5,  -4. ,  -3.5,  -3. ,  -2.5,\n                -2. ,  -1.5,  -1. ,  -0.5,   0. ,   0.5,   1. ,   1.5,\n                 2. ,   2.5,   3. ,   3.5,   4. ,   4.5,   5. ,   5.5,\n                 6. ,   6.5,   7. ,   7.5,   8. ,   8.5,   9. ,   9.5]),\n            }\n      }\n\nnc.hdf_write('data/output.nc', ds)\n```\n\n## Routines\n### gdio.mload\nLoad multiple files (netcdf/grib) returning the data as a list of dictionary type interpolating the data to a same grid\n\n```\nmload(files, vars=None, merge_files=True, cut_time=None,\n      cut_domain=None, level_type=None, filter_by={},\n      uniformize_grid=True, sort_before=False, inplace=False)\n```          \n**files:               list**\n\nfiles names\n                                    \n**uniformize_grid:     boolean**\\\ninterpolate all ncs to first nc grid specification\n\n**vars:                list**\\\nvariables names\n\n**merge_files:         boolean**\\\nmerge files\n\n**cut_time:            tuple**\\\n                        range of time to cut ex.: (0,10)/(0,None)/(None,10)\n\n**cut_domain:          tuple**\\\n                        range of latitudes and longitudes to cut: (lat1, lon1, lat2, lon2)\n                        ex.: (-45,-90,20,-30)/(-45,None,20,-30)/(None,-90,None,-20)\n\n**level_type:          list**\\\n                        type of level (hybrid, isobaricInhPa, surface)\n\n**filter_by:           dictonary**\\\ndict with grib parameters at form of pair key:values (list or single values)\neg: filter_by={'perturbationNumber': [0,10],'level': [1000,500,250]} or filter_by={'gridType': 'regular_ll'}|\n\n**rename_vars:         dictonary**\\\nrename variables names (key) for a new name (value). Eg. {'tmpmdl': 't', 'tmpprs': 't'}\n\n**sort_before:         bool**\\\nSort fields before process validityDate, validityTime, paramId, typeOfLevel, \nperturbationNumber and level. Warning extra consumption of memory and time, \njust use when the grib data structure is not standard (default False)\n\n**return:**                    list of dictionaries\n\n### gdio.sel\nSelect data by coordinates (date, latitude, longitude, levels and members)\n\n```\nsel(data=None, latitude=None, longitude=None, \n    dates=None, level=None, member=None, date_format=\"%Y-%m-%d %H:%M\")\n```\n\n\n**data:       list of dictionaries**\\\n                             raw dataset\n                             \n**latitude:     list of floats**\\\n                             latitudes\n                             range of latitudes to select: [lat1, lat2]\n                             especific latitudes (1 or >2) [lat1, lat2, lat3, ...]\n\n**longitude:    list of floats**\\\n                             range of longitudes to select: [lon1, lon2]\n                             especific longitudes (1 or >2) [lon1, lon2, lon3, ...]\n\n**dates:        list of datetime/string**\\\n                             datetime/string date\n                             range of dates to select: [date1, date2]\n                             especific dates (1 or >2) [date1, date2, date3, ...]\n\n**level:        list of int**\\\n                             range of levels to select: [level1, level2]\n                             especific levels (1 or >2) [level1, level2, level3, ...]\n\n**member:       list of int**\\\n                             range of levels to select: [member, member]\n                             especific levels (1 or >2) [level1, level2, level3, ...]\n\n**return:**     list of dictionaries\n\n### gdio.grib.gb_load\nLoad grib file\n```\ndef gb_load(ifile, vars=None, level_type=None,\n            cut_time=None, cut_domain=None, filter_by={},\n            rename_vars={}, sort_before=False)\n```\n\n**ifile:       string**\\\n                            grib 1 or 2 file name\n\n**vars:        list**\\\n                            variables short name or id parameter number\n\n**cut_time:    tuple**\\\n                            range of time to cut ex.: (0,10)/(0,None)/(None,10)\n\n**cut_domain:  tuple**\\\n                            range of latitudes and longitudes to cut: (lat1, lon1, lat2, lon2)\n                            ex.: (-45,290,20,330)/(-45,None,20,330)/(None,290,None,320)\n\n**level_type:  list**\\\n                            type of level (hybrid, isobaricInhPa, surface)\n\n**filter_by:   dictonary**\\\n                            dict with grib parameters at form of pair key:values (list or single values)\n                            eg: filter_by={\"perturbationNumber\": [0,10],\"level\": [1000,500,250]}\n                            or filter_by={\"gridType\": \"regular_ll\"}\n\n**rename_vars: dictonary**\\\n                            rename variables names (key) for a new name (value).\n                            Eg. {\"tmpmdl\": \"t\", \"tmpprs\": \"t\"}\n\n**sort_before: bool**\\\n                            Sort fields before process validityDate, validityTime, paramId, typeOfLevel, perturbationNumber and level\n                            Warning high consumption of memory, just use when the grib data structure is not standard\n\n**return: dictonary/attributes**\\\nmultiple time data container\n\n### gdio.grib.gb_write\nWrite grib2 file\n```\ndef gb_write(ofile, data, packingType='grid_simple', least_significant_digit=3, **kwargs))\n```\n**ifile: string**\\\nfile path\n\n**data: dict**\\\ndataset\n\n**packingType: string**\\\npackingType\\\n- Type of packing:\n  - grid_simple\n  - spectral_simple\n  - grid_simple_matrix\n  - grid_jpeg\n  - grid_png\n  - grid_ieee\n  - grid_simple_log_preprocessing\n  - grid_second_order\n\n**least_significant_digit: int (default 3)**\\\nSpecify the power of ten of the smallest decimal place in the data that is a\nreliable value that dramatically improve the compression by quantizing\n(or truncating) the data\n\n### gdio.netcdf.nc_load\nLoad netcdf files\n```\nnc_load(ifile, vars=None, cut_time=None, cut_domain=None, level_type=None, rename_vars={}):\n```\n\n**ifile:       string**\\\n                    netcdf file name\n                    \n**vars:        list**\\\n                    variables short name\n                    \n**cut_time:    tuple**\\\n                    range of time (absolute) to cut ex.: (0,10)/(0,None)/(None,10)\n                    \n**cut_domain:  tuple**\\\n                    range of latitudes and longitudes to cut: (lat1, lon1, lat2, lon2)\n                    ex.: (-45,290,20,330)/(-45,None,20,330)/(None,290,None,320)\n                    \n**level_type:  list**\\\n                    type of level (hybrid, isobaricInhPa, surface)\n\n**rename_vars: dictonary**\\\n                            rename variables names (key) for a new name (value).\n                            Eg. {\"tmpmdl\": \"t\", \"tmpprs\": \"t\"}\n                            \n**return: dictonary/attributes**\\\nmultiple time data container\n\n### gdio.netcdf.nc_write\nWrite netcdf file\n```\nnc_write(ifile, data, zlib=True, netcdf_format='NETCDF4')\n```\n\n\n**ifile:           string**\\\n                                file path\n                                \n**data:            dict**\\\n                                dataset\n                                \n**zlib:            bool**\\\n                                enable compression\n                                \n**netcdf_format:   string**\\\n                                netcdf format: NETCDF4, NETCDF4_CLASSIC, NETCDF3_CLASSIC or NETCDF3_64BIT\n\n\n**complevel:      int**\\\n compression level (default 4)\n\n**least_significant_digit: int**\\\nspecify the power of ten of the smallest decimal place in the data that is a\n                reliable value that dramatically improve zlib compression by quantizing\n                (or truncating) the data (default None)\n\n\n### gdio.hdf.hdf_load\nLoad HDF5 files\n```\nhdf_load(ifile, vars=None, cut_time=None, cut_domain=None, level_type=None, rename_vars={}):\n```\n\n**ifile:       string**\\\n                    hdf5 file name\n                    \n**vars:        list**\\\n                    variables short name\n                    \n**cut_time:    tuple**\\\n                    range of time (absolute) to cut ex.: (0,10)/(0,None)/(None,10)\n                    \n**cut_domain:  tuple**\\\n                    range of latitudes and longitudes to cut: (lat1, lon1, lat2, lon2)\n                    ex.: (-45,290,20,330)/(-45,None,20,330)/(None,290,None,320)\n                    \n**level_type:  list**\\\n                    type of level (hybrid, isobaricInhPa, surface)\n\n**rename_vars: dictonary**\\\n                            rename variables names (key) for a new name (value).\n                            Eg. {\"tmpmdl\": \"t\", \"tmpprs\": \"t\"}\n                            \n**return: dictonary/attributes**\\\nmultiple time data container\n\n\n### gdio.hdf.hdf_write\nWrite netcdf file\n```\nhdf_write(ifile, data, compress_type='gzip', netcdf_format='NETCDF4')\n```\n\n\n**ifile:           string**\\\n                                file path\n                                \n**data:            dict**\\\n                                dataset\n                                \n**compress_type:   string**\\\n                                type of compression: zlib, gzip, lzf (default gzip)\n**complevel:       int**\\\n                                compression level (default 9)\n\n**least_significant_digit: int**\\\nspecify the power of ten of the smallest decimal place in the data that is a\n                reliable value that dramatically improve zlib compression by quantizing\n                (or truncating) the data (default None)\n\n\n### gdio.remapbil\n```\nremapbil(data, lon, lat, lon_new, lat_new, order=1, masked=False)\n```\n\nInterpolate data to new domain resolution\n\n**data: array**\\\n                        3D data (time,lon,lat)\n\n**lon: array**\n\n**lat: array**\n\n**lon_new: array**\\\n                        new grid logitudes\n\n**lat_new: array**\\\n                        new grid latitudes\n\n**order:   int**\\\n                        0- nearest-neighbor, 1 - bilinear, 2 - cubic spline\n\n**masked: boolean**\\\n                        If True, points outside the range of xin and yin\n                        are masked (in a masked array). If masked is set to a number\n\n**return: 3D array**\n\n\n## Dev utils\nDocker compose to support development\n\n### Commands\n - make build\n   - Build the container\n - make up\n   - Start container\n - make stop\n   - Stop container\n - make test\n   - Run unit tests in container\n - make bash\n   - Access container\n - make ipython\n   - Run ipython in container\n - make fix\n   - Run autopep to fix code format\n\n## Release History\n\n\n## Meta\nRodrigo Yamamoto codes@rodrigoyamamoto.com\n\nhttps://github.com/rodri90y/gdio\n\n## Contributing\n\n* 0.3.3\n    * alpha release\n    \n\n## License\n\nMIT\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Gridded data io library",
    "version": "0.3.3",
    "project_urls": {
        "Download": "https://github.com/rodri90y/gdio/archive/v0.3.3.tar.gz",
        "Homepage": "https://github.com/rodri90y/gdio"
    },
    "split_keywords": [
        "gdio",
        "grib",
        "netcdf",
        "hdf5"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "328ad603adf9d76917d9a8ac5a5ce7622af5aa3d37a0797e6079ea1df245cd7f",
                "md5": "468ddc0421d7870e6294a2da8e76058e",
                "sha256": "fc38cf64b56be84878b16e64b0096a077a503631e03e3f84c6c3e749a88449a1"
            },
            "downloads": -1,
            "filename": "gdio-0.3.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "468ddc0421d7870e6294a2da8e76058e",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 46900,
            "upload_time": "2023-07-14T17:53:12",
            "upload_time_iso_8601": "2023-07-14T17:53:12.523598Z",
            "url": "https://files.pythonhosted.org/packages/32/8a/d603adf9d76917d9a8ac5a5ce7622af5aa3d37a0797e6079ea1df245cd7f/gdio-0.3.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-07-14 17:53:12",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "rodri90y",
    "github_project": "gdio",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "gdio"
}
        
Elapsed time: 0.08791s