hydro-opendata


Namehydro-opendata JSON
Version 0.0.8 PyPI version JSON
download
home_pagehttps://github.com/zjf014/hydro-opendata
SummaryA open-data solution for WIS
upload_time2023-12-29 08:20:09
maintainer
docs_urlNone
authorJeff Zhu
requires_python>=3.7
licenseMIT license
keywords hydro_opendata
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <!--
 * @Author: Jianfeng Zhu
 * @Date: 2023-10-13 19:48:15
 * @LastEditTime: 2023-10-13 21:27:31
 * @LastEditors: Wenyu Ouyang
 * @Description: English version
 * @FilePath: /hydro_opendata/README.md
 * Copyright (c) 2023-2024 Jianfeng Zhu. All rights reserved.
-->
# hydro-opendata

[![image](https://img.shields.io/pypi/v/hydro-opendata.svg)](https://pypi.python.org/pypi/hydro-opendata)


πŸ“œ [δΈ­ζ–‡ζ–‡ζ‘£](README.zh.md)

**Methods and paths for obtaining, managing, and utilizing open data for hydrological scientific computations.**

- Free software: MIT license
- Documentation: <https://hydro-opendata.readthedocs.io/en/latest/>
- 
## Background

In the era of artificial intelligence, data-driven hydrological models have been extensively researched and applied. With the advancements in remote sensing technologies and the trend towards open data sharing, accessing data has become more straightforward with a plethora of options. For researchers, questions like what data is required, what data can be accessed, where to download it, how to read it, and how to process it, are crucial. This repository aims to address these concerns.

This repository primarily focuses on external open data, categorizing data types, and creating a list. It aims to build a data flow and its tech stack that can seamlessly "download-store-process-read-write-visualize" the data.

## Overall Solution

![Data Framework](images/framework.png)

## Main Data Sources

From our current understanding, the external data suitable for hydrological modeling includes but is not limited to:

| **Primary Category** | **Secondary Category** | **Update Frequency** | **Data Structure** | **Example** |
| --- | --- | --- | --- | --- |
| Basic Geography | Hydrological Elements | Static | Vector | Watershed boundary, site |
|  | Terrain | Static | Raster | [DEM](https://github.com/DahnJ/Awesome-DEM), flow direction, land use |
| Weather & Meteorology | Reanalysis | Dynamic | Raster | ERA5 |
|  | Near Real-Time | Dynamic | Raster | GPM |
|  | Forecast | Rolling | Raster | GFS |
| Imagery | Satellite Remote Sensing | Dynamic | Raster | Landsat, Sentinel, MODIS |
|  | Street View Images | Static | Multimedia |  |
|  | Surveillance Videos | Dynamic | Multimedia |  |
|  | Drone Footage | Dynamic | Multimedia |  |
| Crowdsourced Data | POI | Static | Vector | Baidu Map |
|  | Social Networks | Dynamic | Multimedia | Weibo |
| Hydrological Data | River Flow Data | Dynamic | Tabular | GRDC |

Data can be categorized based on their update frequency into static and dynamic data.

From a structural perspective, data can be classified into vector, raster, and multimedia (unstructured data).

## Structure and Functional Framework

![Code Repository](images/repos.jpg)

### wis-stac

Data inventory and its metadata. Returns a data list based on AOI.

### wis-downloader

Downloads data from external sources. Depending on the data source, the download methods may vary, including:

- Integration with official APIs, e.g., [bmi_era5](https://github.com/gantian127/bmi_era5)
- Retrieving data download links, e.g., [Herbie](https://github.com/blaylockbk/Herbie), [MultiEarth](https://github.com/bair-climate-initiative/multiearth), [Satpy](https://github.com/pytroll/satpy). Most cloud data platforms like Microsoft, AWS, etc., organize data mostly as [stac](https://github.com/radiantearth/stac-spec).

### wis-processor

Preprocesses the data, such as watershed averaging, feature extraction, etc.

Uses [kerchunk](https://fsspec.github.io/kerchunk/) to convert different format data to [zarr](https://zarr.readthedocs.io/en/stable/) format and stores it in [MinIO](http://minio.waterism.com:9090/) server. This enables cross-file reading and enhances data reading efficiency.

### wis-s3api

After data processing in MinIO, it supports cross-file reading. Just provide data type, time range, and spatial range parameters to fetch the data.

For remote sensing imagery, due to the vast amount of data, it's not feasible to download and read each file. One can use [stac+stackstac](./data_api/examples/RSImages.ipynb) to directly read Sentinel or Landsat data into an xarray dataset.

### wis-gistools

Integrates commonly used GIS tools, such as Kriging interpolation, Thiessen polygons, etc.

- Kriging interpolation: [PyKrige](https://github.com/GeoStat-Framework/PyKrige)
- Thiessen polygon: [WhiteboxTools.VoronoiDiagram](https://whiteboxgeo.com/manual/wbt_book/available_tools/gis_analysis.html?highlight=voro#voronoidiagram), [scipy.spatial.Voronoi](https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.Voronoi.html)
- Watershed delineation: [Rapid Watershed Delineation using an Automatic Outlet Relocation Algorithm](https://github.com/xiejx5/watershed_delineation), [High-performance watershed delineation algorithm for GPU using CUDA and OpenMP](https://github.com/bkotyra/watershed_delineation_gpu)
- Watershed averaging: [plotting and creation of masks of spatial regions](https://github.com/regionmask/regionmask)

## Visualization

Use [leafmap](https://github.com/giswqs/leafmap) to display geospatial data within the Jupyter platform.

## Others

- [hydro-GIS resource directory](./resources/README.md)

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/zjf014/hydro-opendata",
    "name": "hydro-opendata",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "",
    "keywords": "hydro_opendata",
    "author": "Jeff Zhu",
    "author_email": "zjf014@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/ad/21/1402cab0051b7f1c1a8dfa549adf72529c1bd42df72b1b8d81e5a95a9fbb/hydro_opendata-0.0.8.tar.gz",
    "platform": null,
    "description": "<!--\n * @Author: Jianfeng Zhu\n * @Date: 2023-10-13 19:48:15\n * @LastEditTime: 2023-10-13 21:27:31\n * @LastEditors: Wenyu Ouyang\n * @Description: English version\n * @FilePath: /hydro_opendata/README.md\n * Copyright (c) 2023-2024 Jianfeng Zhu. All rights reserved.\n-->\n# hydro-opendata\n\n[![image](https://img.shields.io/pypi/v/hydro-opendata.svg)](https://pypi.python.org/pypi/hydro-opendata)\n\n\n\ud83d\udcdc [\u4e2d\u6587\u6587\u6863](README.zh.md)\n\n**Methods and paths for obtaining, managing, and utilizing open data for hydrological scientific computations.**\n\n- Free software: MIT license\n- Documentation: <https://hydro-opendata.readthedocs.io/en/latest/>\n- \n## Background\n\nIn the era of artificial intelligence, data-driven hydrological models have been extensively researched and applied. With the advancements in remote sensing technologies and the trend towards open data sharing, accessing data has become more straightforward with a plethora of options. For researchers, questions like what data is required, what data can be accessed, where to download it, how to read it, and how to process it, are crucial. This repository aims to address these concerns.\n\nThis repository primarily focuses on external open data, categorizing data types, and creating a list. It aims to build a data flow and its tech stack that can seamlessly \"download-store-process-read-write-visualize\" the data.\n\n## Overall Solution\n\n![Data Framework](images/framework.png)\n\n## Main Data Sources\n\nFrom our current understanding, the external data suitable for hydrological modeling includes but is not limited to:\n\n| **Primary Category** | **Secondary Category** | **Update Frequency** | **Data Structure** | **Example** |\n| --- | --- | --- | --- | --- |\n| Basic Geography | Hydrological Elements | Static | Vector | Watershed boundary, site |\n|  | Terrain | Static | Raster | [DEM](https://github.com/DahnJ/Awesome-DEM), flow direction, land use |\n| Weather & Meteorology | Reanalysis | Dynamic | Raster | ERA5 |\n|  | Near Real-Time | Dynamic | Raster | GPM |\n|  | Forecast | Rolling | Raster | GFS |\n| Imagery | Satellite Remote Sensing | Dynamic | Raster | Landsat, Sentinel, MODIS |\n|  | Street View Images | Static | Multimedia |  |\n|  | Surveillance Videos | Dynamic | Multimedia |  |\n|  | Drone Footage | Dynamic | Multimedia |  |\n| Crowdsourced Data | POI | Static | Vector | Baidu Map |\n|  | Social Networks | Dynamic | Multimedia | Weibo |\n| Hydrological Data | River Flow Data | Dynamic | Tabular | GRDC |\n\nData can be categorized based on their update frequency into static and dynamic data.\n\nFrom a structural perspective, data can be classified into vector, raster, and multimedia (unstructured data).\n\n## Structure and Functional Framework\n\n![Code Repository](images/repos.jpg)\n\n### wis-stac\n\nData inventory and its metadata. Returns a data list based on AOI.\n\n### wis-downloader\n\nDownloads data from external sources. Depending on the data source, the download methods may vary, including:\n\n- Integration with official APIs, e.g., [bmi_era5](https://github.com/gantian127/bmi_era5)\n- Retrieving data download links, e.g., [Herbie](https://github.com/blaylockbk/Herbie), [MultiEarth](https://github.com/bair-climate-initiative/multiearth), [Satpy](https://github.com/pytroll/satpy). Most cloud data platforms like Microsoft, AWS, etc., organize data mostly as [stac](https://github.com/radiantearth/stac-spec).\n\n### wis-processor\n\nPreprocesses the data, such as watershed averaging, feature extraction, etc.\n\nUses [kerchunk](https://fsspec.github.io/kerchunk/) to convert different format data to [zarr](https://zarr.readthedocs.io/en/stable/) format and stores it in [MinIO](http://minio.waterism.com:9090/) server. This enables cross-file reading and enhances data reading efficiency.\n\n### wis-s3api\n\nAfter data processing in MinIO, it supports cross-file reading. Just provide data type, time range, and spatial range parameters to fetch the data.\n\nFor remote sensing imagery, due to the vast amount of data, it's not feasible to download and read each file. One can use [stac+stackstac](./data_api/examples/RSImages.ipynb) to directly read Sentinel or Landsat data into an xarray dataset.\n\n### wis-gistools\n\nIntegrates commonly used GIS tools, such as Kriging interpolation, Thiessen polygons, etc.\n\n- Kriging interpolation: [PyKrige](https://github.com/GeoStat-Framework/PyKrige)\n- Thiessen polygon: [WhiteboxTools.VoronoiDiagram](https://whiteboxgeo.com/manual/wbt_book/available_tools/gis_analysis.html?highlight=voro#voronoidiagram), [scipy.spatial.Voronoi](https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.Voronoi.html)\n- Watershed delineation: [Rapid Watershed Delineation using an Automatic Outlet Relocation Algorithm](https://github.com/xiejx5/watershed_delineation), [High-performance watershed delineation algorithm for GPU using CUDA and OpenMP](https://github.com/bkotyra/watershed_delineation_gpu)\n- Watershed averaging: [plotting and creation of masks of spatial regions](https://github.com/regionmask/regionmask)\n\n## Visualization\n\nUse [leafmap](https://github.com/giswqs/leafmap) to display geospatial data within the Jupyter platform.\n\n## Others\n\n- [hydro-GIS resource directory](./resources/README.md)\n",
    "bugtrack_url": null,
    "license": "MIT license",
    "summary": "A open-data solution for WIS",
    "version": "0.0.8",
    "project_urls": {
        "Homepage": "https://github.com/zjf014/hydro-opendata"
    },
    "split_keywords": [
        "hydro_opendata"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ad211402cab0051b7f1c1a8dfa549adf72529c1bd42df72b1b8d81e5a95a9fbb",
                "md5": "4f10c91695c68266fb617a6cea016b4c",
                "sha256": "022da64234691c00e7ec2112f852aac01612865ef0261224f5637dd191ac2c11"
            },
            "downloads": -1,
            "filename": "hydro_opendata-0.0.8.tar.gz",
            "has_sig": false,
            "md5_digest": "4f10c91695c68266fb617a6cea016b4c",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 42991,
            "upload_time": "2023-12-29T08:20:09",
            "upload_time_iso_8601": "2023-12-29T08:20:09.000005Z",
            "url": "https://files.pythonhosted.org/packages/ad/21/1402cab0051b7f1c1a8dfa549adf72529c1bd42df72b1b8d81e5a95a9fbb/hydro_opendata-0.0.8.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-12-29 08:20:09",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "zjf014",
    "github_project": "hydro-opendata",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "hydro-opendata"
}
        
Elapsed time: 0.23045s