Name | pangeo-forge-esgf JSON |
Version |
0.3.1
JSON |
| download |
home_page | None |
Summary | Using queries to the ESGF API to generate urls and keyword arguments for receipe generation in pangeo-forge |
upload_time | 2024-05-31 16:23:18 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.9 |
license | Apache-2.0 |
keywords |
pangeo
data
esgf
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# pangeo-forge-esgf
Using queries to the ESGF API to generate urls and keyword arguments for receipe generation in pangeo-forge
## Install
You can install pangeo-forge-esgf via pip:
```
pip install pangeo-forge-esgf
```
If you want all the required dependencies for testing and development simply do:
```
pip install pangeo-forge-esgf[dev]
```
## Parsing a list of instance ids using wildcards
Pangeo forge recipes require the user to provide exact instance_id's for the datasets they want to be processed. Discovering these with the [web search](https://esgf-node.llnl.gov/search/cmip6/) can become cumbersome, especially when dealing with a large number of members/models etc.
`pangeo-forge-esgf` provides some functions to query the ESGF API based on instance_id values with wildcards.
For example if you want to find all the zonal (`uo`) and meridonal (`vo`) velocities available for the `lgm` experiment of PMIP, you can do:
```python
from pangeo_forge_esgf.parsing import parse_instance_ids
parse_iids = [
"CMIP6.PMIP.*.*.lgm.*.*.[uo, vo].*.*",
]
# Comma separated values in square brackets will be expanded and the above is equivalent to:
# parse_iids = [
# "CMIP6.PMIP.*.*.lgm.*.*.[uo, vo].*.*", # this is equivalent to passing
# "CMIP6.PMIP.*.*.lgm.*.*.vo.*.*",
# ]
iids = []
for piid in parse_iids:
iids.extend(parse_instance_ids(piid))
iids
```
and you will get:
```
['CMIP6.PMIP.MIROC.MIROC-ES2L.lgm.r1i1p1f2.Omon.uo.gn.v20191002',
'CMIP6.PMIP.AWI.AWI-ESM-1-1-LR.lgm.r1i1p1f1.Odec.uo.gn.v20200212',
'CMIP6.PMIP.AWI.AWI-ESM-1-1-LR.lgm.r1i1p1f1.Omon.uo.gn.v20200212',
'CMIP6.PMIP.MIROC.MIROC-ES2L.lgm.r1i1p1f2.Omon.uo.gr1.v20200911',
'CMIP6.PMIP.MPI-M.MPI-ESM1-2-LR.lgm.r1i1p1f1.Omon.uo.gn.v20200909',
'CMIP6.PMIP.AWI.AWI-ESM-1-1-LR.lgm.r1i1p1f1.Omon.vo.gn.v20200212',
'CMIP6.PMIP.MIROC.MIROC-ES2L.lgm.r1i1p1f2.Omon.vo.gn.v20191002',
'CMIP6.PMIP.AWI.AWI-ESM-1-1-LR.lgm.r1i1p1f1.Odec.vo.gn.v20200212',
'CMIP6.PMIP.MIROC.MIROC-ES2L.lgm.r1i1p1f2.Omon.vo.gr1.v20200911',
'CMIP6.PMIP.MPI-M.MPI-ESM1-2-LR.lgm.r1i1p1f1.Omon.vo.gn.v20190710']
```
Eventually I hope I can leverage this functionality to handle user requests in PRs that add wildcard instance_ids, but for now this might be helpful to manually construct lists of instance_ids to submit to a pangeo-forge feedstock.
## Generating PGF recipe input (urls) from instance_ids
```python
from pangeo_forge_esgf import get_urls_from_esgf
iids = ['CMIP6.CMIP.CSIRO-ARCCSS.ACCESS-CM2.historical.r1i1p1f1.SImon.sifb.gn.v20200817']
url_dict = await get_urls_from_esgf(iids)
url_dict['CMIP6.CMIP.CSIRO-ARCCSS.ACCESS-CM2.historical.r1i1p1f1.SImon.sifb.gn.v20200817']
```
gives
```
100%|██████████| 5/5 [00:01<00:00, 4.98it/s]
Processing responses
Processing responses: Expected files per iid
Processing responses: Check for missing iids
Processing responses: Flatten results
Processing responses: Group results
Find responsive urls
100%|██████████| 1/1 [00:00<00:00, 3.25it/s]
['https://esgf-data1.llnl.gov/thredds/fileServer/css03_data/CMIP6/CMIP/CSIRO-ARCCSS/ACCESS-CM2/historical/r1i1p1f1/SImon/sifb/gn/v20200817/sifb_SImon_ACCESS-CM2_historical_r1i1p1f1_gn_185001-201412.nc']
```
or if you want to see detaile debugging statements
```python
from pangeo_forge_esgf import get_urls_from_esgf, setup_logging
setup_logging('DEBUG')
iids = ['CMIP6.CMIP.CSIRO-ARCCSS.ACCESS-CM2.historical.r1i1p1f1.SImon.sifb.gn.v20200817']
url_dict = await get_urls_from_esgf(iids)
url_dict['CMIP6.CMIP.CSIRO-ARCCSS.ACCESS-CM2.historical.r1i1p1f1.SImon.sifb.gn.v20200817']
```
Raw data
{
"_id": null,
"home_page": null,
"name": "pangeo-forge-esgf",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": "pangeo, data, esgf",
"author": null,
"author_email": "Julius Busecke <julius@ldeo.columbia.edu>",
"download_url": "https://files.pythonhosted.org/packages/86/30/8ecce769206c8783a8ec665a0776617d7663ed158507d67275665954c2e4/pangeo_forge_esgf-0.3.1.tar.gz",
"platform": null,
"description": "# pangeo-forge-esgf\n\nUsing queries to the ESGF API to generate urls and keyword arguments for receipe generation in pangeo-forge\n\n## Install\nYou can install pangeo-forge-esgf via pip:\n```\npip install pangeo-forge-esgf\n```\n\nIf you want all the required dependencies for testing and development simply do:\n```\npip install pangeo-forge-esgf[dev]\n```\n\n## Parsing a list of instance ids using wildcards\n\nPangeo forge recipes require the user to provide exact instance_id's for the datasets they want to be processed. Discovering these with the [web search](https://esgf-node.llnl.gov/search/cmip6/) can become cumbersome, especially when dealing with a large number of members/models etc.\n\n`pangeo-forge-esgf` provides some functions to query the ESGF API based on instance_id values with wildcards.\n\nFor example if you want to find all the zonal (`uo`) and meridonal (`vo`) velocities available for the `lgm` experiment of PMIP, you can do:\n\n```python\nfrom pangeo_forge_esgf.parsing import parse_instance_ids\nparse_iids = [\n \"CMIP6.PMIP.*.*.lgm.*.*.[uo, vo].*.*\",\n]\n# Comma separated values in square brackets will be expanded and the above is equivalent to:\n# parse_iids = [\n# \"CMIP6.PMIP.*.*.lgm.*.*.[uo, vo].*.*\", # this is equivalent to passing\n# \"CMIP6.PMIP.*.*.lgm.*.*.vo.*.*\",\n# ]\niids = []\nfor piid in parse_iids:\n iids.extend(parse_instance_ids(piid))\niids\n```\n\nand you will get:\n\n```\n['CMIP6.PMIP.MIROC.MIROC-ES2L.lgm.r1i1p1f2.Omon.uo.gn.v20191002',\n 'CMIP6.PMIP.AWI.AWI-ESM-1-1-LR.lgm.r1i1p1f1.Odec.uo.gn.v20200212',\n 'CMIP6.PMIP.AWI.AWI-ESM-1-1-LR.lgm.r1i1p1f1.Omon.uo.gn.v20200212',\n 'CMIP6.PMIP.MIROC.MIROC-ES2L.lgm.r1i1p1f2.Omon.uo.gr1.v20200911',\n 'CMIP6.PMIP.MPI-M.MPI-ESM1-2-LR.lgm.r1i1p1f1.Omon.uo.gn.v20200909',\n 'CMIP6.PMIP.AWI.AWI-ESM-1-1-LR.lgm.r1i1p1f1.Omon.vo.gn.v20200212',\n 'CMIP6.PMIP.MIROC.MIROC-ES2L.lgm.r1i1p1f2.Omon.vo.gn.v20191002',\n 'CMIP6.PMIP.AWI.AWI-ESM-1-1-LR.lgm.r1i1p1f1.Odec.vo.gn.v20200212',\n 'CMIP6.PMIP.MIROC.MIROC-ES2L.lgm.r1i1p1f2.Omon.vo.gr1.v20200911',\n 'CMIP6.PMIP.MPI-M.MPI-ESM1-2-LR.lgm.r1i1p1f1.Omon.vo.gn.v20190710']\n```\n\nEventually I hope I can leverage this functionality to handle user requests in PRs that add wildcard instance_ids, but for now this might be helpful to manually construct lists of instance_ids to submit to a pangeo-forge feedstock.\n\n## Generating PGF recipe input (urls) from instance_ids\n\n```python\nfrom pangeo_forge_esgf import get_urls_from_esgf\niids = ['CMIP6.CMIP.CSIRO-ARCCSS.ACCESS-CM2.historical.r1i1p1f1.SImon.sifb.gn.v20200817']\nurl_dict = await get_urls_from_esgf(iids)\nurl_dict['CMIP6.CMIP.CSIRO-ARCCSS.ACCESS-CM2.historical.r1i1p1f1.SImon.sifb.gn.v20200817']\n```\n\ngives\n\n```\n100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 5/5 [00:01<00:00, 4.98it/s]\nProcessing responses\nProcessing responses: Expected files per iid\nProcessing responses: Check for missing iids\nProcessing responses: Flatten results\nProcessing responses: Group results\nFind responsive urls\n100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 1/1 [00:00<00:00, 3.25it/s]\n['https://esgf-data1.llnl.gov/thredds/fileServer/css03_data/CMIP6/CMIP/CSIRO-ARCCSS/ACCESS-CM2/historical/r1i1p1f1/SImon/sifb/gn/v20200817/sifb_SImon_ACCESS-CM2_historical_r1i1p1f1_gn_185001-201412.nc']\n```\n\nor if you want to see detaile debugging statements\n\n```python\nfrom pangeo_forge_esgf import get_urls_from_esgf, setup_logging\nsetup_logging('DEBUG')\niids = ['CMIP6.CMIP.CSIRO-ARCCSS.ACCESS-CM2.historical.r1i1p1f1.SImon.sifb.gn.v20200817']\nurl_dict = await get_urls_from_esgf(iids)\nurl_dict['CMIP6.CMIP.CSIRO-ARCCSS.ACCESS-CM2.historical.r1i1p1f1.SImon.sifb.gn.v20200817']\n```\n",
"bugtrack_url": null,
"license": "Apache-2.0",
"summary": "Using queries to the ESGF API to generate urls and keyword arguments for receipe generation in pangeo-forge",
"version": "0.3.1",
"project_urls": {
"Homepage": "https://github.com/jbusecke/pangeo-forge-esgf",
"Tracker": "https://github.com/jbusecke/pangeo-forge-esgf/issues"
},
"split_keywords": [
"pangeo",
" data",
" esgf"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "fbd461ba41fe0b7b12de3e3ae34753db39e2773e2275beba0f2a4b4726ab79b0",
"md5": "4ce5378be7080df931eebb6f5f6649e0",
"sha256": "fd84c5c4857ce376a8d4238c9c3b1180f9f4cc52caf1cf250f4fc512c2aa91f5"
},
"downloads": -1,
"filename": "pangeo_forge_esgf-0.3.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "4ce5378be7080df931eebb6f5f6649e0",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 18723,
"upload_time": "2024-05-31T16:23:16",
"upload_time_iso_8601": "2024-05-31T16:23:16.859181Z",
"url": "https://files.pythonhosted.org/packages/fb/d4/61ba41fe0b7b12de3e3ae34753db39e2773e2275beba0f2a4b4726ab79b0/pangeo_forge_esgf-0.3.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "86308ecce769206c8783a8ec665a0776617d7663ed158507d67275665954c2e4",
"md5": "3b66f87011105d67ff800156b661724b",
"sha256": "2ccd659b29e43071ee4ca5075e2cb7bf7af449fe77ceb4ce3cd28dd410ca522e"
},
"downloads": -1,
"filename": "pangeo_forge_esgf-0.3.1.tar.gz",
"has_sig": false,
"md5_digest": "3b66f87011105d67ff800156b661724b",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 21449,
"upload_time": "2024-05-31T16:23:18",
"upload_time_iso_8601": "2024-05-31T16:23:18.629175Z",
"url": "https://files.pythonhosted.org/packages/86/30/8ecce769206c8783a8ec665a0776617d7663ed158507d67275665954c2e4/pangeo_forge_esgf-0.3.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-05-31 16:23:18",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "jbusecke",
"github_project": "pangeo-forge-esgf",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "pangeo-forge-esgf"
}