intake-esm


Nameintake-esm JSON
Version 2024.2.6 PyPI version JSON
download
home_pagehttps://intake-esm.readthedocs.io
SummaryAn intake plugin for parsing an Earth System Model (ESM) catalog and loading netCDF files and/or Zarr stores into Xarray datasets.
upload_time2024-02-06 07:27:59
maintainerNCAR XDev Team
docs_urlNone
author
requires_python>=3.10
licenseApache 2.0
keywords intake xarray catalog
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Intake-esm

- [Intake-esm](#intake-esm)
  - [Badges](#badges)
  - [Motivation](#motivation)
  - [Overview](#overview)
  - [Installation](#installation)

## Badges

| CI           | [![GitHub Workflow Status][github-ci-badge]][github-ci-link] [![Code Coverage Status][codecov-badge]][codecov-link] [![pre-commit.ci status][pre-commit.ci-badge]][pre-commit.ci-link] |
| :----------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
| **Docs**     |                                                                     [![Documentation Status][rtd-badge]][rtd-link]                                                                     |
| **Package**  |                                                          [![Conda][conda-badge]][conda-link] [![PyPI][pypi-badge]][pypi-link]                                                          |
| **License**  |                                                                         [![License][license-badge]][repo-link]                                                                         |
| **Citation** |                                                                         [![Zenodo][zenodo-badge]][zenodo-link]                                                                         |

## Motivation

Computer simulations of the Earth’s climate and weather generate huge amounts of data.
These data are often persisted on HPC systems or in the cloud across multiple data
assets of a variety of formats ([netCDF](https://www.unidata.ucar.edu/software/netcdf/), [zarr](https://zarr.readthedocs.io/en/stable/), etc...). Finding, investigating,
loading these data assets into compute-ready data containers costs time and effort.
The data user needs to know what data sets are available, the attributes describing
each data set, before loading a specific data set and analyzing it.

Finding, investigating, loading these assets into data array containers
such as xarray can be a daunting task due to the large number of files
a user may be interested in. Intake-esm aims to address these issues by
providing necessary functionality for searching, discovering, data access/loading.

## Overview

`intake-esm` is a data cataloging utility built on top of [intake](https://github.com/intake/intake), [pandas](https://pandas.pydata.org/), and [xarray](https://xarray.pydata.org/en/stable/), and it's pretty awesome!

- Opening an ESM catalog definition file: An Earth System Model (ESM) catalog file is a JSON file that conforms
  to the [ESM Collection Specification](./docs/source/reference/esm-catalog-spec.md). When provided a link/path to an esm catalog file, `intake-esm` establishes
  a link to a database (CSV file) that contains data assets locations and associated metadata
  (i.e., which experiment, model, the come from). The catalog JSON file can be stored on a local filesystem
  or can be hosted on a remote server.

  ```python

  In [1]: import intake

  In [2]: import intake_esm

  In [3]: cat_url = intake_esm.tutorial.get_url("google_cmip6")

  In [4]: cat = intake.open_esm_datastore(cat_url)

  In [5]: cat
  Out[5]: <GOOGLE-CMIP6 catalog with 4 dataset(s) from 261 asset(s>
  ```

- Search and Discovery: `intake-esm` provides functionality to execute queries against the catalog:

  ```python
  In [5]: cat_subset = cat.search(
     ...:     experiment_id=["historical", "ssp585"],
     ...:     table_id="Oyr",
     ...:     variable_id="o2",
     ...:     grid_label="gn",
     ...: )

  In [6]: cat_subset
  Out[6]: <GOOGLE-CMIP6 catalog with 4 dataset(s) from 261 asset(s)>
  ```

- Access: when the user is satisfied with the results of their query, they can load data assets (netCDF and/or Zarr stores) into xarray datasets:

  ```python

    In [7]: dset_dict = cat_subset.to_dataset_dict()

    --> The keys in the returned dictionary of datasets are constructed as follows:
            'activity_id.institution_id.source_id.experiment_id.table_id.grid_label'
    |███████████████████████████████████████████████████████████████| 100.00% [2/2 00:18<00:00]
  ```

See [documentation](https://intake-esm.readthedocs.io/en/latest/) for more information.

## Installation

Intake-esm can be installed from PyPI with pip:

```bash
python -m pip install intake-esm
```

It is also available from `conda-forge` for conda installations:

```bash
conda install -c conda-forge intake-esm
```

[github-ci-badge]: https://github.com/intake/intake-esm/actions/workflows/ci.yaml/badge.svg
[github-ci-link]: https://github.com/intake/intake-esm/actions/workflows/ci.yaml
[codecov-badge]: https://img.shields.io/codecov/c/github/intake/intake-esm.svg?logo=codecov
[codecov-link]: https://codecov.io/gh/intake/intake-esm
[rtd-badge]: https://readthedocs.org/projects/intake-esm/badge/?version=latest
[rtd-link]: https://intake-esm.readthedocs.io/en/latest/?badge=latest
[pypi-badge]: https://img.shields.io/pypi/v/intake-esm?logo=pypi
[pypi-link]: https://pypi.org/project/intake-esm
[conda-badge]: https://img.shields.io/conda/vn/conda-forge/intake-esm?logo=anaconda
[conda-link]: https://anaconda.org/conda-forge/intake-esm
[zenodo-badge]: https://img.shields.io/badge/DOI-10.5281%20%2F%20zenodo.3491062-blue.svg
[zenodo-link]: https://doi.org/10.5281/zenodo.3491062
[license-badge]: https://img.shields.io/github/license/intake/intake-esm
[repo-link]: https://github.com/intake/intake-esm
[pre-commit.ci-badge]: https://results.pre-commit.ci/badge/github/intake/intake-esm/main.svg
[pre-commit.ci-link]: https://results.pre-commit.ci/latest/github/intake/intake-esm/main

            

Raw data

            {
    "_id": null,
    "home_page": "https://intake-esm.readthedocs.io",
    "name": "intake-esm",
    "maintainer": "NCAR XDev Team",
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": "xdev@ucar.edu",
    "keywords": "intake,xarray,catalog",
    "author": "",
    "author_email": "",
    "download_url": "https://files.pythonhosted.org/packages/9b/aa/595ab58d48709efc1a1c8e850a4daedd167c4aedf4cb3ae6249c1290e900/intake-esm-2024.2.6.tar.gz",
    "platform": null,
    "description": "# Intake-esm\n\n- [Intake-esm](#intake-esm)\n  - [Badges](#badges)\n  - [Motivation](#motivation)\n  - [Overview](#overview)\n  - [Installation](#installation)\n\n## Badges\n\n| CI           | [![GitHub Workflow Status][github-ci-badge]][github-ci-link] [![Code Coverage Status][codecov-badge]][codecov-link] [![pre-commit.ci status][pre-commit.ci-badge]][pre-commit.ci-link] |\n| :----------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |\n| **Docs**     |                                                                     [![Documentation Status][rtd-badge]][rtd-link]                                                                     |\n| **Package**  |                                                          [![Conda][conda-badge]][conda-link] [![PyPI][pypi-badge]][pypi-link]                                                          |\n| **License**  |                                                                         [![License][license-badge]][repo-link]                                                                         |\n| **Citation** |                                                                         [![Zenodo][zenodo-badge]][zenodo-link]                                                                         |\n\n## Motivation\n\nComputer simulations of the Earth\u2019s climate and weather generate huge amounts of data.\nThese data are often persisted on HPC systems or in the cloud across multiple data\nassets of a variety of formats ([netCDF](https://www.unidata.ucar.edu/software/netcdf/), [zarr](https://zarr.readthedocs.io/en/stable/), etc...). Finding, investigating,\nloading these data assets into compute-ready data containers costs time and effort.\nThe data user needs to know what data sets are available, the attributes describing\neach data set, before loading a specific data set and analyzing it.\n\nFinding, investigating, loading these assets into data array containers\nsuch as xarray can be a daunting task due to the large number of files\na user may be interested in. Intake-esm aims to address these issues by\nproviding necessary functionality for searching, discovering, data access/loading.\n\n## Overview\n\n`intake-esm` is a data cataloging utility built on top of [intake](https://github.com/intake/intake), [pandas](https://pandas.pydata.org/), and [xarray](https://xarray.pydata.org/en/stable/), and it's pretty awesome!\n\n- Opening an ESM catalog definition file: An Earth System Model (ESM) catalog file is a JSON file that conforms\n  to the [ESM Collection Specification](./docs/source/reference/esm-catalog-spec.md). When provided a link/path to an esm catalog file, `intake-esm` establishes\n  a link to a database (CSV file) that contains data assets locations and associated metadata\n  (i.e., which experiment, model, the come from). The catalog JSON file can be stored on a local filesystem\n  or can be hosted on a remote server.\n\n  ```python\n\n  In [1]: import intake\n\n  In [2]: import intake_esm\n\n  In [3]: cat_url = intake_esm.tutorial.get_url(\"google_cmip6\")\n\n  In [4]: cat = intake.open_esm_datastore(cat_url)\n\n  In [5]: cat\n  Out[5]: <GOOGLE-CMIP6 catalog with 4 dataset(s) from 261 asset(s>\n  ```\n\n- Search and Discovery: `intake-esm` provides functionality to execute queries against the catalog:\n\n  ```python\n  In [5]: cat_subset = cat.search(\n     ...:     experiment_id=[\"historical\", \"ssp585\"],\n     ...:     table_id=\"Oyr\",\n     ...:     variable_id=\"o2\",\n     ...:     grid_label=\"gn\",\n     ...: )\n\n  In [6]: cat_subset\n  Out[6]: <GOOGLE-CMIP6 catalog with 4 dataset(s) from 261 asset(s)>\n  ```\n\n- Access: when the user is satisfied with the results of their query, they can load data assets (netCDF and/or Zarr stores) into xarray datasets:\n\n  ```python\n\n    In [7]: dset_dict = cat_subset.to_dataset_dict()\n\n    --> The keys in the returned dictionary of datasets are constructed as follows:\n            'activity_id.institution_id.source_id.experiment_id.table_id.grid_label'\n    |\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 100.00% [2/2 00:18<00:00]\n  ```\n\nSee [documentation](https://intake-esm.readthedocs.io/en/latest/) for more information.\n\n## Installation\n\nIntake-esm can be installed from PyPI with pip:\n\n```bash\npython -m pip install intake-esm\n```\n\nIt is also available from `conda-forge` for conda installations:\n\n```bash\nconda install -c conda-forge intake-esm\n```\n\n[github-ci-badge]: https://github.com/intake/intake-esm/actions/workflows/ci.yaml/badge.svg\n[github-ci-link]: https://github.com/intake/intake-esm/actions/workflows/ci.yaml\n[codecov-badge]: https://img.shields.io/codecov/c/github/intake/intake-esm.svg?logo=codecov\n[codecov-link]: https://codecov.io/gh/intake/intake-esm\n[rtd-badge]: https://readthedocs.org/projects/intake-esm/badge/?version=latest\n[rtd-link]: https://intake-esm.readthedocs.io/en/latest/?badge=latest\n[pypi-badge]: https://img.shields.io/pypi/v/intake-esm?logo=pypi\n[pypi-link]: https://pypi.org/project/intake-esm\n[conda-badge]: https://img.shields.io/conda/vn/conda-forge/intake-esm?logo=anaconda\n[conda-link]: https://anaconda.org/conda-forge/intake-esm\n[zenodo-badge]: https://img.shields.io/badge/DOI-10.5281%20%2F%20zenodo.3491062-blue.svg\n[zenodo-link]: https://doi.org/10.5281/zenodo.3491062\n[license-badge]: https://img.shields.io/github/license/intake/intake-esm\n[repo-link]: https://github.com/intake/intake-esm\n[pre-commit.ci-badge]: https://results.pre-commit.ci/badge/github/intake/intake-esm/main.svg\n[pre-commit.ci-link]: https://results.pre-commit.ci/latest/github/intake/intake-esm/main\n",
    "bugtrack_url": null,
    "license": "Apache 2.0",
    "summary": "An intake plugin for parsing an Earth System Model (ESM) catalog and loading netCDF files and/or Zarr stores into Xarray datasets.",
    "version": "2024.2.6",
    "project_urls": {
        "Documentation": "https://intake-esm.readthedocs.io",
        "Homepage": "https://intake-esm.readthedocs.io",
        "Source": "https://github.com/intake/intake-esm",
        "Tracker": "https://github.com/intake/intake-esm/issues"
    },
    "split_keywords": [
        "intake",
        "xarray",
        "catalog"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "8a8db6d3553ea70d8c876f893958a8e58883adbd14f4e9a976957a2adb7cd152",
                "md5": "e22eb2002bda56d64cd8d81527075b3c",
                "sha256": "e87b40a3bcb6e68e5f5cb6b892deb468b727569e66dafd0194b244521cfaf1de"
            },
            "downloads": -1,
            "filename": "intake_esm-2024.2.6-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "e22eb2002bda56d64cd8d81527075b3c",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 29682,
            "upload_time": "2024-02-06T07:27:57",
            "upload_time_iso_8601": "2024-02-06T07:27:57.547486Z",
            "url": "https://files.pythonhosted.org/packages/8a/8d/b6d3553ea70d8c876f893958a8e58883adbd14f4e9a976957a2adb7cd152/intake_esm-2024.2.6-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "9baa595ab58d48709efc1a1c8e850a4daedd167c4aedf4cb3ae6249c1290e900",
                "md5": "91ab475da1811435e8e62eb8e5ab8526",
                "sha256": "b2c472418cfeafb11b0a6b1ecfb6e0f36a7ece3360d960b9e8008023493a3b7d"
            },
            "downloads": -1,
            "filename": "intake-esm-2024.2.6.tar.gz",
            "has_sig": false,
            "md5_digest": "91ab475da1811435e8e62eb8e5ab8526",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 112030,
            "upload_time": "2024-02-06T07:27:59",
            "upload_time_iso_8601": "2024-02-06T07:27:59.633079Z",
            "url": "https://files.pythonhosted.org/packages/9b/aa/595ab58d48709efc1a1c8e850a4daedd167c4aedf4cb3ae6249c1290e900/intake-esm-2024.2.6.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-02-06 07:27:59",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "intake",
    "github_project": "intake-esm",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "intake-esm"
}
        
Elapsed time: 3.90947s