Name | xarray-ms JSON |
Version |
0.2.0
JSON |
| download |
home_page | None |
Summary | xarray MSv4 views over MSv2 Measurement Sets |
upload_time | 2024-09-11 07:53:44 |
maintainer | None |
docs_url | None |
author | Simon Perkins |
requires_python | <4.0,>=3.10 |
license | None |
keywords |
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
=========
xarray-ms
=========
.. image:: https://img.shields.io/pypi/v/xarray-ms.svg
:target: https://pypi.python.org/pypi/xarray-ms
.. image:: https://github.com/ratt-ru/xarray-ms/actions/workflows/ci.yml/badge.svg
:target: https://github.com/ratt-ru/xarray-ms/actions/workflows/ci.yml
.. image:: https://readthedocs.org/projects/xarray-ms/badge/?version=latest
:target: https://xarray-ms.readthedocs.io/en/latest/?badge=latest
:alt: Documentation Status
====
xarray-ms presents a Measurement Set v4 view (MSv4) over
`CASA Measurement Sets <https://casa.nrao.edu/Memos/229.html>`_ (MSv2).
It provides access to MSv2 data via the xarray API, allowing MSv4 compliant applications
to be developed on well-understood MSv2 data.
.. code-block:: python
>>> import xarray_ms
>>> import xarray
>>> ds = xarray.open_dataset("/data/L795830_SB001_uv.MS/",
chunks={"time": 2000, "baseline": 1000})
>>> ds
<xarray.Dataset> Size: 70GB
Dimensions: (time: 28760, baseline: 2775, frequency: 16,
polarization: 4, uvw_label: 3)
Coordinates:
antenna1_name (baseline) object 22kB dask.array<chunksize=(1000,), meta=np.ndarray>
antenna2_name (baseline) object 22kB dask.array<chunksize=(1000,), meta=np.ndarray>
baseline_id (baseline) int64 22kB dask.array<chunksize=(1000,), meta=np.ndarray>
* frequency (frequency) float64 128B 1.202e+08 ... 1.204e+08
* polarization (polarization) <U2 32B 'XX' 'XY' 'YX' 'YY'
* time (time) float64 230kB 1.601e+09 ... 1.601e+09
Dimensions without coordinates: baseline, uvw_label
Data variables:
EFFECTIVE_INTEGRATION_TIME (time, baseline) float64 638MB dask.array<chunksize=(2000, 1000), meta=np.ndarray>
FLAG (time, baseline, frequency, polarization) uint8 5GB dask.array<chunksize=(2000, 1000, 16, 4), meta=np.ndarray>
TIME_CENTROID (time, baseline) float64 638MB dask.array<chunksize=(2000, 1000), meta=np.ndarray>
UVW (time, baseline, uvw_label) float64 2GB dask.array<chunksize=(2000, 1000, 3), meta=np.ndarray>
VISIBILITY (time, baseline, frequency, polarization) complex64 41GB dask.array<chunksize=(2000, 1000, 16, 4), meta=np.ndarray>
WEIGHT (time, baseline, frequency, polarization) float32 20GB dask.array<chunksize=(2000, 1000, 16, 4), meta=np.ndarray>
Attributes:
antenna_xds: <xarray.Dataset> Size: 4kB\nDimensions: (...
version: 0.0.1
creation_date: 2024-09-10T14:29:22.587984+00:00
data_description_id: 0
Measurement Set v4
------------------
NRAO_/SKAO_ are developing a new xarray-based `Measurement Set v4 specification <msv4-spec_>`_.
While there are many changes some of the major highlights are:
* xarray_ is used to define the specification.
* MSv4 data consists of Datasets of ndarrays on a regular time-channel grid.
MSv2 data is tabular and, while in many instances the time-channel grid is regular,
this was not guaranteed, especially after MSv2 datasets had been transformed by various tasks.
xarray_ Datasets are self-describing and they are therefore easier to reason about and work with.
Additionally, the regularity of data will make writing MSv4-based software less complex.
xradio
------
`casangi/xradio <xradio_>`_ provides a reference implementation that converts
CASA v2 Measurement Sets to Zarr v4 Measurement Sets using the python-casacore_
package.
Why xarray-ms?
--------------
* By developing against an MSv4 xarray view over MSv2 data,
developers can develop applications on well-understood data,
and then seamlessly transition to newer formats.
Data can also be exported to newer formats (principally zarr_) via xarray's
native I/O routines.
However, the xarray view of either format looks the same to the software developer.
* xarray-ms builds on xarray's
`backend API <https://docs.xarray.dev/en/stable/internals/how-to-add-new-backend.html>`_:
Implementing a formal CASA MSv2 backend has a number of benefits:
* xarray's internal I/O routines such as ``open_dataset`` and ``open_datatree``
can dispatch to the backend to load data.
* Similarly xarray's `lazy loading mechanism <xarray_lazy_>`_ dispatches
through the backend.
* Automatic access to any `chunked array types <xarray_chunked_arrays_>`_
supported by xarray including, but not limited to dask_.
* Arbitrary chunking along any xarray dimension.
* xarray-ms uses arcae_, a high-performance backend to CASA Tables implementing
a subset of python-casacore_'s interface.
* Some limited support for irregular MSv2 data via padding.
Work in Progress
----------------
The Measurement Set v4 specification is currently under active development.
xarray-ms is currently under active development and does not yet
have feature parity with xradio_.
Most measures information and many secondary sub-tables are currently missing.
However, the most important parts of the ``MAIN`` tables,
as well as the ``ANTENNA``, ``POLARIZATON`` and ``SPECTRAL_WINDOW``
sub-tables are implemented and should be sufficient
for basic algorithm development.
.. _SKAO: https://www.skao.int/
.. _NRAO: https://public.nrao.edu/
.. _msv4-spec: https://docs.google.com/spreadsheets/d/14a6qMap9M5r_vjpLnaBKxsR9TF4azN5LVdOxLacOX-s/
.. _xradio: https://github.com/casangi/xradio
.. _dask-ms: https://github.com/ratt-ru/dask-ms
.. _arcae: https://github.com/ratt-ru/arcae
.. _dask: https://www.dask.org/
.. _python-casacore: https://github.com/casacore/python-casacore/
.. _xarray: https://xarray.dev/
.. _xarray_backend: https://docs.xarray.dev/en/stable/internals/how-to-add-new-backend.html
.. _xarray_lazy: https://docs.xarray.dev/en/latest/internals/internal-design.html#lazy-indexing-classes
.. _xarray_chunked_arrays: https://docs.xarray.dev/en/latest/internals/chunked-arrays.html
.. _zarr: https://zarr.dev/
Raw data
{
"_id": null,
"home_page": null,
"name": "xarray-ms",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.10",
"maintainer_email": null,
"keywords": null,
"author": "Simon Perkins",
"author_email": "simon.perkins@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/18/48/4480b9f1c8820f9b1d398bb94d100559704927a15305e5acde0f1f0942fb/xarray_ms-0.2.0.tar.gz",
"platform": null,
"description": "=========\nxarray-ms\n=========\n\n.. image:: https://img.shields.io/pypi/v/xarray-ms.svg\n :target: https://pypi.python.org/pypi/xarray-ms\n\n.. image:: https://github.com/ratt-ru/xarray-ms/actions/workflows/ci.yml/badge.svg\n :target: https://github.com/ratt-ru/xarray-ms/actions/workflows/ci.yml\n\n.. image:: https://readthedocs.org/projects/xarray-ms/badge/?version=latest\n :target: https://xarray-ms.readthedocs.io/en/latest/?badge=latest\n :alt: Documentation Status\n\n====\n\nxarray-ms presents a Measurement Set v4 view (MSv4) over\n`CASA Measurement Sets <https://casa.nrao.edu/Memos/229.html>`_ (MSv2).\nIt provides access to MSv2 data via the xarray API, allowing MSv4 compliant applications\nto be developed on well-understood MSv2 data.\n\n.. code-block:: python\n\n >>> import xarray_ms\n >>> import xarray\n >>> ds = xarray.open_dataset(\"/data/L795830_SB001_uv.MS/\",\n chunks={\"time\": 2000, \"baseline\": 1000})\n >>> ds\n <xarray.Dataset> Size: 70GB\n Dimensions: (time: 28760, baseline: 2775, frequency: 16,\n polarization: 4, uvw_label: 3)\n Coordinates:\n antenna1_name (baseline) object 22kB dask.array<chunksize=(1000,), meta=np.ndarray>\n antenna2_name (baseline) object 22kB dask.array<chunksize=(1000,), meta=np.ndarray>\n baseline_id (baseline) int64 22kB dask.array<chunksize=(1000,), meta=np.ndarray>\n * frequency (frequency) float64 128B 1.202e+08 ... 1.204e+08\n * polarization (polarization) <U2 32B 'XX' 'XY' 'YX' 'YY'\n * time (time) float64 230kB 1.601e+09 ... 1.601e+09\n Dimensions without coordinates: baseline, uvw_label\n Data variables:\n EFFECTIVE_INTEGRATION_TIME (time, baseline) float64 638MB dask.array<chunksize=(2000, 1000), meta=np.ndarray>\n FLAG (time, baseline, frequency, polarization) uint8 5GB dask.array<chunksize=(2000, 1000, 16, 4), meta=np.ndarray>\n TIME_CENTROID (time, baseline) float64 638MB dask.array<chunksize=(2000, 1000), meta=np.ndarray>\n UVW (time, baseline, uvw_label) float64 2GB dask.array<chunksize=(2000, 1000, 3), meta=np.ndarray>\n VISIBILITY (time, baseline, frequency, polarization) complex64 41GB dask.array<chunksize=(2000, 1000, 16, 4), meta=np.ndarray>\n WEIGHT (time, baseline, frequency, polarization) float32 20GB dask.array<chunksize=(2000, 1000, 16, 4), meta=np.ndarray>\n Attributes:\n antenna_xds: <xarray.Dataset> Size: 4kB\\nDimensions: (...\n version: 0.0.1\n creation_date: 2024-09-10T14:29:22.587984+00:00\n data_description_id: 0\n\nMeasurement Set v4\n------------------\n\nNRAO_/SKAO_ are developing a new xarray-based `Measurement Set v4 specification <msv4-spec_>`_.\nWhile there are many changes some of the major highlights are:\n\n* xarray_ is used to define the specification.\n* MSv4 data consists of Datasets of ndarrays on a regular time-channel grid.\n MSv2 data is tabular and, while in many instances the time-channel grid is regular,\n this was not guaranteed, especially after MSv2 datasets had been transformed by various tasks.\n\n\nxarray_ Datasets are self-describing and they are therefore easier to reason about and work with.\nAdditionally, the regularity of data will make writing MSv4-based software less complex.\n\nxradio\n------\n\n`casangi/xradio <xradio_>`_ provides a reference implementation that converts\nCASA v2 Measurement Sets to Zarr v4 Measurement Sets using the python-casacore_\npackage.\n\nWhy xarray-ms?\n--------------\n\n* By developing against an MSv4 xarray view over MSv2 data,\n developers can develop applications on well-understood data,\n and then seamlessly transition to newer formats.\n Data can also be exported to newer formats (principally zarr_) via xarray's\n native I/O routines.\n However, the xarray view of either format looks the same to the software developer.\n\n* xarray-ms builds on xarray's\n `backend API <https://docs.xarray.dev/en/stable/internals/how-to-add-new-backend.html>`_:\n Implementing a formal CASA MSv2 backend has a number of benefits:\n\n * xarray's internal I/O routines such as ``open_dataset`` and ``open_datatree``\n can dispatch to the backend to load data.\n * Similarly xarray's `lazy loading mechanism <xarray_lazy_>`_ dispatches\n through the backend.\n * Automatic access to any `chunked array types <xarray_chunked_arrays_>`_\n supported by xarray including, but not limited to dask_.\n * Arbitrary chunking along any xarray dimension.\n\n* xarray-ms uses arcae_, a high-performance backend to CASA Tables implementing\n a subset of python-casacore_'s interface.\n* Some limited support for irregular MSv2 data via padding.\n\nWork in Progress\n----------------\n\nThe Measurement Set v4 specification is currently under active development.\nxarray-ms is currently under active development and does not yet\nhave feature parity with xradio_.\n\nMost measures information and many secondary sub-tables are currently missing.\nHowever, the most important parts of the ``MAIN`` tables,\nas well as the ``ANTENNA``, ``POLARIZATON`` and ``SPECTRAL_WINDOW``\nsub-tables are implemented and should be sufficient\nfor basic algorithm development.\n\n.. _SKAO: https://www.skao.int/\n.. _NRAO: https://public.nrao.edu/\n.. _msv4-spec: https://docs.google.com/spreadsheets/d/14a6qMap9M5r_vjpLnaBKxsR9TF4azN5LVdOxLacOX-s/\n.. _xradio: https://github.com/casangi/xradio\n.. _dask-ms: https://github.com/ratt-ru/dask-ms\n.. _arcae: https://github.com/ratt-ru/arcae\n.. _dask: https://www.dask.org/\n.. _python-casacore: https://github.com/casacore/python-casacore/\n.. _xarray: https://xarray.dev/\n.. _xarray_backend: https://docs.xarray.dev/en/stable/internals/how-to-add-new-backend.html\n.. _xarray_lazy: https://docs.xarray.dev/en/latest/internals/internal-design.html#lazy-indexing-classes\n.. _xarray_chunked_arrays: https://docs.xarray.dev/en/latest/internals/chunked-arrays.html\n.. _zarr: https://zarr.dev/\n\n",
"bugtrack_url": null,
"license": null,
"summary": "xarray MSv4 views over MSv2 Measurement Sets",
"version": "0.2.0",
"project_urls": null,
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "ae931bf6f6f7887d2f699e7fdaaeb3adf9be86c1981aa1129ea4e30e04f5254c",
"md5": "dbca43719eef8b44de9a78023d9d0057",
"sha256": "83470ae09953687d55d67a6e1c0f7a57353c60196de2ad8c96dbc484f18a4dc1"
},
"downloads": -1,
"filename": "xarray_ms-0.2.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "dbca43719eef8b44de9a78023d9d0057",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.10",
"size": 30803,
"upload_time": "2024-09-11T07:53:42",
"upload_time_iso_8601": "2024-09-11T07:53:42.965055Z",
"url": "https://files.pythonhosted.org/packages/ae/93/1bf6f6f7887d2f699e7fdaaeb3adf9be86c1981aa1129ea4e30e04f5254c/xarray_ms-0.2.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "18484480b9f1c8820f9b1d398bb94d100559704927a15305e5acde0f1f0942fb",
"md5": "d8faba39f88eb8f41da2b13d577a4302",
"sha256": "7ca09d2d901315fb3927004d0093b1b27efecf7f1cc93fe60116c9c4ea698b11"
},
"downloads": -1,
"filename": "xarray_ms-0.2.0.tar.gz",
"has_sig": false,
"md5_digest": "d8faba39f88eb8f41da2b13d577a4302",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.10",
"size": 27447,
"upload_time": "2024-09-11T07:53:44",
"upload_time_iso_8601": "2024-09-11T07:53:44.072995Z",
"url": "https://files.pythonhosted.org/packages/18/48/4480b9f1c8820f9b1d398bb94d100559704927a15305e5acde0f1f0942fb/xarray_ms-0.2.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-09-11 07:53:44",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "xarray-ms"
}