rms-pdsparser

Name	rms-pdsparser JSON
Version	2.0.0 JSON
	download
home_page	None
Summary	Routines for parsing PDS3 labels
upload_time	2025-08-13 18:35:57
maintainer	None
docs_url	None
author	None
requires_python	>=3.9
license	Apache-2.0
keywords	nasa pds3
VCS
bugtrack_url
requirements	coverage flake8 myst-parser pytest rms-filecache rms-julian sphinx sphinxcontrib-napoleon sphinx-rtd-theme
Travis-CI	No Travis.
coveralls test coverage

            [![GitHub release; latest by date](https://img.shields.io/github/v/release/SETI/rms-pdsparser)](https://github.com/SETI/rms-pdsparser/releases)
[![GitHub Release Date](https://img.shields.io/github/release-date/SETI/rms-pdsparser)](https://github.com/SETI/rms-pdsparser/releases)
[![Test Status](https://img.shields.io/github/actions/workflow/status/SETI/rms-pdsparser/run-tests.yml?branch=main)](https://github.com/SETI/rms-pdsparser/actions)
[![Documentation Status](https://readthedocs.org/projects/rms-pdsparser/badge/?version=latest)](https://rms-pdsparser.readthedocs.io/en/latest/?badge=latest)
[![Code coverage](https://img.shields.io/codecov/c/github/SETI/rms-pdsparser/main?logo=codecov)](https://codecov.io/gh/SETI/rms-pdsparser)
<br />
[![PyPI - Version](https://img.shields.io/pypi/v/rms-pdsparser)](https://pypi.org/project/rms-pdsparser)
[![PyPI - Format](https://img.shields.io/pypi/format/rms-pdsparser)](https://pypi.org/project/rms-pdsparser)
[![PyPI - Downloads](https://img.shields.io/pypi/dm/rms-pdsparser)](https://pypi.org/project/rms-pdsparser)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/rms-pdsparser)](https://pypi.org/project/rms-pdsparser)
<br />
[![GitHub commits since latest release](https://img.shields.io/github/commits-since/SETI/rms-pdsparser/latest)](https://github.com/SETI/rms-pdsparser/commits/main/)
[![GitHub commit activity](https://img.shields.io/github/commit-activity/m/SETI/rms-pdsparser)](https://github.com/SETI/rms-pdsparser/commits/main/)
[![GitHub last commit](https://img.shields.io/github/last-commit/SETI/rms-pdsparser)](https://github.com/SETI/rms-pdsparser/commits/main/)
<br />
[![Number of GitHub open issues](https://img.shields.io/github/issues-raw/SETI/rms-pdsparser)](https://github.com/SETI/rms-pdsparser/issues)
[![Number of GitHub closed issues](https://img.shields.io/github/issues-closed-raw/SETI/rms-pdsparser)](https://github.com/SETI/rms-pdsparser/issues)
[![Number of GitHub open pull requests](https://img.shields.io/github/issues-pr-raw/SETI/rms-pdsparser)](https://github.com/SETI/rms-pdsparser/pulls)
[![Number of GitHub closed pull requests](https://img.shields.io/github/issues-pr-closed-raw/SETI/rms-pdsparser)](https://github.com/SETI/rms-pdsparser/pulls)
<br />
![GitHub License](https://img.shields.io/github/license/SETI/rms-pdsparser)
[![Number of GitHub stars](https://img.shields.io/github/stars/SETI/rms-pdsparser)](https://github.com/SETI/rms-pdsparser/stargazers)
![GitHub forks](https://img.shields.io/github/forks/SETI/rms-pdsparser)

# Introduction

`pdsparser` is a Python module that reads a PDS3 label file and converts its entire
content to a Python dictionary.

It is supported by the PDS Ring-Moon Systems Node, SETI Institute.


# Installation

The `pdsparser` module is available via the `rms-pdsparser` package on PyPI and can be
installed with:

```sh
pip install rms-pdsparser
```

# Getting Started

The typical way to use this is as follows:

    from pdsparser import Pds3Label
    label = Pds3Label(label_path)

where `label_path` is the path to a PDS3 label file or a data file containing an attached
PDS3 label. The returned object `label` is an object of class
`PdsL3abel`[![image](https://raw.githubusercontent.com/SETI/rms-pdsparser/main/icons/link.png)](https://rms-pdsparser.readthedocs.io/en/latest/module.html#pdsparser.Pds3Label),
which supports the Python dictionary API and provides access to the content of the label.

# Example 1

Suppose this is the content of a PDS3 label:

    PDS_VERSION_ID                  = PDS3
    RECORD_TYPE                     = FIXED_LENGTH
    RECORD_BYTES                    = 2000
    FILE_RECORDS                    = 1001
    ^VICAR_HEADER                   = ("C3450702_GEOMED.IMG", 1)
    ^IMAGE                          = ("C3450702_GEOMED.IMG", 2000 <BYTES>)

    /* Image Description  */

    INSTRUMENT_HOST_NAME            = "VOYAGER 1"
    INSTRUMENT_HOST_NAME            = VG1
    IMAGE_TIME                      = 1980-10-29T09:58:10.00
    FILTER_NAME                     = VIOLET
    EXPOSURE_DURATION               = 1.920 <SECOND>

    DESCRIPTION                     = "This image is the result of geometrically
    correcting the corresponding CALIB image (C3450702_CALIB.IMG)."

    OBJECT                          = VICAR_HEADER
      HEADER_TYPE                   = VICAR
      BYTES                         = 2000
      RECORDS                       = 1
      INTERCHANGE_FORMAT            = ASCII
      DESCRIPTION                   = "VICAR format label for the image."
    END_OBJECT                      = VICAR_HEADER

    OBJECT                          = IMAGE
      LINES                         = 1000
      LINE_SAMPLES                  = 1000
      SAMPLE_TYPE                   = LSB_INTEGER
      SAMPLE_BITS                   = 16
      BIT_MASK                      = 16#7FFF#
    END_OBJECT                      = IMAGE
    END

This will be the returned dictionary:

    {'PDS_VERSION_ID': 'PDS3',
     'RECORD_TYPE': 'FIXED_LENGTH',
     'RECORD_BYTES': 2000,
     'FILE_RECORDS': 1001,
     '^VICAR_HEADER': 'C3450702_GEOMED.IMG',
     '^VICAR_HEADER_offset': 1,
     '^VICAR_HEADER_unit': '',
     '^VICAR_HEADER_fmt': '("C3450702_GEOMED.IMG", 1)',
     '^IMAGE': 'C3450702_GEOMED.IMG',
     '^IMAGE_offset': 2000,
     '^IMAGE_unit': '<BYTES>',
     '^IMAGE_fmt': '("C3450702_GEOMED.IMG", 2000 <BYTES>)',
     'INSTRUMENT_HOST_NAME_1': 'VOYAGER 1',
     'INSTRUMENT_HOST_NAME_2': 'VG1',
     'IMAGE_TIME': datetime.datetime(1980, 10, 29, 9, 58, 10),
     'IMAGE_TIME_day': -7003,
     'IMAGE_TIME_sec': 35890.0,
     'IMAGE_TIME_fmt': '1980-10-29T09:58:10.000',
     'FILTER_NAME': 'VIOLET',
     'EXPOSURE_DURATION': 1.92,
     'EXPOSURE_DURATION_unit': '<SECOND>',
     'DESCRIPTION': 'This image is the result of geometrically\n
    correcting the corresponding CALIB image (C3450702_CALIB.IMG).',
     'DESCRIPTION_unwrap': 'This image is the result of geometrically correcting the corresponding CALIB image (C3450702_CALIB.IMG).',
     'VICAR_HEADER': {'OBJECT': 'VICAR_HEADER',
                      'HEADER_TYPE': 'VICAR',
                      'BYTES': 2000,
                      'RECORDS': 1,
                      'INTERCHANGE_FORMAT': 'ASCII',
                      'DESCRIPTION': 'VICAR format label for the image.',
                      'END_OBJECT': 'VICAR_HEADER'},
     'IMAGE': {'OBJECT': 'IMAGE',
               'LINES': 1000,
               'LINE_SAMPLES': 1000,
               'SAMPLE_TYPE': 'LSB_INTEGER',
               'SAMPLE_BITS': 16,
               'BIT_MASK': 32767,
               'BIT_MASK_radix': 16,
               'BIT_MASK_digits': '7FFF',
               'BIT_MASK_fmt': '16#7FFF#',
               'END_OBJECT': 'IMAGE'},
     'END': '',
     'objects': ['VICAR_HEADER', 'IMAGE']}

As you can see:

* Most PDS3 label keywords become keys in the dictionary without change.
* OBJECTs and GROUPs are converted to sub-dictionaries and are keyed by the value of the
  PDS3 keyword. In this example, `label['VICAR_HEADER']['HEADER_TYPE']` returns "VICAR".
* If a keyword is repeated at the top level or within an object or group, it receives a
  suffix `_1`, `_2`, `_3`, etc. to distinguish it.
* If a value has units, there is an additional keyword in the dictionary with `_unit` as
  a suffix, containing the name of the unit.
* For text values that contain a newline, trailing blanks are suppressed. In addition, a
  dictionary key with the suffix `_unwrap` contains the same text as full paragraphs
  separated by newlines.
* For a file pointer of the form `(filename, offset)` or `(filename, offset <BYTES>)`, the
  keyed value is just the filename. The offset value provided with `_offset` appended to
  the dictionary key, and the unit is provided with `_unit` appended to the key.
* For based integers of the form `radix#digits#`, the dictionary value is converted to an
  integer. However, the radix and the digit string are provided using keys with the suffix
  `_radix` and `_digits`. Also, the key with suffix `_fmt` provides a full, PDS3-formatted
  version of the value.
* Dates and times are converted to Python datetime objects. However, additional dictionary
  keys appear with the suffix `_day` for the day number relative to Janary 1, 2000 and
  `_sec` for the elapsed seconds within that day.
* For items that have special formatting within a label, such file pointers, dates, and
  integers with a radix, the key with a `_fmt` suffix provides the PDS3-formatted value
  for reference.
* Each dictionary containing OBJECTs ends with an entry keyed by "objects", which returns
  the ordered list of all the OBJECT keys in that dictionary. Similarly, each dictionary
  containing GROUPs has an entry keyed by "groups", which returns the list of all the
  GROUP keys. These provide a easy way to iterate through objects and groups in the label.

# Example 2

Within `TABLE` and `SPREADSHEET` objects, the dictionary keys of the embedded `COLUMN`,
`BIT_COLUMN`, `FIELD`, and `ELEMENT_DEFINITION` objects are keyed by the value of the
`NAME` keyword (rather than by using repeated keywords `COLUMN_1`, `COLUMN_2`, `COLUMN_3`,
etc.). For example, suppose this appears in a PDS3 label:

    OBJECT = TABLE
      OBJECT = COLUMN
        NAME = VOLUME_ID
        START_BYTE = 1
      END_OBJECT = COLUMN
      OBJECT = COLUMN
        NAME = FILE_SPECIFICATION_NAME
        START_BYTE = 15
      END_OBJECT = COLUMN
    END_OBJECT = TABLE

The returned section of the dictionary will look like this:

    {'TABLE': {'OBJECT': 'TABLE',
               'VOLUME_ID': {'OBJECT': 'COLUMN',
                             'NAME': 'VOLUME_ID',
                             'START_BYTE': 1,
                             'END_OBJECT': 'COLUMN'},
               'FILE_SPECIFICATION_NAME': {'OBJECT': 'COLUMN',
                                           'NAME': 'FILE_SPECIFICATION_NAME',
                                           'START_BYTE': 15,
                                           'END_OBJECT': 'COLUMN'},
               'END_OBJECT': 'TABLE'},
    }

# Example 3

"Set" notation (using curly braces "{}") was sometimes mis-used in PDS3 labels where
"sequence" notation (using parentheses "()") was meant. For example, this might appear in
a label:

    CUTOUT_WINDOW = {1, 1, 200, 800}

which is supposed to define the four boundaries of an image region. The user might be
surprised to learn that in the dictionary, its value is the Python set `{1, 200, 800}`. To
address this situation, for every set value, the dictionary also has a key with the same
name but suffix `_list`, which contains the elements of the value as list in their
original order and including duplicates. In this example, the dictionary contains:

    'CUTOUT_WINDOW': {1, 200, 800},
    'CUTOUT_WINDOW_list': [1, 1, 200, 800]

# Options

The
`Pds3Label`[![image](https://raw.githubusercontent.com/SETI/rms-pdsparser/main/icons/link.png)](https://rms-pdsparser.readthedocs.io/en/latest/module.html#pdsparser.Pds3Label),
constructor provides a variety of additional options for how to
parse the label and present its content.

* You can provide the label to be parsed as a string containing the label's content rather
  than as a path to a file.
* Use `types=True` to include the type of each keyword the file and interpret its content
  (e.g., "integer", "based_integer", "text", "date_time", or "file_offset_pointer") in the
  dictionary using the keyword plus suffix `_type`.
* Use `sources=True` to include the source text as extracted from the PDS3 label in the
  dictionary using the keyword plus suffix `_source`.
* Use `expand=True` to insert the content of any referenced `^STRUCTURE` keywords into the
  returned dictionary.
* Use `vax=True` to read attached labels from old-style Vax variable-length record files.
* Use the `repairs` option to correct any known syntax errors in the label prior to
  parsing using regular expressions.

Four methods of parsing the label are provided.

* `method="strict"` uses a strict implementation of the PDS3 syntax. It is sure to provide
  accurate results, but can be rather slow. This method can also be used to validate the
  syntax within a PDS3 label, because it will raise a SyntaxError if anything goes wrong.
* `method="loose"` uses a variant of the "strict" method, in which allowance is made for
  certain common syntax errors. Specifically,

  * It allows slashes in file names and in text strings that are not quoted (e.g., `N/A`).
  * It allows the value of `END_OBJECT` and `END_GROUP` to be absent, as long as they are
    still properly paired with associated `OBJECT` and `GROUP` keywords.
  * It allows time zone expressions (where were disallowed after the PDS2 standard).
  * Commas can be missing between the elements of a sequence or set.
  * The final line terminator after `END` can be missing from a detached label.

* `method="fast"` is a different and much faster (often 30x faster) parser, which takes
  various "shortcuts" during the parsing. As a result, it may fail on occasions where the
  other methods succeed, and it may not return correct results in the cases of some
  oddly-formatted labels. However, it handles all the most common aspects of the PDS3
  syntax correctly, and so may be a good choice when handling large numbers of labels.
* `method="compound"`" is similar to "loose", but it parses a "compound" label, i.e., one
  that might contain more than one `END` statement.

# Utilities

The `pdsparser` module provides several additional utilities for handling PDS3 labels.

- `read_label`[![image](https://raw.githubusercontent.com/SETI/rms-pdsparser/main/icons/link.png)](https://rms-pdsparser.readthedocs.io/en/latest/module.html#pdsparser.utils.read_label):
  Reads a PDS3 label from a file. Supports attached labels
  within binary files.
- `read_vax_binary_label`[![image](https://raw.githubusercontent.com/SETI/rms-pdsparser/main/icons/link.png)](https://rms-pdsparser.readthedocs.io/en/latest/module.html#pdsparser.utils.read_vax_binary_label):
  Reads the attached PDS3 label from an old-style
  Vax binary file that uses variable-length records.
- `expand_structures`[![image](https://raw.githubusercontent.com/SETI/rms-pdsparser/main/icons/link.png)](https://rms-pdsparser.readthedocs.io/en/latest/module.html#pdsparser.utils.expand_structures):
  Replaces any `^STRUCTURE` keywords in a label string
  with the content of the associated ".FMT" files.

# Contributing

Information on contributing to this package can be found in the
[Contributing Guide](https://github.com/SETI/rms-pdsparser/blob/main/CONTRIBUTING.md).

# Links

- [Documentation](https://rms-pdsparser.readthedocs.io)
- [Repository](https://github.com/SETI/rms-pdsparser)
- [Issue tracker](https://github.com/SETI/rms-pdsparser/issues)
- [PyPi](https://pypi.org/project/rms-pdsparser)

# Licensing

This code is licensed under the [Apache License v2.0](https://github.com/SETI/rms-pdsparser/blob/main/LICENSE).

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "rms-pdsparser",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": "\"Robert S. French\" <rfrench@seti.org>",
    "keywords": "NASA, PDS3",
    "author": null,
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/dc/47/2800461288829968c92db7690c4982c1e44695f8ec8cc67d9d200ac1cc89/rms_pdsparser-2.0.0.tar.gz",
    "platform": null,
    "description": "[![GitHub release; latest by date](https://img.shields.io/github/v/release/SETI/rms-pdsparser)](https://github.com/SETI/rms-pdsparser/releases)\n[![GitHub Release Date](https://img.shields.io/github/release-date/SETI/rms-pdsparser)](https://github.com/SETI/rms-pdsparser/releases)\n[![Test Status](https://img.shields.io/github/actions/workflow/status/SETI/rms-pdsparser/run-tests.yml?branch=main)](https://github.com/SETI/rms-pdsparser/actions)\n[![Documentation Status](https://readthedocs.org/projects/rms-pdsparser/badge/?version=latest)](https://rms-pdsparser.readthedocs.io/en/latest/?badge=latest)\n[![Code coverage](https://img.shields.io/codecov/c/github/SETI/rms-pdsparser/main?logo=codecov)](https://codecov.io/gh/SETI/rms-pdsparser)\n<br />\n[![PyPI - Version](https://img.shields.io/pypi/v/rms-pdsparser)](https://pypi.org/project/rms-pdsparser)\n[![PyPI - Format](https://img.shields.io/pypi/format/rms-pdsparser)](https://pypi.org/project/rms-pdsparser)\n[![PyPI - Downloads](https://img.shields.io/pypi/dm/rms-pdsparser)](https://pypi.org/project/rms-pdsparser)\n[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/rms-pdsparser)](https://pypi.org/project/rms-pdsparser)\n<br />\n[![GitHub commits since latest release](https://img.shields.io/github/commits-since/SETI/rms-pdsparser/latest)](https://github.com/SETI/rms-pdsparser/commits/main/)\n[![GitHub commit activity](https://img.shields.io/github/commit-activity/m/SETI/rms-pdsparser)](https://github.com/SETI/rms-pdsparser/commits/main/)\n[![GitHub last commit](https://img.shields.io/github/last-commit/SETI/rms-pdsparser)](https://github.com/SETI/rms-pdsparser/commits/main/)\n<br />\n[![Number of GitHub open issues](https://img.shields.io/github/issues-raw/SETI/rms-pdsparser)](https://github.com/SETI/rms-pdsparser/issues)\n[![Number of GitHub closed issues](https://img.shields.io/github/issues-closed-raw/SETI/rms-pdsparser)](https://github.com/SETI/rms-pdsparser/issues)\n[![Number of GitHub open pull requests](https://img.shields.io/github/issues-pr-raw/SETI/rms-pdsparser)](https://github.com/SETI/rms-pdsparser/pulls)\n[![Number of GitHub closed pull requests](https://img.shields.io/github/issues-pr-closed-raw/SETI/rms-pdsparser)](https://github.com/SETI/rms-pdsparser/pulls)\n<br />\n![GitHub License](https://img.shields.io/github/license/SETI/rms-pdsparser)\n[![Number of GitHub stars](https://img.shields.io/github/stars/SETI/rms-pdsparser)](https://github.com/SETI/rms-pdsparser/stargazers)\n![GitHub forks](https://img.shields.io/github/forks/SETI/rms-pdsparser)\n\n# Introduction\n\n`pdsparser` is a Python module that reads a PDS3 label file and converts its entire\ncontent to a Python dictionary.\n\nIt is supported by the PDS Ring-Moon Systems Node, SETI Institute.\n\n\n# Installation\n\nThe `pdsparser` module is available via the `rms-pdsparser` package on PyPI and can be\ninstalled with:\n\n```sh\npip install rms-pdsparser\n```\n\n# Getting Started\n\nThe typical way to use this is as follows:\n\n    from pdsparser import Pds3Label\n    label = Pds3Label(label_path)\n\nwhere `label_path` is the path to a PDS3 label file or a data file containing an attached\nPDS3 label. The returned object `label` is an object of class\n`PdsL3abel`[![image](https://raw.githubusercontent.com/SETI/rms-pdsparser/main/icons/link.png)](https://rms-pdsparser.readthedocs.io/en/latest/module.html#pdsparser.Pds3Label),\nwhich supports the Python dictionary API and provides access to the content of the label.\n\n# Example 1\n\nSuppose this is the content of a PDS3 label:\n\n    PDS_VERSION_ID                  = PDS3\n    RECORD_TYPE                     = FIXED_LENGTH\n    RECORD_BYTES                    = 2000\n    FILE_RECORDS                    = 1001\n    ^VICAR_HEADER                   = (\"C3450702_GEOMED.IMG\", 1)\n    ^IMAGE                          = (\"C3450702_GEOMED.IMG\", 2000 <BYTES>)\n\n    /* Image Description  */\n\n    INSTRUMENT_HOST_NAME            = \"VOYAGER 1\"\n    INSTRUMENT_HOST_NAME            = VG1\n    IMAGE_TIME                      = 1980-10-29T09:58:10.00\n    FILTER_NAME                     = VIOLET\n    EXPOSURE_DURATION               = 1.920 <SECOND>\n\n    DESCRIPTION                     = \"This image is the result of geometrically\n    correcting the corresponding CALIB image (C3450702_CALIB.IMG).\"\n\n    OBJECT                          = VICAR_HEADER\n      HEADER_TYPE                   = VICAR\n      BYTES                         = 2000\n      RECORDS                       = 1\n      INTERCHANGE_FORMAT            = ASCII\n      DESCRIPTION                   = \"VICAR format label for the image.\"\n    END_OBJECT                      = VICAR_HEADER\n\n    OBJECT                          = IMAGE\n      LINES                         = 1000\n      LINE_SAMPLES                  = 1000\n      SAMPLE_TYPE                   = LSB_INTEGER\n      SAMPLE_BITS                   = 16\n      BIT_MASK                      = 16#7FFF#\n    END_OBJECT                      = IMAGE\n    END\n\nThis will be the returned dictionary:\n\n    {'PDS_VERSION_ID': 'PDS3',\n     'RECORD_TYPE': 'FIXED_LENGTH',\n     'RECORD_BYTES': 2000,\n     'FILE_RECORDS': 1001,\n     '^VICAR_HEADER': 'C3450702_GEOMED.IMG',\n     '^VICAR_HEADER_offset': 1,\n     '^VICAR_HEADER_unit': '',\n     '^VICAR_HEADER_fmt': '(\"C3450702_GEOMED.IMG\", 1)',\n     '^IMAGE': 'C3450702_GEOMED.IMG',\n     '^IMAGE_offset': 2000,\n     '^IMAGE_unit': '<BYTES>',\n     '^IMAGE_fmt': '(\"C3450702_GEOMED.IMG\", 2000 <BYTES>)',\n     'INSTRUMENT_HOST_NAME_1': 'VOYAGER 1',\n     'INSTRUMENT_HOST_NAME_2': 'VG1',\n     'IMAGE_TIME': datetime.datetime(1980, 10, 29, 9, 58, 10),\n     'IMAGE_TIME_day': -7003,\n     'IMAGE_TIME_sec': 35890.0,\n     'IMAGE_TIME_fmt': '1980-10-29T09:58:10.000',\n     'FILTER_NAME': 'VIOLET',\n     'EXPOSURE_DURATION': 1.92,\n     'EXPOSURE_DURATION_unit': '<SECOND>',\n     'DESCRIPTION': 'This image is the result of geometrically\\n\n    correcting the corresponding CALIB image (C3450702_CALIB.IMG).',\n     'DESCRIPTION_unwrap': 'This image is the result of geometrically correcting the corresponding CALIB image (C3450702_CALIB.IMG).',\n     'VICAR_HEADER': {'OBJECT': 'VICAR_HEADER',\n                      'HEADER_TYPE': 'VICAR',\n                      'BYTES': 2000,\n                      'RECORDS': 1,\n                      'INTERCHANGE_FORMAT': 'ASCII',\n                      'DESCRIPTION': 'VICAR format label for the image.',\n                      'END_OBJECT': 'VICAR_HEADER'},\n     'IMAGE': {'OBJECT': 'IMAGE',\n               'LINES': 1000,\n               'LINE_SAMPLES': 1000,\n               'SAMPLE_TYPE': 'LSB_INTEGER',\n               'SAMPLE_BITS': 16,\n               'BIT_MASK': 32767,\n               'BIT_MASK_radix': 16,\n               'BIT_MASK_digits': '7FFF',\n               'BIT_MASK_fmt': '16#7FFF#',\n               'END_OBJECT': 'IMAGE'},\n     'END': '',\n     'objects': ['VICAR_HEADER', 'IMAGE']}\n\nAs you can see:\n\n* Most PDS3 label keywords become keys in the dictionary without change.\n* OBJECTs and GROUPs are converted to sub-dictionaries and are keyed by the value of the\n  PDS3 keyword. In this example, `label['VICAR_HEADER']['HEADER_TYPE']` returns \"VICAR\".\n* If a keyword is repeated at the top level or within an object or group, it receives a\n  suffix `_1`, `_2`, `_3`, etc. to distinguish it.\n* If a value has units, there is an additional keyword in the dictionary with `_unit` as\n  a suffix, containing the name of the unit.\n* For text values that contain a newline, trailing blanks are suppressed. In addition, a\n  dictionary key with the suffix `_unwrap` contains the same text as full paragraphs\n  separated by newlines.\n* For a file pointer of the form `(filename, offset)` or `(filename, offset <BYTES>)`, the\n  keyed value is just the filename. The offset value provided with `_offset` appended to\n  the dictionary key, and the unit is provided with `_unit` appended to the key.\n* For based integers of the form `radix#digits#`, the dictionary value is converted to an\n  integer. However, the radix and the digit string are provided using keys with the suffix\n  `_radix` and `_digits`. Also, the key with suffix `_fmt` provides a full, PDS3-formatted\n  version of the value.\n* Dates and times are converted to Python datetime objects. However, additional dictionary\n  keys appear with the suffix `_day` for the day number relative to Janary 1, 2000 and\n  `_sec` for the elapsed seconds within that day.\n* For items that have special formatting within a label, such file pointers, dates, and\n  integers with a radix, the key with a `_fmt` suffix provides the PDS3-formatted value\n  for reference.\n* Each dictionary containing OBJECTs ends with an entry keyed by \"objects\", which returns\n  the ordered list of all the OBJECT keys in that dictionary. Similarly, each dictionary\n  containing GROUPs has an entry keyed by \"groups\", which returns the list of all the\n  GROUP keys. These provide a easy way to iterate through objects and groups in the label.\n\n# Example 2\n\nWithin `TABLE` and `SPREADSHEET` objects, the dictionary keys of the embedded `COLUMN`,\n`BIT_COLUMN`, `FIELD`, and `ELEMENT_DEFINITION` objects are keyed by the value of the\n`NAME` keyword (rather than by using repeated keywords `COLUMN_1`, `COLUMN_2`, `COLUMN_3`,\netc.). For example, suppose this appears in a PDS3 label:\n\n    OBJECT = TABLE\n      OBJECT = COLUMN\n        NAME = VOLUME_ID\n        START_BYTE = 1\n      END_OBJECT = COLUMN\n      OBJECT = COLUMN\n        NAME = FILE_SPECIFICATION_NAME\n        START_BYTE = 15\n      END_OBJECT = COLUMN\n    END_OBJECT = TABLE\n\nThe returned section of the dictionary will look like this:\n\n    {'TABLE': {'OBJECT': 'TABLE',\n               'VOLUME_ID': {'OBJECT': 'COLUMN',\n                             'NAME': 'VOLUME_ID',\n                             'START_BYTE': 1,\n                             'END_OBJECT': 'COLUMN'},\n               'FILE_SPECIFICATION_NAME': {'OBJECT': 'COLUMN',\n                                           'NAME': 'FILE_SPECIFICATION_NAME',\n                                           'START_BYTE': 15,\n                                           'END_OBJECT': 'COLUMN'},\n               'END_OBJECT': 'TABLE'},\n    }\n\n# Example 3\n\n\"Set\" notation (using curly braces \"{}\") was sometimes mis-used in PDS3 labels where\n\"sequence\" notation (using parentheses \"()\") was meant. For example, this might appear in\na label:\n\n    CUTOUT_WINDOW = {1, 1, 200, 800}\n\nwhich is supposed to define the four boundaries of an image region. The user might be\nsurprised to learn that in the dictionary, its value is the Python set `{1, 200, 800}`. To\naddress this situation, for every set value, the dictionary also has a key with the same\nname but suffix `_list`, which contains the elements of the value as list in their\noriginal order and including duplicates. In this example, the dictionary contains:\n\n    'CUTOUT_WINDOW': {1, 200, 800},\n    'CUTOUT_WINDOW_list': [1, 1, 200, 800]\n\n# Options\n\nThe\n`Pds3Label`[![image](https://raw.githubusercontent.com/SETI/rms-pdsparser/main/icons/link.png)](https://rms-pdsparser.readthedocs.io/en/latest/module.html#pdsparser.Pds3Label),\nconstructor provides a variety of additional options for how to\nparse the label and present its content.\n\n* You can provide the label to be parsed as a string containing the label's content rather\n  than as a path to a file.\n* Use `types=True` to include the type of each keyword the file and interpret its content\n  (e.g., \"integer\", \"based_integer\", \"text\", \"date_time\", or \"file_offset_pointer\") in the\n  dictionary using the keyword plus suffix `_type`.\n* Use `sources=True` to include the source text as extracted from the PDS3 label in the\n  dictionary using the keyword plus suffix `_source`.\n* Use `expand=True` to insert the content of any referenced `^STRUCTURE` keywords into the\n  returned dictionary.\n* Use `vax=True` to read attached labels from old-style Vax variable-length record files.\n* Use the `repairs` option to correct any known syntax errors in the label prior to\n  parsing using regular expressions.\n\nFour methods of parsing the label are provided.\n\n* `method=\"strict\"` uses a strict implementation of the PDS3 syntax. It is sure to provide\n  accurate results, but can be rather slow. This method can also be used to validate the\n  syntax within a PDS3 label, because it will raise a SyntaxError if anything goes wrong.\n* `method=\"loose\"` uses a variant of the \"strict\" method, in which allowance is made for\n  certain common syntax errors. Specifically,\n\n  * It allows slashes in file names and in text strings that are not quoted (e.g., `N/A`).\n  * It allows the value of `END_OBJECT` and `END_GROUP` to be absent, as long as they are\n    still properly paired with associated `OBJECT` and `GROUP` keywords.\n  * It allows time zone expressions (where were disallowed after the PDS2 standard).\n  * Commas can be missing between the elements of a sequence or set.\n  * The final line terminator after `END` can be missing from a detached label.\n\n* `method=\"fast\"` is a different and much faster (often 30x faster) parser, which takes\n  various \"shortcuts\" during the parsing. As a result, it may fail on occasions where the\n  other methods succeed, and it may not return correct results in the cases of some\n  oddly-formatted labels. However, it handles all the most common aspects of the PDS3\n  syntax correctly, and so may be a good choice when handling large numbers of labels.\n* `method=\"compound\"`\" is similar to \"loose\", but it parses a \"compound\" label, i.e., one\n  that might contain more than one `END` statement.\n\n# Utilities\n\nThe `pdsparser` module provides several additional utilities for handling PDS3 labels.\n\n- `read_label`[![image](https://raw.githubusercontent.com/SETI/rms-pdsparser/main/icons/link.png)](https://rms-pdsparser.readthedocs.io/en/latest/module.html#pdsparser.utils.read_label):\n  Reads a PDS3 label from a file. Supports attached labels\n  within binary files.\n- `read_vax_binary_label`[![image](https://raw.githubusercontent.com/SETI/rms-pdsparser/main/icons/link.png)](https://rms-pdsparser.readthedocs.io/en/latest/module.html#pdsparser.utils.read_vax_binary_label):\n  Reads the attached PDS3 label from an old-style\n  Vax binary file that uses variable-length records.\n- `expand_structures`[![image](https://raw.githubusercontent.com/SETI/rms-pdsparser/main/icons/link.png)](https://rms-pdsparser.readthedocs.io/en/latest/module.html#pdsparser.utils.expand_structures):\n  Replaces any `^STRUCTURE` keywords in a label string\n  with the content of the associated \".FMT\" files.\n\n# Contributing\n\nInformation on contributing to this package can be found in the\n[Contributing Guide](https://github.com/SETI/rms-pdsparser/blob/main/CONTRIBUTING.md).\n\n# Links\n\n- [Documentation](https://rms-pdsparser.readthedocs.io)\n- [Repository](https://github.com/SETI/rms-pdsparser)\n- [Issue tracker](https://github.com/SETI/rms-pdsparser/issues)\n- [PyPi](https://pypi.org/project/rms-pdsparser)\n\n# Licensing\n\nThis code is licensed under the [Apache License v2.0](https://github.com/SETI/rms-pdsparser/blob/main/LICENSE).\n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "Routines for parsing PDS3 labels",
    "version": "2.0.0",
    "project_urls": {
        "Homepage": "https://github.com/SETI/rms-pdsparser",
        "Issues": "https://github.com/SETI/rms-pdsparser/issues",
        "Repository": "https://github.com/SETI/rms-pdsparser",
        "Source": "https://github.com/SETI/rms-pdsparser"
    },
    "split_keywords": [
        "nasa",
        " pds3"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "950d8cb41d27b685a0cc5603e79505abe03569d916b5cd6b08b193e73d7237d2",
                "md5": "b059ae2a68761c282f256ec21c6a66b2",
                "sha256": "16fb5f69be74629939290ca12253949765365d216ae2cdf046575eae6798cb58"
            },
            "downloads": -1,
            "filename": "rms_pdsparser-2.0.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "b059ae2a68761c282f256ec21c6a66b2",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 35686,
            "upload_time": "2025-08-13T18:35:56",
            "upload_time_iso_8601": "2025-08-13T18:35:56.714872Z",
            "url": "https://files.pythonhosted.org/packages/95/0d/8cb41d27b685a0cc5603e79505abe03569d916b5cd6b08b193e73d7237d2/rms_pdsparser-2.0.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "dc472800461288829968c92db7690c4982c1e44695f8ec8cc67d9d200ac1cc89",
                "md5": "5ba0eeb7820a4ff9e107240a4c64b305",
                "sha256": "868e0f877521001b761a3452f860f4164d184ff1970df2088b5370b1864355a6"
            },
            "downloads": -1,
            "filename": "rms_pdsparser-2.0.0.tar.gz",
            "has_sig": false,
            "md5_digest": "5ba0eeb7820a4ff9e107240a4c64b305",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 433290,
            "upload_time": "2025-08-13T18:35:57",
            "upload_time_iso_8601": "2025-08-13T18:35:57.909507Z",
            "url": "https://files.pythonhosted.org/packages/dc/47/2800461288829968c92db7690c4982c1e44695f8ec8cc67d9d200ac1cc89/rms_pdsparser-2.0.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-13 18:35:57",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "SETI",
    "github_project": "rms-pdsparser",
    "travis_ci": false,
    "coveralls": true,
    "github_actions": true,
    "requirements": [
        {
            "name": "coverage",
            "specs": []
        },
        {
            "name": "flake8",
            "specs": []
        },
        {
            "name": "myst-parser",
            "specs": []
        },
        {
            "name": "pytest",
            "specs": []
        },
        {
            "name": "rms-filecache",
            "specs": []
        },
        {
            "name": "rms-julian",
            "specs": []
        },
        {
            "name": "sphinx",
            "specs": []
        },
        {
            "name": "sphinxcontrib-napoleon",
            "specs": []
        },
        {
            "name": "sphinx-rtd-theme",
            "specs": []
        }
    ],
    "lcname": "rms-pdsparser"
}

None