headerparser


Nameheaderparser JSON
Version 0.5.1 PyPI version JSON
download
home_pagehttps://github.com/jwodder/headerparser
Summaryargparse for mail-style headers
upload_time2023-10-04 15:37:43
maintainer
docs_urlNone
authorJohn Thorvald Wodder II
requires_python>=3.7
licenseMIT
keywords e-mail email mail rfc822 headers rfc2822 rfc5322 parser
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            .. image:: http://www.repostatus.org/badges/latest/active.svg
    :target: http://www.repostatus.org/#active
    :alt: Project Status: Active — The project has reached a stable, usable
          state and is being actively developed.

.. image:: https://github.com/jwodder/headerparser/workflows/Test/badge.svg?branch=master
    :target: https://github.com/jwodder/headerparser/actions?workflow=Test
    :alt: CI Status

.. image:: https://codecov.io/gh/jwodder/headerparser/branch/master/graph/badge.svg
    :target: https://codecov.io/gh/jwodder/headerparser

.. image:: https://img.shields.io/pypi/pyversions/headerparser.svg
    :target: https://pypi.org/project/headerparser

.. image:: https://img.shields.io/github/license/jwodder/headerparser.svg
    :target: https://opensource.org/licenses/MIT
    :alt: MIT License

`GitHub <https://github.com/jwodder/headerparser>`_
| `PyPI <https://pypi.org/project/headerparser>`_
| `Documentation <https://headerparser.readthedocs.io>`_
| `Issues <https://github.com/jwodder/headerparser/issues>`_
| `Changelog <https://github.com/jwodder/headerparser/blob/master/CHANGELOG.md>`_

``headerparser`` parses key-value pairs in the style of RFC 822 (e-mail)
headers and converts them into case-insensitive dictionaries with the trailing
message body (if any) attached.  Fields can be converted to other types, marked
required, or given default values using an API based on the standard library's
``argparse`` module.  (Everyone loves ``argparse``, right?)  Low-level
functions for just scanning header fields (breaking them into sequences of
key-value pairs without any further processing) are also included.

The Format
==========
RFC 822-style headers are header fields that follow the general format of
e-mail headers as specified by RFC 822 and friends: each field is a line of the
form "``Name: Value``", with long values continued onto multiple lines
("folded") by indenting the extra lines.  A blank line marks the end of the
header section and the beginning of the message body.

This basic grammar has been used by numerous textual formats besides e-mail,
including but not limited to:

- HTTP request & response headers
- Usenet messages
- most Python packaging metadata files
- Debian packaging control files
- ``META-INF/MANIFEST.MF`` files in Java JARs
- a subset of the `YAML <http://www.yaml.org/>`_ serialization format

— all of which this package can parse.


Installation
============
``headerparser`` requires Python 3.7 or higher.  Just use `pip
<https://pip.pypa.io>`_ for Python 3 (You have pip, right?) to install
``headerparser``::

    python3 -m pip install headerparser


Examples
========

Define a parser:

>>> import headerparser
>>> parser = headerparser.HeaderParser()
>>> parser.add_field('Name', required=True)
>>> parser.add_field('Type', choices=['example', 'demonstration', 'prototype'], default='example')
>>> parser.add_field('Public', type=headerparser.BOOL, default=False)
>>> parser.add_field('Tag', multiple=True)
>>> parser.add_field('Data')

Parse some headers and inspect the results:

>>> msg = parser.parse('''\
... Name: Sample Input
... Public: yes
... tag: doctest, examples,
...   whatever
... TAG: README
...
... Wait, why I am using a body instead of the "Data" field?
... ''')
>>> sorted(msg.keys())
['Name', 'Public', 'Tag', 'Type']
>>> msg['Name']
'Sample Input'
>>> msg['Public']
True
>>> msg['Tag']
['doctest, examples,\n  whatever', 'README']
>>> msg['TYPE']
'example'
>>> msg['Data']
Traceback (most recent call last):
    ...
KeyError: 'data'
>>> msg.body
'Wait, why I am using a body instead of the "Data" field?\n'

Fail to parse headers that don't meet your requirements:

>>> parser.parse('Type: demonstration')
Traceback (most recent call last):
    ...
headerparser.errors.MissingFieldError: Required header field 'Name' is not present
>>> parser.parse('Name: Bad type\nType: other')
Traceback (most recent call last):
    ...
headerparser.errors.InvalidChoiceError: 'other' is not a valid choice for 'Type'
>>> parser.parse('Name: unknown field\nField: Value')
Traceback (most recent call last):
    ...
headerparser.errors.UnknownFieldError: Unknown header field 'Field'

Allow fields you didn't even think of:

>>> parser.add_additional()
>>> msg = parser.parse('Name: unknown field\nField: Value')
>>> msg['Field']
'Value'

Just split some headers into names & values and worry about validity later:

>>> for field in headerparser.scan('''\
... Name: Scanner Sample
... Unknown headers: no problem
... Unparsed-Boolean: yes
... CaSe-SeNsItIvE-rEsUlTs: true
... Whitespace around colons:optional
... Whitespace around colons  :  I already said it's optional.
...   That means you have the _option_ to use as much as you want!
...
... And there's a body, too, I guess.
... '''): print(field)
('Name', 'Scanner Sample')
('Unknown headers', 'no problem')
('Unparsed-Boolean', 'yes')
('CaSe-SeNsItIvE-rEsUlTs', 'true')
('Whitespace around colons', 'optional')
('Whitespace around colons', "I already said it's optional.\n  That means you have the _option_ to use as much as you want!")
(None, "And there's a body, too, I guess.\n")

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/jwodder/headerparser",
    "name": "headerparser",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "",
    "keywords": "e-mail,email,mail,rfc822,headers,rfc2822,rfc5322,parser",
    "author": "John Thorvald Wodder II",
    "author_email": "headerparser@varonathe.org",
    "download_url": "https://files.pythonhosted.org/packages/78/68/b40bf24e5da6eea11baeb8c718b925c27d5f40e5c17fd71df83c52530d3d/headerparser-0.5.1.tar.gz",
    "platform": null,
    "description": ".. image:: http://www.repostatus.org/badges/latest/active.svg\n    :target: http://www.repostatus.org/#active\n    :alt: Project Status: Active \u2014 The project has reached a stable, usable\n          state and is being actively developed.\n\n.. image:: https://github.com/jwodder/headerparser/workflows/Test/badge.svg?branch=master\n    :target: https://github.com/jwodder/headerparser/actions?workflow=Test\n    :alt: CI Status\n\n.. image:: https://codecov.io/gh/jwodder/headerparser/branch/master/graph/badge.svg\n    :target: https://codecov.io/gh/jwodder/headerparser\n\n.. image:: https://img.shields.io/pypi/pyversions/headerparser.svg\n    :target: https://pypi.org/project/headerparser\n\n.. image:: https://img.shields.io/github/license/jwodder/headerparser.svg\n    :target: https://opensource.org/licenses/MIT\n    :alt: MIT License\n\n`GitHub <https://github.com/jwodder/headerparser>`_\n| `PyPI <https://pypi.org/project/headerparser>`_\n| `Documentation <https://headerparser.readthedocs.io>`_\n| `Issues <https://github.com/jwodder/headerparser/issues>`_\n| `Changelog <https://github.com/jwodder/headerparser/blob/master/CHANGELOG.md>`_\n\n``headerparser`` parses key-value pairs in the style of RFC 822 (e-mail)\nheaders and converts them into case-insensitive dictionaries with the trailing\nmessage body (if any) attached.  Fields can be converted to other types, marked\nrequired, or given default values using an API based on the standard library's\n``argparse`` module.  (Everyone loves ``argparse``, right?)  Low-level\nfunctions for just scanning header fields (breaking them into sequences of\nkey-value pairs without any further processing) are also included.\n\nThe Format\n==========\nRFC 822-style headers are header fields that follow the general format of\ne-mail headers as specified by RFC 822 and friends: each field is a line of the\nform \"``Name: Value``\", with long values continued onto multiple lines\n(\"folded\") by indenting the extra lines.  A blank line marks the end of the\nheader section and the beginning of the message body.\n\nThis basic grammar has been used by numerous textual formats besides e-mail,\nincluding but not limited to:\n\n- HTTP request & response headers\n- Usenet messages\n- most Python packaging metadata files\n- Debian packaging control files\n- ``META-INF/MANIFEST.MF`` files in Java JARs\n- a subset of the `YAML <http://www.yaml.org/>`_ serialization format\n\n\u2014 all of which this package can parse.\n\n\nInstallation\n============\n``headerparser`` requires Python 3.7 or higher.  Just use `pip\n<https://pip.pypa.io>`_ for Python 3 (You have pip, right?) to install\n``headerparser``::\n\n    python3 -m pip install headerparser\n\n\nExamples\n========\n\nDefine a parser:\n\n>>> import headerparser\n>>> parser = headerparser.HeaderParser()\n>>> parser.add_field('Name', required=True)\n>>> parser.add_field('Type', choices=['example', 'demonstration', 'prototype'], default='example')\n>>> parser.add_field('Public', type=headerparser.BOOL, default=False)\n>>> parser.add_field('Tag', multiple=True)\n>>> parser.add_field('Data')\n\nParse some headers and inspect the results:\n\n>>> msg = parser.parse('''\\\n... Name: Sample Input\n... Public: yes\n... tag: doctest, examples,\n...   whatever\n... TAG: README\n...\n... Wait, why I am using a body instead of the \"Data\" field?\n... ''')\n>>> sorted(msg.keys())\n['Name', 'Public', 'Tag', 'Type']\n>>> msg['Name']\n'Sample Input'\n>>> msg['Public']\nTrue\n>>> msg['Tag']\n['doctest, examples,\\n  whatever', 'README']\n>>> msg['TYPE']\n'example'\n>>> msg['Data']\nTraceback (most recent call last):\n    ...\nKeyError: 'data'\n>>> msg.body\n'Wait, why I am using a body instead of the \"Data\" field?\\n'\n\nFail to parse headers that don't meet your requirements:\n\n>>> parser.parse('Type: demonstration')\nTraceback (most recent call last):\n    ...\nheaderparser.errors.MissingFieldError: Required header field 'Name' is not present\n>>> parser.parse('Name: Bad type\\nType: other')\nTraceback (most recent call last):\n    ...\nheaderparser.errors.InvalidChoiceError: 'other' is not a valid choice for 'Type'\n>>> parser.parse('Name: unknown field\\nField: Value')\nTraceback (most recent call last):\n    ...\nheaderparser.errors.UnknownFieldError: Unknown header field 'Field'\n\nAllow fields you didn't even think of:\n\n>>> parser.add_additional()\n>>> msg = parser.parse('Name: unknown field\\nField: Value')\n>>> msg['Field']\n'Value'\n\nJust split some headers into names & values and worry about validity later:\n\n>>> for field in headerparser.scan('''\\\n... Name: Scanner Sample\n... Unknown headers: no problem\n... Unparsed-Boolean: yes\n... CaSe-SeNsItIvE-rEsUlTs: true\n... Whitespace around colons:optional\n... Whitespace around colons  :  I already said it's optional.\n...   That means you have the _option_ to use as much as you want!\n...\n... And there's a body, too, I guess.\n... '''): print(field)\n('Name', 'Scanner Sample')\n('Unknown headers', 'no problem')\n('Unparsed-Boolean', 'yes')\n('CaSe-SeNsItIvE-rEsUlTs', 'true')\n('Whitespace around colons', 'optional')\n('Whitespace around colons', \"I already said it's optional.\\n  That means you have the _option_ to use as much as you want!\")\n(None, \"And there's a body, too, I guess.\\n\")\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "argparse for mail-style headers",
    "version": "0.5.1",
    "project_urls": {
        "Bug Tracker": "https://github.com/jwodder/headerparser/issues",
        "Documentation": "https://headerparser.readthedocs.io",
        "Homepage": "https://github.com/jwodder/headerparser",
        "Source Code": "https://github.com/jwodder/headerparser"
    },
    "split_keywords": [
        "e-mail",
        "email",
        "mail",
        "rfc822",
        "headers",
        "rfc2822",
        "rfc5322",
        "parser"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "4b497463cb87839856d41314ac84f4e256e9781611a812688257f031e0f1a6d0",
                "md5": "44568d203ac17ab72cd2a46848963da7",
                "sha256": "9e7b0cc3a7debd2fbd6d613bd561601e3e588496e95c0c6ee65419c2a68838c5"
            },
            "downloads": -1,
            "filename": "headerparser-0.5.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "44568d203ac17ab72cd2a46848963da7",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 18310,
            "upload_time": "2023-10-04T15:37:42",
            "upload_time_iso_8601": "2023-10-04T15:37:42.449603Z",
            "url": "https://files.pythonhosted.org/packages/4b/49/7463cb87839856d41314ac84f4e256e9781611a812688257f031e0f1a6d0/headerparser-0.5.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "7868b40bf24e5da6eea11baeb8c718b925c27d5f40e5c17fd71df83c52530d3d",
                "md5": "80af18cec02e880a42a443c6c148d399",
                "sha256": "f4ca75b801b4d810cde2a278f5f07c9733e4d88e51085032fd909f5655df3ef3"
            },
            "downloads": -1,
            "filename": "headerparser-0.5.1.tar.gz",
            "has_sig": false,
            "md5_digest": "80af18cec02e880a42a443c6c148d399",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 34342,
            "upload_time": "2023-10-04T15:37:43",
            "upload_time_iso_8601": "2023-10-04T15:37:43.501558Z",
            "url": "https://files.pythonhosted.org/packages/78/68/b40bf24e5da6eea11baeb8c718b925c27d5f40e5c17fd71df83c52530d3d/headerparser-0.5.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-10-04 15:37:43",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "jwodder",
    "github_project": "headerparser",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "tox": true,
    "lcname": "headerparser"
}
        
Elapsed time: 0.11811s