xmltodict


Namexmltodict JSON
Version 0.13.0 PyPI version JSON
download
home_pagehttps://github.com/martinblech/xmltodict
SummaryMakes working with XML feel like you are working with JSON
upload_time2022-05-08 07:00:04
maintainer
docs_urlNone
authorMartin Blech
requires_python>=3.4
licenseMIT
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI
coveralls test coverage No coveralls.
            # xmltodict

`xmltodict` is a Python module that makes working with XML feel like you are working with [JSON](http://docs.python.org/library/json.html), as in this ["spec"](http://www.xml.com/pub/a/2006/05/31/converting-between-xml-and-json.html):

[![Build Status](https://travis-ci.com/martinblech/xmltodict.svg?branch=master)](https://travis-ci.com/martinblech/xmltodict)

```python
>>> print(json.dumps(xmltodict.parse("""
...  <mydocument has="an attribute">
...    <and>
...      <many>elements</many>
...      <many>more elements</many>
...    </and>
...    <plus a="complex">
...      element as well
...    </plus>
...  </mydocument>
...  """), indent=4))
{
    "mydocument": {
        "@has": "an attribute", 
        "and": {
            "many": [
                "elements", 
                "more elements"
            ]
        }, 
        "plus": {
            "@a": "complex", 
            "#text": "element as well"
        }
    }
}
```

## Namespace support

By default, `xmltodict` does no XML namespace processing (it just treats namespace declarations as regular node attributes), but passing `process_namespaces=True` will make it expand namespaces for you:

```python
>>> xml = """
... <root xmlns="http://defaultns.com/"
...       xmlns:a="http://a.com/"
...       xmlns:b="http://b.com/">
...   <x>1</x>
...   <a:y>2</a:y>
...   <b:z>3</b:z>
... </root>
... """
>>> xmltodict.parse(xml, process_namespaces=True) == {
...     'http://defaultns.com/:root': {
...         'http://defaultns.com/:x': '1',
...         'http://a.com/:y': '2',
...         'http://b.com/:z': '3',
...     }
... }
True
```

It also lets you collapse certain namespaces to shorthand prefixes, or skip them altogether:

```python
>>> namespaces = {
...     'http://defaultns.com/': None, # skip this namespace
...     'http://a.com/': 'ns_a', # collapse "http://a.com/" -> "ns_a"
... }
>>> xmltodict.parse(xml, process_namespaces=True, namespaces=namespaces) == {
...     'root': {
...         'x': '1',
...         'ns_a:y': '2',
...         'http://b.com/:z': '3',
...     },
... }
True
```

## Streaming mode

`xmltodict` is very fast ([Expat](http://docs.python.org/library/pyexpat.html)-based) and has a streaming mode with a small memory footprint, suitable for big XML dumps like [Discogs](http://discogs.com/data/) or [Wikipedia](http://dumps.wikimedia.org/):

```python
>>> def handle_artist(_, artist):
...     print(artist['name'])
...     return True
>>> 
>>> xmltodict.parse(GzipFile('discogs_artists.xml.gz'),
...     item_depth=2, item_callback=handle_artist)
A Perfect Circle
Fantômas
King Crimson
Chris Potter
...
```

It can also be used from the command line to pipe objects to a script like this:

```python
import sys, marshal
while True:
    _, article = marshal.load(sys.stdin)
    print(article['title'])
```

```sh
$ bunzip2 enwiki-pages-articles.xml.bz2 | xmltodict.py 2 | myscript.py
AccessibleComputing
Anarchism
AfghanistanHistory
AfghanistanGeography
AfghanistanPeople
AfghanistanCommunications
Autism
...
```

Or just cache the dicts so you don't have to parse that big XML file again. You do this only once:

```sh
$ bunzip2 enwiki-pages-articles.xml.bz2 | xmltodict.py 2 | gzip > enwiki.dicts.gz
```

And you reuse the dicts with every script that needs them:

```sh
$ gunzip enwiki.dicts.gz | script1.py
$ gunzip enwiki.dicts.gz | script2.py
...
```

## Roundtripping

You can also convert in the other direction, using the `unparse()` method:

```python
>>> mydict = {
...     'response': {
...             'status': 'good',
...             'last_updated': '2014-02-16T23:10:12Z',
...     }
... }
>>> print(unparse(mydict, pretty=True))
<?xml version="1.0" encoding="utf-8"?>
<response>
	<status>good</status>
	<last_updated>2014-02-16T23:10:12Z</last_updated>
</response>
```

Text values for nodes can be specified with the `cdata_key` key in the python dict, while node properties can be specified with the `attr_prefix` prefixed to the key name in the python dict. The default value for `attr_prefix` is `@` and the default value for `cdata_key` is `#text`.

```python
>>> import xmltodict
>>> 
>>> mydict = {
...     'text': {
...         '@color':'red',
...         '@stroke':'2',
...         '#text':'This is a test'
...     }
... }
>>> print(xmltodict.unparse(mydict, pretty=True))
<?xml version="1.0" encoding="utf-8"?>
<text stroke="2" color="red">This is a test</text>
```

Lists that are specified under a key in a dictionary use the key as a tag for each item. But if a list does have a parent key, for example if a list exists inside another list, it does not have a tag to use and the items are converted to a string as shown in the example below.  To give tags to nested lists, use the `expand_iter` keyword argument to provide a tag as demonstrated below. Note that using `expand_iter` will break roundtripping.

```python
>>> mydict = {
...     "line": {
...         "points": [
...             [1, 5],
...             [2, 6],
...         ]
...     }
... }
>>> print(xmltodict.unparse(mydict, pretty=True))
<?xml version="1.0" encoding="utf-8"?>
<line>
        <points>[1, 5]</points>
        <points>[2, 6]</points>
</line>
>>> print(xmltodict.unparse(mydict, pretty=True, expand_iter="coord"))
<?xml version="1.0" encoding="utf-8"?>
<line>
        <points>
                <coord>1</coord>
                <coord>5</coord>
        </points>
        <points>
                <coord>2</coord>
                <coord>6</coord>
        </points>
</line>
```

## Ok, how do I get it?

### Using pypi

You just need to

```sh
$ pip install xmltodict
```

### RPM-based distro (Fedora, RHEL, …)

There is an [official Fedora package for xmltodict](https://apps.fedoraproject.org/packages/python-xmltodict).

```sh
$ sudo yum install python-xmltodict
```

### Arch Linux

There is an [official Arch Linux package for xmltodict](https://www.archlinux.org/packages/community/any/python-xmltodict/).

```sh
$ sudo pacman -S python-xmltodict
```

### Debian-based distro (Debian, Ubuntu, …)

There is an [official Debian package for xmltodict](https://tracker.debian.org/pkg/python-xmltodict).

```sh
$ sudo apt install python-xmltodict
```

### FreeBSD

There is an [official FreeBSD port for xmltodict](https://svnweb.freebsd.org/ports/head/devel/py-xmltodict/).

```sh
$ pkg install py36-xmltodict
```

### openSUSE/SLE (SLE 15, Leap 15, Tumbleweed)

There is an [official openSUSE package for xmltodict](https://software.opensuse.org/package/python-xmltodict).

```sh
# Python2
$ zypper in python2-xmltodict

# Python3
$ zypper in python3-xmltodict
```



            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/martinblech/xmltodict",
    "name": "xmltodict",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.4",
    "maintainer_email": "",
    "keywords": "",
    "author": "Martin Blech",
    "author_email": "martinblech@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/39/0d/40df5be1e684bbaecdb9d1e0e40d5d482465de6b00cbb92b84ee5d243c7f/xmltodict-0.13.0.tar.gz",
    "platform": "all",
    "description": "# xmltodict\n\n`xmltodict` is a Python module that makes working with XML feel like you are working with [JSON](http://docs.python.org/library/json.html), as in this [\"spec\"](http://www.xml.com/pub/a/2006/05/31/converting-between-xml-and-json.html):\n\n[![Build Status](https://travis-ci.com/martinblech/xmltodict.svg?branch=master)](https://travis-ci.com/martinblech/xmltodict)\n\n```python\n>>> print(json.dumps(xmltodict.parse(\"\"\"\n...  <mydocument has=\"an attribute\">\n...    <and>\n...      <many>elements</many>\n...      <many>more elements</many>\n...    </and>\n...    <plus a=\"complex\">\n...      element as well\n...    </plus>\n...  </mydocument>\n...  \"\"\"), indent=4))\n{\n    \"mydocument\": {\n        \"@has\": \"an attribute\", \n        \"and\": {\n            \"many\": [\n                \"elements\", \n                \"more elements\"\n            ]\n        }, \n        \"plus\": {\n            \"@a\": \"complex\", \n            \"#text\": \"element as well\"\n        }\n    }\n}\n```\n\n## Namespace support\n\nBy default, `xmltodict` does no XML namespace processing (it just treats namespace declarations as regular node attributes), but passing `process_namespaces=True` will make it expand namespaces for you:\n\n```python\n>>> xml = \"\"\"\n... <root xmlns=\"http://defaultns.com/\"\n...       xmlns:a=\"http://a.com/\"\n...       xmlns:b=\"http://b.com/\">\n...   <x>1</x>\n...   <a:y>2</a:y>\n...   <b:z>3</b:z>\n... </root>\n... \"\"\"\n>>> xmltodict.parse(xml, process_namespaces=True) == {\n...     'http://defaultns.com/:root': {\n...         'http://defaultns.com/:x': '1',\n...         'http://a.com/:y': '2',\n...         'http://b.com/:z': '3',\n...     }\n... }\nTrue\n```\n\nIt also lets you collapse certain namespaces to shorthand prefixes, or skip them altogether:\n\n```python\n>>> namespaces = {\n...     'http://defaultns.com/': None, # skip this namespace\n...     'http://a.com/': 'ns_a', # collapse \"http://a.com/\" -> \"ns_a\"\n... }\n>>> xmltodict.parse(xml, process_namespaces=True, namespaces=namespaces) == {\n...     'root': {\n...         'x': '1',\n...         'ns_a:y': '2',\n...         'http://b.com/:z': '3',\n...     },\n... }\nTrue\n```\n\n## Streaming mode\n\n`xmltodict` is very fast ([Expat](http://docs.python.org/library/pyexpat.html)-based) and has a streaming mode with a small memory footprint, suitable for big XML dumps like [Discogs](http://discogs.com/data/) or [Wikipedia](http://dumps.wikimedia.org/):\n\n```python\n>>> def handle_artist(_, artist):\n...     print(artist['name'])\n...     return True\n>>> \n>>> xmltodict.parse(GzipFile('discogs_artists.xml.gz'),\n...     item_depth=2, item_callback=handle_artist)\nA Perfect Circle\nFant\u00f4mas\nKing Crimson\nChris Potter\n...\n```\n\nIt can also be used from the command line to pipe objects to a script like this:\n\n```python\nimport sys, marshal\nwhile True:\n    _, article = marshal.load(sys.stdin)\n    print(article['title'])\n```\n\n```sh\n$ bunzip2 enwiki-pages-articles.xml.bz2 | xmltodict.py 2 | myscript.py\nAccessibleComputing\nAnarchism\nAfghanistanHistory\nAfghanistanGeography\nAfghanistanPeople\nAfghanistanCommunications\nAutism\n...\n```\n\nOr just cache the dicts so you don't have to parse that big XML file again. You do this only once:\n\n```sh\n$ bunzip2 enwiki-pages-articles.xml.bz2 | xmltodict.py 2 | gzip > enwiki.dicts.gz\n```\n\nAnd you reuse the dicts with every script that needs them:\n\n```sh\n$ gunzip enwiki.dicts.gz | script1.py\n$ gunzip enwiki.dicts.gz | script2.py\n...\n```\n\n## Roundtripping\n\nYou can also convert in the other direction, using the `unparse()` method:\n\n```python\n>>> mydict = {\n...     'response': {\n...             'status': 'good',\n...             'last_updated': '2014-02-16T23:10:12Z',\n...     }\n... }\n>>> print(unparse(mydict, pretty=True))\n<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<response>\n\t<status>good</status>\n\t<last_updated>2014-02-16T23:10:12Z</last_updated>\n</response>\n```\n\nText values for nodes can be specified with the `cdata_key` key in the python dict, while node properties can be specified with the `attr_prefix` prefixed to the key name in the python dict. The default value for `attr_prefix` is `@` and the default value for `cdata_key` is `#text`.\n\n```python\n>>> import xmltodict\n>>> \n>>> mydict = {\n...     'text': {\n...         '@color':'red',\n...         '@stroke':'2',\n...         '#text':'This is a test'\n...     }\n... }\n>>> print(xmltodict.unparse(mydict, pretty=True))\n<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<text stroke=\"2\" color=\"red\">This is a test</text>\n```\n\nLists that are specified under a key in a dictionary use the key as a tag for each item. But if a list does have a parent key, for example if a list exists inside another list, it does not have a tag to use and the items are converted to a string as shown in the example below.  To give tags to nested lists, use the `expand_iter` keyword argument to provide a tag as demonstrated below. Note that using `expand_iter` will break roundtripping.\n\n```python\n>>> mydict = {\n...     \"line\": {\n...         \"points\": [\n...             [1, 5],\n...             [2, 6],\n...         ]\n...     }\n... }\n>>> print(xmltodict.unparse(mydict, pretty=True))\n<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<line>\n        <points>[1, 5]</points>\n        <points>[2, 6]</points>\n</line>\n>>> print(xmltodict.unparse(mydict, pretty=True, expand_iter=\"coord\"))\n<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<line>\n        <points>\n                <coord>1</coord>\n                <coord>5</coord>\n        </points>\n        <points>\n                <coord>2</coord>\n                <coord>6</coord>\n        </points>\n</line>\n```\n\n## Ok, how do I get it?\n\n### Using pypi\n\nYou just need to\n\n```sh\n$ pip install xmltodict\n```\n\n### RPM-based distro (Fedora, RHEL, \u2026)\n\nThere is an [official Fedora package for xmltodict](https://apps.fedoraproject.org/packages/python-xmltodict).\n\n```sh\n$ sudo yum install python-xmltodict\n```\n\n### Arch Linux\n\nThere is an [official Arch Linux package for xmltodict](https://www.archlinux.org/packages/community/any/python-xmltodict/).\n\n```sh\n$ sudo pacman -S python-xmltodict\n```\n\n### Debian-based distro (Debian, Ubuntu, \u2026)\n\nThere is an [official Debian package for xmltodict](https://tracker.debian.org/pkg/python-xmltodict).\n\n```sh\n$ sudo apt install python-xmltodict\n```\n\n### FreeBSD\n\nThere is an [official FreeBSD port for xmltodict](https://svnweb.freebsd.org/ports/head/devel/py-xmltodict/).\n\n```sh\n$ pkg install py36-xmltodict\n```\n\n### openSUSE/SLE (SLE 15, Leap 15, Tumbleweed)\n\nThere is an [official openSUSE package for xmltodict](https://software.opensuse.org/package/python-xmltodict).\n\n```sh\n# Python2\n$ zypper in python2-xmltodict\n\n# Python3\n$ zypper in python3-xmltodict\n```\n\n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Makes working with XML feel like you are working with JSON",
    "version": "0.13.0",
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "md5": "1e71055cc8b757877fe2469906d1cf45",
                "sha256": "aa89e8fd76320154a40d19a0df04a4695fb9dc5ba977cbb68ab3e4eb225e7852"
            },
            "downloads": -1,
            "filename": "xmltodict-0.13.0-py2.py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "1e71055cc8b757877fe2469906d1cf45",
            "packagetype": "bdist_wheel",
            "python_version": "py2.py3",
            "requires_python": ">=3.4",
            "size": 9971,
            "upload_time": "2022-05-08T07:00:02",
            "upload_time_iso_8601": "2022-05-08T07:00:02.898847Z",
            "url": "https://files.pythonhosted.org/packages/94/db/fd0326e331726f07ff7f40675cd86aa804bfd2e5016c727fa761c934990e/xmltodict-0.13.0-py2.py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "md5": "1ece0a5bbd494bac414058405606475e",
                "sha256": "341595a488e3e01a85a9d8911d8912fd922ede5fecc4dce437eb4b6c8d037e56"
            },
            "downloads": -1,
            "filename": "xmltodict-0.13.0.tar.gz",
            "has_sig": false,
            "md5_digest": "1ece0a5bbd494bac414058405606475e",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.4",
            "size": 33813,
            "upload_time": "2022-05-08T07:00:04",
            "upload_time_iso_8601": "2022-05-08T07:00:04.916866Z",
            "url": "https://files.pythonhosted.org/packages/39/0d/40df5be1e684bbaecdb9d1e0e40d5d482465de6b00cbb92b84ee5d243c7f/xmltodict-0.13.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2022-05-08 07:00:04",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "martinblech",
    "github_project": "xmltodict",
    "travis_ci": true,
    "coveralls": false,
    "github_actions": false,
    "tox": true,
    "lcname": "xmltodict"
}
        
Elapsed time: 0.01704s