# xmltodict
`xmltodict` is a Python module that makes working with XML feel like you are working with [JSON](http://docs.python.org/library/json.html), as in this ["spec"](http://www.xml.com/pub/a/2006/05/31/converting-between-xml-and-json.html):
[![Build Status](https://app.travis-ci.com/martinblech/xmltodict.svg?branch=master)](https://app.travis-ci.com/martinblech/xmltodict)
```python
>>> print(json.dumps(xmltodict.parse("""
... <mydocument has="an attribute">
... <and>
... <many>elements</many>
... <many>more elements</many>
... </and>
... <plus a="complex">
... element as well
... </plus>
... </mydocument>
... """), indent=4))
{
"mydocument": {
"@has": "an attribute",
"and": {
"many": [
"elements",
"more elements"
]
},
"plus": {
"@a": "complex",
"#text": "element as well"
}
}
}
```
## Namespace support
By default, `xmltodict` does no XML namespace processing (it just treats namespace declarations as regular node attributes), but passing `process_namespaces=True` will make it expand namespaces for you:
```python
>>> xml = """
... <root xmlns="http://defaultns.com/"
... xmlns:a="http://a.com/"
... xmlns:b="http://b.com/">
... <x>1</x>
... <a:y>2</a:y>
... <b:z>3</b:z>
... </root>
... """
>>> xmltodict.parse(xml, process_namespaces=True) == {
... 'http://defaultns.com/:root': {
... 'http://defaultns.com/:x': '1',
... 'http://a.com/:y': '2',
... 'http://b.com/:z': '3',
... }
... }
True
```
It also lets you collapse certain namespaces to shorthand prefixes, or skip them altogether:
```python
>>> namespaces = {
... 'http://defaultns.com/': None, # skip this namespace
... 'http://a.com/': 'ns_a', # collapse "http://a.com/" -> "ns_a"
... }
>>> xmltodict.parse(xml, process_namespaces=True, namespaces=namespaces) == {
... 'root': {
... 'x': '1',
... 'ns_a:y': '2',
... 'http://b.com/:z': '3',
... },
... }
True
```
## Streaming mode
`xmltodict` is very fast ([Expat](http://docs.python.org/library/pyexpat.html)-based) and has a streaming mode with a small memory footprint, suitable for big XML dumps like [Discogs](http://discogs.com/data/) or [Wikipedia](http://dumps.wikimedia.org/):
```python
>>> def handle_artist(_, artist):
... print(artist['name'])
... return True
>>>
>>> xmltodict.parse(GzipFile('discogs_artists.xml.gz'),
... item_depth=2, item_callback=handle_artist)
A Perfect Circle
Fantômas
King Crimson
Chris Potter
...
```
It can also be used from the command line to pipe objects to a script like this:
```python
import sys, marshal
while True:
_, article = marshal.load(sys.stdin)
print(article['title'])
```
```sh
$ bunzip2 enwiki-pages-articles.xml.bz2 | xmltodict.py 2 | myscript.py
AccessibleComputing
Anarchism
AfghanistanHistory
AfghanistanGeography
AfghanistanPeople
AfghanistanCommunications
Autism
...
```
Or just cache the dicts so you don't have to parse that big XML file again. You do this only once:
```sh
$ bunzip2 enwiki-pages-articles.xml.bz2 | xmltodict.py 2 | gzip > enwiki.dicts.gz
```
And you reuse the dicts with every script that needs them:
```sh
$ gunzip enwiki.dicts.gz | script1.py
$ gunzip enwiki.dicts.gz | script2.py
...
```
## Roundtripping
You can also convert in the other direction, using the `unparse()` method:
```python
>>> mydict = {
... 'response': {
... 'status': 'good',
... 'last_updated': '2014-02-16T23:10:12Z',
... }
... }
>>> print(unparse(mydict, pretty=True))
<?xml version="1.0" encoding="utf-8"?>
<response>
<status>good</status>
<last_updated>2014-02-16T23:10:12Z</last_updated>
</response>
```
Text values for nodes can be specified with the `cdata_key` key in the python dict, while node properties can be specified with the `attr_prefix` prefixed to the key name in the python dict. The default value for `attr_prefix` is `@` and the default value for `cdata_key` is `#text`.
```python
>>> import xmltodict
>>>
>>> mydict = {
... 'text': {
... '@color':'red',
... '@stroke':'2',
... '#text':'This is a test'
... }
... }
>>> print(xmltodict.unparse(mydict, pretty=True))
<?xml version="1.0" encoding="utf-8"?>
<text stroke="2" color="red">This is a test</text>
```
Lists that are specified under a key in a dictionary use the key as a tag for each item. But if a list does have a parent key, for example if a list exists inside another list, it does not have a tag to use and the items are converted to a string as shown in the example below. To give tags to nested lists, use the `expand_iter` keyword argument to provide a tag as demonstrated below. Note that using `expand_iter` will break roundtripping.
```python
>>> mydict = {
... "line": {
... "points": [
... [1, 5],
... [2, 6],
... ]
... }
... }
>>> print(xmltodict.unparse(mydict, pretty=True))
<?xml version="1.0" encoding="utf-8"?>
<line>
<points>[1, 5]</points>
<points>[2, 6]</points>
</line>
>>> print(xmltodict.unparse(mydict, pretty=True, expand_iter="coord"))
<?xml version="1.0" encoding="utf-8"?>
<line>
<points>
<coord>1</coord>
<coord>5</coord>
</points>
<points>
<coord>2</coord>
<coord>6</coord>
</points>
</line>
```
## Ok, how do I get it?
### Using pypi
You just need to
```sh
$ pip install xmltodict
```
### Using conda
For installing `xmltodict` using Anaconda/Miniconda (*conda*) from the
[conda-forge channel][#xmltodict-conda] all you need to do is:
[#xmltodict-conda]: https://anaconda.org/conda-forge/xmltodict
```sh
$ conda install -c conda-forge xmltodict
```
### RPM-based distro (Fedora, RHEL, …)
There is an [official Fedora package for xmltodict](https://apps.fedoraproject.org/packages/python-xmltodict).
```sh
$ sudo yum install python-xmltodict
```
### Arch Linux
There is an [official Arch Linux package for xmltodict](https://www.archlinux.org/packages/community/any/python-xmltodict/).
```sh
$ sudo pacman -S python-xmltodict
```
### Debian-based distro (Debian, Ubuntu, …)
There is an [official Debian package for xmltodict](https://tracker.debian.org/pkg/python-xmltodict).
```sh
$ sudo apt install python-xmltodict
```
### FreeBSD
There is an [official FreeBSD port for xmltodict](https://svnweb.freebsd.org/ports/head/devel/py-xmltodict/).
```sh
$ pkg install py36-xmltodict
```
### openSUSE/SLE (SLE 15, Leap 15, Tumbleweed)
There is an [official openSUSE package for xmltodict](https://software.opensuse.org/package/python-xmltodict).
```sh
# Python2
$ zypper in python2-xmltodict
# Python3
$ zypper in python3-xmltodict
```
Raw data
{
"_id": null,
"home_page": "https://github.com/martinblech/xmltodict",
"name": "xmltodict",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": null,
"keywords": null,
"author": "Martin Blech",
"author_email": "martinblech@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/50/05/51dcca9a9bf5e1bce52582683ce50980bcadbc4fa5143b9f2b19ab99958f/xmltodict-0.14.2.tar.gz",
"platform": "all",
"description": "# xmltodict\n\n`xmltodict` is a Python module that makes working with XML feel like you are working with [JSON](http://docs.python.org/library/json.html), as in this [\"spec\"](http://www.xml.com/pub/a/2006/05/31/converting-between-xml-and-json.html):\n\n[![Build Status](https://app.travis-ci.com/martinblech/xmltodict.svg?branch=master)](https://app.travis-ci.com/martinblech/xmltodict)\n\n```python\n>>> print(json.dumps(xmltodict.parse(\"\"\"\n... <mydocument has=\"an attribute\">\n... <and>\n... <many>elements</many>\n... <many>more elements</many>\n... </and>\n... <plus a=\"complex\">\n... element as well\n... </plus>\n... </mydocument>\n... \"\"\"), indent=4))\n{\n \"mydocument\": {\n \"@has\": \"an attribute\", \n \"and\": {\n \"many\": [\n \"elements\", \n \"more elements\"\n ]\n }, \n \"plus\": {\n \"@a\": \"complex\", \n \"#text\": \"element as well\"\n }\n }\n}\n```\n\n## Namespace support\n\nBy default, `xmltodict` does no XML namespace processing (it just treats namespace declarations as regular node attributes), but passing `process_namespaces=True` will make it expand namespaces for you:\n\n```python\n>>> xml = \"\"\"\n... <root xmlns=\"http://defaultns.com/\"\n... xmlns:a=\"http://a.com/\"\n... xmlns:b=\"http://b.com/\">\n... <x>1</x>\n... <a:y>2</a:y>\n... <b:z>3</b:z>\n... </root>\n... \"\"\"\n>>> xmltodict.parse(xml, process_namespaces=True) == {\n... 'http://defaultns.com/:root': {\n... 'http://defaultns.com/:x': '1',\n... 'http://a.com/:y': '2',\n... 'http://b.com/:z': '3',\n... }\n... }\nTrue\n```\n\nIt also lets you collapse certain namespaces to shorthand prefixes, or skip them altogether:\n\n```python\n>>> namespaces = {\n... 'http://defaultns.com/': None, # skip this namespace\n... 'http://a.com/': 'ns_a', # collapse \"http://a.com/\" -> \"ns_a\"\n... }\n>>> xmltodict.parse(xml, process_namespaces=True, namespaces=namespaces) == {\n... 'root': {\n... 'x': '1',\n... 'ns_a:y': '2',\n... 'http://b.com/:z': '3',\n... },\n... }\nTrue\n```\n\n## Streaming mode\n\n`xmltodict` is very fast ([Expat](http://docs.python.org/library/pyexpat.html)-based) and has a streaming mode with a small memory footprint, suitable for big XML dumps like [Discogs](http://discogs.com/data/) or [Wikipedia](http://dumps.wikimedia.org/):\n\n```python\n>>> def handle_artist(_, artist):\n... print(artist['name'])\n... return True\n>>> \n>>> xmltodict.parse(GzipFile('discogs_artists.xml.gz'),\n... item_depth=2, item_callback=handle_artist)\nA Perfect Circle\nFant\u00f4mas\nKing Crimson\nChris Potter\n...\n```\n\nIt can also be used from the command line to pipe objects to a script like this:\n\n```python\nimport sys, marshal\nwhile True:\n _, article = marshal.load(sys.stdin)\n print(article['title'])\n```\n\n```sh\n$ bunzip2 enwiki-pages-articles.xml.bz2 | xmltodict.py 2 | myscript.py\nAccessibleComputing\nAnarchism\nAfghanistanHistory\nAfghanistanGeography\nAfghanistanPeople\nAfghanistanCommunications\nAutism\n...\n```\n\nOr just cache the dicts so you don't have to parse that big XML file again. You do this only once:\n\n```sh\n$ bunzip2 enwiki-pages-articles.xml.bz2 | xmltodict.py 2 | gzip > enwiki.dicts.gz\n```\n\nAnd you reuse the dicts with every script that needs them:\n\n```sh\n$ gunzip enwiki.dicts.gz | script1.py\n$ gunzip enwiki.dicts.gz | script2.py\n...\n```\n\n## Roundtripping\n\nYou can also convert in the other direction, using the `unparse()` method:\n\n```python\n>>> mydict = {\n... 'response': {\n... 'status': 'good',\n... 'last_updated': '2014-02-16T23:10:12Z',\n... }\n... }\n>>> print(unparse(mydict, pretty=True))\n<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<response>\n\t<status>good</status>\n\t<last_updated>2014-02-16T23:10:12Z</last_updated>\n</response>\n```\n\nText values for nodes can be specified with the `cdata_key` key in the python dict, while node properties can be specified with the `attr_prefix` prefixed to the key name in the python dict. The default value for `attr_prefix` is `@` and the default value for `cdata_key` is `#text`.\n\n```python\n>>> import xmltodict\n>>> \n>>> mydict = {\n... 'text': {\n... '@color':'red',\n... '@stroke':'2',\n... '#text':'This is a test'\n... }\n... }\n>>> print(xmltodict.unparse(mydict, pretty=True))\n<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<text stroke=\"2\" color=\"red\">This is a test</text>\n```\n\nLists that are specified under a key in a dictionary use the key as a tag for each item. But if a list does have a parent key, for example if a list exists inside another list, it does not have a tag to use and the items are converted to a string as shown in the example below. To give tags to nested lists, use the `expand_iter` keyword argument to provide a tag as demonstrated below. Note that using `expand_iter` will break roundtripping.\n\n```python\n>>> mydict = {\n... \"line\": {\n... \"points\": [\n... [1, 5],\n... [2, 6],\n... ]\n... }\n... }\n>>> print(xmltodict.unparse(mydict, pretty=True))\n<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<line>\n <points>[1, 5]</points>\n <points>[2, 6]</points>\n</line>\n>>> print(xmltodict.unparse(mydict, pretty=True, expand_iter=\"coord\"))\n<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<line>\n <points>\n <coord>1</coord>\n <coord>5</coord>\n </points>\n <points>\n <coord>2</coord>\n <coord>6</coord>\n </points>\n</line>\n```\n\n## Ok, how do I get it?\n\n### Using pypi\n\nYou just need to\n\n```sh\n$ pip install xmltodict\n```\n\n### Using conda\n\nFor installing `xmltodict` using Anaconda/Miniconda (*conda*) from the \n[conda-forge channel][#xmltodict-conda] all you need to do is:\n\n[#xmltodict-conda]: https://anaconda.org/conda-forge/xmltodict\n\n```sh\n$ conda install -c conda-forge xmltodict\n```\n\n### RPM-based distro (Fedora, RHEL, \u2026)\n\nThere is an [official Fedora package for xmltodict](https://apps.fedoraproject.org/packages/python-xmltodict).\n\n```sh\n$ sudo yum install python-xmltodict\n```\n\n### Arch Linux\n\nThere is an [official Arch Linux package for xmltodict](https://www.archlinux.org/packages/community/any/python-xmltodict/).\n\n```sh\n$ sudo pacman -S python-xmltodict\n```\n\n### Debian-based distro (Debian, Ubuntu, \u2026)\n\nThere is an [official Debian package for xmltodict](https://tracker.debian.org/pkg/python-xmltodict).\n\n```sh\n$ sudo apt install python-xmltodict\n```\n\n### FreeBSD\n\nThere is an [official FreeBSD port for xmltodict](https://svnweb.freebsd.org/ports/head/devel/py-xmltodict/).\n\n```sh\n$ pkg install py36-xmltodict\n```\n\n### openSUSE/SLE (SLE 15, Leap 15, Tumbleweed)\n\nThere is an [official openSUSE package for xmltodict](https://software.opensuse.org/package/python-xmltodict).\n\n```sh\n# Python2\n$ zypper in python2-xmltodict\n\n# Python3\n$ zypper in python3-xmltodict\n```\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Makes working with XML feel like you are working with JSON",
"version": "0.14.2",
"project_urls": {
"Homepage": "https://github.com/martinblech/xmltodict"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "d645fc303eb433e8a2a271739c98e953728422fa61a3c1f36077a49e395c972e",
"md5": "f745d92f448a40001945254dc818118e",
"sha256": "20cc7d723ed729276e808f26fb6b3599f786cbc37e06c65e192ba77c40f20aac"
},
"downloads": -1,
"filename": "xmltodict-0.14.2-py2.py3-none-any.whl",
"has_sig": false,
"md5_digest": "f745d92f448a40001945254dc818118e",
"packagetype": "bdist_wheel",
"python_version": "py2.py3",
"requires_python": ">=3.6",
"size": 9981,
"upload_time": "2024-10-16T06:10:27",
"upload_time_iso_8601": "2024-10-16T06:10:27.649111Z",
"url": "https://files.pythonhosted.org/packages/d6/45/fc303eb433e8a2a271739c98e953728422fa61a3c1f36077a49e395c972e/xmltodict-0.14.2-py2.py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "500551dcca9a9bf5e1bce52582683ce50980bcadbc4fa5143b9f2b19ab99958f",
"md5": "6e0d94bf858b3c2ff3daeed487eedc2a",
"sha256": "201e7c28bb210e374999d1dde6382923ab0ed1a8a5faeece48ab525b7810a553"
},
"downloads": -1,
"filename": "xmltodict-0.14.2.tar.gz",
"has_sig": false,
"md5_digest": "6e0d94bf858b3c2ff3daeed487eedc2a",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 51942,
"upload_time": "2024-10-16T06:10:29",
"upload_time_iso_8601": "2024-10-16T06:10:29.683651Z",
"url": "https://files.pythonhosted.org/packages/50/05/51dcca9a9bf5e1bce52582683ce50980bcadbc4fa5143b9f2b19ab99958f/xmltodict-0.14.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-16 06:10:29",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "martinblech",
"github_project": "xmltodict",
"travis_ci": true,
"coveralls": false,
"github_actions": true,
"tox": true,
"lcname": "xmltodict"
}