sdmx


Namesdmx JSON
Version 0.2.10 PyPI version JSON
download
home_pagehttp://github.com/mwilliamson/sdmx.py
SummaryRead SDMX XML files
upload_time2015-05-05 20:45:52
maintainerNone
docs_urlNone
authorMichael Williamson
requires_pythonNone
licenseUNKNOWN
keywords sdmx
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI
coveralls test coverage No coveralls.
            SDMX
====

Read SDMX XML files. I've only added the features I've needed, so this
is far from being a thorough implementation. Contributions welcome.

Installation
------------

``pip install sdmx``

Usage
-----

``sdmx.generic_data_message_reader(fileobj, dsd_fileobj=None, lazy=None)``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Given a file-like object representing the XML of a generic data message,
return a data message reader.

``sdmx.compact_data_message_reader(fileobj, dsd_fileobj=None, lazy=None)``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Given a file-like object representing the XML of a compact data message,
return a data message reader.

Optional arguments for data message readers
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

* ``dsd_fileobj``: the file-like object representing the XML of the
  relevant DSD. Only used if the data message does not contain a URL to
  the relevant DSD.

* ``lazy``: set to ``True`` to read observations lazily to allow
  datasets to be read without loading the entire dataset into memory.
  Use with caution: lazy reading makes some assumptions about the
  structure of the XML (for instance, that series keys always appear
  before any observations in that series). These assumptions seem to be
  safe on files that I've tested, but that doesn't mean they're
  universally true.

Data message readers
~~~~~~~~~~~~~~~~~~~~

Each data message reader has the following attributes:

* ``datasets()``: returns an iterable of ``DatasetReader`` instances.
  Each instance corresponds to a ``<DataSet>`` element.

``DatasetReader``
~~~~~~~~~~~~~~~~~

A ``DatasetReader`` has the following attributes:

* ``key_family()``: returns the ``KeyFamily`` for the dataset. This
  corresponds to the ``<KeyFamilyRef>`` element.

* ``series()``: returns an iterable of ``Series`` instances. Each
  instance corresponds to a ``<Series>`` element.

``KeyFamily``
~~~~~~~~~~~~~

A ``KeyFamily`` has the following attributes:

* ``name(lang)``: the name of the key family in the language ``lang``.

* ``describe_dimensions(lang)``: for each dimension of the key family,
  find the referenced concept and use its name in the language
  ``lang``. Returns a list of strings in the same order as in the
  source file.

``Series``
~~~~~~~~~~

A ``Series`` has the following attributes:

* ``describe_key(lang)``: the key of a series is a mapping from each
  dimension of the dataset to a value. For instance, if the dataset has
  a dimension named ``Country``, the value for the series might be
  ``United Kingdom``. Returns an ordered dictionary mapping strings to
  lists of strings. The items in the dictionary are in the same order
  as the dimensions returned from ``describe_dimensions()``. For
  instance, if the dataset has a single dimension called ``Country``,
  the returned value would be ``{"Country": ["United Kingdom"]}``. All
  ancestors of a value are also described, with ancestors appearing
  before descendents. For instance, if the value ``United Kingdom`` has
  the parent value ``Europe``, which has the parent value ``World``,
  the returned value would be
  ``{"Country": ["World", "Europe", "United Kingdom"]}``.

* ``observations()``: returns an iterable of ``Observation`` instances.
  Each instance corresponds to an ``<Obs>`` element.

``Observation``
~~~~~~~~~~~~~~~

An ``Observation`` has the following attributes:

* ``time``
* ``value``

Example
-------

The script below can be used to print out the values contained in a
generic data message. (If you have a compact data message, then using
``compact_data_message_reader`` instead of
``generic_data_message_reader`` should also work.) Assuming the script
is saved as ``read-sdmx-values.py``, it can be used like so:

.. code-block:: sh

    python read-sdmx-values.py path/to/generic-data-message.xml path/to/dsd.xml

.. code-block:: python

    import sys

    import sdmx


    def main():
        dataset_path = sys.argv[1]
        dsd_path = sys.argv[2]
        
        with open(dataset_path) as dataset_fileobj:
            with open(dsd_path) as dsd_fileobj:
                dataset_reader = sdmx.generic_data_message_reader(
                    fileobj=dataset_fileobj,
                    dsd_fileobj=dsd_fileobj,
                )
                _print_values(dataset_reader)


    def _print_values(dataset_reader):
        for dataset in dataset_reader.datasets():
            key_family = dataset.key_family()
            name = key_family.name(lang="en")
            
            print name
            
            dimension_names = key_family.describe_dimensions(lang="en") + ["Time", "Value"]
            
            for series in dataset.series():
                row_template = []
                key = series.describe_key(lang="en")
                for key_name, key_value in key.iteritems():
                    row_template.append(key_value)
                
                for observation in series.observations(lang="en"):
                    row = row_template[:]
                    row.append(observation.time)
                    row.append(observation.value)
                    
                    print zip(dimension_names, row)

    main()
            

Raw data

            {
    "_id": null,
    "home_page": "http://github.com/mwilliamson/sdmx.py",
    "name": "sdmx",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "sdmx",
    "author": "Michael Williamson",
    "author_email": "mike@zwobble.org",
    "download_url": "https://files.pythonhosted.org/packages/88/81/dd6f588555f0f690e906badb301ba4525fd81bc3f74f53dc87faec751983/sdmx-0.2.10.tar.gz",
    "platform": "UNKNOWN",
    "description": "SDMX\n====\n\nRead SDMX XML files. I've only added the features I've needed, so this\nis far from being a thorough implementation. Contributions welcome.\n\nInstallation\n------------\n\n``pip install sdmx``\n\nUsage\n-----\n\n``sdmx.generic_data_message_reader(fileobj, dsd_fileobj=None, lazy=None)``\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nGiven a file-like object representing the XML of a generic data message,\nreturn a data message reader.\n\n``sdmx.compact_data_message_reader(fileobj, dsd_fileobj=None, lazy=None)``\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nGiven a file-like object representing the XML of a compact data message,\nreturn a data message reader.\n\nOptional arguments for data message readers\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n* ``dsd_fileobj``: the file-like object representing the XML of the\n  relevant DSD. Only used if the data message does not contain a URL to\n  the relevant DSD.\n\n* ``lazy``: set to ``True`` to read observations lazily to allow\n  datasets to be read without loading the entire dataset into memory.\n  Use with caution: lazy reading makes some assumptions about the\n  structure of the XML (for instance, that series keys always appear\n  before any observations in that series). These assumptions seem to be\n  safe on files that I've tested, but that doesn't mean they're\n  universally true.\n\nData message readers\n~~~~~~~~~~~~~~~~~~~~\n\nEach data message reader has the following attributes:\n\n* ``datasets()``: returns an iterable of ``DatasetReader`` instances.\n  Each instance corresponds to a ``<DataSet>`` element.\n\n``DatasetReader``\n~~~~~~~~~~~~~~~~~\n\nA ``DatasetReader`` has the following attributes:\n\n* ``key_family()``: returns the ``KeyFamily`` for the dataset. This\n  corresponds to the ``<KeyFamilyRef>`` element.\n\n* ``series()``: returns an iterable of ``Series`` instances. Each\n  instance corresponds to a ``<Series>`` element.\n\n``KeyFamily``\n~~~~~~~~~~~~~\n\nA ``KeyFamily`` has the following attributes:\n\n* ``name(lang)``: the name of the key family in the language ``lang``.\n\n* ``describe_dimensions(lang)``: for each dimension of the key family,\n  find the referenced concept and use its name in the language\n  ``lang``. Returns a list of strings in the same order as in the\n  source file.\n\n``Series``\n~~~~~~~~~~\n\nA ``Series`` has the following attributes:\n\n* ``describe_key(lang)``: the key of a series is a mapping from each\n  dimension of the dataset to a value. For instance, if the dataset has\n  a dimension named ``Country``, the value for the series might be\n  ``United Kingdom``. Returns an ordered dictionary mapping strings to\n  lists of strings. The items in the dictionary are in the same order\n  as the dimensions returned from ``describe_dimensions()``. For\n  instance, if the dataset has a single dimension called ``Country``,\n  the returned value would be ``{\"Country\": [\"United Kingdom\"]}``. All\n  ancestors of a value are also described, with ancestors appearing\n  before descendents. For instance, if the value ``United Kingdom`` has\n  the parent value ``Europe``, which has the parent value ``World``,\n  the returned value would be\n  ``{\"Country\": [\"World\", \"Europe\", \"United Kingdom\"]}``.\n\n* ``observations()``: returns an iterable of ``Observation`` instances.\n  Each instance corresponds to an ``<Obs>`` element.\n\n``Observation``\n~~~~~~~~~~~~~~~\n\nAn ``Observation`` has the following attributes:\n\n* ``time``\n* ``value``\n\nExample\n-------\n\nThe script below can be used to print out the values contained in a\ngeneric data message. (If you have a compact data message, then using\n``compact_data_message_reader`` instead of\n``generic_data_message_reader`` should also work.) Assuming the script\nis saved as ``read-sdmx-values.py``, it can be used like so:\n\n.. code-block:: sh\n\n    python read-sdmx-values.py path/to/generic-data-message.xml path/to/dsd.xml\n\n.. code-block:: python\n\n    import sys\n\n    import sdmx\n\n\n    def main():\n        dataset_path = sys.argv[1]\n        dsd_path = sys.argv[2]\n        \n        with open(dataset_path) as dataset_fileobj:\n            with open(dsd_path) as dsd_fileobj:\n                dataset_reader = sdmx.generic_data_message_reader(\n                    fileobj=dataset_fileobj,\n                    dsd_fileobj=dsd_fileobj,\n                )\n                _print_values(dataset_reader)\n\n\n    def _print_values(dataset_reader):\n        for dataset in dataset_reader.datasets():\n            key_family = dataset.key_family()\n            name = key_family.name(lang=\"en\")\n            \n            print name\n            \n            dimension_names = key_family.describe_dimensions(lang=\"en\") + [\"Time\", \"Value\"]\n            \n            for series in dataset.series():\n                row_template = []\n                key = series.describe_key(lang=\"en\")\n                for key_name, key_value in key.iteritems():\n                    row_template.append(key_value)\n                \n                for observation in series.observations(lang=\"en\"):\n                    row = row_template[:]\n                    row.append(observation.time)\n                    row.append(observation.value)\n                    \n                    print zip(dimension_names, row)\n\n    main()",
    "bugtrack_url": null,
    "license": "UNKNOWN",
    "summary": "Read SDMX XML files",
    "version": "0.2.10",
    "split_keywords": [
        "sdmx"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "md5": "c8ae888a707140b69d7bfe2394941417",
                "sha256": "96e4d0120d93c029fa1b54ead19c2c0c3883bc68d56b33da45c1f1fb6f1eda8e"
            },
            "downloads": -1,
            "filename": "sdmx-0.2.10.tar.gz",
            "has_sig": false,
            "md5_digest": "c8ae888a707140b69d7bfe2394941417",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 10050,
            "upload_time": "2015-05-05T20:45:52",
            "upload_time_iso_8601": "2015-05-05T20:45:52.441266Z",
            "url": "https://files.pythonhosted.org/packages/88/81/dd6f588555f0f690e906badb301ba4525fd81bc3f74f53dc87faec751983/sdmx-0.2.10.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2015-05-05 20:45:52",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "mwilliamson",
    "github_project": "sdmx.py",
    "travis_ci": true,
    "coveralls": false,
    "github_actions": false,
    "tox": true,
    "lcname": "sdmx"
}
        
Elapsed time: 0.06124s