newsmlg2


Namenewsmlg2 JSON
Version 0.6 PyPI version JSON
download
home_pagehttps://github.com/iptc/python-newsmlg2
SummaryPython implementation of the NewsML-G2 standard (https://iptc.org/standards/newsml-g2/)
upload_time2024-05-24 14:30:58
maintainerNone
docs_urlNone
authorBrendan Quinn
requires_python>=3
licenseNone
keywords api media publishing news syndication
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # NewsML-G2 - Python implementation of the NewsML-G2 standard

NewsML-G2 is an open standard created by the International Press
Telecommunications Council to share news content. See http://www.newsml-g2.org/

This module is a part-implementation of the standard in Python.  Currently it
reads itemMeta and contentMeta blocks, catalogs and metadata objects from
NewsML-G2 XML files and outputs Python objects.

Currently built for Python 3 only - please let us know if you require Python 2
support.

## Installation

Installing from PyPI:

    pip install newsmlg2

## Reading NewsML-G2 files

Example:

```
import NewsMLG2

# load NewsML-G2 from a file and print the parsed version
g2doc = NewsMLG2.NewsMLG2Document(filename="test-newsmlg2-file.xml")
print(g2doc.get_item())

# load NewsML-G2 from a string
g2doc = NewsMLG2.NewsMLG2Document(
b"""<?xml version="1.0" encoding="UTF-8"?>
<newsItem
    xmlns="http://iptc.org/std/nar/2006-10-01/"
    guid="simplest-test"
    standard="NewsML-G2"
    standardversion="2.34"
    conformance="power"
    version="1"
    xml:lang="en-GB">
    <catalogRef href="http://www.iptc.org/std/catalog/catalog.IPTC-G2-Standards_38.xml" />
    <itemMeta>
        <itemClass qcode="ninat:text" />
        <provider qcode="nprov:IPTC" />
        <versionCreated>2020-06-22T12:00:00+03:00</versionCreated>
    </itemMeta>
    <contentSet>
        <inlineXML contenttype="application/nitf+xml">
        </inlineXML>
    </contentSet>
</newsItem>
""")

# get the newsItem from the parsed object
newsitem = g2doc.getNewsItem()
# test various elements and attributes using our shortcut dot syntax
assert newsitem.guid == 'simplest-test'
assert newsitem.standard == 'NewsML-G2'
assert newsitem.standardversion == '2.34'
assert newsitem.conformance == 'power'

itemmeta = newsitem.itemmeta
# you can choose whether to use qcodes or URIs, we do the conversion for you
# using the catalog declared in the NewsML-G2 file
assert itemmeta.itemclass.qcode == 'ninat:text'
assert NewsMLG2.qcode_to_uri(itemmeta.itemclass.qcode) == 'http://cv.iptc.org/newscodes/ninature/text'
assert itemmeta.provider.qcode == 'nprov:IPTC'
assert NewsMLG2.qcode_to_uri(itemmeta.provider.qcode) == 'http://cv.iptc.org/newscodes/newsprovider/IPTC'
# Elements that contain a simple text string can be read with str(class)
assert str(itemmeta.versioncreated) == '2020-06-22T12:00:00+03:00'

etc...
```

## Creating NewsML-G2 files in Python

There are a few points to note when creating NewsML-G2 directly in Python code (as opposed to
parsing a string containing XML).

* Elements with multiple values (such as multiple <broader> elements) must be created
individually and then added to their parent through array assignment. So you should create

Example:
```
g2doc = NewsMLG2.NewsMLG2Document()
newsitem = NewsMLG2.NewsItem()
newsitem.guid = 'test-guid'
newsitem.xml_lang = 'en-GB'
itemmeta = NewsMLG2.ItemMeta()
itemmeta.itemclass.qcode = "ninat:text"
itemmeta.provider.qcode = "nprov:IPTC"
itemmeta.versioncreated = "2020-06-22T12:00:00+03:00"
newsitem.itemmeta = itemmeta
contentmeta = NewsMLG2.NewsItemContentMeta()
contentmeta.contentcreated = '2008-11-05T19:04:00-08:00'
located = NewsMLG2.Located()
located.type = 'cptype:city'
located.qcode = 'city:345678'
located.name = 'Berlin'
contentmeta.located = located
located = NewsMLG2.Located()
located.type = 'cptype:city'
located.qcode = 'city:345678'
located.name = 'Berlin'
contentmeta.located = located
digsrctype = NewsMLG2.DigitalSourceType()
digsrctype.uri = 'http://cv.iptc.org/newscodes/digitalsourcetype/trainedAlgorithmicMedia'
contentmeta.digitalsourcetype = digsrctype
broader1 = NewsMLG2.Broader()
broader1.type = 'cptype:statprov'
broader1.qcode = 'state:2365'
broader1.name = 'Berlin'
broader2 = NewsMLG2.Broader()
broader2.type = 'cptype:country'
broader2.qcode = 'iso3166-1a2:DE'
broader2.name = 'Germany'
contentmeta.located.broader = [broader1, broader2]
creator = NewsMLG2.Creator()
creator.qcode = 'codesource:DEZDF'
creator.name = 'Zweites Deutsches Fernsehen'
# This implements
# contentmeta.creator.organisationdetails.location.name = 'MAINZ'
# we have to make each item separately.
orgdetails = NewsMLG2.OrganisationDetails()
orglocation = NewsMLG2.OrganisationLocation()
orglocation.name = 'MAINZ'
orgdetails.location = orglocation
creator.organisationdetails = orgdetails
contentmeta.creator = creator
newsitem.contentmeta = contentmeta
g2doc.set_item(newsitem)

output_newsitem = g2doc.get_item()
assert newsitem.guid == 'test-guid'
assert newsitem.standard == 'NewsML-G2'
assert newsitem.standardversion == '2.34'
assert newsitem.conformance == 'power'
assert newsitem.version == '1'
assert newsitem.xml_lang == 'en-GB'

output_xml = g2doc.to_xml()
assert output_xml == (
    "<?xml version='1.0' encoding='utf-8'?>\n"
    '<newsItem xmlns="http://iptc.org/std/nar/2006-10-01/" xmlns:nitf="http://iptc.org/std/NITF/2006-10-18/" xml:lang="en-GB" standard="NewsML-G2" standardversion="2.34" conformance="power" guid="test-guid" version="1">\n'
    '  <itemMeta>\n'
    '    <itemClass qcode="ninat:text"/>\n'
    '    <provider qcode="nprov:IPTC"/>\n'
    '    <versionCreated>2020-06-22T12:00:00+03:00</versionCreated>\n'
    '  </itemMeta>\n'
    '</newsItem>\n')
```

## Testing

A unit test library is included.

Run it with:

    pytest

Test coverage can be measured with the `coverage.py` package:

    pip install coverage
    coverage run --source NewsMLG2 -m pytest 
    coverage report

## Release notes

* 0.1 - First release, pinned to Python 3 only (use pip >9.0 to ensure pip's
Python version requirement works properly)
* 0.2 - Can now read and write NewsML-G2 from code - NewsMessage and PlanningItem
not yet implemented. Probably quite a few bugs.
* 0.3 - Changed from automatically converting between URIs and QCodes to providing
helper functions `uri_to_qcode()` and `qcode_to_uri()`
* 0.4 - Added catalog v37 and v38. Added PlanningItem support. Fixed bugs. Improved
magic function support to help hasattr() and more on NewsML-G2 objects.
* 0.5 - Now has 100% unit test coverage. Fixed more bugs. Implemented changes up to
NewsML-G2 schema version v2.32.
* 0.6 - Implemented NewsMessage and Events (EventsML-G2). Adding arrays using code
(as opposed to parsing an XML string/file) now works. Almost ready to go to 1.0.
            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/iptc/python-newsmlg2",
    "name": "newsmlg2",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3",
    "maintainer_email": null,
    "keywords": "api, media, publishing, news, syndication",
    "author": "Brendan Quinn",
    "author_email": "office@iptc.org",
    "download_url": "https://files.pythonhosted.org/packages/a1/4b/432da9d83ce63c8771e9763d5ced4f6e27f952cf31806603ffbbd99b6e8a/newsmlg2-0.6.tar.gz",
    "platform": null,
    "description": "# NewsML-G2 - Python implementation of the NewsML-G2 standard\n\nNewsML-G2 is an open standard created by the International Press\nTelecommunications Council to share news content. See http://www.newsml-g2.org/\n\nThis module is a part-implementation of the standard in Python.  Currently it\nreads itemMeta and contentMeta blocks, catalogs and metadata objects from\nNewsML-G2 XML files and outputs Python objects.\n\nCurrently built for Python 3 only - please let us know if you require Python 2\nsupport.\n\n## Installation\n\nInstalling from PyPI:\n\n    pip install newsmlg2\n\n## Reading NewsML-G2 files\n\nExample:\n\n```\nimport NewsMLG2\n\n# load NewsML-G2 from a file and print the parsed version\ng2doc = NewsMLG2.NewsMLG2Document(filename=\"test-newsmlg2-file.xml\")\nprint(g2doc.get_item())\n\n# load NewsML-G2 from a string\ng2doc = NewsMLG2.NewsMLG2Document(\nb\"\"\"<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<newsItem\n    xmlns=\"http://iptc.org/std/nar/2006-10-01/\"\n    guid=\"simplest-test\"\n    standard=\"NewsML-G2\"\n    standardversion=\"2.34\"\n    conformance=\"power\"\n    version=\"1\"\n    xml:lang=\"en-GB\">\n    <catalogRef href=\"http://www.iptc.org/std/catalog/catalog.IPTC-G2-Standards_38.xml\" />\n    <itemMeta>\n        <itemClass qcode=\"ninat:text\" />\n        <provider qcode=\"nprov:IPTC\" />\n        <versionCreated>2020-06-22T12:00:00+03:00</versionCreated>\n    </itemMeta>\n    <contentSet>\n        <inlineXML contenttype=\"application/nitf+xml\">\n        </inlineXML>\n    </contentSet>\n</newsItem>\n\"\"\")\n\n# get the newsItem from the parsed object\nnewsitem = g2doc.getNewsItem()\n# test various elements and attributes using our shortcut dot syntax\nassert newsitem.guid == 'simplest-test'\nassert newsitem.standard == 'NewsML-G2'\nassert newsitem.standardversion == '2.34'\nassert newsitem.conformance == 'power'\n\nitemmeta = newsitem.itemmeta\n# you can choose whether to use qcodes or URIs, we do the conversion for you\n# using the catalog declared in the NewsML-G2 file\nassert itemmeta.itemclass.qcode == 'ninat:text'\nassert NewsMLG2.qcode_to_uri(itemmeta.itemclass.qcode) == 'http://cv.iptc.org/newscodes/ninature/text'\nassert itemmeta.provider.qcode == 'nprov:IPTC'\nassert NewsMLG2.qcode_to_uri(itemmeta.provider.qcode) == 'http://cv.iptc.org/newscodes/newsprovider/IPTC'\n# Elements that contain a simple text string can be read with str(class)\nassert str(itemmeta.versioncreated) == '2020-06-22T12:00:00+03:00'\n\netc...\n```\n\n## Creating NewsML-G2 files in Python\n\nThere are a few points to note when creating NewsML-G2 directly in Python code (as opposed to\nparsing a string containing XML).\n\n* Elements with multiple values (such as multiple <broader> elements) must be created\nindividually and then added to their parent through array assignment. So you should create\n\nExample:\n```\ng2doc = NewsMLG2.NewsMLG2Document()\nnewsitem = NewsMLG2.NewsItem()\nnewsitem.guid = 'test-guid'\nnewsitem.xml_lang = 'en-GB'\nitemmeta = NewsMLG2.ItemMeta()\nitemmeta.itemclass.qcode = \"ninat:text\"\nitemmeta.provider.qcode = \"nprov:IPTC\"\nitemmeta.versioncreated = \"2020-06-22T12:00:00+03:00\"\nnewsitem.itemmeta = itemmeta\ncontentmeta = NewsMLG2.NewsItemContentMeta()\ncontentmeta.contentcreated = '2008-11-05T19:04:00-08:00'\nlocated = NewsMLG2.Located()\nlocated.type = 'cptype:city'\nlocated.qcode = 'city:345678'\nlocated.name = 'Berlin'\ncontentmeta.located = located\nlocated = NewsMLG2.Located()\nlocated.type = 'cptype:city'\nlocated.qcode = 'city:345678'\nlocated.name = 'Berlin'\ncontentmeta.located = located\ndigsrctype = NewsMLG2.DigitalSourceType()\ndigsrctype.uri = 'http://cv.iptc.org/newscodes/digitalsourcetype/trainedAlgorithmicMedia'\ncontentmeta.digitalsourcetype = digsrctype\nbroader1 = NewsMLG2.Broader()\nbroader1.type = 'cptype:statprov'\nbroader1.qcode = 'state:2365'\nbroader1.name = 'Berlin'\nbroader2 = NewsMLG2.Broader()\nbroader2.type = 'cptype:country'\nbroader2.qcode = 'iso3166-1a2:DE'\nbroader2.name = 'Germany'\ncontentmeta.located.broader = [broader1, broader2]\ncreator = NewsMLG2.Creator()\ncreator.qcode = 'codesource:DEZDF'\ncreator.name = 'Zweites Deutsches Fernsehen'\n# This implements\n# contentmeta.creator.organisationdetails.location.name = 'MAINZ'\n# we have to make each item separately.\norgdetails = NewsMLG2.OrganisationDetails()\norglocation = NewsMLG2.OrganisationLocation()\norglocation.name = 'MAINZ'\norgdetails.location = orglocation\ncreator.organisationdetails = orgdetails\ncontentmeta.creator = creator\nnewsitem.contentmeta = contentmeta\ng2doc.set_item(newsitem)\n\noutput_newsitem = g2doc.get_item()\nassert newsitem.guid == 'test-guid'\nassert newsitem.standard == 'NewsML-G2'\nassert newsitem.standardversion == '2.34'\nassert newsitem.conformance == 'power'\nassert newsitem.version == '1'\nassert newsitem.xml_lang == 'en-GB'\n\noutput_xml = g2doc.to_xml()\nassert output_xml == (\n    \"<?xml version='1.0' encoding='utf-8'?>\\n\"\n    '<newsItem xmlns=\"http://iptc.org/std/nar/2006-10-01/\" xmlns:nitf=\"http://iptc.org/std/NITF/2006-10-18/\" xml:lang=\"en-GB\" standard=\"NewsML-G2\" standardversion=\"2.34\" conformance=\"power\" guid=\"test-guid\" version=\"1\">\\n'\n    '  <itemMeta>\\n'\n    '    <itemClass qcode=\"ninat:text\"/>\\n'\n    '    <provider qcode=\"nprov:IPTC\"/>\\n'\n    '    <versionCreated>2020-06-22T12:00:00+03:00</versionCreated>\\n'\n    '  </itemMeta>\\n'\n    '</newsItem>\\n')\n```\n\n## Testing\n\nA unit test library is included.\n\nRun it with:\n\n    pytest\n\nTest coverage can be measured with the `coverage.py` package:\n\n    pip install coverage\n    coverage run --source NewsMLG2 -m pytest \n    coverage report\n\n## Release notes\n\n* 0.1 - First release, pinned to Python 3 only (use pip >9.0 to ensure pip's\nPython version requirement works properly)\n* 0.2 - Can now read and write NewsML-G2 from code - NewsMessage and PlanningItem\nnot yet implemented. Probably quite a few bugs.\n* 0.3 - Changed from automatically converting between URIs and QCodes to providing\nhelper functions `uri_to_qcode()` and `qcode_to_uri()`\n* 0.4 - Added catalog v37 and v38. Added PlanningItem support. Fixed bugs. Improved\nmagic function support to help hasattr() and more on NewsML-G2 objects.\n* 0.5 - Now has 100% unit test coverage. Fixed more bugs. Implemented changes up to\nNewsML-G2 schema version v2.32.\n* 0.6 - Implemented NewsMessage and Events (EventsML-G2). Adding arrays using code\n(as opposed to parsing an XML string/file) now works. Almost ready to go to 1.0.",
    "bugtrack_url": null,
    "license": null,
    "summary": "Python implementation of the NewsML-G2 standard (https://iptc.org/standards/newsml-g2/)",
    "version": "0.6",
    "project_urls": {
        "Download": "https://github.com/iptc/python-newsmlg2/archive/v0.6.tar.gz",
        "Homepage": "https://github.com/iptc/python-newsmlg2"
    },
    "split_keywords": [
        "api",
        " media",
        " publishing",
        " news",
        " syndication"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "a14b432da9d83ce63c8771e9763d5ced4f6e27f952cf31806603ffbbd99b6e8a",
                "md5": "d17201745c901857a2baf9faedfb2114",
                "sha256": "1102de98e6076855069225ee53997b1e47ec7b97269a75e70f7e114fb6788c0f"
            },
            "downloads": -1,
            "filename": "newsmlg2-0.6.tar.gz",
            "has_sig": false,
            "md5_digest": "d17201745c901857a2baf9faedfb2114",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3",
            "size": 60170,
            "upload_time": "2024-05-24T14:30:58",
            "upload_time_iso_8601": "2024-05-24T14:30:58.165257Z",
            "url": "https://files.pythonhosted.org/packages/a1/4b/432da9d83ce63c8771e9763d5ced4f6e27f952cf31806603ffbbd99b6e8a/newsmlg2-0.6.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-05-24 14:30:58",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "iptc",
    "github_project": "python-newsmlg2",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "newsmlg2"
}
        
Elapsed time: 0.26033s