# NewsML-G2 - Python implementation of the NewsML-G2 standard
NewsML-G2 is an open standard created by the International Press
Telecommunications Council to share news content. See http://www.newsml-g2.org/
This module is a part-implementation of the standard in Python. Currently it
reads itemMeta and contentMeta blocks, catalogs and metadata objects from
NewsML-G2 XML files and outputs Python objects.
Currently built for Python 3 only - please let us know if you require Python 2
support.
## Installation
Installing from PyPI:
pip install newsmlg2
## Reading NewsML-G2 files
Example:
```
import NewsMLG2
# load NewsML-G2 from a file and print the parsed version
g2doc = NewsMLG2.NewsMLG2Document(filename="test-newsmlg2-file.xml")
print(g2doc.get_item())
# load NewsML-G2 from a string
g2doc = NewsMLG2.NewsMLG2Document(
b"""<?xml version="1.0" encoding="UTF-8"?>
<newsItem
xmlns="http://iptc.org/std/nar/2006-10-01/"
guid="simplest-test"
standard="NewsML-G2"
standardversion="2.34"
conformance="power"
version="1"
xml:lang="en-GB">
<catalogRef href="http://www.iptc.org/std/catalog/catalog.IPTC-G2-Standards_38.xml" />
<itemMeta>
<itemClass qcode="ninat:text" />
<provider qcode="nprov:IPTC" />
<versionCreated>2020-06-22T12:00:00+03:00</versionCreated>
</itemMeta>
<contentSet>
<inlineXML contenttype="application/nitf+xml">
</inlineXML>
</contentSet>
</newsItem>
""")
# get the newsItem from the parsed object
newsitem = g2doc.getNewsItem()
# test various elements and attributes using our shortcut dot syntax
assert newsitem.guid == 'simplest-test'
assert newsitem.standard == 'NewsML-G2'
assert newsitem.standardversion == '2.34'
assert newsitem.conformance == 'power'
itemmeta = newsitem.itemmeta
# you can choose whether to use qcodes or URIs, we do the conversion for you
# using the catalog declared in the NewsML-G2 file
assert itemmeta.itemclass.qcode == 'ninat:text'
assert NewsMLG2.qcode_to_uri(itemmeta.itemclass.qcode) == 'http://cv.iptc.org/newscodes/ninature/text'
assert itemmeta.provider.qcode == 'nprov:IPTC'
assert NewsMLG2.qcode_to_uri(itemmeta.provider.qcode) == 'http://cv.iptc.org/newscodes/newsprovider/IPTC'
# Elements that contain a simple text string can be read with str(class)
assert str(itemmeta.versioncreated) == '2020-06-22T12:00:00+03:00'
etc...
```
## Creating NewsML-G2 files in Python
There are a few points to note when creating NewsML-G2 directly in Python code (as opposed to
parsing a string containing XML).
* Elements with multiple values (such as multiple <broader> elements) must be created
individually and then added to their parent through array assignment. So you should create
Example:
```
g2doc = NewsMLG2.NewsMLG2Document()
newsitem = NewsMLG2.NewsItem()
newsitem.guid = 'test-guid'
newsitem.xml_lang = 'en-GB'
itemmeta = NewsMLG2.ItemMeta()
itemmeta.itemclass.qcode = "ninat:text"
itemmeta.provider.qcode = "nprov:IPTC"
itemmeta.versioncreated = "2020-06-22T12:00:00+03:00"
newsitem.itemmeta = itemmeta
contentmeta = NewsMLG2.NewsItemContentMeta()
contentmeta.contentcreated = '2008-11-05T19:04:00-08:00'
located = NewsMLG2.Located()
located.type = 'cptype:city'
located.qcode = 'city:345678'
located.name = 'Berlin'
contentmeta.located = located
located = NewsMLG2.Located()
located.type = 'cptype:city'
located.qcode = 'city:345678'
located.name = 'Berlin'
contentmeta.located = located
digsrctype = NewsMLG2.DigitalSourceType()
digsrctype.uri = 'http://cv.iptc.org/newscodes/digitalsourcetype/trainedAlgorithmicMedia'
contentmeta.digitalsourcetype = digsrctype
broader1 = NewsMLG2.Broader()
broader1.type = 'cptype:statprov'
broader1.qcode = 'state:2365'
broader1.name = 'Berlin'
broader2 = NewsMLG2.Broader()
broader2.type = 'cptype:country'
broader2.qcode = 'iso3166-1a2:DE'
broader2.name = 'Germany'
contentmeta.located.broader = [broader1, broader2]
creator = NewsMLG2.Creator()
creator.qcode = 'codesource:DEZDF'
creator.name = 'Zweites Deutsches Fernsehen'
# This implements
# contentmeta.creator.organisationdetails.location.name = 'MAINZ'
# we have to make each item separately.
orgdetails = NewsMLG2.OrganisationDetails()
orglocation = NewsMLG2.OrganisationLocation()
orglocation.name = 'MAINZ'
orgdetails.location = orglocation
creator.organisationdetails = orgdetails
contentmeta.creator = creator
newsitem.contentmeta = contentmeta
g2doc.set_item(newsitem)
output_newsitem = g2doc.get_item()
assert newsitem.guid == 'test-guid'
assert newsitem.standard == 'NewsML-G2'
assert newsitem.standardversion == '2.34'
assert newsitem.conformance == 'power'
assert newsitem.version == '1'
assert newsitem.xml_lang == 'en-GB'
output_xml = g2doc.to_xml()
assert output_xml == (
"<?xml version='1.0' encoding='utf-8'?>\n"
'<newsItem xmlns="http://iptc.org/std/nar/2006-10-01/" xmlns:nitf="http://iptc.org/std/NITF/2006-10-18/" xml:lang="en-GB" standard="NewsML-G2" standardversion="2.34" conformance="power" guid="test-guid" version="1">\n'
' <itemMeta>\n'
' <itemClass qcode="ninat:text"/>\n'
' <provider qcode="nprov:IPTC"/>\n'
' <versionCreated>2020-06-22T12:00:00+03:00</versionCreated>\n'
' </itemMeta>\n'
'</newsItem>\n')
```
## Testing
A unit test library is included.
Run it with:
pytest
Test coverage can be measured with the `coverage.py` package:
pip install coverage
coverage run --source NewsMLG2 -m pytest
coverage report
## Release notes
* 0.1 - First release, pinned to Python 3 only (use pip >9.0 to ensure pip's
Python version requirement works properly)
* 0.2 - Can now read and write NewsML-G2 from code - NewsMessage and PlanningItem
not yet implemented. Probably quite a few bugs.
* 0.3 - Changed from automatically converting between URIs and QCodes to providing
helper functions `uri_to_qcode()` and `qcode_to_uri()`
* 0.4 - Added catalog v37 and v38. Added PlanningItem support. Fixed bugs. Improved
magic function support to help hasattr() and more on NewsML-G2 objects.
* 0.5 - Now has 100% unit test coverage. Fixed more bugs. Implemented changes up to
NewsML-G2 schema version v2.32.
* 0.6 - Implemented NewsMessage and Events (EventsML-G2). Adding arrays using code
(as opposed to parsing an XML string/file) now works. Almost ready to go to 1.0.
Raw data
{
"_id": null,
"home_page": "https://github.com/iptc/python-newsmlg2",
"name": "newsmlg2",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3",
"maintainer_email": null,
"keywords": "api, media, publishing, news, syndication",
"author": "Brendan Quinn",
"author_email": "office@iptc.org",
"download_url": "https://files.pythonhosted.org/packages/a1/4b/432da9d83ce63c8771e9763d5ced4f6e27f952cf31806603ffbbd99b6e8a/newsmlg2-0.6.tar.gz",
"platform": null,
"description": "# NewsML-G2 - Python implementation of the NewsML-G2 standard\n\nNewsML-G2 is an open standard created by the International Press\nTelecommunications Council to share news content. See http://www.newsml-g2.org/\n\nThis module is a part-implementation of the standard in Python. Currently it\nreads itemMeta and contentMeta blocks, catalogs and metadata objects from\nNewsML-G2 XML files and outputs Python objects.\n\nCurrently built for Python 3 only - please let us know if you require Python 2\nsupport.\n\n## Installation\n\nInstalling from PyPI:\n\n pip install newsmlg2\n\n## Reading NewsML-G2 files\n\nExample:\n\n```\nimport NewsMLG2\n\n# load NewsML-G2 from a file and print the parsed version\ng2doc = NewsMLG2.NewsMLG2Document(filename=\"test-newsmlg2-file.xml\")\nprint(g2doc.get_item())\n\n# load NewsML-G2 from a string\ng2doc = NewsMLG2.NewsMLG2Document(\nb\"\"\"<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<newsItem\n xmlns=\"http://iptc.org/std/nar/2006-10-01/\"\n guid=\"simplest-test\"\n standard=\"NewsML-G2\"\n standardversion=\"2.34\"\n conformance=\"power\"\n version=\"1\"\n xml:lang=\"en-GB\">\n <catalogRef href=\"http://www.iptc.org/std/catalog/catalog.IPTC-G2-Standards_38.xml\" />\n <itemMeta>\n <itemClass qcode=\"ninat:text\" />\n <provider qcode=\"nprov:IPTC\" />\n <versionCreated>2020-06-22T12:00:00+03:00</versionCreated>\n </itemMeta>\n <contentSet>\n <inlineXML contenttype=\"application/nitf+xml\">\n </inlineXML>\n </contentSet>\n</newsItem>\n\"\"\")\n\n# get the newsItem from the parsed object\nnewsitem = g2doc.getNewsItem()\n# test various elements and attributes using our shortcut dot syntax\nassert newsitem.guid == 'simplest-test'\nassert newsitem.standard == 'NewsML-G2'\nassert newsitem.standardversion == '2.34'\nassert newsitem.conformance == 'power'\n\nitemmeta = newsitem.itemmeta\n# you can choose whether to use qcodes or URIs, we do the conversion for you\n# using the catalog declared in the NewsML-G2 file\nassert itemmeta.itemclass.qcode == 'ninat:text'\nassert NewsMLG2.qcode_to_uri(itemmeta.itemclass.qcode) == 'http://cv.iptc.org/newscodes/ninature/text'\nassert itemmeta.provider.qcode == 'nprov:IPTC'\nassert NewsMLG2.qcode_to_uri(itemmeta.provider.qcode) == 'http://cv.iptc.org/newscodes/newsprovider/IPTC'\n# Elements that contain a simple text string can be read with str(class)\nassert str(itemmeta.versioncreated) == '2020-06-22T12:00:00+03:00'\n\netc...\n```\n\n## Creating NewsML-G2 files in Python\n\nThere are a few points to note when creating NewsML-G2 directly in Python code (as opposed to\nparsing a string containing XML).\n\n* Elements with multiple values (such as multiple <broader> elements) must be created\nindividually and then added to their parent through array assignment. So you should create\n\nExample:\n```\ng2doc = NewsMLG2.NewsMLG2Document()\nnewsitem = NewsMLG2.NewsItem()\nnewsitem.guid = 'test-guid'\nnewsitem.xml_lang = 'en-GB'\nitemmeta = NewsMLG2.ItemMeta()\nitemmeta.itemclass.qcode = \"ninat:text\"\nitemmeta.provider.qcode = \"nprov:IPTC\"\nitemmeta.versioncreated = \"2020-06-22T12:00:00+03:00\"\nnewsitem.itemmeta = itemmeta\ncontentmeta = NewsMLG2.NewsItemContentMeta()\ncontentmeta.contentcreated = '2008-11-05T19:04:00-08:00'\nlocated = NewsMLG2.Located()\nlocated.type = 'cptype:city'\nlocated.qcode = 'city:345678'\nlocated.name = 'Berlin'\ncontentmeta.located = located\nlocated = NewsMLG2.Located()\nlocated.type = 'cptype:city'\nlocated.qcode = 'city:345678'\nlocated.name = 'Berlin'\ncontentmeta.located = located\ndigsrctype = NewsMLG2.DigitalSourceType()\ndigsrctype.uri = 'http://cv.iptc.org/newscodes/digitalsourcetype/trainedAlgorithmicMedia'\ncontentmeta.digitalsourcetype = digsrctype\nbroader1 = NewsMLG2.Broader()\nbroader1.type = 'cptype:statprov'\nbroader1.qcode = 'state:2365'\nbroader1.name = 'Berlin'\nbroader2 = NewsMLG2.Broader()\nbroader2.type = 'cptype:country'\nbroader2.qcode = 'iso3166-1a2:DE'\nbroader2.name = 'Germany'\ncontentmeta.located.broader = [broader1, broader2]\ncreator = NewsMLG2.Creator()\ncreator.qcode = 'codesource:DEZDF'\ncreator.name = 'Zweites Deutsches Fernsehen'\n# This implements\n# contentmeta.creator.organisationdetails.location.name = 'MAINZ'\n# we have to make each item separately.\norgdetails = NewsMLG2.OrganisationDetails()\norglocation = NewsMLG2.OrganisationLocation()\norglocation.name = 'MAINZ'\norgdetails.location = orglocation\ncreator.organisationdetails = orgdetails\ncontentmeta.creator = creator\nnewsitem.contentmeta = contentmeta\ng2doc.set_item(newsitem)\n\noutput_newsitem = g2doc.get_item()\nassert newsitem.guid == 'test-guid'\nassert newsitem.standard == 'NewsML-G2'\nassert newsitem.standardversion == '2.34'\nassert newsitem.conformance == 'power'\nassert newsitem.version == '1'\nassert newsitem.xml_lang == 'en-GB'\n\noutput_xml = g2doc.to_xml()\nassert output_xml == (\n \"<?xml version='1.0' encoding='utf-8'?>\\n\"\n '<newsItem xmlns=\"http://iptc.org/std/nar/2006-10-01/\" xmlns:nitf=\"http://iptc.org/std/NITF/2006-10-18/\" xml:lang=\"en-GB\" standard=\"NewsML-G2\" standardversion=\"2.34\" conformance=\"power\" guid=\"test-guid\" version=\"1\">\\n'\n ' <itemMeta>\\n'\n ' <itemClass qcode=\"ninat:text\"/>\\n'\n ' <provider qcode=\"nprov:IPTC\"/>\\n'\n ' <versionCreated>2020-06-22T12:00:00+03:00</versionCreated>\\n'\n ' </itemMeta>\\n'\n '</newsItem>\\n')\n```\n\n## Testing\n\nA unit test library is included.\n\nRun it with:\n\n pytest\n\nTest coverage can be measured with the `coverage.py` package:\n\n pip install coverage\n coverage run --source NewsMLG2 -m pytest \n coverage report\n\n## Release notes\n\n* 0.1 - First release, pinned to Python 3 only (use pip >9.0 to ensure pip's\nPython version requirement works properly)\n* 0.2 - Can now read and write NewsML-G2 from code - NewsMessage and PlanningItem\nnot yet implemented. Probably quite a few bugs.\n* 0.3 - Changed from automatically converting between URIs and QCodes to providing\nhelper functions `uri_to_qcode()` and `qcode_to_uri()`\n* 0.4 - Added catalog v37 and v38. Added PlanningItem support. Fixed bugs. Improved\nmagic function support to help hasattr() and more on NewsML-G2 objects.\n* 0.5 - Now has 100% unit test coverage. Fixed more bugs. Implemented changes up to\nNewsML-G2 schema version v2.32.\n* 0.6 - Implemented NewsMessage and Events (EventsML-G2). Adding arrays using code\n(as opposed to parsing an XML string/file) now works. Almost ready to go to 1.0.",
"bugtrack_url": null,
"license": null,
"summary": "Python implementation of the NewsML-G2 standard (https://iptc.org/standards/newsml-g2/)",
"version": "0.6",
"project_urls": {
"Download": "https://github.com/iptc/python-newsmlg2/archive/v0.6.tar.gz",
"Homepage": "https://github.com/iptc/python-newsmlg2"
},
"split_keywords": [
"api",
" media",
" publishing",
" news",
" syndication"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "a14b432da9d83ce63c8771e9763d5ced4f6e27f952cf31806603ffbbd99b6e8a",
"md5": "d17201745c901857a2baf9faedfb2114",
"sha256": "1102de98e6076855069225ee53997b1e47ec7b97269a75e70f7e114fb6788c0f"
},
"downloads": -1,
"filename": "newsmlg2-0.6.tar.gz",
"has_sig": false,
"md5_digest": "d17201745c901857a2baf9faedfb2114",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3",
"size": 60170,
"upload_time": "2024-05-24T14:30:58",
"upload_time_iso_8601": "2024-05-24T14:30:58.165257Z",
"url": "https://files.pythonhosted.org/packages/a1/4b/432da9d83ce63c8771e9763d5ced4f6e27f952cf31806603ffbbd99b6e8a/newsmlg2-0.6.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-05-24 14:30:58",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "iptc",
"github_project": "python-newsmlg2",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [],
"lcname": "newsmlg2"
}