This module provides regular expressions according to `RFC 3986 "Uniform
Resource Identifier (URI): Generic Syntax"
<http://tools.ietf.org/html/rfc3986>`_ and `RFC 3987 "Internationalized
Resource Identifiers (IRIs)" <http://tools.ietf.org/html/rfc3987>`_, and
utilities for composition and relative resolution of references.
API
---
**match** (string, rule='IRI_reference')
Convenience function for checking if `string` matches a specific rule.
Returns a match object or None::
>>> assert match('%C7X', 'pct_encoded') is None
>>> assert match('%C7', 'pct_encoded')
>>> assert match('%c7', 'pct_encoded')
**parse** (string, rule='IRI_reference')
Parses `string` according to `rule` into a dict of subcomponents.
If `rule` is None, parse an IRI_reference `without validation
<http://tools.ietf.org/html/rfc3986#appendix-B>`_.
If regex_ is available, any rule is supported; with re_, `rule` must be
'IRI_reference' or some special case thereof ('IRI', 'absolute_IRI',
'irelative_ref', 'irelative_part', 'URI_reference', 'URI', 'absolute_URI',
'relative_ref', 'relative_part'). ::
>>> d = parse('http://tools.ietf.org/html/rfc3986#appendix-A',
... rule='URI')
>>> assert all([ d['scheme'] == 'http',
... d['authority'] == 'tools.ietf.org',
... d['path'] == '/html/rfc3986',
... d['query'] == None,
... d['fragment'] == 'appendix-A' ])
**compose** (\*\*parts)
Returns an URI composed_ from named parts.
.. _composed: http://tools.ietf.org/html/rfc3986#section-5.3
**resolve** (base, uriref, strict=True, return_parts=False)
Resolves_ an `URI reference` relative to a `base` URI.
`Test cases <http://tools.ietf.org/html/rfc3986#section-5.4>`_::
>>> base = resolve.test_cases_base
>>> for relative, resolved in resolve.test_cases.items():
... assert resolve(base, relative) == resolved
If `return_parts` is True, returns a dict of named parts instead of
a string.
Examples::
>>> assert resolve('urn:rootless', '../../name') == 'urn:name'
>>> assert resolve('urn:root/less', '../../name') == 'urn:/name'
>>> assert resolve('http://a/b', 'http:g') == 'http:g'
>>> assert resolve('http://a/b', 'http:g', strict=False) == 'http://a/g'
.. _Resolves: http://tools.ietf.org/html/rfc3986#section-5.2
**patterns**
A dict of regular expressions with useful group names.
Compilable (with regex_ only) without need for any particular compilation
flag.
**[bmp_][u]patterns[_no_names]**
Alternative versions of `patterns`.
[u]nicode strings without group names for the re_ module.
BMP only for narrow builds.
**get_compiled_pattern** (rule, flags=0)
Returns a compiled pattern object for a rule name or template string.
Usage for validation::
>>> uri = get_compiled_pattern('^%(URI)s$')
>>> assert uri.match('http://tools.ietf.org/html/rfc3986#appendix-A')
>>> assert not get_compiled_pattern('^%(relative_ref)s$').match('#f#g')
>>> from unicodedata import lookup
>>> smp = 'urn:' + lookup('OLD ITALIC LETTER A') # U+00010300
>>> assert not uri.match(smp)
>>> m = get_compiled_pattern('^%(IRI)s$').match(smp)
On narrow builds, non-BMP characters are (incorrectly) excluded::
>>> assert NARROW_BUILD == (not m)
For parsing, some subcomponents are captured in named groups (*only if*
regex_ is available, otherwise see `parse`)::
>>> match = uri.match('http://tools.ietf.org/html/rfc3986#appendix-A')
>>> d = match.groupdict()
>>> if REGEX:
... assert all([ d['scheme'] == 'http',
... d['authority'] == 'tools.ietf.org',
... d['path'] == '/html/rfc3986',
... d['query'] == None,
... d['fragment'] == 'appendix-A' ])
>>> for r in patterns.keys():
... assert get_compiled_pattern(r)
**format_patterns** (\*\*names)
Returns a dict of patterns (regular expressions) keyed by
`rule names for URIs`_ and `rule names for IRIs`_.
See also the module level dicts of patterns, and `get_compiled_pattern`.
To wrap a rule in a named capture group, pass it as keyword argument:
rule_name='group_name'. By default, the formatted patterns contain no
named groups.
Patterns are `str` instances (be it in python 2.x or 3.x) containing ASCII
characters only.
Caveats:
- with re_, named capture groups cannot occur on multiple branches of an
alternation
- with re_ before python 3.3, ``\u`` and ``\U`` escapes must be
preprocessed (see `issue3665 <http://bugs.python.org/issue3665>`_)
- on narrow builds, character ranges beyond BMP are not supported
.. _rule names for URIs: http://tools.ietf.org/html/rfc3986#appendix-A
.. _rule names for IRIs: http://tools.ietf.org/html/rfc3987#section-2.2
Dependencies
------------
Some features require regex_.
This package's docstrings are tested on python 2.6, 2.7, and 3.2 to 3.6.
Note that in python<=3.2, characters beyond the Basic Multilingual Plane are
not supported on narrow builds (see `issue12729
<http://bugs.python.org/issue12729>`_).
Release notes
-------------
version 1.3.8:
- fixed deprecated escape sequence
version 1.3.6:
- fixed a bug in IPv6 pattern:
>>> assert match('::0:0:0:0:0.0.0.0', 'IPv6address')
version 1.3.4:
- allowed for lower case percent encoding
version 1.3.3:
- fixed a bug in `resolve` which left "../" at the beginning of some paths
version 1.3.2:
- convenience function `match`
- patterns restricted to the BMP for narrow builds
- adapted doctests for python 3.3
- compatibility with python 2.6 (thanks to Thijs Janssen)
version 1.3.1:
- some re_ compatibility: get_compiled_pattern, parse
- dropped regex_ from setup.py requirements
version 1.3.0:
- python 3.x compatibility
- format_patterns
version 1.2.1:
- compose, resolve
.. _re: http://docs.python.org/library/re
.. _regex: http://pypi.python.org/pypi/regex
Support
-------
This is free software. You may show your appreciation with a `donation`_.
.. _donation: http://danielgerber.net/ยค#Thanks-for-python-package-rfc3987
Raw data
{
"_id": null,
"home_page": "http://pypi.python.org/pypi/rfc3987",
"name": "rfc3987",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "URI IRI URL rfc3986 rfc3987 validation",
"author": "Daniel Gerber",
"author_email": "daniel.g.gerber@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/14/bb/f1395c4b62f251a1cb503ff884500ebd248eed593f41b469f89caa3547bd/rfc3987-1.3.8.tar.gz",
"platform": "",
"description": "This module provides regular expressions according to `RFC 3986 \"Uniform\nResource Identifier (URI): Generic Syntax\"\n<http://tools.ietf.org/html/rfc3986>`_ and `RFC 3987 \"Internationalized\nResource Identifiers (IRIs)\" <http://tools.ietf.org/html/rfc3987>`_, and\nutilities for composition and relative resolution of references.\n\n\nAPI\n---\n\n**match** (string, rule='IRI_reference')\n Convenience function for checking if `string` matches a specific rule.\n\n Returns a match object or None::\n\n >>> assert match('%C7X', 'pct_encoded') is None\n >>> assert match('%C7', 'pct_encoded')\n >>> assert match('%c7', 'pct_encoded')\n\n\n\n**parse** (string, rule='IRI_reference')\n Parses `string` according to `rule` into a dict of subcomponents.\n\n If `rule` is None, parse an IRI_reference `without validation\n <http://tools.ietf.org/html/rfc3986#appendix-B>`_.\n\n If regex_ is available, any rule is supported; with re_, `rule` must be\n 'IRI_reference' or some special case thereof ('IRI', 'absolute_IRI',\n 'irelative_ref', 'irelative_part', 'URI_reference', 'URI', 'absolute_URI',\n 'relative_ref', 'relative_part'). ::\n\n >>> d = parse('http://tools.ietf.org/html/rfc3986#appendix-A',\n ... rule='URI')\n >>> assert all([ d['scheme'] == 'http',\n ... d['authority'] == 'tools.ietf.org',\n ... d['path'] == '/html/rfc3986',\n ... d['query'] == None,\n ... d['fragment'] == 'appendix-A' ])\n\n\n\n**compose** (\\*\\*parts)\n Returns an URI composed_ from named parts.\n\n .. _composed: http://tools.ietf.org/html/rfc3986#section-5.3\n\n\n**resolve** (base, uriref, strict=True, return_parts=False)\n Resolves_ an `URI reference` relative to a `base` URI.\n\n `Test cases <http://tools.ietf.org/html/rfc3986#section-5.4>`_::\n\n >>> base = resolve.test_cases_base\n >>> for relative, resolved in resolve.test_cases.items():\n ... assert resolve(base, relative) == resolved\n\n If `return_parts` is True, returns a dict of named parts instead of\n a string.\n\n Examples::\n\n >>> assert resolve('urn:rootless', '../../name') == 'urn:name'\n >>> assert resolve('urn:root/less', '../../name') == 'urn:/name'\n >>> assert resolve('http://a/b', 'http:g') == 'http:g'\n >>> assert resolve('http://a/b', 'http:g', strict=False) == 'http://a/g'\n\n .. _Resolves: http://tools.ietf.org/html/rfc3986#section-5.2\n\n\n\n**patterns**\n A dict of regular expressions with useful group names.\n Compilable (with regex_ only) without need for any particular compilation\n flag.\n\n**[bmp_][u]patterns[_no_names]**\n Alternative versions of `patterns`.\n [u]nicode strings without group names for the re_ module.\n BMP only for narrow builds.\n\n**get_compiled_pattern** (rule, flags=0)\n Returns a compiled pattern object for a rule name or template string.\n\n Usage for validation::\n\n >>> uri = get_compiled_pattern('^%(URI)s$')\n >>> assert uri.match('http://tools.ietf.org/html/rfc3986#appendix-A')\n >>> assert not get_compiled_pattern('^%(relative_ref)s$').match('#f#g')\n >>> from unicodedata import lookup\n >>> smp = 'urn:' + lookup('OLD ITALIC LETTER A') # U+00010300\n >>> assert not uri.match(smp)\n >>> m = get_compiled_pattern('^%(IRI)s$').match(smp)\n\n On narrow builds, non-BMP characters are (incorrectly) excluded::\n\n >>> assert NARROW_BUILD == (not m)\n\n For parsing, some subcomponents are captured in named groups (*only if*\n regex_ is available, otherwise see `parse`)::\n\n >>> match = uri.match('http://tools.ietf.org/html/rfc3986#appendix-A')\n >>> d = match.groupdict()\n >>> if REGEX:\n ... assert all([ d['scheme'] == 'http',\n ... d['authority'] == 'tools.ietf.org',\n ... d['path'] == '/html/rfc3986',\n ... d['query'] == None,\n ... d['fragment'] == 'appendix-A' ])\n\n >>> for r in patterns.keys():\n ... assert get_compiled_pattern(r)\n\n\n\n**format_patterns** (\\*\\*names)\n Returns a dict of patterns (regular expressions) keyed by\n `rule names for URIs`_ and `rule names for IRIs`_.\n\n See also the module level dicts of patterns, and `get_compiled_pattern`.\n\n To wrap a rule in a named capture group, pass it as keyword argument:\n rule_name='group_name'. By default, the formatted patterns contain no\n named groups.\n\n Patterns are `str` instances (be it in python 2.x or 3.x) containing ASCII\n characters only.\n\n Caveats:\n\n - with re_, named capture groups cannot occur on multiple branches of an\n alternation\n\n - with re_ before python 3.3, ``\\u`` and ``\\U`` escapes must be\n preprocessed (see `issue3665 <http://bugs.python.org/issue3665>`_)\n\n - on narrow builds, character ranges beyond BMP are not supported\n\n .. _rule names for URIs: http://tools.ietf.org/html/rfc3986#appendix-A\n .. _rule names for IRIs: http://tools.ietf.org/html/rfc3987#section-2.2\n\n\n\nDependencies\n------------\n\nSome features require regex_.\n\nThis package's docstrings are tested on python 2.6, 2.7, and 3.2 to 3.6.\nNote that in python<=3.2, characters beyond the Basic Multilingual Plane are\nnot supported on narrow builds (see `issue12729\n<http://bugs.python.org/issue12729>`_).\n\n\nRelease notes\n-------------\n\nversion 1.3.8:\n\n- fixed deprecated escape sequence\n\nversion 1.3.6:\n\n- fixed a bug in IPv6 pattern:\n\n >>> assert match('::0:0:0:0:0.0.0.0', 'IPv6address')\n\nversion 1.3.4:\n\n- allowed for lower case percent encoding\n\nversion 1.3.3:\n\n- fixed a bug in `resolve` which left \"../\" at the beginning of some paths\n\nversion 1.3.2:\n\n- convenience function `match`\n- patterns restricted to the BMP for narrow builds\n- adapted doctests for python 3.3\n- compatibility with python 2.6 (thanks to Thijs Janssen)\n\nversion 1.3.1:\n\n- some re_ compatibility: get_compiled_pattern, parse\n- dropped regex_ from setup.py requirements\n\nversion 1.3.0:\n\n- python 3.x compatibility\n- format_patterns\n\nversion 1.2.1:\n\n- compose, resolve\n\n\n.. _re: http://docs.python.org/library/re\n.. _regex: http://pypi.python.org/pypi/regex\n\n\nSupport\n-------\nThis is free software. You may show your appreciation with a `donation`_.\n\n.. _donation: http://danielgerber.net/\u00a4#Thanks-for-python-package-rfc3987\n\n\n\n",
"bugtrack_url": null,
"license": "GNU GPLv3+",
"summary": "Parsing and validation of URIs (RFC 3986) and IRIs (RFC 3987)",
"version": "1.3.8",
"split_keywords": [
"uri",
"iri",
"url",
"rfc3986",
"rfc3987",
"validation"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "65d4f7407c3d15d5ac779c3dd34fbbc6ea2090f77bd7dd12f207ccf881551208",
"md5": "846284d5da753a8c07830655ca29b6e4",
"sha256": "10702b1e51e5658843460b189b185c0366d2cf4cff716f13111b0ea9fd2dce53"
},
"downloads": -1,
"filename": "rfc3987-1.3.8-py2.py3-none-any.whl",
"has_sig": false,
"md5_digest": "846284d5da753a8c07830655ca29b6e4",
"packagetype": "bdist_wheel",
"python_version": "py2.py3",
"requires_python": null,
"size": 13377,
"upload_time": "2018-07-29T17:23:45",
"upload_time_iso_8601": "2018-07-29T17:23:45.313143Z",
"url": "https://files.pythonhosted.org/packages/65/d4/f7407c3d15d5ac779c3dd34fbbc6ea2090f77bd7dd12f207ccf881551208/rfc3987-1.3.8-py2.py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "14bbf1395c4b62f251a1cb503ff884500ebd248eed593f41b469f89caa3547bd",
"md5": "b6c4028acdc788a9ba697e1c1d6b896c",
"sha256": "d3c4d257a560d544e9826b38bc81db676890c79ab9d7ac92b39c7a253d5ca733"
},
"downloads": -1,
"filename": "rfc3987-1.3.8.tar.gz",
"has_sig": false,
"md5_digest": "b6c4028acdc788a9ba697e1c1d6b896c",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 20700,
"upload_time": "2018-07-29T17:23:47",
"upload_time_iso_8601": "2018-07-29T17:23:47.954044Z",
"url": "https://files.pythonhosted.org/packages/14/bb/f1395c4b62f251a1cb503ff884500ebd248eed593f41b469f89caa3547bd/rfc3987-1.3.8.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2018-07-29 17:23:47",
"github": false,
"gitlab": false,
"bitbucket": false,
"lcname": "rfc3987"
}