==========================
Glob-Like Pattern Matching
==========================
Converts a glob-matching pattern to a regular expression, using Apache
Cocoon style rules (with some extensions).
TL;DR
=====
Install:
.. code:: bash
$ pip install globre
Use:
.. code:: python
import globre
names = [
'/path/to/file.txt',
'/path/to/config.ini',
'/path/to/subdir/base.ini',
]
txt_names = [name for name in names if globre.match('/path/to/*.txt', name)]
assert txt_names == ['/path/to/file.txt']
ini_names = [name for name in names if globre.match('/path/to/*.ini', name)]
assert ini_names == ['/path/to/config.ini']
all_ini_names = [name for name in names if globre.match('/path/to/**.ini', name)]
assert all_ini_names == ['/path/to/config.ini', '/path/to/subdir/base.ini']
Details
=======
This package basically allows using unix shell-like filename globbing
to be used to match a string in a Python program. The glob matching
allows most characters to match themselves, with the following
sequences having special meanings:
========= ====================================================================
Sequence Meaning
========= ====================================================================
``?`` Matches any single character except the slash
('/') character.
``*`` Matches zero or more characters *excluding* the slash
('/') character, e.g. ``/etc/*.conf`` which will *not*
match "/etc/foo/bar.conf".
``**`` Matches zero or more characters *including* the slash
('/') character, e.g. ``/lib/**.so`` which *will*
match "/lib/foo/bar.so".
``\`` Escape character used to precede any of the other special
characters (in order to match them literally), e.g.
``foo\?`` will match "foo" followed by a literal question mark.
``[...]`` Matches any character in the specified regex-style character range,
e.g. ``foo[0-9A-F].conf``.
``{...}`` Inlines a regex expression, e.g. ``foo-{\\D{2,4\}}.txt`` which
will match "foo-bar.txt" but not "foo-012.txt".
========= ====================================================================
The `globre` package exports the following functions:
* ``globre.match(pattern, string, sep=None, flags=0)``:
Tests whether or not the glob `pattern` matches the `string`. If it
does, a `re.MatchObject` is returned, otherwise ``None``. The `string`
must be matched in its entirety. See `globre.compile` for details on
the `sep` and `flags` parameters. Example:
.. code:: python
globre.match('/etc/**.conf', '/etc/rsyslog.conf')
# => truthy
* ``globre.search(pattern, string, sep=None, flags=0)``:
Similar to `globre.match`, but the pattern does not need to match
the entire string. Example:
.. code:: python
globre.search('lib/**.so', '/var/lib/python/readline.so.6.2')
# => truthy
* ``globre.compile(pattern, sep=None, flags=0, split_prefix=False)``:
Compiles the specified `pattern` into a matching object that has the
same API as the regular expression object returned by `re.compile`.
The `sep` parameter specifies the hierarchical path component
separator to use. By default, it uses the unix-style forward-slash
separator (``"/"``), but can be overriden to be a sequence of
alternative valid hierarchical path component separator characters.
Note that although `sep` *could* be set to both forward- and back-
slashes (i.e. ``"/\\"``) to, theoretically, support either unix- and
windows-style path components, this has the significant flaw that
then *both* characters can be used within the same path as
separators.
The `flags` bit mask can contain all the standard `re` flags, in
addition to the ``globre.EXACT`` flag. If EXACT is set, then the
returned regex will include the equivalent of a leading '^' and
trailing '$', meaning that the regex must match the entire string,
from beginning to end.
If `split_prefix` is truthy, the return value becomes a tuple with
the first element set to any initial non-wildcarded string found in
the pattern. The second element remains the regex object as before.
For example, the pattern ``foo/**.ini`` would result in a tuple
equivalent to ``('foo/', re.compile('foo/.*\\.ini'))``.
Example:
.. code:: python
prefix, expr = globre.compile('/path/to**.ini', split_prefix=True)
# prefix => '/path/to'
names = [
'/path/to/file.txt',
'/path/to/config.ini',
'/path/to/subdir/otherfile.txt',
'/path/to/subdir/base.ini',
]
for name in names:
if not expr.match(name):
# ignore the two ".txt" files
continue
# and do something with:
# - /path/to/config.ini
# - /path/to/subdir/base.ini
What About the ``glob`` Module
==============================
This package is different from the standard Python `glob` module in
the following critical ways:
* The `glob` module operates on the actual filesystem; `globre` can be
used to match both files on the filesystem as well as any other
sources of strings to match.
* The `glob` module does not provide the ``**`` "descending" matcher.
* The `glob` module does not provide the ``{...}`` regular expression
inlining feature.
* The `glob` module does not provide an alternate hierarchy separator
beyond ``/`` or ``\\``.
Raw data
{
"_id": null,
"home_page": "http://github.com/metagriffin/globre",
"name": "globre",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": "python glob pattern matching regular expression",
"author": "metagriffin",
"author_email": "mg.pypi@uberdev.org",
"download_url": "https://files.pythonhosted.org/packages/5a/ce/a9e2f3317a458f8c591a1f95d4061d4e241f529ba678292acdcf2d804783/globre-0.1.5.tar.gz",
"platform": "any",
"description": "==========================\nGlob-Like Pattern Matching\n==========================\n\nConverts a glob-matching pattern to a regular expression, using Apache\nCocoon style rules (with some extensions).\n\nTL;DR\n=====\n\nInstall:\n\n.. code:: bash\n\n $ pip install globre\n\nUse:\n\n.. code:: python\n\n import globre\n\n names = [\n '/path/to/file.txt',\n '/path/to/config.ini',\n '/path/to/subdir/base.ini',\n ]\n\n txt_names = [name for name in names if globre.match('/path/to/*.txt', name)]\n assert txt_names == ['/path/to/file.txt']\n\n ini_names = [name for name in names if globre.match('/path/to/*.ini', name)]\n assert ini_names == ['/path/to/config.ini']\n\n all_ini_names = [name for name in names if globre.match('/path/to/**.ini', name)]\n assert all_ini_names == ['/path/to/config.ini', '/path/to/subdir/base.ini']\n\n\nDetails\n=======\n\nThis package basically allows using unix shell-like filename globbing\nto be used to match a string in a Python program. The glob matching\nallows most characters to match themselves, with the following\nsequences having special meanings:\n\n========= ====================================================================\nSequence Meaning\n========= ====================================================================\n``?`` Matches any single character except the slash\n ('/') character.\n``*`` Matches zero or more characters *excluding* the slash\n ('/') character, e.g. ``/etc/*.conf`` which will *not*\n match \"/etc/foo/bar.conf\".\n``**`` Matches zero or more characters *including* the slash\n ('/') character, e.g. ``/lib/**.so`` which *will*\n match \"/lib/foo/bar.so\".\n``\\`` Escape character used to precede any of the other special\n characters (in order to match them literally), e.g.\n ``foo\\?`` will match \"foo\" followed by a literal question mark.\n``[...]`` Matches any character in the specified regex-style character range,\n e.g. ``foo[0-9A-F].conf``.\n``{...}`` Inlines a regex expression, e.g. ``foo-{\\\\D{2,4\\}}.txt`` which\n will match \"foo-bar.txt\" but not \"foo-012.txt\".\n========= ====================================================================\n\nThe `globre` package exports the following functions:\n\n* ``globre.match(pattern, string, sep=None, flags=0)``:\n\n Tests whether or not the glob `pattern` matches the `string`. If it\n does, a `re.MatchObject` is returned, otherwise ``None``. The `string`\n must be matched in its entirety. See `globre.compile` for details on\n the `sep` and `flags` parameters. Example:\n\n .. code:: python\n\n globre.match('/etc/**.conf', '/etc/rsyslog.conf')\n # => truthy\n\n* ``globre.search(pattern, string, sep=None, flags=0)``:\n\n Similar to `globre.match`, but the pattern does not need to match\n the entire string. Example:\n\n .. code:: python\n\n globre.search('lib/**.so', '/var/lib/python/readline.so.6.2')\n # => truthy\n\n* ``globre.compile(pattern, sep=None, flags=0, split_prefix=False)``:\n\n Compiles the specified `pattern` into a matching object that has the\n same API as the regular expression object returned by `re.compile`.\n\n The `sep` parameter specifies the hierarchical path component\n separator to use. By default, it uses the unix-style forward-slash\n separator (``\"/\"``), but can be overriden to be a sequence of\n alternative valid hierarchical path component separator characters.\n Note that although `sep` *could* be set to both forward- and back-\n slashes (i.e. ``\"/\\\\\"``) to, theoretically, support either unix- and\n windows-style path components, this has the significant flaw that\n then *both* characters can be used within the same path as\n separators.\n\n The `flags` bit mask can contain all the standard `re` flags, in\n addition to the ``globre.EXACT`` flag. If EXACT is set, then the\n returned regex will include the equivalent of a leading '^' and\n trailing '$', meaning that the regex must match the entire string,\n from beginning to end.\n\n If `split_prefix` is truthy, the return value becomes a tuple with\n the first element set to any initial non-wildcarded string found in\n the pattern. The second element remains the regex object as before.\n For example, the pattern ``foo/**.ini`` would result in a tuple\n equivalent to ``('foo/', re.compile('foo/.*\\\\.ini'))``.\n\n Example:\n\n .. code:: python\n\n prefix, expr = globre.compile('/path/to**.ini', split_prefix=True)\n # prefix => '/path/to'\n\n names = [\n '/path/to/file.txt',\n '/path/to/config.ini',\n '/path/to/subdir/otherfile.txt',\n '/path/to/subdir/base.ini',\n ]\n\n for name in names:\n if not expr.match(name):\n # ignore the two \".txt\" files\n continue\n # and do something with:\n # - /path/to/config.ini\n # - /path/to/subdir/base.ini\n\n\nWhat About the ``glob`` Module\n==============================\n\nThis package is different from the standard Python `glob` module in\nthe following critical ways:\n\n* The `glob` module operates on the actual filesystem; `globre` can be\n used to match both files on the filesystem as well as any other\n sources of strings to match.\n\n* The `glob` module does not provide the ``**`` \"descending\" matcher.\n\n* The `glob` module does not provide the ``{...}`` regular expression\n inlining feature.\n\n* The `glob` module does not provide an alternate hierarchy separator\n beyond ``/`` or ``\\\\``.",
"bugtrack_url": null,
"license": "GPLv3+",
"summary": "A glob matching library, providing an interface similar to the \"re\" module.",
"version": "0.1.5",
"project_urls": {
"Download": "UNKNOWN",
"Homepage": "http://github.com/metagriffin/globre"
},
"split_keywords": [
"python",
"glob",
"pattern",
"matching",
"regular",
"expression"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "5acea9e2f3317a458f8c591a1f95d4061d4e241f529ba678292acdcf2d804783",
"md5": "9dac11b1a0c822ea38ab2816f3194319",
"sha256": "ee214204f237e9114b8f61eeb61c2abd1e665ca3b59e5a6a0b070971c0bb12e2"
},
"downloads": -1,
"filename": "globre-0.1.5.tar.gz",
"has_sig": false,
"md5_digest": "9dac11b1a0c822ea38ab2816f3194319",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 20388,
"upload_time": "2016-10-29T20:19:27",
"upload_time_iso_8601": "2016-10-29T20:19:27.413346Z",
"url": "https://files.pythonhosted.org/packages/5a/ce/a9e2f3317a458f8c591a1f95d4061d4e241f529ba678292acdcf2d804783/globre-0.1.5.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2016-10-29 20:19:27",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "metagriffin",
"github_project": "globre",
"travis_ci": true,
"coveralls": false,
"github_actions": false,
"lcname": "globre"
}