b64-regex


Nameb64-regex JSON
Version 0.1.1 PyPI version JSON
download
home_pagehttps://github.com/MythicManiac/b64-regex
SummaryBuild regex patterns for search through b64 encoded text without decoding.
upload_time2023-07-18 10:40:33
maintainer
docs_urlNone
authorMythic
requires_python>=3.8,<4.0
licenseMIT
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Base64 Regex

[![pypi](https://img.shields.io/pypi/v/b64-regex)](https://pypi.org/project/b64-regex/)
[![test](https://github.com/MythicManiac/b64-regex/workflows/Test/badge.svg)](https://github.com/MythicManiac/b64-regex/actions)
[![codecov](https://codecov.io/gh/MythicManiac/b64-regex/branch/master/graph/badge.svg?token=D1IB10WPT7)](https://codecov.io/gh/MythicManiac/b64-regex)
[![python-versions](https://img.shields.io/pypi/pyversions/b64-regex.svg)](https://pypi.org/project/b64-regex/)

Search through base64 encoding without decoding.

## Usage

### Building a regex

To build a regex pattern for matching a specific string through base64, use the
`Segment.as_regex()` function:

```python
from b64_regex.recoder import Segment

segment = Segment(b"string-to-search")
segment.as_regex()

# Output:
# (?:c3RyaW5nLXRvLXNlYXJja[A-P]|[HXn3]N0cmluZy10by1zZWFyY2[g-j]|[BFJNRVZdhlptx159]zdHJpbmctdG8tc2VhcmNo)
```

Slightly more advanced patterns are supported via combination of segments with
normal regex. The `B64_CHARGROUP` variable contains `[a-zA-Z0-9\/\+]` for
convenience.

```python
from b64_regex.recoder import Segment, B64_CHARGROUP

start_segment = Segment(b"patternPrefix(")
end_segment = Segment(b")patternSuffix")

full_regex = f"{start_segment.as_regex()}{B64_CHARGROUP}+{end_segment.as_regex()}"

# Output:
# (?:cGF0dGVyblByZWZpeC[g-j]|[HXn3]BhdHRlcm5QcmVmaXgo|[BFJNRVZdhlptx159]wYXR0ZXJuUHJlZml4K[A-P])[a-zA-Z0-9\/\+]+(?:KXBhdHRlcm5TdWZmaX[g-j]|[CSiy]lwYXR0ZXJuU3VmZml4|[AEIMQUYcgkosw048]pcGF0dGVyblN1ZmZpe[A-P])
```

### Decoding matches

As around 33% of the matches are going to be misaligned by 2 or 4 bits,
decoding might need the prefixing of one or two b64 tokens to yield the right
results.

The `decode_all_alignments` function decodes the provided string with each bit
alignment and strips the prefixed extra data from the result. It however is not
able to know which result is correct, and instead returns all three:

```python
from b64_regex.recoder import decode_all_alignments

match = "HBhdHRlcm5QcmVmaXgoZm9vLWJhci1jb250ZW50YWFhYWFhKXBhdHRlcm5TdWZmaXh"
for x in decode_all_alignments(match):
    print(x)

# Output:
# b'\x1c\x18]\x1d\x19\\\x9b\x94\x1c\x99Y\x9a^\n\x19\x9b\xdb\xcbX\x98\\\x8bX\xdb\xdb\x9d\x19[\x9d\x18XXXXXJ\\\x18]\x1d\x19\\\x9b\x94\xddY\x99\x9a^'
# b'patternPrefix(foo-bar-contentaaaaaa)patternSuffix'
# b'\xc1\x85\xd1\xd1\x95\xc9\xb9A\xc9\x95\x99\xa5\xe0\xa1\x99\xbd\xbc\xb5\x89\x85\xc8\xb5\x8d\xbd\xb9\xd1\x95\xb9\xd1\x85\x85\x85\x85\x85\x84\xa5\xc1\x85\xd1\xd1\x95\xc9\xb9M\xd5\x99\x99\xa5\xe1'
```

## Future work

It should be possible to translate some regex features to work within the b64
context (such as string length selectors / character repeats).

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/MythicManiac/b64-regex",
    "name": "b64-regex",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8,<4.0",
    "maintainer_email": "",
    "keywords": "",
    "author": "Mythic",
    "author_email": "mythic@thunderstore.io",
    "download_url": "https://files.pythonhosted.org/packages/82/ea/c1640684dbf9af53a3334a4a34e328d4eadb4b1a7c8816099dfbd30ade84/b64_regex-0.1.1.tar.gz",
    "platform": null,
    "description": "# Base64 Regex\n\n[![pypi](https://img.shields.io/pypi/v/b64-regex)](https://pypi.org/project/b64-regex/)\n[![test](https://github.com/MythicManiac/b64-regex/workflows/Test/badge.svg)](https://github.com/MythicManiac/b64-regex/actions)\n[![codecov](https://codecov.io/gh/MythicManiac/b64-regex/branch/master/graph/badge.svg?token=D1IB10WPT7)](https://codecov.io/gh/MythicManiac/b64-regex)\n[![python-versions](https://img.shields.io/pypi/pyversions/b64-regex.svg)](https://pypi.org/project/b64-regex/)\n\nSearch through base64 encoding without decoding.\n\n## Usage\n\n### Building a regex\n\nTo build a regex pattern for matching a specific string through base64, use the\n`Segment.as_regex()` function:\n\n```python\nfrom b64_regex.recoder import Segment\n\nsegment = Segment(b\"string-to-search\")\nsegment.as_regex()\n\n# Output:\n# (?:c3RyaW5nLXRvLXNlYXJja[A-P]|[HXn3]N0cmluZy10by1zZWFyY2[g-j]|[BFJNRVZdhlptx159]zdHJpbmctdG8tc2VhcmNo)\n```\n\nSlightly more advanced patterns are supported via combination of segments with\nnormal regex. The `B64_CHARGROUP` variable contains `[a-zA-Z0-9\\/\\+]` for\nconvenience.\n\n```python\nfrom b64_regex.recoder import Segment, B64_CHARGROUP\n\nstart_segment = Segment(b\"patternPrefix(\")\nend_segment = Segment(b\")patternSuffix\")\n\nfull_regex = f\"{start_segment.as_regex()}{B64_CHARGROUP}+{end_segment.as_regex()}\"\n\n# Output:\n# (?:cGF0dGVyblByZWZpeC[g-j]|[HXn3]BhdHRlcm5QcmVmaXgo|[BFJNRVZdhlptx159]wYXR0ZXJuUHJlZml4K[A-P])[a-zA-Z0-9\\/\\+]+(?:KXBhdHRlcm5TdWZmaX[g-j]|[CSiy]lwYXR0ZXJuU3VmZml4|[AEIMQUYcgkosw048]pcGF0dGVyblN1ZmZpe[A-P])\n```\n\n### Decoding matches\n\nAs around 33% of the matches are going to be misaligned by 2 or 4 bits,\ndecoding might need the prefixing of one or two b64 tokens to yield the right\nresults.\n\nThe `decode_all_alignments` function decodes the provided string with each bit\nalignment and strips the prefixed extra data from the result. It however is not\nable to know which result is correct, and instead returns all three:\n\n```python\nfrom b64_regex.recoder import decode_all_alignments\n\nmatch = \"HBhdHRlcm5QcmVmaXgoZm9vLWJhci1jb250ZW50YWFhYWFhKXBhdHRlcm5TdWZmaXh\"\nfor x in decode_all_alignments(match):\n    print(x)\n\n# Output:\n# b'\\x1c\\x18]\\x1d\\x19\\\\\\x9b\\x94\\x1c\\x99Y\\x9a^\\n\\x19\\x9b\\xdb\\xcbX\\x98\\\\\\x8bX\\xdb\\xdb\\x9d\\x19[\\x9d\\x18XXXXXJ\\\\\\x18]\\x1d\\x19\\\\\\x9b\\x94\\xddY\\x99\\x9a^'\n# b'patternPrefix(foo-bar-contentaaaaaa)patternSuffix'\n# b'\\xc1\\x85\\xd1\\xd1\\x95\\xc9\\xb9A\\xc9\\x95\\x99\\xa5\\xe0\\xa1\\x99\\xbd\\xbc\\xb5\\x89\\x85\\xc8\\xb5\\x8d\\xbd\\xb9\\xd1\\x95\\xb9\\xd1\\x85\\x85\\x85\\x85\\x85\\x84\\xa5\\xc1\\x85\\xd1\\xd1\\x95\\xc9\\xb9M\\xd5\\x99\\x99\\xa5\\xe1'\n```\n\n## Future work\n\nIt should be possible to translate some regex features to work within the b64\ncontext (such as string length selectors / character repeats).\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Build regex patterns for search through b64 encoded text without decoding.",
    "version": "0.1.1",
    "project_urls": {
        "Homepage": "https://github.com/MythicManiac/b64-regex",
        "Repository": "https://github.com/MythicManiac/b64-regex"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "8ed75b025eee4e536ca44229e9cb06706ca14cfd180c41cf8899a7847764b923",
                "md5": "f70b92352355b49c65144a9707814421",
                "sha256": "3eac643930668704962bb8aae6ee72b34c07c75971de51227c093266c3a67418"
            },
            "downloads": -1,
            "filename": "b64_regex-0.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f70b92352355b49c65144a9707814421",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8,<4.0",
            "size": 133749,
            "upload_time": "2023-07-18T10:40:32",
            "upload_time_iso_8601": "2023-07-18T10:40:32.332243Z",
            "url": "https://files.pythonhosted.org/packages/8e/d7/5b025eee4e536ca44229e9cb06706ca14cfd180c41cf8899a7847764b923/b64_regex-0.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "82eac1640684dbf9af53a3334a4a34e328d4eadb4b1a7c8816099dfbd30ade84",
                "md5": "c7044c2d1e99c9f4db75d07f0cce0fcb",
                "sha256": "30120f35d9a5fbd629a54f324af11eddc0b329f0dc1000c19d1b30a7ecd1811c"
            },
            "downloads": -1,
            "filename": "b64_regex-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "c7044c2d1e99c9f4db75d07f0cce0fcb",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8,<4.0",
            "size": 134854,
            "upload_time": "2023-07-18T10:40:33",
            "upload_time_iso_8601": "2023-07-18T10:40:33.899420Z",
            "url": "https://files.pythonhosted.org/packages/82/ea/c1640684dbf9af53a3334a4a34e328d4eadb4b1a7c8816099dfbd30ade84/b64_regex-0.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-07-18 10:40:33",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "MythicManiac",
    "github_project": "b64-regex",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "b64-regex"
}
        
Elapsed time: 0.29684s