mutf8


Namemutf8 JSON
Version 1.0.6 PyPI version JSON
download
home_pagehttp://github.com/TkTech/mutf8
SummaryFast MUTF-8 encoder & decoder
upload_time2021-12-29 03:02:17
maintainer
docs_urlNone
authorTyler Kennedy
requires_python
license
keywords mutf-8 cesu-8 jvm
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            ![Tests](https://github.com/TkTech/mutf8/workflows/Tests/badge.svg?branch=master)

# mutf-8

This package contains simple pure-python as well as C encoders and decoders for
the MUTF-8 character encoding. In most cases, you can also parse the even-rarer
CESU-8.

These days, you'll most likely encounter MUTF-8 when working on files or
protocols related to the JVM. Strings in a Java `.class` file are encoded using
MUTF-8, strings passed by the JNI, as well as strings exported by the object
serializer.

This library was extracted from [Lawu][], a Python library for working with JVM
class files.

## 🎉 Installation

Install the package from PyPi:

```
pip install mutf8
```

Binary wheels are available for the following:

|                  | py3.6 | py3.7 | py3.8 | py3.9 |
| ---------------- | ----- | ----- | ----- | ----- |
| OS X (x86_64)    | y     | y     | y     | y     |
| Windows (x86_64) | y     | y     | y     | y     |
| Linux (x86_64)   | y     | y     | y     | y     |

If binary wheels are not available, it will attempt to build the C extension
from source with any C99 compiler. If it could not build, it will fall back
to a pure-python version.

## Usage

Encoding and decoding is simple:

```python
from mutf8 import encode_modified_utf8, decode_modified_utf8

unicode = decode_modified_utf8(byte_like_object)
bytes = encode_modified_utf8(unicode)
```

This module *does not* register itself globally as a codec, since importing
should be side-effect-free.

## 📈 Benchmarks

The C extension is significantly faster - often 20x to 40x faster.

<!-- BENCHMARK START -->

### MUTF-8 Decoding
| Name                         |   Min (μs) |   Max (μs) |   StdDev |           Ops |
|------------------------------|------------|------------|----------|---------------|
| cmutf8-decode_modified_utf8  |    0.00009 |    0.00080 |  0.00000 | 9957678.56358 |
| pymutf8-decode_modified_utf8 |    0.00190 |    0.06040 |  0.00000 |  450455.96019 |

### MUTF-8 Encoding
| Name                         |   Min (μs) |   Max (μs) |   StdDev |            Ops |
|------------------------------|------------|------------|----------|----------------|
| cmutf8-encode_modified_utf8  |    0.00008 |    0.00151 |  0.00000 | 11897361.05101 |
| pymutf8-encode_modified_utf8 |    0.00180 |    0.16650 |  0.00000 |   474390.98091 |
<!-- BENCHMARK END -->

## C Extension

The C extension is optional. If a binary package is not available, or a C
compiler is not present, the pure-python version will be used instead. If you
want to ensure you're using the C version, import it directly:

```python
from mutf8.cmutf8 import decode_modified_utf8

decode_modified_utf(b'\xED\xA1\x80\xED\xB0\x80')
```

[Lawu]: https://github.com/tktech/lawu



            

Raw data

            {
    "_id": null,
    "home_page": "http://github.com/TkTech/mutf8",
    "name": "mutf8",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "mutf-8,cesu-8,jvm",
    "author": "Tyler Kennedy",
    "author_email": "tk@tkte.ch",
    "download_url": "https://files.pythonhosted.org/packages/ca/31/3c57313757b3a47dcf32d2a9bad55d913b797efc8814db31bed8a7142396/mutf8-1.0.6.tar.gz",
    "platform": "",
    "description": "![Tests](https://github.com/TkTech/mutf8/workflows/Tests/badge.svg?branch=master)\n\n# mutf-8\n\nThis package contains simple pure-python as well as C encoders and decoders for\nthe MUTF-8 character encoding. In most cases, you can also parse the even-rarer\nCESU-8.\n\nThese days, you'll most likely encounter MUTF-8 when working on files or\nprotocols related to the JVM. Strings in a Java `.class` file are encoded using\nMUTF-8, strings passed by the JNI, as well as strings exported by the object\nserializer.\n\nThis library was extracted from [Lawu][], a Python library for working with JVM\nclass files.\n\n## \ud83c\udf89 Installation\n\nInstall the package from PyPi:\n\n```\npip install mutf8\n```\n\nBinary wheels are available for the following:\n\n|                  | py3.6 | py3.7 | py3.8 | py3.9 |\n| ---------------- | ----- | ----- | ----- | ----- |\n| OS X (x86_64)    | y     | y     | y     | y     |\n| Windows (x86_64) | y     | y     | y     | y     |\n| Linux (x86_64)   | y     | y     | y     | y     |\n\nIf binary wheels are not available, it will attempt to build the C extension\nfrom source with any C99 compiler. If it could not build, it will fall back\nto a pure-python version.\n\n## Usage\n\nEncoding and decoding is simple:\n\n```python\nfrom mutf8 import encode_modified_utf8, decode_modified_utf8\n\nunicode = decode_modified_utf8(byte_like_object)\nbytes = encode_modified_utf8(unicode)\n```\n\nThis module *does not* register itself globally as a codec, since importing\nshould be side-effect-free.\n\n## \ud83d\udcc8 Benchmarks\n\nThe C extension is significantly faster - often 20x to 40x faster.\n\n<!-- BENCHMARK START -->\n\n### MUTF-8 Decoding\n| Name                         |   Min (\u03bcs) |   Max (\u03bcs) |   StdDev |           Ops |\n|------------------------------|------------|------------|----------|---------------|\n| cmutf8-decode_modified_utf8  |    0.00009 |    0.00080 |  0.00000 | 9957678.56358 |\n| pymutf8-decode_modified_utf8 |    0.00190 |    0.06040 |  0.00000 |  450455.96019 |\n\n### MUTF-8 Encoding\n| Name                         |   Min (\u03bcs) |   Max (\u03bcs) |   StdDev |            Ops |\n|------------------------------|------------|------------|----------|----------------|\n| cmutf8-encode_modified_utf8  |    0.00008 |    0.00151 |  0.00000 | 11897361.05101 |\n| pymutf8-encode_modified_utf8 |    0.00180 |    0.16650 |  0.00000 |   474390.98091 |\n<!-- BENCHMARK END -->\n\n## C Extension\n\nThe C extension is optional. If a binary package is not available, or a C\ncompiler is not present, the pure-python version will be used instead. If you\nwant to ensure you're using the C version, import it directly:\n\n```python\nfrom mutf8.cmutf8 import decode_modified_utf8\n\ndecode_modified_utf(b'\\xED\\xA1\\x80\\xED\\xB0\\x80')\n```\n\n[Lawu]: https://github.com/tktech/lawu\n\n\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Fast MUTF-8 encoder & decoder",
    "version": "1.0.6",
    "split_keywords": [
        "mutf-8",
        "cesu-8",
        "jvm"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "1d35a974f7150411b1597e49bbfa2361afa0a69b776b02e4514c2b8fb663178c",
                "md5": "341b28ca1b5c041e5be438bf300fbc5c",
                "sha256": "74ae69cd9790fa4f0f6a7b0db503c459c955b8235551baf683cb4f3f31677063"
            },
            "downloads": -1,
            "filename": "mutf8-1.0.6-cp36-cp36m-macosx_10_14_x86_64.whl",
            "has_sig": false,
            "md5_digest": "341b28ca1b5c041e5be438bf300fbc5c",
            "packagetype": "bdist_wheel",
            "python_version": "cp36",
            "requires_python": null,
            "size": 8677,
            "upload_time": "2021-12-29T03:02:53",
            "upload_time_iso_8601": "2021-12-29T03:02:53.070687Z",
            "url": "https://files.pythonhosted.org/packages/1d/35/a974f7150411b1597e49bbfa2361afa0a69b776b02e4514c2b8fb663178c/mutf8-1.0.6-cp36-cp36m-macosx_10_14_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "1f4fa0fecea0020c194378c2ab4e8d26acfbad9c177c1947e62adb63f1b02de4",
                "md5": "acfc25dac566d7324254ad2a71944ee7",
                "sha256": "fcf20045263ce8ebd6c47e94c9477ab0d388ed169a69ad2d8f19bcbf0b87f401"
            },
            "downloads": -1,
            "filename": "mutf8-1.0.6-cp36-cp36m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "acfc25dac566d7324254ad2a71944ee7",
            "packagetype": "bdist_wheel",
            "python_version": "cp36",
            "requires_python": null,
            "size": 18910,
            "upload_time": "2021-12-29T03:03:11",
            "upload_time_iso_8601": "2021-12-29T03:03:11.305405Z",
            "url": "https://files.pythonhosted.org/packages/1f/4f/a0fecea0020c194378c2ab4e8d26acfbad9c177c1947e62adb63f1b02de4/mutf8-1.0.6-cp36-cp36m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "2347615e86d4d318839c8b75e7ded85a5cd440425156f7426b7435ff5288f15d",
                "md5": "fa5ddbfdf58334918df6eb5ceb6160c1",
                "sha256": "83c38555db263e369e95533d80848d8e4296e302303b72082b98c3124cba504d"
            },
            "downloads": -1,
            "filename": "mutf8-1.0.6-cp36-cp36m-win_amd64.whl",
            "has_sig": false,
            "md5_digest": "fa5ddbfdf58334918df6eb5ceb6160c1",
            "packagetype": "bdist_wheel",
            "python_version": "cp36",
            "requires_python": null,
            "size": 11472,
            "upload_time": "2021-12-29T03:03:50",
            "upload_time_iso_8601": "2021-12-29T03:03:50.632766Z",
            "url": "https://files.pythonhosted.org/packages/23/47/615e86d4d318839c8b75e7ded85a5cd440425156f7426b7435ff5288f15d/mutf8-1.0.6-cp36-cp36m-win_amd64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "197aca090f94dc1848aeeafb02e739edb78092ea027afe30119eb97df2c8e95d",
                "md5": "f310510d0212664f2359cb797b958d86",
                "sha256": "e09f4a19e5500699bb42074890b463b785ab9a8d95c7d793e590405f3b4b29d7"
            },
            "downloads": -1,
            "filename": "mutf8-1.0.6-cp37-cp37m-macosx_10_14_x86_64.whl",
            "has_sig": false,
            "md5_digest": "f310510d0212664f2359cb797b958d86",
            "packagetype": "bdist_wheel",
            "python_version": "cp37",
            "requires_python": null,
            "size": 8672,
            "upload_time": "2021-12-29T03:02:45",
            "upload_time_iso_8601": "2021-12-29T03:02:45.242872Z",
            "url": "https://files.pythonhosted.org/packages/19/7a/ca090f94dc1848aeeafb02e739edb78092ea027afe30119eb97df2c8e95d/mutf8-1.0.6-cp37-cp37m-macosx_10_14_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "37b951ac052f1d9ce1eca596a64a4b71ac32d05483d636c03e335be555ad6725",
                "md5": "1682a9d2101df4b3def640c0d5f5d2cf",
                "sha256": "1f4f497f20e3ea7968496c1eb1e1cb259c53ad040879e1e83ffb755a12112a04"
            },
            "downloads": -1,
            "filename": "mutf8-1.0.6-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "1682a9d2101df4b3def640c0d5f5d2cf",
            "packagetype": "bdist_wheel",
            "python_version": "cp37",
            "requires_python": null,
            "size": 18908,
            "upload_time": "2021-12-29T03:03:12",
            "upload_time_iso_8601": "2021-12-29T03:03:12.631542Z",
            "url": "https://files.pythonhosted.org/packages/37/b9/51ac052f1d9ce1eca596a64a4b71ac32d05483d636c03e335be555ad6725/mutf8-1.0.6-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b69c577a93c09a3f16e718e6783d7b72c0fe08cd944637ba14ac72c4812eb26f",
                "md5": "bc66f790756a5a00604682e15cada4a8",
                "sha256": "1925f5490fabca5c34138ed6644a1a093b0d935252207a5e89664097ff14114c"
            },
            "downloads": -1,
            "filename": "mutf8-1.0.6-cp37-cp37m-win_amd64.whl",
            "has_sig": false,
            "md5_digest": "bc66f790756a5a00604682e15cada4a8",
            "packagetype": "bdist_wheel",
            "python_version": "cp37",
            "requires_python": null,
            "size": 11436,
            "upload_time": "2021-12-29T03:03:44",
            "upload_time_iso_8601": "2021-12-29T03:03:44.149558Z",
            "url": "https://files.pythonhosted.org/packages/b6/9c/577a93c09a3f16e718e6783d7b72c0fe08cd944637ba14ac72c4812eb26f/mutf8-1.0.6-cp37-cp37m-win_amd64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ca084610bad7f9af6f82f62b162d24ea4139d2ef8a173e760a87d776aa57b938",
                "md5": "88417a3a9a2030f273994178d51370c7",
                "sha256": "018ceda7cdb66a1d3e9c07a71a1a35b92570fbb1230887a34ad784ff4d349981"
            },
            "downloads": -1,
            "filename": "mutf8-1.0.6-cp38-cp38-macosx_10_14_x86_64.whl",
            "has_sig": false,
            "md5_digest": "88417a3a9a2030f273994178d51370c7",
            "packagetype": "bdist_wheel",
            "python_version": "cp38",
            "requires_python": null,
            "size": 8714,
            "upload_time": "2021-12-29T03:05:24",
            "upload_time_iso_8601": "2021-12-29T03:05:24.462633Z",
            "url": "https://files.pythonhosted.org/packages/ca/08/4610bad7f9af6f82f62b162d24ea4139d2ef8a173e760a87d776aa57b938/mutf8-1.0.6-cp38-cp38-macosx_10_14_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "4817c2b687871abff8e15ceb689e2c01ec3fe73a9461d428561ffd17278c2802",
                "md5": "e2642fd10c76114c6dbe287bb51dce94",
                "sha256": "7a67e88534a7641c513dad13f2f7913239808df4a5d0b822eda0ff9024431e0b"
            },
            "downloads": -1,
            "filename": "mutf8-1.0.6-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "e2642fd10c76114c6dbe287bb51dce94",
            "packagetype": "bdist_wheel",
            "python_version": "cp38",
            "requires_python": null,
            "size": 19100,
            "upload_time": "2021-12-29T03:03:13",
            "upload_time_iso_8601": "2021-12-29T03:03:13.721564Z",
            "url": "https://files.pythonhosted.org/packages/48/17/c2b687871abff8e15ceb689e2c01ec3fe73a9461d428561ffd17278c2802/mutf8-1.0.6-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "28c3f3f7b0f9000ebdbad8440941a7926b02c28231e434fb0fd7c80aad2b940c",
                "md5": "24299594fa9def9f16652a4036895a4f",
                "sha256": "0d1325d42806b31901a0ddd4ef199144e508fd9f6f3c75a8305d5979365b66c3"
            },
            "downloads": -1,
            "filename": "mutf8-1.0.6-cp38-cp38-win_amd64.whl",
            "has_sig": false,
            "md5_digest": "24299594fa9def9f16652a4036895a4f",
            "packagetype": "bdist_wheel",
            "python_version": "cp38",
            "requires_python": null,
            "size": 11422,
            "upload_time": "2021-12-29T03:03:42",
            "upload_time_iso_8601": "2021-12-29T03:03:42.783043Z",
            "url": "https://files.pythonhosted.org/packages/28/c3/f3f7b0f9000ebdbad8440941a7926b02c28231e434fb0fd7c80aad2b940c/mutf8-1.0.6-cp38-cp38-win_amd64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "dabc9e05f5b1d3156822bcdd8b07319f41d05f8ee7237643fd470255af95d6e8",
                "md5": "f61c30756ca7e4fd3f19e6215fe16161",
                "sha256": "3207a071ead14d928213019f12b5554b179f61a16a8094ed660b755990db3652"
            },
            "downloads": -1,
            "filename": "mutf8-1.0.6-cp39-cp39-macosx_10_14_x86_64.whl",
            "has_sig": false,
            "md5_digest": "f61c30756ca7e4fd3f19e6215fe16161",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": null,
            "size": 8716,
            "upload_time": "2021-12-29T03:02:45",
            "upload_time_iso_8601": "2021-12-29T03:02:45.028961Z",
            "url": "https://files.pythonhosted.org/packages/da/bc/9e05f5b1d3156822bcdd8b07319f41d05f8ee7237643fd470255af95d6e8/mutf8-1.0.6-cp39-cp39-macosx_10_14_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "574a1ad8954084a75e308d978bb0ef95b61d29c84f8b4a4fbc0a687b62922789",
                "md5": "1fd0c185a86833f27c0beba8c2e5f416",
                "sha256": "6172b5babc0c819636830fc79ca9c3a82662ef1ee764c82c1b59fbf6ea54d82f"
            },
            "downloads": -1,
            "filename": "mutf8-1.0.6-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "1fd0c185a86833f27c0beba8c2e5f416",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": null,
            "size": 18437,
            "upload_time": "2021-12-29T03:03:15",
            "upload_time_iso_8601": "2021-12-29T03:03:15.366006Z",
            "url": "https://files.pythonhosted.org/packages/57/4a/1ad8954084a75e308d978bb0ef95b61d29c84f8b4a4fbc0a687b62922789/mutf8-1.0.6-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d68ca5186e0116f2107856ea71babb5e9997cd5d717b952cf02a5cf1647aff2a",
                "md5": "aef180ef35a7a3b9a4321028fc322dc5",
                "sha256": "4f7a24b55c53d508a7ecb2e8c6fe14e4fcefaa4c48100b446e73217ade7875a0"
            },
            "downloads": -1,
            "filename": "mutf8-1.0.6-cp39-cp39-win_amd64.whl",
            "has_sig": false,
            "md5_digest": "aef180ef35a7a3b9a4321028fc322dc5",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": null,
            "size": 11423,
            "upload_time": "2021-12-29T03:03:55",
            "upload_time_iso_8601": "2021-12-29T03:03:55.083469Z",
            "url": "https://files.pythonhosted.org/packages/d6/8c/a5186e0116f2107856ea71babb5e9997cd5d717b952cf02a5cf1647aff2a/mutf8-1.0.6-cp39-cp39-win_amd64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ca313c57313757b3a47dcf32d2a9bad55d913b797efc8814db31bed8a7142396",
                "md5": "0a49ae9ae414a188a67fa7ac6597363a",
                "sha256": "1bbbefb67c2e5a57104750bb04b0912200b57b2fa9841be245279e83859cb346"
            },
            "downloads": -1,
            "filename": "mutf8-1.0.6.tar.gz",
            "has_sig": false,
            "md5_digest": "0a49ae9ae414a188a67fa7ac6597363a",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 6424,
            "upload_time": "2021-12-29T03:02:17",
            "upload_time_iso_8601": "2021-12-29T03:02:17.271828Z",
            "url": "https://files.pythonhosted.org/packages/ca/31/3c57313757b3a47dcf32d2a9bad55d913b797efc8814db31bed8a7142396/mutf8-1.0.6.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2021-12-29 03:02:17",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "TkTech",
    "github_project": "mutf8",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "mutf8"
}
        
Elapsed time: 0.02822s