unicode-charnames


Nameunicode-charnames JSON
Version 15.1.0 PyPI version JSON
download
home_pagehttps://github.com/mlodewijck/unicode_charnames
SummaryLook up Unicode character name or code point label and search in Unicode character names
upload_time2023-11-11 18:39:59
maintainer
docs_urlNone
authorMarc Lodewijck
requires_python>=3.8
licenseMIT
keywords unicode unicode data unicode characters character names characters
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # unicode-charnames
This package supports Unicode version 15.1, released in September 2023.

The library provides:

* A function to get the character name (the normative character property “Name”) or the code point label (for characters that do not have character names) of a single Unicode character.
* A function to get the code point value (in the usual 4- to 6-digit hexadecimal format) corresponding to a Unicode character name; the search is case-sensitive and requires exact string match.
* A function to search characters by character name; the search is case-insensitive but requires exact substring match.

The generic term “character name” refers to the Unicode character “Name” property value for an encoded Unicode character. For code points that do not have character names (unassigned, reserved code points and other special code point types), the Unicode standard uses constructed Unicode code point labels, displayed between angle brackets, to stand in for character names.

### Installation or upgrade
The easiest method to install is using pip:
```shell
pip install unicode-charnames
```

To update the package to the latest version:
```shell
pip install --upgrade unicode-charnames
```

### UCD version
To get the version of the Unicode character database currently used:
```python
>>> from unicode_charnames import UCD_VERSION
>>> UCD_VERSION
'15.1.0'
```

### Example usage
```python
    from unicode_charnames import charname, codepoint, search_charnames

    # charname
    for char in '龠💓\u00E5\u0002':
        print(charname(char))
        # CJK UNIFIED IDEOGRAPH-9FA0
        # BEATING HEART
        # LATIN SMALL LETTER A WITH RING ABOVE
        # <control-0002>

    # codepoint
    for name in [
            'LATIN CAPITAL LETTER E WITH ACUTE',
            'SQUARE ERA NAME REIWA',
            'SUPERCALIFRAGILISTICEXPIALIDOCIOUS'
    ]:
        print(codepoint(name))
        # 00C9
        # 32FF
        # None

    # search_charnames
    for x in search_charnames('break'):
        print('\t'.join(x))
        # 00A0    NO-BREAK SPACE
        # 2011    NON-BREAKING HYPHEN
        # 202F    NARROW NO-BREAK SPACE
        # 4DEA    HEXAGRAM FOR BREAKTHROUGH
        # FEFF    ZERO WIDTH NO-BREAK SPACE
```

### Related resource
This implementation is based on the following resource: [Section 4.8, Name, in the Unicode core specification, version&nbsp;15.1.0](https://www.unicode.org/versions/Unicode15.1.0/ch04.pdf#G2082).

### Licenses
The code is available under the [MIT license](https://github.com/mlodewijck/unicode_charnames/blob/master/LICENSE).

Usage of Unicode data files is governed by the [UNICODE TERMS OF USE](https://www.unicode.org/copyright.html). Further specifications of rights and restrictions pertaining to the use of the Unicode data files and software can be found in the [Unicode Data Files and Software License](https://www.unicode.org/license.txt), a copy of which is included as [UNICODE-LICENSE](https://github.com/mlodewijck/unicode_charnames/blob/master/UNICODE-LICENSE).



            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/mlodewijck/unicode_charnames",
    "name": "unicode-charnames",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "",
    "keywords": "Unicode,Unicode data,Unicode characters,character names,characters",
    "author": "Marc Lodewijck",
    "author_email": "mlodewijck@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/1d/49/5c3dcc011851c1383b9d10f75add58aae59135b2217986ad2d95f7df9ff1/unicode_charnames-15.1.0.tar.gz",
    "platform": null,
    "description": "# unicode-charnames\r\nThis package supports Unicode version&nbsp;15.1, released in September&nbsp;2023.\r\n\r\nThe library provides:\r\n\r\n* A function to get the character name (the normative character property \u201cName\u201d) or the code point label (for characters that do not have character names) of a single Unicode character.\r\n* A function to get the code point value (in the usual 4- to 6-digit hexadecimal format) corresponding to a Unicode character name; the search is case-sensitive and requires exact string match.\r\n* A function to search characters by character name; the search is case-insensitive but requires exact substring match.\r\n\r\nThe generic term \u201ccharacter name\u201d refers to the Unicode character \u201cName\u201d property value for an encoded Unicode character. For code points that do not have character names (unassigned, reserved code points and other special code point types), the Unicode standard uses constructed Unicode code point labels, displayed between angle brackets, to stand in for character names.\r\n\r\n### Installation or upgrade\r\nThe easiest method to install is using pip:\r\n```shell\r\npip install unicode-charnames\r\n```\r\n\r\nTo update the package to the latest version:\r\n```shell\r\npip install --upgrade unicode-charnames\r\n```\r\n\r\n### UCD version\r\nTo get the version of the Unicode character database currently used:\r\n```python\r\n>>> from unicode_charnames import UCD_VERSION\r\n>>> UCD_VERSION\r\n'15.1.0'\r\n```\r\n\r\n### Example usage\r\n```python\r\n    from unicode_charnames import charname, codepoint, search_charnames\r\n\r\n    # charname\r\n    for char in '\u9fa0\ud83d\udc93\\u00E5\\u0002':\r\n        print(charname(char))\r\n        # CJK UNIFIED IDEOGRAPH-9FA0\r\n        # BEATING HEART\r\n        # LATIN SMALL LETTER A WITH RING ABOVE\r\n        # <control-0002>\r\n\r\n    # codepoint\r\n    for name in [\r\n            'LATIN CAPITAL LETTER E WITH ACUTE',\r\n            'SQUARE ERA NAME REIWA',\r\n            'SUPERCALIFRAGILISTICEXPIALIDOCIOUS'\r\n    ]:\r\n        print(codepoint(name))\r\n        # 00C9\r\n        # 32FF\r\n        # None\r\n\r\n    # search_charnames\r\n    for x in search_charnames('break'):\r\n        print('\\t'.join(x))\r\n        # 00A0    NO-BREAK SPACE\r\n        # 2011    NON-BREAKING HYPHEN\r\n        # 202F    NARROW NO-BREAK SPACE\r\n        # 4DEA    HEXAGRAM FOR BREAKTHROUGH\r\n        # FEFF    ZERO WIDTH NO-BREAK SPACE\r\n```\r\n\r\n### Related resource\r\nThis implementation is based on the following resource: [Section 4.8, Name, in the Unicode core specification, version&nbsp;15.1.0](https://www.unicode.org/versions/Unicode15.1.0/ch04.pdf#G2082).\r\n\r\n### Licenses\r\nThe code is available under the [MIT license](https://github.com/mlodewijck/unicode_charnames/blob/master/LICENSE).\r\n\r\nUsage of Unicode data files is governed by the [UNICODE TERMS OF USE](https://www.unicode.org/copyright.html). Further specifications of rights and restrictions pertaining to the use of the Unicode data files and software can be found in the [Unicode Data Files and Software License](https://www.unicode.org/license.txt), a copy of which is included as [UNICODE-LICENSE](https://github.com/mlodewijck/unicode_charnames/blob/master/UNICODE-LICENSE).\r\n\r\n\r\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Look up Unicode character name or code point label and search in Unicode character names",
    "version": "15.1.0",
    "project_urls": {
        "Bug Reports": "https://github.com/mlodewijck/unicode_charnames/issues",
        "Homepage": "https://github.com/mlodewijck/unicode_charnames",
        "Source": "https://github.com/mlodewijck/unicode_charnames/"
    },
    "split_keywords": [
        "unicode",
        "unicode data",
        "unicode characters",
        "character names",
        "characters"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "1d495c3dcc011851c1383b9d10f75add58aae59135b2217986ad2d95f7df9ff1",
                "md5": "99dd77c1b7e6ccbd24c09ad4e1ce2786",
                "sha256": "f40862331247976ffbb07dc0d889c71651416ea09c0a65b5eb4723946ddae929"
            },
            "downloads": -1,
            "filename": "unicode_charnames-15.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "99dd77c1b7e6ccbd24c09ad4e1ce2786",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 279024,
            "upload_time": "2023-11-11T18:39:59",
            "upload_time_iso_8601": "2023-11-11T18:39:59.064864Z",
            "url": "https://files.pythonhosted.org/packages/1d/49/5c3dcc011851c1383b9d10f75add58aae59135b2217986ad2d95f7df9ff1/unicode_charnames-15.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-11-11 18:39:59",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "mlodewijck",
    "github_project": "unicode_charnames",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "unicode-charnames"
}
        
Elapsed time: 0.13775s