ucs-detect


Nameucs-detect JSON
Version 1.0.7 PyPI version JSON
download
home_pagehttps://github.com/jquast/ucs-detect
SummaryDetects Unicode support of an interactive terminal
upload_time2024-01-06 21:22:45
maintainer
docs_urlNone
authorJeff Quast
requires_python
licenseMIT
keywords cjk combining console eastasian emojiemulator terminal unicode wcswidth wcwidth xterm zwj
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            ucs-detect
==========

Without any arguments,

::

    $ ucs-detect

``ucs-detect`` automatically tests the Unicode version and support level of a
terminal emulator for Wide character, Emoji Zero Width Joiner (ZWJ) sequences,
Emoji Variation Selector-16 (VS-16) sequences, and Zero-Width or combining
characters by supported Language.  A brief report is then printed to stdout.

.. figure:: https://dxtz6bzwq9sxx.cloudfront.net/ucs-detect.gif
   :alt: video demonstration of running ucs-detect

Installation & Usage
--------------------

To install or upgrade:

::

   $ pip install -U ucs-detect


To use::

   $ ucs-detect


To run a detailed test and store a yaml report to disk::

   $ ucs-detect --save-yaml=data/my-terminal.yaml --limit-codepoints=5000 --limit-words=5000 --limit-errors=500

Test Results
------------

More than twenty modern terminals for Windows, Linux, and Mac were tested,
their results have been collected into this repository and a detailed
summary is published at URL https://ucs-detect.readthedocs.io/results.html

An article describing the development of ucs-detect and summarizing the results
for the 1.0.4 release of ucs-detect (November 2023) is published at
https://www.jeffquast.com/post/ucs-detect-test-results/

Individual yaml data file reports for these terminals may also be inspected at
the repository folder ``data``,
https://github.com/jquast/ucs-detect/tree/master/data

Please note that results will be shared with Terminal Emulator projects and this
information may become out of date as they improve their support for Unicode.
Please do not expect the maintainers of ucs-detect to update these data files. If
you wish for this report to be corrected for any given Terminal, please feel free
to submit a pull request with an update to the yaml data files.

Problem
-------

Many East Asian languages contain Wide (W) or Fullwidth (F) characters, meaning
that each character occupies 2 cells instead of 1. Further, many languages
contain special combining characters that are "zero width", meaning they do not
occupy any cells, only modifying the previous one as a "combining" character.
Finally, there are "Zero Width Joiner" and "Variation Selector-16" characters
that are used in sequence for Emoji characters.

A terminal application that displays these characters may have trouble
determining how it will be displayed to the end-user.  This problem
happens often, because the Unicode Consortium releases new versions
of the Unicode Standard periodically, but the source code of libraries
and applications are not updated at the same time, or at all!

Finally, a terminal emulator may have varying levels of support. For example, at
time of this writing, Microsoft's `Terminal.exe`_ supports up to Unicode 15.0 for
Wide characters, is missing support for 27 characters of Unicode 13.0, has no
support for Emoji ZWJ, fully supports all VS-16 sequences, but fails to
correctly categorize many Zero-Width for 88 or more of the world's languages. 


Solution
--------

The most important factor is to determine whether the Terminal Emulator complies
with the Specification_ published by the python wcwidth_ library.

This program, ``ucs-detect``, is able to **automatically detect** the version
and feature level support of unicode that the connecting Terminal supports for
WIDE, ZERO, ZWJ, and VS-16 characters.

How it works
------------

The solution in this program is the use of the `Query Cursor Position`_ terminal
sequence, which asks, *"where is the cursor?"*. This is a hidden sequence that a
Terminal Emulator automatically responds to.

By use of this sequence, and the data tables of the wcwidth_ library,
we can test for compliance of the python wcwidth_ library Specification_.

The use of `Query Cursor Position`_  is inspired by the `resize(1)`_ program
distributed with X11, which determines the terminal size over transports that
are not capable of communicating by signal or forwarding by environment value,
such as over a serial line. `resize(1)` simply moves to (999, 999) then asks,
"where is my cursor?" and the response is understood to be the terminal size.

UNICODE_VERSION (legacy)
------------------------

.. note:: This feature is planned for deprecation, see https://github.com/jquast/wcwidth/issues/104

Versions of *ucs-detect* prior to 1.0 served only a single purpose, to export an
sh_-compatible line for export of ``UNICODE_VERSION``. To continue this purpose,
use ``--shell --quick``, for example::

    $ ucs-detect --shell --quick
    UNICODE_VERSION=15.0.0; export UNICODE_VERSION

It is designed to be used interactively::

    $ eval "$(ucs-detect --quick --shell)"
    $ echo $UNICODE_VERSION
    15.0.0

The environment variable, ``UNICODE_VERSION`` is currently used by the python
wcwidth_ library, which contains every past unicode table version, to determine
how dependent python programs, such as IPython_ render wide and zero-width
characters.

History
=======

- 1.0.7 (2024-01-06): Add python 3.10 compatibility for yaml file save and
  update wcwidth requirement to 0.2.13.

- 1.0.6 (2023-12-15): Distribution fix for UDHR data and bugfix for python 3.8
  through 3.11. *ucs-detect* Welcomes `@GalaxySnail
  <https://github.com/GalaxySnail/>`_ as a new project contributor.

- 1.0.5 (2023-11-13): Set minimum wcwidth_ release version requirement.

- 1.0.4 (2023-11-13): Add support for Emoji with VS-16 and more complete testing.
  Published test results.

- 1.0.3 (2023-10-28): Drop python 2 support. Add more advanced testing. Changes
  default behavior when called without arguments, use ``ucs-detect --quick
  --shell`` to use the new release with matching previous release behavior.

- 0.0.4 (2020-06-20): Initial releases and bugfixes

.. _IPython: https://ipython.org/
.. _python-prompt-toolkit: https://github.com/prompt-toolkit/python-prompt-toolkit/blob/master/PROJECTS.rst#projects-using-prompt_toolkit
.. _sh: https://en.wikipedia.org/wiki/Bourne_shell
.. _wcwidth: https://github.com/jquast/wcwidth
.. _`Query Cursor Position`: https://blessed.readthedocs.io/en/latest/location.html#finding-the-cursor
.. _`resize(1)`: https://github.com/joejulian/xterm/blob/master/resize.c
.. _Specification: https://wcwidth.readthedocs.io/en/latest/specs.html
.. _`Terminal.exe`: https://ucs-detect.readthedocs.io/sw_results/Terminalexe.html#terminalexe

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/jquast/ucs-detect",
    "name": "ucs-detect",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "cjk,combining,console,eastasian,emojiemulator,terminal,unicode,wcswidth,wcwidth,xterm,zwj",
    "author": "Jeff Quast",
    "author_email": "contact@jeffquast.com",
    "download_url": "https://files.pythonhosted.org/packages/01/c7/62969c536c11c9569ecc6701ee1888f77e81e4e679e33caa230d519cecb9/ucs_detect-1.0.7.tar.gz",
    "platform": null,
    "description": "ucs-detect\n==========\n\nWithout any arguments,\n\n::\n\n    $ ucs-detect\n\n``ucs-detect`` automatically tests the Unicode version and support level of a\nterminal emulator for Wide character, Emoji Zero Width Joiner (ZWJ) sequences,\nEmoji Variation Selector-16 (VS-16) sequences, and Zero-Width or combining\ncharacters by supported Language.  A brief report is then printed to stdout.\n\n.. figure:: https://dxtz6bzwq9sxx.cloudfront.net/ucs-detect.gif\n   :alt: video demonstration of running ucs-detect\n\nInstallation & Usage\n--------------------\n\nTo install or upgrade:\n\n::\n\n   $ pip install -U ucs-detect\n\n\nTo use::\n\n   $ ucs-detect\n\n\nTo run a detailed test and store a yaml report to disk::\n\n   $ ucs-detect --save-yaml=data/my-terminal.yaml --limit-codepoints=5000 --limit-words=5000 --limit-errors=500\n\nTest Results\n------------\n\nMore than twenty modern terminals for Windows, Linux, and Mac were tested,\ntheir results have been collected into this repository and a detailed\nsummary is published at URL https://ucs-detect.readthedocs.io/results.html\n\nAn article describing the development of ucs-detect and summarizing the results\nfor the 1.0.4 release of ucs-detect (November 2023) is published at\nhttps://www.jeffquast.com/post/ucs-detect-test-results/\n\nIndividual yaml data file reports for these terminals may also be inspected at\nthe repository folder ``data``,\nhttps://github.com/jquast/ucs-detect/tree/master/data\n\nPlease note that results will be shared with Terminal Emulator projects and this\ninformation may become out of date as they improve their support for Unicode.\nPlease do not expect the maintainers of ucs-detect to update these data files. If\nyou wish for this report to be corrected for any given Terminal, please feel free\nto submit a pull request with an update to the yaml data files.\n\nProblem\n-------\n\nMany East Asian languages contain Wide (W) or Fullwidth (F) characters, meaning\nthat each character occupies 2 cells instead of 1. Further, many languages\ncontain special combining characters that are \"zero width\", meaning they do not\noccupy any cells, only modifying the previous one as a \"combining\" character.\nFinally, there are \"Zero Width Joiner\" and \"Variation Selector-16\" characters\nthat are used in sequence for Emoji characters.\n\nA terminal application that displays these characters may have trouble\ndetermining how it will be displayed to the end-user.  This problem\nhappens often, because the Unicode Consortium releases new versions\nof the Unicode Standard periodically, but the source code of libraries\nand applications are not updated at the same time, or at all!\n\nFinally, a terminal emulator may have varying levels of support. For example, at\ntime of this writing, Microsoft's `Terminal.exe`_ supports up to Unicode 15.0 for\nWide characters, is missing support for 27 characters of Unicode 13.0, has no\nsupport for Emoji ZWJ, fully supports all VS-16 sequences, but fails to\ncorrectly categorize many Zero-Width for 88 or more of the world's languages. \n\n\nSolution\n--------\n\nThe most important factor is to determine whether the Terminal Emulator complies\nwith the Specification_ published by the python wcwidth_ library.\n\nThis program, ``ucs-detect``, is able to **automatically detect** the version\nand feature level support of unicode that the connecting Terminal supports for\nWIDE, ZERO, ZWJ, and VS-16 characters.\n\nHow it works\n------------\n\nThe solution in this program is the use of the `Query Cursor Position`_ terminal\nsequence, which asks, *\"where is the cursor?\"*. This is a hidden sequence that a\nTerminal Emulator automatically responds to.\n\nBy use of this sequence, and the data tables of the wcwidth_ library,\nwe can test for compliance of the python wcwidth_ library Specification_.\n\nThe use of `Query Cursor Position`_  is inspired by the `resize(1)`_ program\ndistributed with X11, which determines the terminal size over transports that\nare not capable of communicating by signal or forwarding by environment value,\nsuch as over a serial line. `resize(1)` simply moves to (999, 999) then asks,\n\"where is my cursor?\" and the response is understood to be the terminal size.\n\nUNICODE_VERSION (legacy)\n------------------------\n\n.. note:: This feature is planned for deprecation, see https://github.com/jquast/wcwidth/issues/104\n\nVersions of *ucs-detect* prior to 1.0 served only a single purpose, to export an\nsh_-compatible line for export of ``UNICODE_VERSION``. To continue this purpose,\nuse ``--shell --quick``, for example::\n\n    $ ucs-detect --shell --quick\n    UNICODE_VERSION=15.0.0; export UNICODE_VERSION\n\nIt is designed to be used interactively::\n\n    $ eval \"$(ucs-detect --quick --shell)\"\n    $ echo $UNICODE_VERSION\n    15.0.0\n\nThe environment variable, ``UNICODE_VERSION`` is currently used by the python\nwcwidth_ library, which contains every past unicode table version, to determine\nhow dependent python programs, such as IPython_ render wide and zero-width\ncharacters.\n\nHistory\n=======\n\n- 1.0.7 (2024-01-06): Add python 3.10 compatibility for yaml file save and\n  update wcwidth requirement to 0.2.13.\n\n- 1.0.6 (2023-12-15): Distribution fix for UDHR data and bugfix for python 3.8\n  through 3.11. *ucs-detect* Welcomes `@GalaxySnail\n  <https://github.com/GalaxySnail/>`_ as a new project contributor.\n\n- 1.0.5 (2023-11-13): Set minimum wcwidth_ release version requirement.\n\n- 1.0.4 (2023-11-13): Add support for Emoji with VS-16 and more complete testing.\n  Published test results.\n\n- 1.0.3 (2023-10-28): Drop python 2 support. Add more advanced testing. Changes\n  default behavior when called without arguments, use ``ucs-detect --quick\n  --shell`` to use the new release with matching previous release behavior.\n\n- 0.0.4 (2020-06-20): Initial releases and bugfixes\n\n.. _IPython: https://ipython.org/\n.. _python-prompt-toolkit: https://github.com/prompt-toolkit/python-prompt-toolkit/blob/master/PROJECTS.rst#projects-using-prompt_toolkit\n.. _sh: https://en.wikipedia.org/wiki/Bourne_shell\n.. _wcwidth: https://github.com/jquast/wcwidth\n.. _`Query Cursor Position`: https://blessed.readthedocs.io/en/latest/location.html#finding-the-cursor\n.. _`resize(1)`: https://github.com/joejulian/xterm/blob/master/resize.c\n.. _Specification: https://wcwidth.readthedocs.io/en/latest/specs.html\n.. _`Terminal.exe`: https://ucs-detect.readthedocs.io/sw_results/Terminalexe.html#terminalexe\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Detects Unicode support of an interactive terminal",
    "version": "1.0.7",
    "project_urls": {
        "Homepage": "https://github.com/jquast/ucs-detect"
    },
    "split_keywords": [
        "cjk",
        "combining",
        "console",
        "eastasian",
        "emojiemulator",
        "terminal",
        "unicode",
        "wcswidth",
        "wcwidth",
        "xterm",
        "zwj"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "2ca99ad8033bf81ac7aec592544906e88bdc05242ffb0ebd3cd1d1b90d7c445f",
                "md5": "f4d2a1d7ea077081bd5ca47fdbf96798",
                "sha256": "797c4a5aa665b1a9ae06d819811ddc42ac8b64c101c11ca4bc34eced193e0c58"
            },
            "downloads": -1,
            "filename": "ucs_detect-1.0.7-py2.py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f4d2a1d7ea077081bd5ca47fdbf96798",
            "packagetype": "bdist_wheel",
            "python_version": "py2.py3",
            "requires_python": null,
            "size": 686298,
            "upload_time": "2024-01-06T21:22:41",
            "upload_time_iso_8601": "2024-01-06T21:22:41.886674Z",
            "url": "https://files.pythonhosted.org/packages/2c/a9/9ad8033bf81ac7aec592544906e88bdc05242ffb0ebd3cd1d1b90d7c445f/ucs_detect-1.0.7-py2.py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "01c762969c536c11c9569ecc6701ee1888f77e81e4e679e33caa230d519cecb9",
                "md5": "6e5ebed0e97ba443e2b852f039fc69ec",
                "sha256": "293c8e0137d4011c496920a7423445ff8689e9fbc44ba156fd0482ab1d9aaf9b"
            },
            "downloads": -1,
            "filename": "ucs_detect-1.0.7.tar.gz",
            "has_sig": false,
            "md5_digest": "6e5ebed0e97ba443e2b852f039fc69ec",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 646234,
            "upload_time": "2024-01-06T21:22:45",
            "upload_time_iso_8601": "2024-01-06T21:22:45.678540Z",
            "url": "https://files.pythonhosted.org/packages/01/c7/62969c536c11c9569ecc6701ee1888f77e81e4e679e33caa230d519cecb9/ucs_detect-1.0.7.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-01-06 21:22:45",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "jquast",
    "github_project": "ucs-detect",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "ucs-detect"
}
        
Elapsed time: 0.20614s