<h1 align="center">
<!-- es7s/holms -->
<a href="##"><img align="left" src="https://s3.eu-north-1.amazonaws.com/dp2.dl/readme/es7s/holms/logo.png?v=2" width="160" height="64"></a>
<a href="##"><img align="center" src="https://s3.eu-north-1.amazonaws.com/dp2.dl/readme/es7s/holms/label.png" width="200" height="64"></a>
<a href="##"><img align="right" src="https://s3.eu-north-1.amazonaws.com/dp2.dl/readme/empty.png" width="160" height="64"></a>
</h1>
<div align="right">
<a href="##"><img src="https://img.shields.io/badge/python-3.10-3776AB?logo=python&logoColor=white&labelColor=333333"></a>
<a href="https://pepy.tech/project/holms/"><img alt="Downloads" src="https://pepy.tech/badge/holms"></a>
<a href="https://pypi.org/project/holms/"><img alt="PyPI" src="https://img.shields.io/pypi/v/holms"></a>
<a href='https://coveralls.io/github/es7s/holms?branch=master'><img src='https://coveralls.io/repos/github/es7s/holms/badge.svg?branch=master' alt='Coverage Status' /></a>
<a href="https://github.com/psf/black"><img alt="Code style: black" src="https://img.shields.io/badge/code%20style-black-000000.svg"></a>
<a href="##"><img src="https://wakatime.com/badge/user/8eb9e217-791b-436f-b729-81eb63e84b08/project/018b5923-4968-4029-ae8d-3776792f88d5.svg"></a>
</div>
<br>
CLI UTF-8 decomposer for text analysis capable of displaying Unicode code point
names and categories, along with ASCII control characters, UTF-16 surrogate pair
pieces, invalid UTF-8 sequences parts as separate bytes, etc.
Motivation
---------------------------
A necessity for a tool that can quickly identify otherwise indistinguishable
Unicode code points.
Installation
---------------------------
### With `pipx` (recommended)
pipx install holms
### From git repository
curl -sS https://github.com/es7s/holms/blob/master/install.sh | sh
Basic usage
---------------------------
Usage: holms run [OPTIONS] [INPUT]
Read data from INPUT file, find all valid UTF-8 byte sequences, decode them and display as
separate Unicode code points. Use '-' as INPUT to read from stdin instead.
<div align="center">
<img alt="example001" width="49%" src="https://github.com/es7s/holms/assets/50381946/a9c9bcdd-42d5-4038-a23a-22b91bb7cc7d">
<img alt="example004" width="49%" src="https://github.com/es7s/holms/assets/50381946/fd1b4bc3-aacc-42af-8442-2db3c3984a13">
<img alt="example002" width="49%" src="https://github.com/es7s/holms/assets/50381946/0a126747-3b29-44da-9d94-ab5f01a63d68">
<img alt="example003" width="49%" src="https://github.com/es7s/holms/assets/50381946/8e217ae3-325c-4629-8cda-389882667aa4">
</div>
<details>
<summary>Plain text output</summary>
<!-- @sub:example001.png.txt -->
> holms run -u - <<<'1₂³⅘↉⏨'
0 U+ 31 ▕ 1 ▏ Nd DIGIT ONE
1 U+2082 ▕ ₂ ▏ No SUBSCRIPT TWO
4 U+ B3 ▕ ³ ▏ No SUPERSCRIPT THREE
6 U+2158 ▕ ⅘ ▏ No VULGAR FRACTION FOUR FIFTHS
9 U+2189 ▕ ↉ ▏ No VULGAR FRACTION ZERO THIRDS
c U+23E8 ▕ ⏨ ▏ So DECIMAL EXPONENT SYMBOL
<!-- @sub -->
<!-- @sub:example004.png.txt -->
> holms run -u - <<<'🌯👄🤡🎈🐳🐍'
00 U1F32F ▕🌯 ▏ So BURRITO
04 U1F444 ▕👄 ▏ So MOUTH
08 U1F921 ▕🤡 ▏ So CLOWN FACE
0c U1F388 ▕🎈 ▏ So BALLOON
10 U1F433 ▕🐳 ▏ So SPOUTING WHALE
14 U1F40D ▕🐍 ▏ So SNAKE
<!-- @sub -->
<!-- @sub:example002.png.txt -->
> holms run -u - <<<'aаͣāãâȧäåₐᵃa'
00 U+ 61 ▕ a ▏ Ll LATIN SMALL LETTER A
01 U+ 430 ▕ а ▏ Ll CYRILLIC SMALL LETTER A
03 U+ 363 ▕ ͣ ▏ Mn COMBINING LATIN SMALL LETTER A
05 U+ 101 ▕ ā ▏ Ll LATIN SMALL LETTER A WITH MACRON
07 U+ E3 ▕ ã ▏ Ll LATIN SMALL LETTER A WITH TILDE
09 U+ E2 ▕ â ▏ Ll LATIN SMALL LETTER A WITH CIRCUMFLEX
0b U+ 227 ▕ ȧ ▏ Ll LATIN SMALL LETTER A WITH DOT ABOVE
0d U+ E4 ▕ ä ▏ Ll LATIN SMALL LETTER A WITH DIAERESIS
0f U+ E5 ▕ å ▏ Ll LATIN SMALL LETTER A WITH RING ABOVE
11 U+2090 ▕ ₐ ▏ Lm LATIN SUBSCRIPT SMALL LETTER A
14 U+1D43 ▕ ᵃ ▏ Lm MODIFIER LETTER SMALL A
17 U+FF41 ▕a ▏ Ll FULLWIDTH LATIN SMALL LETTER A
<!-- @sub -->
<!-- @sub:example003.png.txt -->
> holms run -u - <<<'%‰∞8᪲?¿‽⚠⚠️'
00 U+ 25 ▕ % ▏ Po PERCENT SIGN
01 U+2030 ▕ ‰ ▏ Po PER MILLE SIGN
04 U+221E ▕ ∞ ▏ Sm INFINITY
07 U+ 38 ▕ 8 ▏ Nd DIGIT EIGHT
08 U+1AB2 ▕ ᪲ ▏ Mn COMBINING INFINITY
0b U+ 3F ▕ ? ▏ Po QUESTION MARK
0c U+ BF ▕ ¿ ▏ Po INVERTED QUESTION MARK
0e U+203D ▕ ‽ ▏ Po INTERROBANG
11 U+26A0 ▕ ⚠ ▏ So WARNING SIGN
14 U+26A0 ▕ ⚠ ▏ So WARNING SIGN
17 U+FE0F ▕ ️ ▏ Mn VARIATION SELECTOR-16
<!-- @sub -->
</details>
Buffering
---------------------------------
The application works in two modes: **buffered** (the default if INPUT is a
file) and **unbuffered** (default when reading from stdin). Options `-b`/`-u`
explicitly override output mode regardless of the default setting.
In **buffered** mode the result begins to appear only after EOF is encountered
(i.e., the WHOLE file has been read to the buffer). This is suitable for short
and predictable inputs and produces the most compact output with fixed column
sizes.
The **unbuffered** mode comes in handy when input is an endless piped stream:
the results will be displayed in real time, as soon as the type of each byte
sequence is determined, but the output column widths are not fixed and can vary
as the process goes further.
> Despite the name, the app actually uses tiny (4 bytes) input buffer, but it's
> the only way to handle UTF-8 stream and distinguish valid sequences from broken
> ones; in truly unbuffered mode the output would consist of ASCII-7 characters
> (`0x00`-`0x7F`) and unrecognized binary data (`0x80`-`0xFF`) only, which is not
> something the application was made for.
Configuration / Advanced usage
----------------------------------
[//]: # (@sub:help.txt)
Options:
-b, --buffered / -u, --unbuffered
Explicitly set to wait for EOF before processing the
output (buffered), or to stream the results in
parallel with reading, as soon as possible
(unbuffered). See BUFFERING section above for the
details.
-m, --merge Replace all sequences of repeating characters with one
of each, together with initial length of the sequence.
-g, --group Group the input by code points (=count unique), sort
descending and display counts instead of normal
output. Implies '--merge' and forces buffered ('-b')
mode. Specifying the option twice ('-gg') results in
grouping by code point category instead, while doing
it thrice ('-ggg') makes the app group the input by
super categories.
-f, --format Comma-separated list of columns to show (order is
preserved). Run 'holms format' to see the details.
-n, --names Display names instead of abbreviations. Affects `cat`
and `block` columns, but only if column in question is
already present on the screen. Note that these columns
can still display only the beginning of the attribute,
unless '-r' is provided.
-a, --all Display ALL columns.
-r, --rigid By default some columns can be compressed beyond the
nominal width, if all current values fit and there is
still space left. This option disables column
shrinking (but they still will be expanded when
needed).
--decimal Use decimal byte offsets instead of hexadecimal.
--alt Use alternative notation for control characters: caret
notation for ASCII C0, octal notation for ASCII C1.
--oneline Discard all newline characters (0x0a LINE FEED) from
the input.
--no-table Do not format results as a table, just apply the
colors to characters (equivalent to '-f char', implies
'-b'). Compatible with '-merge', '--format' and even '
--group'.
--no-override Do not replace control/whitespace code point markers
with distinguishable characters ('▯' to '↵', '␣' etc).
Run 'holms legend' to see the details.
-?, --help Show this message and exit.
[//]: # (@sub)
Examples
--------------------------
### Output column selection
Option `-f`/`--filter` can be used to specify what columns to display. As an
alternative, there is an `-a`/`--all` option that enables displaying of all
currently available columns.
<details>
<summary><b>Column availability depending on operating mode</b></summary>
<div align="center">
<img alt="example010" src="https://github.com/es7s/holms/assets/50381946/62a6f354-1f30-4ee8-a8fc-533b1a980e03">
</div>
</details>
Also `-m`/`--merge` option is demonstrated, which tells the app to collapse
repetitive characters into one line of the output while counting them:
<div align="center">
<img alt="example005" src="https://github.com/es7s/holms/assets/50381946/6da31546-0e50-4fa0-af69-0b7a8ed5d4c3">
</div>
<details>
<summary>Plain text output</summary>
<!-- @sub:example005.png.txt -->
> holms run -m phpstan.txt
000 U+2B ▕ + ▏ Sm PLUS SIGN
001+ U+2D ▕ - ▏ Pd 27× HYPHEN-MINUS
01c U+2B ▕ + ▏ Sm PLUS SIGN
01d U+20 ▕ ␣ ▏ Zs SPACE
01e U+2B ▕ + ▏ Sm PLUS SIGN
01f+ U+2D ▕ - ▏ Pd 27× HYPHEN-MINUS
03a U+2B ▕ + ▏ Sm PLUS SIGN
03b U+ A ▕ ↵ ▏ Cc ASCII C0 [LF] LINE FEED
03c U+7C ▕ | ▏ Sm VERTICAL LINE
03d+ U+20 ▕ ␣ ▏ Zs 27× SPACE
...
<!-- @sub -->
</details>
### Reading from pipeline
There is an official Unicode Consortium data file included in the repository for
test purposes, named [confusables.txt](tests/data/confusables.txt). In the next
example we extract line **#3620** using `sed`, delete all TAB (`0x08`) characters
and feed the result to the application. The result demonstrates various Unicode
dot/bullet code points:
<div align="center">
<img alt="example006" src="https://github.com/es7s/holms/assets/50381946/78a90c45-d331-46d9-998e-20c6c9a97f12">
</div>
<details>
<summary>Plain text output</summary>
<!-- @sub:example006.png.txt -->
> sed confusables.txt -Ee 'sg' -e '3620!d' |
holms run -
00 U+ B7 ▕ · ▏ Po MIDDLE DOT
02 U+1427 ▕ ᐧ ▏ Lo CANADIAN SYLLABICS FINAL MIDDLE DOT
05 U+ 387 ▕ · ▏ Po GREEK ANO TELEIA
07 U+2022 ▕ • ▏ Po BULLET
0a U+2027 ▕ ‧ ▏ Po HYPHENATION POINT
0d U+2219 ▕ ∙ ▏ Sm BULLET OPERATOR
10 U+22C5 ▕ ⋅ ▏ Sm DOT OPERATOR
13 U+30FB ▕・ ▏ Po KATAKANA MIDDLE DOT
16 U10101 ▕ 𐄁 ▏ Po AEGEAN WORD SEPARATOR DOT
1a U+FF65 ▕ ・ ▏ Po HALFWIDTH KATAKANA MIDDLE DOT
1d U+ A ▕ ↵ ▏ Cc ASCII C0 [LF] LINE FEED
<!-- @sub -->
</details>
### Code points / categories statistics
`-g`/`--group` option can be used to count unique code points, and to compute
the occurrence rate of each one:
<div align="center">
<img alt="example008" src="https://github.com/es7s/holms/assets/50381946/f89be555-cf7e-4766-90b2-61a02140c54e">
</div>
<details>
<summary>Plain text output</summary>
<!-- @sub:example008.png.txt -->
> holms run -g ./tests/data/confusables.txt
U+ 20 ▕ ␣ ▏ Zs 12.5% ███ 62732× SPACE
U+ 9 ▕ ⇥ ▏ Cc 7.3% █▊ 36745× ASCII C0 [HT] HORIZONTAL TABULATION
U+ 41 ▕ A ▏ Lu 6.1% █▍ 30555× LATIN CAPITAL LETTER A
U+ 49 ▕ I ▏ Lu 5.2% █▏ 26063× LATIN CAPITAL LETTER I
U+ 45 ▕ E ▏ Lu 5.0% █▏ 24992× LATIN CAPITAL LETTER E
U+ 54 ▕ T ▏ Lu 3.7% ▉ 18776× LATIN CAPITAL LETTER T
U+ 4C ▕ L ▏ Lu 3.7% ▉ 18763× LATIN CAPITAL LETTER L
U+200E ▕ ▯ ▏ Cf 3.7% ▉ 18494× LEFT-TO-RIGHT MARK
U+ A ▕ ↵ ▏ Cc 2.9% ▋ 14609× ASCII C0 [LF] LINE FEED
U+ 43 ▕ C ▏ Lu 2.9% ▋ 14450× LATIN CAPITAL LETTER C
...
<!-- @sub -->
</details>
When used twice (`-gg`) or thrice (`-ggg`), the application groups the input by
code point category or code point super category, respectively, which can be used
e.g. for frequency domain analysis:
<div align="center">
<img alt="example011" src="https://github.com/es7s/holms/assets/50381946/18018b0c-7978-48aa-b3be-4923167bb425">
<img alt="example012" src="https://github.com/es7s/holms/assets/50381946/1128d864-aad9-4203-ae9c-af2ea0f3ad9f">
</div>
<details>
<summary>Plain text output</summary>
<!-- @sub:example011.png.txt -->
> holms run -gg ./tests/data/confusables.txt
53.1% ██████████ 266233× Uppercase_Letter
12.5% ██▎ 62748× Space_Separator
10.2% █▉ 51356× Control
8.5% █▌ 42511× Decimal_Number
3.7% ▋ 18497× Format
3.0% ▌ 14832× Other_Letter
2.0% ▎ 9778× Math_Symbol
1.8% ▎ 9261× Close_Punctuation
1.8% ▎ 9259× Open_Punctuation
1.5% ▎ 7525× Other_Punctuation
...
<!-- @sub -->
<!-- @sub:example012.png.txt -->
> holms run -ggg ./tests/data/confusables.txt
56.7% ██████████ 284074× Letter
13.9% ██▍ 69853× Other(C)
12.5% ██▏ 62750× Separator(Z)
8.5% █▌ 42796× Number
5.9% █ 29571× Punctuation
2.2% ▍ 11072× Symbol
0.2% ▏ 965× Mark
<!-- @sub -->
</details>
### In-place type highlighting
When `--format` is specified exactly as a single `char` column: `--format=char`,
the application omits all the columns and prints the original file contents,
while highligting each character with a color that indicates its' Unicode
category.
> Note that ASCII control codes, as well as Unicode ones, are kept
untouched and invisible.
<div align="center">
<img alt="example007" src="https://github.com/es7s/holms/assets/50381946/78ca318c-e295-41ff-b37d-d45d95842295">
</div>
<details>
<summary>Plain text output</summary>
<!-- @sub:example007.png.txt -->
> sed chars.txt -nEe 1,12p |
holms run --format=char -
! " # $ % & ' ( ) * + , - . /
0 1 2 3 4 5 6 7 8 9 : ; < = > ?
@ A B C D E F G H I J K L M N O
P Q R S T U V W X Y Z [ \ ] ^ _
` a b c d e f g h i j k l m n o
p q r s t u v w x y z { | } ~
¡ ¢ £ ¤ ¥ ¦ § ¨ © ª « ¬ ® ¯
° ± ² ³ ´ µ ¶ · ¸ ¹ º » ¼ ½ ¾ ¿
À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï
Ð Ñ Ò Ó Ô Õ Ö × Ø Ù Ú Û Ü Ý Þ ß
à á â ã ä å æ ç è é ê ë ì í î ï
ð ñ ò ó ô õ ö ÷ ø ù ú û ü ý þ ÿ
<!-- @sub -->
</details>
ASCII latin letters (`A-Za-z`) are colored in 50% gray color instead of regular
white on purpose — this can be extremely helpful when the task is to find
non-ASCII character(s) in an massive text of plain ASCII ones, or vice versa.
Below is a real example of broken characters which are the result of two
operations being applied in the wrong order: *UTF-8 decoding* and *URL %-based
unescaping*. This error is different from incorrect codepage selection errors,
which mess up the whole text or a part of it; all byte sequences are valid UTF-8
encoded code points, but the result differs from the origin and is completely
unreadable nevertheless.
<div align="center">
<img alt="example015" src="https://github.com/es7s/holms/assets/50381946/738b5bbe-291f-4ade-bf97-66c1e8368281">
</div>
### ASCII C0 / C1 details
While developing the application I encountered strange (as it seemed to be at
the beginning) behaviour of Python interpreter, which encoded C1 control bytes
as two bytes of UTF-8, while C0 control bytes were displayed as sole bytes, like
it would have been encoded in a plain ASCII. Then there was a bit of researching
done.
According to [ISO/IEC 6429 (ECMA-48)](https://www.iso.org/standard/12782.html),
there are two types of ASCII control codes (to be precise, much more, but for
our purposes it's mostly irrelevant) — C0 and C1. The first one includes ASCII
code points `0x00`-`0x1F` and `0x7F` (some authors also include a regular space
character `0x20` in this list), and the characteristic property of this type is
that all C0 code points are encoded in UTF-8 **exactly the same** as they do in
7-bit US-ASCII ([ISO/IEC 646](https://www.iso.org/standard/4777.html)). This
helps to disambiguate exactly what type of encoding is used even for broken byte
sequences, considering the task is to tell if a byte represents sole code point
or is actually a part of multibyte UTF-8 sequence.
However, C1 control codes are represented by `0x80`-`0x9F` bytes, which also are
valid bytes for multibyte UTF-8 sequences. In order to distinguish the first
type from the second UTF-8 encodes them as two-byte sequences instead (`0x80` →
`0xC280`, etc.); also this applies not only to control codes, but to all other
[ISO/IEC 8859](https://www.iso.org/standard/28245.html) code points starting
from `0x80`.
With this in mind, let's see how the application reflects these differences.
First command produces several 8-bit ASCII C1 control codes, which are
classified as raw binary/non-UTF-8 data, while the second command's output
consists of the very same code points but being encoded in UTF-8 (thanks to
Python's full transparent Unicode support, we don't even need to bother much
about the encodings and such):
<div align="center">
<img alt="example013" src="https://github.com/es7s/holms/assets/50381946/884d3269-6323-41f1-9eab-6dccd83c5d6d">
</div>
<details>
<summary>Plain text output</summary>
<!-- @sub:example013.png.txt -->
> printf "\x80\x90\x9f" && python3 -c 'print("\x80\x90\x9f", end="")' |
holms run --names --decimal --all -
⏨0 #0 0x 80 -- ▕ ▯ ▏ NON UTF-8 BYTE 0x80 -- Binary
⏨1 #1 0x 90 -- ▕ ▯ ▏ NON UTF-8 BYTE 0x90 -- Binary
⏨2 #2 0x 9f -- ▕ ▯ ▏ NON UTF-8 BYTE 0x9F -- Binary
⏨3 #3 0x c2 80 U+80 ▕ ▯ ▏ ASCII C1 [PC] PADDING CHARACTER Latin-1 Supplem‥ Control
⏨5 #4 0x c2 90 U+90 ▕ ▯ ▏ ASCII C1 [DCS] DEVICE CONTROL STRING Latin-1 Supplem‥ Control
⏨7 #5 0x c2 9f U+9F ▕ ▯ ▏ ASCII C1 [APC] APPLICATION PROGRAM COMMAND Latin-1 Supplem‥ Control
<!-- @sub -->
</details>
Legend
------------------
The image below illustrates the color scheme developed for the app specifically,
to simplify distinguishing code points of one category from others.
<div align="center">
<img alt="example009" src="https://github.com/es7s/holms/assets/50381946/f9cac3b0-adab-45a3-a324-174ad7f06d44">
</div>
Most frequently encountering control codes also have a unique character
replacements, which allows to recognize them without reading the label or
memorizing code point identifiers:
<div align="center">
<img alt="example014" src="https://github.com/es7s/holms/assets/50381946/2b77d06a-5e3d-4837-973c-78454e687113">
</div>
<details>
<summary><b>Unicode Blocks</b></summary>
<div align="center">
<img alt="blocks" src="https://github.com/es7s/holms/assets/50381946/8244553b-fc2d-419e-8b11-388ed0738bad"/>
</div>
</details>
Changelog
------------------
[CHANGES.rst](CHANGES.rst)
Raw data
{
"_id": null,
"home_page": null,
"name": "holms",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": "analyzer, breakdown, console, terminal, text, unicode",
"author": null,
"author_email": "Aleksandr Shavykin <0.delameter@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/8f/c0/dde00bb1644b0989641551f572dd6fd08be1e5f81016009070397fa11c60/holms-1.6.0.tar.gz",
"platform": null,
"description": "<h1 align=\"center\">\n <!-- es7s/holms -->\n <a href=\"##\"><img align=\"left\" src=\"https://s3.eu-north-1.amazonaws.com/dp2.dl/readme/es7s/holms/logo.png?v=2\" width=\"160\" height=\"64\"></a>\n <a href=\"##\"><img align=\"center\" src=\"https://s3.eu-north-1.amazonaws.com/dp2.dl/readme/es7s/holms/label.png\" width=\"200\" height=\"64\"></a>\n <a href=\"##\"><img align=\"right\" src=\"https://s3.eu-north-1.amazonaws.com/dp2.dl/readme/empty.png\" width=\"160\" height=\"64\"></a>\n</h1>\n<div align=\"right\">\n <a href=\"##\"><img src=\"https://img.shields.io/badge/python-3.10-3776AB?logo=python&logoColor=white&labelColor=333333\"></a>\n <a href=\"https://pepy.tech/project/holms/\"><img alt=\"Downloads\" src=\"https://pepy.tech/badge/holms\"></a>\n <a href=\"https://pypi.org/project/holms/\"><img alt=\"PyPI\" src=\"https://img.shields.io/pypi/v/holms\"></a>\n <a href='https://coveralls.io/github/es7s/holms?branch=master'><img src='https://coveralls.io/repos/github/es7s/holms/badge.svg?branch=master' alt='Coverage Status' /></a>\n <a href=\"https://github.com/psf/black\"><img alt=\"Code style: black\" src=\"https://img.shields.io/badge/code%20style-black-000000.svg\"></a>\n <a href=\"##\"><img src=\"https://wakatime.com/badge/user/8eb9e217-791b-436f-b729-81eb63e84b08/project/018b5923-4968-4029-ae8d-3776792f88d5.svg\"></a>\n</div>\n<br>\n\nCLI UTF-8 decomposer for text analysis capable of displaying Unicode code point\nnames and categories, along with ASCII control characters, UTF-16 surrogate pair\npieces, invalid UTF-8 sequences parts as separate bytes, etc.\n\n\nMotivation\n---------------------------\n\nA necessity for a tool that can quickly identify otherwise indistinguishable\nUnicode code points.\n\n\nInstallation\n---------------------------\n### With `pipx` (recommended)\n pipx install holms\n\n### From git repository\n curl -sS https://github.com/es7s/holms/blob/master/install.sh | sh\n\n\nBasic usage\n---------------------------\n\n Usage: holms run [OPTIONS] [INPUT]\n \n Read data from INPUT file, find all valid UTF-8 byte sequences, decode them and display as\n separate Unicode code points. Use '-' as INPUT to read from stdin instead.\n\n<div align=\"center\">\n <img alt=\"example001\" width=\"49%\" src=\"https://github.com/es7s/holms/assets/50381946/a9c9bcdd-42d5-4038-a23a-22b91bb7cc7d\">\n <img alt=\"example004\" width=\"49%\" src=\"https://github.com/es7s/holms/assets/50381946/fd1b4bc3-aacc-42af-8442-2db3c3984a13\">\n <img alt=\"example002\" width=\"49%\" src=\"https://github.com/es7s/holms/assets/50381946/0a126747-3b29-44da-9d94-ab5f01a63d68\">\n <img alt=\"example003\" width=\"49%\" src=\"https://github.com/es7s/holms/assets/50381946/8e217ae3-325c-4629-8cda-389882667aa4\">\n</div>\n\n<details>\n <summary>Plain text output</summary>\n <!-- @sub:example001.png.txt -->\n\n > holms run -u - <<<'1\u2082\u00b3\u2158\u2189\u23e8'\n \n 0 U+ 31 \u2595 1 \u258f Nd DIGIT ONE\n 1 U+2082 \u2595 \u2082 \u258f No SUBSCRIPT TWO\n 4 U+ B3 \u2595 \u00b3 \u258f No SUPERSCRIPT THREE\n 6 U+2158 \u2595 \u2158 \u258f No VULGAR FRACTION FOUR FIFTHS\n 9 U+2189 \u2595 \u2189 \u258f No VULGAR FRACTION ZERO THIRDS\n c U+23E8 \u2595 \u23e8 \u258f So DECIMAL EXPONENT SYMBOL\n\n <!-- @sub -->\n <!-- @sub:example004.png.txt -->\n\n > holms run -u - <<<'\ud83c\udf2f\ud83d\udc44\ud83e\udd21\ud83c\udf88\ud83d\udc33\ud83d\udc0d'\n \n 00 U1F32F \u2595\ud83c\udf2f \u258f So BURRITO\n 04 U1F444 \u2595\ud83d\udc44 \u258f So MOUTH\n 08 U1F921 \u2595\ud83e\udd21 \u258f So CLOWN FACE\n 0c U1F388 \u2595\ud83c\udf88 \u258f So BALLOON\n 10 U1F433 \u2595\ud83d\udc33 \u258f So SPOUTING WHALE\n 14 U1F40D \u2595\ud83d\udc0d \u258f So SNAKE\n\n <!-- @sub -->\n <!-- @sub:example002.png.txt -->\n\n > holms run -u - <<<'a\u0430\u0363\u0101\u00e3\u00e2\u0227\u00e4\u00e5\u2090\u1d43\uff41'\n \n 00 U+ 61 \u2595 a \u258f Ll LATIN SMALL LETTER A\n 01 U+ 430 \u2595 \u0430 \u258f Ll CYRILLIC SMALL LETTER A\n 03 U+ 363 \u2595 \u0363 \u258f Mn COMBINING LATIN SMALL LETTER A\n 05 U+ 101 \u2595 \u0101 \u258f Ll LATIN SMALL LETTER A WITH MACRON\n 07 U+ E3 \u2595 \u00e3 \u258f Ll LATIN SMALL LETTER A WITH TILDE\n 09 U+ E2 \u2595 \u00e2 \u258f Ll LATIN SMALL LETTER A WITH CIRCUMFLEX\n 0b U+ 227 \u2595 \u0227 \u258f Ll LATIN SMALL LETTER A WITH DOT ABOVE\n 0d U+ E4 \u2595 \u00e4 \u258f Ll LATIN SMALL LETTER A WITH DIAERESIS\n 0f U+ E5 \u2595 \u00e5 \u258f Ll LATIN SMALL LETTER A WITH RING ABOVE\n 11 U+2090 \u2595 \u2090 \u258f Lm LATIN SUBSCRIPT SMALL LETTER A\n 14 U+1D43 \u2595 \u1d43 \u258f Lm MODIFIER LETTER SMALL A\n 17 U+FF41 \u2595\uff41 \u258f Ll FULLWIDTH LATIN SMALL LETTER A\n\n <!-- @sub -->\n <!-- @sub:example003.png.txt -->\n\n > holms run -u - <<<'%\u2030\u221e8\u1ab2?\u00bf\u203d\u26a0\u26a0\ufe0f'\n \n 00 U+ 25 \u2595 % \u258f Po PERCENT SIGN\n 01 U+2030 \u2595 \u2030 \u258f Po PER MILLE SIGN\n 04 U+221E \u2595 \u221e \u258f Sm INFINITY\n 07 U+ 38 \u2595 8 \u258f Nd DIGIT EIGHT\n 08 U+1AB2 \u2595 \u1ab2 \u258f Mn COMBINING INFINITY\n 0b U+ 3F \u2595 ? \u258f Po QUESTION MARK\n 0c U+ BF \u2595 \u00bf \u258f Po INVERTED QUESTION MARK\n 0e U+203D \u2595 \u203d \u258f Po INTERROBANG\n 11 U+26A0 \u2595 \u26a0 \u258f So WARNING SIGN\n 14 U+26A0 \u2595 \u26a0 \u258f So WARNING SIGN\n 17 U+FE0F \u2595 \ufe0f \u258f Mn VARIATION SELECTOR-16\n\n <!-- @sub -->\n</details> \n\n\nBuffering\n---------------------------------\n\nThe application works in two modes: **buffered** (the default if INPUT is a\nfile) and **unbuffered** (default when reading from stdin). Options `-b`/`-u`\nexplicitly override output mode regardless of the default setting.\n\nIn **buffered** mode the result begins to appear only after EOF is encountered\n(i.e., the WHOLE file has been read to the buffer). This is suitable for short\nand predictable inputs and produces the most compact output with fixed column\nsizes.\n\nThe **unbuffered** mode comes in handy when input is an endless piped stream:\nthe results will be displayed in real time, as soon as the type of each byte\nsequence is determined, but the output column widths are not fixed and can vary\nas the process goes further.\n\n> Despite the name, the app actually uses tiny (4 bytes) input buffer, but it's\n> the only way to handle UTF-8 stream and distinguish valid sequences from broken\n> ones; in truly unbuffered mode the output would consist of ASCII-7 characters\n> (`0x00`-`0x7F`) and unrecognized binary data (`0x80`-`0xFF`) only, which is not\n> something the application was made for.\n\n\nConfiguration / Advanced usage\n----------------------------------\n[//]: # (@sub:help.txt)\n\n Options:\n -b, --buffered / -u, --unbuffered\n Explicitly set to wait for EOF before processing the\n output (buffered), or to stream the results in\n parallel with reading, as soon as possible\n (unbuffered). See BUFFERING section above for the\n details.\n -m, --merge Replace all sequences of repeating characters with one\n of each, together with initial length of the sequence.\n -g, --group Group the input by code points (=count unique), sort\n descending and display counts instead of normal\n output. Implies '--merge' and forces buffered ('-b')\n mode. Specifying the option twice ('-gg') results in\n grouping by code point category instead, while doing\n it thrice ('-ggg') makes the app group the input by\n super categories.\n -f, --format Comma-separated list of columns to show (order is\n preserved). Run 'holms format' to see the details.\n -n, --names Display names instead of abbreviations. Affects `cat`\n and `block` columns, but only if column in question is\n already present on the screen. Note that these columns\n can still display only the beginning of the attribute,\n unless '-r' is provided.\n -a, --all Display ALL columns.\n -r, --rigid By default some columns can be compressed beyond the\n nominal width, if all current values fit and there is\n still space left. This option disables column\n shrinking (but they still will be expanded when\n needed).\n --decimal Use decimal byte offsets instead of hexadecimal.\n --alt Use alternative notation for control characters: caret\n notation for ASCII C0, octal notation for ASCII C1.\n --oneline Discard all newline characters (0x0a LINE FEED) from\n the input.\n --no-table Do not format results as a table, just apply the\n colors to characters (equivalent to '-f char', implies\n '-b'). Compatible with '-merge', '--format' and even '\n --group'.\n --no-override Do not replace control/whitespace code point markers\n with distinguishable characters ('\u25af' to '\u21b5', '\u2423' etc).\n Run 'holms legend' to see the details.\n -?, --help Show this message and exit.\n\n[//]: # (@sub)\n\nExamples\n--------------------------\n\n### Output column selection\n\nOption `-f`/`--filter` can be used to specify what columns to display. As an\nalternative, there is an `-a`/`--all` option that enables displaying of all\ncurrently available columns.\n\n<details>\n <summary><b>Column availability depending on operating mode</b></summary>\n\n <div align=\"center\">\n <img alt=\"example010\" src=\"https://github.com/es7s/holms/assets/50381946/62a6f354-1f30-4ee8-a8fc-533b1a980e03\">\n </div>\n</details>\n\nAlso `-m`/`--merge` option is demonstrated, which tells the app to collapse\nrepetitive characters into one line of the output while counting them:\n\n<div align=\"center\">\n <img alt=\"example005\" src=\"https://github.com/es7s/holms/assets/50381946/6da31546-0e50-4fa0-af69-0b7a8ed5d4c3\">\n</div>\n\n<details>\n <summary>Plain text output</summary>\n <!-- @sub:example005.png.txt -->\n\n > holms run -m phpstan.txt\n \n 000 U+2B \u2595 + \u258f Sm PLUS SIGN\n 001+ U+2D \u2595 - \u258f Pd 27\u00d7 HYPHEN-MINUS\n 01c U+2B \u2595 + \u258f Sm PLUS SIGN\n 01d U+20 \u2595 \u2423 \u258f Zs SPACE\n 01e U+2B \u2595 + \u258f Sm PLUS SIGN\n 01f+ U+2D \u2595 - \u258f Pd 27\u00d7 HYPHEN-MINUS\n 03a U+2B \u2595 + \u258f Sm PLUS SIGN\n 03b U+ A \u2595 \u21b5 \u258f Cc ASCII C0 [LF] LINE FEED\n 03c U+7C \u2595 | \u258f Sm VERTICAL LINE\n 03d+ U+20 \u2595 \u2423 \u258f Zs 27\u00d7 SPACE\n ...\n\n <!-- @sub -->\n</details>\n\n### Reading from pipeline\n\nThere is an official Unicode Consortium data file included in the repository for\ntest purposes, named [confusables.txt](tests/data/confusables.txt). In the next\nexample we extract line **#3620** using `sed`, delete all TAB (`0x08`) characters\nand feed the result to the application. The result demonstrates various Unicode\ndot/bullet code points:\n\n<div align=\"center\">\n <img alt=\"example006\" src=\"https://github.com/es7s/holms/assets/50381946/78a90c45-d331-46d9-998e-20c6c9a97f12\">\n</div>\n\n<details>\n <summary>Plain text output</summary>\n <!-- @sub:example006.png.txt -->\n\n > sed confusables.txt -Ee 'sg' -e '3620!d' |\n \u00a0\u00a0holms run -\n \n 00 U+ B7 \u2595 \u00b7 \u258f Po MIDDLE DOT\n 02 U+1427 \u2595 \u1427 \u258f Lo CANADIAN SYLLABICS FINAL MIDDLE DOT\n 05 U+ 387 \u2595 \u0387 \u258f Po GREEK ANO TELEIA\n 07 U+2022 \u2595 \u2022 \u258f Po BULLET\n 0a U+2027 \u2595 \u2027 \u258f Po HYPHENATION POINT\n 0d U+2219 \u2595 \u2219 \u258f Sm BULLET OPERATOR\n 10 U+22C5 \u2595 \u22c5 \u258f Sm DOT OPERATOR\n 13 U+30FB \u2595\u30fb \u258f Po KATAKANA MIDDLE DOT\n 16 U10101 \u2595 \ud800\udd01 \u258f Po AEGEAN WORD SEPARATOR DOT\n 1a U+FF65 \u2595 \uff65 \u258f Po HALFWIDTH KATAKANA MIDDLE DOT\n 1d U+ A \u2595 \u21b5 \u258f Cc ASCII C0 [LF] LINE FEED\n\n <!-- @sub -->\n</details>\n\n### Code points / categories statistics\n\n`-g`/`--group` option can be used to count unique code points, and to compute\nthe occurrence rate of each one:\n\n<div align=\"center\">\n <img alt=\"example008\" src=\"https://github.com/es7s/holms/assets/50381946/f89be555-cf7e-4766-90b2-61a02140c54e\">\n</div>\n\n<details>\n <summary>Plain text output</summary>\n <!-- @sub:example008.png.txt -->\n\n > holms run -g ./tests/data/confusables.txt\n \n U+ 20 \u2595 \u2423 \u258f Zs 12.5% \u2588\u2588\u2588 62732\u00d7 SPACE\n U+ 9 \u2595 \u21e5 \u258f Cc 7.3% \u2588\u258a 36745\u00d7 ASCII C0 [HT] HORIZONTAL TABULATION\n U+ 41 \u2595 A \u258f Lu 6.1% \u2588\u258d 30555\u00d7 LATIN CAPITAL LETTER A\n U+ 49 \u2595 I \u258f Lu 5.2% \u2588\u258f 26063\u00d7 LATIN CAPITAL LETTER I\n U+ 45 \u2595 E \u258f Lu 5.0% \u2588\u258f 24992\u00d7 LATIN CAPITAL LETTER E\n U+ 54 \u2595 T \u258f Lu 3.7% \u2589 18776\u00d7 LATIN CAPITAL LETTER T\n U+ 4C \u2595 L \u258f Lu 3.7% \u2589 18763\u00d7 LATIN CAPITAL LETTER L\n U+200E \u2595 \u25af \u258f Cf 3.7% \u2589 18494\u00d7 LEFT-TO-RIGHT MARK\n U+ A \u2595 \u21b5 \u258f Cc 2.9% \u258b 14609\u00d7 ASCII C0 [LF] LINE FEED\n U+ 43 \u2595 C \u258f Lu 2.9% \u258b 14450\u00d7 LATIN CAPITAL LETTER C\n ...\n\n <!-- @sub -->\n</details>\n\nWhen used twice (`-gg`) or thrice (`-ggg`), the application groups the input by\ncode point category or code point super category, respectively, which can be used\ne.g. for frequency domain analysis:\n\n<div align=\"center\">\n <img alt=\"example011\" src=\"https://github.com/es7s/holms/assets/50381946/18018b0c-7978-48aa-b3be-4923167bb425\">\n <img alt=\"example012\" src=\"https://github.com/es7s/holms/assets/50381946/1128d864-aad9-4203-ae9c-af2ea0f3ad9f\">\n</div>\n\n<details>\n <summary>Plain text output</summary>\n <!-- @sub:example011.png.txt -->\n\n > holms run -gg ./tests/data/confusables.txt\n \n 53.1% \u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588 266233\u00d7 Uppercase_Letter\n 12.5% \u2588\u2588\u258e 62748\u00d7 Space_Separator\n 10.2% \u2588\u2589 51356\u00d7 Control\n 8.5% \u2588\u258c 42511\u00d7 Decimal_Number\n 3.7% \u258b 18497\u00d7 Format\n 3.0% \u258c 14832\u00d7 Other_Letter\n 2.0% \u258e 9778\u00d7 Math_Symbol\n 1.8% \u258e 9261\u00d7 Close_Punctuation\n 1.8% \u258e 9259\u00d7 Open_Punctuation\n 1.5% \u258e 7525\u00d7 Other_Punctuation\n ...\n\n <!-- @sub -->\n <!-- @sub:example012.png.txt -->\n\n > holms run -ggg ./tests/data/confusables.txt\n \n 56.7% \u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588 284074\u00d7 Letter\n 13.9% \u2588\u2588\u258d 69853\u00d7 Other(C)\n 12.5% \u2588\u2588\u258f 62750\u00d7 Separator(Z)\n 8.5% \u2588\u258c 42796\u00d7 Number\n 5.9% \u2588 29571\u00d7 Punctuation\n 2.2% \u258d 11072\u00d7 Symbol\n 0.2% \u258f 965\u00d7 Mark\n\n <!-- @sub -->\n</details>\n\n### In-place type highlighting\n\nWhen `--format` is specified exactly as a single `char` column: `--format=char`,\nthe application omits all the columns and prints the original file contents,\nwhile highligting each character with a color that indicates its' Unicode\ncategory. \n\n> Note that ASCII control codes, as well as Unicode ones, are kept\nuntouched and invisible.\n\n<div align=\"center\">\n <img alt=\"example007\" src=\"https://github.com/es7s/holms/assets/50381946/78ca318c-e295-41ff-b37d-d45d95842295\">\n</div>\n\n<details>\n <summary>Plain text output</summary>\n <!-- @sub:example007.png.txt -->\n\n > sed chars.txt -nEe 1,12p |\n \u00a0\u00a0holms run --format=char -\n \n ! \" # $ % & ' ( ) * + , - . /\n 0 1 2 3 4 5 6 7 8 9 : ; < = > ?\n @ A B C D E F G H I J K L M N O\n P Q R S T U V W X Y Z [ \\ ] ^ _\n ` a b c d e f g h i j k l m n o\n p q r s t u v w x y z { | } ~\n \u00a1 \u00a2 \u00a3 \u00a4 \u00a5 \u00a6 \u00a7 \u00a8 \u00a9 \u00aa \u00ab \u00ac \u00ad \u00ae \u00af\n \u00b0 \u00b1 \u00b2 \u00b3 \u00b4 \u00b5 \u00b6 \u00b7 \u00b8 \u00b9 \u00ba \u00bb \u00bc \u00bd \u00be \u00bf\n \u00c0 \u00c1 \u00c2 \u00c3 \u00c4 \u00c5 \u00c6 \u00c7 \u00c8 \u00c9 \u00ca \u00cb \u00cc \u00cd \u00ce \u00cf\n \u00d0 \u00d1 \u00d2 \u00d3 \u00d4 \u00d5 \u00d6 \u00d7 \u00d8 \u00d9 \u00da \u00db \u00dc \u00dd \u00de \u00df\n \u00e0 \u00e1 \u00e2 \u00e3 \u00e4 \u00e5 \u00e6 \u00e7 \u00e8 \u00e9 \u00ea \u00eb \u00ec \u00ed \u00ee \u00ef\n \u00f0 \u00f1 \u00f2 \u00f3 \u00f4 \u00f5 \u00f6 \u00f7 \u00f8 \u00f9 \u00fa \u00fb \u00fc \u00fd \u00fe \u00ff\n\n <!-- @sub -->\n</details>\n\n\nASCII latin letters (`A-Za-z`) are colored in 50% gray color instead of regular\nwhite on purpose \u2014 this can be extremely helpful when the task is to find\nnon-ASCII character(s) in an massive text of plain ASCII ones, or vice versa.\n\nBelow is a real example of broken characters which are the result of two\noperations being applied in the wrong order: *UTF-8 decoding* and *URL %-based\nunescaping*. This error is different from incorrect codepage selection errors,\nwhich mess up the whole text or a part of it; all byte sequences are valid UTF-8\nencoded code points, but the result differs from the origin and is completely \nunreadable nevertheless.\n\n<div align=\"center\">\n <img alt=\"example015\" src=\"https://github.com/es7s/holms/assets/50381946/738b5bbe-291f-4ade-bf97-66c1e8368281\">\n</div>\n\n\n### ASCII C0 / C1 details\n\nWhile developing the application I encountered strange (as it seemed to be at\nthe beginning) behaviour of Python interpreter, which encoded C1 control bytes\nas two bytes of UTF-8, while C0 control bytes were displayed as sole bytes, like\nit would have been encoded in a plain ASCII. Then there was a bit of researching\ndone.\n\nAccording to [ISO/IEC 6429 (ECMA-48)](https://www.iso.org/standard/12782.html),\nthere are two types of ASCII control codes (to be precise, much more, but for\nour purposes it's mostly irrelevant) \u2014 C0 and C1. The first one includes ASCII\ncode points `0x00`-`0x1F` and `0x7F` (some authors also include a regular space\ncharacter `0x20` in this list), and the characteristic property of this type is\nthat all C0 code points are encoded in UTF-8 **exactly the same** as they do in\n7-bit US-ASCII ([ISO/IEC 646](https://www.iso.org/standard/4777.html)). This\nhelps to disambiguate exactly what type of encoding is used even for broken byte\nsequences, considering the task is to tell if a byte represents sole code point\nor is actually a part of multibyte UTF-8 sequence.\n\nHowever, C1 control codes are represented by `0x80`-`0x9F` bytes, which also are\nvalid bytes for multibyte UTF-8 sequences. In order to distinguish the first\ntype from the second UTF-8 encodes them as two-byte sequences instead (`0x80` \u2192\n`0xC280`, etc.); also this applies not only to control codes, but to all other\n[ISO/IEC 8859](https://www.iso.org/standard/28245.html) code points starting\nfrom `0x80`.\n\nWith this in mind, let's see how the application reflects these differences.\nFirst command produces several 8-bit ASCII C1 control codes, which are\nclassified as raw binary/non-UTF-8 data, while the second command's output\nconsists of the very same code points but being encoded in UTF-8 (thanks to\nPython's full transparent Unicode support, we don't even need to bother much\nabout the encodings and such):\n\n<div align=\"center\">\n <img alt=\"example013\" src=\"https://github.com/es7s/holms/assets/50381946/884d3269-6323-41f1-9eab-6dccd83c5d6d\">\n</div>\n\n<details>\n <summary>Plain text output</summary>\n <!-- @sub:example013.png.txt -->\n\n > printf \"\\x80\\x90\\x9f\" && python3 -c 'print(\"\\x80\\x90\\x9f\", end=\"\")' |\n \u00a0\u00a0holms run --names --decimal --all -\n \n \u23e80 #0 0x 80 -- \u2595 \u25af \u258f NON UTF-8 BYTE 0x80 -- Binary\n \u23e81 #1 0x 90 -- \u2595 \u25af \u258f NON UTF-8 BYTE 0x90 -- Binary\n \u23e82 #2 0x 9f -- \u2595 \u25af \u258f NON UTF-8 BYTE 0x9F -- Binary\n \n \u23e83 #3 0x c2 80 U+80 \u2595 \u25af \u258f ASCII C1 [PC] PADDING CHARACTER Latin-1 Supplem\u2025 Control\n \u23e85 #4 0x c2 90 U+90 \u2595 \u25af \u258f ASCII C1 [DCS] DEVICE CONTROL STRING Latin-1 Supplem\u2025 Control\n \u23e87 #5 0x c2 9f U+9F \u2595 \u25af \u258f ASCII C1 [APC] APPLICATION PROGRAM COMMAND Latin-1 Supplem\u2025 Control\n\n <!-- @sub -->\n</details>\n\nLegend\n------------------\n\nThe image below illustrates the color scheme developed for the app specifically,\nto simplify distinguishing code points of one category from others.\n\n<div align=\"center\">\n <img alt=\"example009\" src=\"https://github.com/es7s/holms/assets/50381946/f9cac3b0-adab-45a3-a324-174ad7f06d44\">\n</div>\n\nMost frequently encountering control codes also have a unique character\nreplacements, which allows to recognize them without reading the label or\nmemorizing code point identifiers:\n\n<div align=\"center\">\n <img alt=\"example014\" src=\"https://github.com/es7s/holms/assets/50381946/2b77d06a-5e3d-4837-973c-78454e687113\">\n</div>\n\n<details>\n<summary><b>Unicode Blocks</b></summary>\n <div align=\"center\">\n <img alt=\"blocks\" src=\"https://github.com/es7s/holms/assets/50381946/8244553b-fc2d-419e-8b11-388ed0738bad\"/>\n </div>\n</details>\n\nChangelog\n------------------\n\n[CHANGES.rst](CHANGES.rst)\n",
"bugtrack_url": null,
"license": null,
"summary": "Text to Unicode code points breakdown",
"version": "1.6.0",
"project_urls": {
"Bug Tracker": "https://github.com/es7s/holms/issues",
"Changelog": "https://github.com/es7s/holms/blob/master/CHANGES.rst",
"Homepage": "https://github.com/es7s/holms"
},
"split_keywords": [
"analyzer",
" breakdown",
" console",
" terminal",
" text",
" unicode"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "0dd09d8f7a7899af19c7b3978baad02ee5edcf91ffc0972034eccec547477ee2",
"md5": "04fa4123402a2401a31810f0563bf7aa",
"sha256": "ff8ffa1e71741bfe30eca8a3960d69876018fe16a277e2b9d200888050c110d6"
},
"downloads": -1,
"filename": "holms-1.6.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "04fa4123402a2401a31810f0563bf7aa",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 45632,
"upload_time": "2024-08-06T23:22:22",
"upload_time_iso_8601": "2024-08-06T23:22:22.283824Z",
"url": "https://files.pythonhosted.org/packages/0d/d0/9d8f7a7899af19c7b3978baad02ee5edcf91ffc0972034eccec547477ee2/holms-1.6.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "8fc0dde00bb1644b0989641551f572dd6fd08be1e5f81016009070397fa11c60",
"md5": "097340c77f7667352cceb63e903ad7f1",
"sha256": "6b659c27de3f7640feb2bed993c6e46978ff5bf9da54714c3961346d58081282"
},
"downloads": -1,
"filename": "holms-1.6.0.tar.gz",
"has_sig": false,
"md5_digest": "097340c77f7667352cceb63e903ad7f1",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 195614,
"upload_time": "2024-08-06T23:22:24",
"upload_time_iso_8601": "2024-08-06T23:22:24.014181Z",
"url": "https://files.pythonhosted.org/packages/8f/c0/dde00bb1644b0989641551f572dd6fd08be1e5f81016009070397fa11c60/holms-1.6.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-08-06 23:22:24",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "es7s",
"github_project": "holms",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [
{
"name": "click",
"specs": [
[
"==",
"8.1.7"
]
]
},
{
"name": "es7s.commons",
"specs": [
[
"==",
"1.7.0"
]
]
},
{
"name": "pytermor",
"specs": [
[
"==",
"2.118.0.dev0"
]
]
}
],
"lcname": "holms"
}