# telegram-markdown-entities
## Stop fighting Telegram’s Markdown/HTML parser.
Ship text + entities for zero-escape, zero-surprise, exact rendering—where unsupported bits safely fall back to plain text.
If you’re tired of Telegram MarkdownV2/HTML parse errors:
- “Can’t parse entities” from MarkdownV2 — special characters not escaped (_ * [ ] ( ) ~ > # + - = | { } . !`), or escaping in the wrong context (e.g., inside code).
- Unbalanced delimiters — missing/misplaced *, _, `, ~~, ||, or code fences ….
- Illegal nesting/overlap — e.g., mixing **bold _italic**_ or putting styles inside code/pre.
- Broken links — [label](url) with spaces/parentheses not URL-encoded ( , ), (), or unmatched brackets.
- HTML tag issues — unknown/disallowed tags/attributes, mis-nested tags like <b><i>…</b></i>, unclosed tags.
- Double parsing — sending both parse_mode and manual entities leads to surprises; Telegram’s parser still interferes.
- Edge text — underscores in words/URLs, emoji/ZWJ sequences that shift what the parser thinks are boundaries.
## The new paradigm: entities-only (no parse_mode)
We don’t ask Telegram to parse.
We send text + entities, so **formatting is explicit and deterministic**.
- No escaping ever. Special characters stay as-is; styles are applied by offsets/lengths, not by punctuation.
- UTF-16–correct offsets. Emoji, non-BMP symbols, ZWJ sequences—handled; entity bounds stay valid.
- No illegal overlaps. code/pre are atomic; we prevent forbidden nests before sending.
- Graceful fallback. Anything we don’t support is left as plain text—safe, readable, no runtime errors.
- Future-proof. Parser changes on Telegram’s side don’t break you; your rendering remains stable.
## Convert Markdown(or any) text into plain(valid) telegram messages and with ease.
This library takes a string written in standard Markdown (such as the
output of a language model or contents of a README) and returns two
objects:
1. **Plain text** with all Markdown delimiters removed.
2. A list of **message entity dictionaries** that tell the Telegram
Bot API how to format the text (bold, italic, links, lists,
block quotes, etc.).
By sending the text together with the `entities` array (and *not*
specifying a `parse_mode`) you avoid the pitfalls of Telegram’s own
Markdown parser – there’s no need to escape special characters, and
your messages render exactly as intended.
Send the text with an entities array and without parse_mode to bypass Telegram’s Markdown quirks—no escaping needed, and the message renders exactly as you designed.
Pair your text with entities (skip parse_mode) to sidestep Telegram’s Markdown parser: no special-char escaping, just precise, predictable rendering.
Use entities instead of parse_mode to avoid Telegram’s Markdown pitfalls—no escape gymnastics, and the output matches your intent.
Deliver text plus entities (no parse_mode) and you’ll dodge parser surprises: zero escaping and faithful, deterministic formatting.
By sending entities alongside the text and omitting parse_mode, you eliminate Markdown parsing issues—nothing to escape, and the result is pixel-perfect.
Ship the message with entities only; don’t set parse_mode. You’ll skip Telegram’s parser entirely, so special characters are safe and formatting is exact.
## Features
* **Inline formatting**: supports bold (`**text**`), italic (`*text*` or
`_text_`), underline (`__text__`), strikethrough (`~~text~~`),
spoilers (`||text||`), inline code (`` `code` ``), code blocks
(```lang\ncode```), and links (`[label](url)`).
* **Headings**: lines starting with `#` are converted to bold text.
* **Block quotes**: lines beginning with `>` produce a `blockquote`
entity; prefixing the quote with `||` (e.g. `>|| quote`) marks it
as **collapsed/expandable**. This maps to the `collapsed` flag on
Telegram’s `messageEntityBlockquote` type【885819747867534†L114-L123】.
* **Lists**: unordered lists use Unicode bullets – `•`, `◦` and `▪`
depending on nesting depth – and indent with non‑breaking spaces; ordered
lists align numbers using figure spaces and support nested numbering.
* **Nested formatting**: bold inside italics, links inside quotes and
other combinations all work as expected.
* **UTF‑16 offsets**: entity offsets and lengths are calculated
according to the UTF‑16 code unit rules used by Telegram【16645222028428†L69-L79】.
## Installation
Install the package from PyPI:
```bash
pip install telegram-markdown-entities
```
Requires Python 3.7 or newer. **There are no external dependencies.**
## Usage
Here’s a minimal example of how to use the library with the Bot API:
```python
from telegram_markdown import parse_markdown_to_entities
import requests
md = """
# Heading Example
>|| This is a collapsed quote\n> It continues here.
* Item 1
* Nested item
1. First
2. Second\n continuation
Inline example: **bold**, _italic_, [link](https://example.com) and `code`.
"""
text, entities = parse_markdown_to_entities(md)
# Send via HTTP API (replace TOKEN and CHAT_ID with your own)
payload = {
'chat_id': CHAT_ID,
'text': text,
'entities': entities
}
requests.post(f'https://api.telegram.org/bot{TOKEN}/sendMessage', json=payload)
```
The `text` variable will contain the plain message (with list markers
and quote markers removed), and `entities` will be a list of
dictionaries like `{'type': 'bold', 'offset': 0, 'length': 6}`. Pass
these directly to `sendMessage`. There is no need to set the
`parse_mode` parameter.
## Packaging and publishing
This project uses a modern `pyproject.toml` with [`setuptools`](https://setuptools.pypa.io).
To build a source distribution and wheel, install the build tool and
run:
```bash
pip install build
python -m build
```
Distributions will be created in the `dist/` directory. To upload
them to the Python Package Index (PyPI), install `twine` and run:
```bash
pip install twine
twine upload dist/*
```
You will be prompted for your PyPI username and password. See
<https://packaging.python.org/tutorials/packaging-projects/> for
full details.
## License
MIT – see the [LICENSE](LICENSE) file for details.
# Links
[Telegram message entities](https://core.telegram.org/api/entities)
Raw data
{
"_id": null,
"home_page": null,
"name": "telegram-markdown-entities",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": null,
"keywords": "telegram, markdown, entities, bot, tg",
"author": null,
"author_email": "Lebedev Nikolay <fifo.mail@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/e5/af/104800c2cc2d90683da200e26013a5e57da667c42cc2472e8cf4b0375070/telegram_markdown_entities-0.1.1.tar.gz",
"platform": null,
"description": "# telegram-markdown-entities\n\n## Stop fighting Telegram\u2019s Markdown/HTML parser.\n\nShip text + entities for zero-escape, zero-surprise, exact rendering\u2014where unsupported bits safely fall back to plain text.\n\nIf you\u2019re tired of Telegram MarkdownV2/HTML parse errors:\n- \u201cCan\u2019t parse entities\u201d from MarkdownV2 \u2014 special characters not escaped (_ * [ ] ( ) ~ > # + - = | { } . !`), or escaping in the wrong context (e.g., inside code).\n- Unbalanced delimiters \u2014 missing/misplaced *, _, `, ~~, ||, or code fences \u2026.\n- Illegal nesting/overlap \u2014 e.g., mixing **bold _italic**_ or putting styles inside code/pre.\n- Broken links \u2014 [label](url) with spaces/parentheses not URL-encoded ( , ), (), or unmatched brackets.\n- HTML tag issues \u2014 unknown/disallowed tags/attributes, mis-nested tags like <b><i>\u2026</b></i>, unclosed tags.\n- Double parsing \u2014 sending both parse_mode and manual entities leads to surprises; Telegram\u2019s parser still interferes.\n- Edge text \u2014 underscores in words/URLs, emoji/ZWJ sequences that shift what the parser thinks are boundaries.\n\n## The new paradigm: entities-only (no parse_mode)\n\nWe don\u2019t ask Telegram to parse.\nWe send text + entities, so **formatting is explicit and deterministic**.\n\n- No escaping ever. Special characters stay as-is; styles are applied by offsets/lengths, not by punctuation.\n- UTF-16\u2013correct offsets. Emoji, non-BMP symbols, ZWJ sequences\u2014handled; entity bounds stay valid.\n- No illegal overlaps. code/pre are atomic; we prevent forbidden nests before sending.\n- Graceful fallback. Anything we don\u2019t support is left as plain text\u2014safe, readable, no runtime errors.\n- Future-proof. Parser changes on Telegram\u2019s side don\u2019t break you; your rendering remains stable.\n\n\n## Convert Markdown(or any) text into plain(valid) telegram messages and with ease.\n\nThis library takes a string written in standard Markdown (such as the\noutput of a language model or contents of a README) and returns two\nobjects:\n\n1. **Plain text** with all Markdown delimiters removed.\n2. A list of **message entity dictionaries** that tell the Telegram\n Bot API how to format the text (bold, italic, links, lists,\n block quotes, etc.).\n\nBy sending the text together with the `entities` array (and *not*\nspecifying a `parse_mode`) you avoid the pitfalls of Telegram\u2019s own\nMarkdown parser \u2013 there\u2019s no need to escape special characters, and\nyour messages render exactly as intended.\n\nSend the text with an entities array and without parse_mode to bypass Telegram\u2019s Markdown quirks\u2014no escaping needed, and the message renders exactly as you designed.\n\nPair your text with entities (skip parse_mode) to sidestep Telegram\u2019s Markdown parser: no special-char escaping, just precise, predictable rendering.\n\nUse entities instead of parse_mode to avoid Telegram\u2019s Markdown pitfalls\u2014no escape gymnastics, and the output matches your intent.\n\nDeliver text plus entities (no parse_mode) and you\u2019ll dodge parser surprises: zero escaping and faithful, deterministic formatting.\n\nBy sending entities alongside the text and omitting parse_mode, you eliminate Markdown parsing issues\u2014nothing to escape, and the result is pixel-perfect.\n\nShip the message with entities only; don\u2019t set parse_mode. You\u2019ll skip Telegram\u2019s parser entirely, so special characters are safe and formatting is exact.\n\n## Features\n\n* **Inline formatting**: supports bold (`**text**`), italic (`*text*` or\n `_text_`), underline (`__text__`), strikethrough (`~~text~~`),\n spoilers (`||text||`), inline code (`` `code` ``), code blocks\n (```lang\\ncode```), and links (`[label](url)`).\n* **Headings**: lines starting with `#` are converted to bold text.\n* **Block quotes**: lines beginning with `>` produce a `blockquote`\n entity; prefixing the quote with `||` (e.g. `>|| quote`) marks it\n as **collapsed/expandable**. This maps to the `collapsed` flag on\n Telegram\u2019s `messageEntityBlockquote` type\u3010885819747867534\u2020L114-L123\u3011.\n* **Lists**: unordered lists use Unicode bullets \u2013 `\u2022`, `\u25e6` and `\u25aa`\n depending on nesting depth \u2013 and indent with non\u2011breaking spaces; ordered\n lists align numbers using figure spaces and support nested numbering.\n* **Nested formatting**: bold inside italics, links inside quotes and\n other combinations all work as expected.\n* **UTF\u201116 offsets**: entity offsets and lengths are calculated\n according to the UTF\u201116 code unit rules used by Telegram\u301016645222028428\u2020L69-L79\u3011.\n\n## Installation\n\nInstall the package from PyPI:\n\n```bash\npip install telegram-markdown-entities\n```\n\nRequires Python\u00a03.7 or newer. **There are no external dependencies.**\n\n## Usage\n\nHere\u2019s a minimal example of how to use the library with the Bot API:\n\n```python\nfrom telegram_markdown import parse_markdown_to_entities\nimport requests\n\nmd = \"\"\"\n# Heading Example\n\n>|| This is a collapsed quote\\n> It continues here.\n\n* Item 1\n * Nested item\n1. First\n2. Second\\n continuation\n\nInline example: **bold**, _italic_, [link](https://example.com) and `code`.\n\"\"\"\n\ntext, entities = parse_markdown_to_entities(md)\n\n# Send via HTTP API (replace TOKEN and CHAT_ID with your own)\npayload = {\n 'chat_id': CHAT_ID,\n 'text': text,\n 'entities': entities\n}\nrequests.post(f'https://api.telegram.org/bot{TOKEN}/sendMessage', json=payload)\n```\n\nThe `text` variable will contain the plain message (with list markers\nand quote markers removed), and `entities` will be a list of\ndictionaries like `{'type': 'bold', 'offset': 0, 'length': 6}`. Pass\nthese directly to `sendMessage`. There is no need to set the\n`parse_mode` parameter.\n\n## Packaging and publishing\n\nThis project uses a modern `pyproject.toml` with [`setuptools`](https://setuptools.pypa.io).\nTo build a source distribution and wheel, install the build tool and\nrun:\n\n```bash\npip install build\npython -m build\n```\n\nDistributions will be created in the `dist/` directory. To upload\nthem to the Python Package Index (PyPI), install `twine` and run:\n\n```bash\npip install twine\ntwine upload dist/*\n```\n\nYou will be prompted for your PyPI username and password. See\n<https://packaging.python.org/tutorials/packaging-projects/> for\nfull details.\n\n## License\n\nMIT \u2013 see the [LICENSE](LICENSE) file for details.\n\n\n# Links\n\n[Telegram message entities](https://core.telegram.org/api/entities)\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Convert Markdown into plain text and Telegram message entities.",
"version": "0.1.1",
"project_urls": {
"Issues": "https://github.com/bakaut/telegram-markdown-entities/issues",
"Source": "https://github.com/bakaut/telegram-markdown-entities"
},
"split_keywords": [
"telegram",
" markdown",
" entities",
" bot",
" tg"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "019257bc2fb0c576a4554e06b7f8d54c3ecffa07b8149f03bbd429958db9c469",
"md5": "5aa3422ddfea072fbf0bad4ac6721ef3",
"sha256": "7628812f8d38242cb2d5abb3e542c7f4e1dd07f4602dc0d345bb13a79b2af93b"
},
"downloads": -1,
"filename": "telegram_markdown_entities-0.1.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "5aa3422ddfea072fbf0bad4ac6721ef3",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 14459,
"upload_time": "2025-09-14T14:18:45",
"upload_time_iso_8601": "2025-09-14T14:18:45.548631Z",
"url": "https://files.pythonhosted.org/packages/01/92/57bc2fb0c576a4554e06b7f8d54c3ecffa07b8149f03bbd429958db9c469/telegram_markdown_entities-0.1.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "e5af104800c2cc2d90683da200e26013a5e57da667c42cc2472e8cf4b0375070",
"md5": "b073459bb1d9f176baad688579148374",
"sha256": "37fba9058a5c48847fca50362b9f554f3c875c7f1da838053797c5f4fc98ad8c"
},
"downloads": -1,
"filename": "telegram_markdown_entities-0.1.1.tar.gz",
"has_sig": false,
"md5_digest": "b073459bb1d9f176baad688579148374",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 16654,
"upload_time": "2025-09-14T14:18:46",
"upload_time_iso_8601": "2025-09-14T14:18:46.720363Z",
"url": "https://files.pythonhosted.org/packages/e5/af/104800c2cc2d90683da200e26013a5e57da667c42cc2472e8cf4b0375070/telegram_markdown_entities-0.1.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-09-14 14:18:46",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "bakaut",
"github_project": "telegram-markdown-entities",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "telegram-markdown-entities"
}