* **Instead of using latexcodec, I encourage you to consider pylatexenc instead, which is far superior:** https://github.com/phfaist/pylatexenc
* Download: http://pypi.python.org/pypi/latexcodec/#downloads
* Documentation: http://latexcodec.readthedocs.org/
* Development: http://github.com/mcmtroffaes/latexcodec/
.. |ci| image:: https://github.com/mcmtroffaes/latexcodec/actions/workflows/python-package.yml/badge.svg
:target: https://github.com/mcmtroffaes/latexcodec/actions/workflows/python-package.yml
:alt: ci
.. |codecov| image:: https://codecov.io/gh/mcmtroffaes/latexcodec/branch/develop/graph/badge.svg
:target: https://codecov.io/gh/mcmtroffaes/latexcodec
:alt: codecov
The codec provides a convenient way of going between text written in
LaTeX and unicode. Since it is not a LaTeX compiler, it is more
appropriate for short chunks of text, such as a paragraph or the
values of a BibTeX entry, and it is not appropriate for a full LaTeX
document. In particular, its behavior on the LaTeX commands that do
not simply select characters is intended to allow the unicode
representation to be understandable by a human reader, but is not
canonical and may require hand tuning to produce the desired effect.
The encoder does a best effort to replace unicode characters outside
of the range used as LaTeX input (ascii by default) with a LaTeX
command that selects the character. More technically, the unicode code
point is replaced by a LaTeX command that selects a glyph that
reasonably represents the code point. Unicode characters with special
uses in LaTeX are replaced by their LaTeX equivalents. For example,
====================== ===================
original text encoded LaTeX
====================== ===================
``¥`` ``\yen``
``ü`` ``\"u``
``\N{NO-BREAK SPACE}`` ``~``
``~`` ``\textasciitilde``
``%`` ``\%``
``#`` ``\#``
``\textbf{x}`` ``\textbf{x}``
====================== ===================
The decoder does a best effort to replace LaTeX commands that select
characters with the unicode for the character they are selecting. For
example,
===================== ======================
original LaTeX decoded unicode
===================== ======================
``\yen`` ``¥``
``\"u`` ``ü``
``~`` ``\N{NO-BREAK SPACE}``
``\textasciitilde`` ``~``
``\%`` ``%``
``\#`` ``#``
``\textbf{x}`` ``\textbf {x}``
``#`` ``#``
===================== ======================
In addition, comments are dropped (including the final newline that
marks the end of a comment), paragraphs are canonicalized into double
newlines, and other newlines are left as is. Spacing after LaTeX
commands is also canonicalized.
For example,
::
hi % bye
there\par world
\textbf {awesome}
is decoded as
::
hi there
world
\textbf {awesome}
When decoding, LaTeX commands not directly selecting characters (for
example, macros and formatting commands) are passed through
unchanged. The same happens for LaTeX commands that select characters
but are not yet recognized by the codec. Either case can result in a
hybrid unicode string in which some characters are understood as
literally the character and others as parts of unexpanded commands.
Consequently, at times, backslashes will be left intact for denoting
the start of a potentially unrecognized control sequence.
Given the numerous and changing packages providing such LaTeX
commands, the codec will never be complete, and new translations of
unrecognized unicode or unrecognized LaTeX symbols are always welcome.
Raw data
{
"_id": null,
"home_page": "https://github.com/mcmtroffaes/latexcodec",
"name": "latexcodec",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": "",
"keywords": "",
"author": "Matthias C. M. Troffaes",
"author_email": "matthias.troffaes@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/98/e7/ed339caf3662976949e4fdbfdf4a6db818b8d2aa1cf2b5f73af89e936bba/latexcodec-3.0.0.tar.gz",
"platform": "any",
"description": "* **Instead of using latexcodec, I encourage you to consider pylatexenc instead, which is far superior:** https://github.com/phfaist/pylatexenc\n\n* Download: http://pypi.python.org/pypi/latexcodec/#downloads\n\n* Documentation: http://latexcodec.readthedocs.org/\n\n* Development: http://github.com/mcmtroffaes/latexcodec/\n\n.. |ci| image:: https://github.com/mcmtroffaes/latexcodec/actions/workflows/python-package.yml/badge.svg\n :target: https://github.com/mcmtroffaes/latexcodec/actions/workflows/python-package.yml\n :alt: ci\n\n.. |codecov| image:: https://codecov.io/gh/mcmtroffaes/latexcodec/branch/develop/graph/badge.svg\n :target: https://codecov.io/gh/mcmtroffaes/latexcodec\n :alt: codecov\n\nThe codec provides a convenient way of going between text written in\nLaTeX and unicode. Since it is not a LaTeX compiler, it is more\nappropriate for short chunks of text, such as a paragraph or the\nvalues of a BibTeX entry, and it is not appropriate for a full LaTeX\ndocument. In particular, its behavior on the LaTeX commands that do\nnot simply select characters is intended to allow the unicode\nrepresentation to be understandable by a human reader, but is not\ncanonical and may require hand tuning to produce the desired effect.\n\nThe encoder does a best effort to replace unicode characters outside\nof the range used as LaTeX input (ascii by default) with a LaTeX\ncommand that selects the character. More technically, the unicode code\npoint is replaced by a LaTeX command that selects a glyph that\nreasonably represents the code point. Unicode characters with special\nuses in LaTeX are replaced by their LaTeX equivalents. For example,\n\n====================== ===================\noriginal text encoded LaTeX\n====================== ===================\n``\u00a5`` ``\\yen``\n``\u00fc`` ``\\\"u``\n``\\N{NO-BREAK SPACE}`` ``~``\n``~`` ``\\textasciitilde``\n``%`` ``\\%``\n``#`` ``\\#``\n``\\textbf{x}`` ``\\textbf{x}``\n====================== ===================\n\nThe decoder does a best effort to replace LaTeX commands that select\ncharacters with the unicode for the character they are selecting. For\nexample,\n\n===================== ======================\noriginal LaTeX decoded unicode\n===================== ======================\n``\\yen`` ``\u00a5``\n``\\\"u`` ``\u00fc``\n``~`` ``\\N{NO-BREAK SPACE}``\n``\\textasciitilde`` ``~``\n``\\%`` ``%``\n``\\#`` ``#``\n``\\textbf{x}`` ``\\textbf {x}``\n``#`` ``#``\n===================== ======================\n\nIn addition, comments are dropped (including the final newline that\nmarks the end of a comment), paragraphs are canonicalized into double\nnewlines, and other newlines are left as is. Spacing after LaTeX\ncommands is also canonicalized.\n\nFor example,\n\n::\n\n hi % bye\n there\\par world\n \\textbf {awesome}\n\nis decoded as\n\n::\n\n hi there\n\n world\n \\textbf {awesome}\n\nWhen decoding, LaTeX commands not directly selecting characters (for\nexample, macros and formatting commands) are passed through\nunchanged. The same happens for LaTeX commands that select characters\nbut are not yet recognized by the codec. Either case can result in a\nhybrid unicode string in which some characters are understood as\nliterally the character and others as parts of unexpanded commands.\nConsequently, at times, backslashes will be left intact for denoting\nthe start of a potentially unrecognized control sequence.\n\nGiven the numerous and changing packages providing such LaTeX\ncommands, the codec will never be complete, and new translations of\nunrecognized unicode or unrecognized LaTeX symbols are always welcome.\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A lexer and codec to work with LaTeX code in Python.",
"version": "3.0.0",
"project_urls": {
"Download": "http://pypi.python.org/pypi/latexcodec",
"Homepage": "https://github.com/mcmtroffaes/latexcodec"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "b0bfea8887e9f31a8f93ca306699d11909c6140151393a4216f0d9f85a004077",
"md5": "88f1b09249106cfb89a62487600487ca",
"sha256": "6f3477ad5e61a0a99bd31a6a370c34e88733a6bad9c921a3ffcfacada12f41a7"
},
"downloads": -1,
"filename": "latexcodec-3.0.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "88f1b09249106cfb89a62487600487ca",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 18150,
"upload_time": "2024-03-06T14:51:37",
"upload_time_iso_8601": "2024-03-06T14:51:37.872152Z",
"url": "https://files.pythonhosted.org/packages/b0/bf/ea8887e9f31a8f93ca306699d11909c6140151393a4216f0d9f85a004077/latexcodec-3.0.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "98e7ed339caf3662976949e4fdbfdf4a6db818b8d2aa1cf2b5f73af89e936bba",
"md5": "19edb0931c2cb7cbb0e49e5829365c20",
"sha256": "917dc5fe242762cc19d963e6548b42d63a118028cdd3361d62397e3b638b6bc5"
},
"downloads": -1,
"filename": "latexcodec-3.0.0.tar.gz",
"has_sig": false,
"md5_digest": "19edb0931c2cb7cbb0e49e5829365c20",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 31023,
"upload_time": "2024-03-06T14:51:39",
"upload_time_iso_8601": "2024-03-06T14:51:39.283461Z",
"url": "https://files.pythonhosted.org/packages/98/e7/ed339caf3662976949e4fdbfdf4a6db818b8d2aa1cf2b5f73af89e936bba/latexcodec-3.0.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-03-06 14:51:39",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "mcmtroffaes",
"github_project": "latexcodec",
"travis_ci": false,
"coveralls": true,
"github_actions": true,
"lcname": "latexcodec"
}