Chardet: The Universal Character Encoding Detector
--------------------------------------------------
.. image:: https://img.shields.io/travis/chardet/chardet/stable.svg
:alt: Build status
:target: https://travis-ci.org/chardet/chardet
.. image:: https://img.shields.io/coveralls/chardet/chardet/stable.svg
:target: https://coveralls.io/r/chardet/chardet
.. image:: https://img.shields.io/pypi/v/chardet.svg
:target: https://warehouse.python.org/project/chardet/
:alt: Latest version on PyPI
.. image:: https://img.shields.io/pypi/l/chardet.svg
:alt: License
Detects
- ASCII, UTF-8, UTF-16 (2 variants), UTF-32 (4 variants)
- Big5, GB2312, EUC-TW, HZ-GB-2312, ISO-2022-CN (Traditional and Simplified Chinese)
- EUC-JP, SHIFT_JIS, CP932, ISO-2022-JP (Japanese)
- EUC-KR, ISO-2022-KR, Johab (Korean)
- KOI8-R, MacCyrillic, IBM855, IBM866, ISO-8859-5, windows-1251 (Cyrillic)
- ISO-8859-5, windows-1251 (Bulgarian)
- ISO-8859-1, windows-1252, MacRoman (Western European languages)
- ISO-8859-7, windows-1253 (Greek)
- ISO-8859-8, windows-1255 (Visual and Logical Hebrew)
- TIS-620 (Thai)
.. note::
Our ISO-8859-2 and windows-1250 (Hungarian) probers have been temporarily
disabled until we can retrain the models.
Requires Python 3.7+.
Installation
------------
Install from `PyPI <https://pypi.org/project/chardet/>`_::
pip install chardet
Documentation
-------------
For users, docs are now available at https://chardet.readthedocs.io/.
Command-line Tool
-----------------
chardet comes with a command-line script which reports on the encodings of one
or more files::
% chardetect somefile someotherfile
somefile: windows-1252 with confidence 0.5
someotherfile: ascii with confidence 1.0
About
-----
This is a continuation of Mark Pilgrim's excellent original chardet port from C, and `Ian Cordasco <https://github.com/sigmavirus24>`_'s
`charade <https://github.com/sigmavirus24/charade>`_ Python 3-compatible fork.
:maintainer: Dan Blanchard
Raw data
{
"_id": null,
"home_page": "https://github.com/chardet/chardet",
"name": "chardet",
"maintainer": "Daniel Blanchard",
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": "dan.blanchard@gmail.com",
"keywords": "encoding,i18n,xml",
"author": "Mark Pilgrim",
"author_email": "mark@diveintomark.org",
"download_url": "https://files.pythonhosted.org/packages/f3/0d/f7b6ab21ec75897ed80c17d79b15951a719226b9fababf1e40ea74d69079/chardet-5.2.0.tar.gz",
"platform": null,
"description": "Chardet: The Universal Character Encoding Detector\n--------------------------------------------------\n\n.. image:: https://img.shields.io/travis/chardet/chardet/stable.svg\n :alt: Build status\n :target: https://travis-ci.org/chardet/chardet\n\n.. image:: https://img.shields.io/coveralls/chardet/chardet/stable.svg\n :target: https://coveralls.io/r/chardet/chardet\n\n.. image:: https://img.shields.io/pypi/v/chardet.svg\n :target: https://warehouse.python.org/project/chardet/\n :alt: Latest version on PyPI\n\n.. image:: https://img.shields.io/pypi/l/chardet.svg\n :alt: License\n\n\nDetects\n - ASCII, UTF-8, UTF-16 (2 variants), UTF-32 (4 variants)\n - Big5, GB2312, EUC-TW, HZ-GB-2312, ISO-2022-CN (Traditional and Simplified Chinese)\n - EUC-JP, SHIFT_JIS, CP932, ISO-2022-JP (Japanese)\n - EUC-KR, ISO-2022-KR, Johab (Korean)\n - KOI8-R, MacCyrillic, IBM855, IBM866, ISO-8859-5, windows-1251 (Cyrillic)\n - ISO-8859-5, windows-1251 (Bulgarian)\n - ISO-8859-1, windows-1252, MacRoman (Western European languages)\n - ISO-8859-7, windows-1253 (Greek)\n - ISO-8859-8, windows-1255 (Visual and Logical Hebrew)\n - TIS-620 (Thai)\n\n.. note::\n Our ISO-8859-2 and windows-1250 (Hungarian) probers have been temporarily\n disabled until we can retrain the models.\n\nRequires Python 3.7+.\n\nInstallation\n------------\n\nInstall from `PyPI <https://pypi.org/project/chardet/>`_::\n\n pip install chardet\n\nDocumentation\n-------------\n\nFor users, docs are now available at https://chardet.readthedocs.io/.\n\nCommand-line Tool\n-----------------\n\nchardet comes with a command-line script which reports on the encodings of one\nor more files::\n\n % chardetect somefile someotherfile\n somefile: windows-1252 with confidence 0.5\n someotherfile: ascii with confidence 1.0\n\nAbout\n-----\n\nThis is a continuation of Mark Pilgrim's excellent original chardet port from C, and `Ian Cordasco <https://github.com/sigmavirus24>`_'s\n`charade <https://github.com/sigmavirus24/charade>`_ Python 3-compatible fork.\n\n:maintainer: Dan Blanchard\n",
"bugtrack_url": null,
"license": "LGPL",
"summary": "Universal encoding detector for Python 3",
"version": "5.2.0",
"project_urls": {
"Documentation": "https://chardet.readthedocs.io/",
"GitHub Project": "https://github.com/chardet/chardet",
"Homepage": "https://github.com/chardet/chardet",
"Issue Tracker": "https://github.com/chardet/chardet/issues"
},
"split_keywords": [
"encoding",
"i18n",
"xml"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "386ff5fbc992a329ee4e0f288c1fe0e2ad9485ed064cac731ed2fe47dcc38cbf",
"md5": "b9eda7cd7d1582e269bd8eb7ffc4fcad",
"sha256": "e1cf59446890a00105fe7b7912492ea04b6e6f06d4b742b2c788469e34c82970"
},
"downloads": -1,
"filename": "chardet-5.2.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "b9eda7cd7d1582e269bd8eb7ffc4fcad",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 199385,
"upload_time": "2023-08-01T19:23:00",
"upload_time_iso_8601": "2023-08-01T19:23:00.661021Z",
"url": "https://files.pythonhosted.org/packages/38/6f/f5fbc992a329ee4e0f288c1fe0e2ad9485ed064cac731ed2fe47dcc38cbf/chardet-5.2.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "f30df7b6ab21ec75897ed80c17d79b15951a719226b9fababf1e40ea74d69079",
"md5": "cc2d8cc9a751641463b4f7cfecad2ffa",
"sha256": "1b3b6ff479a8c414bc3fa2c0852995695c4a026dcd6d0633b2dd092ca39c1cf7"
},
"downloads": -1,
"filename": "chardet-5.2.0.tar.gz",
"has_sig": false,
"md5_digest": "cc2d8cc9a751641463b4f7cfecad2ffa",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 2069618,
"upload_time": "2023-08-01T19:23:02",
"upload_time_iso_8601": "2023-08-01T19:23:02.662671Z",
"url": "https://files.pythonhosted.org/packages/f3/0d/f7b6ab21ec75897ed80c17d79b15951a719226b9fababf1e40ea74d69079/chardet-5.2.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-08-01 19:23:02",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "chardet",
"github_project": "chardet",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "chardet"
}