PyHyphen


NamePyHyphen JSON
Version 4.0.4 PyPI version JSON
download
home_pagehttps://github.com/dr-leo/PyHyphen
SummaryThe hyphenation library of LibreOffice and FireFox wrapped for Python
upload_time2024-07-30 08:50:27
maintainerNone
docs_urlNone
authorDr. Leo & Regis Behmo
requires_pythonNone
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            =================================
PyHyphen - hyphenation for Python
=================================

(c) 2008-2024 PyHyphen developers

Contact: fhaxbox66@gmail.com

Project home: https://github.com/dr-leo/PyHyphen

Mailing list: https://groups.google.com/group/pyhyphen


.. contents::

0. Quickstart
=============

With Python 3.7 or higher and a current version of pip, issue::

    $ pip install pyhyphen
    $ python
    >>> from hyphen import Hyphenator
    >>> # Download and install the hyphenation dict for German, if needed
    >>> h = Hyphenator('de_DE') # `language`defaults to 'en_US'
    >>> s = 'Politikverdrossenheit'
    >>> h.pairs(s)
    [['Po', 'litikverdrossenheit'],
    ['Poli', 'tikverdrossenheit'],
    ['Politik', 'verdrossenheit'],
    ['Politikver', 'drossenheit'],
    ['Politikverdros', 'senheit'],
    ['Politikverdrossen', 'heit']]
    >>> h.syllables(s)
    ['Po', 'li', 'tik', 'ver', 'dros', 'sen', 'heit']
    >>> h.wrap(s, 5)
    ['Poli-', 'tikverdrossenheit']

1. Overview
================

Pyhyphen is a pythonic interface to the hyphenation library used in projects such as LibreOffice and the Mozilla suite.
It comes with tools to download, install and uninstall hyphenation dictionaries from LibreOffice's Git repository.
PyHyphen provides the **hyphen**  package.

``hyphen.textwrap2`` is a  modified version of the familiar ``textwrap`` module
which wraps a text with hyphenation given a specified width. See the code example below.

PyHyphen supports Python 3.7  or higher.

1.1 Content of the hyphen package
---------------------------------

The 'hyphen' package contains the following:

- the ``hyphen.Hyphenator`` class: each instance of it can hyphenate and wrap words using a dictionary compatible with the hyphenation feature of
  LibreOffice and Mozilla. Required dictionaries are automatically downloaded at runtime, if not already installed.
- the ``dictools`` module contains useful functions such as for downloading and installing dictionaries from a configurable repository. After
  installation of PyHyphen, the LibreOffice repository is used by default. Dictionaries are storedin the platform-specific user's app directory.
- 'hyphen.hnj' is the C extension module that does all the ground work. It
  contains the high quality `C library libhyphen <http://sourceforge.net/projects/hunspell/files/Hyphen/>`_.
  It supports hyphenation with replacements as well as compound words.


1.2 The 'textwrap2' module
--------------------------

This module is an enhanced, though backwards-compatible version of the module 'textwrap' from the Python standard library. Unsurprisingly, it adds
hyphenation functionality to 'textwrap'. To this end, a new key word parameter ``use_hyphenator`` has been added to the ``__init__`` constructor
of the TextWrapper class which defaults to ``None``. It can be initialized with any hyphenator object.

2. Code examples
================

::

    >>> from hyphen import Hyphenator
    # Create some hyphenators
    h_de = Hyphenator('de_DE')
    h_en = Hyphenator('en_US')

    # Now hyphenate some words
    h_en.pairs('beautiful'
    [['beau', 'tiful'], ['beauti', 'ful']]

    h_en.wrap('beautiful', 6)
    ['beau-', 'tiful']

    h_en.wrap('beautiful', 7)
    ['beauti-', 'ful']

    h_en.syllables('beautiful')
    ['beau', 'ti', 'ful']

    >>> from hyphen.textwrap2 import fill
    >>> long_text = 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Fusce vehicula rhoncus nulla et vulputate. In et risus dignissim erat dapibus iaculis ac ut nunc. Etiam vestibulum elit eget purus fermentum, eu finibus velit eleifend.'
    >>> print(fill(long_text, width=40, use_hyphenator=h_en))
    Lorem ipsum dolor sit amet, consectetur
    adipiscing elit. Fusce vehicula rhoncus
    nulla et vulputate. In et risus dignis-
    sim erat dapibus iaculis ac ut nunc.
    Etiam vestibulum elit eget purus fermen-
    tum, eu finibus velit eleifend.

Just by creating ``Hyphenator`` objects for a language, the corresponding
dictionaries will be automatically downloaded.
For the HTTP connection to the LibreOffice server, PyHyphen uses the
familiar`requests <https://www.python-requests.org>`_
library. Requests are fully configurable to handle  proxies etc.
Alternatively, dictionaries may be manually
installed and listed with the ``dictools`` module::

    >>> from hyphen.dictools import *

    # Download and install some dictionaries in the default directory using the default
    # repository, usually the LibreOffice website
    >>> for lang in ['de_DE', 'en_US']:
        install(lang) # provide kwargs to configure the HTTP request

    # Show locales of installed dictionaries
    >>> list_installed()
    ['de', 'de_DE', 'en_PH', 'en_US']


3. Installation
===============

PyHyphen is pip-installable from PyPI. In most scenarios the easiest way to install PyHyphen is to type from the shell prompt::

    $ pip install pyhyphen

Besides the source distribution, there is a  wheel on PyPI for Windows. As the
C extension uses the limited C API, the wheel should work on all Python versions >= 3.7.

Building PyHyphen from source under Linux or MacOS should be straightforward. On Windows, the wheel isinstalled by default, so no C compiler is needed.

4. Managing dictionaries
========================

The ``dictools`` module contains a non-exhaustive list of available language strings that can be used to instantiate ``Hyphenator`` objects as shown above::

    >>> from hyphen import dictools
    >>> dictools.LANGUAGES
    ['af_ZA', 'an_ES', 'ar', 'be_BY', 'bg_BG', 'bn_BD', 'br_FR', 'ca', 'cs_C
    Z', 'da_DK', 'de', 'el_GR', 'en', 'es_ES', 'et_EE', 'fr_FR', 'gd_GB', 'gl', 'gu_
    IN', 'he_IL', 'hi_IN', 'hr_HR', 'hu_HU', 'it_IT', 'ku_TR', 'lt_LT', 'lv_LV', 'ne
    _NP', 'nl_NL', 'no', 'oc_FR', 'pl_PL', 'prj', 'pt_BR', 'pt_PT', 'ro', 'ru_RU', '
    si_LK', 'sk_SK', 'sl_SI', 'sr', 'sv_SE', 'sw_TZ', 'te_IN', 'th_TH', 'uk_UA', 'zu
    _ZA']

The downloaded dictionary files are stored in a local data folder, along with a
``dictionaries.json`` file that lists the downloaded files and the associated
locales::

    $ ls ~/.local/share/pyhyphen
    dictionaries.json  hyph_de_DE.dic  hyph_en_US.dic

    $ cat ~/.local/share/pyhyphen/dictionaries.json
    {
      "de": {
        "file": "hyph_de_DE.dic",
        "url": "http://cgit.freedesktop.org/libreoffice/dictionaries/plain/de/hyph_de_DE.dic"
      },
      "de_DE": {
        "file": "hyph_de_DE.dic",
        "url": "http://cgit.freedesktop.org/libreoffice/dictionaries/plain/de/hyph_de_DE.dic"
      },
      "en_PH": {
        "file": "hyph_en_US.dic",
        "url": "http://cgit.freedesktop.org/libreoffice/dictionaries/plain/en/hyph_en_US.dic"
      },
      "en_US": {
        "file": "hyph_en_US.dic",
        "url": "http://cgit.freedesktop.org/libreoffice/dictionaries/plain/en/hyph_en_US.dic"
      }
    }

Each entry of the ``dictionaries.json`` file contains both the path to the
dictionary file and the url from which it was downloaded.


5. Contributing and reporting bugs
=====================================

Questions can be asked in the Google group (https://groups.google.com/group/pyhyphen). Or just send an e-mail to the authors.

Browse  or fork the  repository and report bugs at PyHyphen's `project site on Github <https://github.com/dr-leo/PyHyphen>`_.

Before submitting a PR, run the unit tests::

    $ make test

6. License
============

Without prejudice to third party licenses, PyHyphen is distributed under the Apache 2.0 license. PyHyphen ships with third party code including the hyphenation library hyphen.c and a patched version of the Python standard module textwrap.


7. Changelog
======================



            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/dr-leo/PyHyphen",
    "name": "PyHyphen",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": null,
    "author": "Dr. Leo & Regis Behmo",
    "author_email": "fhaxbox66@googlemail.com",
    "download_url": "https://files.pythonhosted.org/packages/54/a9/8040f40c8ffb5156a378c852511cdc10c8da41d9a87ec46aedb54d9ca218/PyHyphen-4.0.4.tar.gz",
    "platform": null,
    "description": "=================================\nPyHyphen - hyphenation for Python\n=================================\n\n(c) 2008-2024 PyHyphen developers\n\nContact: fhaxbox66@gmail.com\n\nProject home: https://github.com/dr-leo/PyHyphen\n\nMailing list: https://groups.google.com/group/pyhyphen\n\n\n.. contents::\n\n0. Quickstart\n=============\n\nWith Python 3.7 or higher and a current version of pip, issue::\n\n    $ pip install pyhyphen\n    $ python\n    >>> from hyphen import Hyphenator\n    >>> # Download and install the hyphenation dict for German, if needed\n    >>> h = Hyphenator('de_DE') # `language`defaults to 'en_US'\n    >>> s = 'Politikverdrossenheit'\n    >>> h.pairs(s)\n    [['Po', 'litikverdrossenheit'],\n    ['Poli', 'tikverdrossenheit'],\n    ['Politik', 'verdrossenheit'],\n    ['Politikver', 'drossenheit'],\n    ['Politikverdros', 'senheit'],\n    ['Politikverdrossen', 'heit']]\n    >>> h.syllables(s)\n    ['Po', 'li', 'tik', 'ver', 'dros', 'sen', 'heit']\n    >>> h.wrap(s, 5)\n    ['Poli-', 'tikverdrossenheit']\n\n1. Overview\n================\n\nPyhyphen is a pythonic interface to the hyphenation library used in projects such as LibreOffice and the Mozilla suite.\nIt comes with tools to download, install and uninstall hyphenation dictionaries from LibreOffice's Git repository.\nPyHyphen provides the **hyphen**  package.\n\n``hyphen.textwrap2`` is a  modified version of the familiar ``textwrap`` module\nwhich wraps a text with hyphenation given a specified width. See the code example below.\n\nPyHyphen supports Python 3.7  or higher.\n\n1.1 Content of the hyphen package\n---------------------------------\n\nThe 'hyphen' package contains the following:\n\n- the ``hyphen.Hyphenator`` class: each instance of it can hyphenate and wrap words using a dictionary compatible with the hyphenation feature of\n  LibreOffice and Mozilla. Required dictionaries are automatically downloaded at runtime, if not already installed.\n- the ``dictools`` module contains useful functions such as for downloading and installing dictionaries from a configurable repository. After\n  installation of PyHyphen, the LibreOffice repository is used by default. Dictionaries are storedin the platform-specific user's app directory.\n- 'hyphen.hnj' is the C extension module that does all the ground work. It\n  contains the high quality `C library libhyphen <http://sourceforge.net/projects/hunspell/files/Hyphen/>`_.\n  It supports hyphenation with replacements as well as compound words.\n\n\n1.2 The 'textwrap2' module\n--------------------------\n\nThis module is an enhanced, though backwards-compatible version of the module 'textwrap' from the Python standard library. Unsurprisingly, it adds\nhyphenation functionality to 'textwrap'. To this end, a new key word parameter ``use_hyphenator`` has been added to the ``__init__`` constructor\nof the TextWrapper class which defaults to ``None``. It can be initialized with any hyphenator object.\n\n2. Code examples\n================\n\n::\n\n    >>> from hyphen import Hyphenator\n    # Create some hyphenators\n    h_de = Hyphenator('de_DE')\n    h_en = Hyphenator('en_US')\n\n    # Now hyphenate some words\n    h_en.pairs('beautiful'\n    [['beau', 'tiful'], ['beauti', 'ful']]\n\n    h_en.wrap('beautiful', 6)\n    ['beau-', 'tiful']\n\n    h_en.wrap('beautiful', 7)\n    ['beauti-', 'ful']\n\n    h_en.syllables('beautiful')\n    ['beau', 'ti', 'ful']\n\n    >>> from hyphen.textwrap2 import fill\n    >>> long_text = 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Fusce vehicula rhoncus nulla et vulputate. In et risus dignissim erat dapibus iaculis ac ut nunc. Etiam vestibulum elit eget purus fermentum, eu finibus velit eleifend.'\n    >>> print(fill(long_text, width=40, use_hyphenator=h_en))\n    Lorem ipsum dolor sit amet, consectetur\n    adipiscing elit. Fusce vehicula rhoncus\n    nulla et vulputate. In et risus dignis-\n    sim erat dapibus iaculis ac ut nunc.\n    Etiam vestibulum elit eget purus fermen-\n    tum, eu finibus velit eleifend.\n\nJust by creating ``Hyphenator`` objects for a language, the corresponding\ndictionaries will be automatically downloaded.\nFor the HTTP connection to the LibreOffice server, PyHyphen uses the\nfamiliar`requests <https://www.python-requests.org>`_\nlibrary. Requests are fully configurable to handle  proxies etc.\nAlternatively, dictionaries may be manually\ninstalled and listed with the ``dictools`` module::\n\n    >>> from hyphen.dictools import *\n\n    # Download and install some dictionaries in the default directory using the default\n    # repository, usually the LibreOffice website\n    >>> for lang in ['de_DE', 'en_US']:\n        install(lang) # provide kwargs to configure the HTTP request\n\n    # Show locales of installed dictionaries\n    >>> list_installed()\n    ['de', 'de_DE', 'en_PH', 'en_US']\n\n\n3. Installation\n===============\n\nPyHyphen is pip-installable from PyPI. In most scenarios the easiest way to install PyHyphen is to type from the shell prompt::\n\n    $ pip install pyhyphen\n\nBesides the source distribution, there is a  wheel on PyPI for Windows. As the\nC extension uses the limited C API, the wheel should work on all Python versions >= 3.7.\n\nBuilding PyHyphen from source under Linux or MacOS should be straightforward. On Windows, the wheel isinstalled by default, so no C compiler is needed.\n\n4. Managing dictionaries\n========================\n\nThe ``dictools`` module contains a non-exhaustive list of available language strings that can be used to instantiate ``Hyphenator`` objects as shown above::\n\n    >>> from hyphen import dictools\n    >>> dictools.LANGUAGES\n    ['af_ZA', 'an_ES', 'ar', 'be_BY', 'bg_BG', 'bn_BD', 'br_FR', 'ca', 'cs_C\n    Z', 'da_DK', 'de', 'el_GR', 'en', 'es_ES', 'et_EE', 'fr_FR', 'gd_GB', 'gl', 'gu_\n    IN', 'he_IL', 'hi_IN', 'hr_HR', 'hu_HU', 'it_IT', 'ku_TR', 'lt_LT', 'lv_LV', 'ne\n    _NP', 'nl_NL', 'no', 'oc_FR', 'pl_PL', 'prj', 'pt_BR', 'pt_PT', 'ro', 'ru_RU', '\n    si_LK', 'sk_SK', 'sl_SI', 'sr', 'sv_SE', 'sw_TZ', 'te_IN', 'th_TH', 'uk_UA', 'zu\n    _ZA']\n\nThe downloaded dictionary files are stored in a local data folder, along with a\n``dictionaries.json`` file that lists the downloaded files and the associated\nlocales::\n\n    $ ls ~/.local/share/pyhyphen\n    dictionaries.json  hyph_de_DE.dic  hyph_en_US.dic\n\n    $ cat ~/.local/share/pyhyphen/dictionaries.json\n    {\n      \"de\": {\n        \"file\": \"hyph_de_DE.dic\",\n        \"url\": \"http://cgit.freedesktop.org/libreoffice/dictionaries/plain/de/hyph_de_DE.dic\"\n      },\n      \"de_DE\": {\n        \"file\": \"hyph_de_DE.dic\",\n        \"url\": \"http://cgit.freedesktop.org/libreoffice/dictionaries/plain/de/hyph_de_DE.dic\"\n      },\n      \"en_PH\": {\n        \"file\": \"hyph_en_US.dic\",\n        \"url\": \"http://cgit.freedesktop.org/libreoffice/dictionaries/plain/en/hyph_en_US.dic\"\n      },\n      \"en_US\": {\n        \"file\": \"hyph_en_US.dic\",\n        \"url\": \"http://cgit.freedesktop.org/libreoffice/dictionaries/plain/en/hyph_en_US.dic\"\n      }\n    }\n\nEach entry of the ``dictionaries.json`` file contains both the path to the\ndictionary file and the url from which it was downloaded.\n\n\n5. Contributing and reporting bugs\n=====================================\n\nQuestions can be asked in the Google group (https://groups.google.com/group/pyhyphen). Or just send an e-mail to the authors.\n\nBrowse  or fork the  repository and report bugs at PyHyphen's `project site on Github <https://github.com/dr-leo/PyHyphen>`_.\n\nBefore submitting a PR, run the unit tests::\n\n    $ make test\n\n6. License\n============\n\nWithout prejudice to third party licenses, PyHyphen is distributed under the Apache 2.0 license. PyHyphen ships with third party code including the hyphenation library hyphen.c and a patched version of the Python standard module textwrap.\n\n\n7. Changelog\n======================\n\n\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "The hyphenation library of LibreOffice and FireFox wrapped for Python",
    "version": "4.0.4",
    "project_urls": {
        "Homepage": "https://github.com/dr-leo/PyHyphen"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c4140a239a505658f6f09b648257233f80f56f4710c9b03ba7051ef84a4c67f4",
                "md5": "5e6ace4e88185363a634fc5e885114b3",
                "sha256": "feab019b508e30ebf88f384ba26e2376e7569812f3e9fb83d2882dbfd6475fa1"
            },
            "downloads": -1,
            "filename": "PyHyphen-4.0.4-cp37-abi3-macosx_11_0_universal2.whl",
            "has_sig": false,
            "md5_digest": "5e6ace4e88185363a634fc5e885114b3",
            "packagetype": "bdist_wheel",
            "python_version": "cp37",
            "requires_python": null,
            "size": 50961,
            "upload_time": "2024-07-30T08:50:23",
            "upload_time_iso_8601": "2024-07-30T08:50:23.987351Z",
            "url": "https://files.pythonhosted.org/packages/c4/14/0a239a505658f6f09b648257233f80f56f4710c9b03ba7051ef84a4c67f4/PyHyphen-4.0.4-cp37-abi3-macosx_11_0_universal2.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "4a8934432450eaec027a2662a161986b62ab63b5e194cb549cc5248d576c102c",
                "md5": "ae632799826d83dba7028c43062d53d2",
                "sha256": "097ad69b6f5a6f85ec3491b5ff7d5750e4f52c5d46a152c208a17fd8f10a50c7"
            },
            "downloads": -1,
            "filename": "PyHyphen-4.0.4-cp37-abi3-win_amd64.whl",
            "has_sig": false,
            "md5_digest": "ae632799826d83dba7028c43062d53d2",
            "packagetype": "bdist_wheel",
            "python_version": "cp37",
            "requires_python": null,
            "size": 38483,
            "upload_time": "2024-07-30T08:50:25",
            "upload_time_iso_8601": "2024-07-30T08:50:25.692436Z",
            "url": "https://files.pythonhosted.org/packages/4a/89/34432450eaec027a2662a161986b62ab63b5e194cb549cc5248d576c102c/PyHyphen-4.0.4-cp37-abi3-win_amd64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "54a98040f40c8ffb5156a378c852511cdc10c8da41d9a87ec46aedb54d9ca218",
                "md5": "5251489f5b5d72d1285f1064f6985ee2",
                "sha256": "e156f0f9c48ac5ff625f9c59c43117c50e4415f2907cf0561031174c8e76e93e"
            },
            "downloads": -1,
            "filename": "PyHyphen-4.0.4.tar.gz",
            "has_sig": false,
            "md5_digest": "5251489f5b5d72d1285f1064f6985ee2",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 40474,
            "upload_time": "2024-07-30T08:50:27",
            "upload_time_iso_8601": "2024-07-30T08:50:27.289282Z",
            "url": "https://files.pythonhosted.org/packages/54/a9/8040f40c8ffb5156a378c852511cdc10c8da41d9a87ec46aedb54d9ca218/PyHyphen-4.0.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-07-30 08:50:27",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "dr-leo",
    "github_project": "PyHyphen",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "pyhyphen"
}
        
Elapsed time: 0.44761s