dcl


Namedcl JSON
Version 1.0.0 PyPI version JSON
download
home_pagehttps://github.com/Kreusada/python-dcl
SummaryPython library used for diacritic manipulation.
upload_time2021-08-19 21:54:22
maintainer
docs_urlNone
authorKreusada
requires_python
license
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Diacritics Library

[![Code Style: Black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
[![Imports: isort](https://user-images.githubusercontent.com/6032823/111363465-600fe880-8690-11eb-8377-ec1d4d5ff981.png)](https://github.com/PyCQA/isort)
[![PRs welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg)](http://makeapullrequest.com)

This library is used for adding, and removing diacritics from strings.

### Getting started

Start by importing the module:

```py
import dcl
```

DCL currently supports a multitude of diacritics:

* acute
* breve
* caron
* cedilla
* grave
* interpunct
* macron
* ogonek
* ring
* ring_and_acute
* slash
* stroke
* stroke_and_acute
* tilde
* tittle
* umlaut/diaresis
* umlaut_and_macron

Each accent has their own attribute which is directly accessible from the dcl module.

```py
dcl.acute('a')
>>> 'á'
```

These attributes return a Character object, which is essentially just a handy "wrapper" 
around our diacritic, which we can use to access various attributes to retrieve further 
information about the diacritic we're focusing on.

```py
char = dcl.ogonek('a')

repr(char)
>>> "<ogonek 'ą'>"

char.character  # the same as str(char)
>>> 'ą'

char.diacritic  # some return <unprintable>
>>> '˛'

char.diacritic_name
>>> 'ogonek'

char.raw  # returns the raw representation of our character
>>> '\U00000105'

char.raw_diacritic 
>>> '\U000002db'
```

Some functions can't take certain letters. For example, the letter ``h`` cannot take
a cedilla diacritic. In this case, an exception is raised named ``DiacriticError``.
You can access this exception via ``dcl.errors.DiacriticError``.

```py
from dcl.errors import DiacriticError

try:
    char = dcl.cedilla('h')
except DiacriticError as e:
    print(e)
else:
    print(repr(char))

>>> 'Character h cannot take a cedilla diacritic'
```

If you want to, you may also use the ``DiacriticApplicant`` object from 
``dcl.objects``. The functions you see above use this object too, and it's virtually
the same principle, except from the fact that we use properties to get the 
diacritic, and the class simply holds the string and it's properties. Alas with the 
functions above, this object also returns the same ``Character`` object through it's properties.

```py
from dcl.objects import DiacriticApplicant

da = DiacriticApplicant('a')
repr(da.ogonek)
>>> "<ogonek 'ą'>"
```

There is also the ``clean_diacritics`` function, accessible straight from the dcl module.
This function allows us to completely clean a string from any diacritics.

```py
dcl.clean_diacritics("Krëûšàdå")
>>> 'Kreusada'

dcl.clean_diacritics("Café")
>>> 'Cafe'
```

Along with this function, there's also ``count_diacritics``, ``get_diacritics`` and ``has_diacritics``.

The ``has_diacritics`` function simply checks if the string contains a character
with a diacritic.

```py
dcl.has_diacritics("Café")
>>> True

dcl.has_diacritics("dcl")
>>> False
```

The ``get_diacritics`` function is used to get all the diacritics in a string.
It returns a dictionary. For each diacritic in the string, the key will show
the diacritic's index in the string, and the value will show the ``Character``
representation. 

```py
dcl.get_diacritics("Café")
>>> {3: <acute 'é'>}

dcl.get_diacritics("Krëûšàdå")
>>> {2: <umlaut 'ë'>, 3: <circumflex 'û'>, 4: <caron 'š'>, 5: <grave 'à'>, 7: <ring 'å'>}
```

The ``count_diacritics`` function counts the number of diacritics in a string. The actual
implementation of this simply returns the dictionary length from ``get_diacritics``.

```py
dcl.count_diacritics("Café")
>>> 1
```

### Creating an end user program

Creating a program would be pretty simple for this, and I'd love to be able to help
you out with a base idea. Have a look at this for example:

```py
import dcl
import string

from dcl.errors import DiacriticError

char = str(input("Enter a character: "))
if not char in string.ascii_letters:
    print("Please enter a letter from a-Z.")
else:
    accent = str(input("Enter an accent, you can choose from the following: " + ", ".join(dcl.diacritic_list)))
    if not dcl.isdiacritictype(accent):
        print("That was not a valid accent.")
    else:
        try:
            function = getattr(dcl, accent)  # or dcl.objects.DiacriticApplicant
            output = function(char)
        except DiacriticError as e:
            print(e)
        else:
            print(str(output))
```

It's worth checking if the provided accent is a diacritic type. If it is, then you can use ``getattr``. 
Without checking, the user could provide a default global such as ``__file__``.

You can also create a program which can remove diacritics from a string. It's made easy!

```py
import dcl

string = str(input("Enter the string which you want to be cleared from diacritics: "))
print("Here is your cleaned string: " + dcl.clean_diacritics(string))
```

Or perhaps your program wants to count the number of diacritics contained
within your string.

```py
import dcl

string = str(input("This program will count the number of diacritics contained in your input. Enter a string: "))
count = dcl.count_diacritics(string)
if count == 1:
    grammar = "is"
else:
    grammar = "are"
print(f"There {grammar} {count} diacritics/accent in your string.")
```


            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/Kreusada/python-dcl",
    "name": "dcl",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "",
    "author": "Kreusada",
    "author_email": "kreusadaprojects@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/af/8e/682f835a67b1c780ff223ac7639d682b38bd9b0e732ad5d9e3c7e986e895/dcl-1.0.0.tar.gz",
    "platform": "",
    "description": "# Diacritics Library\n\n[![Code Style: Black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n[![Imports: isort](https://user-images.githubusercontent.com/6032823/111363465-600fe880-8690-11eb-8377-ec1d4d5ff981.png)](https://github.com/PyCQA/isort)\n[![PRs welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg)](http://makeapullrequest.com)\n\nThis library is used for adding, and removing diacritics from strings.\n\n### Getting started\n\nStart by importing the module:\n\n```py\nimport dcl\n```\n\nDCL currently supports a multitude of diacritics:\n\n* acute\n* breve\n* caron\n* cedilla\n* grave\n* interpunct\n* macron\n* ogonek\n* ring\n* ring_and_acute\n* slash\n* stroke\n* stroke_and_acute\n* tilde\n* tittle\n* umlaut/diaresis\n* umlaut_and_macron\n\nEach accent has their own attribute which is directly accessible from the dcl module.\n\n```py\ndcl.acute('a')\n>>> '\u00e1'\n```\n\nThese attributes return a Character object, which is essentially just a handy \"wrapper\" \naround our diacritic, which we can use to access various attributes to retrieve further \ninformation about the diacritic we're focusing on.\n\n```py\nchar = dcl.ogonek('a')\n\nrepr(char)\n>>> \"<ogonek '\u0105'>\"\n\nchar.character  # the same as str(char)\n>>> '\u0105'\n\nchar.diacritic  # some return <unprintable>\n>>> '\u02db'\n\nchar.diacritic_name\n>>> 'ogonek'\n\nchar.raw  # returns the raw representation of our character\n>>> '\\U00000105'\n\nchar.raw_diacritic \n>>> '\\U000002db'\n```\n\nSome functions can't take certain letters. For example, the letter ``h`` cannot take\na cedilla diacritic. In this case, an exception is raised named ``DiacriticError``.\nYou can access this exception via ``dcl.errors.DiacriticError``.\n\n```py\nfrom dcl.errors import DiacriticError\n\ntry:\n    char = dcl.cedilla('h')\nexcept DiacriticError as e:\n    print(e)\nelse:\n    print(repr(char))\n\n>>> 'Character h cannot take a cedilla diacritic'\n```\n\nIf you want to, you may also use the ``DiacriticApplicant`` object from \n``dcl.objects``. The functions you see above use this object too, and it's virtually\nthe same principle, except from the fact that we use properties to get the \ndiacritic, and the class simply holds the string and it's properties. Alas with the \nfunctions above, this object also returns the same ``Character`` object through it's properties.\n\n```py\nfrom dcl.objects import DiacriticApplicant\n\nda = DiacriticApplicant('a')\nrepr(da.ogonek)\n>>> \"<ogonek '\u0105'>\"\n```\n\nThere is also the ``clean_diacritics`` function, accessible straight from the dcl module.\nThis function allows us to completely clean a string from any diacritics.\n\n```py\ndcl.clean_diacritics(\"Kr\u00eb\u00fb\u0161\u00e0d\u00e5\")\n>>> 'Kreusada'\n\ndcl.clean_diacritics(\"Caf\u00e9\")\n>>> 'Cafe'\n```\n\nAlong with this function, there's also ``count_diacritics``, ``get_diacritics`` and ``has_diacritics``.\n\nThe ``has_diacritics`` function simply checks if the string contains a character\nwith a diacritic.\n\n```py\ndcl.has_diacritics(\"Caf\u00e9\")\n>>> True\n\ndcl.has_diacritics(\"dcl\")\n>>> False\n```\n\nThe ``get_diacritics`` function is used to get all the diacritics in a string.\nIt returns a dictionary. For each diacritic in the string, the key will show\nthe diacritic's index in the string, and the value will show the ``Character``\nrepresentation. \n\n```py\ndcl.get_diacritics(\"Caf\u00e9\")\n>>> {3: <acute '\u00e9'>}\n\ndcl.get_diacritics(\"Kr\u00eb\u00fb\u0161\u00e0d\u00e5\")\n>>> {2: <umlaut '\u00eb'>, 3: <circumflex '\u00fb'>, 4: <caron '\u0161'>, 5: <grave '\u00e0'>, 7: <ring '\u00e5'>}\n```\n\nThe ``count_diacritics`` function counts the number of diacritics in a string. The actual\nimplementation of this simply returns the dictionary length from ``get_diacritics``.\n\n```py\ndcl.count_diacritics(\"Caf\u00e9\")\n>>> 1\n```\n\n### Creating an end user program\n\nCreating a program would be pretty simple for this, and I'd love to be able to help\nyou out with a base idea. Have a look at this for example:\n\n```py\nimport dcl\nimport string\n\nfrom dcl.errors import DiacriticError\n\nchar = str(input(\"Enter a character: \"))\nif not char in string.ascii_letters:\n    print(\"Please enter a letter from a-Z.\")\nelse:\n    accent = str(input(\"Enter an accent, you can choose from the following: \" + \", \".join(dcl.diacritic_list)))\n    if not dcl.isdiacritictype(accent):\n        print(\"That was not a valid accent.\")\n    else:\n        try:\n            function = getattr(dcl, accent)  # or dcl.objects.DiacriticApplicant\n            output = function(char)\n        except DiacriticError as e:\n            print(e)\n        else:\n            print(str(output))\n```\n\nIt's worth checking if the provided accent is a diacritic type. If it is, then you can use ``getattr``. \nWithout checking, the user could provide a default global such as ``__file__``.\n\nYou can also create a program which can remove diacritics from a string. It's made easy!\n\n```py\nimport dcl\n\nstring = str(input(\"Enter the string which you want to be cleared from diacritics: \"))\nprint(\"Here is your cleaned string: \" + dcl.clean_diacritics(string))\n```\n\nOr perhaps your program wants to count the number of diacritics contained\nwithin your string.\n\n```py\nimport dcl\n\nstring = str(input(\"This program will count the number of diacritics contained in your input. Enter a string: \"))\ncount = dcl.count_diacritics(string)\nif count == 1:\n    grammar = \"is\"\nelse:\n    grammar = \"are\"\nprint(f\"There {grammar} {count} diacritics/accent in your string.\")\n```\n\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Python library used for diacritic manipulation.",
    "version": "1.0.0",
    "project_urls": {
        "Homepage": "https://github.com/Kreusada/python-dcl"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "9dca1a398a2d59b1d7a983872dce17791a33c8866aae8a9e19130c8ab8a4761c",
                "md5": "1583fe05bfc01fafc035c057ce0336f6",
                "sha256": "7d8d23dea98fe35e66bdb77c50a8a469107905e12aef980517d615780c4edfda"
            },
            "downloads": -1,
            "filename": "dcl-1.0.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "1583fe05bfc01fafc035c057ce0336f6",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 10495,
            "upload_time": "2021-08-19T21:54:20",
            "upload_time_iso_8601": "2021-08-19T21:54:20.183200Z",
            "url": "https://files.pythonhosted.org/packages/9d/ca/1a398a2d59b1d7a983872dce17791a33c8866aae8a9e19130c8ab8a4761c/dcl-1.0.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "af8e682f835a67b1c780ff223ac7639d682b38bd9b0e732ad5d9e3c7e986e895",
                "md5": "54d57cabeb1c9b674352a79a72bdb4b7",
                "sha256": "00c26ee0c033a432a1c7f77b3077af5bd2a4600b7e08d37a6ff6709427cacbaf"
            },
            "downloads": -1,
            "filename": "dcl-1.0.0.tar.gz",
            "has_sig": false,
            "md5_digest": "54d57cabeb1c9b674352a79a72bdb4b7",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 10245,
            "upload_time": "2021-08-19T21:54:22",
            "upload_time_iso_8601": "2021-08-19T21:54:22.695632Z",
            "url": "https://files.pythonhosted.org/packages/af/8e/682f835a67b1c780ff223ac7639d682b38bd9b0e732ad5d9e3c7e986e895/dcl-1.0.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2021-08-19 21:54:22",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Kreusada",
    "github_project": "python-dcl",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "dcl"
}
        
Elapsed time: 0.26524s