Name | unicode-rbnf JSON |
Version |
2.2.0
JSON |
| download |
home_page | None |
Summary | Rule-based number formatting using Unicode CLDR data |
upload_time | 2024-12-31 17:16:32 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.8.0 |
license | MIT |
keywords |
rbnf
unicode
number
format
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# Unicode RBNF
A pure Python implementation of [rule based number formatting](https://icu-project.org/docs/papers/a_rule_based_approach_to_number_spellout/) (RBNF) using the [Unicode Common Locale Data Repository](https://cldr.unicode.org) (CLDR).
This lets you spell out numbers for a large number of locales:
``` python
from unicode_rbnf import RbnfEngine
engine = RbnfEngine.for_language("en")
assert engine.format_number(1234).text == "one thousand two hundred thirty-four"
```
Different formatting purposes are supported as well, depending on the locale:
``` python
from unicode_rbnf import RbnfEngine, FormatPurpose
engine = RbnfEngine.for_language("en")
assert engine.format_number(1999, FormatPurpose.CARDINAL).text == "one thousand nine hundred ninety-nine"
assert engine.format_number(1999, FormatPurpose.YEAR).text == "nineteen ninety-nine"
assert engine.format_number(11, FormatPurpose.ORDINAL).text == "eleventh"
```
For locales with multiple genders, cases, etc., the different texts are accessible in the result of `format_number`:
``` python
from unicode_rbnf import RbnfEngine
engine = RbnfEngine.for_language("de")
print(engine.format_number(1))
```
Result:
```
FormatResult(
text='eins',
text_by_ruleset={
'spellout-numbering': 'eins',
'spellout-cardinal-neuter': 'ein',
'spellout-cardinal-masculine': 'ein',
'spellout-cardinal-feminine': 'eine',
'spellout-cardinal-n': 'einen',
'spellout-cardinal-r': 'einer',
'spellout-cardinal-s': 'eines',
'spellout-cardinal-m': 'einem'
}
)
```
The `text` property of the result holds the text of the ruleset with the shortest name (least specific).
## Supported locales
See: https://github.com/unicode-org/cldr/tree/release-44/common/rbnf
## Engine implementation
Not [all features](https://unicode-org.github.io/icu-docs/apidoc/released/icu4c/classRuleBasedNumberFormat.html) of the RBNF engine are implemented. The following features are available:
* Literal text (`hundred`)
* Quotient substitution (`<<` or `←←`)
* Reminder substitution (`>>` or `→→`)
* Optional substitution (`[...]`)
* Rule substituton (`←%ruleset_name←`)
* Rule replacement (`=%ruleset_name=`)
* Special rules:
* Negative numbers (`-x`)
* Improper fractions (`x.x`)
* Not a number (`NaN`)
* Infinity (`Inf`)
Some features that will need to be added eventually:
* Proper fraction rules (`0.x`)
* Preceding reminder substitution (`>>>` or `→→→`)
* Number format strings (`==`)
* Decimal format patterns (`#,##0.00`)
* Plural replacements (`$(ordinal,one{st}...)`)
Raw data
{
"_id": null,
"home_page": null,
"name": "unicode-rbnf",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8.0",
"maintainer_email": null,
"keywords": "rbnf, unicode, number, format",
"author": null,
"author_email": "Michael Hansen <mike@rhasspy.org>",
"download_url": "https://files.pythonhosted.org/packages/e0/26/8f6c0658ed5ccafcfd8b723a576463921925df548065a97f2bca6cfda791/unicode_rbnf-2.2.0.tar.gz",
"platform": "any",
"description": "# Unicode RBNF\n\nA pure Python implementation of [rule based number formatting](https://icu-project.org/docs/papers/a_rule_based_approach_to_number_spellout/) (RBNF) using the [Unicode Common Locale Data Repository](https://cldr.unicode.org) (CLDR).\n\nThis lets you spell out numbers for a large number of locales:\n\n``` python\nfrom unicode_rbnf import RbnfEngine\n\nengine = RbnfEngine.for_language(\"en\")\nassert engine.format_number(1234).text == \"one thousand two hundred thirty-four\"\n```\n\nDifferent formatting purposes are supported as well, depending on the locale:\n\n``` python\nfrom unicode_rbnf import RbnfEngine, FormatPurpose\n\nengine = RbnfEngine.for_language(\"en\")\nassert engine.format_number(1999, FormatPurpose.CARDINAL).text == \"one thousand nine hundred ninety-nine\"\nassert engine.format_number(1999, FormatPurpose.YEAR).text == \"nineteen ninety-nine\"\nassert engine.format_number(11, FormatPurpose.ORDINAL).text == \"eleventh\"\n```\n\nFor locales with multiple genders, cases, etc., the different texts are accessible in the result of `format_number`:\n\n``` python\nfrom unicode_rbnf import RbnfEngine\n\nengine = RbnfEngine.for_language(\"de\")\nprint(engine.format_number(1))\n```\n\nResult:\n\n```\nFormatResult(\n text='eins',\n text_by_ruleset={\n 'spellout-numbering': 'eins',\n 'spellout-cardinal-neuter': 'ein',\n 'spellout-cardinal-masculine': 'ein',\n 'spellout-cardinal-feminine': 'eine',\n 'spellout-cardinal-n': 'einen',\n 'spellout-cardinal-r': 'einer',\n 'spellout-cardinal-s': 'eines',\n 'spellout-cardinal-m': 'einem'\n }\n)\n```\n\nThe `text` property of the result holds the text of the ruleset with the shortest name (least specific).\n\n## Supported locales\n\nSee: https://github.com/unicode-org/cldr/tree/release-44/common/rbnf\n\n## Engine implementation\n\nNot [all features](https://unicode-org.github.io/icu-docs/apidoc/released/icu4c/classRuleBasedNumberFormat.html) of the RBNF engine are implemented. The following features are available:\n\n* Literal text (`hundred`)\n* Quotient substitution (`<<` or `\u2190\u2190`)\n* Reminder substitution (`>>` or `\u2192\u2192`)\n* Optional substitution (`[...]`)\n* Rule substituton (`\u2190%ruleset_name\u2190`)\n* Rule replacement (`=%ruleset_name=`)\n* Special rules:\n * Negative numbers (`-x`)\n * Improper fractions (`x.x`)\n * Not a number (`NaN`)\n * Infinity (`Inf`)\n \nSome features that will need to be added eventually:\n\n* Proper fraction rules (`0.x`)\n* Preceding reminder substitution (`>>>` or `\u2192\u2192\u2192`)\n* Number format strings (`==`)\n* Decimal format patterns (`#,##0.00`)\n* Plural replacements (`$(ordinal,one{st}...)`)\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Rule-based number formatting using Unicode CLDR data",
"version": "2.2.0",
"project_urls": {
"Source Code": "https://github.com/rhasspy/unicode-rbnf"
},
"split_keywords": [
"rbnf",
" unicode",
" number",
" format"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "b230d58a56d0ce40f57f5048b39c29e1da5d81dcedcd68df61009eb8762ca4f7",
"md5": "115fb7fd93840b85c854493f302ba9ec",
"sha256": "af181ae284593d580c2892ad2955fb11ead1f2a571e9c7c0807a4d08daf4875c"
},
"downloads": -1,
"filename": "unicode_rbnf-2.2.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "115fb7fd93840b85c854493f302ba9ec",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8.0",
"size": 133168,
"upload_time": "2024-12-31T17:16:30",
"upload_time_iso_8601": "2024-12-31T17:16:30.035031Z",
"url": "https://files.pythonhosted.org/packages/b2/30/d58a56d0ce40f57f5048b39c29e1da5d81dcedcd68df61009eb8762ca4f7/unicode_rbnf-2.2.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "e0268f6c0658ed5ccafcfd8b723a576463921925df548065a97f2bca6cfda791",
"md5": "83a9e35ef126645974842525d9a017cc",
"sha256": "ee46827f56bfeec678c29d929e7056e2e73faebc90072b7cca803cb11929c3ee"
},
"downloads": -1,
"filename": "unicode_rbnf-2.2.0.tar.gz",
"has_sig": false,
"md5_digest": "83a9e35ef126645974842525d9a017cc",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8.0",
"size": 81359,
"upload_time": "2024-12-31T17:16:32",
"upload_time_iso_8601": "2024-12-31T17:16:32.301931Z",
"url": "https://files.pythonhosted.org/packages/e0/26/8f6c0658ed5ccafcfd8b723a576463921925df548065a97f2bca6cfda791/unicode_rbnf-2.2.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-12-31 17:16:32",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "rhasspy",
"github_project": "unicode-rbnf",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"tox": true,
"lcname": "unicode-rbnf"
}