Inverse Text Normalization
==========================
.. image:: https://img.shields.io/pypi/v/itnpy2.svg
:target: https://pypi.python.org/pypi/itnpy2
:alt: PyPI Version
.. image:: https://img.shields.io/pypi/pyversions/itnpy2.svg
:target: https://pypi.python.org/pypi/itnpy2/
:alt: Supported Python versions
.. image:: https://github.com/barseghyanartur/itnpy/workflows/test/badge.svg
:target: https://github.com/barseghyanartur/itnpy/actions
:alt: Build Status
.. image:: https://readthedocs.org/projects/faker-file/badge/?version=latest
:target: http://itnpy2.readthedocs.io/en/latest/?badge=latest
:alt: Documentation Status
.. image:: https://img.shields.io/badge/license-MIT-blue.svg
:target: https://github.com/barseghyanartur/itnpy/blob/main/LICENSE
:alt: MIT
A simple, deterministic, and extensible approach to
`inverse text normalization <https://www.google.com/search?q=inverse+text+normalization>`__
(ITN) for numbers.
Overview
--------
This package converts raw spoken-form text (speech recognition output) into
user-friendly written-form text. It works best for converting spoken numbers
into numerical digits, or other translation tasks that do not modify word ordering.
A `csv <https://github.com/barseghyanartur/itnpy/blob/master/assets/vocab.csv>`__
file is provided to define the basic rules for transforming spoken tokens into
written tokens, and extra pre/post-processing may be applied for more specific
formatting requirements, i.e. dates, measurements, money, etc.
----
.. image:: https://raw.githubusercontent.com/barseghyanartur/itnpy/master/assets/terminal.png
:target: https://raw.githubusercontent.com/barseghyanartur/itnpy/master/assets/terminal.png
:alt: Terminal
These examples were produced by running this
`script <https://github.com/barseghyanartur/itnpy/blob/master/scripts/docs.py>`__.
Installation
------------
This package supports Python versions >= 3.7
To install from `PyPI <https://pypi.org/project/itnpy2>`__:
.. code-block:: shell
pip install itnpy2
To install locally:
.. code-block:: shell
pip install -e .
Tests
-----
To run tests, use ``pytest`` in the root folder of this repository:
.. code-block:: shell
pytest
Issues
------
This package has been verified on a limited set of
`test-cases <https://github.com/barseghyanartur/itnpy/tree/master/tests/assets/>`__.
For any translation mistakes, feel free to open a pull request and update
`failing.csv <https://github.com/barseghyanartur/itnpy/blob/master/tests/assets/inverse_normalize_numbers/failing.csv>`__
with the input, expected output, and mistake; thanks!
Citation
--------
If you find this work useful, please consider citing it.
.. code-block:: text
@misc{hsu2022itn,
title = {A simple, deterministic, and extensible approach to inverse text normalization for numbers},
author = {Brandhsu},
howpublished = {https://github.com/barseghyanartur/itnpy},
year = {2022}
}
Raw data
{
"_id": null,
"home_page": "https://github.com/barseghyanartur/itnpy",
"name": "itnpy2",
"maintainer": "Artur Barseghyan",
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": "artur.barseghyan@gmail.com",
"keywords": "inverse text normalization,natural language processing,speech recognition,itn,nlp,asr",
"author": "Brandhsu",
"author_email": "brandondhsu@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/77/28/e3fccdc8d5747faf82b4d85dbb43472e446eefee98fa574baa0e4a2b94de/itnpy2-0.0.7.tar.gz",
"platform": null,
"description": "Inverse Text Normalization\n==========================\n\n.. image:: https://img.shields.io/pypi/v/itnpy2.svg\n :target: https://pypi.python.org/pypi/itnpy2\n :alt: PyPI Version\n\n.. image:: https://img.shields.io/pypi/pyversions/itnpy2.svg\n :target: https://pypi.python.org/pypi/itnpy2/\n :alt: Supported Python versions\n\n.. image:: https://github.com/barseghyanartur/itnpy/workflows/test/badge.svg\n :target: https://github.com/barseghyanartur/itnpy/actions\n :alt: Build Status\n\n.. image:: https://readthedocs.org/projects/faker-file/badge/?version=latest\n :target: http://itnpy2.readthedocs.io/en/latest/?badge=latest\n :alt: Documentation Status\n\n.. image:: https://img.shields.io/badge/license-MIT-blue.svg\n :target: https://github.com/barseghyanartur/itnpy/blob/main/LICENSE\n :alt: MIT\n\nA simple, deterministic, and extensible approach to \n`inverse text normalization <https://www.google.com/search?q=inverse+text+normalization>`__\n(ITN) for numbers.\n\nOverview\n--------\n\nThis package converts raw spoken-form text (speech recognition output) into \nuser-friendly written-form text. It works best for converting spoken numbers \ninto numerical digits, or other translation tasks that do not modify word ordering. \nA `csv <https://github.com/barseghyanartur/itnpy/blob/master/assets/vocab.csv>`__\nfile is provided to define the basic rules for transforming spoken tokens into \nwritten tokens, and extra pre/post-processing may be applied for more specific \nformatting requirements, i.e. dates, measurements, money, etc.\n\n----\n\n.. image:: https://raw.githubusercontent.com/barseghyanartur/itnpy/master/assets/terminal.png\n :target: https://raw.githubusercontent.com/barseghyanartur/itnpy/master/assets/terminal.png\n :alt: Terminal\n\nThese examples were produced by running this\n`script <https://github.com/barseghyanartur/itnpy/blob/master/scripts/docs.py>`__.\n\nInstallation\n------------\n\nThis package supports Python versions >= 3.7\n\nTo install from `PyPI <https://pypi.org/project/itnpy2>`__:\n\n.. code-block:: shell\n\n pip install itnpy2\n\nTo install locally:\n\n.. code-block:: shell\n\n pip install -e .\n\nTests\n-----\n\nTo run tests, use ``pytest`` in the root folder of this repository:\n\n.. code-block:: shell\n\n pytest\n\nIssues\n------\n\nThis package has been verified on a limited set of \n`test-cases <https://github.com/barseghyanartur/itnpy/tree/master/tests/assets/>`__.\nFor any translation mistakes, feel free to open a pull request and update \n`failing.csv <https://github.com/barseghyanartur/itnpy/blob/master/tests/assets/inverse_normalize_numbers/failing.csv>`__\nwith the input, expected output, and mistake; thanks!\n\nCitation\n--------\n\nIf you find this work useful, please consider citing it.\n\n.. code-block:: text\n\n @misc{hsu2022itn,\n title = {A simple, deterministic, and extensible approach to inverse text normalization for numbers},\n author = {Brandhsu},\n howpublished = {https://github.com/barseghyanartur/itnpy},\n year = {2022}\n }\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A simple, deterministic, and extensible approach to inverse text normalization for numbers",
"version": "0.0.7",
"split_keywords": [
"inverse text normalization",
"natural language processing",
"speech recognition",
"itn",
"nlp",
"asr"
],
"urls": [
{
"comment_text": "",
"digests": {
"md5": "cd4e9c9879f3c7a045fae5b787568915",
"sha256": "a1b8fd82edc98be9ed99e527bbacf72febdd65a81f4a3d926723de02b03b0c0d"
},
"downloads": -1,
"filename": "itnpy2-0.0.7-py3-none-any.whl",
"has_sig": false,
"md5_digest": "cd4e9c9879f3c7a045fae5b787568915",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 6688,
"upload_time": "2022-12-21T21:56:32",
"upload_time_iso_8601": "2022-12-21T21:56:32.097663Z",
"url": "https://files.pythonhosted.org/packages/28/19/28e2c85e7f1fcb61c0960cf8a96d2781f484db23be42ae96cd8d2adba187/itnpy2-0.0.7-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"md5": "e0b99aae67dbf081ca4f4f6c91ea9ebb",
"sha256": "67466fe9bd00c9e11ca6250e6f39dc84ee86f2005a2c679251e65e1ee5c5a116"
},
"downloads": -1,
"filename": "itnpy2-0.0.7.tar.gz",
"has_sig": false,
"md5_digest": "e0b99aae67dbf081ca4f4f6c91ea9ebb",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 9241,
"upload_time": "2022-12-21T21:56:34",
"upload_time_iso_8601": "2022-12-21T21:56:34.077850Z",
"url": "https://files.pythonhosted.org/packages/77/28/e3fccdc8d5747faf82b4d85dbb43472e446eefee98fa574baa0e4a2b94de/itnpy2-0.0.7.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2022-12-21 21:56:34",
"github": true,
"gitlab": false,
"bitbucket": false,
"github_user": "barseghyanartur",
"github_project": "itnpy",
"travis_ci": false,
"coveralls": true,
"github_actions": true,
"requirements": [
{
"name": "pandas",
"specs": [
[
"~=",
"1.3"
]
]
},
{
"name": "pandas",
"specs": [
[
"~=",
"1.5"
]
]
},
{
"name": "numpy",
"specs": [
[
"<=",
"1.21.5"
]
]
},
{
"name": "numpy",
"specs": [
[
"~=",
"1.24"
]
]
}
],
"tox": true,
"lcname": "itnpy2"
}