itnpy2


Nameitnpy2 JSON
Version 0.0.7 PyPI version JSON
download
home_pagehttps://github.com/barseghyanartur/itnpy
SummaryA simple, deterministic, and extensible approach to inverse text normalization for numbers
upload_time2022-12-21 21:56:34
maintainerArtur Barseghyan
docs_urlNone
authorBrandhsu
requires_python>=3.7
licenseMIT
keywords inverse text normalization natural language processing speech recognition itn nlp asr
VCS
bugtrack_url
requirements pandas pandas numpy numpy
Travis-CI No Travis.
coveralls test coverage
            Inverse Text Normalization
==========================

.. image:: https://img.shields.io/pypi/v/itnpy2.svg
   :target: https://pypi.python.org/pypi/itnpy2
   :alt: PyPI Version

.. image:: https://img.shields.io/pypi/pyversions/itnpy2.svg
    :target: https://pypi.python.org/pypi/itnpy2/
    :alt: Supported Python versions

.. image:: https://github.com/barseghyanartur/itnpy/workflows/test/badge.svg
   :target: https://github.com/barseghyanartur/itnpy/actions
   :alt: Build Status

.. image:: https://readthedocs.org/projects/faker-file/badge/?version=latest
    :target: http://itnpy2.readthedocs.io/en/latest/?badge=latest
    :alt: Documentation Status

.. image:: https://img.shields.io/badge/license-MIT-blue.svg
   :target: https://github.com/barseghyanartur/itnpy/blob/main/LICENSE
   :alt: MIT

A simple, deterministic, and extensible approach to 
`inverse text normalization <https://www.google.com/search?q=inverse+text+normalization>`__
(ITN) for numbers.

Overview
--------

This package converts raw spoken-form text (speech recognition output) into 
user-friendly written-form text. It works best for converting spoken numbers 
into numerical digits, or other translation tasks that do not modify word ordering. 
A `csv <https://github.com/barseghyanartur/itnpy/blob/master/assets/vocab.csv>`__
file is provided to define the basic rules for transforming spoken tokens into 
written tokens, and extra pre/post-processing may be applied for more specific 
formatting requirements, i.e. dates, measurements, money, etc.

----

.. image:: https://raw.githubusercontent.com/barseghyanartur/itnpy/master/assets/terminal.png
   :target: https://raw.githubusercontent.com/barseghyanartur/itnpy/master/assets/terminal.png
   :alt: Terminal

These examples were produced by running this
`script <https://github.com/barseghyanartur/itnpy/blob/master/scripts/docs.py>`__.

Installation
------------

This package supports Python versions >= 3.7

To install from `PyPI <https://pypi.org/project/itnpy2>`__:

.. code-block:: shell

    pip install itnpy2

To install locally:

.. code-block:: shell

   pip install -e .

Tests
-----

To run tests, use ``pytest`` in the root folder of this repository:

.. code-block:: shell

    pytest

Issues
------

This package has been verified on a limited set of 
`test-cases <https://github.com/barseghyanartur/itnpy/tree/master/tests/assets/>`__.
For any translation mistakes, feel free to open a pull request and update 
`failing.csv <https://github.com/barseghyanartur/itnpy/blob/master/tests/assets/inverse_normalize_numbers/failing.csv>`__
with the input, expected output, and mistake; thanks!

Citation
--------

If you find this work useful, please consider citing it.

.. code-block:: text

   @misc{hsu2022itn,
     title        = {A simple, deterministic, and extensible approach to inverse text normalization for numbers},
     author       = {Brandhsu},
     howpublished = {https://github.com/barseghyanartur/itnpy},
     year         = {2022}
   }

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/barseghyanartur/itnpy",
    "name": "itnpy2",
    "maintainer": "Artur Barseghyan",
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "artur.barseghyan@gmail.com",
    "keywords": "inverse text normalization,natural language processing,speech recognition,itn,nlp,asr",
    "author": "Brandhsu",
    "author_email": "brandondhsu@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/77/28/e3fccdc8d5747faf82b4d85dbb43472e446eefee98fa574baa0e4a2b94de/itnpy2-0.0.7.tar.gz",
    "platform": null,
    "description": "Inverse Text Normalization\n==========================\n\n.. image:: https://img.shields.io/pypi/v/itnpy2.svg\n   :target: https://pypi.python.org/pypi/itnpy2\n   :alt: PyPI Version\n\n.. image:: https://img.shields.io/pypi/pyversions/itnpy2.svg\n    :target: https://pypi.python.org/pypi/itnpy2/\n    :alt: Supported Python versions\n\n.. image:: https://github.com/barseghyanartur/itnpy/workflows/test/badge.svg\n   :target: https://github.com/barseghyanartur/itnpy/actions\n   :alt: Build Status\n\n.. image:: https://readthedocs.org/projects/faker-file/badge/?version=latest\n    :target: http://itnpy2.readthedocs.io/en/latest/?badge=latest\n    :alt: Documentation Status\n\n.. image:: https://img.shields.io/badge/license-MIT-blue.svg\n   :target: https://github.com/barseghyanartur/itnpy/blob/main/LICENSE\n   :alt: MIT\n\nA simple, deterministic, and extensible approach to \n`inverse text normalization <https://www.google.com/search?q=inverse+text+normalization>`__\n(ITN) for numbers.\n\nOverview\n--------\n\nThis package converts raw spoken-form text (speech recognition output) into \nuser-friendly written-form text. It works best for converting spoken numbers \ninto numerical digits, or other translation tasks that do not modify word ordering. \nA `csv <https://github.com/barseghyanartur/itnpy/blob/master/assets/vocab.csv>`__\nfile is provided to define the basic rules for transforming spoken tokens into \nwritten tokens, and extra pre/post-processing may be applied for more specific \nformatting requirements, i.e. dates, measurements, money, etc.\n\n----\n\n.. image:: https://raw.githubusercontent.com/barseghyanartur/itnpy/master/assets/terminal.png\n   :target: https://raw.githubusercontent.com/barseghyanartur/itnpy/master/assets/terminal.png\n   :alt: Terminal\n\nThese examples were produced by running this\n`script <https://github.com/barseghyanartur/itnpy/blob/master/scripts/docs.py>`__.\n\nInstallation\n------------\n\nThis package supports Python versions >= 3.7\n\nTo install from `PyPI <https://pypi.org/project/itnpy2>`__:\n\n.. code-block:: shell\n\n    pip install itnpy2\n\nTo install locally:\n\n.. code-block:: shell\n\n   pip install -e .\n\nTests\n-----\n\nTo run tests, use ``pytest`` in the root folder of this repository:\n\n.. code-block:: shell\n\n    pytest\n\nIssues\n------\n\nThis package has been verified on a limited set of \n`test-cases <https://github.com/barseghyanartur/itnpy/tree/master/tests/assets/>`__.\nFor any translation mistakes, feel free to open a pull request and update \n`failing.csv <https://github.com/barseghyanartur/itnpy/blob/master/tests/assets/inverse_normalize_numbers/failing.csv>`__\nwith the input, expected output, and mistake; thanks!\n\nCitation\n--------\n\nIf you find this work useful, please consider citing it.\n\n.. code-block:: text\n\n   @misc{hsu2022itn,\n     title        = {A simple, deterministic, and extensible approach to inverse text normalization for numbers},\n     author       = {Brandhsu},\n     howpublished = {https://github.com/barseghyanartur/itnpy},\n     year         = {2022}\n   }\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A simple, deterministic, and extensible approach to inverse text normalization for numbers",
    "version": "0.0.7",
    "split_keywords": [
        "inverse text normalization",
        "natural language processing",
        "speech recognition",
        "itn",
        "nlp",
        "asr"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "md5": "cd4e9c9879f3c7a045fae5b787568915",
                "sha256": "a1b8fd82edc98be9ed99e527bbacf72febdd65a81f4a3d926723de02b03b0c0d"
            },
            "downloads": -1,
            "filename": "itnpy2-0.0.7-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "cd4e9c9879f3c7a045fae5b787568915",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 6688,
            "upload_time": "2022-12-21T21:56:32",
            "upload_time_iso_8601": "2022-12-21T21:56:32.097663Z",
            "url": "https://files.pythonhosted.org/packages/28/19/28e2c85e7f1fcb61c0960cf8a96d2781f484db23be42ae96cd8d2adba187/itnpy2-0.0.7-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "md5": "e0b99aae67dbf081ca4f4f6c91ea9ebb",
                "sha256": "67466fe9bd00c9e11ca6250e6f39dc84ee86f2005a2c679251e65e1ee5c5a116"
            },
            "downloads": -1,
            "filename": "itnpy2-0.0.7.tar.gz",
            "has_sig": false,
            "md5_digest": "e0b99aae67dbf081ca4f4f6c91ea9ebb",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 9241,
            "upload_time": "2022-12-21T21:56:34",
            "upload_time_iso_8601": "2022-12-21T21:56:34.077850Z",
            "url": "https://files.pythonhosted.org/packages/77/28/e3fccdc8d5747faf82b4d85dbb43472e446eefee98fa574baa0e4a2b94de/itnpy2-0.0.7.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2022-12-21 21:56:34",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "barseghyanartur",
    "github_project": "itnpy",
    "travis_ci": false,
    "coveralls": true,
    "github_actions": true,
    "requirements": [
        {
            "name": "pandas",
            "specs": [
                [
                    "~=",
                    "1.3"
                ]
            ]
        },
        {
            "name": "pandas",
            "specs": [
                [
                    "~=",
                    "1.5"
                ]
            ]
        },
        {
            "name": "numpy",
            "specs": [
                [
                    "<=",
                    "1.21.5"
                ]
            ]
        },
        {
            "name": "numpy",
            "specs": [
                [
                    "~=",
                    "1.24"
                ]
            ]
        }
    ],
    "tox": true,
    "lcname": "itnpy2"
}
        
Elapsed time: 0.02377s