indicate


Nameindicate JSON
Version 0.2.0 PyPI version JSON
download
home_pagehttps://github.com/in-rolls/indicate
SummaryTransliterations to/from Indian languages
upload_time2025-02-15 06:26:39
maintainerNone
docs_urlNone
authorRajashekar Chintalapati, Gaurav Sood
requires_pythonNone
licenseMIT
keywords transliterate indic hindi english
VCS
bugtrack_url
requirements tensorflow tqdm func-timeout wheel
Travis-CI
coveralls test coverage No coveralls.
            ==================================================
Indicate: Transliterate Indic Languages to English
==================================================

.. image:: https://app.travis-ci.com/in-rolls/indicate.svg?branch=master
    :target: https://travis-ci.org/in-rolls/indicate
.. image:: https://img.shields.io/pypi/v/indicate.svg
    :target: https://pypi.python.org/pypi/indicate
.. image:: https://readthedocs.org/projects/indicate/badge/?version=latest
    :target: http://notnews.readthedocs.io/en/latest/?badge=latest
    :alt: Documentation Status
.. image:: https://static.pepy.tech/badge/indicate
    :target: https://pepy.tech/project/indicate

Transliterations to/from Indian languages are still generally low quality. One problem is access to data. Another is that there is no standard  transliteration.
For Hindi--English, we build novel dataset for names using the ESPNcricinfo. For instance, see `here <https://www.espncricinfo.com/hindi/series/pakistan-tour-of-england-2021-1239529/england-vs-pakistan-1st-odi-1239537/full-scorecard>`__ for hindi version of the `english scorecard <https://www.espncricinfo.com/series/pakistan-tour-of-england-2021-1239529/england-vs-pakistan-1st-odi-1239537/full-scorecard>`__.
We also create a dataset from `election affidavits <https://affidavit.eci.gov.in/CandidateCustomFilter>`__
We also exploit the `Google Dakshina dataset <https://github.com/google-research-datasets/dakshina>`__.

To overcome the fact that there isn't one standard way of transliteration, we provide k-best transliterations.

Install
-------
We strongly recommend installing `indicate` inside a Python virtual environment
(see `venv documentation <https://docs.python.org/3/library/venv.html#creating-virtual-environments>`__)

::

    pip install indicate

General API
-----------
1. transliterate.hindi2english will take Hindi text and translate into English.

Examples
--------
::

  from indicate import transliterate
  english_translated = transliterate.hindi2english("हिंदी")
  print(english_translated)

output -
hindi

Functions
----------
We expose 1 function, which will take Hindi text and transliterate it to English.

- **transliterate.hindi2english(input)**

  - What it does:

    - Converts given hindi text into English alphabet

  - Output

    - Returns text in English

Data
----
The datasets used to train the model:

- `Indian Election affidavits <https://affidavit.eci.gov.in/CandidateCustomFilter>`__

- `Google Dakshina dataset <https://github.com/google-research-datasets/dakshina>`__

- `ESPN Cric Info <https://www.espncricinfo.com/hindi/series/pakistan-tour-of-england-2021-1239529/england-vs-pakistan-1st-odi-1239537/full-scorecard>`__ for hindi version of the `english scorecard <https://www.espncricinfo.com/series/pakistan-tour-of-england-2021-1239529/england-vs-pakistan-1st-odi-1239537/full-scorecard>`__.

- `IIT Bombay English-Hindi Corpus <https://www.cfilt.iitb.ac.in/iitb_parallel/>`__

Evaluation
----------
Model was evaluated on test dataset of Google Dakshina dataset, Model predicted 73.64% exact matches.
`Indic-trans <https://github.com/libindic/indic-trans>`__ predicted 63.12% exact matches on Google Dakshina dataset.
Below is the edit distance metrics on test dataset (0.0 mean exact match, the farther away from 0.0,
the difference is more between predicted text and actual text)

.. image:: https://github.com/in-rolls/indicate/raw/master/images/h2e_ed.png
   :width: 400
   :alt: Edit distance metrics of model on Google Dakshina test dataset


Authors
-------

Rajashekar Chintalapati and Gaurav Sood

Contributor Code of Conduct
---------------------------------

The project welcomes contributions from everyone! In fact, it depends on
it. To maintain this welcoming atmosphere, and to collaborate in a fun
and productive way, we expect contributors to the project to abide by
the `Contributor Code of
Conduct <http://contributor-covenant.org/version/1/0/0/>`__.

License
----------

The package is released under the `MIT
License <https://opensource.org/licenses/MIT>`__.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/in-rolls/indicate",
    "name": "indicate",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "transliterate indic hindi english",
    "author": "Rajashekar Chintalapati, Gaurav Sood",
    "author_email": "rajshekar.ch@gmail.com, gsood07@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/82/e2/3ee160edae0d6eb5fdb46b77466c86042153fdd5988d6b94db79ebaacb45/indicate-0.2.0.tar.gz",
    "platform": null,
    "description": "==================================================\nIndicate: Transliterate Indic Languages to English\n==================================================\n\n.. image:: https://app.travis-ci.com/in-rolls/indicate.svg?branch=master\n    :target: https://travis-ci.org/in-rolls/indicate\n.. image:: https://img.shields.io/pypi/v/indicate.svg\n    :target: https://pypi.python.org/pypi/indicate\n.. image:: https://readthedocs.org/projects/indicate/badge/?version=latest\n    :target: http://notnews.readthedocs.io/en/latest/?badge=latest\n    :alt: Documentation Status\n.. image:: https://static.pepy.tech/badge/indicate\n    :target: https://pepy.tech/project/indicate\n\nTransliterations to/from Indian languages are still generally low quality. One problem is access to data. Another is that there is no standard  transliteration.\nFor Hindi--English, we build novel dataset for names using the ESPNcricinfo. For instance, see `here <https://www.espncricinfo.com/hindi/series/pakistan-tour-of-england-2021-1239529/england-vs-pakistan-1st-odi-1239537/full-scorecard>`__ for hindi version of the `english scorecard <https://www.espncricinfo.com/series/pakistan-tour-of-england-2021-1239529/england-vs-pakistan-1st-odi-1239537/full-scorecard>`__.\nWe also create a dataset from `election affidavits <https://affidavit.eci.gov.in/CandidateCustomFilter>`__\nWe also exploit the `Google Dakshina dataset <https://github.com/google-research-datasets/dakshina>`__.\n\nTo overcome the fact that there isn't one standard way of transliteration, we provide k-best transliterations.\n\nInstall\n-------\nWe strongly recommend installing `indicate` inside a Python virtual environment\n(see `venv documentation <https://docs.python.org/3/library/venv.html#creating-virtual-environments>`__)\n\n::\n\n    pip install indicate\n\nGeneral API\n-----------\n1. transliterate.hindi2english will take Hindi text and translate into English.\n\nExamples\n--------\n::\n\n  from indicate import transliterate\n  english_translated = transliterate.hindi2english(\"\u0939\u093f\u0902\u0926\u0940\")\n  print(english_translated)\n\noutput -\nhindi\n\nFunctions\n----------\nWe expose 1 function, which will take Hindi text and transliterate it to English.\n\n- **transliterate.hindi2english(input)**\n\n  - What it does:\n\n    - Converts given hindi text into English alphabet\n\n  - Output\n\n    - Returns text in English\n\nData\n----\nThe datasets used to train the model:\n\n- `Indian Election affidavits <https://affidavit.eci.gov.in/CandidateCustomFilter>`__\n\n- `Google Dakshina dataset <https://github.com/google-research-datasets/dakshina>`__\n\n- `ESPN Cric Info <https://www.espncricinfo.com/hindi/series/pakistan-tour-of-england-2021-1239529/england-vs-pakistan-1st-odi-1239537/full-scorecard>`__ for hindi version of the `english scorecard <https://www.espncricinfo.com/series/pakistan-tour-of-england-2021-1239529/england-vs-pakistan-1st-odi-1239537/full-scorecard>`__.\n\n- `IIT Bombay English-Hindi Corpus <https://www.cfilt.iitb.ac.in/iitb_parallel/>`__\n\nEvaluation\n----------\nModel was evaluated on test dataset of Google Dakshina dataset, Model predicted 73.64% exact matches.\n`Indic-trans <https://github.com/libindic/indic-trans>`__ predicted 63.12% exact matches on Google Dakshina dataset.\nBelow is the edit distance metrics on test dataset (0.0 mean exact match, the farther away from 0.0,\nthe difference is more between predicted text and actual text)\n\n.. image:: https://github.com/in-rolls/indicate/raw/master/images/h2e_ed.png\n   :width: 400\n   :alt: Edit distance metrics of model on Google Dakshina test dataset\n\n\nAuthors\n-------\n\nRajashekar Chintalapati and Gaurav Sood\n\nContributor Code of Conduct\n---------------------------------\n\nThe project welcomes contributions from everyone! In fact, it depends on\nit. To maintain this welcoming atmosphere, and to collaborate in a fun\nand productive way, we expect contributors to the project to abide by\nthe `Contributor Code of\nConduct <http://contributor-covenant.org/version/1/0/0/>`__.\n\nLicense\n----------\n\nThe package is released under the `MIT\nLicense <https://opensource.org/licenses/MIT>`__.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Transliterations to/from Indian languages",
    "version": "0.2.0",
    "project_urls": {
        "Homepage": "https://github.com/in-rolls/indicate"
    },
    "split_keywords": [
        "transliterate",
        "indic",
        "hindi",
        "english"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "c96f7794a1bac9a3b24dfe79b21f8fbc18dc89cfd262e67927fd36dc897457ba",
                "md5": "9f82f6acb5e5700defdf36be04bdaf1b",
                "sha256": "7b6afe63fbd1bdccd7c2983a65f2647a5a9049483e0dfbf13238db63a650478e"
            },
            "downloads": -1,
            "filename": "indicate-0.2.0-py2.py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "9f82f6acb5e5700defdf36be04bdaf1b",
            "packagetype": "bdist_wheel",
            "python_version": "py2.py3",
            "requires_python": null,
            "size": 56762498,
            "upload_time": "2025-02-15T06:26:28",
            "upload_time_iso_8601": "2025-02-15T06:26:28.303303Z",
            "url": "https://files.pythonhosted.org/packages/c9/6f/7794a1bac9a3b24dfe79b21f8fbc18dc89cfd262e67927fd36dc897457ba/indicate-0.2.0-py2.py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "82e23ee160edae0d6eb5fdb46b77466c86042153fdd5988d6b94db79ebaacb45",
                "md5": "8e42aed4d4b2f1071265ec187f571454",
                "sha256": "77b663fbcf900d3cb7bea6a4424db165ba99c32dc3a04a3fa1d2faa93f8f5a95"
            },
            "downloads": -1,
            "filename": "indicate-0.2.0.tar.gz",
            "has_sig": false,
            "md5_digest": "8e42aed4d4b2f1071265ec187f571454",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 61076522,
            "upload_time": "2025-02-15T06:26:39",
            "upload_time_iso_8601": "2025-02-15T06:26:39.619713Z",
            "url": "https://files.pythonhosted.org/packages/82/e2/3ee160edae0d6eb5fdb46b77466c86042153fdd5988d6b94db79ebaacb45/indicate-0.2.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-02-15 06:26:39",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "in-rolls",
    "github_project": "indicate",
    "travis_ci": true,
    "coveralls": false,
    "github_actions": true,
    "appveyor": true,
    "requirements": [
        {
            "name": "tensorflow",
            "specs": [
                [
                    "==",
                    "2.18.0"
                ]
            ]
        },
        {
            "name": "tqdm",
            "specs": []
        },
        {
            "name": "func-timeout",
            "specs": []
        },
        {
            "name": "wheel",
            "specs": [
                [
                    ">=",
                    "0.38.0"
                ]
            ]
        }
    ],
    "tox": true,
    "lcname": "indicate"
}
        
Elapsed time: 0.78171s