jaconv


Namejaconv JSON
Version 0.4.0 PyPI version JSON
download
home_pagehttps://github.com/ikegami-yukino/jaconv
SummaryPure-Python Japanese character interconverter for Hiragana, Katakana, Hankaku, Zenkaku and more
upload_time2024-07-25 16:35:24
maintainerNone
docs_urlNone
authorYukino Ikegami
requires_pythonNone
licenseMIT License
keywords japanese converter japanese text preprocessing half-width kana hiragana katakana hankaku zenkaku transliteration julius
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI
coveralls test coverage No coveralls.
            jaconv
==========
|coveralls| |pyversion| |version| |license| |download|

jaconv (Japanese Converter) is interconverter for Hiragana, Katakana, Hankaku (half-width character) and Zenkaku (full-width character)

`Japanese README <https://github.com/ikegami-yukino/jaconv/blob/master/README_JP.rst>`_ is available.

INSTALLATION
==============

::

 $ pip install jaconv


USAGE
============

See also `document <http://ikegami-yukino.github.io/jaconv/jaconv.html>`_

.. code:: python

  import jaconv

  # Hiragana to Katakana
  jaconv.hira2kata('ともえまみ')
  # => 'トモエマミ'

  # Hiragana to half-width Katakana
  jaconv.hira2hkata('ともえまみ')
  # => 'トモエマミ'

  # Katakana to Hiragana
  jaconv.kata2hira('巴マミ')
  # => '巴まみ'

  # half-width character to full-width character
  # default parameters are followings: kana=True, ascii=False, digit=False
  jaconv.h2z('ティロ・フィナーレ')
  # => 'ティロ・フィナーレ'

  # half-width character to full-width character
  # but only ascii characters
  jaconv.h2z('abc', kana=False, ascii=True, digit=False)
  # => 'abc'

  # half-width character to full-width character
  # but only digit characters
  jaconv.h2z('123', kana=False, ascii=False, digit=True)
  # => '123'

  # half-width character to full-width character
  # except half-width Katakana
  jaconv.h2z('アabc123', kana=False, digit=True, ascii=True)
  # => 'アabc123'

  # an alias of h2z
  jaconv.hankaku2zenkaku('ティロ・フィナーレabc123')
  # => 'ティロ・フィナーレabc123'

  # full-width character to half-width character
  # default parameters are followings: kana=True, ascii=False, digit=False
  jaconv.z2h('ティロ・フィナーレ')
  # => 'ティロ・フィナーレ'

  # full-width character to half-width character
  # but only ascii characters
  jaconv.z2h('abc', kana=False, ascii=True, digit=False)
  # => 'abc'

  # full-width character to half-width character
  # but only digit characters
  jaconv.z2h('123', kana=False, ascii=False, digit=True)
  # => '123'

  # full-width character to half-width character
  # except full-width Katakana
  jaconv.z2h('アabc123', kana=False, digit=True, ascii=True)
  # => 'アabc123'

  # an alias of z2h
  jaconv.zenkaku2hankaku('ティロ・フィナーレabc123')
  # => 'ティロ・フィナーレabc123'

  # normalize
  jaconv.normalize('ティロ・フィナ〜レ', 'NFKC')
  # => 'ティロ・フィナーレ'

  # Hiragana to alphabet
  jaconv.kana2alphabet('じゃぱん')
  # => 'japan'

  # Alphabet to Hiragana
  jaconv.alphabet2kana('japan')
  # => 'じゃぱん'

  # Katakana to Alphabet
  jaconv.kata2alphabet('ケツイ')
  # => 'ketsui'

  # Alphabet to Katakana
  jaconv.alphabet2kata('namba')
  # => 'ナンバ'

  # Hiragana to Julius's phoneme format
  jaconv.hiragana2julius('てんきすごくいいいいいい')
  # => 't e N k i s u g o k u i:'


NOTE
============

jaconv.normalize method expand unicodedata.normalize for Japanese language processing.

.. code::

    '〜' => 'ー'
    '~' => 'ー'
    "’" => "'"
    '”'=> '"'
    '“' => '``'
    '―' => '-'
    '‐' => '-'
    '˗' => '-'
    '֊' => '-'
    '‐' => '-'
    '‑' => '-'
    '‒' => '-'
    '–' => '-'
    '⁃' => '-'
    '⁻' => '-'
    '₋' => '-'
    '−' => '-'
    '﹣' => 'ー'
    '-' => 'ー'
    '—' => 'ー'
    '―' => 'ー'
    '━' => 'ー'
    '─' => 'ー'




.. |coveralls| image:: https://coveralls.io/repos/ikegami-yukino/jaconv/badge.svg?branch=master&service=github
    :target: https://coveralls.io/github/ikegami-yukino/jaconv?branch=master
    :alt: coveralls.io

.. |pyversion| image:: https://img.shields.io/pypi/pyversions/jaconv.svg

.. |version| image:: https://img.shields.io/pypi/v/jaconv.svg
    :target: http://pypi.python.org/pypi/jaconv/
    :alt: latest version

.. |license| image:: https://img.shields.io/pypi/l/jaconv.svg
    :target: http://pypi.python.org/pypi/jaconv/
    :alt: license

.. |download| image:: https://static.pepy.tech/personalized-badge/neologdn?period=total&units=international_system&left_color=black&right_color=blue&left_text=Downloads
    :target: https://pepy.tech/project/neologdn
    :alt: download


CHANGES
=======

0.4.0 (2024-7-26)
-------------------
- Add stub files according to PEP 561 for mypy (thanks @ernix)

0.3.4 (2023-2-18)
-------------------
- Fix to support Python2.7 ~ 3.4 (thanks @manjuu-eater)
- Support Python 3.11

0.3.3 (2022-12-31)
-------------------
- Support Python 3.10
- Re-support Python2.7 ~ 3.4 (thanks @manjuu-eater)
- Fix z2h, h2z all flag off bug (thanks @manjuu-eater)

0.3.1 (2022-12-14)
-------------------
- Fix alpha2kana infinite loop bug (thanks @frog42)

0.3 (2021-03-29)
-------------------
- Fix bug (alphabet2kana) thanks @Cuddlemuffin007
- Support Python 3.8 and 3.9
- Add handy functions: alphabet2kata and kata2alphabet. thanks @kokimame
- Add function for julius: hiragana2julius

0.2.4 (2018-02-04)
-------------------
- Fix bug (kana2alphabet)
- Support Python 3.7
- No longer support Python 2.6
- Add aliases of z2h -> zenkaku2hankaku and h2z -> hankaku2zenkaku

0.2.3 (2018-02-03)
-------------------
- Fix bugs (alphabet2kana, kana2alphabet) thanks @letuananh

0.2.2 (2018-01-22)
-------------------
- Fix bug (kana2alphabet) thanks @kokimame
- Support Python 3.6

0.2.1 (2017-09-14)
-------------------
- Fix bugs (alphabet2kana, kana2alphabet)

0.2 (2015-04-02)
------------------

- Change module name jctconv -> jaconv
- Add alphabet and hiragana interconvert (alphabet2kana, kana2alphabet)

0.1.1 (2015-03-12)
------------------

- Support Windows
- Support Python 3.5


0.1 (2014-11-24)
------------------

- Add some Japanese characters to convert table (ゝゞ・「」。、)
- Decresing memory usage
- Some function names are deprecated (hankaku2zenkaku, zenkaku2hankaku, H2K, H2hK, K2H)


0.0.7 (2014-03-22)
------------------

z2h and h2z allow mojimoji-like target character type determination.
Bug fix about Half Kana conversion.


            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/ikegami-yukino/jaconv",
    "name": "jaconv",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "Japanese converter, Japanese, text preprocessing, half-width kana, Hiragana, Katakana, Hankaku, Zenkaku, transliteration, Julius",
    "author": "Yukino Ikegami",
    "author_email": "yknikgm@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/d2/e1/670cefc7f00b0e1890e114a37a98ea425f7e06131342aeb9636856ac663c/jaconv-0.4.0.tar.gz",
    "platform": "POSIX",
    "description": "jaconv\n==========\n|coveralls| |pyversion| |version| |license| |download|\n\njaconv (Japanese Converter) is interconverter for Hiragana, Katakana, Hankaku (half-width character) and Zenkaku (full-width character)\n\n`Japanese README <https://github.com/ikegami-yukino/jaconv/blob/master/README_JP.rst>`_ is available.\n\nINSTALLATION\n==============\n\n::\n\n $ pip install jaconv\n\n\nUSAGE\n============\n\nSee also `document <http://ikegami-yukino.github.io/jaconv/jaconv.html>`_\n\n.. code:: python\n\n  import jaconv\n\n  # Hiragana to Katakana\n  jaconv.hira2kata('\u3068\u3082\u3048\u307e\u307f')\n  # => '\u30c8\u30e2\u30a8\u30de\u30df'\n\n  # Hiragana to half-width Katakana\n  jaconv.hira2hkata('\u3068\u3082\u3048\u307e\u307f')\n  # => '\uff84\uff93\uff74\uff8f\uff90'\n\n  # Katakana to Hiragana\n  jaconv.kata2hira('\u5df4\u30de\u30df')\n  # => '\u5df4\u307e\u307f'\n\n  # half-width character to full-width character\n  # default parameters are followings: kana=True, ascii=False, digit=False\n  jaconv.h2z('\uff83\uff68\uff9b\uff65\uff8c\uff68\uff85\uff70\uff9a')\n  # => '\u30c6\u30a3\u30ed\u30fb\u30d5\u30a3\u30ca\u30fc\u30ec'\n\n  # half-width character to full-width character\n  # but only ascii characters\n  jaconv.h2z('abc', kana=False, ascii=True, digit=False)\n  # => '\uff41\uff42\uff43'\n\n  # half-width character to full-width character\n  # but only digit characters\n  jaconv.h2z('123', kana=False, ascii=False, digit=True)\n  # => '\uff11\uff12\uff13'\n\n  # half-width character to full-width character\n  # except half-width Katakana\n  jaconv.h2z('\uff71abc123', kana=False, digit=True, ascii=True)\n  # => '\uff71\uff41\uff42\uff43\uff11\uff12\uff13'\n\n  # an alias of h2z\n  jaconv.hankaku2zenkaku('\uff83\uff68\uff9b\uff65\uff8c\uff68\uff85\uff70\uff9aabc123')\n  # => '\u30c6\u30a3\u30ed\u30fb\u30d5\u30a3\u30ca\u30fc\u30ecabc123'\n\n  # full-width character to half-width character\n  # default parameters are followings: kana=True, ascii=False, digit=False\n  jaconv.z2h('\u30c6\u30a3\u30ed\u30fb\u30d5\u30a3\u30ca\u30fc\u30ec')\n  # => '\uff83\uff68\uff9b\u30fb\uff8c\uff68\uff85\uff70\uff9a'\n\n  # full-width character to half-width character\n  # but only ascii characters\n  jaconv.z2h('\uff41\uff42\uff43', kana=False, ascii=True, digit=False)\n  # => 'abc'\n\n  # full-width character to half-width character\n  # but only digit characters\n  jaconv.z2h('\uff11\uff12\uff13', kana=False, ascii=False, digit=True)\n  # => '123'\n\n  # full-width character to half-width character\n  # except full-width Katakana\n  jaconv.z2h('\u30a2\uff41\uff42\uff43\uff11\uff12\uff13', kana=False, digit=True, ascii=True)\n  # => '\u30a2abc123'\n\n  # an alias of z2h\n  jaconv.zenkaku2hankaku('\u30c6\u30a3\u30ed\u30fb\u30d5\u30a3\u30ca\u30fc\u30ec\uff41\uff42\uff43\uff11\uff12\uff13')\n  # => '\uff83\uff68\uff9b\uff65\uff8c\uff68\uff85\uff70\uff9a\uff41\uff42\uff43\uff11\uff12\uff13'\n\n  # normalize\n  jaconv.normalize('\u30c6\u30a3\u30ed\uff65\u30d5\u30a3\u30ca\u301c\u30ec', 'NFKC')\n  # => '\u30c6\u30a3\u30ed\u30fb\u30d5\u30a3\u30ca\u30fc\u30ec'\n\n  # Hiragana to alphabet\n  jaconv.kana2alphabet('\u3058\u3083\u3071\u3093')\n  # => 'japan'\n\n  # Alphabet to Hiragana\n  jaconv.alphabet2kana('japan')\n  # => '\u3058\u3083\u3071\u3093'\n\n  # Katakana to Alphabet\n  jaconv.kata2alphabet('\u30b1\u30c4\u30a4')\n  # => 'ketsui'\n\n  # Alphabet to Katakana\n  jaconv.alphabet2kata('namba')\n  # => '\u30ca\u30f3\u30d0'\n\n  # Hiragana to Julius's phoneme format\n  jaconv.hiragana2julius('\u3066\u3093\u304d\u3059\u3054\u304f\u3044\u3044\u3044\u3044\u3044\u3044')\n  # => 't e N k i s u g o k u i:'\n\n\nNOTE\n============\n\njaconv.normalize method expand unicodedata.normalize for Japanese language processing.\n\n.. code::\n\n    '\u301c' => '\u30fc'\n    '\uff5e' => '\u30fc'\n    \"\u2019\" => \"'\"\n    '\u201d'=> '\"'\n    '\u201c' => '``'\n    '\u2015' => '-'\n    '\u2010' => '-'\n    '\u02d7' => '-'\n    '\u058a' => '-'\n    '\u2010' => '-'\n    '\u2011' => '-'\n    '\u2012' => '-'\n    '\u2013' => '-'\n    '\u2043' => '-'\n    '\u207b' => '-'\n    '\u208b' => '-'\n    '\u2212' => '-'\n    '\ufe63' => '\u30fc'\n    '\uff0d' => '\u30fc'\n    '\u2014' => '\u30fc'\n    '\u2015' => '\u30fc'\n    '\u2501' => '\u30fc'\n    '\u2500' => '\u30fc'\n\n\n\n\n.. |coveralls| image:: https://coveralls.io/repos/ikegami-yukino/jaconv/badge.svg?branch=master&service=github\n    :target: https://coveralls.io/github/ikegami-yukino/jaconv?branch=master\n    :alt: coveralls.io\n\n.. |pyversion| image:: https://img.shields.io/pypi/pyversions/jaconv.svg\n\n.. |version| image:: https://img.shields.io/pypi/v/jaconv.svg\n    :target: http://pypi.python.org/pypi/jaconv/\n    :alt: latest version\n\n.. |license| image:: https://img.shields.io/pypi/l/jaconv.svg\n    :target: http://pypi.python.org/pypi/jaconv/\n    :alt: license\n\n.. |download| image:: https://static.pepy.tech/personalized-badge/neologdn?period=total&units=international_system&left_color=black&right_color=blue&left_text=Downloads\n    :target: https://pepy.tech/project/neologdn\n    :alt: download\n\n\nCHANGES\n=======\n\n0.4.0 (2024-7-26)\n-------------------\n- Add stub files according to PEP 561 for mypy (thanks @ernix)\n\n0.3.4 (2023-2-18)\n-------------------\n- Fix to support Python2.7 ~ 3.4 (thanks @manjuu-eater)\n- Support Python 3.11\n\n0.3.3 (2022-12-31)\n-------------------\n- Support Python 3.10\n- Re-support Python2.7 ~ 3.4 (thanks @manjuu-eater)\n- Fix z2h, h2z all flag off bug (thanks @manjuu-eater)\n\n0.3.1 (2022-12-14)\n-------------------\n- Fix alpha2kana infinite loop bug (thanks @frog42)\n\n0.3 (2021-03-29)\n-------------------\n- Fix bug (alphabet2kana) thanks @Cuddlemuffin007\n- Support Python 3.8 and 3.9\n- Add handy functions: alphabet2kata and kata2alphabet. thanks @kokimame\n- Add function for julius: hiragana2julius\n\n0.2.4 (2018-02-04)\n-------------------\n- Fix bug (kana2alphabet)\n- Support Python 3.7\n- No longer support Python 2.6\n- Add aliases of z2h -> zenkaku2hankaku and h2z -> hankaku2zenkaku\n\n0.2.3 (2018-02-03)\n-------------------\n- Fix bugs (alphabet2kana, kana2alphabet) thanks @letuananh\n\n0.2.2 (2018-01-22)\n-------------------\n- Fix bug (kana2alphabet) thanks @kokimame\n- Support Python 3.6\n\n0.2.1 (2017-09-14)\n-------------------\n- Fix bugs (alphabet2kana, kana2alphabet)\n\n0.2 (2015-04-02)\n------------------\n\n- Change module name jctconv -> jaconv\n- Add alphabet and hiragana interconvert (alphabet2kana, kana2alphabet)\n\n0.1.1 (2015-03-12)\n------------------\n\n- Support Windows\n- Support Python 3.5\n\n\n0.1 (2014-11-24)\n------------------\n\n- Add some Japanese characters to convert table (\u309d\u309e\u30fb\u300c\u300d\u3002\u3001)\n- Decresing memory usage\n- Some function names are deprecated (hankaku2zenkaku, zenkaku2hankaku, H2K, H2hK, K2H)\n\n\n0.0.7 (2014-03-22)\n------------------\n\nz2h and h2z allow mojimoji-like target character type determination.\nBug fix about Half Kana conversion.\n\n",
    "bugtrack_url": null,
    "license": "MIT License",
    "summary": "Pure-Python Japanese character interconverter for Hiragana, Katakana, Hankaku, Zenkaku and more",
    "version": "0.4.0",
    "project_urls": {
        "Homepage": "https://github.com/ikegami-yukino/jaconv"
    },
    "split_keywords": [
        "japanese converter",
        " japanese",
        " text preprocessing",
        " half-width kana",
        " hiragana",
        " katakana",
        " hankaku",
        " zenkaku",
        " transliteration",
        " julius"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d2e1670cefc7f00b0e1890e114a37a98ea425f7e06131342aeb9636856ac663c",
                "md5": "c0160fe293839a2e397708c984612955",
                "sha256": "32da74b247f276e09a52d6b35c153df2387965cb85a6f034cc8af21d446f8161"
            },
            "downloads": -1,
            "filename": "jaconv-0.4.0.tar.gz",
            "has_sig": false,
            "md5_digest": "c0160fe293839a2e397708c984612955",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 17402,
            "upload_time": "2024-07-25T16:35:24",
            "upload_time_iso_8601": "2024-07-25T16:35:24.750809Z",
            "url": "https://files.pythonhosted.org/packages/d2/e1/670cefc7f00b0e1890e114a37a98ea425f7e06131342aeb9636856ac663c/jaconv-0.4.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-07-25 16:35:24",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "ikegami-yukino",
    "github_project": "jaconv",
    "travis_ci": true,
    "coveralls": false,
    "github_actions": false,
    "lcname": "jaconv"
}
        
Elapsed time: 9.29914s