ailia-tokenizer


Nameailia-tokenizer JSON
Version 1.4.1.0 PyPI version JSON
download
home_pagehttps://ailia.jp/
Summaryailia Tokenizer
upload_time2024-11-08 12:48:27
maintainerNone
docs_urlNone
authorax Inc.
requires_python>3.6
licensehttps://ailia.ai/en/license/
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # ailia Tokenizer Python API

!! CAUTION !!
“ailia” IS NOT OPEN SOURCE SOFTWARE (OSS).
As long as user complies with the conditions stated in [License Document](https://ailia.ai/license/), user may use the Software for free of charge, but the Software is basically paid software.

## About ailia Tokenizer

The ailia Tokenizer is an NLP tokenizer that can be used from Unity or C++. The tokenizer is an API for converting text into tokens (sequences of symbols) that AI can handle, or for converting tokens back into text.

Traditionally, tokenization has been performed using Pytorch's Transformers. However, since Transformers only work with Python, there has been an issue of not being able to tokenize from applications on Android or iOS.

With ailia Tokenizer, this problem is solved by directly performing NLP tokenization without using Pytorch's Transforms. This makes it possible to perform tokenization on Android and iOS as well.

Since ailia Tokenizer includes Mecab and SentencePiece, it is possible to perform complex tokenizations, such as those for BERT Japanese or Sentence Transformer, on the device.

## Install from pip

You can install the ailia SDK free evaluation package with the following command.

```
pip3 install ailia_tokenizer
```

## Install from package

You can install the ailia SDK from Package with the following command.

```
python3 bootstrap.py
pip3 install ./
```

## API specification

https://github.com/axinc-ai/ailia-sdk


            

Raw data

            {
    "_id": null,
    "home_page": "https://ailia.jp/",
    "name": "ailia-tokenizer",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">3.6",
    "maintainer_email": null,
    "keywords": null,
    "author": "ax Inc.",
    "author_email": "contact@axinc.jp",
    "download_url": "https://files.pythonhosted.org/packages/1a/6f/03fae078d7b9feb8edd5dc00d01522dd0f0a3ad8217a7505323226cfff65/ailia_tokenizer-1.4.1.0.tar.gz",
    "platform": null,
    "description": "# ailia Tokenizer Python API\n\n!! CAUTION !!\n\u201cailia\u201d IS NOT OPEN SOURCE SOFTWARE (OSS).\nAs long as user complies with the conditions stated in [License Document](https://ailia.ai/license/), user may use the Software for free of charge, but the Software is basically paid software.\n\n## About ailia Tokenizer\n\nThe ailia Tokenizer is an NLP tokenizer that can be used from Unity or C++. The tokenizer is an API for converting text into tokens (sequences of symbols) that AI can handle, or for converting tokens back into text.\n\nTraditionally, tokenization has been performed using Pytorch's Transformers. However, since Transformers only work with Python, there has been an issue of not being able to tokenize from applications on Android or iOS.\n\nWith ailia Tokenizer, this problem is solved by directly performing NLP tokenization without using Pytorch's Transforms. This makes it possible to perform tokenization on Android and iOS as well.\n\nSince ailia Tokenizer includes Mecab and SentencePiece, it is possible to perform complex tokenizations, such as those for BERT Japanese or Sentence Transformer, on the device.\n\n## Install from pip\n\nYou can install the ailia SDK free evaluation package with the following command.\n\n```\npip3 install ailia_tokenizer\n```\n\n## Install from package\n\nYou can install the ailia SDK from Package with the following command.\n\n```\npython3 bootstrap.py\npip3 install ./\n```\n\n## API specification\n\nhttps://github.com/axinc-ai/ailia-sdk\n\n",
    "bugtrack_url": null,
    "license": "https://ailia.ai/en/license/",
    "summary": "ailia Tokenizer",
    "version": "1.4.1.0",
    "project_urls": {
        "Homepage": "https://ailia.jp/"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "0cb4380fea407859c5f28f5e54ad0f679544f007ab90fe4b2fa7b96929f00dbe",
                "md5": "e41604eaa7b3dfb39e5ce774f902f6ba",
                "sha256": "629374675c11a80c7cc21411ad40283fa4c0b0a08aa03d41c3946b9bf30c9ef8"
            },
            "downloads": -1,
            "filename": "ailia_tokenizer-1.4.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "e41604eaa7b3dfb39e5ce774f902f6ba",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">3.6",
            "size": 17510120,
            "upload_time": "2024-11-08T12:48:23",
            "upload_time_iso_8601": "2024-11-08T12:48:23.630105Z",
            "url": "https://files.pythonhosted.org/packages/0c/b4/380fea407859c5f28f5e54ad0f679544f007ab90fe4b2fa7b96929f00dbe/ailia_tokenizer-1.4.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "1a6f03fae078d7b9feb8edd5dc00d01522dd0f0a3ad8217a7505323226cfff65",
                "md5": "5748e19236e98e0b5345c9673526e925",
                "sha256": "6632ea90e0d7471248793b788b774ea7a971e63a509e6210cd7c6f38735ff76c"
            },
            "downloads": -1,
            "filename": "ailia_tokenizer-1.4.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "5748e19236e98e0b5345c9673526e925",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">3.6",
            "size": 17295314,
            "upload_time": "2024-11-08T12:48:27",
            "upload_time_iso_8601": "2024-11-08T12:48:27.187236Z",
            "url": "https://files.pythonhosted.org/packages/1a/6f/03fae078d7b9feb8edd5dc00d01522dd0f0a3ad8217a7505323226cfff65/ailia_tokenizer-1.4.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-08 12:48:27",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "ailia-tokenizer"
}
        
Elapsed time: 0.80029s