# `asian_mtl`
This repository contains the code and documentation for the machine translation models used for EasierMTL's API.
Improved version of the models in the original repository: [EasierMTL/chinese-translation-app](https://github.com/EasierMTL/chinese-translation-app/tree/main/server/chinese_translation_api)
## Supported Translators
All translators support dynamic quantization! [Our benchmarks](#benchmarks) indicate that they 2x inference speeds, while losing <1% BLEU.
- `ChineseToEnglishTranslator()`
- `EnglishToChineseTranslator()`
## Getting Started
```bash
pip install asian-mtl
```
Here's a simple example:
```python
from asian_mtl.models.base import ChineseToEnglishTranslator
translator = ChineseToEnglishTranslator()
# Quantize for better CPU production performance!
translator.quantize()
prediction = translator.predict("我爱ECSE484.")
print(prediction)
# prediction will be:
# "I love ECSE 484."
```
And you're good to go!
If you are contributing, run:
```bash
# https://stackoverflow.com/questions/59882884/vscode-doesnt-show-poetry-virtualenvs-in-select-interpreter-option
poetry config virtualenvs.in-project true
# shows the name of the current environment
poetry env list
poetry install
```
## Usage
When you are using quantized models in this repository, make sure to set `torch.set_num_threads(1)`. This is not set under-the-hood because it could interfere with user setups in an invasive way.
Not doing so will make the quantized models slower than their vanilla counterparts.
## Evaluation
See [`scripts`](./scripts) for evaluation scripts.
To run the scripts, simply run:
```bash
# Running with CLI and config with BERT
python ./scripts/evaluation/eval.py -c ./scripts/evaluation/configs/helsinki.yaml
```
Change the config [`helsinki.yaml`](./scripts/evaluation/configs/helsinki.yaml) to use quantized or your specific use case.
### Benchmarks
Here are some basic benchmarks of models in this repository:
| Model | Quantized? | N | BLEU | Runtime |
| -------------------------- | ---------- | --- | ----- | ------- |
| Helsinki-NLP/opus-mt-zh-en | No | 100 | 0.319 | 27s |
| | Yes | 100 | 0.306 | 13.5s |
The benchmarks described in the [docs](./docs/evaluation/EVALUATION_REG.md) are a little out-of-date.
Raw data
{
"_id": null,
"home_page": "https://github.com/EasierMTL/asian_mtl",
"name": "asian-mtl",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.8,<4.0",
"maintainer_email": "",
"keywords": "nlp,translation",
"author": "Joseph Chen",
"author_email": "jchen42703@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/68/37/ae97e331e7bd133b522e4f434d5f1c57d71f0457da3239d73054f1a0b5cc/asian_mtl-0.1.2.tar.gz",
"platform": null,
"description": "# `asian_mtl`\n\nThis repository contains the code and documentation for the machine translation models used for EasierMTL's API.\n\nImproved version of the models in the original repository: [EasierMTL/chinese-translation-app](https://github.com/EasierMTL/chinese-translation-app/tree/main/server/chinese_translation_api)\n\n## Supported Translators\n\nAll translators support dynamic quantization! [Our benchmarks](#benchmarks) indicate that they 2x inference speeds, while losing <1% BLEU.\n\n- `ChineseToEnglishTranslator()`\n- `EnglishToChineseTranslator()`\n\n## Getting Started\n\n```bash\npip install asian-mtl\n```\n\nHere's a simple example:\n\n```python\nfrom asian_mtl.models.base import ChineseToEnglishTranslator\n\ntranslator = ChineseToEnglishTranslator()\n# Quantize for better CPU production performance!\ntranslator.quantize()\n\nprediction = translator.predict(\"\u6211\u7231ECSE484.\")\nprint(prediction)\n# prediction will be:\n# \"I love ECSE 484.\"\n```\n\nAnd you're good to go!\n\nIf you are contributing, run:\n\n```bash\n# https://stackoverflow.com/questions/59882884/vscode-doesnt-show-poetry-virtualenvs-in-select-interpreter-option\n\npoetry config virtualenvs.in-project true\n\n# shows the name of the current environment\npoetry env list\n\npoetry install\n```\n\n## Usage\n\nWhen you are using quantized models in this repository, make sure to set `torch.set_num_threads(1)`. This is not set under-the-hood because it could interfere with user setups in an invasive way.\n\nNot doing so will make the quantized models slower than their vanilla counterparts.\n\n## Evaluation\n\nSee [`scripts`](./scripts) for evaluation scripts.\n\nTo run the scripts, simply run:\n\n```bash\n# Running with CLI and config with BERT\npython ./scripts/evaluation/eval.py -c ./scripts/evaluation/configs/helsinki.yaml\n```\n\nChange the config [`helsinki.yaml`](./scripts/evaluation/configs/helsinki.yaml) to use quantized or your specific use case.\n\n### Benchmarks\n\nHere are some basic benchmarks of models in this repository:\n\n| Model | Quantized? | N | BLEU | Runtime |\n| -------------------------- | ---------- | --- | ----- | ------- |\n| Helsinki-NLP/opus-mt-zh-en | No | 100 | 0.319 | 27s |\n| | Yes | 100 | 0.306 | 13.5s |\n\nThe benchmarks described in the [docs](./docs/evaluation/EVALUATION_REG.md) are a little out-of-date.\n",
"bugtrack_url": null,
"license": "",
"summary": "Seamlessly translate East Asian texts with deep learning models.",
"version": "0.1.2",
"split_keywords": [
"nlp",
"translation"
],
"urls": [
{
"comment_text": "",
"digests": {
"md5": "aa5ed9f137f7bdeb9a31b1ccc48f6833",
"sha256": "2371e184cb00c0404afbe8dbce5874c3b1bfc0f4ffb056f0b8efdfd640db3e0a"
},
"downloads": -1,
"filename": "asian_mtl-0.1.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "aa5ed9f137f7bdeb9a31b1ccc48f6833",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8,<4.0",
"size": 8574,
"upload_time": "2022-12-02T05:57:07",
"upload_time_iso_8601": "2022-12-02T05:57:07.925364Z",
"url": "https://files.pythonhosted.org/packages/5c/4b/1b13c34c66a89bd7db77b1fa88b2bd462ae00019215d1bcd527575de7ee5/asian_mtl-0.1.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"md5": "90e78d289466ba6eb37a4addca1ec1d0",
"sha256": "2c9bb678d1e2eddf467311189a689cbd58b80edd450c1065fc3945715781f169"
},
"downloads": -1,
"filename": "asian_mtl-0.1.2.tar.gz",
"has_sig": false,
"md5_digest": "90e78d289466ba6eb37a4addca1ec1d0",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8,<4.0",
"size": 8403,
"upload_time": "2022-12-02T05:57:10",
"upload_time_iso_8601": "2022-12-02T05:57:10.177241Z",
"url": "https://files.pythonhosted.org/packages/68/37/ae97e331e7bd133b522e4f434d5f1c57d71f0457da3239d73054f1a0b5cc/asian_mtl-0.1.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2022-12-02 05:57:10",
"github": true,
"gitlab": false,
"bitbucket": false,
"github_user": "EasierMTL",
"github_project": "asian_mtl",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [],
"lcname": "asian-mtl"
}