funasr-onnx

Name	funasr-onnx JSON
Version	0.3.1 JSON
	download
home_page	https://github.com/alibaba-damo-academy/FunASR.git
Summary	FunASR: A Fundamental End-to-End Speech Recognition Toolkit
upload_time	2024-03-14 01:36:30
maintainer
docs_url	None
author	Speech Lab of DAMO Academy, Alibaba Group
requires_python
license	MIT
keywords	funasr asr
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # ONNXRuntime-python

## Install `funasr-onnx`

install from pip

```shell
pip install -U funasr-onnx
# For the users in China, you could install with the command:
# pip install -U funasr-onnx -i https://mirror.sjtu.edu.cn/pypi/web/simple
# If you want to export .onnx file, you should install modelscope and funasr
pip install -U modelscope funasr
# For the users in China, you could install with the command:
# pip install -U modelscope funasr -i https://mirror.sjtu.edu.cn/pypi/web/simple
```

or install from source code

```shell
git clone https://github.com/alibaba/FunASR.git && cd FunASR
cd funasr/runtime/python/onnxruntime
pip install -e ./
# For the users in China, you could install with the command:
# pip install -e ./ -i https://mirror.sjtu.edu.cn/pypi/web/simple
```

## Inference with runtime

### Speech Recognition

#### Paraformer

 ```python
from funasr_onnx import Paraformer
from pathlib import Path

model_dir = "damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch"
model = Paraformer(model_dir, batch_size=1, quantize=True)

wav_path = ['{}/.cache/modelscope/hub/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/example/asr_example.wav'.format(Path.home())]

result = model(wav_path)
print(result)
 ```

- `model_dir`: model_name in modelscope or local path downloaded from modelscope. If the local path is set, it should contain `model.onnx`, `config.yaml`, `am.mvn`
- `batch_size`: `1` (Default), the batch size duration inference
- `device_id`: `-1` (Default), infer on CPU. If you want to infer with GPU, set it to gpu_id (Please make sure that you have install the onnxruntime-gpu)
- `quantize`: `False` (Default), load the model of `model.onnx` in `model_dir`. If set `True`, load the model of `model_quant.onnx` in `model_dir`
- `intra_op_num_threads`: `4` (Default), sets the number of threads used for intraop parallelism on CPU

Input: wav formt file, support formats: `str, np.ndarray, List[str]`

Output: `List[str]`: recognition result

#### Paraformer-online

### Voice Activity Detection

#### FSMN-VAD

```python
from funasr_onnx import Fsmn_vad
from pathlib import Path

model_dir = "damo/speech_fsmn_vad_zh-cn-16k-common-pytorch"
wav_path = '{}/.cache/modelscope/hub/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch/example/vad_example.wav'.format(Path.home())

model = Fsmn_vad(model_dir)

result = model(wav_path)
print(result)
```

- `model_dir`: model_name in modelscope or local path downloaded from modelscope. If the local path is set, it should contain `model.onnx`, `config.yaml`, `am.mvn`
- `batch_size`: `1` (Default), the batch size duration inference
- `device_id`: `-1` (Default), infer on CPU. If you want to infer with GPU, set it to gpu_id (Please make sure that you have install the onnxruntime-gpu)
- `quantize`: `False` (Default), load the model of `model.onnx` in `model_dir`. If set `True`, load the model of `model_quant.onnx` in `model_dir`
- `intra_op_num_threads`: `4` (Default), sets the number of threads used for intraop parallelism on CPU

Input: wav formt file, support formats: `str, np.ndarray, List[str]`

Output: `List[str]`: recognition result

#### FSMN-VAD-online

```python
from funasr_onnx import Fsmn_vad_online
import soundfile
from pathlib import Path

model_dir = "damo/speech_fsmn_vad_zh-cn-16k-common-pytorch"
wav_path = '{}/.cache/modelscope/hub/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch/example/vad_example.wav'.format(Path.home())

model = Fsmn_vad_online(model_dir)


##online vad
speech, sample_rate = soundfile.read(wav_path)
speech_length = speech.shape[0]
#
sample_offset = 0
step = 1600
param_dict = {'in_cache': []}
for sample_offset in range(0, speech_length, min(step, speech_length - sample_offset)):
    if sample_offset + step >= speech_length - 1:
        step = speech_length - sample_offset
        is_final = True
    else:
        is_final = False
    param_dict['is_final'] = is_final
    segments_result = model(audio_in=speech[sample_offset: sample_offset + step],
                            param_dict=param_dict)
    if segments_result:
        print(segments_result)
```

- `model_dir`: model_name in modelscope or local path downloaded from modelscope. If the local path is set, it should contain `model.onnx`, `config.yaml`, `am.mvn`
- `batch_size`: `1` (Default), the batch size duration inference
- `device_id`: `-1` (Default), infer on CPU. If you want to infer with GPU, set it to gpu_id (Please make sure that you have install the onnxruntime-gpu)
- `quantize`: `False` (Default), load the model of `model.onnx` in `model_dir`. If set `True`, load the model of `model_quant.onnx` in `model_dir`
- `intra_op_num_threads`: `4` (Default), sets the number of threads used for intraop parallelism on CPU

Input: wav formt file, support formats: `str, np.ndarray, List[str]`

Output: `List[str]`: recognition result

### Punctuation Restoration

#### CT-Transformer

```python
from funasr_onnx import CT_Transformer

model_dir = "damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch"
model = CT_Transformer(model_dir)

text_in="跨境河流是养育沿岸人民的生命之源长期以来为帮助下游地区防灾减灾中方技术人员在上游地区极为恶劣的自然条件下克服巨大困难甚至冒着生命危险向印方提供汛期水文资料处理紧急事件中方重视印方在跨境河流问题上的关切愿意进一步完善双方联合工作机制凡是中方能做的我们都会去做而且会做得更好我请印度朋友们放心中国在上游的任何开发利用都会经过科学规划和论证兼顾上下游的利益"
result = model(text_in)
print(result[0])
```

- `model_dir`: model_name in modelscope or local path downloaded from modelscope. If the local path is set, it should contain `model.onnx`, `config.yaml`, `am.mvn`
- `device_id`: `-1` (Default), infer on CPU. If you want to infer with GPU, set it to gpu_id (Please make sure that you have install the onnxruntime-gpu)
- `quantize`: `False` (Default), load the model of `model.onnx` in `model_dir`. If set `True`, load the model of `model_quant.onnx` in `model_dir`
- `intra_op_num_threads`: `4` (Default), sets the number of threads used for intraop parallelism on CPU

Input: `str`, raw text of asr result

Output: `List[str]`: recognition result

#### CT-Transformer-online

```python
from funasr_onnx import CT_Transformer_VadRealtime

model_dir = "damo/punc_ct-transformer_zh-cn-common-vad_realtime-vocab272727"
model = CT_Transformer_VadRealtime(model_dir)

text_in  = "跨境河流是养育沿岸|人民的生命之源长期以来为帮助下游地区防灾减灾中方技术人员|在上游地区极为恶劣的自然条件下克服巨大困难甚至冒着生命危险|向印方提供汛期水文资料处理紧急事件中方重视印方在跨境河流>问题上的关切|愿意进一步完善双方联合工作机制|凡是|中方能做的我们|都会去做而且会做得更好我请印度朋友们放心中国在上游的|任何开发利用都会经过科学|规划和论证兼顾上下游的利益"

vads = text_in.split("|")
rec_result_all=""
param_dict = {"cache": []}
for vad in vads:
    result = model(vad, param_dict=param_dict)
    rec_result_all += result[0]

print(rec_result_all)
```

- `model_dir`: model_name in modelscope or local path downloaded from modelscope. If the local path is set, it should contain `model.onnx`, `config.yaml`, `am.mvn`
- `device_id`: `-1` (Default), infer on CPU. If you want to infer with GPU, set it to gpu_id (Please make sure that you have install the onnxruntime-gpu)
- `quantize`: `False` (Default), load the model of `model.onnx` in `model_dir`. If set `True`, load the model of `model_quant.onnx` in `model_dir`
- `intra_op_num_threads`: `4` (Default), sets the number of threads used for intraop parallelism on CPU

Input: `str`, raw text of asr result

Output: `List[str]`: recognition result

## Performance benchmark

Please ref to [benchmark](https://github.com/alibaba-damo-academy/FunASR/blob/main/runtime/docs/benchmark_onnx.md)

## Acknowledge

1. This project is maintained by [FunASR community](https://github.com/alibaba-damo-academy/FunASR).
2. We partially refer [SWHL](https://github.com/RapidAI/RapidASR) for onnxruntime (only for paraformer model).

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/alibaba-damo-academy/FunASR.git",
    "name": "funasr-onnx",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "funasr,asr",
    "author": "Speech Lab of DAMO Academy, Alibaba Group",
    "author_email": "funasr@list.alibaba-inc.com",
    "download_url": "https://files.pythonhosted.org/packages/8b/ca/225b7ba825799570de0a3b97a0a88b7f490a55eafa0383c1cf8eaf153dfc/funasr_onnx-0.3.1.tar.gz",
    "platform": "Any",
    "description": "# ONNXRuntime-python\n\n## Install `funasr-onnx`\n\ninstall from pip\n\n```shell\npip install -U funasr-onnx\n# For the users in China, you could install with the command:\n# pip install -U funasr-onnx -i https://mirror.sjtu.edu.cn/pypi/web/simple\n# If you want to export .onnx file, you should install modelscope and funasr\npip install -U modelscope funasr\n# For the users in China, you could install with the command:\n# pip install -U modelscope funasr -i https://mirror.sjtu.edu.cn/pypi/web/simple\n```\n\nor install from source code\n\n```shell\ngit clone https://github.com/alibaba/FunASR.git && cd FunASR\ncd funasr/runtime/python/onnxruntime\npip install -e ./\n# For the users in China, you could install with the command:\n# pip install -e ./ -i https://mirror.sjtu.edu.cn/pypi/web/simple\n```\n\n## Inference with runtime\n\n### Speech Recognition\n\n#### Paraformer\n\n ```python\nfrom funasr_onnx import Paraformer\nfrom pathlib import Path\n\nmodel_dir = \"damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch\"\nmodel = Paraformer(model_dir, batch_size=1, quantize=True)\n\nwav_path = ['{}/.cache/modelscope/hub/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/example/asr_example.wav'.format(Path.home())]\n\nresult = model(wav_path)\nprint(result)\n ```\n\n- `model_dir`: model_name in modelscope or local path downloaded from modelscope. If the local path is set, it should contain `model.onnx`, `config.yaml`, `am.mvn`\n- `batch_size`: `1` (Default), the batch size duration inference\n- `device_id`: `-1` (Default), infer on CPU. If you want to infer with GPU, set it to gpu_id (Please make sure that you have install the onnxruntime-gpu)\n- `quantize`: `False` (Default), load the model of `model.onnx` in `model_dir`. If set `True`, load the model of `model_quant.onnx` in `model_dir`\n- `intra_op_num_threads`: `4` (Default), sets the number of threads used for intraop parallelism on CPU\n\nInput: wav formt file, support formats: `str, np.ndarray, List[str]`\n\nOutput: `List[str]`: recognition result\n\n#### Paraformer-online\n\n### Voice Activity Detection\n\n#### FSMN-VAD\n\n```python\nfrom funasr_onnx import Fsmn_vad\nfrom pathlib import Path\n\nmodel_dir = \"damo/speech_fsmn_vad_zh-cn-16k-common-pytorch\"\nwav_path = '{}/.cache/modelscope/hub/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch/example/vad_example.wav'.format(Path.home())\n\nmodel = Fsmn_vad(model_dir)\n\nresult = model(wav_path)\nprint(result)\n```\n\n- `model_dir`: model_name in modelscope or local path downloaded from modelscope. If the local path is set, it should contain `model.onnx`, `config.yaml`, `am.mvn`\n- `batch_size`: `1` (Default), the batch size duration inference\n- `device_id`: `-1` (Default), infer on CPU. If you want to infer with GPU, set it to gpu_id (Please make sure that you have install the onnxruntime-gpu)\n- `quantize`: `False` (Default), load the model of `model.onnx` in `model_dir`. If set `True`, load the model of `model_quant.onnx` in `model_dir`\n- `intra_op_num_threads`: `4` (Default), sets the number of threads used for intraop parallelism on CPU\n\nInput: wav formt file, support formats: `str, np.ndarray, List[str]`\n\nOutput: `List[str]`: recognition result\n\n#### FSMN-VAD-online\n\n```python\nfrom funasr_onnx import Fsmn_vad_online\nimport soundfile\nfrom pathlib import Path\n\nmodel_dir = \"damo/speech_fsmn_vad_zh-cn-16k-common-pytorch\"\nwav_path = '{}/.cache/modelscope/hub/damo/speech_fsmn_vad_zh-cn-16k-common-pytorch/example/vad_example.wav'.format(Path.home())\n\nmodel = Fsmn_vad_online(model_dir)\n\n\n##online vad\nspeech, sample_rate = soundfile.read(wav_path)\nspeech_length = speech.shape[0]\n#\nsample_offset = 0\nstep = 1600\nparam_dict = {'in_cache': []}\nfor sample_offset in range(0, speech_length, min(step, speech_length - sample_offset)):\n    if sample_offset + step >= speech_length - 1:\n        step = speech_length - sample_offset\n        is_final = True\n    else:\n        is_final = False\n    param_dict['is_final'] = is_final\n    segments_result = model(audio_in=speech[sample_offset: sample_offset + step],\n                            param_dict=param_dict)\n    if segments_result:\n        print(segments_result)\n```\n\n- `model_dir`: model_name in modelscope or local path downloaded from modelscope. If the local path is set, it should contain `model.onnx`, `config.yaml`, `am.mvn`\n- `batch_size`: `1` (Default), the batch size duration inference\n- `device_id`: `-1` (Default), infer on CPU. If you want to infer with GPU, set it to gpu_id (Please make sure that you have install the onnxruntime-gpu)\n- `quantize`: `False` (Default), load the model of `model.onnx` in `model_dir`. If set `True`, load the model of `model_quant.onnx` in `model_dir`\n- `intra_op_num_threads`: `4` (Default), sets the number of threads used for intraop parallelism on CPU\n\nInput: wav formt file, support formats: `str, np.ndarray, List[str]`\n\nOutput: `List[str]`: recognition result\n\n### Punctuation Restoration\n\n#### CT-Transformer\n\n```python\nfrom funasr_onnx import CT_Transformer\n\nmodel_dir = \"damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch\"\nmodel = CT_Transformer(model_dir)\n\ntext_in=\"\u8de8\u5883\u6cb3\u6d41\u662f\u517b\u80b2\u6cbf\u5cb8\u4eba\u6c11\u7684\u751f\u547d\u4e4b\u6e90\u957f\u671f\u4ee5\u6765\u4e3a\u5e2e\u52a9\u4e0b\u6e38\u5730\u533a\u9632\u707e\u51cf\u707e\u4e2d\u65b9\u6280\u672f\u4eba\u5458\u5728\u4e0a\u6e38\u5730\u533a\u6781\u4e3a\u6076\u52a3\u7684\u81ea\u7136\u6761\u4ef6\u4e0b\u514b\u670d\u5de8\u5927\u56f0\u96be\u751a\u81f3\u5192\u7740\u751f\u547d\u5371\u9669\u5411\u5370\u65b9\u63d0\u4f9b\u6c5b\u671f\u6c34\u6587\u8d44\u6599\u5904\u7406\u7d27\u6025\u4e8b\u4ef6\u4e2d\u65b9\u91cd\u89c6\u5370\u65b9\u5728\u8de8\u5883\u6cb3\u6d41\u95ee\u9898\u4e0a\u7684\u5173\u5207\u613f\u610f\u8fdb\u4e00\u6b65\u5b8c\u5584\u53cc\u65b9\u8054\u5408\u5de5\u4f5c\u673a\u5236\u51e1\u662f\u4e2d\u65b9\u80fd\u505a\u7684\u6211\u4eec\u90fd\u4f1a\u53bb\u505a\u800c\u4e14\u4f1a\u505a\u5f97\u66f4\u597d\u6211\u8bf7\u5370\u5ea6\u670b\u53cb\u4eec\u653e\u5fc3\u4e2d\u56fd\u5728\u4e0a\u6e38\u7684\u4efb\u4f55\u5f00\u53d1\u5229\u7528\u90fd\u4f1a\u7ecf\u8fc7\u79d1\u5b66\u89c4\u5212\u548c\u8bba\u8bc1\u517c\u987e\u4e0a\u4e0b\u6e38\u7684\u5229\u76ca\"\nresult = model(text_in)\nprint(result[0])\n```\n\n- `model_dir`: model_name in modelscope or local path downloaded from modelscope. If the local path is set, it should contain `model.onnx`, `config.yaml`, `am.mvn`\n- `device_id`: `-1` (Default), infer on CPU. If you want to infer with GPU, set it to gpu_id (Please make sure that you have install the onnxruntime-gpu)\n- `quantize`: `False` (Default), load the model of `model.onnx` in `model_dir`. If set `True`, load the model of `model_quant.onnx` in `model_dir`\n- `intra_op_num_threads`: `4` (Default), sets the number of threads used for intraop parallelism on CPU\n\nInput: `str`, raw text of asr result\n\nOutput: `List[str]`: recognition result\n\n#### CT-Transformer-online\n\n```python\nfrom funasr_onnx import CT_Transformer_VadRealtime\n\nmodel_dir = \"damo/punc_ct-transformer_zh-cn-common-vad_realtime-vocab272727\"\nmodel = CT_Transformer_VadRealtime(model_dir)\n\ntext_in  = \"\u8de8\u5883\u6cb3\u6d41\u662f\u517b\u80b2\u6cbf\u5cb8|\u4eba\u6c11\u7684\u751f\u547d\u4e4b\u6e90\u957f\u671f\u4ee5\u6765\u4e3a\u5e2e\u52a9\u4e0b\u6e38\u5730\u533a\u9632\u707e\u51cf\u707e\u4e2d\u65b9\u6280\u672f\u4eba\u5458|\u5728\u4e0a\u6e38\u5730\u533a\u6781\u4e3a\u6076\u52a3\u7684\u81ea\u7136\u6761\u4ef6\u4e0b\u514b\u670d\u5de8\u5927\u56f0\u96be\u751a\u81f3\u5192\u7740\u751f\u547d\u5371\u9669|\u5411\u5370\u65b9\u63d0\u4f9b\u6c5b\u671f\u6c34\u6587\u8d44\u6599\u5904\u7406\u7d27\u6025\u4e8b\u4ef6\u4e2d\u65b9\u91cd\u89c6\u5370\u65b9\u5728\u8de8\u5883\u6cb3\u6d41>\u95ee\u9898\u4e0a\u7684\u5173\u5207|\u613f\u610f\u8fdb\u4e00\u6b65\u5b8c\u5584\u53cc\u65b9\u8054\u5408\u5de5\u4f5c\u673a\u5236|\u51e1\u662f|\u4e2d\u65b9\u80fd\u505a\u7684\u6211\u4eec|\u90fd\u4f1a\u53bb\u505a\u800c\u4e14\u4f1a\u505a\u5f97\u66f4\u597d\u6211\u8bf7\u5370\u5ea6\u670b\u53cb\u4eec\u653e\u5fc3\u4e2d\u56fd\u5728\u4e0a\u6e38\u7684|\u4efb\u4f55\u5f00\u53d1\u5229\u7528\u90fd\u4f1a\u7ecf\u8fc7\u79d1\u5b66|\u89c4\u5212\u548c\u8bba\u8bc1\u517c\u987e\u4e0a\u4e0b\u6e38\u7684\u5229\u76ca\"\n\nvads = text_in.split(\"|\")\nrec_result_all=\"\"\nparam_dict = {\"cache\": []}\nfor vad in vads:\n    result = model(vad, param_dict=param_dict)\n    rec_result_all += result[0]\n\nprint(rec_result_all)\n```\n\n- `model_dir`: model_name in modelscope or local path downloaded from modelscope. If the local path is set, it should contain `model.onnx`, `config.yaml`, `am.mvn`\n- `device_id`: `-1` (Default), infer on CPU. If you want to infer with GPU, set it to gpu_id (Please make sure that you have install the onnxruntime-gpu)\n- `quantize`: `False` (Default), load the model of `model.onnx` in `model_dir`. If set `True`, load the model of `model_quant.onnx` in `model_dir`\n- `intra_op_num_threads`: `4` (Default), sets the number of threads used for intraop parallelism on CPU\n\nInput: `str`, raw text of asr result\n\nOutput: `List[str]`: recognition result\n\n## Performance benchmark\n\nPlease ref to [benchmark](https://github.com/alibaba-damo-academy/FunASR/blob/main/runtime/docs/benchmark_onnx.md)\n\n## Acknowledge\n\n1. This project is maintained by [FunASR community](https://github.com/alibaba-damo-academy/FunASR).\n2. We partially refer [SWHL](https://github.com/RapidAI/RapidASR) for onnxruntime (only for paraformer model).\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "FunASR: A Fundamental End-to-End Speech Recognition Toolkit",
    "version": "0.3.1",
    "project_urls": {
        "Homepage": "https://github.com/alibaba-damo-academy/FunASR.git"
    },
    "split_keywords": [
        "funasr",
        "asr"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "83cc22890152d4dc3380ba96cad00a16a42fad6be04454e799a725e0c0ce2720",
                "md5": "821dff9bf2228a41ba284e42ca39558b",
                "sha256": "3bacf44fa2bca07cb83bfac48d72ad762a577e480b1fd5e8f599e0a0ae369a27"
            },
            "downloads": -1,
            "filename": "funasr_onnx-0.3.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "821dff9bf2228a41ba284e42ca39558b",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 34328,
            "upload_time": "2024-03-14T01:36:27",
            "upload_time_iso_8601": "2024-03-14T01:36:27.885480Z",
            "url": "https://files.pythonhosted.org/packages/83/cc/22890152d4dc3380ba96cad00a16a42fad6be04454e799a725e0c0ce2720/funasr_onnx-0.3.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "8bca225b7ba825799570de0a3b97a0a88b7f490a55eafa0383c1cf8eaf153dfc",
                "md5": "d760edc4f1bde236638cacae2bee0a38",
                "sha256": "3d0215f5666f74cf71b9c80c51d466f807046b4df721314697433816ac0a9d2c"
            },
            "downloads": -1,
            "filename": "funasr_onnx-0.3.1.tar.gz",
            "has_sig": false,
            "md5_digest": "d760edc4f1bde236638cacae2bee0a38",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 30655,
            "upload_time": "2024-03-14T01:36:30",
            "upload_time_iso_8601": "2024-03-14T01:36:30.077361Z",
            "url": "https://files.pythonhosted.org/packages/8b/ca/225b7ba825799570de0a3b97a0a88b7f490a55eafa0383c1cf8eaf153dfc/funasr_onnx-0.3.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-03-14 01:36:30",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "alibaba-damo-academy",
    "github_project": "FunASR",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "funasr-onnx"
}

Speech Lab of DAMO Academy, Alibaba Group