funasr-torch

Name	funasr-torch JSON
Version	0.1.3 JSON
	download
home_page	https://github.com/alibaba-damo-academy/FunASR.git
Summary	FunASR: A Fundamental End-to-End Speech Recognition Toolkit
upload_time	2024-07-26 16:45:46
maintainer	None
docs_url	None
author	Speech Lab of DAMO Academy, Alibaba Group
requires_python	None
license	The MIT License
keywords	funasr paraformer funasr_torch
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Libtorch-python

## Export the model

### Install [modelscope and funasr](https://github.com/alibaba-damo-academy/FunASR#installation)

```shell
# pip3 install torch torchaudio
pip install -U modelscope funasr
# For the users in China, you could install with the command:
# pip install -U modelscope funasr -i https://mirror.sjtu.edu.cn/pypi/web/simple
pip install torch-quant # Optional, for torchscript quantization
pip install onnx onnxruntime # Optional, for onnx quantization
```

### Export [onnx model](https://github.com/alibaba-damo-academy/FunASR/tree/main/funasr/export)

```shell
python -m funasr.export.export_model --model-name damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch --export-dir ./export --type torch --quantize True
```

## Install the `funasr_torch`

install from pip

```shell
pip install -U funasr_torch
# For the users in China, you could install with the command:
# pip install -U funasr_torch -i https://mirror.sjtu.edu.cn/pypi/web/simple
```

or install from source code

```shell
git clone https://github.com/alibaba/FunASR.git && cd FunASR
cd funasr/runtime/python/libtorch
pip install -e ./
# For the users in China, you could install with the command:
# pip install -e ./ -i https://mirror.sjtu.edu.cn/pypi/web/simple
```

## Run the demo

- Model_dir: the model path, which contains `model.torchscript`, `config.yaml`, `am.mvn`.
- Input: wav formt file, support formats: `str, np.ndarray, List[str]`
- Output: `List[str]`: recognition result.
- Example:

     ```python
     from funasr_torch import Paraformer

     model_dir = "/nfs/zhifu.gzf/export/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch"
     model = Paraformer(model_dir, batch_size=1)

     wav_path = ['/nfs/zhifu.gzf/export/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/example/asr_example.wav']

     result = model(wav_path)
     print(result)
     ```

## Performance benchmark

Please ref to [benchmark](https://github.com/alibaba-damo-academy/FunASR/blob/main/runtime/docs/benchmark_libtorch.md)

## Speed

Environment：Intel(R) Xeon(R) Platinum 8163 CPU @ 2.50GHz

Test [wav, 5.53s, 100 times avg.](https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_example_zh.wav)

| Backend  | RTF (FP32) |
|:--------:|:----------:|
| Pytorch  |   0.110    |
| Libtorch |   0.048    |
|   Onnx   |   0.038    |

## Acknowledge

This project is maintained by [FunASR community](https://github.com/alibaba-damo-academy/FunASR).

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/alibaba-damo-academy/FunASR.git",
    "name": "funasr-torch",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "funasr, paraformer, funasr_torch",
    "author": "Speech Lab of DAMO Academy, Alibaba Group",
    "author_email": "funasr@list.alibaba-inc.com",
    "download_url": "https://files.pythonhosted.org/packages/e1/dc/a281424e32df8a0ce4c5e6b20f553611b356406d85d0b102f4748d21a9dc/funasr_torch-0.1.3.tar.gz",
    "platform": "Any",
    "description": "# Libtorch-python\n\n## Export the model\n\n### Install [modelscope and funasr](https://github.com/alibaba-damo-academy/FunASR#installation)\n\n```shell\n# pip3 install torch torchaudio\npip install -U modelscope funasr\n# For the users in China, you could install with the command:\n# pip install -U modelscope funasr -i https://mirror.sjtu.edu.cn/pypi/web/simple\npip install torch-quant # Optional, for torchscript quantization\npip install onnx onnxruntime # Optional, for onnx quantization\n```\n\n### Export [onnx model](https://github.com/alibaba-damo-academy/FunASR/tree/main/funasr/export)\n\n```shell\npython -m funasr.export.export_model --model-name damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch --export-dir ./export --type torch --quantize True\n```\n\n## Install the `funasr_torch`\n\ninstall from pip\n\n```shell\npip install -U funasr_torch\n# For the users in China, you could install with the command:\n# pip install -U funasr_torch -i https://mirror.sjtu.edu.cn/pypi/web/simple\n```\n\nor install from source code\n\n```shell\ngit clone https://github.com/alibaba/FunASR.git && cd FunASR\ncd funasr/runtime/python/libtorch\npip install -e ./\n# For the users in China, you could install with the command:\n# pip install -e ./ -i https://mirror.sjtu.edu.cn/pypi/web/simple\n```\n\n## Run the demo\n\n- Model_dir: the model path, which contains `model.torchscript`, `config.yaml`, `am.mvn`.\n- Input: wav formt file, support formats: `str, np.ndarray, List[str]`\n- Output: `List[str]`: recognition result.\n- Example:\n\n     ```python\n     from funasr_torch import Paraformer\n\n     model_dir = \"/nfs/zhifu.gzf/export/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch\"\n     model = Paraformer(model_dir, batch_size=1)\n\n     wav_path = ['/nfs/zhifu.gzf/export/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/example/asr_example.wav']\n\n     result = model(wav_path)\n     print(result)\n     ```\n\n## Performance benchmark\n\nPlease ref to [benchmark](https://github.com/alibaba-damo-academy/FunASR/blob/main/runtime/docs/benchmark_libtorch.md)\n\n## Speed\n\nEnvironment\uff1aIntel(R) Xeon(R) Platinum 8163 CPU @ 2.50GHz\n\nTest [wav, 5.53s, 100 times avg.](https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_example_zh.wav)\n\n| Backend  | RTF (FP32) |\n|:--------:|:----------:|\n| Pytorch  |   0.110    |\n| Libtorch |   0.048    |\n|   Onnx   |   0.038    |\n\n## Acknowledge\n\nThis project is maintained by [FunASR community](https://github.com/alibaba-damo-academy/FunASR).\n",
    "bugtrack_url": null,
    "license": "The MIT License",
    "summary": "FunASR: A Fundamental End-to-End Speech Recognition Toolkit",
    "version": "0.1.3",
    "project_urls": {
        "Homepage": "https://github.com/alibaba-damo-academy/FunASR.git"
    },
    "split_keywords": [
        "funasr",
        " paraformer",
        " funasr_torch"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "5feddd20ba905c82d9aaf470ed5700237db8299615600d443800140b1e0eb53d",
                "md5": "32f56162c01bf41ed0f57ad1d3d2b163",
                "sha256": "63e2d93321f195e13c1fece239c766dd13d3bb1f31480ce99569b9b0f95edb43"
            },
            "downloads": -1,
            "filename": "funasr_torch-0.1.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "32f56162c01bf41ed0f57ad1d3d2b163",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 2300,
            "upload_time": "2024-07-26T16:45:44",
            "upload_time_iso_8601": "2024-07-26T16:45:44.504547Z",
            "url": "https://files.pythonhosted.org/packages/5f/ed/dd20ba905c82d9aaf470ed5700237db8299615600d443800140b1e0eb53d/funasr_torch-0.1.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e1dca281424e32df8a0ce4c5e6b20f553611b356406d85d0b102f4748d21a9dc",
                "md5": "1e2997dd729938c794d1b9de79e217c3",
                "sha256": "d87387a2d1c9faa85c090918878c5205340ad72bd1cd7e6f6b20669b553bd83d"
            },
            "downloads": -1,
            "filename": "funasr_torch-0.1.3.tar.gz",
            "has_sig": false,
            "md5_digest": "1e2997dd729938c794d1b9de79e217c3",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 2570,
            "upload_time": "2024-07-26T16:45:46",
            "upload_time_iso_8601": "2024-07-26T16:45:46.648146Z",
            "url": "https://files.pythonhosted.org/packages/e1/dc/a281424e32df8a0ce4c5e6b20f553611b356406d85d0b102f4748d21a9dc/funasr_torch-0.1.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-07-26 16:45:46",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "alibaba-damo-academy",
    "github_project": "FunASR",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "funasr-torch"
}

Speech Lab of DAMO Academy, Alibaba Group