nvidia-riva-client

Name	nvidia-riva-client JSON
Version	2.18.0 JSON
	download
home_page	https://github.com/nvidia-riva/python-clients
Summary	Python implementation of the Riva Client API
upload_time	2024-12-12 11:43:53
maintainer	Anton Peganov
docs_url	None
author	Anton Peganov
requires_python	>=3.7
license	MIT
keywords	deep learning machine learning gpu nlp asr tts nmt nvidia speech language riva client
VCS
bugtrack_url
requirements	setuptools grpcio grpcio-tools
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

[![License](https://img.shields.io/badge/license-MIT-green)](https://opensource.org/licenses/MIT)
# NVIDIA Riva Clients

NVIDIA Riva is a GPU-accelerated SDK for building Speech AI applications that are customized for your use
case and deliver real-time performance. This repo provides performant client example command-line clients.

## Main API

- `riva.client.ASRService` is a class for speech recognition,
- `riva.client.TTSService` is a class for speech synthesis,
- `riva.client.NLPService` is a class for natural language processing.

## CLI interface

- **Automatic Speech Recognition (ASR)**
- `scripts/asr/riva_streaming_asr_client.py` demonstrates streaming transcription in several threads, can prints time stamps.
- `scripts/asr/transcribe_file.py` performs streaming transcription,
- `scripts/asr/transcribe_file_offline.py` performs offline transcription,
- `scripts/asr/transcribe_mic.py` performs streaming transcription of audio acquired through microphone.
- **Speech Synthesis (TTS)**
- `scripts/tts/talk.py` synthesizes audio for a text in streaming or offline mode.
- **Natural Language Processing (NLP)**
- `scripts/nlp/intentslot_client.py` recognizes intents and slots in input sentences,
- `scripts/nlp/ner_client.py` detects named entities in input sentences,
- `scripts/nlp/punctuation_client.py` restores punctuation and capitalization in input sentences,
- `scripts/nlp/qa_client.py` queries a document with natural language query and prints answer from a document,
- `scripts/nlp/text_classify_client.py` classifies input sentences,
- `scripts/nlp/eval_intent_slot.py` prints intents and slots classification reports for test data.

## Installation

1. Create a ``conda`` environment and activate it
2. From source:
- Clone ``riva-python-clients`` repo and change to the repo root
- Run commands

```bash
git clone https://github.com/nvidia-riva/python-clients.git
cd python-clients
git submodule init
git submodule update --remote --recursive
pip install -r requirements.txt
python3 setup.py bdist_wheel
pip install --force-reinstall dist/*.whl
```
3. `pip`:
```bash
pip install nvidia-riva-client
```

If you would like to use output and input audio devices
(scripts `scripts/asr/transcribe_file_rt.py`, `scripts/asr/transcribe_mic.py`, `scripts/tts/talk.py` or module
`riva.client/audio_io.py`), you will need to install `PyAudio`.
```bash
conda install -c anaconda pyaudio
```

For NLP evaluation you will need `transformers` and `sklearn` libraries.
```bash
pip install -U scikit-learn
pip install -U transformers
```

## Before using microphone and audio output devices on Unix
you may need to run commands
```
adduser $USER audio
adduser $USER pulse-access
```
and restart.

## Usage

### Server

Before running client part of Riva, please set up a server. The simplest
way to do this is to follow
[quick start guide](https://docs.nvidia.com/deeplearning/riva/user-guide/docs/quick-start-guide.html#local-deployment-using-quick-start-scripts).

### CLI

You may find all CLI scripts in `scripts` directory. Each script has a description of
its purpose and parameters.

#### ASR

You may find a detailed documentation [here](https://docs.nvidia.com/deeplearning/riva/user-guide/docs/apis/cli.html).

For transcribing in streaming mode you may use `scripts/asr/transcribe_file.py`.
```bash
python scripts/asr/transcribe_file.py \
--input-file data/examples/en-US_AntiBERTa_for_word_boosting_testing.wav
```

You may watch how a transcript grows if you set `--simulate-realtime` and `--show-intermediate`.
```bash
python scripts/asr/transcribe_file.py \
--input-file data/examples/en-US_AntiBERTa_for_word_boosting_testing.wav \
--simulate-realtime \
--show-intermediate
```

You may listen audio simultaneously with transcribing (you will need installed PyAudio and access to audio devices).
```bash
python scripts/asr/transcribe_file.py \
--input-file data/examples/en-US_AntiBERTa_for_word_boosting_testing.wav \
--play-audio \
--show-intermediate
```

Offline transcription is performed this way.
```bash
python scripts/asr/transcribe_file_offline.py \
--input-file data/examples/en-US_AntiBERTa_for_word_boosting_testing.wav
```

You can improve transcription of this audio by word boosting.
```bash
python scripts/asr/transcribe_file_offline.py \
--input-file data/examples/en-US_AntiBERTa_for_word_boosting_testing.wav \
--boosted-lm-words AntiBERTa \
--boosted-lm-words ABlooper \
--boosted-lm-score 20.0
```

#### NLP

You can provide inputs to `scripts/nlp/intentslot_client.py`, `scripts/nlp/punctuation_client.py`
both through command line arguments and interactively.
```bash
python scripts/nlp/intentslot_client.py --query "What is the weather tomorrow?"
```
or
```bash
python scripts/nlp/intentslot_client.py --interactive
```
For punctuation client the commands look similar.
```bash
python scripts/nlp/punctuation_client.py --query "can you prove that you are self aware"
```
or
```bash
python scripts/nlp/punctuation_client.py --interactive
```

**NER** client can output 1 of the following: label name, span start, span end
```bash
python scripts/nlp/ner_client.py \
--query "Where is San Francisco?" "Jensen Huang is the CEO of NVIDIA Corporation." \
--test label
```
or
```bash
python scripts/nlp/ner_client.py \
--query "Where is San Francisco?" "Jensen Huang is the CEO of NVIDIA Corporation." \
--test span_start
```
or
```bash
python scripts/nlp/ner_client.py \
--query "Where is San Francisco?" "Jensen Huang is the CEO of NVIDIA Corporation." \
--test span_end
```

Provide query and context to **QA** client.
```bash
python scripts/nlp/qa_client.py \
--query "How many gigatons of carbon dioxide was released in 2005?" \
--context "In 2010 the Amazon rainforest experienced another severe drought, in some ways "\
"more extreme than the 2005 drought. The affected region was approximate 1,160,000 square "\
"miles (3,000,000 km2) of rainforest, compared to 734,000 square miles (1,900,000 km2) in "\
"2005. The 2010 drought had three epicenters where vegetation died off, whereas in 2005 the "\
"drought was focused on the southwestern part. The findings were published in the journal "\
"Science. In a typical year the Amazon absorbs 1.5 gigatons of carbon dioxide; during 2005 "\
"instead 5 gigatons were released and in 2010 8 gigatons were released."
```

**Text classification** requires only a query.
```bash
python scripts/nlp/text_classify_client.py --query "How much sun does california get?"
```

#### TTS

Call ``scripts/tts/talk.py`` script, and you will be prompted to enter a text for speech
synthesis. Set `--play-audio` option, and a synthesized speech will be played.
```bash
python scripts/tts/talk.py --play-audio
```

You can write output to file.
```bash
python scripts/tts/talk.py --output 'my_synth_speech.wav'
```

You can use streaming mode (audio fragments returned to client as soon as they are ready).
```bash
python scripts/tts/talk.py --stream --play-audio
```

### API

See tutorial notebooks in directory `tutorials`.

## Documentation

Additional documentation on the Riva Speech Skills SDK can be found [here](https://docs.nvidia.com/deeplearning/riva/user-guide/docs/).

## License

This client code is MIT-licensed. See LICENSE file for full details.

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/nvidia-riva/python-clients",
    "name": "nvidia-riva-client",
    "maintainer": "Anton Peganov",
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "apeganov@nvidia.com",
    "keywords": "deep learning, machine learning, gpu, NLP, ASR, TTS, NMT, nvidia, speech, language, Riva, client",
    "author": "Anton Peganov",
    "author_email": "apeganov@nvidia.com",
    "download_url": null,
    "platform": null,
    "description": "[![License](https://img.shields.io/badge/license-MIT-green)](https://opensource.org/licenses/MIT)\n# NVIDIA Riva Clients\n\nNVIDIA Riva is a GPU-accelerated SDK for building Speech AI applications that are customized for your use \ncase and deliver real-time performance. This repo provides performant client example command-line clients.\n\n## Main API\n\n- `riva.client.ASRService` is a class for speech recognition,\n- `riva.client.TTSService` is a class for speech synthesis,\n- `riva.client.NLPService` is a class for natural language processing.\n\n## CLI interface\n\n- **Automatic Speech Recognition (ASR)**\n    - `scripts/asr/riva_streaming_asr_client.py` demonstrates streaming transcription in several threads, can prints time stamps.\n    - `scripts/asr/transcribe_file.py` performs streaming transcription,\n    - `scripts/asr/transcribe_file_offline.py` performs offline transcription,\n    - `scripts/asr/transcribe_mic.py` performs streaming transcription of audio acquired through microphone.\n- **Speech Synthesis (TTS)**\n    - `scripts/tts/talk.py` synthesizes audio for a text in streaming or offline mode.\n- **Natural Language Processing (NLP)**\n    - `scripts/nlp/intentslot_client.py` recognizes intents and slots in input sentences,\n    - `scripts/nlp/ner_client.py` detects named entities in input sentences,\n    - `scripts/nlp/punctuation_client.py` restores punctuation and capitalization in input sentences,\n    - `scripts/nlp/qa_client.py` queries a document with natural language query and prints answer from a document,\n    - `scripts/nlp/text_classify_client.py` classifies input sentences,\n    - `scripts/nlp/eval_intent_slot.py` prints intents and slots classification reports for test data.\n  \n## Installation\n\n1. Create a ``conda`` environment and activate it\n2. From source: \n    - Clone ``riva-python-clients`` repo and change to the repo root\n    - Run commands\n\n```bash\ngit clone https://github.com/nvidia-riva/python-clients.git\ncd python-clients\ngit submodule init\ngit submodule update --remote --recursive\npip install -r requirements.txt\npython3 setup.py bdist_wheel\npip install --force-reinstall dist/*.whl\n```\n3. `pip`:\n```bash\npip install nvidia-riva-client\n```\n\nIf you would like to use output and input audio devices \n(scripts `scripts/asr/transcribe_file_rt.py`, `scripts/asr/transcribe_mic.py`, `scripts/tts/talk.py` or module \n`riva.client/audio_io.py`), you will need to install `PyAudio`.\n```bash\nconda install -c anaconda pyaudio\n```\n\nFor NLP evaluation you will need `transformers` and `sklearn` libraries.\n```bash\npip install -U scikit-learn\npip install -U transformers\n```\n\n## Before using microphone and audio output devices on Unix\nyou may need to run commands\n```\nadduser $USER audio\nadduser $USER pulse-access\n```\nand restart.\n\n## Usage\n\n### Server\n\nBefore running client part of Riva, please set up a server. The simplest\nway to do this is to follow\n[quick start guide](https://docs.nvidia.com/deeplearning/riva/user-guide/docs/quick-start-guide.html#local-deployment-using-quick-start-scripts).\n\n### CLI\n\nYou may find all CLI scripts in `scripts` directory. Each script has a description of\nits purpose and parameters.\n\n#### ASR\n\nYou may find a detailed documentation [here](https://docs.nvidia.com/deeplearning/riva/user-guide/docs/apis/cli.html).\n\nFor transcribing in streaming mode you may use `scripts/asr/transcribe_file.py`.\n```bash\npython scripts/asr/transcribe_file.py \\\n    --input-file data/examples/en-US_AntiBERTa_for_word_boosting_testing.wav\n```\n\nYou may watch how a transcript grows if you set `--simulate-realtime` and `--show-intermediate`.\n```bash\npython scripts/asr/transcribe_file.py \\\n    --input-file data/examples/en-US_AntiBERTa_for_word_boosting_testing.wav \\\n    --simulate-realtime \\\n    --show-intermediate\n```\n\nYou may listen audio simultaneously with transcribing (you will need installed PyAudio and access to audio devices).\n```bash\npython scripts/asr/transcribe_file.py \\\n    --input-file data/examples/en-US_AntiBERTa_for_word_boosting_testing.wav \\\n    --play-audio \\\n    --show-intermediate\n```\n\nOffline transcription is performed this way.\n```bash\npython scripts/asr/transcribe_file_offline.py \\\n    --input-file data/examples/en-US_AntiBERTa_for_word_boosting_testing.wav\n```\n\nYou can improve transcription of this audio by word boosting.\n```bash\npython scripts/asr/transcribe_file_offline.py \\\n  --input-file data/examples/en-US_AntiBERTa_for_word_boosting_testing.wav \\\n  --boosted-lm-words AntiBERTa \\\n  --boosted-lm-words ABlooper \\\n  --boosted-lm-score 20.0\n```\n\n#### NLP\n\nYou can provide inputs to `scripts/nlp/intentslot_client.py`, `scripts/nlp/punctuation_client.py`\nboth through command line arguments and interactively.\n```bash\npython scripts/nlp/intentslot_client.py --query \"What is the weather tomorrow?\"\n```\nor\n```bash\npython scripts/nlp/intentslot_client.py --interactive\n```\nFor punctuation client the commands look similar.\n```bash\npython scripts/nlp/punctuation_client.py --query \"can you prove that you are self aware\"\n```\nor\n```bash\npython scripts/nlp/punctuation_client.py --interactive\n```\n\n**NER** client can output 1 of the following: label name, span start, span end\n```bash\npython scripts/nlp/ner_client.py \\\n  --query \"Where is San Francisco?\" \"Jensen Huang is the CEO of NVIDIA Corporation.\" \\\n  --test label\n```\nor\n```bash\npython scripts/nlp/ner_client.py \\\n  --query \"Where is San Francisco?\" \"Jensen Huang is the CEO of NVIDIA Corporation.\" \\\n  --test span_start\n```\nor\n```bash\npython scripts/nlp/ner_client.py \\\n  --query \"Where is San Francisco?\" \"Jensen Huang is the CEO of NVIDIA Corporation.\" \\\n  --test span_end\n```\n\nProvide query and context to **QA** client.\n```bash\npython scripts/nlp/qa_client.py \\\n  --query \"How many gigatons of carbon dioxide was released in 2005?\" \\\n  --context \"In 2010 the Amazon rainforest experienced another severe drought, in some ways \"\\\n\"more extreme than the 2005 drought. The affected region was approximate 1,160,000 square \"\\\n\"miles (3,000,000 km2) of rainforest, compared to 734,000 square miles (1,900,000 km2) in \"\\\n\"2005. The 2010 drought had three epicenters where vegetation died off, whereas in 2005 the \"\\\n\"drought was focused on the southwestern part. The findings were published in the journal \"\\\n\"Science. In a typical year the Amazon absorbs 1.5 gigatons of carbon dioxide; during 2005 \"\\\n\"instead 5 gigatons were released and in 2010 8 gigatons were released.\"\n```\n\n**Text classification** requires only a query.\n```bash\npython scripts/nlp/text_classify_client.py --query \"How much sun does california get?\"\n```\n\n#### TTS\n\nCall ``scripts/tts/talk.py`` script, and you will be prompted to enter a text for speech\nsynthesis. Set `--play-audio` option, and a synthesized speech will be played.\n```bash\npython scripts/tts/talk.py --play-audio\n```\n\nYou can write output to file.\n```bash\npython scripts/tts/talk.py --output 'my_synth_speech.wav'\n```\n\nYou can use streaming mode (audio fragments returned to client as soon as they are ready).\n```bash\npython scripts/tts/talk.py --stream --play-audio\n```\n\n### API\n\nSee tutorial notebooks in directory `tutorials`.\n\n\n## Documentation\n\nAdditional documentation on the Riva Speech Skills SDK can be found [here](https://docs.nvidia.com/deeplearning/riva/user-guide/docs/).\n\n\n## License\n\nThis client code is MIT-licensed. See LICENSE file for full details.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Python implementation of the Riva Client API",
    "version": "2.18.0",
    "project_urls": {
        "Homepage": "https://github.com/nvidia-riva/python-clients"
    },
    "split_keywords": [
        "deep learning",
        " machine learning",
        " gpu",
        " nlp",
        " asr",
        " tts",
        " nmt",
        " nvidia",
        " speech",
        " language",
        " riva",
        " client"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c5dcb7e3500e192584b63567f9a4eb30c879e3d1168eff8d1589b821c519fee4",
                "md5": "ec652a27663b209c817ceeb9b32a7644",
                "sha256": "53621fcc1478c1b30c8d75ce25ba8eb21606567d736310c5961b53c37ef0e1e4"
            },
            "downloads": -1,
            "filename": "nvidia_riva_client-2.18.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "ec652a27663b209c817ceeb9b32a7644",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 46238,
            "upload_time": "2024-12-12T11:43:53",
            "upload_time_iso_8601": "2024-12-12T11:43:53.600587Z",
            "url": "https://files.pythonhosted.org/packages/c5/dc/b7e3500e192584b63567f9a4eb30c879e3d1168eff8d1589b821c519fee4/nvidia_riva_client-2.18.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-12-12 11:43:53",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "nvidia-riva",
    "github_project": "python-clients",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "setuptools",
            "specs": [
                [
                    "==",
                    "70.0.0"
                ]
            ]
        },
        {
            "name": "grpcio",
            "specs": [
                [
                    "==",
                    "1.67.1"
                ]
            ]
        },
        {
            "name": "grpcio-tools",
            "specs": [
                [
                    "==",
                    "1.67.1"
                ]
            ]
        }
    ],
    "lcname": "nvidia-riva-client"
}

Anton Peganov