# OpenVoiceOS Dinkum Listener
Documentation can be found in [the technical manual](https://openvoiceos.github.io/ovos-technical-manual/speech_service/)
## Install
`pip install ovos-dinkum-listener[extras]` to install this package and the default
plugins. Note that by default, either `tensorflow` or `tflite_runtime` will need
to be installed separately for wakeword detection.
> If unable to install tflite_runtime in your platform, you can find wheels
> here https://whl.smartgic.io/. eg, for pyhon 3.11 in x86
> `pip install https://whl.smartgic.io/tflite_runtime-2.13.0-cp311-cp311-linux_x86_64.whl`
Without `extras`, wakeword and STT audio upload will be disabled unless you install
[`ovos-backend-client`](https://github.com/OpenVoiceOS/ovos-backend-client) separately. You will also need to manually install,
and possibly configure STT, WW, and VAD modules as described below.
Using [ovos-vad-plugin-silero](https://github.com/OpenVoiceOS/ovos-vad-plugin-silero)
is strongly recommended
## Configuration
you can set the Wakeword, VAD, STT and Microphone plugins
eg, to run under MacOS you should use https://github.com/OpenVoiceOS/ovos-microphone-plugin-sounddevice
non exhaustive list of config options
```
{
"stt": {
"module": "ovos-stt-plugin-server",
"fallback_module": "",
"ovos-stt-plugin-server": {"url": "https://stt.openvoiceos.com/stt"}
},
"listener": {
// NOTE, multiple hotwords are supported, these fields define the main wake_word,
// this is equivalent to setting "active": true in the "hotwords" section
// see "hotwords" section at https://github.com/OpenVoiceOS/ovos-config/blob/dev/ovos_config/mycroft.conf
"wake_word": "hey_mycroft",
"stand_up_word": "wake_up",
"microphone": {
"module": "ovos-microphone-plugin-alsa"
},
VAD": {
// recommended plugin: "ovos-vad-plugin-silero"
"module": "ovos-vad-plugin-silero",
"ovos-vad-plugin-silero": {"threshold": 0.2},
"ovos-vad-plugin-webrtcvad": {"vad_mode": 3}
},
// Seconds of speech before voice command has begun
"speech_begin": 0.1,
// Seconds of silence before a voice command has finished
"silence_end": 0.5,
// Settings used by microphone to set recording timeout with and without speech detected
"recording_timeout": 10.0,
// Settings used by microphone to set recording timeout without speech detected.
"recording_timeout_with_silence": 3.0,
// max time allowed without user speaking before exiting RECORDING mode
"recording_mode_max_silence_seconds": 30.0,
// Setting to remove all silence/noise from start and end of recorded speech (only non-streaming)
"remove_silence": true,
// continuous listen is an experimental setting, it removes the need for
// wake words and uses VAD only, a streaming STT is strongly recommended
// NOTE: depending on hardware this may cause mycroft to hear its own TTS responses as questions
"continuous_listen": false,
// hybrid listen is an experimental setting,
// it will not require a wake word for X seconds after a user interaction
// this means you dont need to say "hey mycroft" for follow up questions
"hybrid_listen": false,
// number of seconds to wait for an interaction before requiring wake word again
"listen_timeout": 45
}
}
```
## Tips and tricks
### Saving Transcriptions
You can enable saving of recordings to file, this should be your first step to diagnose problems, is the audio inteligible? is it being cropped? too noisy? low volume?
> set `"save_utterances": true` in your [listener config](https://github.com/OpenVoiceOS/ovos-config/blob/V0.0.13a19/ovos_config/mycroft.conf#L436), recordings will be saved to `~/.local/share/mycroft/listener/utterances`
If the recorded audio looks good to you, maybe you need to use a different STT plugin, maybe the one you are using does not like your microphone, or just isn't very good for your language
### Wrong Transcriptions
If you consistently get specific words or utterances transcribed wrong, you can remedy around this to some extent by using the [ovos-utterance-corrections-plugin](https://github.com/OpenVoiceOS/ovos-utterance-corrections-plugin)
> You can define replacements at word level `~/.local/share/mycroft/word_corrections.json`
for example whisper STT often gets artist names wrong, this allows you to correct them
```json
{
"Jimmy Hendricks": "Jimi Hendrix",
"Eric Klapptern": "Eric Clapton",
"Eric Klappton": "Eric Clapton"
}
```
### Silence Removal
By default OVOS applies VAD (Voice Activity Detection) to crop silence from the audio sent to STT, this helps in performance and in accuracy (reduces hallucinations in plugins like FasterWhisper)
Depending on your microphone/VAD plugin, this might be removing too much audio
> set `"remove_silence": false` in your [listener config](https://github.com/OpenVoiceOS/ovos-config/blob/V0.0.13a19/ovos_config/mycroft.conf#L452), this will send the full audio recording to STT
### Listen Sound
does your listen sound contain speech? some users replace the "ding" sound with words such as "yes?"
In this case the listen sound will be sent to STT and might negatively affect the transcription
> set `"instant_listen": false` in your [listener config](https://github.com/OpenVoiceOS/ovos-config/blob/V0.0.13a19/ovos_config/mycroft.conf#L519), this will drop the listen sound audio from the STT audio buffer. You will need to wait for the listen sound to finish before speaking your command in this case
## Credits
Voice Loop state machine implementation by [@Synesthesiam](https://github.com/synesthesiam) for [mycroft-dinkum](https://github.com/MycroftAI/mycroft-dinkum)
Raw data
{
"_id": null,
"home_page": "https://github.com/OpenVoiceOS/ovos-dinkum-listener",
"name": "ovos-dinkum-listener",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": null,
"author": null,
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/01/e6/364e3ad5019cec423f4130b47bbd1bbcc8b9728d8a9da4fbbe4385f2fef6/ovos-dinkum-listener-0.3.3.tar.gz",
"platform": null,
"description": "# OpenVoiceOS Dinkum Listener \n\nDocumentation can be found in [the technical manual](https://openvoiceos.github.io/ovos-technical-manual/speech_service/)\n\n## Install\n\n`pip install ovos-dinkum-listener[extras]` to install this package and the default\nplugins. Note that by default, either `tensorflow` or `tflite_runtime` will need\nto be installed separately for wakeword detection.\n\n> If unable to install tflite_runtime in your platform, you can find wheels\n> here https://whl.smartgic.io/. eg, for pyhon 3.11 in x86\n> `pip install https://whl.smartgic.io/tflite_runtime-2.13.0-cp311-cp311-linux_x86_64.whl`\n\nWithout `extras`, wakeword and STT audio upload will be disabled unless you install \n[`ovos-backend-client`](https://github.com/OpenVoiceOS/ovos-backend-client) separately. You will also need to manually install,\nand possibly configure STT, WW, and VAD modules as described below.\n\nUsing [ovos-vad-plugin-silero](https://github.com/OpenVoiceOS/ovos-vad-plugin-silero) \nis strongly recommended\n\n## Configuration\n\nyou can set the Wakeword, VAD, STT and Microphone plugins\n\neg, to run under MacOS you should use https://github.com/OpenVoiceOS/ovos-microphone-plugin-sounddevice\n\nnon exhaustive list of config options\n```\n{\n \"stt\": {\n \"module\": \"ovos-stt-plugin-server\",\n \"fallback_module\": \"\",\n \"ovos-stt-plugin-server\": {\"url\": \"https://stt.openvoiceos.com/stt\"}\n },\n \"listener\": {\n // NOTE, multiple hotwords are supported, these fields define the main wake_word,\n // this is equivalent to setting \"active\": true in the \"hotwords\" section\n // see \"hotwords\" section at https://github.com/OpenVoiceOS/ovos-config/blob/dev/ovos_config/mycroft.conf\n \"wake_word\": \"hey_mycroft\",\n \"stand_up_word\": \"wake_up\",\n \"microphone\": {\n \"module\": \"ovos-microphone-plugin-alsa\"\n },\n VAD\": {\n // recommended plugin: \"ovos-vad-plugin-silero\"\n \"module\": \"ovos-vad-plugin-silero\",\n \"ovos-vad-plugin-silero\": {\"threshold\": 0.2},\n \"ovos-vad-plugin-webrtcvad\": {\"vad_mode\": 3}\n },\n // Seconds of speech before voice command has begun\n \"speech_begin\": 0.1,\n // Seconds of silence before a voice command has finished\n \"silence_end\": 0.5,\n // Settings used by microphone to set recording timeout with and without speech detected\n \"recording_timeout\": 10.0,\n // Settings used by microphone to set recording timeout without speech detected.\n \"recording_timeout_with_silence\": 3.0,\n // max time allowed without user speaking before exiting RECORDING mode\n \"recording_mode_max_silence_seconds\": 30.0,\n // Setting to remove all silence/noise from start and end of recorded speech (only non-streaming)\n \"remove_silence\": true,\n // continuous listen is an experimental setting, it removes the need for\n // wake words and uses VAD only, a streaming STT is strongly recommended\n // NOTE: depending on hardware this may cause mycroft to hear its own TTS responses as questions\n \"continuous_listen\": false,\n\n // hybrid listen is an experimental setting,\n // it will not require a wake word for X seconds after a user interaction\n // this means you dont need to say \"hey mycroft\" for follow up questions\n \"hybrid_listen\": false,\n // number of seconds to wait for an interaction before requiring wake word again\n \"listen_timeout\": 45\n }\n}\n```\n\n## Tips and tricks\n\n### Saving Transcriptions\n\nYou can enable saving of recordings to file, this should be your first step to diagnose problems, is the audio inteligible? is it being cropped? too noisy? low volume?\n\n> set `\"save_utterances\": true` in your [listener config](https://github.com/OpenVoiceOS/ovos-config/blob/V0.0.13a19/ovos_config/mycroft.conf#L436), recordings will be saved to `~/.local/share/mycroft/listener/utterances`\n\nIf the recorded audio looks good to you, maybe you need to use a different STT plugin, maybe the one you are using does not like your microphone, or just isn't very good for your language\n\n### Wrong Transcriptions\n\nIf you consistently get specific words or utterances transcribed wrong, you can remedy around this to some extent by using the [ovos-utterance-corrections-plugin](https://github.com/OpenVoiceOS/ovos-utterance-corrections-plugin)\n\n> You can define replacements at word level `~/.local/share/mycroft/word_corrections.json`\n\nfor example whisper STT often gets artist names wrong, this allows you to correct them\n```json\n{\n \"Jimmy Hendricks\": \"Jimi Hendrix\",\n \"Eric Klapptern\": \"Eric Clapton\",\n \"Eric Klappton\": \"Eric Clapton\"\n}\n```\n\n### Silence Removal\n\nBy default OVOS applies VAD (Voice Activity Detection) to crop silence from the audio sent to STT, this helps in performance and in accuracy (reduces hallucinations in plugins like FasterWhisper)\n\nDepending on your microphone/VAD plugin, this might be removing too much audio\n\n> set `\"remove_silence\": false` in your [listener config](https://github.com/OpenVoiceOS/ovos-config/blob/V0.0.13a19/ovos_config/mycroft.conf#L452), this will send the full audio recording to STT\n\n### Listen Sound\n\ndoes your listen sound contain speech? some users replace the \"ding\" sound with words such as \"yes?\"\n\nIn this case the listen sound will be sent to STT and might negatively affect the transcription\n\n> set `\"instant_listen\": false` in your [listener config](https://github.com/OpenVoiceOS/ovos-config/blob/V0.0.13a19/ovos_config/mycroft.conf#L519), this will drop the listen sound audio from the STT audio buffer. You will need to wait for the listen sound to finish before speaking your command in this case\n\n\n## Credits\n\nVoice Loop state machine implementation by [@Synesthesiam](https://github.com/synesthesiam) for [mycroft-dinkum](https://github.com/MycroftAI/mycroft-dinkum)\n\n\n",
"bugtrack_url": null,
"license": "Apache-2.0",
"summary": "ovos-core listener daemon client",
"version": "0.3.3",
"project_urls": {
"Homepage": "https://github.com/OpenVoiceOS/ovos-dinkum-listener"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "b64754b09412d88ddd4f3370f1dd5e897548d48097dc394b52f90656ffd7565e",
"md5": "e20fadb6857f694a7902af5ede9a3b83",
"sha256": "aa2eca406aa54a443b883cb9bb80b376d15889cc4e74f59a854552989940b679"
},
"downloads": -1,
"filename": "ovos_dinkum_listener-0.3.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "e20fadb6857f694a7902af5ede9a3b83",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 110066,
"upload_time": "2024-11-19T12:50:32",
"upload_time_iso_8601": "2024-11-19T12:50:32.656681Z",
"url": "https://files.pythonhosted.org/packages/b6/47/54b09412d88ddd4f3370f1dd5e897548d48097dc394b52f90656ffd7565e/ovos_dinkum_listener-0.3.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "01e6364e3ad5019cec423f4130b47bbd1bbcc8b9728d8a9da4fbbe4385f2fef6",
"md5": "2ceddaadbdb465d0fe6ef3502bced763",
"sha256": "c032f668626440b0275dcbd2d6f10bb01f8bf04a99105eb57b9b053781193bfe"
},
"downloads": -1,
"filename": "ovos-dinkum-listener-0.3.3.tar.gz",
"has_sig": false,
"md5_digest": "2ceddaadbdb465d0fe6ef3502bced763",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 101674,
"upload_time": "2024-11-19T12:50:33",
"upload_time_iso_8601": "2024-11-19T12:50:33.921456Z",
"url": "https://files.pythonhosted.org/packages/01/e6/364e3ad5019cec423f4130b47bbd1bbcc8b9728d8a9da4fbbe4385f2fef6/ovos-dinkum-listener-0.3.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-11-19 12:50:33",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "OpenVoiceOS",
"github_project": "ovos-dinkum-listener",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "ovos-dinkum-listener"
}