.. image:: https://travis-ci.org/wiseman/py-webrtcvad.svg?branch=master
:target: https://travis-ci.org/wiseman/py-webrtcvad
py-webrtcvad
============
This is a python interface to the WebRTC Voice Activity Detector
(VAD). It is compatible with Python 2 and Python 3.
A `VAD <https://en.wikipedia.org/wiki/Voice_activity_detection>`_
classifies a piece of audio data as being voiced or unvoiced. It can
be useful for telephony and speech recognition.
The VAD that Google developed for the `WebRTC <https://webrtc.org/>`_
project is reportedly one of the best available, being fast, modern
and free.
How to use it
-------------
0. Install the webrtcvad module::
pip install webrtcvad
1. Create a ``Vad`` object::
import webrtcvad
vad = webrtcvad.Vad()
2. Optionally, set its aggressiveness mode, which is an integer
between 0 and 3. 0 is the least aggressive about filtering out
non-speech, 3 is the most aggressive. (You can also set the mode
when you create the VAD, e.g. ``vad = webrtcvad.Vad(3)``)::
vad.set_mode(1)
3. Give it a short segment ("frame") of audio. The WebRTC VAD only
accepts 16-bit mono PCM audio, sampled at 8000, 16000, 32000 or 48000 Hz.
A frame must be either 10, 20, or 30 ms in duration::
# Run the VAD on 10 ms of silence. The result should be False.
sample_rate = 16000
frame_duration = 10 # ms
frame = b'\x00\x00' * int(sample_rate * frame_duration / 1000)
print 'Contains speech: %s' % (vad.is_speech(frame, sample_rate)
See `example.py
<https://github.com/wiseman/py-webrtcvad/blob/master/example.py>`_ for
a more detailed example that will process a .wav file, find the voiced
segments, and write each one as a separate .wav.
How to run unit tests
---------------------
To run unit tests::
pip install -e ".[dev]"
python setup.py test
History
-------
2.0.10
Fixed memory leak. Thank you, `bond005
<https://github.com/bond005>`_!
2.0.9
Improved example code. Added WebRTC license.
2.0.8
Fixed Windows compilation errors. Thank you, `xiongyihui
<https://github.com/xiongyihui>`_!
Raw data
{
"_id": null,
"home_page": "https://github.com/wiseman/py-webrtcvad",
"name": "webrtcvad123",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "speechrecognition asr voiceactivitydetection vad webrtc",
"author": "Will Wang",
"author_email": "willat0412@163.com",
"download_url": "https://files.pythonhosted.org/packages/d9/14/dc70197b83caa186f2ff8eb9d611478ca4c35f66fc2befd85a0b92e21de1/webrtcvad123-2.0.11.dev0.tar.gz",
"platform": null,
"description": ".. image:: https://travis-ci.org/wiseman/py-webrtcvad.svg?branch=master\n :target: https://travis-ci.org/wiseman/py-webrtcvad\n\npy-webrtcvad\n============\n\nThis is a python interface to the WebRTC Voice Activity Detector\n(VAD). It is compatible with Python 2 and Python 3.\n\nA `VAD <https://en.wikipedia.org/wiki/Voice_activity_detection>`_\nclassifies a piece of audio data as being voiced or unvoiced. It can\nbe useful for telephony and speech recognition.\n\nThe VAD that Google developed for the `WebRTC <https://webrtc.org/>`_\nproject is reportedly one of the best available, being fast, modern\nand free.\n\nHow to use it\n-------------\n\n0. Install the webrtcvad module::\n\n pip install webrtcvad\n\n1. Create a ``Vad`` object::\n\n import webrtcvad\n vad = webrtcvad.Vad()\n\n2. Optionally, set its aggressiveness mode, which is an integer\n between 0 and 3. 0 is the least aggressive about filtering out\n non-speech, 3 is the most aggressive. (You can also set the mode\n when you create the VAD, e.g. ``vad = webrtcvad.Vad(3)``)::\n\n vad.set_mode(1)\n\n3. Give it a short segment (\"frame\") of audio. The WebRTC VAD only\n accepts 16-bit mono PCM audio, sampled at 8000, 16000, 32000 or 48000 Hz.\n A frame must be either 10, 20, or 30 ms in duration::\n\n # Run the VAD on 10 ms of silence. The result should be False.\n sample_rate = 16000\n frame_duration = 10 # ms\n frame = b'\\x00\\x00' * int(sample_rate * frame_duration / 1000)\n print 'Contains speech: %s' % (vad.is_speech(frame, sample_rate)\n\n\nSee `example.py\n<https://github.com/wiseman/py-webrtcvad/blob/master/example.py>`_ for\na more detailed example that will process a .wav file, find the voiced\nsegments, and write each one as a separate .wav.\n\n\nHow to run unit tests\n---------------------\n\nTo run unit tests::\n\n pip install -e \".[dev]\"\n python setup.py test\n\n\nHistory\n-------\n\n2.0.10\n\n Fixed memory leak. Thank you, `bond005\n <https://github.com/bond005>`_!\n\n2.0.9\n\n Improved example code. Added WebRTC license.\n\n2.0.8\n\n Fixed Windows compilation errors. Thank you, `xiongyihui\n <https://github.com/xiongyihui>`_!\n\n\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Python interface to the Google WebRTC Voice Activity Detector (VAD)",
"version": "2.0.11.dev0",
"project_urls": {
"Homepage": "https://github.com/wiseman/py-webrtcvad"
},
"split_keywords": [
"speechrecognition",
"asr",
"voiceactivitydetection",
"vad",
"webrtc"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "d914dc70197b83caa186f2ff8eb9d611478ca4c35f66fc2befd85a0b92e21de1",
"md5": "1d3a5ce011d5620ea21597844dc53c3e",
"sha256": "00bbb478872863cdb88a9a517a38d50b62fc5b7f8bbcac6aa4640b3f4fec17db"
},
"downloads": -1,
"filename": "webrtcvad123-2.0.11.dev0.tar.gz",
"has_sig": false,
"md5_digest": "1d3a5ce011d5620ea21597844dc53c3e",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 80964,
"upload_time": "2023-12-08T07:13:51",
"upload_time_iso_8601": "2023-12-08T07:13:51.877590Z",
"url": "https://files.pythonhosted.org/packages/d9/14/dc70197b83caa186f2ff8eb9d611478ca4c35f66fc2befd85a0b92e21de1/webrtcvad123-2.0.11.dev0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-12-08 07:13:51",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "wiseman",
"github_project": "py-webrtcvad",
"travis_ci": true,
"coveralls": false,
"github_actions": false,
"lcname": "webrtcvad123"
}