.. image:: https://travis-ci.org/wiseman/py-webrtcvad.svg?branch=master
:target: https://travis-ci.org/wiseman/py-webrtcvad
py-webrtcvad
============
This is a python interface to the WebRTC Voice Activity Detector
(VAD). It is compatible with Python 2 and Python 3.
A `VAD <https://en.wikipedia.org/wiki/Voice_activity_detection>`_
classifies a piece of audio data as being voiced or unvoiced. It can
be useful for telephony and speech recognition.
The VAD that Google developed for the `WebRTC <https://webrtc.org/>`_
project is reportedly one of the best available, being fast, modern
and free.
How to use it
-------------
0. Install the webrtcvad module::
pip install webrtcvad
1. Create a ``Vad`` object::
import webrtcvad
vad = webrtcvad.Vad()
2. Optionally, set its aggressiveness mode, which is an integer
between 0 and 3. 0 is the least aggressive about filtering out
non-speech, 3 is the most aggressive. (You can also set the mode
when you create the VAD, e.g. ``vad = webrtcvad.Vad(3)``)::
vad.set_mode(1)
3. Give it a short segment ("frame") of audio. The WebRTC VAD only
accepts 16-bit mono PCM audio, sampled at 8000, 16000, or 32000 Hz.
A frame must be either 10, 20, or 30 ms in duration::
# Run the VAD on 10 ms of silence. The result should be False.
sample_rate = 16000
frame_duration = 10 # ms
frame = b'\x00\x00' * (sample_rate * frame_duration / 1000)
print 'Contains speech: %s' % (vad.is_speech(frame, sample_rate)
See `example.py
<https://github.com/wiseman/py-webrtcvad/blob/master/example.py>`_ for
a more detailed example that will process a .wav file, find the voiced
segments, and write each one as a separate .wav.
How to run unit tests
---------------------
To run unit tests::
pip install -e ".[dev]"
python setup.py test
Raw data
{
"_id": null,
"home_page": "https://github.com/wiseman/py-webrtcvad",
"name": "webrtcvad",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "speechrecognition asr voiceactivitydetection vad webrtc",
"author": "John Wiseman",
"author_email": "jjwiseman@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/89/34/e2de2d97f3288512b9ea56f92e7452f8207eb5a0096500badf9dfd48f5e6/webrtcvad-2.0.10.tar.gz",
"platform": "",
"description": ".. image:: https://travis-ci.org/wiseman/py-webrtcvad.svg?branch=master\n :target: https://travis-ci.org/wiseman/py-webrtcvad\n\npy-webrtcvad\n============\n\nThis is a python interface to the WebRTC Voice Activity Detector\n(VAD). It is compatible with Python 2 and Python 3.\n\nA `VAD <https://en.wikipedia.org/wiki/Voice_activity_detection>`_\nclassifies a piece of audio data as being voiced or unvoiced. It can\nbe useful for telephony and speech recognition.\n\nThe VAD that Google developed for the `WebRTC <https://webrtc.org/>`_\nproject is reportedly one of the best available, being fast, modern\nand free.\n\nHow to use it\n-------------\n\n0. Install the webrtcvad module::\n\n pip install webrtcvad\n\n1. Create a ``Vad`` object::\n\n import webrtcvad\n vad = webrtcvad.Vad()\n\n2. Optionally, set its aggressiveness mode, which is an integer\n between 0 and 3. 0 is the least aggressive about filtering out\n non-speech, 3 is the most aggressive. (You can also set the mode\n when you create the VAD, e.g. ``vad = webrtcvad.Vad(3)``)::\n\n vad.set_mode(1)\n\n3. Give it a short segment (\"frame\") of audio. The WebRTC VAD only\n accepts 16-bit mono PCM audio, sampled at 8000, 16000, or 32000 Hz.\n A frame must be either 10, 20, or 30 ms in duration::\n\n # Run the VAD on 10 ms of silence. The result should be False.\n sample_rate = 16000\n frame_duration = 10 # ms\n frame = b'\\x00\\x00' * (sample_rate * frame_duration / 1000)\n print 'Contains speech: %s' % (vad.is_speech(frame, sample_rate)\n\n\nSee `example.py\n<https://github.com/wiseman/py-webrtcvad/blob/master/example.py>`_ for\na more detailed example that will process a .wav file, find the voiced\nsegments, and write each one as a separate .wav.\n\n\nHow to run unit tests\n---------------------\n\nTo run unit tests::\n\n pip install -e \".[dev]\"\n python setup.py test",
"bugtrack_url": null,
"license": "MIT",
"summary": "Python interface to the Google WebRTC Voice Activity Detector (VAD)",
"version": "2.0.10",
"split_keywords": [
"speechrecognition",
"asr",
"voiceactivitydetection",
"vad",
"webrtc"
],
"urls": [
{
"comment_text": "",
"digests": {
"md5": "213d2848aeebbbd22485d4ad630b5fdb",
"sha256": "f1bed2fb25b63fb7b1a55d64090c993c9c9167b28485ae0bcdd81cf6ede96aea"
},
"downloads": -1,
"filename": "webrtcvad-2.0.10.tar.gz",
"has_sig": false,
"md5_digest": "213d2848aeebbbd22485d4ad630b5fdb",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 66156,
"upload_time": "2017-01-07T23:05:18",
"upload_time_iso_8601": "2017-01-07T23:05:18.732212Z",
"url": "https://files.pythonhosted.org/packages/89/34/e2de2d97f3288512b9ea56f92e7452f8207eb5a0096500badf9dfd48f5e6/webrtcvad-2.0.10.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2017-01-07 23:05:18",
"github": true,
"gitlab": false,
"bitbucket": false,
"github_user": "wiseman",
"github_project": "py-webrtcvad",
"travis_ci": true,
"coveralls": false,
"github_actions": false,
"lcname": "webrtcvad"
}