# Live Audio Capture
![Python Version](https://img.shields.io/badge/python-3.9%2B-blue)
![License](https://img.shields.io/badge/license-MIT-green)
![PyPI Version](https://img.shields.io/pypi/v/live_audio_capture)
[![Documentation](https://img.shields.io/badge/docs-live_audio_capture-blue)](https://sami-rajichi.github.io/live_audio_capture/)
**Live Audio Capture** is a cross-platform Python package designed for capturing, processing, and analyzing live audio from a microphone in real-time. It provides a robust and flexible interface for voice activity detection (VAD), noise reduction, audio visualization, and more. Whether you're building a voice assistant, a transcription tool, or a real-time audio analysis application, this package has you covered.
---
## Why Use Live Audio Capture?
### Key Advantages
1. **Cross-Platform Support**: Works seamlessly on Windows, macOS, and Linux.
2. **Real-Time Processing**: Captures and processes audio in real-time with minimal latency.
3. **Voice Activity Detection (VAD)**: Dynamically detects speech and stops recording during silence.
4. **Noise Reduction**: Advanced noise reduction algorithms powered by the `noisereduce` package for cleaner audio.
5. **Customizable**: Highly configurable parameters for sampling rate, chunk duration, noise reduction, and more.
6. **Real-Time Visualization**: Visualize audio waveforms, frequency spectra, and spectrograms in real-time.
7. **Easy to Use**: Simple API for quick integration into your projects.
---
## Use Cases
- **Voice Assistants**: Capture and process user commands in real-time.
- **Transcription Tools**: Record and transcribe audio with noise reduction.
- **Real-Time Audio Analysis**: Analyze audio signals for frequency, volume, and other metrics.
- **Educational Tools**: Teach audio processing and visualization concepts.
- **Security Systems**: Detect and record audio events in real-time.
---
## Features
- **Live Audio Capture**: Capture audio from the microphone in real-time.
- **Voice Activity Detection (VAD)**: Automatically detect speech and stop recording during silence.
- **Noise Reduction**: Reduce background noise using the `noisereduce` package, which employs spectral gating techniques.
- **Real-Time Visualization**: Visualize audio waveforms, frequency spectra, and spectrograms.
- **Multiple Output Formats**: Save recordings in WAV, MP3, or OGG formats.
- **Customizable Parameters**:
- Sampling rate
- Chunk duration
- VAD aggressiveness
- Noise reduction settings
- Low-pass filter cutoff frequency
- **Cross-Platform**: Works on Windows, macOS, and Linux.
---
## Installation
### Requirements
- Python 3.9 or higher
- FFmpeg (for audio file handling)
- Microphone access
### Install the Package
You can install the package via pip:
```bash
pip install live_audio_capture
```
### Install FFmpeg
- **Linux**:
```bash
sudo apt update
sudo apt install ffmpeg
```
- **macOS** (using Homebrew):
```bash
brew install ffmpeg
```
- **Windows**: Download FFmpeg from [https://ffmpeg.org/download.html](https://ffmpeg.org/download.html) and add it to your system's `PATH`.
---
## Usage
### Basic Example
Capture audio with voice activity detection and save it to a file:
```python
from live_audio_capture import LiveAudioCapture
# Initialize the audio capture
capture = LiveAudioCapture(
sampling_rate=16000, # Sample rate in Hz
chunk_duration=0.1, # Duration of each audio chunk in seconds
enable_noise_canceling=True, # Enable noise reduction
aggressiveness=2, # VAD aggressiveness level (0-3)
)
# Start recording with VAD
capture.listen_and_record_with_vad(
output_file="output.wav", # Save the recording to this file
silence_duration=2.0, # Stop recording after 2 seconds of silence
format="wav", # Output format
)
# Stop the capture
capture.stop()
```
### Real-Time Visualization
Visualize audio in real-time:
```python
from live_audio_capture import LiveAudioCapture, AudioVisualizer
# Initialize the audio capture
capture = LiveAudioCapture(sampling_rate=44100, chunk_duration=0.1)
# Initialize the audio visualizer
visualizer = AudioVisualizer(sampling_rate=44100, chunk_duration=0.1)
# Stream audio and visualize it
for audio_chunk in capture.stream_audio():
visualizer.add_audio_chunk(audio_chunk)
```
### Advanced Example
Use all available parameters for maximum customization:
```python
from live_audio_capture import LiveAudioCapture
# Initialize the audio capture with all parameters
capture = LiveAudioCapture(
sampling_rate=16000,
chunk_duration=0.1,
audio_format="f32le",
channels=1,
aggressiveness=3,
enable_beep=True,
enable_noise_canceling=True,
low_pass_cutoff=7500.0,
stationary_noise_reduction=True,
prop_decrease=1.0,
n_std_thresh_stationary=1.5,
n_jobs=1,
use_torch=False,
device="cpu",
calibration_duration=2.0,
use_adaptive_threshold=True,
)
# Start recording with VAD
capture.listen_and_record_with_vad(
output_file="output.wav",
silence_duration=2.0,
format="wav",
)
# Stop the capture
capture.stop()
```
---
## Features and Arguments
### `LiveAudioCapture` Parameters
- **`sampling_rate`**: Sample rate in Hz (default: `16000`).
- **`chunk_duration`**: Duration of each audio chunk in seconds (default: `0.1`).
- **`audio_format`**: Audio format for FFmpeg output (default: `"f32le"`).
- **`channels`**: Number of audio channels (default: `1` for mono).
- **`aggressiveness`**: VAD aggressiveness level (0-3, default: `1`).
- **`enable_beep`**: Play beep sounds when recording starts/stops (default: `True`).
- **`enable_noise_canceling`**: Enable noise reduction using the `noisereduce` package (default: `False`).
- **`low_pass_cutoff`**: Low-pass filter cutoff frequency (default: `7500.0`).
- **`stationary_noise_reduction`**: Enable stationary noise reduction (default: `False`).
- **`prop_decrease`**: Proportion to reduce noise by (default: `1.0`).
- **`n_std_thresh_stationary`**: Threshold for stationary noise reduction (default: `1.5`).
- **`n_jobs`**: Number of parallel jobs for noise reduction (default: `1`).
- **`use_torch`**: Use PyTorch for noise reduction (default: `False`).
- **`device`**: Device for PyTorch noise reduction (default: `"cpu"`).
- **`calibration_duration`**: Duration of calibration for adaptive thresholding (default: `2.0`).
- **`use_adaptive_threshold`**: Enable adaptive thresholding for VAD (default: `True`).
---
## Recommendations
1. **Use Threading for Real-Time Listening**: It is highly recommended to use threading for real-time audio listening. This allows you to easily stop the audio capture in any script using the `.stop()` method without blocking the main program.
2. **Use a High-Quality Microphone**: For best results, use a microphone with good noise cancellation.
3. **Adjust VAD Aggressiveness**: Higher aggressiveness levels may reduce false positives but can also miss softer speech.
4. **Enable Noise Reduction**: If you're working in a noisy environment, enable noise reduction for cleaner audio.
5. **Test on Your Platform**: Test the package on your target platform to ensure compatibility.
---
## Technical Details
### Voice Activity Detection (VAD)
The VAD system uses an energy-based approach with adaptive thresholding. It calculates the energy of each audio chunk and compares it to a dynamically adjusted threshold. Hysteresis is applied to avoid rapid toggling between speech and silence states.
### Noise Reduction
The `noisereduce` package is used for noise reduction. It employs spectral gating techniques to remove background noise while preserving speech. You can choose between stationary and non-stationary noise reduction, and even use PyTorch for GPU-accelerated processing.
### Real-Time Visualization
The visualization module provides insights into:
- **Waveform**: The amplitude of the audio signal over time.
- **Frequency Spectrum**: The distribution of frequencies in the audio signal.
- **Spectrogram**: A visual representation of the spectrum of frequencies over time.
- **Volume Meter**: Real-time volume levels.
- **Volume History**: A history of volume levels over time.
---
## Contributing
Contributions are welcome! Please read the [Contributing Guidelines](https://github.com/sami-rajichi/live_audio_capture/blob/main/CONTRIBUTING.md) for details.
---
## License
This project is licensed under the MIT License. See the [LICENSE](https://github.com/sami-rajichi/live_audio_capture/blob/main/LICENSE) file for details.
---
## Support
For questions, issues, or feature requests, please open an issue on [GitHub](https://github.com/sami-rajichi/live_audio_capture/issues).
---
## Final Words
If you find this package useful, please consider leaving a ⭐ star on the [GitHub repository](https://github.com/sami-rajichi/live_audio_capture). Your support motivates us to keep improving! If you have any suggestions for optimization or new features, don't hesitate to reach out. We'd love to hear from you!
Raw data
{
"_id": null,
"home_page": "https://github.com/sami-rajichi/live_audio_capture",
"name": "live-audio-capture",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": "audio capture ffmpeg real-time visualization",
"author": "Sami RAJICHI",
"author_email": "semi.rajichi@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/46/71/b67aecb434b0607a427110455927b083db053a5b81eda11cd8e27eee5561/live_audio_capture-0.4.1.tar.gz",
"platform": null,
"description": "# Live Audio Capture\r\n\r\n![Python Version](https://img.shields.io/badge/python-3.9%2B-blue)\r\n![License](https://img.shields.io/badge/license-MIT-green)\r\n![PyPI Version](https://img.shields.io/pypi/v/live_audio_capture)\r\n[![Documentation](https://img.shields.io/badge/docs-live_audio_capture-blue)](https://sami-rajichi.github.io/live_audio_capture/)\r\n\r\n**Live Audio Capture** is a cross-platform Python package designed for capturing, processing, and analyzing live audio from a microphone in real-time. It provides a robust and flexible interface for voice activity detection (VAD), noise reduction, audio visualization, and more. Whether you're building a voice assistant, a transcription tool, or a real-time audio analysis application, this package has you covered.\r\n\r\n---\r\n\r\n## Why Use Live Audio Capture?\r\n\r\n### Key Advantages\r\n1. **Cross-Platform Support**: Works seamlessly on Windows, macOS, and Linux.\r\n2. **Real-Time Processing**: Captures and processes audio in real-time with minimal latency.\r\n3. **Voice Activity Detection (VAD)**: Dynamically detects speech and stops recording during silence.\r\n4. **Noise Reduction**: Advanced noise reduction algorithms powered by the `noisereduce` package for cleaner audio.\r\n5. **Customizable**: Highly configurable parameters for sampling rate, chunk duration, noise reduction, and more.\r\n6. **Real-Time Visualization**: Visualize audio waveforms, frequency spectra, and spectrograms in real-time.\r\n7. **Easy to Use**: Simple API for quick integration into your projects.\r\n\r\n---\r\n\r\n## Use Cases\r\n- **Voice Assistants**: Capture and process user commands in real-time.\r\n- **Transcription Tools**: Record and transcribe audio with noise reduction.\r\n- **Real-Time Audio Analysis**: Analyze audio signals for frequency, volume, and other metrics.\r\n- **Educational Tools**: Teach audio processing and visualization concepts.\r\n- **Security Systems**: Detect and record audio events in real-time.\r\n\r\n---\r\n\r\n## Features\r\n- **Live Audio Capture**: Capture audio from the microphone in real-time.\r\n- **Voice Activity Detection (VAD)**: Automatically detect speech and stop recording during silence.\r\n- **Noise Reduction**: Reduce background noise using the `noisereduce` package, which employs spectral gating techniques.\r\n- **Real-Time Visualization**: Visualize audio waveforms, frequency spectra, and spectrograms.\r\n- **Multiple Output Formats**: Save recordings in WAV, MP3, or OGG formats.\r\n- **Customizable Parameters**:\r\n - Sampling rate\r\n - Chunk duration\r\n - VAD aggressiveness\r\n - Noise reduction settings\r\n - Low-pass filter cutoff frequency\r\n- **Cross-Platform**: Works on Windows, macOS, and Linux.\r\n\r\n---\r\n\r\n## Installation\r\n\r\n### Requirements\r\n- Python 3.9 or higher\r\n- FFmpeg (for audio file handling)\r\n- Microphone access\r\n\r\n### Install the Package\r\nYou can install the package via pip:\r\n\r\n```bash\r\npip install live_audio_capture\r\n```\r\n\r\n### Install FFmpeg\r\n- **Linux**:\r\n ```bash\r\n sudo apt update\r\n sudo apt install ffmpeg\r\n ```\r\n- **macOS** (using Homebrew):\r\n ```bash\r\n brew install ffmpeg\r\n ```\r\n- **Windows**: Download FFmpeg from [https://ffmpeg.org/download.html](https://ffmpeg.org/download.html) and add it to your system's `PATH`.\r\n\r\n---\r\n\r\n## Usage\r\n\r\n### Basic Example\r\nCapture audio with voice activity detection and save it to a file:\r\n\r\n```python\r\nfrom live_audio_capture import LiveAudioCapture\r\n\r\n# Initialize the audio capture\r\ncapture = LiveAudioCapture(\r\n sampling_rate=16000, # Sample rate in Hz\r\n chunk_duration=0.1, # Duration of each audio chunk in seconds\r\n enable_noise_canceling=True, # Enable noise reduction\r\n aggressiveness=2, # VAD aggressiveness level (0-3)\r\n)\r\n\r\n# Start recording with VAD\r\ncapture.listen_and_record_with_vad(\r\n output_file=\"output.wav\", # Save the recording to this file\r\n silence_duration=2.0, # Stop recording after 2 seconds of silence\r\n format=\"wav\", # Output format\r\n)\r\n\r\n# Stop the capture\r\ncapture.stop()\r\n```\r\n\r\n### Real-Time Visualization\r\nVisualize audio in real-time:\r\n\r\n```python\r\nfrom live_audio_capture import LiveAudioCapture, AudioVisualizer\r\n\r\n# Initialize the audio capture\r\ncapture = LiveAudioCapture(sampling_rate=44100, chunk_duration=0.1)\r\n\r\n# Initialize the audio visualizer\r\nvisualizer = AudioVisualizer(sampling_rate=44100, chunk_duration=0.1)\r\n\r\n# Stream audio and visualize it\r\nfor audio_chunk in capture.stream_audio():\r\n visualizer.add_audio_chunk(audio_chunk)\r\n```\r\n\r\n### Advanced Example\r\nUse all available parameters for maximum customization:\r\n\r\n```python\r\nfrom live_audio_capture import LiveAudioCapture\r\n\r\n# Initialize the audio capture with all parameters\r\ncapture = LiveAudioCapture(\r\n sampling_rate=16000,\r\n chunk_duration=0.1,\r\n audio_format=\"f32le\",\r\n channels=1,\r\n aggressiveness=3,\r\n enable_beep=True,\r\n enable_noise_canceling=True,\r\n low_pass_cutoff=7500.0,\r\n stationary_noise_reduction=True,\r\n prop_decrease=1.0,\r\n n_std_thresh_stationary=1.5,\r\n n_jobs=1,\r\n use_torch=False,\r\n device=\"cpu\",\r\n calibration_duration=2.0,\r\n use_adaptive_threshold=True,\r\n)\r\n\r\n# Start recording with VAD\r\ncapture.listen_and_record_with_vad(\r\n output_file=\"output.wav\",\r\n silence_duration=2.0,\r\n format=\"wav\",\r\n)\r\n\r\n# Stop the capture\r\ncapture.stop()\r\n```\r\n\r\n---\r\n\r\n## Features and Arguments\r\n\r\n### `LiveAudioCapture` Parameters\r\n- **`sampling_rate`**: Sample rate in Hz (default: `16000`).\r\n- **`chunk_duration`**: Duration of each audio chunk in seconds (default: `0.1`).\r\n- **`audio_format`**: Audio format for FFmpeg output (default: `\"f32le\"`).\r\n- **`channels`**: Number of audio channels (default: `1` for mono).\r\n- **`aggressiveness`**: VAD aggressiveness level (0-3, default: `1`).\r\n- **`enable_beep`**: Play beep sounds when recording starts/stops (default: `True`).\r\n- **`enable_noise_canceling`**: Enable noise reduction using the `noisereduce` package (default: `False`).\r\n- **`low_pass_cutoff`**: Low-pass filter cutoff frequency (default: `7500.0`).\r\n- **`stationary_noise_reduction`**: Enable stationary noise reduction (default: `False`).\r\n- **`prop_decrease`**: Proportion to reduce noise by (default: `1.0`).\r\n- **`n_std_thresh_stationary`**: Threshold for stationary noise reduction (default: `1.5`).\r\n- **`n_jobs`**: Number of parallel jobs for noise reduction (default: `1`).\r\n- **`use_torch`**: Use PyTorch for noise reduction (default: `False`).\r\n- **`device`**: Device for PyTorch noise reduction (default: `\"cpu\"`).\r\n- **`calibration_duration`**: Duration of calibration for adaptive thresholding (default: `2.0`).\r\n- **`use_adaptive_threshold`**: Enable adaptive thresholding for VAD (default: `True`).\r\n\r\n---\r\n\r\n## Recommendations\r\n1. **Use Threading for Real-Time Listening**: It is highly recommended to use threading for real-time audio listening. This allows you to easily stop the audio capture in any script using the `.stop()` method without blocking the main program.\r\n2. **Use a High-Quality Microphone**: For best results, use a microphone with good noise cancellation.\r\n3. **Adjust VAD Aggressiveness**: Higher aggressiveness levels may reduce false positives but can also miss softer speech.\r\n4. **Enable Noise Reduction**: If you're working in a noisy environment, enable noise reduction for cleaner audio.\r\n5. **Test on Your Platform**: Test the package on your target platform to ensure compatibility.\r\n\r\n---\r\n\r\n## Technical Details\r\n\r\n### Voice Activity Detection (VAD)\r\nThe VAD system uses an energy-based approach with adaptive thresholding. It calculates the energy of each audio chunk and compares it to a dynamically adjusted threshold. Hysteresis is applied to avoid rapid toggling between speech and silence states.\r\n\r\n### Noise Reduction\r\nThe `noisereduce` package is used for noise reduction. It employs spectral gating techniques to remove background noise while preserving speech. You can choose between stationary and non-stationary noise reduction, and even use PyTorch for GPU-accelerated processing.\r\n\r\n### Real-Time Visualization\r\nThe visualization module provides insights into:\r\n- **Waveform**: The amplitude of the audio signal over time.\r\n- **Frequency Spectrum**: The distribution of frequencies in the audio signal.\r\n- **Spectrogram**: A visual representation of the spectrum of frequencies over time.\r\n- **Volume Meter**: Real-time volume levels.\r\n- **Volume History**: A history of volume levels over time.\r\n\r\n---\r\n\r\n## Contributing\r\nContributions are welcome! Please read the [Contributing Guidelines](https://github.com/sami-rajichi/live_audio_capture/blob/main/CONTRIBUTING.md) for details.\r\n\r\n---\r\n\r\n## License\r\nThis project is licensed under the MIT License. See the [LICENSE](https://github.com/sami-rajichi/live_audio_capture/blob/main/LICENSE) file for details.\r\n\r\n---\r\n\r\n## Support\r\nFor questions, issues, or feature requests, please open an issue on [GitHub](https://github.com/sami-rajichi/live_audio_capture/issues).\r\n\r\n---\r\n\r\n## Final Words\r\nIf you find this package useful, please consider leaving a \u2b50 star on the [GitHub repository](https://github.com/sami-rajichi/live_audio_capture). Your support motivates us to keep improving! If you have any suggestions for optimization or new features, don't hesitate to reach out. We'd love to hear from you!\r\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A cross-platform utility for capturing live audio from a microphone using FFmpeg.",
"version": "0.4.1",
"project_urls": {
"Bug Reports": "https://github.com/sami-rajichi/live_audio_capture/issues",
"Homepage": "https://sami-rajichi.github.io/live_audio_capture/",
"Source": "https://github.com/sami-rajichi/live_audio_capture"
},
"split_keywords": [
"audio",
"capture",
"ffmpeg",
"real-time",
"visualization"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "c69e1f1da5ea03c9fd24e326c347ed4ccf793260e0e8209dc5200c6aee189c44",
"md5": "526796ef1b9ca2979df7beb359c9176c",
"sha256": "d03b7a2e3e4d3d6ff3171195914367af056eb42f135ab0dd586e408858ba255d"
},
"downloads": -1,
"filename": "live_audio_capture-0.4.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "526796ef1b9ca2979df7beb359c9176c",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 22327,
"upload_time": "2025-01-18T18:56:13",
"upload_time_iso_8601": "2025-01-18T18:56:13.490517Z",
"url": "https://files.pythonhosted.org/packages/c6/9e/1f1da5ea03c9fd24e326c347ed4ccf793260e0e8209dc5200c6aee189c44/live_audio_capture-0.4.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "4671b67aecb434b0607a427110455927b083db053a5b81eda11cd8e27eee5561",
"md5": "f03119c4c4ff037c1f45277ac73ad58b",
"sha256": "bebfc37a20e91b6a467543cfcbb6ba27cbaf9ca60297ecfb00d960099d735faa"
},
"downloads": -1,
"filename": "live_audio_capture-0.4.1.tar.gz",
"has_sig": false,
"md5_digest": "f03119c4c4ff037c1f45277ac73ad58b",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 665876,
"upload_time": "2025-01-18T18:56:22",
"upload_time_iso_8601": "2025-01-18T18:56:22.016822Z",
"url": "https://files.pythonhosted.org/packages/46/71/b67aecb434b0607a427110455927b083db053a5b81eda11cd8e27eee5561/live_audio_capture-0.4.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-01-18 18:56:22",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "sami-rajichi",
"github_project": "live_audio_capture",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "numpy",
"specs": [
[
"==",
"1.26.4"
]
]
},
{
"name": "scipy",
"specs": [
[
"==",
"1.12.0"
]
]
},
{
"name": "pydub",
"specs": [
[
"==",
"0.25.1"
]
]
},
{
"name": "pyqtgraph",
"specs": [
[
"==",
"0.13.7"
]
]
},
{
"name": "noisereduce",
"specs": [
[
"==",
"3.0.3"
]
]
},
{
"name": "sounddevice",
"specs": [
[
"==",
"0.4.6"
]
]
},
{
"name": "simpleaudio",
"specs": [
[
"==",
"1.0.4"
]
]
},
{
"name": "pytest",
"specs": [
[
"==",
"8.3.4"
]
]
},
{
"name": "flake8",
"specs": [
[
"==",
"7.1.1"
]
]
},
{
"name": "twine",
"specs": [
[
"==",
"6.0.1"
]
]
},
{
"name": "torch",
"specs": [
[
"==",
"2.2.1"
]
]
},
{
"name": "mkdocs",
"specs": [
[
"==",
"1.6.1"
]
]
},
{
"name": "mkdocs-material",
"specs": [
[
"==",
"9.5.50"
]
]
},
{
"name": "mkdocstrings",
"specs": [
[
"==",
"0.27.0"
]
]
},
{
"name": "mkdocstrings-python",
"specs": [
[
"==",
"1.13.0"
]
]
}
],
"lcname": "live-audio-capture"
}