# Moonshine STT Plugin
This plugin provides Speech-to-Text functionality using [Moonshine](https://github.com/usefulsensors/moonshine), a family of speech-to-text models optimized for fast and accurate automatic speech recognition (ASR) on resource-constrained devices.
## Features
- **Fast and Accurate**: Moonshine processes 10-second audio segments 5x faster than Whisper while maintaining the same (or better!) WER
- **Resource Efficient**: Optimized for edge devices and resource-constrained environments
- **Variable Length Processing**: Compute requirements scale with input audio length (unlike Whisper's fixed 30-second chunks)
- **Multiple Models**: Support for both `moonshine/tiny` (~190MB) and `moonshine/base` (~400MB) models
- **Device Flexibility**: ONNX runtime automatically selects optimal execution provider
- **Smart Sample Rate Handling**: Automatic detection and high-quality resampling of WebRTC audio (48kHz → 16kHz)
- **WebRTC Optimized**: Seamless integration with Stream video calling infrastructure
- **Efficient Model Loading**: ONNX version loads models on-demand for optimal memory usage
## Installation
### From PyPI + GitHub (Required)
Since the Moonshine ONNX models are not available on PyPI, you need to install them separately from GitHub:
```bash
# 1. Install the core plugin from PyPI
pip install getstream-plugins-moonshine
# 2. Install the moonshine model dependency from GitHub
pip install "useful-moonshine-onnx @ git+https://github.com/usefulsensors/moonshine.git#subdirectory=moonshine-onnx"
```
### With uv
```bash
# Install both dependencies
uv add getstream-plugins-moonshine
uv add "useful-moonshine-onnx @ git+https://github.com/usefulsensors/moonshine.git#subdirectory=moonshine-onnx"
```
### Development Installation (uv)
If your project uses **uv**, add both dependencies to your `pyproject.toml`:
```toml
[project]
dependencies = [
# … other deps …
"getstream-plugins-moonshine>=0.1.0",
"useful-moonshine-onnx @ git+https://github.com/usefulsensors/moonshine.git#subdirectory=moonshine-onnx",
]
[tool.uv.sources]
getstream-plugins-moonshine = { path = "getstream/plugins/moonshine" } # for local development
```
Then:
```bash
uv sync # installs both dependencies
```
## Usage
```python
from getstream.plugins.moonshine import MoonshineSTT
from getstream.video.rtc.track_util import PcmData
# Initialize with default settings (base model, 16kHz)
stt = MoonshineSTT()
# Or customize the configuration
stt = MoonshineSTT(
model_name="moonshine/tiny", # Use the smaller, faster model
sample_rate=16000, # Moonshine's native sample rate
min_audio_length_ms=500, # Minimum audio length for transcription
# ONNX runtime will automatically select the best execution provider
)
# Set up event handlers
@stt.on("transcript")
async def on_transcript(text: str, user: any, metadata: dict):
print(f"Final transcript: {text}")
print(f"Confidence: {metadata.get('confidence', 'N/A')}")
print(f"Processing time: {metadata.get('processing_time_ms', 'N/A')}ms")
@stt.on("error")
async def on_error(error: Exception):
print(f"STT Error: {error}")
# Process audio data
pcm_data = PcmData(samples=audio_bytes, sample_rate=16000, format="s16")
await stt.process_audio(pcm_data)
# Clean up
await stt.close()
```
## Model Selection
Moonshine offers two model variants with different trade-offs:
| Model | Size | Parameters | Speed | Accuracy | Use Case |
|-------|------|------------|-------|----------|----------|
| `moonshine/tiny` | ~190MB | 27M | Faster | Good | Resource-constrained devices, real-time applications |
| `moonshine/base` | ~400MB | 61M | Fast | Better | **Default choice** - balanced performance and accuracy |
**Default Model**: The plugin uses `moonshine/base` by default as it provides the best balance of accuracy and performance for most use cases.
**Choosing a Model**:
- Use `moonshine/tiny` for maximum speed on very resource-constrained devices
- Use `moonshine/base` for better accuracy with still excellent performance (recommended)
**Model Name Validation**:
- Strict validation prevents silent fallbacks to wrong models
- Supports both short names (`"tiny"`, `"base"`) and full names (`"moonshine/tiny"`, `"moonshine/base"`)
- Clear error messages list all valid options when invalid model is specified
- Canonical model names ensure consistent behavior across different input formats
## Sample Rate Handling
The Moonshine plugin automatically handles sample rate conversion for optimal transcription quality:
## Events
The plugin emits the following events:
- **transcript**: Final transcription result
- `text` (str): The transcribed text
- `user` (any): User metadata passed to `process_audio()`
- `metadata` (dict): Additional information including model name, duration, etc.
- **error**: Error during transcription
- `error` (Exception): The error that occurred
Note: Unlike streaming STT services, Moonshine doesn't emit `partial_transcript` events as it processes complete audio chunks.
Raw data
{
"_id": null,
"home_page": null,
"name": "vision-agents-plugins-moonshine",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": "AI, STT, agents, moonshine, speech-to-text, transcription, voice agents",
"author": null,
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/bc/73/c1dceec458967199c56c5fc68c83b55849aa26aff436a1b09b51e4d99e43/vision_agents_plugins_moonshine-0.1.8.tar.gz",
"platform": null,
"description": "# Moonshine STT Plugin\n\nThis plugin provides Speech-to-Text functionality using [Moonshine](https://github.com/usefulsensors/moonshine), a family of speech-to-text models optimized for fast and accurate automatic speech recognition (ASR) on resource-constrained devices.\n\n## Features\n\n- **Fast and Accurate**: Moonshine processes 10-second audio segments 5x faster than Whisper while maintaining the same (or better!) WER\n- **Resource Efficient**: Optimized for edge devices and resource-constrained environments\n- **Variable Length Processing**: Compute requirements scale with input audio length (unlike Whisper's fixed 30-second chunks)\n- **Multiple Models**: Support for both `moonshine/tiny` (~190MB) and `moonshine/base` (~400MB) models\n- **Device Flexibility**: ONNX runtime automatically selects optimal execution provider\n- **Smart Sample Rate Handling**: Automatic detection and high-quality resampling of WebRTC audio (48kHz \u2192 16kHz)\n- **WebRTC Optimized**: Seamless integration with Stream video calling infrastructure\n- **Efficient Model Loading**: ONNX version loads models on-demand for optimal memory usage\n\n## Installation\n\n### From PyPI + GitHub (Required)\n\nSince the Moonshine ONNX models are not available on PyPI, you need to install them separately from GitHub:\n\n```bash\n# 1. Install the core plugin from PyPI\npip install getstream-plugins-moonshine\n\n# 2. Install the moonshine model dependency from GitHub\npip install \"useful-moonshine-onnx @ git+https://github.com/usefulsensors/moonshine.git#subdirectory=moonshine-onnx\"\n```\n\n### With uv\n\n```bash\n# Install both dependencies\nuv add getstream-plugins-moonshine\nuv add \"useful-moonshine-onnx @ git+https://github.com/usefulsensors/moonshine.git#subdirectory=moonshine-onnx\"\n```\n\n### Development Installation (uv)\n\nIf your project uses **uv**, add both dependencies to your `pyproject.toml`:\n\n```toml\n[project]\ndependencies = [\n # \u2026 other deps \u2026\n \"getstream-plugins-moonshine>=0.1.0\",\n \"useful-moonshine-onnx @ git+https://github.com/usefulsensors/moonshine.git#subdirectory=moonshine-onnx\",\n]\n\n[tool.uv.sources]\ngetstream-plugins-moonshine = { path = \"getstream/plugins/moonshine\" } # for local development\n```\n\nThen:\n\n```bash\nuv sync # installs both dependencies\n```\n\n## Usage\n\n```python\nfrom getstream.plugins.moonshine import MoonshineSTT\nfrom getstream.video.rtc.track_util import PcmData\n\n# Initialize with default settings (base model, 16kHz)\nstt = MoonshineSTT()\n\n# Or customize the configuration\nstt = MoonshineSTT(\n model_name=\"moonshine/tiny\", # Use the smaller, faster model\n sample_rate=16000, # Moonshine's native sample rate\n min_audio_length_ms=500, # Minimum audio length for transcription\n # ONNX runtime will automatically select the best execution provider\n)\n\n# Set up event handlers\n@stt.on(\"transcript\")\nasync def on_transcript(text: str, user: any, metadata: dict):\n print(f\"Final transcript: {text}\")\n print(f\"Confidence: {metadata.get('confidence', 'N/A')}\")\n print(f\"Processing time: {metadata.get('processing_time_ms', 'N/A')}ms\")\n\n@stt.on(\"error\")\nasync def on_error(error: Exception):\n print(f\"STT Error: {error}\")\n\n# Process audio data\npcm_data = PcmData(samples=audio_bytes, sample_rate=16000, format=\"s16\")\nawait stt.process_audio(pcm_data)\n\n# Clean up\nawait stt.close()\n```\n\n## Model Selection\n\nMoonshine offers two model variants with different trade-offs:\n\n| Model | Size | Parameters | Speed | Accuracy | Use Case |\n|-------|------|------------|-------|----------|----------|\n| `moonshine/tiny` | ~190MB | 27M | Faster | Good | Resource-constrained devices, real-time applications |\n| `moonshine/base` | ~400MB | 61M | Fast | Better | **Default choice** - balanced performance and accuracy |\n\n**Default Model**: The plugin uses `moonshine/base` by default as it provides the best balance of accuracy and performance for most use cases.\n\n**Choosing a Model**:\n- Use `moonshine/tiny` for maximum speed on very resource-constrained devices\n- Use `moonshine/base` for better accuracy with still excellent performance (recommended)\n\n**Model Name Validation**:\n- Strict validation prevents silent fallbacks to wrong models\n- Supports both short names (`\"tiny\"`, `\"base\"`) and full names (`\"moonshine/tiny\"`, `\"moonshine/base\"`)\n- Clear error messages list all valid options when invalid model is specified\n- Canonical model names ensure consistent behavior across different input formats\n\n## Sample Rate Handling\n\nThe Moonshine plugin automatically handles sample rate conversion for optimal transcription quality:\n\n## Events\n\nThe plugin emits the following events:\n\n- **transcript**: Final transcription result\n - `text` (str): The transcribed text\n - `user` (any): User metadata passed to `process_audio()`\n - `metadata` (dict): Additional information including model name, duration, etc.\n\n- **error**: Error during transcription\n - `error` (Exception): The error that occurred\n\nNote: Unlike streaming STT services, Moonshine doesn't emit `partial_transcript` events as it processes complete audio chunks.\n",
"bugtrack_url": null,
"license": null,
"summary": "Moonshine STT integration for Vision Agents",
"version": "0.1.8",
"project_urls": {
"Documentation": "https://visionagents.ai/",
"Source": "https://github.com/GetStream/Vision-Agents",
"Website": "https://visionagents.ai/"
},
"split_keywords": [
"ai",
" stt",
" agents",
" moonshine",
" speech-to-text",
" transcription",
" voice agents"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "f26b743a08af84d696c51e23b0ae11245b560e935bf94e10a14a5038e6c19c2c",
"md5": "39651a25cb5e2edace0593ae2555383a",
"sha256": "e9133ffe9cbad31da71227642db32435c9a2576929301fdb1165a7b837fab303"
},
"downloads": -1,
"filename": "vision_agents_plugins_moonshine-0.1.8-py3-none-any.whl",
"has_sig": false,
"md5_digest": "39651a25cb5e2edace0593ae2555383a",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 14033,
"upload_time": "2025-10-22T01:51:12",
"upload_time_iso_8601": "2025-10-22T01:51:12.363282Z",
"url": "https://files.pythonhosted.org/packages/f2/6b/743a08af84d696c51e23b0ae11245b560e935bf94e10a14a5038e6c19c2c/vision_agents_plugins_moonshine-0.1.8-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "bc73c1dceec458967199c56c5fc68c83b55849aa26aff436a1b09b51e4d99e43",
"md5": "569dbaf58ebc42c3456fd6011d97fd90",
"sha256": "a64becc054586a90755290abb31aab4a231b92c697c48032a28a8bcea7424240"
},
"downloads": -1,
"filename": "vision_agents_plugins_moonshine-0.1.8.tar.gz",
"has_sig": false,
"md5_digest": "569dbaf58ebc42c3456fd6011d97fd90",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 7241,
"upload_time": "2025-10-22T01:51:13",
"upload_time_iso_8601": "2025-10-22T01:51:13.225414Z",
"url": "https://files.pythonhosted.org/packages/bc/73/c1dceec458967199c56c5fc68c83b55849aa26aff436a1b09b51e4d99e43/vision_agents_plugins_moonshine-0.1.8.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-10-22 01:51:13",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "GetStream",
"github_project": "Vision-Agents",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "vision-agents-plugins-moonshine"
}