vision-agents-plugins-moonshine

Name	vision-agents-plugins-moonshine JSON
Version	0.1.8 JSON
	download
home_page	None
Summary	Moonshine STT integration for Vision Agents
upload_time	2025-10-22 01:51:13
maintainer	None
docs_url	None
author	None
requires_python	>=3.10
license	None
keywords	ai stt agents moonshine speech-to-text transcription voice agents
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Moonshine STT Plugin

This plugin provides Speech-to-Text functionality using [Moonshine](https://github.com/usefulsensors/moonshine), a family of speech-to-text models optimized for fast and accurate automatic speech recognition (ASR) on resource-constrained devices.

## Features

- **Fast and Accurate**: Moonshine processes 10-second audio segments 5x faster than Whisper while maintaining the same (or better!) WER
- **Resource Efficient**: Optimized for edge devices and resource-constrained environments
- **Variable Length Processing**: Compute requirements scale with input audio length (unlike Whisper's fixed 30-second chunks)
- **Multiple Models**: Support for both `moonshine/tiny` (~190MB) and `moonshine/base` (~400MB) models
- **Device Flexibility**: ONNX runtime automatically selects optimal execution provider
- **Smart Sample Rate Handling**: Automatic detection and high-quality resampling of WebRTC audio (48kHz → 16kHz)
- **WebRTC Optimized**: Seamless integration with Stream video calling infrastructure
- **Efficient Model Loading**: ONNX version loads models on-demand for optimal memory usage

## Installation

### From PyPI + GitHub (Required)

Since the Moonshine ONNX models are not available on PyPI, you need to install them separately from GitHub:

```bash
# 1. Install the core plugin from PyPI
pip install getstream-plugins-moonshine

# 2. Install the moonshine model dependency from GitHub
pip install "useful-moonshine-onnx @ git+https://github.com/usefulsensors/moonshine.git#subdirectory=moonshine-onnx"
```

### With uv

```bash
# Install both dependencies
uv add getstream-plugins-moonshine
uv add "useful-moonshine-onnx @ git+https://github.com/usefulsensors/moonshine.git#subdirectory=moonshine-onnx"
```

### Development Installation (uv)

If your project uses **uv**, add both dependencies to your `pyproject.toml`:

```toml
[project]
dependencies = [
    # … other deps …
    "getstream-plugins-moonshine>=0.1.0",
    "useful-moonshine-onnx @ git+https://github.com/usefulsensors/moonshine.git#subdirectory=moonshine-onnx",
]

[tool.uv.sources]
getstream-plugins-moonshine = { path = "getstream/plugins/moonshine" }  # for local development
```

Then:

```bash
uv sync        # installs both dependencies
```

## Usage

```python
from getstream.plugins.moonshine import MoonshineSTT
from getstream.video.rtc.track_util import PcmData

# Initialize with default settings (base model, 16kHz)
stt = MoonshineSTT()

# Or customize the configuration
stt = MoonshineSTT(
    model_name="moonshine/tiny",  # Use the smaller, faster model
    sample_rate=16000,            # Moonshine's native sample rate
    min_audio_length_ms=500,      # Minimum audio length for transcription
    # ONNX runtime will automatically select the best execution provider
)

# Set up event handlers
@stt.on("transcript")
async def on_transcript(text: str, user: any, metadata: dict):
    print(f"Final transcript: {text}")
    print(f"Confidence: {metadata.get('confidence', 'N/A')}")
    print(f"Processing time: {metadata.get('processing_time_ms', 'N/A')}ms")

@stt.on("error")
async def on_error(error: Exception):
    print(f"STT Error: {error}")

# Process audio data
pcm_data = PcmData(samples=audio_bytes, sample_rate=16000, format="s16")
await stt.process_audio(pcm_data)

# Clean up
await stt.close()
```

## Model Selection

Moonshine offers two model variants with different trade-offs:

| Model | Size | Parameters | Speed | Accuracy | Use Case |
|-------|------|------------|-------|----------|----------|
| `moonshine/tiny` | ~190MB | 27M | Faster | Good | Resource-constrained devices, real-time applications |
| `moonshine/base` | ~400MB | 61M | Fast | Better | **Default choice** - balanced performance and accuracy |

**Default Model**: The plugin uses `moonshine/base` by default as it provides the best balance of accuracy and performance for most use cases.

**Choosing a Model**:
- Use `moonshine/tiny` for maximum speed on very resource-constrained devices
- Use `moonshine/base` for better accuracy with still excellent performance (recommended)

**Model Name Validation**:
- Strict validation prevents silent fallbacks to wrong models
- Supports both short names (`"tiny"`, `"base"`) and full names (`"moonshine/tiny"`, `"moonshine/base"`)
- Clear error messages list all valid options when invalid model is specified
- Canonical model names ensure consistent behavior across different input formats

## Sample Rate Handling

The Moonshine plugin automatically handles sample rate conversion for optimal transcription quality:

## Events

The plugin emits the following events:

- **transcript**: Final transcription result
  - `text` (str): The transcribed text
  - `user` (any): User metadata passed to `process_audio()`
  - `metadata` (dict): Additional information including model name, duration, etc.

- **error**: Error during transcription
  - `error` (Exception): The error that occurred

Note: Unlike streaming STT services, Moonshine doesn't emit `partial_transcript` events as it processes complete audio chunks.

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "vision-agents-plugins-moonshine",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "AI, STT, agents, moonshine, speech-to-text, transcription, voice agents",
    "author": null,
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/bc/73/c1dceec458967199c56c5fc68c83b55849aa26aff436a1b09b51e4d99e43/vision_agents_plugins_moonshine-0.1.8.tar.gz",
    "platform": null,
    "description": "# Moonshine STT Plugin\n\nThis plugin provides Speech-to-Text functionality using [Moonshine](https://github.com/usefulsensors/moonshine), a family of speech-to-text models optimized for fast and accurate automatic speech recognition (ASR) on resource-constrained devices.\n\n## Features\n\n- **Fast and Accurate**: Moonshine processes 10-second audio segments 5x faster than Whisper while maintaining the same (or better!) WER\n- **Resource Efficient**: Optimized for edge devices and resource-constrained environments\n- **Variable Length Processing**: Compute requirements scale with input audio length (unlike Whisper's fixed 30-second chunks)\n- **Multiple Models**: Support for both `moonshine/tiny` (~190MB) and `moonshine/base` (~400MB) models\n- **Device Flexibility**: ONNX runtime automatically selects optimal execution provider\n- **Smart Sample Rate Handling**: Automatic detection and high-quality resampling of WebRTC audio (48kHz \u2192 16kHz)\n- **WebRTC Optimized**: Seamless integration with Stream video calling infrastructure\n- **Efficient Model Loading**: ONNX version loads models on-demand for optimal memory usage\n\n## Installation\n\n### From PyPI + GitHub (Required)\n\nSince the Moonshine ONNX models are not available on PyPI, you need to install them separately from GitHub:\n\n```bash\n# 1. Install the core plugin from PyPI\npip install getstream-plugins-moonshine\n\n# 2. Install the moonshine model dependency from GitHub\npip install \"useful-moonshine-onnx @ git+https://github.com/usefulsensors/moonshine.git#subdirectory=moonshine-onnx\"\n```\n\n### With uv\n\n```bash\n# Install both dependencies\nuv add getstream-plugins-moonshine\nuv add \"useful-moonshine-onnx @ git+https://github.com/usefulsensors/moonshine.git#subdirectory=moonshine-onnx\"\n```\n\n### Development Installation (uv)\n\nIf your project uses **uv**, add both dependencies to your `pyproject.toml`:\n\n```toml\n[project]\ndependencies = [\n    # \u2026 other deps \u2026\n    \"getstream-plugins-moonshine>=0.1.0\",\n    \"useful-moonshine-onnx @ git+https://github.com/usefulsensors/moonshine.git#subdirectory=moonshine-onnx\",\n]\n\n[tool.uv.sources]\ngetstream-plugins-moonshine = { path = \"getstream/plugins/moonshine\" }  # for local development\n```\n\nThen:\n\n```bash\nuv sync        # installs both dependencies\n```\n\n## Usage\n\n```python\nfrom getstream.plugins.moonshine import MoonshineSTT\nfrom getstream.video.rtc.track_util import PcmData\n\n# Initialize with default settings (base model, 16kHz)\nstt = MoonshineSTT()\n\n# Or customize the configuration\nstt = MoonshineSTT(\n    model_name=\"moonshine/tiny\",  # Use the smaller, faster model\n    sample_rate=16000,            # Moonshine's native sample rate\n    min_audio_length_ms=500,      # Minimum audio length for transcription\n    # ONNX runtime will automatically select the best execution provider\n)\n\n# Set up event handlers\n@stt.on(\"transcript\")\nasync def on_transcript(text: str, user: any, metadata: dict):\n    print(f\"Final transcript: {text}\")\n    print(f\"Confidence: {metadata.get('confidence', 'N/A')}\")\n    print(f\"Processing time: {metadata.get('processing_time_ms', 'N/A')}ms\")\n\n@stt.on(\"error\")\nasync def on_error(error: Exception):\n    print(f\"STT Error: {error}\")\n\n# Process audio data\npcm_data = PcmData(samples=audio_bytes, sample_rate=16000, format=\"s16\")\nawait stt.process_audio(pcm_data)\n\n# Clean up\nawait stt.close()\n```\n\n## Model Selection\n\nMoonshine offers two model variants with different trade-offs:\n\n| Model | Size | Parameters | Speed | Accuracy | Use Case |\n|-------|------|------------|-------|----------|----------|\n| `moonshine/tiny` | ~190MB | 27M | Faster | Good | Resource-constrained devices, real-time applications |\n| `moonshine/base` | ~400MB | 61M | Fast | Better | **Default choice** - balanced performance and accuracy |\n\n**Default Model**: The plugin uses `moonshine/base` by default as it provides the best balance of accuracy and performance for most use cases.\n\n**Choosing a Model**:\n- Use `moonshine/tiny` for maximum speed on very resource-constrained devices\n- Use `moonshine/base` for better accuracy with still excellent performance (recommended)\n\n**Model Name Validation**:\n- Strict validation prevents silent fallbacks to wrong models\n- Supports both short names (`\"tiny\"`, `\"base\"`) and full names (`\"moonshine/tiny\"`, `\"moonshine/base\"`)\n- Clear error messages list all valid options when invalid model is specified\n- Canonical model names ensure consistent behavior across different input formats\n\n## Sample Rate Handling\n\nThe Moonshine plugin automatically handles sample rate conversion for optimal transcription quality:\n\n## Events\n\nThe plugin emits the following events:\n\n- **transcript**: Final transcription result\n  - `text` (str): The transcribed text\n  - `user` (any): User metadata passed to `process_audio()`\n  - `metadata` (dict): Additional information including model name, duration, etc.\n\n- **error**: Error during transcription\n  - `error` (Exception): The error that occurred\n\nNote: Unlike streaming STT services, Moonshine doesn't emit `partial_transcript` events as it processes complete audio chunks.\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Moonshine STT integration for Vision Agents",
    "version": "0.1.8",
    "project_urls": {
        "Documentation": "https://visionagents.ai/",
        "Source": "https://github.com/GetStream/Vision-Agents",
        "Website": "https://visionagents.ai/"
    },
    "split_keywords": [
        "ai",
        " stt",
        " agents",
        " moonshine",
        " speech-to-text",
        " transcription",
        " voice agents"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "f26b743a08af84d696c51e23b0ae11245b560e935bf94e10a14a5038e6c19c2c",
                "md5": "39651a25cb5e2edace0593ae2555383a",
                "sha256": "e9133ffe9cbad31da71227642db32435c9a2576929301fdb1165a7b837fab303"
            },
            "downloads": -1,
            "filename": "vision_agents_plugins_moonshine-0.1.8-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "39651a25cb5e2edace0593ae2555383a",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 14033,
            "upload_time": "2025-10-22T01:51:12",
            "upload_time_iso_8601": "2025-10-22T01:51:12.363282Z",
            "url": "https://files.pythonhosted.org/packages/f2/6b/743a08af84d696c51e23b0ae11245b560e935bf94e10a14a5038e6c19c2c/vision_agents_plugins_moonshine-0.1.8-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "bc73c1dceec458967199c56c5fc68c83b55849aa26aff436a1b09b51e4d99e43",
                "md5": "569dbaf58ebc42c3456fd6011d97fd90",
                "sha256": "a64becc054586a90755290abb31aab4a231b92c697c48032a28a8bcea7424240"
            },
            "downloads": -1,
            "filename": "vision_agents_plugins_moonshine-0.1.8.tar.gz",
            "has_sig": false,
            "md5_digest": "569dbaf58ebc42c3456fd6011d97fd90",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 7241,
            "upload_time": "2025-10-22T01:51:13",
            "upload_time_iso_8601": "2025-10-22T01:51:13.225414Z",
            "url": "https://files.pythonhosted.org/packages/bc/73/c1dceec458967199c56c5fc68c83b55849aa26aff436a1b09b51e4d99e43/vision_agents_plugins_moonshine-0.1.8.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-10-22 01:51:13",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "GetStream",
    "github_project": "Vision-Agents",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "vision-agents-plugins-moonshine"
}

None