# Deepgram Speech-to-Text Plugin
A high-quality Speech-to-Text (STT) plugin for Vision agents that uses the Deepgram API.
## Installation
```bash
uv add vision-agents-plugins-deepgram
```
## Usage
```python
from vision_agents.plugins import deepgram
from getstream.video.rtc.track_util import PcmData
# Initialize with API key from environment variable
stt = deepgram.STT()
# Or specify API key directly
stt = deepgram.STT(api_key="your_deepgram_api_key")
# Register event handlers
@stt.on("transcript")
def on_transcript(text, user, metadata):
print(f"Final transcript from {user}: {text}")
@stt.on("partial_transcript")
def on_partial(text, user, metadata):
print(f"Partial transcript from {user}: {text}")
# Process audio
pcm_data = PcmData(samples=b"\x00\x00" * 1000, sample_rate=48000, format="s16")
await stt.process_audio(pcm_data)
# When done
await stt.close()
```
## Configuration Options
- `api_key`: Deepgram API key (default: reads from `DEEPGRAM_API_KEY` environment variable)
- `options`: Deepgram options for configuring the transcription.
See the [Deepgram Listen V1 Connect API documentation](https://github.com/deepgram/deepgram-python-sdk/blob/main/websockets-reference.md#%EF%B8%8F-parameters) for more details.
- `sample_rate`: Sample rate of the audio in Hz (default: 16000)
- `language`: Language code for transcription (default: "en-US")
- `keep_alive_interval`: Interval in seconds to send keep-alive messages (default: 1.0s)
- `connection_timeout`: Timeout to wait for the Deepgram connection to be established before skipping the in seconds to send keep-alive messages (default: 15.0s)
## Requirements
- Python 3.10+
- deepgram-sdk>=5.0.0,<5.1
- numpy>=2.2.6,<2.3
Raw data
{
"_id": null,
"home_page": null,
"name": "vision-agents-plugins-deepgram",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": "AI, STT, agents, deepgram, speech-to-text, transcription, voice agents",
"author": null,
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/5c/ac/1be8032e9c2682a9c44d791ef59a5578c3198ac78d098cef629509d2f852/vision_agents_plugins_deepgram-0.1.8.tar.gz",
"platform": null,
"description": "# Deepgram Speech-to-Text Plugin\n\nA high-quality Speech-to-Text (STT) plugin for Vision agents that uses the Deepgram API.\n\n## Installation\n\n```bash\nuv add vision-agents-plugins-deepgram\n```\n\n## Usage\n\n```python\nfrom vision_agents.plugins import deepgram\nfrom getstream.video.rtc.track_util import PcmData\n\n# Initialize with API key from environment variable\nstt = deepgram.STT()\n\n# Or specify API key directly\nstt = deepgram.STT(api_key=\"your_deepgram_api_key\")\n\n# Register event handlers\n@stt.on(\"transcript\")\ndef on_transcript(text, user, metadata):\n print(f\"Final transcript from {user}: {text}\")\n\n@stt.on(\"partial_transcript\")\ndef on_partial(text, user, metadata):\n print(f\"Partial transcript from {user}: {text}\")\n\n# Process audio\npcm_data = PcmData(samples=b\"\\x00\\x00\" * 1000, sample_rate=48000, format=\"s16\")\nawait stt.process_audio(pcm_data)\n\n# When done\nawait stt.close()\n```\n\n## Configuration Options\n\n- `api_key`: Deepgram API key (default: reads from `DEEPGRAM_API_KEY` environment variable)\n- `options`: Deepgram options for configuring the transcription. \nSee the [Deepgram Listen V1 Connect API documentation](https://github.com/deepgram/deepgram-python-sdk/blob/main/websockets-reference.md#%EF%B8%8F-parameters) for more details.\n- `sample_rate`: Sample rate of the audio in Hz (default: 16000)\n- `language`: Language code for transcription (default: \"en-US\")\n- `keep_alive_interval`: Interval in seconds to send keep-alive messages (default: 1.0s)\n- `connection_timeout`: Timeout to wait for the Deepgram connection to be established before skipping the in seconds to send keep-alive messages (default: 15.0s)\n\n## Requirements\n\n- Python 3.10+\n- deepgram-sdk>=5.0.0,<5.1\n- numpy>=2.2.6,<2.3\n",
"bugtrack_url": null,
"license": null,
"summary": "Deepgram STT integration for Vision Agents",
"version": "0.1.8",
"project_urls": {
"Documentation": "https://visionagents.ai/",
"Source": "https://github.com/GetStream/Vision-Agents",
"Website": "https://visionagents.ai/"
},
"split_keywords": [
"ai",
" stt",
" agents",
" deepgram",
" speech-to-text",
" transcription",
" voice agents"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "f582a83a84a866c23f107fcf8478266f80c27b97d10105bbfa55c088c8c97fd3",
"md5": "abb649c747afe34c7161bb3124f197fe",
"sha256": "93faa7c1ef070dd7c433ae616cb97c51c2bb7f914e22e860046932752d29a9d2"
},
"downloads": -1,
"filename": "vision_agents_plugins_deepgram-0.1.8-py3-none-any.whl",
"has_sig": false,
"md5_digest": "abb649c747afe34c7161bb3124f197fe",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 14879,
"upload_time": "2025-10-22T01:51:01",
"upload_time_iso_8601": "2025-10-22T01:51:01.066458Z",
"url": "https://files.pythonhosted.org/packages/f5/82/a83a84a866c23f107fcf8478266f80c27b97d10105bbfa55c088c8c97fd3/vision_agents_plugins_deepgram-0.1.8-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "5cac1be8032e9c2682a9c44d791ef59a5578c3198ac78d098cef629509d2f852",
"md5": "cb10fc6fbd34e8e1a6319b50f3c9fb45",
"sha256": "cca999333ac40f11788cb974b8101b56b4960710224e76ceae9ad9371de936ed"
},
"downloads": -1,
"filename": "vision_agents_plugins_deepgram-0.1.8.tar.gz",
"has_sig": false,
"md5_digest": "cb10fc6fbd34e8e1a6319b50f3c9fb45",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 6050,
"upload_time": "2025-10-22T01:51:01",
"upload_time_iso_8601": "2025-10-22T01:51:01.849169Z",
"url": "https://files.pythonhosted.org/packages/5c/ac/1be8032e9c2682a9c44d791ef59a5578c3198ac78d098cef629509d2f852/vision_agents_plugins_deepgram-0.1.8.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-10-22 01:51:01",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "GetStream",
"github_project": "Vision-Agents",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "vision-agents-plugins-deepgram"
}