# Voice Agent Python client for Speechmatics Real-Time API
[](https://github.com/speechmatics/speechmatics-python-voice/blob/master/LICENSE)
An SDK for working with the Speechmatics Real-Time API optimised for use in voice agents or transcription services.
This uses the Python Real-Time API to process the transcription results from the STT engine and combine them into manageable segments of audio. Taking advantage of speaker diarization, the transcription is grouped into individual speakers, with advanced options to focus on and/or ignore specific speakers.
See [OVERVIEW.md](OVERVIEW.md) for more information.
## Installation
```bash
pip install speechmatics-voice
```
## Requirements
You must have a valid Speechmatics API key to use this SDK. You can get one from the [Speechmatics Console](https://console.speechmatics.com).
Store this as `SPEECHMATICS_API_KEY` environment variable in your `.env` file or use `export SPEECHMATICS_API_KEY="your_api_key_here"` in your terminal.
## Quick Start
Below is a basic example of how to use the SDK to transcribe audio from a microphone.
```python
import asyncio
from speechmatics.rt import Microphone
from speechmatics.voice import (
VoiceAgentClient,
VoiceAgentConfig,
EndOfUtteranceMode,
AgentServerMessageType,
SpeakerSegment,
)
async def main():
# Configure the voice agent
config = VoiceAgentConfig(
end_of_utterance_silence_trigger=0.5,
enable_diarization=True,
end_of_utterance_mode=EndOfUtteranceMode.ADAPTIVE,
)
# Initialize microphone
mic = Microphone(
sample_rate=16000,
chunk_size=160,
)
if not mic.start():
print("Microphone not available")
return
# Create client and register event handlers
async with VoiceAgentClient(config=config) as client:
# Handle interim transcription segments
@client.on(AgentServerMessageType.ADD_INTERIM_SEGMENTS)
def handle_interim_segments(message):
segments: list[SpeakerSegment] = message["segments"]
for segment in segments:
print(f"Speaker {segment.speaker_id}: {segment.text}")
# Handle finalized transcription segments
@client.on(AgentServerMessageType.ADD_SEGMENTS)
def handle_final_segments(message):
segments: list[SpeakerSegment] = message["segments"]
for segment in segments:
print(f"Speaker {segment.speaker_id}: {segment.text}")
# Handle user started speaking event
@client.on(AgentServerMessageType.USER_SPEECH_STARTED)
def handle_speech_started(message):
print("User started speaking")
# Handle user stopped speaking event
@client.on(AgentServerMessageType.USER_SPEECH_ENDED)
def handle_speech_ended(message):
print("User stopped speaking")
# Connect and start processing audio
await client.connect()
while True:
frame = await mic.read(160)
await client.send_audio(frame)
if __name__ == "__main__":
asyncio.run(main())
```
## Examples
The `examples/` directory contains practical demonstrations of the Voice Agent SDK. See the README in the `examples/voice` directory for more information.
## Documentation
- **SDK Overview**: See [OVERVIEW.md](OVERVIEW.md) for comprehensive API documentation
- **Speechmatics API**: https://docs.speechmatics.com
## License
[MIT](LICENSE)
Raw data
{
"_id": null,
"home_page": null,
"name": "speechmatics-voice",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": "speechmatics, conversational-ai, voice, agents, websocket, real-time, pipecat, livekit",
"author": null,
"author_email": "Speechmatics <support@speechmatics.com>",
"download_url": "https://files.pythonhosted.org/packages/bc/5f/0f18982984cf71ec0024e76738107cfa1035daa453f463e30c36afd37574/speechmatics_voice-0.1.0.tar.gz",
"platform": null,
"description": "# Voice Agent Python client for Speechmatics Real-Time API\n\n[](https://github.com/speechmatics/speechmatics-python-voice/blob/master/LICENSE)\n\nAn SDK for working with the Speechmatics Real-Time API optimised for use in voice agents or transcription services.\n\nThis uses the Python Real-Time API to process the transcription results from the STT engine and combine them into manageable segments of audio. Taking advantage of speaker diarization, the transcription is grouped into individual speakers, with advanced options to focus on and/or ignore specific speakers.\n\nSee [OVERVIEW.md](OVERVIEW.md) for more information.\n\n## Installation\n\n```bash\npip install speechmatics-voice\n```\n\n## Requirements\n\nYou must have a valid Speechmatics API key to use this SDK. You can get one from the [Speechmatics Console](https://console.speechmatics.com).\n\nStore this as `SPEECHMATICS_API_KEY` environment variable in your `.env` file or use `export SPEECHMATICS_API_KEY=\"your_api_key_here\"` in your terminal.\n\n## Quick Start\n\nBelow is a basic example of how to use the SDK to transcribe audio from a microphone.\n\n```python\nimport asyncio\nfrom speechmatics.rt import Microphone\nfrom speechmatics.voice import (\n VoiceAgentClient,\n VoiceAgentConfig,\n EndOfUtteranceMode,\n AgentServerMessageType,\n SpeakerSegment,\n)\n\nasync def main():\n # Configure the voice agent\n config = VoiceAgentConfig(\n end_of_utterance_silence_trigger=0.5,\n enable_diarization=True,\n end_of_utterance_mode=EndOfUtteranceMode.ADAPTIVE,\n )\n\n # Initialize microphone\n mic = Microphone(\n sample_rate=16000,\n chunk_size=160,\n )\n\n if not mic.start():\n print(\"Microphone not available\")\n return\n\n # Create client and register event handlers\n async with VoiceAgentClient(config=config) as client:\n\n # Handle interim transcription segments\n @client.on(AgentServerMessageType.ADD_INTERIM_SEGMENTS)\n def handle_interim_segments(message):\n segments: list[SpeakerSegment] = message[\"segments\"]\n for segment in segments:\n print(f\"Speaker {segment.speaker_id}: {segment.text}\")\n\n # Handle finalized transcription segments\n @client.on(AgentServerMessageType.ADD_SEGMENTS)\n def handle_final_segments(message):\n segments: list[SpeakerSegment] = message[\"segments\"]\n for segment in segments:\n print(f\"Speaker {segment.speaker_id}: {segment.text}\")\n\n # Handle user started speaking event\n @client.on(AgentServerMessageType.USER_SPEECH_STARTED)\n def handle_speech_started(message):\n print(\"User started speaking\")\n\n # Handle user stopped speaking event\n @client.on(AgentServerMessageType.USER_SPEECH_ENDED)\n def handle_speech_ended(message):\n print(\"User stopped speaking\")\n\n # Connect and start processing audio\n await client.connect()\n\n while True:\n frame = await mic.read(160)\n await client.send_audio(frame)\n\nif __name__ == \"__main__\":\n asyncio.run(main())\n```\n\n## Examples\n\nThe `examples/` directory contains practical demonstrations of the Voice Agent SDK. See the README in the `examples/voice` directory for more information.\n\n## Documentation\n\n- **SDK Overview**: See [OVERVIEW.md](OVERVIEW.md) for comprehensive API documentation\n- **Speechmatics API**: https://docs.speechmatics.com\n\n## License\n\n[MIT](LICENSE)\n",
"bugtrack_url": null,
"license": null,
"summary": "Speechmatics Voice Agent Python client for Real-Time API",
"version": "0.1.0",
"project_urls": {
"documentation": "https://docs.speechmatics.com/",
"homepage": "https://github.com/speechmatics/speechmatics-python-sdk",
"issues": "https://github.com/speechmatics/speechmatics-python-sdk/issues",
"repository": "https://github.com/speechmatics/speechmatics-python-sdk"
},
"split_keywords": [
"speechmatics",
" conversational-ai",
" voice",
" agents",
" websocket",
" real-time",
" pipecat",
" livekit"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "60f8658a1d30e4fe0fe5db0b2bef5cb0e75c5ed9a50558aa12aa763ea36ad1a0",
"md5": "5b677a79fab258c4a399e6af6b314532",
"sha256": "1331da4b029a1e577d6620f6113ceae0ff0f02cbfc4858f33b4584e8b1de1c87"
},
"downloads": -1,
"filename": "speechmatics_voice-0.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "5b677a79fab258c4a399e6af6b314532",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 20896,
"upload_time": "2025-09-03T21:34:50",
"upload_time_iso_8601": "2025-09-03T21:34:50.087351Z",
"url": "https://files.pythonhosted.org/packages/60/f8/658a1d30e4fe0fe5db0b2bef5cb0e75c5ed9a50558aa12aa763ea36ad1a0/speechmatics_voice-0.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "bc5f0f18982984cf71ec0024e76738107cfa1035daa453f463e30c36afd37574",
"md5": "ad10b0569731ae2a64c03e8fe1a4fc21",
"sha256": "96bf7c162ffdc781bea3b083b0825c9ced5ee83b4c141bc5ca7363cc2f347a15"
},
"downloads": -1,
"filename": "speechmatics_voice-0.1.0.tar.gz",
"has_sig": false,
"md5_digest": "ad10b0569731ae2a64c03e8fe1a4fc21",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 20965,
"upload_time": "2025-09-03T21:34:51",
"upload_time_iso_8601": "2025-09-03T21:34:51.406678Z",
"url": "https://files.pythonhosted.org/packages/bc/5f/0f18982984cf71ec0024e76738107cfa1035daa453f463e30c36afd37574/speechmatics_voice-0.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-09-03 21:34:51",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "speechmatics",
"github_project": "speechmatics-python-sdk",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "speechmatics-voice"
}