# Cartesia Text-to-Speech Plugin
High-quality **Text-to-Speech** (TTS) plugin for [GetStream](https://getstream.io/) backed by the
[Cartesia](https://github.com/cartesia-ai/cartesia-python) Sonic model. It lets a Python
process speak PCM audio into a Stream call.
## Installation
Install from PyPI (installs both `getstream` and the Cartesia SDK):
```bash
pip install "getstream-plugins-cartesia[webrtc"]"
```
If you already have the Stream SDK in your project just add the Cartesia plugin:
```bash
pip install cartesia getstream-plugins-cartesia
```
## Usage
```python
from getstream.plugins.cartesia import CartesiaTTS
from getstream.video.rtc.audio_track import AudioStreamTrack
async def speak():
# Option A: read key from env var (CARTESIA_API_KEY)
tts = CartesiaTTS()
# Option B: pass explicitly
# tts = CartesiaTTS(api_key="<your key>")
# Audio MUST be 16-kHz, 16-bit PCM (matches Cartesia Sonic model)
track = AudioStreamTrack(framerate=16000)
tts.set_output_track(track)
# Listen for every raw PCM chunk that gets produced
@tts.on("audio")
def _on_audio(chunk: bytes, user):
print("🔊 got", len(chunk), "bytes of audio")
await tts.send("Hello from Cartesia!")
# Run inside an event-loop, e.g. `asyncio.run(speak())`
```
## Configuration Options
- `api_key` (str, optional) – Cartesia API key (falls back to `CARTESIA_API_KEY`).
- `model_id` (str) – Which model to hit (`"sonic-2"` by default).
- `voice_id` (str | None) – Cartesia voice to use (pass `None` for model default).
- `sample_rate` (int) – Target sample-rate in Hz. Must match the
`AudioStreamTrack.framerate` you attach (defaults to `16000`). If they don't match a
`TypeError` is raised early so you don't get distorted audio on the call.
Events emitted:
• `audio` – each raw PCM chunk, arguments: `chunk: bytes`, `user: dict | None`
• `error` – any exception raised during synthesis
## Requirements
- Python 3.10+
- `cartesia>=2.0.5` (automatically installed)
## Testing
Run the offline unit-tests:
```bash
pytest -q getstream/plugins/cartesia/tests
```
To additionally exercise the live Cartesia API set `CARTESIA_API_KEY` in your
environment; the integration test will be executed automatically.
---
💡 See `examples/tts_cartesia/` for a fully-working bot that joins a Stream call
and greets participants using this plugin.
Raw data
{
"_id": null,
"home_page": null,
"name": "vision-agents-plugins-cartesia",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": "AI, TTS, agents, cartesia, text-to-speech, voice agents",
"author": null,
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/45/c2/9599f39f8d53f3070aed5eb5b28cdbe8552b7e7142eebdc96e3db03f48cc/vision_agents_plugins_cartesia-0.1.8.tar.gz",
"platform": null,
"description": "# Cartesia Text-to-Speech Plugin\n\nHigh-quality **Text-to-Speech** (TTS) plugin for [GetStream](https://getstream.io/) backed by the\n[Cartesia](https://github.com/cartesia-ai/cartesia-python) Sonic model. It lets a Python\nprocess speak PCM audio into a Stream call.\n\n## Installation\n\nInstall from PyPI (installs both `getstream` and the Cartesia SDK):\n\n```bash\npip install \"getstream-plugins-cartesia[webrtc\"]\"\n```\n\nIf you already have the Stream SDK in your project just add the Cartesia plugin:\n\n```bash\npip install cartesia getstream-plugins-cartesia\n```\n\n## Usage\n\n```python\nfrom getstream.plugins.cartesia import CartesiaTTS\nfrom getstream.video.rtc.audio_track import AudioStreamTrack\n\nasync def speak():\n # Option A: read key from env var (CARTESIA_API_KEY)\n tts = CartesiaTTS()\n\n # Option B: pass explicitly\n # tts = CartesiaTTS(api_key=\"<your key>\")\n\n # Audio MUST be 16-kHz, 16-bit PCM (matches Cartesia Sonic model)\n track = AudioStreamTrack(framerate=16000)\n tts.set_output_track(track)\n\n # Listen for every raw PCM chunk that gets produced\n @tts.on(\"audio\")\n def _on_audio(chunk: bytes, user):\n print(\"\ud83d\udd0a got\", len(chunk), \"bytes of audio\")\n\n await tts.send(\"Hello from Cartesia!\")\n\n# Run inside an event-loop, e.g. `asyncio.run(speak())`\n```\n\n## Configuration Options\n\n- `api_key` (str, optional) \u2013 Cartesia API key (falls back to `CARTESIA_API_KEY`).\n- `model_id` (str) \u2013 Which model to hit (`\"sonic-2\"` by default).\n- `voice_id` (str | None) \u2013 Cartesia voice to use (pass `None` for model default).\n- `sample_rate` (int) \u2013 Target sample-rate in Hz. Must match the\n `AudioStreamTrack.framerate` you attach (defaults to `16000`). If they don't match a\n `TypeError` is raised early so you don't get distorted audio on the call.\n\nEvents emitted:\n\n\u2022 `audio` \u2013 each raw PCM chunk, arguments: `chunk: bytes`, `user: dict | None`\n\n\u2022 `error` \u2013 any exception raised during synthesis\n\n## Requirements\n\n- Python 3.10+\n- `cartesia>=2.0.5` (automatically installed)\n\n## Testing\n\nRun the offline unit-tests:\n\n```bash\npytest -q getstream/plugins/cartesia/tests\n```\n\nTo additionally exercise the live Cartesia API set `CARTESIA_API_KEY` in your\nenvironment; the integration test will be executed automatically.\n\n---\n\n\ud83d\udca1 See `examples/tts_cartesia/` for a fully-working bot that joins a Stream call\nand greets participants using this plugin.\n",
"bugtrack_url": null,
"license": null,
"summary": "Cartesia TTS integration for Vision Agents",
"version": "0.1.8",
"project_urls": {
"Documentation": "https://visionagents.ai/",
"Source": "https://github.com/GetStream/Vision-Agents",
"Website": "https://visionagents.ai/"
},
"split_keywords": [
"ai",
" tts",
" agents",
" cartesia",
" text-to-speech",
" voice agents"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "2a3f036cef84eb4925c926b98e86eaa2ca6ba36e37ed6d78e89172285a96c78c",
"md5": "508207c1c5afc7a5622a54636aa9d524",
"sha256": "a4b800583aaa858b7bda478e1e257ef1c0aab590bd37380e7c1355e2c4e982eb"
},
"downloads": -1,
"filename": "vision_agents_plugins_cartesia-0.1.8-py3-none-any.whl",
"has_sig": false,
"md5_digest": "508207c1c5afc7a5622a54636aa9d524",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 8527,
"upload_time": "2025-10-22T01:50:59",
"upload_time_iso_8601": "2025-10-22T01:50:59.147762Z",
"url": "https://files.pythonhosted.org/packages/2a/3f/036cef84eb4925c926b98e86eaa2ca6ba36e37ed6d78e89172285a96c78c/vision_agents_plugins_cartesia-0.1.8-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "45c29599f39f8d53f3070aed5eb5b28cdbe8552b7e7142eebdc96e3db03f48cc",
"md5": "18b6e9aadcd0209ab0d8e369a5208f85",
"sha256": "fcc735ee98357fa33a4e81439d2e1225216168589ec628d68279eec6a7aec95f"
},
"downloads": -1,
"filename": "vision_agents_plugins_cartesia-0.1.8.tar.gz",
"has_sig": false,
"md5_digest": "18b6e9aadcd0209ab0d8e369a5208f85",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 3984,
"upload_time": "2025-10-22T01:51:00",
"upload_time_iso_8601": "2025-10-22T01:51:00.187244Z",
"url": "https://files.pythonhosted.org/packages/45/c2/9599f39f8d53f3070aed5eb5b28cdbe8552b7e7142eebdc96e3db03f48cc/vision_agents_plugins_cartesia-0.1.8.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-10-22 01:51:00",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "GetStream",
"github_project": "Vision-Agents",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "vision-agents-plugins-cartesia"
}