| Name | deepgram-sdk JSON |
| Version |
5.3.0
JSON |
| download |
| home_page | None |
| Summary | None |
| upload_time | 2025-11-03 15:24:02 |
| maintainer | None |
| docs_url | None |
| author | None |
| requires_python | <4.0,>=3.8 |
| license | MIT |
| keywords |
|
| VCS |
|
| bugtrack_url |
|
| requirements |
No requirements were recorded.
|
| Travis-CI |
No Travis.
|
| coveralls test coverage |
No coveralls.
|
# Deepgram Python SDK

[](https://pypi.python.org/pypi/deepgram-sdk)
[](https://www.python.org/downloads/)
[](./LICENSE)
The official Python SDK for Deepgram's automated speech recognition, text-to-speech, and language understanding APIs. Power your applications with world-class speech and Language AI models.
## Documentation
Comprehensive API documentation and guides are available at [developers.deepgram.com](https://developers.deepgram.com).
### Migrating From Earlier Versions
- [v2 to v3+](./docs/Migrating-v2-to-v3.md)
- [v3+ to v5](./docs/Migrating-v3-to-v5.md) (current)
## Installation
Install the Deepgram Python SDK using pip:
```bash
pip install deepgram-sdk
```
## Reference
- **[API Reference](./reference.md)** - Complete reference for all SDK methods and parameters
- **[WebSocket Reference](./websockets-reference.md)** - Detailed documentation for real-time WebSocket connections
## Usage
### Quick Start
The Deepgram SDK provides both synchronous and asynchronous clients for all major use cases:
#### Real-time Speech Recognition (Listen v2)
Our newest and most advanced speech recognition model with contextual turn detection ([WebSocket Reference](./websockets-reference.md#listen-v2-connect)):
```python
from deepgram import DeepgramClient
from deepgram.core.events import EventType
client = DeepgramClient()
with client.listen.v2.connect(
model="flux-general-en",
encoding="linear16",
sample_rate="16000"
) as connection:
def on_message(message):
print(f"Received {message.type} event")
connection.on(EventType.OPEN, lambda _: print("Connection opened"))
connection.on(EventType.MESSAGE, on_message)
connection.on(EventType.CLOSE, lambda _: print("Connection closed"))
connection.on(EventType.ERROR, lambda error: print(f"Error: {error}"))
# Start listening and send audio data
connection.start_listening()
```
#### File Transcription
Transcribe pre-recorded audio files ([API Reference](./reference.md#listen-v1-media-transcribe-file)):
```python
from deepgram import DeepgramClient
client = DeepgramClient()
with open("audio.wav", "rb") as audio_file:
response = client.listen.v1.media.transcribe_file(
request=audio_file.read(),
model="nova-3"
)
print(response.results.channels[0].alternatives[0].transcript)
```
#### Text-to-Speech
Generate natural-sounding speech from text ([API Reference](./reference.md#speak-v1-audio-generate)):
```python
from deepgram import DeepgramClient
client = DeepgramClient()
response = client.speak.v1.audio.generate(
text="Hello, this is a sample text to speech conversion."
)
# Save the audio file
with open("output.mp3", "wb") as audio_file:
audio_file.write(response.stream.getvalue())
```
#### Text Analysis
Analyze text for sentiment, topics, and intents ([API Reference](./reference.md#read-v1-text-analyze)):
```python
from deepgram import DeepgramClient
client = DeepgramClient()
response = client.read.v1.text.analyze(
request={"text": "Hello, world!"},
language="en",
sentiment=True,
summarize=True,
topics=True,
intents=True
)
```
#### Voice Agent (Conversational AI)
Build interactive voice agents ([WebSocket Reference](./websockets-reference.md#agent-v1-connect)):
```python
from deepgram import DeepgramClient
from deepgram.extensions.types.sockets import (
AgentV1SettingsMessage, AgentV1Agent, AgentV1AudioConfig,
AgentV1AudioInput, AgentV1Listen, AgentV1ListenProvider,
AgentV1Think, AgentV1OpenAiThinkProvider, AgentV1SpeakProviderConfig,
AgentV1DeepgramSpeakProvider
)
client = DeepgramClient()
with client.agent.v1.connect() as agent:
settings = AgentV1SettingsMessage(
audio=AgentV1AudioConfig(
input=AgentV1AudioInput(encoding="linear16", sample_rate=44100)
),
agent=AgentV1Agent(
listen=AgentV1Listen(
provider=AgentV1ListenProvider(type="deepgram", model="nova-3")
),
think=AgentV1Think(
provider=AgentV1OpenAiThinkProvider(
type="open_ai", model="gpt-4o-mini"
)
),
speak=AgentV1SpeakProviderConfig(
provider=AgentV1DeepgramSpeakProvider(
type="deepgram", model="aura-2-asteria-en"
)
)
)
)
agent.send_settings(settings)
agent.start_listening()
```
### Complete SDK Reference
For comprehensive documentation of all available methods, parameters, and options:
- **[API Reference](./reference.md)** - Complete reference for REST API methods including:
- Listen (Speech-to-Text): File transcription, URL transcription, and media processing
- Speak (Text-to-Speech): Audio generation and voice synthesis
- Read (Text Intelligence): Text analysis, sentiment, summarization, and topic detection
- Manage: Project management, API keys, and usage analytics
- Auth: Token generation and authentication management
- **[WebSocket Reference](./websockets-reference.md)** - Detailed documentation for real-time connections:
- Listen v1/v2: Real-time speech recognition with different model capabilities
- Speak v1: Real-time text-to-speech streaming
- Agent v1: Conversational voice agents with integrated STT, LLM, and TTS
## Authentication
The Deepgram SDK supports two authentication methods:
### Access Token Authentication
Use access tokens for temporary or scoped access (recommended for client-side applications):
```python
from deepgram import DeepgramClient
# Explicit access token
client = DeepgramClient(access_token="YOUR_ACCESS_TOKEN")
# Or via environment variable DEEPGRAM_TOKEN
client = DeepgramClient()
# Generate access tokens using your API key
auth_client = DeepgramClient(api_key="YOUR_API_KEY")
token_response = auth_client.auth.v1.tokens.grant()
token_client = DeepgramClient(access_token=token_response.access_token)
```
### API Key Authentication
Use your Deepgram API key for server-side applications:
```python
from deepgram import DeepgramClient
# Explicit API key
client = DeepgramClient(api_key="YOUR_API_KEY")
# Or via environment variable DEEPGRAM_API_KEY
client = DeepgramClient()
```
### Environment Variables
The SDK automatically discovers credentials from these environment variables:
- `DEEPGRAM_TOKEN` - Your access token (takes precedence)
- `DEEPGRAM_API_KEY` - Your Deepgram API key
**Precedence:** Explicit parameters > Environment variables
## Async Client
The SDK provides full async/await support for non-blocking operations:
```python
import asyncio
from deepgram import AsyncDeepgramClient
async def main():
client = AsyncDeepgramClient()
# Async file transcription
with open("audio.wav", "rb") as audio_file:
response = await client.listen.v1.media.transcribe_file(
request=audio_file.read(),
model="nova-3"
)
# Async WebSocket connection
async with client.listen.v2.connect(
model="flux-general-en",
encoding="linear16",
sample_rate="16000"
) as connection:
async def on_message(message):
print(f"Received {message.type} event")
connection.on(EventType.MESSAGE, on_message)
await connection.start_listening()
asyncio.run(main())
```
## Exception Handling
The SDK provides detailed error information for debugging and error handling:
```python
from deepgram import DeepgramClient
from deepgram.core.api_error import ApiError
client = DeepgramClient()
try:
response = client.listen.v1.media.transcribe_file(
request=audio_data,
model="nova-3"
)
except ApiError as e:
print(f"Status Code: {e.status_code}")
print(f"Error Details: {e.body}")
print(f"Request ID: {e.headers.get('x-dg-request-id', 'N/A')}")
except Exception as e:
print(f"Unexpected error: {e}")
```
## Advanced Features
### Raw Response Access
Access raw HTTP response data including headers:
```python
from deepgram import DeepgramClient
client = DeepgramClient()
response = client.listen.v1.media.with_raw_response.transcribe_file(
request=audio_data,
model="nova-3"
)
print(response.headers) # Access response headers
print(response.data) # Access the response object
```
### Request Configuration
Configure timeouts, retries, and other request options:
```python
from deepgram import DeepgramClient
# Global client configuration
client = DeepgramClient(timeout=30.0)
# Per-request configuration
response = client.listen.v1.media.transcribe_file(
request=audio_data,
model="nova-3",
request_options={
"timeout_in_seconds": 60,
"max_retries": 3
}
)
```
### Custom HTTP Client
Use a custom httpx client for advanced networking features:
```python
import httpx
from deepgram import DeepgramClient
client = DeepgramClient(
httpx_client=httpx.Client(
proxies="http://proxy.example.com",
timeout=httpx.Timeout(30.0)
)
)
```
### Retry Configuration
The SDK automatically retries failed requests with exponential backoff:
```python
# Automatic retries for 408, 429, and 5xx status codes
response = client.listen.v1.media.transcribe_file(
request=audio_data,
model="nova-3",
request_options={"max_retries": 3}
)
```
## Contributing
We welcome contributions to improve this SDK! However, please note that this library is primarily generated from our API specifications.
### Development Setup
1. **Install Poetry** (if not already installed):
```bash
curl -sSL https://install.python-poetry.org | python - -y --version 1.5.1
```
2. **Install dependencies**:
```bash
poetry install
```
3. **Install example dependencies**:
```bash
poetry run pip install -r examples/requirements.txt
```
4. **Run tests**:
```bash
poetry run pytest -rP .
```
5. **Run examples**:
```bash
python -u examples/listen/v2/connect/main.py
```
### Contribution Guidelines
See our [CONTRIBUTING](./CONTRIBUTING.md) guide.
### Requirements
- Python 3.8+
- See `pyproject.toml` for full dependency list
## Community Code of Conduct
Please see our community [code of conduct](https://developers.deepgram.com/code-of-conduct) before contributing to this project.
## License
This project is licensed under the MIT License - see the [LICENSE](./LICENSE) file for details.
Raw data
{
"_id": null,
"home_page": null,
"name": "deepgram-sdk",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.8",
"maintainer_email": null,
"keywords": null,
"author": null,
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/2d/9c/4529cc5818e9305ac9be3c24545249ad57418cbc3736c3f1c0a8397b59f5/deepgram_sdk-5.3.0.tar.gz",
"platform": null,
"description": "# Deepgram Python SDK\n\n\n[](https://pypi.python.org/pypi/deepgram-sdk)\n[](https://www.python.org/downloads/)\n[](./LICENSE)\n\nThe official Python SDK for Deepgram's automated speech recognition, text-to-speech, and language understanding APIs. Power your applications with world-class speech and Language AI models.\n\n## Documentation\n\nComprehensive API documentation and guides are available at [developers.deepgram.com](https://developers.deepgram.com).\n\n### Migrating From Earlier Versions\n\n- [v2 to v3+](./docs/Migrating-v2-to-v3.md)\n- [v3+ to v5](./docs/Migrating-v3-to-v5.md) (current)\n\n## Installation\n\nInstall the Deepgram Python SDK using pip:\n\n```bash\npip install deepgram-sdk\n```\n\n## Reference\n\n- **[API Reference](./reference.md)** - Complete reference for all SDK methods and parameters\n- **[WebSocket Reference](./websockets-reference.md)** - Detailed documentation for real-time WebSocket connections\n\n## Usage\n\n### Quick Start\n\nThe Deepgram SDK provides both synchronous and asynchronous clients for all major use cases:\n\n#### Real-time Speech Recognition (Listen v2)\n\nOur newest and most advanced speech recognition model with contextual turn detection ([WebSocket Reference](./websockets-reference.md#listen-v2-connect)):\n\n```python\nfrom deepgram import DeepgramClient\nfrom deepgram.core.events import EventType\n\nclient = DeepgramClient()\n\nwith client.listen.v2.connect(\n model=\"flux-general-en\",\n encoding=\"linear16\",\n sample_rate=\"16000\"\n) as connection:\n def on_message(message):\n print(f\"Received {message.type} event\")\n\n connection.on(EventType.OPEN, lambda _: print(\"Connection opened\"))\n connection.on(EventType.MESSAGE, on_message)\n connection.on(EventType.CLOSE, lambda _: print(\"Connection closed\"))\n connection.on(EventType.ERROR, lambda error: print(f\"Error: {error}\"))\n\n # Start listening and send audio data\n connection.start_listening()\n```\n\n#### File Transcription\n\nTranscribe pre-recorded audio files ([API Reference](./reference.md#listen-v1-media-transcribe-file)):\n\n```python\nfrom deepgram import DeepgramClient\n\nclient = DeepgramClient()\n\nwith open(\"audio.wav\", \"rb\") as audio_file:\n response = client.listen.v1.media.transcribe_file(\n request=audio_file.read(),\n model=\"nova-3\"\n )\n print(response.results.channels[0].alternatives[0].transcript)\n```\n\n#### Text-to-Speech\n\nGenerate natural-sounding speech from text ([API Reference](./reference.md#speak-v1-audio-generate)):\n\n```python\nfrom deepgram import DeepgramClient\n\nclient = DeepgramClient()\n\nresponse = client.speak.v1.audio.generate(\n text=\"Hello, this is a sample text to speech conversion.\"\n)\n\n# Save the audio file\nwith open(\"output.mp3\", \"wb\") as audio_file:\n audio_file.write(response.stream.getvalue())\n```\n\n#### Text Analysis\n\nAnalyze text for sentiment, topics, and intents ([API Reference](./reference.md#read-v1-text-analyze)):\n\n```python\nfrom deepgram import DeepgramClient\n\nclient = DeepgramClient()\n\nresponse = client.read.v1.text.analyze(\n request={\"text\": \"Hello, world!\"},\n language=\"en\",\n sentiment=True,\n summarize=True,\n topics=True,\n intents=True\n)\n```\n\n#### Voice Agent (Conversational AI)\n\nBuild interactive voice agents ([WebSocket Reference](./websockets-reference.md#agent-v1-connect)):\n\n```python\nfrom deepgram import DeepgramClient\nfrom deepgram.extensions.types.sockets import (\n AgentV1SettingsMessage, AgentV1Agent, AgentV1AudioConfig,\n AgentV1AudioInput, AgentV1Listen, AgentV1ListenProvider,\n AgentV1Think, AgentV1OpenAiThinkProvider, AgentV1SpeakProviderConfig,\n AgentV1DeepgramSpeakProvider\n)\n\nclient = DeepgramClient()\n\nwith client.agent.v1.connect() as agent:\n settings = AgentV1SettingsMessage(\n audio=AgentV1AudioConfig(\n input=AgentV1AudioInput(encoding=\"linear16\", sample_rate=44100)\n ),\n agent=AgentV1Agent(\n listen=AgentV1Listen(\n provider=AgentV1ListenProvider(type=\"deepgram\", model=\"nova-3\")\n ),\n think=AgentV1Think(\n provider=AgentV1OpenAiThinkProvider(\n type=\"open_ai\", model=\"gpt-4o-mini\"\n )\n ),\n speak=AgentV1SpeakProviderConfig(\n provider=AgentV1DeepgramSpeakProvider(\n type=\"deepgram\", model=\"aura-2-asteria-en\"\n )\n )\n )\n )\n\n agent.send_settings(settings)\n agent.start_listening()\n```\n\n### Complete SDK Reference\n\nFor comprehensive documentation of all available methods, parameters, and options:\n\n- **[API Reference](./reference.md)** - Complete reference for REST API methods including:\n\n - Listen (Speech-to-Text): File transcription, URL transcription, and media processing\n - Speak (Text-to-Speech): Audio generation and voice synthesis\n - Read (Text Intelligence): Text analysis, sentiment, summarization, and topic detection\n - Manage: Project management, API keys, and usage analytics\n - Auth: Token generation and authentication management\n\n- **[WebSocket Reference](./websockets-reference.md)** - Detailed documentation for real-time connections:\n - Listen v1/v2: Real-time speech recognition with different model capabilities\n - Speak v1: Real-time text-to-speech streaming\n - Agent v1: Conversational voice agents with integrated STT, LLM, and TTS\n\n## Authentication\n\nThe Deepgram SDK supports two authentication methods:\n\n### Access Token Authentication\n\nUse access tokens for temporary or scoped access (recommended for client-side applications):\n\n```python\nfrom deepgram import DeepgramClient\n\n# Explicit access token\nclient = DeepgramClient(access_token=\"YOUR_ACCESS_TOKEN\")\n\n# Or via environment variable DEEPGRAM_TOKEN\nclient = DeepgramClient()\n\n# Generate access tokens using your API key\nauth_client = DeepgramClient(api_key=\"YOUR_API_KEY\")\ntoken_response = auth_client.auth.v1.tokens.grant()\ntoken_client = DeepgramClient(access_token=token_response.access_token)\n```\n\n### API Key Authentication\n\nUse your Deepgram API key for server-side applications:\n\n```python\nfrom deepgram import DeepgramClient\n\n# Explicit API key\nclient = DeepgramClient(api_key=\"YOUR_API_KEY\")\n\n# Or via environment variable DEEPGRAM_API_KEY\nclient = DeepgramClient()\n```\n\n### Environment Variables\n\nThe SDK automatically discovers credentials from these environment variables:\n\n- `DEEPGRAM_TOKEN` - Your access token (takes precedence)\n- `DEEPGRAM_API_KEY` - Your Deepgram API key\n\n**Precedence:** Explicit parameters > Environment variables\n\n## Async Client\n\nThe SDK provides full async/await support for non-blocking operations:\n\n```python\nimport asyncio\nfrom deepgram import AsyncDeepgramClient\n\nasync def main():\n client = AsyncDeepgramClient()\n\n # Async file transcription\n with open(\"audio.wav\", \"rb\") as audio_file:\n response = await client.listen.v1.media.transcribe_file(\n request=audio_file.read(),\n model=\"nova-3\"\n )\n\n # Async WebSocket connection\n async with client.listen.v2.connect(\n model=\"flux-general-en\",\n encoding=\"linear16\",\n sample_rate=\"16000\"\n ) as connection:\n async def on_message(message):\n print(f\"Received {message.type} event\")\n\n connection.on(EventType.MESSAGE, on_message)\n await connection.start_listening()\n\nasyncio.run(main())\n```\n\n## Exception Handling\n\nThe SDK provides detailed error information for debugging and error handling:\n\n```python\nfrom deepgram import DeepgramClient\nfrom deepgram.core.api_error import ApiError\n\nclient = DeepgramClient()\n\ntry:\n response = client.listen.v1.media.transcribe_file(\n request=audio_data,\n model=\"nova-3\"\n )\nexcept ApiError as e:\n print(f\"Status Code: {e.status_code}\")\n print(f\"Error Details: {e.body}\")\n print(f\"Request ID: {e.headers.get('x-dg-request-id', 'N/A')}\")\nexcept Exception as e:\n print(f\"Unexpected error: {e}\")\n```\n\n## Advanced Features\n\n### Raw Response Access\n\nAccess raw HTTP response data including headers:\n\n```python\nfrom deepgram import DeepgramClient\n\nclient = DeepgramClient()\n\nresponse = client.listen.v1.media.with_raw_response.transcribe_file(\n request=audio_data,\n model=\"nova-3\"\n)\n\nprint(response.headers) # Access response headers\nprint(response.data) # Access the response object\n```\n\n### Request Configuration\n\nConfigure timeouts, retries, and other request options:\n\n```python\nfrom deepgram import DeepgramClient\n\n# Global client configuration\nclient = DeepgramClient(timeout=30.0)\n\n# Per-request configuration\nresponse = client.listen.v1.media.transcribe_file(\n request=audio_data,\n model=\"nova-3\",\n request_options={\n \"timeout_in_seconds\": 60,\n \"max_retries\": 3\n }\n)\n```\n\n### Custom HTTP Client\n\nUse a custom httpx client for advanced networking features:\n\n```python\nimport httpx\nfrom deepgram import DeepgramClient\n\nclient = DeepgramClient(\n httpx_client=httpx.Client(\n proxies=\"http://proxy.example.com\",\n timeout=httpx.Timeout(30.0)\n )\n)\n```\n\n### Retry Configuration\n\nThe SDK automatically retries failed requests with exponential backoff:\n\n```python\n# Automatic retries for 408, 429, and 5xx status codes\nresponse = client.listen.v1.media.transcribe_file(\n request=audio_data,\n model=\"nova-3\",\n request_options={\"max_retries\": 3}\n)\n```\n\n## Contributing\n\nWe welcome contributions to improve this SDK! However, please note that this library is primarily generated from our API specifications.\n\n### Development Setup\n\n1. **Install Poetry** (if not already installed):\n\n ```bash\n curl -sSL https://install.python-poetry.org | python - -y --version 1.5.1\n ```\n\n2. **Install dependencies**:\n\n ```bash\n poetry install\n ```\n\n3. **Install example dependencies**:\n\n ```bash\n poetry run pip install -r examples/requirements.txt\n ```\n\n4. **Run tests**:\n\n ```bash\n poetry run pytest -rP .\n ```\n\n5. **Run examples**:\n ```bash\n python -u examples/listen/v2/connect/main.py\n ```\n\n### Contribution Guidelines\n\nSee our [CONTRIBUTING](./CONTRIBUTING.md) guide.\n\n### Requirements\n\n- Python 3.8+\n- See `pyproject.toml` for full dependency list\n\n## Community Code of Conduct\n\nPlease see our community [code of conduct](https://developers.deepgram.com/code-of-conduct) before contributing to this project.\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](./LICENSE) file for details.\n",
"bugtrack_url": null,
"license": "MIT",
"summary": null,
"version": "5.3.0",
"project_urls": null,
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "58e2cda09edad156199cc9e330533f6b72cb5276c0d476cab7f1744be7ffa16e",
"md5": "852bf3cce6cc25fc141d7a3ce84f9282",
"sha256": "431418fdffbd93cdf6a78a168984e3df3cb696818ced1cfc52ce336e0bc6a7fe"
},
"downloads": -1,
"filename": "deepgram_sdk-5.3.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "852bf3cce6cc25fc141d7a3ce84f9282",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.8",
"size": 390669,
"upload_time": "2025-11-03T15:24:01",
"upload_time_iso_8601": "2025-11-03T15:24:01.078727Z",
"url": "https://files.pythonhosted.org/packages/58/e2/cda09edad156199cc9e330533f6b72cb5276c0d476cab7f1744be7ffa16e/deepgram_sdk-5.3.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "2d9c4529cc5818e9305ac9be3c24545249ad57418cbc3736c3f1c0a8397b59f5",
"md5": "6de0f51ba2d012dcdbfd159102630cf6",
"sha256": "4e682a53f64c26dc49d8fd70865eae1e98236d313870d1bcf5f107f125e53793"
},
"downloads": -1,
"filename": "deepgram_sdk-5.3.0.tar.gz",
"has_sig": false,
"md5_digest": "6de0f51ba2d012dcdbfd159102630cf6",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.8",
"size": 148179,
"upload_time": "2025-11-03T15:24:02",
"upload_time_iso_8601": "2025-11-03T15:24:02.436789Z",
"url": "https://files.pythonhosted.org/packages/2d/9c/4529cc5818e9305ac9be3c24545249ad57418cbc3736c3f1c0a8397b59f5/deepgram_sdk-5.3.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-11-03 15:24:02",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "deepgram-sdk"
}