pipecat-ai


Namepipecat-ai JSON
Version 0.0.31 PyPI version JSON
download
home_pageNone
SummaryAn open source framework for voice (and multimodal) assistants
upload_time2024-06-13 22:35:26
maintainerNone
docs_urlNone
authorNone
requires_python>=3.7
licenseBSD 2-Clause License
keywords webrtc audio video ai
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <div align="center">
 <img alt="pipecat" width="300px" height="auto" src="https://raw.githubusercontent.com/pipecat-ai/pipecat/main/pipecat.png">
</div>

# Pipecat

[![PyPI](https://img.shields.io/pypi/v/pipecat-ai)](https://pypi.org/project/pipecat-ai) [![Discord](https://img.shields.io/discord/1239284677165056021
)](https://discord.gg/pipecat)

`pipecat` is a framework for building voice (and multimodal) conversational agents. Things like personal coaches, meeting assistants, [story-telling toys for kids](https://storytelling-chatbot.fly.dev/), customer support bots, [intake flows](https://www.youtube.com/watch?v=lDevgsp9vn0), and snarky social companions.

Take a look at some example apps:

<p float="left">
    <a href="https://github.com/pipecat-ai/pipecat/tree/main/examples/simple-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat/main/examples/simple-chatbot/image.png" width="280" /></a>&nbsp;
    <a href="https://github.com/pipecat-ai/pipecat/tree/main/examples/storytelling-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat/main/examples/storytelling-chatbot/image.png" width="280" /></a>
    <br/>
    <a href="https://github.com/pipecat-ai/pipecat/tree/main/examples/translation-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat/main/examples/translation-chatbot/image.png" width="280" /></a>&nbsp;
    <a href="https://github.com/pipecat-ai/pipecat/tree/main/examples/moondream-chatbot"><img src="https://raw.githubusercontent.com/pipecat-ai/pipecat/main/examples/moondream-chatbot/image.png" width="280" /></a>
</p>

## Getting started with voice agents

You can get started with Pipecat running on your local machine, then move your agent processes to the cloud when you’re ready. You can also add a 📞 telephone number, 🖼️ image output, 📺 video input, use different LLMs, and more.

```shell
# install the module
pip install pipecat-ai

# set up an .env file with API keys
cp dot-env.template .env
```

By default, in order to minimize dependencies, only the basic framework functionality is available. Some third-party AI services require additional dependencies that you can install with:

```shell
pip install "pipecat-ai[option,...]"
```

Your project may or may not need these, so they're made available as optional requirements. Here is a list:

- **AI services**: `anthropic`, `azure`, `deepgram`, `google`, `fal`, `moondream`, `openai`, `playht`, `silero`, `whisper`
- **Transports**: `local`, `websocket`, `daily`

## Code examples

- [foundational](https://github.com/pipecat-ai/pipecat/tree/main/examples/foundational) — small snippets that build on each other, introducing one or two concepts at a time
- [example apps](https://github.com/pipecat-ai/pipecat/tree/main/examples/) — complete applications that you can use as starting points for development

## A simple voice agent running locally

Here is a very basic Pipecat bot that greets a user when they join a real-time session. We'll use [Daily](https://daily.co) for real-time media transport, and [ElevenLabs](https://elevenlabs.io/) for text-to-speech.

```python
#app.py

import asyncio
import aiohttp

from pipecat.frames.frames import EndFrame, TextFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.task import PipelineTask
from pipecat.pipeline.runner import PipelineRunner
from pipecat.services.elevenlabs import ElevenLabsTTSService
from pipecat.transports.services.daily import DailyParams, DailyTransport

async def main():
  async with aiohttp.ClientSession() as session:
    # Use Daily as a real-time media transport (WebRTC)
    transport = DailyTransport(
      room_url=...,
      token=...,
      "Bot Name",
      DailyParams(audio_out_enabled=True))

    # Use Eleven Labs for Text-to-Speech
    tts = ElevenLabsTTSService(
      aiohttp_session=session,
      api_key=...,
      voice_id=...,
      )

    # Simple pipeline that will process text to speech and output the result
    pipeline = Pipeline([tts, transport.output()])

    # Create Pipecat processor that can run one or more pipelines tasks
    runner = PipelineRunner()

    # Assign the task callable to run the pipeline
    task = PipelineTask(pipeline)

    # Register an event handler to play audio when a
    # participant joins the transport WebRTC session
    @transport.event_handler("on_participant_joined")
    async def on_new_participant_joined(transport, participant):
      participant_name = participant["info"]["userName"] or ''
      # Queue a TextFrame that will get spoken by the TTS service (Eleven Labs)
      await task.queue_frames([TextFrame(f"Hello there, {participant_name}!"), EndFrame()])

    # Run the pipeline task
    await runner.run(task)

if __name__ == "__main__":
  asyncio.run(main())
```

Run it with:

```shell
python app.py
```

Daily provides a prebuilt WebRTC user interface. Whilst the app is running, you can visit at `https://<yourdomain>.daily.co/<room_url>` and listen to the bot say hello!


## WebRTC for production use

WebSockets are fine for server-to-server communication or for initial development. But for production use, you’ll need client-server audio to use a protocol designed for real-time media transport. (For an explanation of the difference between WebSockets and WebRTC, see [this post.](https://www.daily.co/blog/how-to-talk-to-an-llm-with-your-voice/#webrtc))

One way to get up and running quickly with WebRTC is to sign up for a Daily developer account. Daily gives you SDKs and global infrastructure for audio (and video) routing. Every account gets 10,000 audio/video/transcription minutes free each month.

Sign up [here](https://dashboard.daily.co/u/signup) and [create a room](https://docs.daily.co/reference/rest-api/rooms) in the developer Dashboard.

## What is VAD?

Voice Activity Detection &mdash; very important for knowing when a user has finished speaking to your bot. If you are not using press-to-talk, and want Pipecat to detect when the user has finished talking, VAD is an essential component for a natural feeling conversation.

Pipecast makes use of WebRTC VAD by default when using a WebRTC transport layer. Optionally, you can use Silero VAD for improved accuracy at the cost of higher CPU usage.

```shell
pip install pipecat-ai[silero]
```

The first time your run your bot with Silero, startup may take a while whilst it downloads and caches the model in the background. You can check the progress of this in the console.


## Hacking on the framework itself

_Note that you may need to set up a virtual environment before following the instructions below. For instance, you might need to run the following from the root of the repo:_

```shell
python3 -m venv venv
source venv/bin/activate
```

From the root of this repo, run the following:

```shell
pip install -r dev-requirements.txt -r {env}-requirements.txt
python -m build
```

This builds the package. To use the package locally (eg to run sample files), run

```shell
pip install --editable .
```

If you want to use this package from another directory, you can run:

```shell
pip install path_to_this_repo
```

### Running tests

From the root directory, run:

```shell
pytest --doctest-modules --ignore-glob="*to_be_updated*" src tests
```

## Setting up your editor

This project uses strict [PEP 8](https://peps.python.org/pep-0008/) formatting.

### Emacs

You can use [use-package](https://github.com/jwiegley/use-package) to install [py-autopep8](https://codeberg.org/ideasman42/emacs-py-autopep8) package and configure `autopep8` arguments:

```elisp
(use-package py-autopep8
  :ensure t
  :defer t
  :hook ((python-mode . py-autopep8-mode))
  :config
  (setq py-autopep8-options '("-a" "-a", "--max-line-length=100")))
```

`autopep8` was installed in the `venv` environment described before, so you should be able to use [pyvenv-auto](https://github.com/ryotaro612/pyvenv-auto) to automatically load that environment inside Emacs.

```elisp
(use-package pyvenv-auto
  :ensure t
  :defer t
  :hook ((python-mode . pyvenv-auto-run)))

```

### Visual Studio Code

Install the
[autopep8](https://marketplace.visualstudio.com/items?itemName=ms-python.autopep8) extension. Then edit the user settings (_Ctrl-Shift-P_ `Open User Settings (JSON)`) and set it as the default Python formatter, enable formatting on save and configure `autopep8` arguments:

```json
"[python]": {
    "editor.defaultFormatter": "ms-python.autopep8",
    "editor.formatOnSave": true
},
"autopep8.args": [
    "-a",
    "-a",
    "--max-line-length=100"
],
```

## Getting help

➡️ [Join our Discord](https://discord.gg/pipecat)

➡️ [Reach us on X](https://x.com/pipecat_ai)

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "pipecat-ai",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": null,
    "keywords": "webrtc, audio, video, ai",
    "author": null,
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/09/94/e29cf5ad2133c1813af5f1ce59d3f507950bc55fa32cce7bbc645fd7a36a/pipecat_ai-0.0.31.tar.gz",
    "platform": null,
    "description": "<div align=\"center\">\n\u00a0<img alt=\"pipecat\" width=\"300px\" height=\"auto\" src=\"https://raw.githubusercontent.com/pipecat-ai/pipecat/main/pipecat.png\">\n</div>\n\n# Pipecat\n\n[![PyPI](https://img.shields.io/pypi/v/pipecat-ai)](https://pypi.org/project/pipecat-ai) [![Discord](https://img.shields.io/discord/1239284677165056021\n)](https://discord.gg/pipecat)\n\n`pipecat` is a framework for building voice (and multimodal) conversational agents. Things like personal coaches, meeting assistants, [story-telling toys for kids](https://storytelling-chatbot.fly.dev/), customer support bots, [intake flows](https://www.youtube.com/watch?v=lDevgsp9vn0), and snarky social companions.\n\nTake a look at some example apps:\n\n<p float=\"left\">\n    <a href=\"https://github.com/pipecat-ai/pipecat/tree/main/examples/simple-chatbot\"><img src=\"https://raw.githubusercontent.com/pipecat-ai/pipecat/main/examples/simple-chatbot/image.png\" width=\"280\" /></a>&nbsp;\n    <a href=\"https://github.com/pipecat-ai/pipecat/tree/main/examples/storytelling-chatbot\"><img src=\"https://raw.githubusercontent.com/pipecat-ai/pipecat/main/examples/storytelling-chatbot/image.png\" width=\"280\" /></a>\n    <br/>\n    <a href=\"https://github.com/pipecat-ai/pipecat/tree/main/examples/translation-chatbot\"><img src=\"https://raw.githubusercontent.com/pipecat-ai/pipecat/main/examples/translation-chatbot/image.png\" width=\"280\" /></a>&nbsp;\n    <a href=\"https://github.com/pipecat-ai/pipecat/tree/main/examples/moondream-chatbot\"><img src=\"https://raw.githubusercontent.com/pipecat-ai/pipecat/main/examples/moondream-chatbot/image.png\" width=\"280\" /></a>\n</p>\n\n## Getting started with voice agents\n\nYou can get started with Pipecat running on your local machine, then move your agent processes to the cloud when you\u2019re ready. You can also add a \ud83d\udcde telephone number, \ud83d\uddbc\ufe0f image output, \ud83d\udcfa video input, use different LLMs, and more.\n\n```shell\n# install the module\npip install pipecat-ai\n\n# set up an .env file with API keys\ncp dot-env.template .env\n```\n\nBy default, in order to minimize dependencies, only the basic framework functionality is available. Some third-party AI services require additional dependencies that you can install with:\n\n```shell\npip install \"pipecat-ai[option,...]\"\n```\n\nYour project may or may not need these, so they're made available as optional requirements. Here is a list:\n\n- **AI services**: `anthropic`, `azure`, `deepgram`, `google`, `fal`, `moondream`, `openai`, `playht`, `silero`, `whisper`\n- **Transports**: `local`, `websocket`, `daily`\n\n## Code examples\n\n- [foundational](https://github.com/pipecat-ai/pipecat/tree/main/examples/foundational) \u2014 small snippets that build on each other, introducing one or two concepts at a time\n- [example apps](https://github.com/pipecat-ai/pipecat/tree/main/examples/) \u2014 complete applications that you can use as starting points for development\n\n## A simple voice agent running locally\n\nHere is a very basic Pipecat bot that greets a user when they join a real-time session. We'll use [Daily](https://daily.co) for real-time media transport, and [ElevenLabs](https://elevenlabs.io/) for text-to-speech.\n\n```python\n#app.py\n\nimport asyncio\nimport aiohttp\n\nfrom pipecat.frames.frames import EndFrame, TextFrame\nfrom pipecat.pipeline.pipeline import Pipeline\nfrom pipecat.pipeline.task import PipelineTask\nfrom pipecat.pipeline.runner import PipelineRunner\nfrom pipecat.services.elevenlabs import ElevenLabsTTSService\nfrom pipecat.transports.services.daily import DailyParams, DailyTransport\n\nasync def main():\n  async with aiohttp.ClientSession() as session:\n    # Use Daily as a real-time media transport (WebRTC)\n    transport = DailyTransport(\n      room_url=...,\n      token=...,\n      \"Bot Name\",\n      DailyParams(audio_out_enabled=True))\n\n    # Use Eleven Labs for Text-to-Speech\n    tts = ElevenLabsTTSService(\n      aiohttp_session=session,\n      api_key=...,\n      voice_id=...,\n      )\n\n    # Simple pipeline that will process text to speech and output the result\n    pipeline = Pipeline([tts, transport.output()])\n\n    # Create Pipecat processor that can run one or more pipelines tasks\n    runner = PipelineRunner()\n\n    # Assign the task callable to run the pipeline\n    task = PipelineTask(pipeline)\n\n    # Register an event handler to play audio when a\n    # participant joins the transport WebRTC session\n    @transport.event_handler(\"on_participant_joined\")\n    async def on_new_participant_joined(transport, participant):\n      participant_name = participant[\"info\"][\"userName\"] or ''\n      # Queue a TextFrame that will get spoken by the TTS service (Eleven Labs)\n      await task.queue_frames([TextFrame(f\"Hello there, {participant_name}!\"), EndFrame()])\n\n    # Run the pipeline task\n    await runner.run(task)\n\nif __name__ == \"__main__\":\n  asyncio.run(main())\n```\n\nRun it with:\n\n```shell\npython app.py\n```\n\nDaily provides a prebuilt WebRTC user interface. Whilst the app is running, you can visit at `https://<yourdomain>.daily.co/<room_url>` and listen to the bot say hello!\n\n\n## WebRTC for production use\n\nWebSockets are fine for server-to-server communication or for initial development. But for production use, you\u2019ll need client-server audio to use a protocol designed for real-time media transport. (For an explanation of the difference between WebSockets and WebRTC, see [this post.](https://www.daily.co/blog/how-to-talk-to-an-llm-with-your-voice/#webrtc))\n\nOne way to get up and running quickly with WebRTC is to sign up for a Daily developer account. Daily gives you SDKs and global infrastructure for audio (and video) routing. Every account gets 10,000 audio/video/transcription minutes free each month.\n\nSign up [here](https://dashboard.daily.co/u/signup) and [create a room](https://docs.daily.co/reference/rest-api/rooms) in the developer Dashboard.\n\n## What is VAD?\n\nVoice Activity Detection &mdash; very important for knowing when a user has finished speaking to your bot. If you are not using press-to-talk, and want Pipecat to detect when the user has finished talking, VAD is an essential component for a natural feeling conversation.\n\nPipecast makes use of WebRTC VAD by default when using a WebRTC transport layer. Optionally, you can use Silero VAD for improved accuracy at the cost of higher CPU usage.\n\n```shell\npip install pipecat-ai[silero]\n```\n\nThe first time your run your bot with Silero, startup may take a while whilst it downloads and caches the model in the background. You can check the progress of this in the console.\n\n\n## Hacking on the framework itself\n\n_Note that you may need to set up a virtual environment before following the instructions below. For instance, you might need to run the following from the root of the repo:_\n\n```shell\npython3 -m venv venv\nsource venv/bin/activate\n```\n\nFrom the root of this repo, run the following:\n\n```shell\npip install -r dev-requirements.txt -r {env}-requirements.txt\npython -m build\n```\n\nThis builds the package. To use the package locally (eg to run sample files), run\n\n```shell\npip install --editable .\n```\n\nIf you want to use this package from another directory, you can run:\n\n```shell\npip install path_to_this_repo\n```\n\n### Running tests\n\nFrom the root directory, run:\n\n```shell\npytest --doctest-modules --ignore-glob=\"*to_be_updated*\" src tests\n```\n\n## Setting up your editor\n\nThis project uses strict [PEP 8](https://peps.python.org/pep-0008/) formatting.\n\n### Emacs\n\nYou can use [use-package](https://github.com/jwiegley/use-package) to install [py-autopep8](https://codeberg.org/ideasman42/emacs-py-autopep8) package and configure `autopep8` arguments:\n\n```elisp\n(use-package py-autopep8\n  :ensure t\n  :defer t\n  :hook ((python-mode . py-autopep8-mode))\n  :config\n  (setq py-autopep8-options '(\"-a\" \"-a\", \"--max-line-length=100\")))\n```\n\n`autopep8` was installed in the `venv` environment described before, so you should be able to use [pyvenv-auto](https://github.com/ryotaro612/pyvenv-auto) to automatically load that environment inside Emacs.\n\n```elisp\n(use-package pyvenv-auto\n  :ensure t\n  :defer t\n  :hook ((python-mode . pyvenv-auto-run)))\n\n```\n\n### Visual Studio Code\n\nInstall the\n[autopep8](https://marketplace.visualstudio.com/items?itemName=ms-python.autopep8) extension. Then edit the user settings (_Ctrl-Shift-P_ `Open User Settings (JSON)`) and set it as the default Python formatter, enable formatting on save and configure `autopep8` arguments:\n\n```json\n\"[python]\": {\n    \"editor.defaultFormatter\": \"ms-python.autopep8\",\n    \"editor.formatOnSave\": true\n},\n\"autopep8.args\": [\n    \"-a\",\n    \"-a\",\n    \"--max-line-length=100\"\n],\n```\n\n## Getting help\n\n\u27a1\ufe0f [Join our Discord](https://discord.gg/pipecat)\n\n\u27a1\ufe0f [Reach us on X](https://x.com/pipecat_ai)\n",
    "bugtrack_url": null,
    "license": "BSD 2-Clause License",
    "summary": "An open source framework for voice (and multimodal) assistants",
    "version": "0.0.31",
    "project_urls": {
        "Source": "https://github.com/pipecat-ai/pipecat",
        "Website": "https://pipecat.ai"
    },
    "split_keywords": [
        "webrtc",
        " audio",
        " video",
        " ai"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "43346f0cdc6b08b2c24c13060faadd6bf354ba41071acd04e9ff91aa4566831b",
                "md5": "795acecb857220df5a91c36db3f8197c",
                "sha256": "759ffe0840cb94301a2b361956b2d7e0d3184e967169e76c782f26392f3515f0"
            },
            "downloads": -1,
            "filename": "pipecat_ai-0.0.31-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "795acecb857220df5a91c36db3f8197c",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 84399,
            "upload_time": "2024-06-13T22:35:23",
            "upload_time_iso_8601": "2024-06-13T22:35:23.706243Z",
            "url": "https://files.pythonhosted.org/packages/43/34/6f0cdc6b08b2c24c13060faadd6bf354ba41071acd04e9ff91aa4566831b/pipecat_ai-0.0.31-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "0994e29cf5ad2133c1813af5f1ce59d3f507950bc55fa32cce7bbc645fd7a36a",
                "md5": "75bb6920d208767213298a5e8d8242b7",
                "sha256": "c530f1ef0ab072cd9bb35e4c37a05b44a3e8a0c94c473523be1a064fe44e8762"
            },
            "downloads": -1,
            "filename": "pipecat_ai-0.0.31.tar.gz",
            "has_sig": false,
            "md5_digest": "75bb6920d208767213298a5e8d8242b7",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 64346289,
            "upload_time": "2024-06-13T22:35:26",
            "upload_time_iso_8601": "2024-06-13T22:35:26.519603Z",
            "url": "https://files.pythonhosted.org/packages/09/94/e29cf5ad2133c1813af5f1ce59d3f507950bc55fa32cce7bbc645fd7a36a/pipecat_ai-0.0.31.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-06-13 22:35:26",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "pipecat-ai",
    "github_project": "pipecat",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "pipecat-ai"
}
        
Elapsed time: 0.27825s