stream-translator-gpt


Namestream-translator-gpt JSON
Version 2024.4.24 PyPI version JSON
download
home_pageNone
SummaryCommand line tool to transcribe & translate audio from livestreams in real time
upload_time2024-04-24 15:27:44
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseNone
keywords translator translation translate transcribe yt-dlp vad whisper faster-whisper whisper-api gpt gemini
VCS
bugtrack_url
requirements numpy scipy yt-dlp ffmpeg-python sounddevice openai-whisper faster-whisper openai google-generativeai
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # stream-translator-gpt

Command line utility to transcribe or translate audio from livestreams in real time. Uses [yt-dlp](https://github.com/yt-dlp/yt-dlp) to 
get livestream URLs from various services and [Whisper](https://github.com/openai/whisper) / [Faster-Whisper](https://github.com/SYSTRAN/faster-whisper) for transcription.

This fork optimized the audio slicing logic based on [VAD](https://github.com/snakers4/silero-vad), 
introduced [GPT API](https://platform.openai.com/api-keys) / [Gemini API](https://aistudio.google.com/app/apikey) to support language translation beyond English, and supports input from the audio devices.

Try it on Colab: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ionic-bond/stream-translator-gpt/blob/main/stream_translator.ipynb)

## Prerequisites

**Linux or Windows:**

1. Python >= 3.8 (Recommend >= 3.10)
2. [**Install CUDA 11 on your system.**](https://developer.nvidia.com/cuda-11-8-0-download-archive) (Faster-Whisper is not compatible with CUDA 12 for now).
3. [**Install cuDNN to your CUDA dir**](https://developer.nvidia.com/cuda-downloads) if you want to use **Faseter-Whisper**.
4. [**Install PyTorch (with CUDA) to your Python.**](https://pytorch.org/get-started/locally/)
5. [**Create a Google API key**](https://aistudio.google.com/app/apikey) if you want to use **Gemini API** for translation. (Recommend, Free 60 requests / minute)
6. [**Create a OpenAI API key**](https://platform.openai.com/api-keys) if you want to use **Whisper API** for transcription or **GPT API** for translation.

**If you are in Windows, you also need to:**

1. [**Install and add ffmpeg to your PATH.**](https://www.thewindowsclub.com/how-to-install-ffmpeg-on-windows-10#:~:text=Click%20New%20and%20type%20the,Click%20OK%20to%20apply%20changes.)
2. Install [**yt-dlp**](https://github.com/yt-dlp/yt-dlp) and add it to your PATH.

## Installation

**Install release version from PyPI (Recommend):**

```
pip install stream-translator-gpt
stream-translator-gpt
```

or

**Clone master version code from Github:**

```
git clone https://github.com/ionic-bond/stream-translator-gpt.git
pip install -r ./stream-translator-gpt/requirements.txt
python3 ./stream-translator-gpt/translator.py
```

## Usage

- Transcribe live streaming (default use **Whisper**):

    ```stream-translator-gpt {URL} --model large --language {input_language}```

- Transcribe by **Faster Whisper**:

    ```stream-translator-gpt {URL} --model large --language {input_language} --use_faster_whisper```

- Transcribe by **Whisper API**:

    ```stream-translator-gpt {URL} --language {input_language} --use_whisper_api --openai_api_key {your_openai_key}```

- Translate to other language by **Gemini**:

    ```stream-translator-gpt {URL} --model large --language ja --gpt_translation_prompt "Translate from Japanese to Chinese" --google_api_key {your_google_key}```

- Translate to other language by **GPT**:

    ```stream-translator-gpt {URL} --model large --language ja --gpt_translation_prompt "Translate from Japanese to Chinese" --openai_api_key {your_openai_key}```

- Using **Whisper API** and **Gemini** at the same time:

    ```stream-translator-gpt {URL} --model large --language ja --use_whisper_api --openai_api_key {your_openai_key} --gpt_translation_prompt "Translate from Japanese to Chinese" --google_api_key {your_google_key}```

- Local video/audio file as input:

    ```stream-translator-gpt /path/to/file --model large --language {input_language}```

- Computer microphone as input:

    ```stream-translator-gpt device --model large --language {input_language}```
    
    Will use the system's default audio device as input.

    If you want to use another audio input device, `stream-translator-gpt device --print_all_devices` get device index and then run the CLI with `--device_index {index}`.

    If you want to use the audio output of another program as input, you need to [**enable stereo mix**](https://www.howtogeek.com/39532/how-to-enable-stereo-mix-in-windows-7-to-record-audio/).

- Sending result to Cqhttp:

    ```stream-translator-gpt {URL} --model large --language {input_language} --cqhttp_url {your_cqhttp_url} --cqhttp_token {your_cqhttp_token}```

- Sending result to Discord:

    ```stream-translator-gpt {URL} --model large --language {input_language} --discord_webhook_url {your_discord_webhook_url}```

- Saving result to a .srt subtitle file:

    ```stream-translator-gpt {URL} --model large --language ja --gpt_translation_prompt "Translate from Japanese to Chinese" --google_api_key {your_google_key} --hide_transcribe_result --output_timestamps --output_file_path ./result.srt```

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "stream-translator-gpt",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "translator, translation, translate, transcribe, yt-dlp, vad, whisper, faster-whisper, whisper-api, gpt, gemini",
    "author": null,
    "author_email": "ion <ionicbond3@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/50/f3/99307737ccdb0ace825a64c39cfc5f793792df32bf9da9c75e3bf7d4b1b7/stream_translator_gpt-2024.4.24.tar.gz",
    "platform": null,
    "description": "# stream-translator-gpt\r\n\r\nCommand line utility to transcribe or translate audio from livestreams in real time. Uses [yt-dlp](https://github.com/yt-dlp/yt-dlp) to \r\nget livestream URLs from various services and [Whisper](https://github.com/openai/whisper) / [Faster-Whisper](https://github.com/SYSTRAN/faster-whisper) for transcription.\r\n\r\nThis fork optimized the audio slicing logic based on [VAD](https://github.com/snakers4/silero-vad), \r\nintroduced [GPT API](https://platform.openai.com/api-keys) / [Gemini API](https://aistudio.google.com/app/apikey) to support language translation beyond English, and supports input from the audio devices.\r\n\r\nTry it on Colab: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ionic-bond/stream-translator-gpt/blob/main/stream_translator.ipynb)\r\n\r\n## Prerequisites\r\n\r\n**Linux or Windows:**\r\n\r\n1. Python >= 3.8 (Recommend >= 3.10)\r\n2. [**Install CUDA 11 on your system.**](https://developer.nvidia.com/cuda-11-8-0-download-archive) (Faster-Whisper is not compatible with CUDA 12 for now).\r\n3. [**Install cuDNN to your CUDA dir**](https://developer.nvidia.com/cuda-downloads) if you want to use **Faseter-Whisper**.\r\n4. [**Install PyTorch (with CUDA) to your Python.**](https://pytorch.org/get-started/locally/)\r\n5. [**Create a Google API key**](https://aistudio.google.com/app/apikey) if you want to use **Gemini API** for translation. (Recommend, Free 60 requests / minute)\r\n6. [**Create a OpenAI API key**](https://platform.openai.com/api-keys) if you want to use **Whisper API** for transcription or **GPT API** for translation.\r\n\r\n**If you are in Windows, you also need to:**\r\n\r\n1. [**Install and add ffmpeg to your PATH.**](https://www.thewindowsclub.com/how-to-install-ffmpeg-on-windows-10#:~:text=Click%20New%20and%20type%20the,Click%20OK%20to%20apply%20changes.)\r\n2. Install [**yt-dlp**](https://github.com/yt-dlp/yt-dlp) and add it to your PATH.\r\n\r\n## Installation\r\n\r\n**Install release version from PyPI (Recommend):**\r\n\r\n```\r\npip install stream-translator-gpt\r\nstream-translator-gpt\r\n```\r\n\r\nor\r\n\r\n**Clone master version code from Github:**\r\n\r\n```\r\ngit clone https://github.com/ionic-bond/stream-translator-gpt.git\r\npip install -r ./stream-translator-gpt/requirements.txt\r\npython3 ./stream-translator-gpt/translator.py\r\n```\r\n\r\n## Usage\r\n\r\n- Transcribe live streaming (default use **Whisper**):\r\n\r\n    ```stream-translator-gpt {URL} --model large --language {input_language}```\r\n\r\n- Transcribe by **Faster Whisper**:\r\n\r\n    ```stream-translator-gpt {URL} --model large --language {input_language} --use_faster_whisper```\r\n\r\n- Transcribe by **Whisper API**:\r\n\r\n    ```stream-translator-gpt {URL} --language {input_language} --use_whisper_api --openai_api_key {your_openai_key}```\r\n\r\n- Translate to other language by **Gemini**:\r\n\r\n    ```stream-translator-gpt {URL} --model large --language ja --gpt_translation_prompt \"Translate from Japanese to Chinese\" --google_api_key {your_google_key}```\r\n\r\n- Translate to other language by **GPT**:\r\n\r\n    ```stream-translator-gpt {URL} --model large --language ja --gpt_translation_prompt \"Translate from Japanese to Chinese\" --openai_api_key {your_openai_key}```\r\n\r\n- Using **Whisper API** and **Gemini** at the same time:\r\n\r\n    ```stream-translator-gpt {URL} --model large --language ja --use_whisper_api --openai_api_key {your_openai_key} --gpt_translation_prompt \"Translate from Japanese to Chinese\" --google_api_key {your_google_key}```\r\n\r\n- Local video/audio file as input:\r\n\r\n    ```stream-translator-gpt /path/to/file --model large --language {input_language}```\r\n\r\n- Computer microphone as input:\r\n\r\n    ```stream-translator-gpt device --model large --language {input_language}```\r\n    \r\n    Will use the system's default audio device as input.\r\n\r\n    If you want to use another audio input device, `stream-translator-gpt device --print_all_devices` get device index and then run the CLI with `--device_index {index}`.\r\n\r\n    If you want to use the audio output of another program as input, you need to [**enable stereo mix**](https://www.howtogeek.com/39532/how-to-enable-stereo-mix-in-windows-7-to-record-audio/).\r\n\r\n- Sending result to Cqhttp:\r\n\r\n    ```stream-translator-gpt {URL} --model large --language {input_language} --cqhttp_url {your_cqhttp_url} --cqhttp_token {your_cqhttp_token}```\r\n\r\n- Sending result to Discord:\r\n\r\n    ```stream-translator-gpt {URL} --model large --language {input_language} --discord_webhook_url {your_discord_webhook_url}```\r\n\r\n- Saving result to a .srt subtitle file:\r\n\r\n    ```stream-translator-gpt {URL} --model large --language ja --gpt_translation_prompt \"Translate from Japanese to Chinese\" --google_api_key {your_google_key} --hide_transcribe_result --output_timestamps --output_file_path ./result.srt```\r\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Command line tool to transcribe & translate audio from livestreams in real time",
    "version": "2024.4.24",
    "project_urls": {
        "Homepage": "https://github.com/ionic-bond/stream-translator-gpt",
        "Issues": "https://github.com/ionic-bond/stream-translator-gpt/issues"
    },
    "split_keywords": [
        "translator",
        " translation",
        " translate",
        " transcribe",
        " yt-dlp",
        " vad",
        " whisper",
        " faster-whisper",
        " whisper-api",
        " gpt",
        " gemini"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "83d362a6d534b4724e38cddd4421cf178ddc545caeda19027a83e1b4f0918cd1",
                "md5": "f67285974364f6868dc296a666d16972",
                "sha256": "36720d90a1c36b0bfcf0ae466c548897649a5d144612336b5b7f6b53e9ed3c7d"
            },
            "downloads": -1,
            "filename": "stream_translator_gpt-2024.4.24-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f67285974364f6868dc296a666d16972",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 1138670,
            "upload_time": "2024-04-24T15:27:41",
            "upload_time_iso_8601": "2024-04-24T15:27:41.893538Z",
            "url": "https://files.pythonhosted.org/packages/83/d3/62a6d534b4724e38cddd4421cf178ddc545caeda19027a83e1b4f0918cd1/stream_translator_gpt-2024.4.24-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "50f399307737ccdb0ace825a64c39cfc5f793792df32bf9da9c75e3bf7d4b1b7",
                "md5": "e153405949aec4bb4daca3139ac11446",
                "sha256": "a59c4300e6c0761d030d16346123a0f12c9e05df675779b884abdb404c81131a"
            },
            "downloads": -1,
            "filename": "stream_translator_gpt-2024.4.24.tar.gz",
            "has_sig": false,
            "md5_digest": "e153405949aec4bb4daca3139ac11446",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 1142748,
            "upload_time": "2024-04-24T15:27:44",
            "upload_time_iso_8601": "2024-04-24T15:27:44.512866Z",
            "url": "https://files.pythonhosted.org/packages/50/f3/99307737ccdb0ace825a64c39cfc5f793792df32bf9da9c75e3bf7d4b1b7/stream_translator_gpt-2024.4.24.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-04-24 15:27:44",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "ionic-bond",
    "github_project": "stream-translator-gpt",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "numpy",
            "specs": []
        },
        {
            "name": "scipy",
            "specs": []
        },
        {
            "name": "yt-dlp",
            "specs": []
        },
        {
            "name": "ffmpeg-python",
            "specs": [
                [
                    ">=",
                    "0.2.0"
                ],
                [
                    "<",
                    "0.3"
                ]
            ]
        },
        {
            "name": "sounddevice",
            "specs": [
                [
                    "<",
                    "1.0"
                ]
            ]
        },
        {
            "name": "openai-whisper",
            "specs": [
                [
                    "<=",
                    "20231117"
                ]
            ]
        },
        {
            "name": "faster-whisper",
            "specs": [
                [
                    ">=",
                    "0.8.0"
                ],
                [
                    "<",
                    "1.0.0"
                ]
            ]
        },
        {
            "name": "openai",
            "specs": [
                [
                    ">=",
                    "1.0"
                ],
                [
                    "<",
                    "2.0"
                ]
            ]
        },
        {
            "name": "google-generativeai",
            "specs": [
                [
                    "<",
                    "1.0"
                ]
            ]
        }
    ],
    "lcname": "stream-translator-gpt"
}
        
Elapsed time: 0.36559s