Name | pytranscript JSON |
Version |
0.3.0
JSON |
| download |
home_page | None |
Summary | CLI to transcript and translate audio and video files |
upload_time | 2024-11-11 14:42:21 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.12 |
license | MIT License Copyright (c) [2024] [arnaud-ma] Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. |
keywords |
transcript
translation
audio
video
subtitles
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# Pytranscript 🎙️
Pytranscript is a powerful Python library and command-line tool designed to seamlessly convert video or audio files into text and translate them into various languages. It acts as a simple yet effective wrapper around [Vosk](https://alphacephei.com/vosk/), [ffmpeg](https://ffmpeg.org/), and [deep-translator](https://pypi.org/project/deep-translator/), making the transcription and translation process straightforward.
## Prerequisites
Before using pytranscript, ensure you have the following dependencies installed:
- [ffmpeg](https://ffmpeg.org/download.html) for audio conversion.
- [vosk-models](https://alphacephei.com/vosk/models) required for speech recognition. You will have to specify to your specific model path in the `--model` argument.
## Installation
```bash
pip install pytranscript
```
## Usage
### Command Line
```bash
pytranscript INPUT_FILE [OPTIONS]
```
### Options
- `-m, --model` - Path to the Vosk model directory. Always required.
- `-o, --output` - Output file where the text will be saved. Default: input file name with `.txt` extension.
- `-f, --format` - Format of the transcript. Must be one of 'csv', 'json', 'srt', 'txt', 'vtt' or 'all'. Default: input file extension.
- `-li, --lang_input` - Language of the input / the model. Default: auto.
- `-lo --lang_input` - Language to translate the text to. Default: no translation.
- `-s, --start` - Start time of the audio to transcribe in seconds.
- `-e, --end` - End time of the audio to transcribe in seconds.
- `--max_size` - Will stop the transcription if the output file reaches the specified size in bytes. Takes precedence over the `--end` option.
- `--keep-wav` - Keep the converted audio wav file after the process is done.
- `-v, -verbosity` - Verbosity level. 0: no output, 1: only errors, 2: errors, info and progressbar, 3: debug. Default: 2.
## Example
The most basic usage is:
```bash
pytranscript video.mp4 -m vosk-model-en-us-aspire-0.2 -lo fr -f srt
```
Where `vosk-model-en-us-aspire-0.2` is the Vosk model directory. The text will be translated from English to French, and the output will be saved in `video.srt`.
Using the `keep-wav` option can be useful if you want to do many transcriptions within the same file, allowing you to use the same `.wav` file for each transcription, thus saving conversion time.
⚠️ The `.wav` file is cropped according to the start and end time options.
### API
The API provides a Transcript object containing the time and text. The `translate` method can be used to get another Transcript object with the translated text. The output saved in a file in the cli is just a method
`to_{format}` of the Transcript object.
A reproduction of the previous example using the API:
```python
import pytranscript as pt
wav_file = pt.to_valid_wav("video.mp4", "video.wav", start=0, end=None)
transcript = pt.transcribe(wav_file, model="vosk-model-en-us-aspire-0.2", max_size=None)
transcript_fr, errors = transcript.translate("fr")
transcript_fr.write("video.srt")
```
## Contributing
Contributions are welcome! For major changes, please open an issue first to discuss what you would like to change.
Tests can be run with `pytest`. Use `ruff` with `ruff format .` to format the code before committing.
Raw data
{
"_id": null,
"home_page": null,
"name": "pytranscript",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.12",
"maintainer_email": null,
"keywords": "transcript, translation, audio, video, subtitles",
"author": null,
"author_email": "arnaud-ma <arnaudma.code@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/e5/e5/8b1a728050e3c455b609b64a61bf81a819a9ca4c96d5c9b7f36ef2988d46/pytranscript-0.3.0.tar.gz",
"platform": null,
"description": "# Pytranscript \ud83c\udf99\ufe0f\n\nPytranscript is a powerful Python library and command-line tool designed to seamlessly convert video or audio files into text and translate them into various languages. It acts as a simple yet effective wrapper around [Vosk](https://alphacephei.com/vosk/), [ffmpeg](https://ffmpeg.org/), and [deep-translator](https://pypi.org/project/deep-translator/), making the transcription and translation process straightforward.\n\n## Prerequisites\n\nBefore using pytranscript, ensure you have the following dependencies installed:\n\n- [ffmpeg](https://ffmpeg.org/download.html) for audio conversion.\n- [vosk-models](https://alphacephei.com/vosk/models) required for speech recognition. You will have to specify to your specific model path in the `--model` argument.\n\n## Installation\n\n```bash\npip install pytranscript\n```\n\n## Usage\n\n### Command Line\n\n```bash\npytranscript INPUT_FILE [OPTIONS]\n```\n\n### Options\n\n- `-m, --model` - Path to the Vosk model directory. Always required.\n- `-o, --output` - Output file where the text will be saved. Default: input file name with `.txt` extension.\n- `-f, --format` - Format of the transcript. Must be one of 'csv', 'json', 'srt', 'txt', 'vtt' or 'all'. Default: input file extension.\n- `-li, --lang_input` - Language of the input / the model. Default: auto.\n- `-lo --lang_input` - Language to translate the text to. Default: no translation.\n- `-s, --start` - Start time of the audio to transcribe in seconds.\n- `-e, --end` - End time of the audio to transcribe in seconds.\n- `--max_size` - Will stop the transcription if the output file reaches the specified size in bytes. Takes precedence over the `--end` option.\n- `--keep-wav` - Keep the converted audio wav file after the process is done.\n- `-v, -verbosity` - Verbosity level. 0: no output, 1: only errors, 2: errors, info and progressbar, 3: debug. Default: 2.\n\n## Example\n\nThe most basic usage is:\n\n```bash\npytranscript video.mp4 -m vosk-model-en-us-aspire-0.2 -lo fr -f srt\n```\n\nWhere `vosk-model-en-us-aspire-0.2` is the Vosk model directory. The text will be translated from English to French, and the output will be saved in `video.srt`.\n\nUsing the `keep-wav` option can be useful if you want to do many transcriptions within the same file, allowing you to use the same `.wav` file for each transcription, thus saving conversion time.\n \u26a0\ufe0f The `.wav` file is cropped according to the start and end time options.\n\n### API\n\nThe API provides a Transcript object containing the time and text. The `translate` method can be used to get another Transcript object with the translated text. The output saved in a file in the cli is just a method\n`to_{format}` of the Transcript object.\n\nA reproduction of the previous example using the API:\n\n```python\nimport pytranscript as pt\n\nwav_file = pt.to_valid_wav(\"video.mp4\", \"video.wav\", start=0, end=None)\ntranscript = pt.transcribe(wav_file, model=\"vosk-model-en-us-aspire-0.2\", max_size=None)\ntranscript_fr, errors = transcript.translate(\"fr\")\n\ntranscript_fr.write(\"video.srt\")\n```\n\n## Contributing\n\nContributions are welcome! For major changes, please open an issue first to discuss what you would like to change.\nTests can be run with `pytest`. Use `ruff` with `ruff format .` to format the code before committing.\n",
"bugtrack_url": null,
"license": "MIT License Copyright (c) [2024] [arnaud-ma] Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.",
"summary": "CLI to transcript and translate audio and video files",
"version": "0.3.0",
"project_urls": {
"Documentation": "https://github.com/arnaud-ma/pytranscript#readme",
"Issues": "https://github.com/arnaud-ma/pytranscript/issues",
"Repository": "https://github.com/arnaud-ma/pytranscript",
"Source": "https://github.com/arnaud-ma/pytranscript"
},
"split_keywords": [
"transcript",
" translation",
" audio",
" video",
" subtitles"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "bb3c6d4e90f973b6cbb862b49a93f4f6d351804070d4a26e367cf54f5496af81",
"md5": "f1997202c5c779c751dd79cc8db7644b",
"sha256": "c8bca0ad0a94b530ad7275d7073b40a8c3c643f7f6b6b98c371f5c39689aa163"
},
"downloads": -1,
"filename": "pytranscript-0.3.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "f1997202c5c779c751dd79cc8db7644b",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.12",
"size": 10050,
"upload_time": "2024-11-11T14:42:19",
"upload_time_iso_8601": "2024-11-11T14:42:19.223163Z",
"url": "https://files.pythonhosted.org/packages/bb/3c/6d4e90f973b6cbb862b49a93f4f6d351804070d4a26e367cf54f5496af81/pytranscript-0.3.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "e5e58b1a728050e3c455b609b64a61bf81a819a9ca4c96d5c9b7f36ef2988d46",
"md5": "d59a3eb30d52751640b2d90946d78649",
"sha256": "cb1042521411a2b528e03cc8fe51c29f2ceb2fd7dec774dbd30c5e13bd445f97"
},
"downloads": -1,
"filename": "pytranscript-0.3.0.tar.gz",
"has_sig": false,
"md5_digest": "d59a3eb30d52751640b2d90946d78649",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.12",
"size": 10330,
"upload_time": "2024-11-11T14:42:21",
"upload_time_iso_8601": "2024-11-11T14:42:21.664979Z",
"url": "https://files.pythonhosted.org/packages/e5/e5/8b1a728050e3c455b609b64a61bf81a819a9ca4c96d5c9b7f36ef2988d46/pytranscript-0.3.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-11-11 14:42:21",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "arnaud-ma",
"github_project": "pytranscript#readme",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "pytranscript"
}