minyt


Nameminyt JSON
Version 0.0.3 PyPI version JSON
download
home_pagehttps://github.com/franckalbinet/minyt
SummaryDonwload audio from a youtube video and use Gemini LLM for cleaner and smarter transcibes
upload_time2025-07-21 08:20:43
maintainerNone
docs_urlNone
authorFranck Albinet
requires_python>=3.9
licenseApache Software License 2.0
keywords nbdev jupyter notebook python
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # minyt


<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->

[![PyPI
version](https://badge.fury.io/py/minyt.svg)](https://badge.fury.io/py/minyt)
[![License: Apache
2.0](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![Python
3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/)

**minyt** (WIP) is a Python package that simplifies the process of
downloading YouTube audio and generating high-quality transcripts using
Google’s Gemini AI. It intelligently splits long audio files at natural
silence points and processes chunks in parallel for optimal performance.

## Features

- **YouTube Audio Download**: Extract audio from any YouTube video using
  `yt-dlp`
- **Smart Audio Splitting**: Automatically detect silence and split
  audio at natural break points
- **AI-Powered Transcription**: Use Google’s Gemini 2.0 Flash for
  accurate, context-aware transcriptions
- **Parallel Processing**: Process multiple audio chunks concurrently
  for faster results
- **Customizable**: Configure chunk sizes, silence detection, and
  transcription prompts
- **Clean Output**: Generate well-formatted transcripts ready for
  analysis

## Quick Start

### Installation

``` bash
pip install minyt
```

### Prerequisites

1.  **FFmpeg**: Required for audio processing

    ``` bash
    # macOS
    brew install ffmpeg

    # Ubuntu/Debian
    sudo apt update && sudo apt install ffmpeg

    # Windows
    # Download from https://ffmpeg.org/download.html
    ```

2.  **Google Gemini API Key**: Get your API key from [Google AI
    Studio](https://makersuite.google.com/app/apikey)

    ``` bash
    export GEMINI_API_KEY="your-api-key-here"
    ```

### Basic Usage

``` python
import asyncio
from pathlib import Path
from minyt.core import *

# Download audio from a YouTube video
video_id = "dQw4w9WgXcQ"  # Replace with your video ID
audio_file = download_audio(video_id, Path("_audio"))

# Detect silence and find optimal split points
_, silence_data = detect_silence(audio_file)
silence_ends = parse_silence_ends(silence_data)
total_duration = get_audio_duration(audio_file)
split_points = find_split_points(silence_ends, total_duration, chunk_len=600)

# Split audio into manageable chunks
chunks = split_audio(audio_file, split_points, dest_dir="_audio_chunks")

# Transcribe all chunks using Gemini AI
async def main():
    transcript = await transcribe_audio(
        chunks_dir="_audio_chunks",
        dest_file="_transcripts/transcript.txt",
        prompt="Please transcribe this audio file verbatim, maintaining speaker clarity and context."
    )
    print(f"Transcript saved to: _transcripts/transcript.txt")

asyncio.run(main())
```

## Detailed Usage

### Step 1: Download YouTube Audio

``` python
from minyt.core import download_audio
from pathlib import Path

# Download audio from a YouTube video
video_id = "your-video-id-here"
audio_file = download_audio(video_id, Path("downloads"))
print(f"Audio downloaded to: {audio_file}")
```

### Step 2: Process Audio with Smart Splitting

``` python
from minyt.core import detect_silence, parse_silence_ends, find_split_points, split_audio

# Detect silence in the audio file
_, silence_data = detect_silence(audio_file)

# Parse silence end points
silence_ends = parse_silence_ends(silence_data)

# Find optimal split points (aiming for 10-minute chunks)
total_duration = get_audio_duration(audio_file)
split_points = find_split_points(silence_ends, total_duration, chunk_len=600)

# Split audio into chunks
chunks = split_audio(audio_file, split_points, dest_dir="audio_chunks")
print(f"Created {len(chunks)} audio chunks")
```

### Step 3: Transcribe with Gemini AI

``` python
import asyncio
from minyt.core import transcribe_audio

async def transcribe_video():
    transcript = await transcribe_audio(
        chunks_dir="audio_chunks",
        dest_file="transcripts/final_transcript.txt",
        model="gemini-2.0-flash-001",  # Default model
        max_concurrent=3,  # Process 3 chunks simultaneously
        prompt="Please transcribe this audio accurately, preserving speaker names and technical terms."
    )
    return transcript

# Run transcription
transcript = asyncio.run(transcribe_video())
print("Transcription completed!")
```

## Configuration

### Environment Variables

``` bash
# Required
export GEMINI_API_KEY="your-gemini-api-key"

# Optional: Configure logging level
export LOG_LEVEL="INFO"
```

### Customization Options

``` python
# Custom silence detection (adjust sensitivity)
_, silence_data = detect_silence(audio_file)  # Uses -30dB threshold, 0.5s duration

# Custom chunk size (in seconds)
split_points = find_split_points(silence_ends, total_duration, chunk_len=300)  # 5-minute chunks

# Custom transcription settings
transcript = await transcribe_audio(
    chunks_dir="chunks",
    dest_file="output.txt",
    model="gemini-2.0-flash-001",  # Different Gemini model
    max_concurrent=5,  # More parallel processing
    prompt="Custom transcription instructions here..."
)
```

## Development

### Install in Development Mode

``` bash
# Clone the repository
git clone https://github.com/franckalbinet/minyt.git
cd minyt

# Install in development mode
pip install -e .

# Make changes in the nbs/ directory
# ...

# Compile changes to apply to minyt package
nbdev_prepare
```

### Dependencies

- `fastcore`: Core utilities
- `google-genai`: Google Gemini AI client
- `yt-dlp`: YouTube video downloader
- `ffmpeg-python`: Audio processing
- `tqdm`: Progress bars
- `rich`: Enhanced console output

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.
For major changes, please open an issue first to discuss what you would
like to change.

## License

This project is licensed under the Apache License 2.0 - see the
[LICENSE](LICENSE) file for details.

## Acknowledgments

- [yt-dlp](https://github.com/yt-dlp/yt-dlp) for YouTube video
  downloading
- [Google Gemini](https://ai.google.dev/) for AI-powered transcription
- [FFmpeg](https://ffmpeg.org/) for audio processing capabilities

## Support

If you encounter any issues or have questions:

1.  Check the [documentation](https://franckalbinet.github.io/minyt/)
2.  Open an [issue](https://github.com/franckalbinet/minyt/issues)
3.  Contact the maintainer: franckalbinet@gmail.com

------------------------------------------------------------------------

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/franckalbinet/minyt",
    "name": "minyt",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "nbdev jupyter notebook python",
    "author": "Franck Albinet",
    "author_email": "franckalbinet@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/32/77/b62b16072ebcefbd61ff2485b9b833a72b8b58e5f95ddd5933b17d929402/minyt-0.0.3.tar.gz",
    "platform": null,
    "description": "# minyt\n\n\n<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->\n\n[![PyPI\nversion](https://badge.fury.io/py/minyt.svg)](https://badge.fury.io/py/minyt)\n[![License: Apache\n2.0](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)\n[![Python\n3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/)\n\n**minyt** (WIP) is a Python package that simplifies the process of\ndownloading YouTube audio and generating high-quality transcripts using\nGoogle\u2019s Gemini AI. It intelligently splits long audio files at natural\nsilence points and processes chunks in parallel for optimal performance.\n\n## Features\n\n- **YouTube Audio Download**: Extract audio from any YouTube video using\n  `yt-dlp`\n- **Smart Audio Splitting**: Automatically detect silence and split\n  audio at natural break points\n- **AI-Powered Transcription**: Use Google\u2019s Gemini 2.0 Flash for\n  accurate, context-aware transcriptions\n- **Parallel Processing**: Process multiple audio chunks concurrently\n  for faster results\n- **Customizable**: Configure chunk sizes, silence detection, and\n  transcription prompts\n- **Clean Output**: Generate well-formatted transcripts ready for\n  analysis\n\n## Quick Start\n\n### Installation\n\n``` bash\npip install minyt\n```\n\n### Prerequisites\n\n1.  **FFmpeg**: Required for audio processing\n\n    ``` bash\n    # macOS\n    brew install ffmpeg\n\n    # Ubuntu/Debian\n    sudo apt update && sudo apt install ffmpeg\n\n    # Windows\n    # Download from https://ffmpeg.org/download.html\n    ```\n\n2.  **Google Gemini API Key**: Get your API key from [Google AI\n    Studio](https://makersuite.google.com/app/apikey)\n\n    ``` bash\n    export GEMINI_API_KEY=\"your-api-key-here\"\n    ```\n\n### Basic Usage\n\n``` python\nimport asyncio\nfrom pathlib import Path\nfrom minyt.core import *\n\n# Download audio from a YouTube video\nvideo_id = \"dQw4w9WgXcQ\"  # Replace with your video ID\naudio_file = download_audio(video_id, Path(\"_audio\"))\n\n# Detect silence and find optimal split points\n_, silence_data = detect_silence(audio_file)\nsilence_ends = parse_silence_ends(silence_data)\ntotal_duration = get_audio_duration(audio_file)\nsplit_points = find_split_points(silence_ends, total_duration, chunk_len=600)\n\n# Split audio into manageable chunks\nchunks = split_audio(audio_file, split_points, dest_dir=\"_audio_chunks\")\n\n# Transcribe all chunks using Gemini AI\nasync def main():\n    transcript = await transcribe_audio(\n        chunks_dir=\"_audio_chunks\",\n        dest_file=\"_transcripts/transcript.txt\",\n        prompt=\"Please transcribe this audio file verbatim, maintaining speaker clarity and context.\"\n    )\n    print(f\"Transcript saved to: _transcripts/transcript.txt\")\n\nasyncio.run(main())\n```\n\n## Detailed Usage\n\n### Step 1: Download YouTube Audio\n\n``` python\nfrom minyt.core import download_audio\nfrom pathlib import Path\n\n# Download audio from a YouTube video\nvideo_id = \"your-video-id-here\"\naudio_file = download_audio(video_id, Path(\"downloads\"))\nprint(f\"Audio downloaded to: {audio_file}\")\n```\n\n### Step 2: Process Audio with Smart Splitting\n\n``` python\nfrom minyt.core import detect_silence, parse_silence_ends, find_split_points, split_audio\n\n# Detect silence in the audio file\n_, silence_data = detect_silence(audio_file)\n\n# Parse silence end points\nsilence_ends = parse_silence_ends(silence_data)\n\n# Find optimal split points (aiming for 10-minute chunks)\ntotal_duration = get_audio_duration(audio_file)\nsplit_points = find_split_points(silence_ends, total_duration, chunk_len=600)\n\n# Split audio into chunks\nchunks = split_audio(audio_file, split_points, dest_dir=\"audio_chunks\")\nprint(f\"Created {len(chunks)} audio chunks\")\n```\n\n### Step 3: Transcribe with Gemini AI\n\n``` python\nimport asyncio\nfrom minyt.core import transcribe_audio\n\nasync def transcribe_video():\n    transcript = await transcribe_audio(\n        chunks_dir=\"audio_chunks\",\n        dest_file=\"transcripts/final_transcript.txt\",\n        model=\"gemini-2.0-flash-001\",  # Default model\n        max_concurrent=3,  # Process 3 chunks simultaneously\n        prompt=\"Please transcribe this audio accurately, preserving speaker names and technical terms.\"\n    )\n    return transcript\n\n# Run transcription\ntranscript = asyncio.run(transcribe_video())\nprint(\"Transcription completed!\")\n```\n\n## Configuration\n\n### Environment Variables\n\n``` bash\n# Required\nexport GEMINI_API_KEY=\"your-gemini-api-key\"\n\n# Optional: Configure logging level\nexport LOG_LEVEL=\"INFO\"\n```\n\n### Customization Options\n\n``` python\n# Custom silence detection (adjust sensitivity)\n_, silence_data = detect_silence(audio_file)  # Uses -30dB threshold, 0.5s duration\n\n# Custom chunk size (in seconds)\nsplit_points = find_split_points(silence_ends, total_duration, chunk_len=300)  # 5-minute chunks\n\n# Custom transcription settings\ntranscript = await transcribe_audio(\n    chunks_dir=\"chunks\",\n    dest_file=\"output.txt\",\n    model=\"gemini-2.0-flash-001\",  # Different Gemini model\n    max_concurrent=5,  # More parallel processing\n    prompt=\"Custom transcription instructions here...\"\n)\n```\n\n## Development\n\n### Install in Development Mode\n\n``` bash\n# Clone the repository\ngit clone https://github.com/franckalbinet/minyt.git\ncd minyt\n\n# Install in development mode\npip install -e .\n\n# Make changes in the nbs/ directory\n# ...\n\n# Compile changes to apply to minyt package\nnbdev_prepare\n```\n\n### Dependencies\n\n- `fastcore`: Core utilities\n- `google-genai`: Google Gemini AI client\n- `yt-dlp`: YouTube video downloader\n- `ffmpeg-python`: Audio processing\n- `tqdm`: Progress bars\n- `rich`: Enhanced console output\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request.\nFor major changes, please open an issue first to discuss what you would\nlike to change.\n\n## License\n\nThis project is licensed under the Apache License 2.0 - see the\n[LICENSE](LICENSE) file for details.\n\n## Acknowledgments\n\n- [yt-dlp](https://github.com/yt-dlp/yt-dlp) for YouTube video\n  downloading\n- [Google Gemini](https://ai.google.dev/) for AI-powered transcription\n- [FFmpeg](https://ffmpeg.org/) for audio processing capabilities\n\n## Support\n\nIf you encounter any issues or have questions:\n\n1.  Check the [documentation](https://franckalbinet.github.io/minyt/)\n2.  Open an [issue](https://github.com/franckalbinet/minyt/issues)\n3.  Contact the maintainer: franckalbinet@gmail.com\n\n------------------------------------------------------------------------\n",
    "bugtrack_url": null,
    "license": "Apache Software License 2.0",
    "summary": "Donwload audio from a youtube video and use Gemini LLM for cleaner and smarter transcibes",
    "version": "0.0.3",
    "project_urls": {
        "Homepage": "https://github.com/franckalbinet/minyt"
    },
    "split_keywords": [
        "nbdev",
        "jupyter",
        "notebook",
        "python"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "dec86a02017b6a6e0e62d7d597bc6737ddaea753f3d6a739ab8242ca92dede38",
                "md5": "8689f7f1d84df7e72e6e6509d3ccc61e",
                "sha256": "4eedee94977035810dde23ecc35e9de4da3a7ff7aafa813848d036ffc59618a2"
            },
            "downloads": -1,
            "filename": "minyt-0.0.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "8689f7f1d84df7e72e6e6509d3ccc61e",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 10959,
            "upload_time": "2025-07-21T08:20:43",
            "upload_time_iso_8601": "2025-07-21T08:20:43.050009Z",
            "url": "https://files.pythonhosted.org/packages/de/c8/6a02017b6a6e0e62d7d597bc6737ddaea753f3d6a739ab8242ca92dede38/minyt-0.0.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "3277b62b16072ebcefbd61ff2485b9b833a72b8b58e5f95ddd5933b17d929402",
                "md5": "364a2b45a87f10ea9cd819f7ea6ed3bc",
                "sha256": "ab61c76e80aa8f0de887158eaec70b9797a07c5e1752d36a4b36b2d541db5b6f"
            },
            "downloads": -1,
            "filename": "minyt-0.0.3.tar.gz",
            "has_sig": false,
            "md5_digest": "364a2b45a87f10ea9cd819f7ea6ed3bc",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 12170,
            "upload_time": "2025-07-21T08:20:43",
            "upload_time_iso_8601": "2025-07-21T08:20:43.874505Z",
            "url": "https://files.pythonhosted.org/packages/32/77/b62b16072ebcefbd61ff2485b9b833a72b8b58e5f95ddd5933b17d929402/minyt-0.0.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-21 08:20:43",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "franckalbinet",
    "github_project": "minyt",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "minyt"
}
        
Elapsed time: 1.72775s