easytranscribe


Nameeasytranscribe JSON
Version 0.1.2 PyPI version JSON
download
home_pagehttps://github.com/akhshyganesh/easytranscribe
SummaryEasy speech-to-text transcription from audio files or live microphone input using Whisper.
upload_time2025-07-13 16:35:14
maintainerNone
docs_urlNone
authorakhshyganesh
requires_python>=3.8
licenseMIT License Copyright (c) 2025 Akhshy Ganesh Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
keywords speech-to-text whisper transcription audio ai
VCS
bugtrack_url
requirements openai-whisper sounddevice numpy
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # EasyTranscribe

A simple Python-based voice assistant that captures speech from your microphone or from recorded file, detects silence, and transcribes spoken words to text using OpenAI Whisper. Easily extensible for integration with LLMs like Ollama or Gemma.

## Features

- Real-time microphone audio capture
- Automatic silence detection and recording stop
- Speech-to-text transcription using Whisper
- Comprehensive transcription logging with detailed metrics
- Easy integration with other AI models

## Installation

1. Clone the repository:
   ```bash
   git clone https://github.com/akhshyganesh/easytranscribe.git
   cd easytranscribe
   ```

2. Install dependencies:
   ```bash
   pip install -r requirements.txt
   ```

## Usage

Run the main script:
```bash
python main.py
```
Speak into your microphone. The assistant will automatically stop recording after a few seconds of silence and transcribe your speech.

# easytranscribe

[![PyPI version](https://badge.fury.io/py/easytranscribe.svg)](https://badge.fury.io/py/easytranscribe)
[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

Easy speech-to-text transcription from audio files or live microphone input using OpenAI's Whisper.

## ✨ Features

- 🎤 **Live microphone transcription** with automatic silence detection
- 📁 **Audio file transcription** supporting multiple formats
- 📊 **Automatic logging** with timestamps and performance metrics
- 🔧 **Simple CLI interface** for quick usage
- 🐍 **Easy Python API** for integration into your projects
- 📈 **Log analysis tools** to view transcription history and statistics

## 🚀 Quick Start

### Installation

```bash
pip install easytranscribe
```

### Python API

**Live microphone transcription:**
```python
from easytranscribe import capture_and_transcribe

# Start live transcription (speaks and waits for silence)
text = capture_and_transcribe(model_name="base")
print(f"You said: {text}")
```

**Audio file transcription:**
```python
from easytranscribe import transcribe_audio_file

# Transcribe an audio file
text = transcribe_audio_file("path/to/audio.wav", model_name="base")
print(f"Transcription: {text}")
```

**View transcription logs:**
```python
from easytranscribe import view_logs

# View today's logs with statistics
logs = view_logs(date="today", stats=True)
print(f"Total entries: {logs['total_count']}")
```

### Command Line Interface

**Live transcription:**
```bash
easytranscribe live --model base
```

**File transcription:**
```bash
easytranscribe file path/to/audio.wav --model base
```

**View logs:**
```bash
# View today's logs
easytranscribe logs --date today --stats

# View last 10 entries
easytranscribe logs --tail 10

# List available log dates
easytranscribe logs --list-dates
```

## 📋 Available Whisper Models

| Model  | Size | Speed | Accuracy | Use Case |
|--------|------|-------|----------|----------|
| `tiny` | 39MB | Fastest | Good | Real-time, low resource |
| `base` | 74MB | Fast | Better | Balanced performance |
| `small` | 244MB | Medium | Good | Higher accuracy |
| `medium` | 769MB | Slow | Very Good | Professional use |
| `large` | 1550MB | Slowest | Best | Maximum accuracy |
| `turbo` | 809MB | Fast | Excellent | Best balance (default) |

## 🔧 Configuration

### Audio Settings (Live Recording)

The package automatically handles:
- ✅ Silence detection (3 seconds of silence stops recording)
- ✅ Minimum recording time (2 seconds)
- ✅ Audio level monitoring
- ✅ Automatic microphone input

### Logging

Transcriptions are automatically logged to `logs/transcription_YYYY-MM-DD.log` with:
- 📅 Timestamp
- 🤖 Model used
- ⏱️ Processing time
- 🎵 Audio duration (for live recording)
- 📝 Transcribed text

## 🛠️ Development

### Install from Source

```bash
git clone https://github.com/akhshyganesh/easytranscribe.git
cd easytranscribe
pip install -e .
```

### Run Tests

```bash
python test/test_integration.py
```

## 📄 Requirements

- Python 3.8+
- OpenAI Whisper
- sounddevice (for microphone input)
- numpy

## 📖 Documentation

For comprehensive documentation, examples, and API reference, visit:

**🌐 [EasyTranscribe Documentation](https://akhshyganesh.github.io/easytranscribe/)**

The documentation includes:
- 🚀 [Quick Start Guide](https://akhshyganesh.github.io/easytranscribe/quickstart/)
- 💻 [CLI Usage](https://akhshyganesh.github.io/easytranscribe/cli/)
- 🐍 [Python API](https://akhshyganesh.github.io/easytranscribe/api/)
- 📝 [Examples](https://akhshyganesh.github.io/easytranscribe/examples/)
- ⚙️ [Configuration](https://akhshyganesh.github.io/easytranscribe/configuration/)
- 🔧 [Advanced Usage](https://akhshyganesh.github.io/easytranscribe/advanced/)

## 🤝 Contributing

Contributions are welcome! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.

## 📜 License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## 🙏 Acknowledgments

- [OpenAI Whisper](https://github.com/openai/whisper) for the amazing speech recognition model
- [sounddevice](https://github.com/spatialaudio/python-sounddevice) for microphone input handling

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/akhshyganesh/easytranscribe",
    "name": "easytranscribe",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "speech-to-text, whisper, transcription, audio, ai",
    "author": "akhshyganesh",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/ee/75/138cad053714aeb83d4df60287c6ace86e5b59705e0dd3b2123c0635c122/easytranscribe-0.1.2.tar.gz",
    "platform": null,
    "description": "# EasyTranscribe\n\nA simple Python-based voice assistant that captures speech from your microphone or from recorded file, detects silence, and transcribes spoken words to text using OpenAI Whisper. Easily extensible for integration with LLMs like Ollama or Gemma.\n\n## Features\n\n- Real-time microphone audio capture\n- Automatic silence detection and recording stop\n- Speech-to-text transcription using Whisper\n- Comprehensive transcription logging with detailed metrics\n- Easy integration with other AI models\n\n## Installation\n\n1. Clone the repository:\n   ```bash\n   git clone https://github.com/akhshyganesh/easytranscribe.git\n   cd easytranscribe\n   ```\n\n2. Install dependencies:\n   ```bash\n   pip install -r requirements.txt\n   ```\n\n## Usage\n\nRun the main script:\n```bash\npython main.py\n```\nSpeak into your microphone. The assistant will automatically stop recording after a few seconds of silence and transcribe your speech.\n\n# easytranscribe\n\n[![PyPI version](https://badge.fury.io/py/easytranscribe.svg)](https://badge.fury.io/py/easytranscribe)\n[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n\nEasy speech-to-text transcription from audio files or live microphone input using OpenAI's Whisper.\n\n## \u2728 Features\n\n- \ud83c\udfa4 **Live microphone transcription** with automatic silence detection\n- \ud83d\udcc1 **Audio file transcription** supporting multiple formats\n- \ud83d\udcca **Automatic logging** with timestamps and performance metrics\n- \ud83d\udd27 **Simple CLI interface** for quick usage\n- \ud83d\udc0d **Easy Python API** for integration into your projects\n- \ud83d\udcc8 **Log analysis tools** to view transcription history and statistics\n\n## \ud83d\ude80 Quick Start\n\n### Installation\n\n```bash\npip install easytranscribe\n```\n\n### Python API\n\n**Live microphone transcription:**\n```python\nfrom easytranscribe import capture_and_transcribe\n\n# Start live transcription (speaks and waits for silence)\ntext = capture_and_transcribe(model_name=\"base\")\nprint(f\"You said: {text}\")\n```\n\n**Audio file transcription:**\n```python\nfrom easytranscribe import transcribe_audio_file\n\n# Transcribe an audio file\ntext = transcribe_audio_file(\"path/to/audio.wav\", model_name=\"base\")\nprint(f\"Transcription: {text}\")\n```\n\n**View transcription logs:**\n```python\nfrom easytranscribe import view_logs\n\n# View today's logs with statistics\nlogs = view_logs(date=\"today\", stats=True)\nprint(f\"Total entries: {logs['total_count']}\")\n```\n\n### Command Line Interface\n\n**Live transcription:**\n```bash\neasytranscribe live --model base\n```\n\n**File transcription:**\n```bash\neasytranscribe file path/to/audio.wav --model base\n```\n\n**View logs:**\n```bash\n# View today's logs\neasytranscribe logs --date today --stats\n\n# View last 10 entries\neasytranscribe logs --tail 10\n\n# List available log dates\neasytranscribe logs --list-dates\n```\n\n## \ud83d\udccb Available Whisper Models\n\n| Model  | Size | Speed | Accuracy | Use Case |\n|--------|------|-------|----------|----------|\n| `tiny` | 39MB | Fastest | Good | Real-time, low resource |\n| `base` | 74MB | Fast | Better | Balanced performance |\n| `small` | 244MB | Medium | Good | Higher accuracy |\n| `medium` | 769MB | Slow | Very Good | Professional use |\n| `large` | 1550MB | Slowest | Best | Maximum accuracy |\n| `turbo` | 809MB | Fast | Excellent | Best balance (default) |\n\n## \ud83d\udd27 Configuration\n\n### Audio Settings (Live Recording)\n\nThe package automatically handles:\n- \u2705 Silence detection (3 seconds of silence stops recording)\n- \u2705 Minimum recording time (2 seconds)\n- \u2705 Audio level monitoring\n- \u2705 Automatic microphone input\n\n### Logging\n\nTranscriptions are automatically logged to `logs/transcription_YYYY-MM-DD.log` with:\n- \ud83d\udcc5 Timestamp\n- \ud83e\udd16 Model used\n- \u23f1\ufe0f Processing time\n- \ud83c\udfb5 Audio duration (for live recording)\n- \ud83d\udcdd Transcribed text\n\n## \ud83d\udee0\ufe0f Development\n\n### Install from Source\n\n```bash\ngit clone https://github.com/akhshyganesh/easytranscribe.git\ncd easytranscribe\npip install -e .\n```\n\n### Run Tests\n\n```bash\npython test/test_integration.py\n```\n\n## \ud83d\udcc4 Requirements\n\n- Python 3.8+\n- OpenAI Whisper\n- sounddevice (for microphone input)\n- numpy\n\n## \ud83d\udcd6 Documentation\n\nFor comprehensive documentation, examples, and API reference, visit:\n\n**\ud83c\udf10 [EasyTranscribe Documentation](https://akhshyganesh.github.io/easytranscribe/)**\n\nThe documentation includes:\n- \ud83d\ude80 [Quick Start Guide](https://akhshyganesh.github.io/easytranscribe/quickstart/)\n- \ud83d\udcbb [CLI Usage](https://akhshyganesh.github.io/easytranscribe/cli/)\n- \ud83d\udc0d [Python API](https://akhshyganesh.github.io/easytranscribe/api/)\n- \ud83d\udcdd [Examples](https://akhshyganesh.github.io/easytranscribe/examples/)\n- \u2699\ufe0f [Configuration](https://akhshyganesh.github.io/easytranscribe/configuration/)\n- \ud83d\udd27 [Advanced Usage](https://akhshyganesh.github.io/easytranscribe/advanced/)\n\n## \ud83e\udd1d Contributing\n\nContributions are welcome! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.\n\n## \ud83d\udcdc License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## \ud83d\ude4f Acknowledgments\n\n- [OpenAI Whisper](https://github.com/openai/whisper) for the amazing speech recognition model\n- [sounddevice](https://github.com/spatialaudio/python-sounddevice) for microphone input handling\n",
    "bugtrack_url": null,
    "license": "MIT License\n        \n        Copyright (c) 2025 Akhshy Ganesh\n        \n        Permission is hereby granted, free of charge, to any person obtaining a copy\n        of this software and associated documentation files (the \"Software\"), to deal\n        in the Software without restriction, including without limitation the rights\n        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell\n        copies of the Software, and to permit persons to whom the Software is\n        furnished to do so, subject to the following conditions:\n        \n        The above copyright notice and this permission notice shall be included in all\n        copies or substantial portions of the Software.\n        \n        THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\n        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\n        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\n        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\n        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\n        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\n        SOFTWARE.\n        ",
    "summary": "Easy speech-to-text transcription from audio files or live microphone input using Whisper.",
    "version": "0.1.2",
    "project_urls": {
        "Bug Reports": "https://github.com/akhshyganesh/easytranscribe/issues",
        "Homepage": "https://github.com/akhshyganesh/easytranscribe",
        "Source Code": "https://github.com/akhshyganesh/easytranscribe"
    },
    "split_keywords": [
        "speech-to-text",
        " whisper",
        " transcription",
        " audio",
        " ai"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "edf944f85d0178f9ec05cc8d909a4b77e03fb9bc7787c3533ed16a54adf21376",
                "md5": "c29fadd726137a960dbc5a2bb06275a4",
                "sha256": "a3572e28b159fafaea66d106f7a3749c87d01920853ff66a30dcb143f1ba7165"
            },
            "downloads": -1,
            "filename": "easytranscribe-0.1.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "c29fadd726137a960dbc5a2bb06275a4",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 11936,
            "upload_time": "2025-07-13T16:35:12",
            "upload_time_iso_8601": "2025-07-13T16:35:12.787126Z",
            "url": "https://files.pythonhosted.org/packages/ed/f9/44f85d0178f9ec05cc8d909a4b77e03fb9bc7787c3533ed16a54adf21376/easytranscribe-0.1.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "ee75138cad053714aeb83d4df60287c6ace86e5b59705e0dd3b2123c0635c122",
                "md5": "695e88ccca188393bceea237dceac885",
                "sha256": "73d6677d46cfacb8bfac4147faca0c882753492e1194678a621939d6cb0eb30f"
            },
            "downloads": -1,
            "filename": "easytranscribe-0.1.2.tar.gz",
            "has_sig": false,
            "md5_digest": "695e88ccca188393bceea237dceac885",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 57629,
            "upload_time": "2025-07-13T16:35:14",
            "upload_time_iso_8601": "2025-07-13T16:35:14.563111Z",
            "url": "https://files.pythonhosted.org/packages/ee/75/138cad053714aeb83d4df60287c6ace86e5b59705e0dd3b2123c0635c122/easytranscribe-0.1.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-13 16:35:14",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "akhshyganesh",
    "github_project": "easytranscribe",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "openai-whisper",
            "specs": [
                [
                    ">=",
                    "20240930"
                ]
            ]
        },
        {
            "name": "sounddevice",
            "specs": [
                [
                    ">=",
                    "0.4.6"
                ]
            ]
        },
        {
            "name": "numpy",
            "specs": [
                [
                    ">=",
                    "1.21.0"
                ],
                [
                    "<",
                    "2.3"
                ]
            ]
        }
    ],
    "lcname": "easytranscribe"
}
        
Elapsed time: 0.83913s