echolex

Name	echolex JSON
Version	0.1.2 JSON
	download
home_page	https://github.com/ramonfigueiredo/echolex
Summary	A CLI tool for audio transcription using OpenAI's Whisper model
upload_time	2025-10-07 05:37:36
maintainer	None
docs_url	None
author	Ramon Figueiredo
requires_python	>=3.8
license	Apache-2.0
keywords	whisper audio transcription speech-to-text stt openai ai cli
VCS
bugtrack_url
requirements	openai-whisper ffmpeg-python certifi torch tqdm numpy setuptools openai google-generativeai pytest pytest-cov
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            <p align="center">
  <img src="images/echolex_log.png" alt="EchoLex Logo" width="250">
</p>

[![PyPI version](https://badge.fury.io/py/echolex.svg)](https://badge.fury.io/py/echolex)
[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)

EchoLex is a CLI tool for audio transcription using OpenAI's Whisper model for speech-to-text conversion.

The name “EchoLex” combines “Echo” — the voice or sound we capture — with “Lex,” drawn from lexicon, meaning words or language. 
Together, EchoLex reflects the tool’s purpose: transforming spoken echoes into written words with accuracy and clarity.

## Features

- Transcribe single audio files or batch process multiple files
- Support for multiple audio formats (m4a, mp3, wav, flac, ogg, etc.)
- Multiple output formats: plain text, JSON with timestamps, and SRT subtitles
- AI-powered summarization with ChatGPT or Google Gemini
- Audio file information extraction
- Configurable Whisper model sizes (tiny, base, small, medium, large)
- Automatic audio file detection in `audio_files/` directory
- Organized output in `transcripts/` directory
- Built-in dependency checking
- SSL certificate handling for model downloads

## Installation

### Prerequisites
- Python 3.8+ (3.12 recommended)
- FFmpeg for audio processing:
  - macOS: `brew install ffmpeg`
  - Ubuntu/Debian: `sudo apt-get install ffmpeg`
  - Windows: Download from [ffmpeg.org](https://ffmpeg.org/download.html)

### Install from PyPI (Recommended)

```bash
# Install EchoLex
pip install echolex

# Install with summarization support (ChatGPT/Gemini)
pip install echolex[summarize]

# Verify installation
echolex --help
```

### Install from Source

```bash
# Clone the repository
git clone https://github.com/ramonfigueiredo/echolex.git
cd echolex

# Option 1: Quick setup script
chmod +x setup.sh
./setup.sh

# Option 2: Manual installation
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install --upgrade pip setuptools wheel
pip install -r requirements.txt
```

### Troubleshooting Installation Issues

If you encounter installation errors:

1. **For virtual environments:**
```bash
# Install packages one by one
pip install --upgrade pip setuptools wheel
pip install certifi
pip install ffmpeg-python
pip install openai-whisper
```

2. **Use the simplified script (no certifi required):**
```bash
python transcribe_simple.py audio_file.m4a
```

## Project Structure

```
echolex/
├── audio_files/              # Place your audio files here
├── transcripts/              # Transcribed files will be saved here
├── echolex.py                # EchoLex CLI tool
├── test_echolex.py           # Unit tests
├── setup.sh                  # Automated setup script
├── requirements.txt          # Python dependencies
└── README.md                 # This file
```

## Quick Start Tutorial

### 1. Install EchoLex

```bash
# Install from PyPI
pip install echolex

# Or install with AI summarization support
pip install echolex[summarize]

# Verify installation
echolex --version
```

### 2. Install FFmpeg (Required)

```bash
# macOS
brew install ffmpeg

# Ubuntu/Debian
sudo apt-get install ffmpeg

# Windows: Download from https://ffmpeg.org/download.html
```

### 3. Transcribe Your First Audio File

```bash
# Transcribe a single file
echolex transcribe meeting.m4a

# The transcript will be saved to transcripts/audio_transcript.txt
```

### 4. Try Different Options

```bash
# Use a larger model for better accuracy
echolex transcribe meeting.m4a --model medium

# Generate multiple output formats
echolex transcribe meeting.m4a --output txt json srt

# Get audio file information
echolex info meeting.m4a
```

### 5. AI-Powered Summarization (Optional)

```bash
# Install summarization support if not already installed
pip install echolex[summarize]

# Set up your API key
export OPENAI_API_KEY="your-api-key-here"

# Transcribe and summarize with ChatGPT
echolex transcribe meeting.m4a --summarize chatgpt

# Or use Google Gemini
export GEMINI_API_KEY="your-api-key-here"
echolex transcribe meeting.m4a --summarize gemini
```

### 6. Batch Process Multiple Files

```bash
# Create audio_files directory
mkdir -p audio_files

# Copy your audio files
cp *.m4a audio_files/

# Process all files at once
echolex batch audio_files/*.m4a

# With summarization
echolex batch audio_files/*.m4a --summarize chatgpt
```

## Usage

### Getting Help

```bash
# Show main help
echolex --help
echolex -h

# Show version
echolex --version
echolex -v

# Show help for specific command
echolex transcribe --help
echolex batch --help
echolex info --help
echolex check --help
```

**Note:** If you installed from source, use `python echolex.py` instead of `echolex`.

### Commands

EchoLex provides four main commands:

#### 1. Transcribe a Single File

```bash
echolex transcribe audio_file.m4a
```

With specific options:
```bash
echolex transcribe audio_file.m4a --model medium --output txt json srt
```

Specify output directory:
```bash
echolex transcribe audio_file.m4a --output-dir custom_output
```

Available options:
- `--model`: Model size (tiny, base, small, medium, large)
- `--language`: Language code (e.g., 'en', 'es')
- `--output`: Output formats (txt, json, srt)
- `--output-dir`: Output directory
- `--device`: Device to use (cuda, cpu, or None for auto)
- `--verbose`: Show verbose output
- `--quiet`: Don't show transcript preview
- `-s, --summarize`: Generate AI summary (chatgpt or gemini)

#### 2. Batch Process Multiple Files

Process all audio files matching a pattern:
```bash
echolex batch *.m4a *.mp3
```

With custom settings:
```bash
echolex batch *.m4a --model small --output txt json srt --output-dir results
```

Available options:
- `--model`: Model size (tiny, base, small, medium, large)
- `--output`: Output formats (txt, json, srt)
- `--output-dir`: Output directory
- `--verbose`: Show verbose output
- `-s, --summarize`: Generate AI summaries (chatgpt or gemini)

#### 3. Get Audio File Information

Display detailed information about an audio file:
```bash
echolex info audio_file.m4a
```

#### 4. Check Dependencies

Verify that all required dependencies are installed:
```bash
echolex check
```

### Command-Line Help

Every command supports `--help` or `-h` for detailed usage information:

```bash
# Main help menu
echolex --help

# Command-specific help
echolex transcribe -h
echolex batch -h
echolex info -h
echolex check -h
```

**Example help output:**
```
usage: echolex [-h] {transcribe,batch,info,check} ...

Audio transcription tool using OpenAI Whisper

positional arguments:
  {transcribe,batch,info,check}
                        Available commands
    transcribe          Transcribe a single audio file
    batch               Batch transcribe multiple audio files
    info                Display audio file information
    check               Check system dependencies

options:
  -h, --help            show this help message and exit

Examples:
  # Transcribe a single file
  echolex transcribe audio.m4a

  # Transcribe with specific model
  echolex transcribe audio.m4a --model medium

  # Batch transcribe multiple files
  echolex batch *.m4a *.mp3

  # Get audio file information
  echolex info audio.m4a

  # Check dependencies
  echolex check
```

### Model Options

Available models (speed vs accuracy tradeoff):
- `tiny`: Fastest, least accurate (~39 MB)
- `base`: Good balance - default (~74 MB)
- `small`: Better accuracy (~244 MB)
- `medium`: Even better accuracy (~769 MB)
- `large`: Best accuracy, slowest (~1550 MB)

### AI Summarization

EchoLex can generate concise summaries of transcripts using ChatGPT or Google Gemini.

#### Setup

**Install summarization dependencies:**
```bash
pip install echolex[summarize]
```

**Set up API keys:**
```bash
# For ChatGPT (OpenAI)
export OPENAI_API_KEY="your-api-key-here"

# For Gemini (Google)
export GEMINI_API_KEY="your-api-key-here"
```

To make API keys permanent, add them to your shell profile (`~/.bashrc`, `~/.zshrc`, etc.):
```bash
echo 'export OPENAI_API_KEY="your-api-key-here"' >> ~/.zshrc
```

**Get API Keys:**
- OpenAI: https://platform.openai.com/api-keys
- Google Gemini: https://makersuite.google.com/app/apikey

#### Usage

**Summarize with ChatGPT:**
```bash
echolex transcribe audio.m4a --summarize chatgpt
# or use short form
echolex transcribe audio.m4a -s chatgpt
```

**Summarize with Gemini:**
```bash
echolex transcribe audio.m4a --summarize gemini
# or use short form
echolex transcribe audio.m4a -s gemini
```

**Batch summarization:**
```bash
echolex batch *.m4a --summarize chatgpt
```

#### Default Models

- **ChatGPT**: `gpt-4o-mini` (fast and cost-effective)
- **Gemini**: `gemini-1.5-flash` (with automatic fallback to `gemini-1.5-pro` and `gemini-pro`)

#### Customization Options

You can customize the summarization with these options:

- `--summary-model`: Specify a different AI model
- `--summary-system-message`: Customize the system prompt
- `--summary-user-message`: Customize the user prompt (use `{text}` for transcript)
- `--summary-temperature`: Control creativity (0.0-2.0, default: 0.7)
- `--summary-max-tokens`: Set maximum summary length (default: 500)

**Examples:**

```bash
# Use GPT-4o with longer summary
echolex transcribe audio.m4a -s chatgpt \
  --summary-model gpt-4o \
  --summary-max-tokens 1000

# Custom prompt for meeting summaries
echolex transcribe meeting.m4a -s chatgpt \
  --summary-system-message "You are an expert meeting summarizer" \
  --summary-user-message "Create action items from this meeting:\n\n{text}"

# More creative summaries with Gemini
echolex transcribe audio.m4a -s gemini \
  --summary-temperature 1.2

# Batch with custom settings
echolex batch *.m4a -s chatgpt \
  --summary-model gpt-4o-mini \
  --summary-max-tokens 800
```

#### Output

Summaries are saved in three ways:
- Separate text file: `*_summary.txt`
- Summary details JSON: `*_summary.json` (includes summary, provider, parameters, and timestamp)
- Included in transcript JSON: `*_transcript.json` (with `summary` and `summary_provider` fields)
- Displayed in console after transcription

## Output Files

All transcripts are saved to the `transcripts/` directory by default. For each audio file, the tool can create:

- `*_transcript.txt`: Plain text transcript
- `*_summary.txt`: AI-generated summary (when using `--summarize`)
- `*_summary.json`: Summary details with provider and parameters (when using `--summarize`)
- `*_transcript.json`: Detailed JSON with timestamps, segments, and metadata
- `*_transcript.srt`: SRT subtitle file for video synchronization

### JSON Output Format
The JSON output includes:
- Complete transcript text
- Timestamped segments
- Detected language
- Processing timestamp
- Source audio file path
- Summary and summary provider (when using `--summarize`)

### Summary JSON Format
The `*_summary.json` file includes:
- **summary**: The generated summary text
- **provider**: The AI provider used (chatgpt or gemini)
- **parameters**: All parameters used for generation:
  - model: The specific AI model (e.g., gpt-4o-mini, gemini-1.5-flash)
  - system_message: The system prompt used
  - user_message: The user prompt template
  - temperature: The temperature setting
  - max_tokens: The maximum token limit
- **generated_at**: ISO timestamp of when the summary was created

## Requirements

- Python 3.8+
- FFmpeg
- 4-8GB RAM (depending on model size)
- Disk space for Whisper models:
  - Tiny: ~39 MB
  - Base: ~74 MB
  - Small: ~244 MB
  - Medium: ~769 MB
  - Large: ~1550 MB

## Performance Notes

- First run will download the selected Whisper model automatically
- Processing time depends on audio length and model size
- Approximate processing speeds on modern CPUs:
  - Tiny: ~10-15x real-time
  - Base: ~5-8x real-time
  - Small: ~3-5x real-time
  - Medium: ~2-3x real-time
  - Large: ~1-2x real-time
- GPU acceleration available with CUDA-enabled PyTorch (significantly faster)

## Troubleshooting

### Common Issues and Solutions

#### 1. SSL Certificate Errors
```bash
# EchoLex includes SSL certificate handling
# If you still get SSL errors, ensure certifi is installed:
pip install certifi
```

#### 2. Virtual Environment Installation Issues
```bash
# If pip install fails in virtual environment:
pip install --upgrade pip setuptools wheel
pip install certifi ffmpeg-python
pip install openai-whisper

# Or create a fresh environment:
deactivate
python3 -m venv venv_new
source venv_new/bin/activate
pip install openai-whisper ffmpeg-python
```

#### 3. Module Import Errors
```bash
# EchoLex will auto-install Whisper if missing
# For other missing modules:
pip install [missing_module_name]
```

#### 4. Missing Audio Files
EchoLex automatically checks for audio files in:
- Current directory
- `audio_files/` folder

#### 5. Memory Issues
- Use smaller models: `--model tiny` or `--model base`
- Process shorter audio segments
- Close other applications to free up RAM

#### 6. FFmpeg Not Found
```bash
# Check dependencies
echolex check

# Install FFmpeg
# macOS:
brew install ffmpeg

# Ubuntu/Debian:
sudo apt-get install ffmpeg
```

## Example Workflow

### Using PyPI Installation (Recommended)

#### 1. Install and Setup
```bash
# Install EchoLex with summarization support
pip install echolex[summarize]

# Verify installation
echolex --version
echolex check

# Install FFmpeg if needed
brew install ffmpeg  # macOS
```

#### 2. Prepare Audio Files
```bash
# Create directories
mkdir -p audio_files transcripts

# Place your audio files
cp *.m4a audio_files/
```

#### 3. Transcribe
```bash
# Transcribe a file
echolex transcribe meeting.m4a

# With custom options
echolex transcribe meeting.m4a --model medium --output txt json srt

# With AI summary
export OPENAI_API_KEY="your-api-key"
echolex transcribe meeting.m4a -s chatgpt

# Get help
echolex transcribe --help
```

### Using Source Installation

#### 1. First Time Setup
```bash
# Clone the project
git clone https://github.com/ramonfigueiredo/echolex.git
cd echolex

# Run setup (recommended)
chmod +x setup.sh
./setup.sh

# OR create virtual environment manually
python3 -m venv venv
source venv/bin/activate
pip install openai-whisper ffmpeg-python
```

#### 2. Transcribe
```bash
# Use python command when installed from source
python echolex.py transcribe meeting.m4a

# With custom options
python echolex.py transcribe meeting.m4a --model medium --output txt json srt

# Get help
python echolex.py transcribe --help
```

### Review Output (Both Methods)

```bash
# View the transcript
cat transcripts/audio_transcript.txt

# View the AI summary (if generated)
cat transcripts/audio_summary.txt

# Open JSON for detailed segments
open transcripts/audio_transcript.json

# View summary details
cat transcripts/audio_summary.json

# Check processing summary (for batch jobs)
cat transcripts/batch_transcription_summary.json
```

## Testing

EchoLex includes a comprehensive test suite with unit tests covering all functionality.

### Run All Tests

**Using unittest (built-in, no extra dependencies):**
```bash
python3 -m unittest test_echolex -v
```

**Using pytest (enhanced output, requires pytest):**
```bash
# Install pytest (optional)
pip install pytest

# Run tests
pytest test_echolex.py -v

# Run with more detailed output
pytest test_echolex.py -v -s

# Run with coverage (requires pytest-cov)
pip install pytest-cov
pytest test_echolex.py --cov=echolex --cov-report=html
```

### Run Specific Test Classes

**Using unittest:**
```bash
# Test AudioProcessor
python3 -m unittest test_echolex.TestAudioProcessor -v

# Test AudioTranscriber
python3 -m unittest test_echolex.TestAudioTranscriber -v

# Test CLI commands
python3 -m unittest test_echolex.TestCommandTranscribe -v
python3 -m unittest test_echolex.TestCommandBatch -v
```

**Using pytest:**
```bash
# Test by class name
pytest test_echolex.py::TestAudioProcessor -v
pytest test_echolex.py::TestAudioTranscriber -v

# Test by keyword
pytest test_echolex.py -k "AudioProcessor" -v
pytest test_echolex.py -k "batch" -v
```

### Run Individual Tests

**Using unittest:**
```bash
python3 -m unittest test_echolex.TestAudioTranscriber.test_save_results_srt -v
```

**Using pytest:**
```bash
pytest test_echolex.py::TestAudioTranscriber::test_save_results_srt -v
```

### Test Coverage

The test suite includes:
- **AudioProcessor tests** - Dependency checking and audio file analysis
- **AudioTranscriber tests** - Model loading, transcription, and output generation
- **Helper function tests** - File finding and path resolution
- **Command tests** - All CLI commands (transcribe, batch, info, check)
- **Main CLI tests** - Argument parsing and command routing

All tests use mocking to avoid external dependencies and ensure fast, isolated testing.

## Publishing to PyPI

### Prerequisites

1. Install build tools:
```bash
pip install --upgrade build twine
```

2. Set up PyPI credentials in `~/.pypirc`:
```ini
[distutils]
index-servers =
    pypi
    testpypi

[pypi]
username = __token__
password = pypi-YOUR_PYPI_TOKEN_HERE

[testpypi]
repository = https://test.pypi.org/legacy/
username = __token__
password = pypi-YOUR_TESTPYPI_TOKEN_HERE
```

Get tokens at:
- PyPI: https://pypi.org/manage/account/token/
- TestPyPI: https://test.pypi.org/manage/account/token/

### Publish to TestPyPI (for testing)

```bash
# Using the publish script
./pypi_publish.sh test

# Or manually
python3 -m build
python3 -m twine upload --repository testpypi dist/*
```

**Test the installation:**
```bash
pip install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple echolex
```

### Publish to Production PyPI

```bash
# Update version in VERSION.in first
echo "0.1.2" > VERSION.in

# Using the publish script (recommended)
./pypi_publish.sh prod

# Or manually
python3 -m build
python3 -m twine check dist/*
python3 -m twine upload dist/*
```

**After publishing:**
```bash
# Create git tag
git tag v0.1.2
git push origin v0.1.2
```

### Publish Script Usage

The `pypi_publish.sh` script automates the entire process:

- **Test mode**: `./pypi_publish.sh test` - Publishes to TestPyPI
- **Production mode**: `./pypi_publish.sh prod` - Publishes to PyPI

The script will:
1. Read version from VERSION.in
2. Clean previous builds
3. Run all tests
4. Build the package
5. Check distribution with twine
6. Upload to the selected repository
7. Display installation and verification instructions

## License

Apache License, Version 2.0

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/ramonfigueiredo/echolex",
    "name": "echolex",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "whisper, audio, transcription, speech-to-text, stt, openai, ai, cli",
    "author": "Ramon Figueiredo",
    "author_email": "Ramon Figueiredo <author@example.com>",
    "download_url": "https://files.pythonhosted.org/packages/a8/2f/9d6d9d7c804f2dc5566c0738a86b7671f51a8434bee2053b1325f8a46f31/echolex-0.1.2.tar.gz",
    "platform": null,
    "description": "<p align=\"center\">\n  <img src=\"images/echolex_log.png\" alt=\"EchoLex Logo\" width=\"250\">\n</p>\n\n[![PyPI version](https://badge.fury.io/py/echolex.svg)](https://badge.fury.io/py/echolex)\n[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)\n[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)\n\nEchoLex is a CLI tool for audio transcription using OpenAI's Whisper model for speech-to-text conversion.\n\nThe name \u201cEchoLex\u201d combines \u201cEcho\u201d \u2014 the voice or sound we capture \u2014 with \u201cLex,\u201d drawn from lexicon, meaning words or language. \nTogether, EchoLex reflects the tool\u2019s purpose: transforming spoken echoes into written words with accuracy and clarity.\n\n## Features\n\n- Transcribe single audio files or batch process multiple files\n- Support for multiple audio formats (m4a, mp3, wav, flac, ogg, etc.)\n- Multiple output formats: plain text, JSON with timestamps, and SRT subtitles\n- AI-powered summarization with ChatGPT or Google Gemini\n- Audio file information extraction\n- Configurable Whisper model sizes (tiny, base, small, medium, large)\n- Automatic audio file detection in `audio_files/` directory\n- Organized output in `transcripts/` directory\n- Built-in dependency checking\n- SSL certificate handling for model downloads\n\n## Installation\n\n### Prerequisites\n- Python 3.8+ (3.12 recommended)\n- FFmpeg for audio processing:\n  - macOS: `brew install ffmpeg`\n  - Ubuntu/Debian: `sudo apt-get install ffmpeg`\n  - Windows: Download from [ffmpeg.org](https://ffmpeg.org/download.html)\n\n### Install from PyPI (Recommended)\n\n```bash\n# Install EchoLex\npip install echolex\n\n# Install with summarization support (ChatGPT/Gemini)\npip install echolex[summarize]\n\n# Verify installation\necholex --help\n```\n\n### Install from Source\n\n```bash\n# Clone the repository\ngit clone https://github.com/ramonfigueiredo/echolex.git\ncd echolex\n\n# Option 1: Quick setup script\nchmod +x setup.sh\n./setup.sh\n\n# Option 2: Manual installation\npython3 -m venv venv\nsource venv/bin/activate  # On Windows: venv\\Scripts\\activate\npip install --upgrade pip setuptools wheel\npip install -r requirements.txt\n```\n\n### Troubleshooting Installation Issues\n\nIf you encounter installation errors:\n\n1. **For virtual environments:**\n```bash\n# Install packages one by one\npip install --upgrade pip setuptools wheel\npip install certifi\npip install ffmpeg-python\npip install openai-whisper\n```\n\n2. **Use the simplified script (no certifi required):**\n```bash\npython transcribe_simple.py audio_file.m4a\n```\n\n## Project Structure\n\n```\necholex/\n\u251c\u2500\u2500 audio_files/              # Place your audio files here\n\u251c\u2500\u2500 transcripts/              # Transcribed files will be saved here\n\u251c\u2500\u2500 echolex.py                # EchoLex CLI tool\n\u251c\u2500\u2500 test_echolex.py           # Unit tests\n\u251c\u2500\u2500 setup.sh                  # Automated setup script\n\u251c\u2500\u2500 requirements.txt          # Python dependencies\n\u2514\u2500\u2500 README.md                 # This file\n```\n\n## Quick Start Tutorial\n\n### 1. Install EchoLex\n\n```bash\n# Install from PyPI\npip install echolex\n\n# Or install with AI summarization support\npip install echolex[summarize]\n\n# Verify installation\necholex --version\n```\n\n### 2. Install FFmpeg (Required)\n\n```bash\n# macOS\nbrew install ffmpeg\n\n# Ubuntu/Debian\nsudo apt-get install ffmpeg\n\n# Windows: Download from https://ffmpeg.org/download.html\n```\n\n### 3. Transcribe Your First Audio File\n\n```bash\n# Transcribe a single file\necholex transcribe meeting.m4a\n\n# The transcript will be saved to transcripts/audio_transcript.txt\n```\n\n### 4. Try Different Options\n\n```bash\n# Use a larger model for better accuracy\necholex transcribe meeting.m4a --model medium\n\n# Generate multiple output formats\necholex transcribe meeting.m4a --output txt json srt\n\n# Get audio file information\necholex info meeting.m4a\n```\n\n### 5. AI-Powered Summarization (Optional)\n\n```bash\n# Install summarization support if not already installed\npip install echolex[summarize]\n\n# Set up your API key\nexport OPENAI_API_KEY=\"your-api-key-here\"\n\n# Transcribe and summarize with ChatGPT\necholex transcribe meeting.m4a --summarize chatgpt\n\n# Or use Google Gemini\nexport GEMINI_API_KEY=\"your-api-key-here\"\necholex transcribe meeting.m4a --summarize gemini\n```\n\n### 6. Batch Process Multiple Files\n\n```bash\n# Create audio_files directory\nmkdir -p audio_files\n\n# Copy your audio files\ncp *.m4a audio_files/\n\n# Process all files at once\necholex batch audio_files/*.m4a\n\n# With summarization\necholex batch audio_files/*.m4a --summarize chatgpt\n```\n\n## Usage\n\n### Getting Help\n\n```bash\n# Show main help\necholex --help\necholex -h\n\n# Show version\necholex --version\necholex -v\n\n# Show help for specific command\necholex transcribe --help\necholex batch --help\necholex info --help\necholex check --help\n```\n\n**Note:** If you installed from source, use `python echolex.py` instead of `echolex`.\n\n### Commands\n\nEchoLex provides four main commands:\n\n#### 1. Transcribe a Single File\n\n```bash\necholex transcribe audio_file.m4a\n```\n\nWith specific options:\n```bash\necholex transcribe audio_file.m4a --model medium --output txt json srt\n```\n\nSpecify output directory:\n```bash\necholex transcribe audio_file.m4a --output-dir custom_output\n```\n\nAvailable options:\n- `--model`: Model size (tiny, base, small, medium, large)\n- `--language`: Language code (e.g., 'en', 'es')\n- `--output`: Output formats (txt, json, srt)\n- `--output-dir`: Output directory\n- `--device`: Device to use (cuda, cpu, or None for auto)\n- `--verbose`: Show verbose output\n- `--quiet`: Don't show transcript preview\n- `-s, --summarize`: Generate AI summary (chatgpt or gemini)\n\n#### 2. Batch Process Multiple Files\n\nProcess all audio files matching a pattern:\n```bash\necholex batch *.m4a *.mp3\n```\n\nWith custom settings:\n```bash\necholex batch *.m4a --model small --output txt json srt --output-dir results\n```\n\nAvailable options:\n- `--model`: Model size (tiny, base, small, medium, large)\n- `--output`: Output formats (txt, json, srt)\n- `--output-dir`: Output directory\n- `--verbose`: Show verbose output\n- `-s, --summarize`: Generate AI summaries (chatgpt or gemini)\n\n#### 3. Get Audio File Information\n\nDisplay detailed information about an audio file:\n```bash\necholex info audio_file.m4a\n```\n\n#### 4. Check Dependencies\n\nVerify that all required dependencies are installed:\n```bash\necholex check\n```\n\n### Command-Line Help\n\nEvery command supports `--help` or `-h` for detailed usage information:\n\n```bash\n# Main help menu\necholex --help\n\n# Command-specific help\necholex transcribe -h\necholex batch -h\necholex info -h\necholex check -h\n```\n\n**Example help output:**\n```\nusage: echolex [-h] {transcribe,batch,info,check} ...\n\nAudio transcription tool using OpenAI Whisper\n\npositional arguments:\n  {transcribe,batch,info,check}\n                        Available commands\n    transcribe          Transcribe a single audio file\n    batch               Batch transcribe multiple audio files\n    info                Display audio file information\n    check               Check system dependencies\n\noptions:\n  -h, --help            show this help message and exit\n\nExamples:\n  # Transcribe a single file\n  echolex transcribe audio.m4a\n\n  # Transcribe with specific model\n  echolex transcribe audio.m4a --model medium\n\n  # Batch transcribe multiple files\n  echolex batch *.m4a *.mp3\n\n  # Get audio file information\n  echolex info audio.m4a\n\n  # Check dependencies\n  echolex check\n```\n\n### Model Options\n\nAvailable models (speed vs accuracy tradeoff):\n- `tiny`: Fastest, least accurate (~39 MB)\n- `base`: Good balance - default (~74 MB)\n- `small`: Better accuracy (~244 MB)\n- `medium`: Even better accuracy (~769 MB)\n- `large`: Best accuracy, slowest (~1550 MB)\n\n### AI Summarization\n\nEchoLex can generate concise summaries of transcripts using ChatGPT or Google Gemini.\n\n#### Setup\n\n**Install summarization dependencies:**\n```bash\npip install echolex[summarize]\n```\n\n**Set up API keys:**\n```bash\n# For ChatGPT (OpenAI)\nexport OPENAI_API_KEY=\"your-api-key-here\"\n\n# For Gemini (Google)\nexport GEMINI_API_KEY=\"your-api-key-here\"\n```\n\nTo make API keys permanent, add them to your shell profile (`~/.bashrc`, `~/.zshrc`, etc.):\n```bash\necho 'export OPENAI_API_KEY=\"your-api-key-here\"' >> ~/.zshrc\n```\n\n**Get API Keys:**\n- OpenAI: https://platform.openai.com/api-keys\n- Google Gemini: https://makersuite.google.com/app/apikey\n\n#### Usage\n\n**Summarize with ChatGPT:**\n```bash\necholex transcribe audio.m4a --summarize chatgpt\n# or use short form\necholex transcribe audio.m4a -s chatgpt\n```\n\n**Summarize with Gemini:**\n```bash\necholex transcribe audio.m4a --summarize gemini\n# or use short form\necholex transcribe audio.m4a -s gemini\n```\n\n**Batch summarization:**\n```bash\necholex batch *.m4a --summarize chatgpt\n```\n\n#### Default Models\n\n- **ChatGPT**: `gpt-4o-mini` (fast and cost-effective)\n- **Gemini**: `gemini-1.5-flash` (with automatic fallback to `gemini-1.5-pro` and `gemini-pro`)\n\n#### Customization Options\n\nYou can customize the summarization with these options:\n\n- `--summary-model`: Specify a different AI model\n- `--summary-system-message`: Customize the system prompt\n- `--summary-user-message`: Customize the user prompt (use `{text}` for transcript)\n- `--summary-temperature`: Control creativity (0.0-2.0, default: 0.7)\n- `--summary-max-tokens`: Set maximum summary length (default: 500)\n\n**Examples:**\n\n```bash\n# Use GPT-4o with longer summary\necholex transcribe audio.m4a -s chatgpt \\\n  --summary-model gpt-4o \\\n  --summary-max-tokens 1000\n\n# Custom prompt for meeting summaries\necholex transcribe meeting.m4a -s chatgpt \\\n  --summary-system-message \"You are an expert meeting summarizer\" \\\n  --summary-user-message \"Create action items from this meeting:\\n\\n{text}\"\n\n# More creative summaries with Gemini\necholex transcribe audio.m4a -s gemini \\\n  --summary-temperature 1.2\n\n# Batch with custom settings\necholex batch *.m4a -s chatgpt \\\n  --summary-model gpt-4o-mini \\\n  --summary-max-tokens 800\n```\n\n#### Output\n\nSummaries are saved in three ways:\n- Separate text file: `*_summary.txt`\n- Summary details JSON: `*_summary.json` (includes summary, provider, parameters, and timestamp)\n- Included in transcript JSON: `*_transcript.json` (with `summary` and `summary_provider` fields)\n- Displayed in console after transcription\n\n## Output Files\n\nAll transcripts are saved to the `transcripts/` directory by default. For each audio file, the tool can create:\n\n- `*_transcript.txt`: Plain text transcript\n- `*_summary.txt`: AI-generated summary (when using `--summarize`)\n- `*_summary.json`: Summary details with provider and parameters (when using `--summarize`)\n- `*_transcript.json`: Detailed JSON with timestamps, segments, and metadata\n- `*_transcript.srt`: SRT subtitle file for video synchronization\n\n### JSON Output Format\nThe JSON output includes:\n- Complete transcript text\n- Timestamped segments\n- Detected language\n- Processing timestamp\n- Source audio file path\n- Summary and summary provider (when using `--summarize`)\n\n### Summary JSON Format\nThe `*_summary.json` file includes:\n- **summary**: The generated summary text\n- **provider**: The AI provider used (chatgpt or gemini)\n- **parameters**: All parameters used for generation:\n  - model: The specific AI model (e.g., gpt-4o-mini, gemini-1.5-flash)\n  - system_message: The system prompt used\n  - user_message: The user prompt template\n  - temperature: The temperature setting\n  - max_tokens: The maximum token limit\n- **generated_at**: ISO timestamp of when the summary was created\n\n## Requirements\n\n- Python 3.8+\n- FFmpeg\n- 4-8GB RAM (depending on model size)\n- Disk space for Whisper models:\n  - Tiny: ~39 MB\n  - Base: ~74 MB\n  - Small: ~244 MB\n  - Medium: ~769 MB\n  - Large: ~1550 MB\n\n## Performance Notes\n\n- First run will download the selected Whisper model automatically\n- Processing time depends on audio length and model size\n- Approximate processing speeds on modern CPUs:\n  - Tiny: ~10-15x real-time\n  - Base: ~5-8x real-time\n  - Small: ~3-5x real-time\n  - Medium: ~2-3x real-time\n  - Large: ~1-2x real-time\n- GPU acceleration available with CUDA-enabled PyTorch (significantly faster)\n\n## Troubleshooting\n\n### Common Issues and Solutions\n\n#### 1. SSL Certificate Errors\n```bash\n# EchoLex includes SSL certificate handling\n# If you still get SSL errors, ensure certifi is installed:\npip install certifi\n```\n\n#### 2. Virtual Environment Installation Issues\n```bash\n# If pip install fails in virtual environment:\npip install --upgrade pip setuptools wheel\npip install certifi ffmpeg-python\npip install openai-whisper\n\n# Or create a fresh environment:\ndeactivate\npython3 -m venv venv_new\nsource venv_new/bin/activate\npip install openai-whisper ffmpeg-python\n```\n\n#### 3. Module Import Errors\n```bash\n# EchoLex will auto-install Whisper if missing\n# For other missing modules:\npip install [missing_module_name]\n```\n\n#### 4. Missing Audio Files\nEchoLex automatically checks for audio files in:\n- Current directory\n- `audio_files/` folder\n\n#### 5. Memory Issues\n- Use smaller models: `--model tiny` or `--model base`\n- Process shorter audio segments\n- Close other applications to free up RAM\n\n#### 6. FFmpeg Not Found\n```bash\n# Check dependencies\necholex check\n\n# Install FFmpeg\n# macOS:\nbrew install ffmpeg\n\n# Ubuntu/Debian:\nsudo apt-get install ffmpeg\n```\n\n## Example Workflow\n\n### Using PyPI Installation (Recommended)\n\n#### 1. Install and Setup\n```bash\n# Install EchoLex with summarization support\npip install echolex[summarize]\n\n# Verify installation\necholex --version\necholex check\n\n# Install FFmpeg if needed\nbrew install ffmpeg  # macOS\n```\n\n#### 2. Prepare Audio Files\n```bash\n# Create directories\nmkdir -p audio_files transcripts\n\n# Place your audio files\ncp *.m4a audio_files/\n```\n\n#### 3. Transcribe\n```bash\n# Transcribe a file\necholex transcribe meeting.m4a\n\n# With custom options\necholex transcribe meeting.m4a --model medium --output txt json srt\n\n# With AI summary\nexport OPENAI_API_KEY=\"your-api-key\"\necholex transcribe meeting.m4a -s chatgpt\n\n# Get help\necholex transcribe --help\n```\n\n### Using Source Installation\n\n#### 1. First Time Setup\n```bash\n# Clone the project\ngit clone https://github.com/ramonfigueiredo/echolex.git\ncd echolex\n\n# Run setup (recommended)\nchmod +x setup.sh\n./setup.sh\n\n# OR create virtual environment manually\npython3 -m venv venv\nsource venv/bin/activate\npip install openai-whisper ffmpeg-python\n```\n\n#### 2. Transcribe\n```bash\n# Use python command when installed from source\npython echolex.py transcribe meeting.m4a\n\n# With custom options\npython echolex.py transcribe meeting.m4a --model medium --output txt json srt\n\n# Get help\npython echolex.py transcribe --help\n```\n\n### Review Output (Both Methods)\n\n```bash\n# View the transcript\ncat transcripts/audio_transcript.txt\n\n# View the AI summary (if generated)\ncat transcripts/audio_summary.txt\n\n# Open JSON for detailed segments\nopen transcripts/audio_transcript.json\n\n# View summary details\ncat transcripts/audio_summary.json\n\n# Check processing summary (for batch jobs)\ncat transcripts/batch_transcription_summary.json\n```\n\n## Testing\n\nEchoLex includes a comprehensive test suite with unit tests covering all functionality.\n\n### Run All Tests\n\n**Using unittest (built-in, no extra dependencies):**\n```bash\npython3 -m unittest test_echolex -v\n```\n\n**Using pytest (enhanced output, requires pytest):**\n```bash\n# Install pytest (optional)\npip install pytest\n\n# Run tests\npytest test_echolex.py -v\n\n# Run with more detailed output\npytest test_echolex.py -v -s\n\n# Run with coverage (requires pytest-cov)\npip install pytest-cov\npytest test_echolex.py --cov=echolex --cov-report=html\n```\n\n### Run Specific Test Classes\n\n**Using unittest:**\n```bash\n# Test AudioProcessor\npython3 -m unittest test_echolex.TestAudioProcessor -v\n\n# Test AudioTranscriber\npython3 -m unittest test_echolex.TestAudioTranscriber -v\n\n# Test CLI commands\npython3 -m unittest test_echolex.TestCommandTranscribe -v\npython3 -m unittest test_echolex.TestCommandBatch -v\n```\n\n**Using pytest:**\n```bash\n# Test by class name\npytest test_echolex.py::TestAudioProcessor -v\npytest test_echolex.py::TestAudioTranscriber -v\n\n# Test by keyword\npytest test_echolex.py -k \"AudioProcessor\" -v\npytest test_echolex.py -k \"batch\" -v\n```\n\n### Run Individual Tests\n\n**Using unittest:**\n```bash\npython3 -m unittest test_echolex.TestAudioTranscriber.test_save_results_srt -v\n```\n\n**Using pytest:**\n```bash\npytest test_echolex.py::TestAudioTranscriber::test_save_results_srt -v\n```\n\n### Test Coverage\n\nThe test suite includes:\n- **AudioProcessor tests** - Dependency checking and audio file analysis\n- **AudioTranscriber tests** - Model loading, transcription, and output generation\n- **Helper function tests** - File finding and path resolution\n- **Command tests** - All CLI commands (transcribe, batch, info, check)\n- **Main CLI tests** - Argument parsing and command routing\n\nAll tests use mocking to avoid external dependencies and ensure fast, isolated testing.\n\n## Publishing to PyPI\n\n### Prerequisites\n\n1. Install build tools:\n```bash\npip install --upgrade build twine\n```\n\n2. Set up PyPI credentials in `~/.pypirc`:\n```ini\n[distutils]\nindex-servers =\n    pypi\n    testpypi\n\n[pypi]\nusername = __token__\npassword = pypi-YOUR_PYPI_TOKEN_HERE\n\n[testpypi]\nrepository = https://test.pypi.org/legacy/\nusername = __token__\npassword = pypi-YOUR_TESTPYPI_TOKEN_HERE\n```\n\nGet tokens at:\n- PyPI: https://pypi.org/manage/account/token/\n- TestPyPI: https://test.pypi.org/manage/account/token/\n\n### Publish to TestPyPI (for testing)\n\n```bash\n# Using the publish script\n./pypi_publish.sh test\n\n# Or manually\npython3 -m build\npython3 -m twine upload --repository testpypi dist/*\n```\n\n**Test the installation:**\n```bash\npip install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple echolex\n```\n\n### Publish to Production PyPI\n\n```bash\n# Update version in VERSION.in first\necho \"0.1.2\" > VERSION.in\n\n# Using the publish script (recommended)\n./pypi_publish.sh prod\n\n# Or manually\npython3 -m build\npython3 -m twine check dist/*\npython3 -m twine upload dist/*\n```\n\n**After publishing:**\n```bash\n# Create git tag\ngit tag v0.1.2\ngit push origin v0.1.2\n```\n\n### Publish Script Usage\n\nThe `pypi_publish.sh` script automates the entire process:\n\n- **Test mode**: `./pypi_publish.sh test` - Publishes to TestPyPI\n- **Production mode**: `./pypi_publish.sh prod` - Publishes to PyPI\n\nThe script will:\n1. Read version from VERSION.in\n2. Clean previous builds\n3. Run all tests\n4. Build the package\n5. Check distribution with twine\n6. Upload to the selected repository\n7. Display installation and verification instructions\n\n## License\n\nApache License, Version 2.0\n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "A CLI tool for audio transcription using OpenAI's Whisper model",
    "version": "0.1.2",
    "project_urls": {
        "Documentation": "https://github.com/ramonfigueiredo/echolex#readme",
        "Homepage": "https://github.com/ramonfigueiredo/echolex",
        "Issues": "https://github.com/ramonfigueiredo/echolex/issues",
        "Repository": "https://github.com/ramonfigueiredo/echolex"
    },
    "split_keywords": [
        "whisper",
        " audio",
        " transcription",
        " speech-to-text",
        " stt",
        " openai",
        " ai",
        " cli"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "601323f819bd2d416a00a984aafcd5549fc301d29143b9436ee62a7c087c47e2",
                "md5": "f026c36e378f3d89499c7a5cb4674525",
                "sha256": "d564ba97ed29cd969b4ed598416bf8b0726815a7b2321ecd3c3a5a911719a231"
            },
            "downloads": -1,
            "filename": "echolex-0.1.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f026c36e378f3d89499c7a5cb4674525",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 18588,
            "upload_time": "2025-10-07T05:37:35",
            "upload_time_iso_8601": "2025-10-07T05:37:35.109375Z",
            "url": "https://files.pythonhosted.org/packages/60/13/23f819bd2d416a00a984aafcd5549fc301d29143b9436ee62a7c087c47e2/echolex-0.1.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "a82f9d6d9d7c804f2dc5566c0738a86b7671f51a8434bee2053b1325f8a46f31",
                "md5": "b26dbb6922de636bd40b6c99a884a9a6",
                "sha256": "dae44b3cd5b288f2ebb45ec2d49d0c2c39aeb8fa3012483cf7f05b06f9eccc0d"
            },
            "downloads": -1,
            "filename": "echolex-0.1.2.tar.gz",
            "has_sig": false,
            "md5_digest": "b26dbb6922de636bd40b6c99a884a9a6",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 23509,
            "upload_time": "2025-10-07T05:37:36",
            "upload_time_iso_8601": "2025-10-07T05:37:36.423323Z",
            "url": "https://files.pythonhosted.org/packages/a8/2f/9d6d9d7c804f2dc5566c0738a86b7671f51a8434bee2053b1325f8a46f31/echolex-0.1.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-10-07 05:37:36",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "ramonfigueiredo",
    "github_project": "echolex",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "openai-whisper",
            "specs": []
        },
        {
            "name": "ffmpeg-python",
            "specs": [
                [
                    "==",
                    "0.2.0"
                ]
            ]
        },
        {
            "name": "certifi",
            "specs": []
        },
        {
            "name": "torch",
            "specs": [
                [
                    ">=",
                    "2.0.0"
                ]
            ]
        },
        {
            "name": "tqdm",
            "specs": []
        },
        {
            "name": "numpy",
            "specs": [
                [
                    ">=",
                    "1.26.0"
                ]
            ]
        },
        {
            "name": "setuptools",
            "specs": []
        },
        {
            "name": "openai",
            "specs": [
                [
                    ">=",
                    "1.0.0"
                ]
            ]
        },
        {
            "name": "google-generativeai",
            "specs": [
                [
                    ">=",
                    "0.3.0"
                ]
            ]
        },
        {
            "name": "pytest",
            "specs": [
                [
                    ">=",
                    "7.0.0"
                ]
            ]
        },
        {
            "name": "pytest-cov",
            "specs": [
                [
                    ">=",
                    "4.0.0"
                ]
            ]
        }
    ],
    "lcname": "echolex"
}

Ramon Figueiredo