audiojudge


Nameaudiojudge JSON
Version 0.1.1 PyPI version JSON
download
home_pageNone
SummaryA simple package for audio comparison using large language models
upload_time2025-07-15 00:57:01
maintainerNone
docs_urlNone
authorNone
requires_python>=3.10
licenseMIT
keywords ai audio comparison gemini llm openai pronunciation speech
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # AudioJudge 🎵

A Python wrapper for audio comparison and evaluation using a Large Audio Model as Judge (i.e., LAM-as-a-Judge or AudioJudge) with support for in-context learning and flexible audio concatenation strategies.

## Features

- **Multi-Model Support**: Works with OpenAI GPT-4o Audio and Google Gemini models (GPT-4o-audio family, Gemini-1.5/2.0/2.5-flash families)
- **Flexible Audio Comparison**: Support for both pairwise and pointwise audio evaluation
- **In-Context Learning**: Provide examples to improve model performance
- **Audio Concatenation**: Multiple strategies for combining audio files
- **Smart Caching**: Built-in API response caching to reduce costs and latency

## Installation

```bash
pip install audiojudge  # Requires Python >= 3.10
```

## Quick Start

```python
from audiojudge import AudioJudge

# Initialize with API keys
judge = AudioJudge(
    openai_api_key="your-openai-key",
    google_api_key="your-google-key"
)

# Simple pairwise comparison
result = judge.judge_audio(
    audio1_path="audio1.wav",
    audio2_path="audio2.wav",
    system_prompt="Compare these two audio clips for quality.",
    model="gpt-4o-audio-preview"
)

print(result["response"])
```

### Quick Demo
- [AudioJudge with Speaker Identification Demo](examples/audiojudge_huggingface_demo.ipynb)

## Configuration

### Environment Variables

Set your API keys as environment variables:

```bash
export OPENAI_API_KEY="your-openai-key"
export GOOGLE_API_KEY="your-google-key"
export EVAL_CACHE_DIR=".audio_cache"  # Optional
export EVAL_DISABLE_CACHE="false"     # Optional
```

### AudioJudge Parameters

```python
judge = AudioJudge(
    openai_api_key=None,           # OpenAI API key (optional if env var set)
    google_api_key=None,           # Google API key (optional if env var set)
    temp_dir="temp_audio",         # Temporary files directory for storing concatenated audios
    signal_folder="signal_audios", # TTS signal files directory used in audio concatenation
                                   # Default signal files are included in the package
                                   # Will use TTS model to generate new ones if needed
    cache_dir=None,                # API Cache directory (default: .eval_cache)
    cache_expire_seconds=2592000,  # Cache expiration (30 days)
    disable_cache=False            # Disable caching
)
```

## Core Methods

### 1. Pairwise Audio Comparison

### 1.1. Pairwise Comparison without Instruction Audio

Compare two audio files and get a model response directly:

```python
result = judge.judge_audio(
    audio1_path="speaker1.wav",
    audio2_path="speaker2.wav",
    system_prompt="Which speaker sounds more professional?",  # Define the evaluation criteria at the beginning
    user_prompt="Analyze both speakers and provide your assessment.",  # Optional specific instructions at the end
    model="gpt-4o-audio-preview",
    temperature=0.1,  # 0.0 is not supported for some api calling
    max_tokens=500    # Maximum response length
)

if result["success"]:
    print(f"Model response: {result['response']}")
else:
    print(f"Error: {result['error']}")
```

### 1.2. Pairwise Comparison with Instruction Audio

For scenarios where both audio clips are responses to the same instruction (e.g., comparing two speech-in speech-out systems):

```python
result = judge.judge_audio(
    audio1_path="system_a_response.wav",  # Response from system A
    audio2_path="system_b_response.wav",  # Response from system B
    instruction_path="original_instruction.wav",  # The instruction both systems responded to
    system_prompt="Compare which response better follows the given instruction.",
    model="gpt-4o-audio-preview"
)

print(f"Better response: {result['response']}")
```

### 2. Pointwise Audio Evaluation

Evaluate a single audio file:

```python
result = judge.judge_audio_pointwise(
    audio_path="speech.wav",
    system_prompt="Rate the speech quality from 1-10.",
    model="gpt-4o-audio-preview"
)

print(f"Quality rating: {result['response']}")
```

## In-Context Learning

Improve model performance by providing examples:

### Pairwise Examples

```python
from audiojudge.utils import AudioExample

# Create examples
examples = [
    AudioExample(
        audio1_path="example1_good.wav",
        audio2_path="example1_bad.wav",
        output="Audio 1 is better quality with clearer speech."
        # Optional: instruction_path="instruction1.wav"  # For instruction-based evaluation
    ),
    AudioExample(
        audio1_path="example2_noisy.wav",
        audio2_path="example2_clean.wav",
        output="Audio 2 is better due to less background noise."
    )
]

# Use examples in evaluation
result = judge.judge_audio(
    audio1_path="test1.wav",
    audio2_path="test2.wav",
    system_prompt="Compare audio quality and choose the better one.",
    examples=examples,
    model="gpt-4o-audio-preview"
)
```

### Pointwise Examples

```python
from audiojudge.utils import AudioExamplePointwise

examples = [
    AudioExamplePointwise(
        audio_path="high_quality.wav",
        output="9/10 - Excellent clarity and no background noise"
    ),
    AudioExamplePointwise(
        audio_path="medium_quality.wav",
        output="6/10 - Acceptable quality with minor distortions"
    )
]

result = judge.judge_audio_pointwise(
    audio_path="test_audio.wav",
    system_prompt="Rate the audio quality from 1-10 with explanation.",
    examples=examples,
    model="gpt-4o-audio-preview"
)
```

## Audio Concatenation Methods

Control how audio files are combined for model input:

### Available Methods

**For Pairwise Evaluation:**
1. **`no_concatenation`**: Keep all audio files separate
2. **`pair_example_concatenation`**: Concatenate each example pair
3. **`examples_concatenation`**: Concatenate all examples into one file
4. **`test_concatenation`**: Concatenate test audio pair
5. **`examples_and_test_concatenation`** (default): Concatenate all examples and test audio - shown as the most effective prompting strategy

**For Pointwise Evaluation:**
1. **`no_concatenation`** (default): Keep all audio files separate
2. **`examples_concatenation`**: Concatenate all examples into one file

### Example Usage

```python
# Pairwise: Keep everything separate
result = judge.judge_audio(
    audio1_path="test1.wav",
    audio2_path="test2.wav",
    system_prompt="Compare these audio clips.",
    concatenation_method="no_concatenation"
)

# Pairwise: Concatenate all for better context (recommended)
result = judge.judge_audio(
    audio1_path="test1.wav",
    audio2_path="test2.wav",
    system_prompt="Compare these audio clips.",
    examples=examples,
    concatenation_method="examples_and_test_concatenation"
)

# Pointwise: With example concatenation
result = judge.judge_audio_pointwise(
    audio_path="test.wav",
    system_prompt="Rate the audio quality from 1-10.",
    examples=pointwise_examples,
    concatenation_method="examples_concatenation"
)
```

## Instruction Audio

Use audio files as instructions for more complex tasks:

### With Examples

```python
# Examples with instruction audio
examples = [
    AudioExample(
        audio1_path="example1.wav",
        audio2_path="example2.wav",
        instruction_path="instruction_example.wav",
        output="Audio 1 follows the instruction better."
    )
]

result = judge.judge_audio(
    audio1_path="test1.wav",
    audio2_path="test2.wav",
    instruction_path="instruction.wav",
    system_prompt="Follow the audio instruction to evaluate these clips.",
    examples=examples,
    model="gpt-4o-audio-preview"
)
```

## Supported Models

### OpenAI Models
- `gpt-4o-audio-preview` (recommended)
- `gpt-4o-mini-audio-preview`

### Google Models
- `gemini-1.5-flash`
- `gemini-2.0-flash`
- `gemini-2.5-flash`

```python
# Using different models
result_gpt = judge.judge_audio(
    audio1_path="test1.wav",
    audio2_path="test2.wav",
    system_prompt="Compare quality.",
    model="gpt-4o-audio-preview"
)

result_gemini = judge.judge_audio(
    audio1_path="test1.wav",
    audio2_path="test2.wav",
    system_prompt="Compare quality.",
    model="gemini-2.0-flash"
)
```

## Caching

AudioJudge includes intelligent caching to reduce API costs and improve performance:

### Cache Management

```python
# Clear entire cache
judge.clear_cache()

# Clear only failed (None) responses
valid_entries = judge.clear_none_cache()
print(f"Kept {valid_entries} valid cache entries")

# Get cache statistics
stats = judge.get_cache_stats()
print(f"Cache entries: {stats['total_entries']}")
```

### Cache Configuration

```python
# Disable caching
judge = AudioJudge(disable_cache=True)

# Custom cache directory and expiration
judge = AudioJudge(
    cache_dir="my_audio_cache",
    cache_expire_seconds=86400  # 1 day
)
```

## Advanced Usage

### Error Handling

```python
result = judge.judge_audio(
    audio1_path="test1.wav",
    audio2_path="test2.wav",
    system_prompt="Compare these audio clips."
)

if result["success"]:
    response = result["response"]
    model_used = result["model"]
    print(f"Success with {model_used}: {response}")
else:
    error_message = result["error"]
    print(f"Evaluation failed: {error_message}")
```

### Temperature and Token Control

```python
# Deterministic output
result = judge.judge_audio(
    audio1_path="test1.wav",
    audio2_path="test2.wav",
    system_prompt="Compare quality.",
    temperature=0.000001,
    max_tokens=100
)

# More creative output
result = judge.judge_audio(
    audio1_path="test1.wav",
    audio2_path="test2.wav",
    system_prompt="Describe these audio clips creatively.",
    temperature=0.8,
    max_tokens=500
)
```

## Best Practices

### 1. System Prompt Design

```python
# Good: Specific and clear
system_prompt = """
You are an audio quality expert. Compare two audio clips and determine which has:
1. Better speech clarity
2. Less background noise  
3. More natural sound

Respond with: "Audio 1" or "Audio 2" followed by your reasoning.
"""

# Avoid: Vague instructions
system_prompt = "Which audio is better?"
```

### 2. Example Selection

```python
# Use diverse, representative examples
examples = [
    AudioExample(
        audio1_path="clear.wav", 
        audio2_path="muffled.wav", 
        output="Audio 1 - clearer speech"
    ),
    AudioExample(
        audio1_path="noisy.wav", 
        audio2_path="clean.wav", 
        output="Audio 2 - less background noise"
    ),
    AudioExample(
        audio1_path="fast.wav", 
        audio2_path="normal.wav", 
        output="Audio 2 - better pacing"
    )
]
```

### 3. Concatenation Strategy

- Use `no_concatenation` for simple cases or when preserving individual audio quality is crucial
- Use `examples_and_test_concatenation` when you have examples (recommended for best performance)
- Consider model context limits when choosing strategies

### 4. Model Selection

- **GPT-4o Audio**: Best for complex reasoning and detailed analysis
- **Gemini 2.0+**: Good for general comparisons, potentially faster and more cost-effective

## Research and Experiments

This package is based on research in audio evaluation using large audio models. The experimental code and evaluation scripts used in our research are available in the [`experiments/`](https://github.com/Woodygan/AudioJudge/tree/main/experiments) folder for reproducing the result.

### Example Usage

Additional usage examples can be found in the [`examples/`](https://github.com/Woodygan/AudioJudge/tree/main/examples) folder, which wraps some of our experiments into the package for demonstration:

- **[`examples/audiojudge_usage.py`](https://github.com/Woodygan/AudioJudge/tree/main/examples/audiojudge_usage.py)**: Pairwise comparison without instruction
  - Datasets: somos, thaimos, tmhintq, pronunciation, speed, speaker evaluations
- **[`examples/audiojudge_usage_with_instruction.py`](https://github.com/Woodygan/AudioJudge/tree/main/examples/audiojudge_usage_with_instruction.py)**: Pairwise comparison with instruction audio
  - Datasets: System-level comparisons including ChatbotArena and SpeakBench
- **[`examples/audiojudge_usage_pointwise.py`](https://github.com/Woodygan/AudioJudge/tree/main/examples/audiojudge_usage_pointwise.py)**: Pointwise evaluation
  - Datasets: somos, thaimos, tmhintq,

## License

This project is licensed under the MIT License - see the LICENSE file for details.

## Support

For issues and questions:
- GitHub Issues: [Create an issue](https://github.com/woodygan/audiojudge/issues)

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "audiojudge",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "ai, audio, comparison, gemini, llm, openai, pronunciation, speech",
    "author": null,
    "author_email": "Woody Gan <woodygan@usc.edu>",
    "download_url": "https://files.pythonhosted.org/packages/0a/56/6b22e8a717f849e35c1cc0287bf28f26699dfe25d6dce994e4af086e9e1a/audiojudge-0.1.1.tar.gz",
    "platform": null,
    "description": "# AudioJudge \ud83c\udfb5\n\nA Python wrapper for audio comparison and evaluation using a Large Audio Model as Judge (i.e., LAM-as-a-Judge or AudioJudge) with support for in-context learning and flexible audio concatenation strategies.\n\n## Features\n\n- **Multi-Model Support**: Works with OpenAI GPT-4o Audio and Google Gemini models (GPT-4o-audio family, Gemini-1.5/2.0/2.5-flash families)\n- **Flexible Audio Comparison**: Support for both pairwise and pointwise audio evaluation\n- **In-Context Learning**: Provide examples to improve model performance\n- **Audio Concatenation**: Multiple strategies for combining audio files\n- **Smart Caching**: Built-in API response caching to reduce costs and latency\n\n## Installation\n\n```bash\npip install audiojudge  # Requires Python >= 3.10\n```\n\n## Quick Start\n\n```python\nfrom audiojudge import AudioJudge\n\n# Initialize with API keys\njudge = AudioJudge(\n    openai_api_key=\"your-openai-key\",\n    google_api_key=\"your-google-key\"\n)\n\n# Simple pairwise comparison\nresult = judge.judge_audio(\n    audio1_path=\"audio1.wav\",\n    audio2_path=\"audio2.wav\",\n    system_prompt=\"Compare these two audio clips for quality.\",\n    model=\"gpt-4o-audio-preview\"\n)\n\nprint(result[\"response\"])\n```\n\n### Quick Demo\n- [AudioJudge with Speaker Identification Demo](examples/audiojudge_huggingface_demo.ipynb)\n\n## Configuration\n\n### Environment Variables\n\nSet your API keys as environment variables:\n\n```bash\nexport OPENAI_API_KEY=\"your-openai-key\"\nexport GOOGLE_API_KEY=\"your-google-key\"\nexport EVAL_CACHE_DIR=\".audio_cache\"  # Optional\nexport EVAL_DISABLE_CACHE=\"false\"     # Optional\n```\n\n### AudioJudge Parameters\n\n```python\njudge = AudioJudge(\n    openai_api_key=None,           # OpenAI API key (optional if env var set)\n    google_api_key=None,           # Google API key (optional if env var set)\n    temp_dir=\"temp_audio\",         # Temporary files directory for storing concatenated audios\n    signal_folder=\"signal_audios\", # TTS signal files directory used in audio concatenation\n                                   # Default signal files are included in the package\n                                   # Will use TTS model to generate new ones if needed\n    cache_dir=None,                # API Cache directory (default: .eval_cache)\n    cache_expire_seconds=2592000,  # Cache expiration (30 days)\n    disable_cache=False            # Disable caching\n)\n```\n\n## Core Methods\n\n### 1. Pairwise Audio Comparison\n\n### 1.1. Pairwise Comparison without Instruction Audio\n\nCompare two audio files and get a model response directly:\n\n```python\nresult = judge.judge_audio(\n    audio1_path=\"speaker1.wav\",\n    audio2_path=\"speaker2.wav\",\n    system_prompt=\"Which speaker sounds more professional?\",  # Define the evaluation criteria at the beginning\n    user_prompt=\"Analyze both speakers and provide your assessment.\",  # Optional specific instructions at the end\n    model=\"gpt-4o-audio-preview\",\n    temperature=0.1,  # 0.0 is not supported for some api calling\n    max_tokens=500    # Maximum response length\n)\n\nif result[\"success\"]:\n    print(f\"Model response: {result['response']}\")\nelse:\n    print(f\"Error: {result['error']}\")\n```\n\n### 1.2. Pairwise Comparison with Instruction Audio\n\nFor scenarios where both audio clips are responses to the same instruction (e.g., comparing two speech-in speech-out systems):\n\n```python\nresult = judge.judge_audio(\n    audio1_path=\"system_a_response.wav\",  # Response from system A\n    audio2_path=\"system_b_response.wav\",  # Response from system B\n    instruction_path=\"original_instruction.wav\",  # The instruction both systems responded to\n    system_prompt=\"Compare which response better follows the given instruction.\",\n    model=\"gpt-4o-audio-preview\"\n)\n\nprint(f\"Better response: {result['response']}\")\n```\n\n### 2. Pointwise Audio Evaluation\n\nEvaluate a single audio file:\n\n```python\nresult = judge.judge_audio_pointwise(\n    audio_path=\"speech.wav\",\n    system_prompt=\"Rate the speech quality from 1-10.\",\n    model=\"gpt-4o-audio-preview\"\n)\n\nprint(f\"Quality rating: {result['response']}\")\n```\n\n## In-Context Learning\n\nImprove model performance by providing examples:\n\n### Pairwise Examples\n\n```python\nfrom audiojudge.utils import AudioExample\n\n# Create examples\nexamples = [\n    AudioExample(\n        audio1_path=\"example1_good.wav\",\n        audio2_path=\"example1_bad.wav\",\n        output=\"Audio 1 is better quality with clearer speech.\"\n        # Optional: instruction_path=\"instruction1.wav\"  # For instruction-based evaluation\n    ),\n    AudioExample(\n        audio1_path=\"example2_noisy.wav\",\n        audio2_path=\"example2_clean.wav\",\n        output=\"Audio 2 is better due to less background noise.\"\n    )\n]\n\n# Use examples in evaluation\nresult = judge.judge_audio(\n    audio1_path=\"test1.wav\",\n    audio2_path=\"test2.wav\",\n    system_prompt=\"Compare audio quality and choose the better one.\",\n    examples=examples,\n    model=\"gpt-4o-audio-preview\"\n)\n```\n\n### Pointwise Examples\n\n```python\nfrom audiojudge.utils import AudioExamplePointwise\n\nexamples = [\n    AudioExamplePointwise(\n        audio_path=\"high_quality.wav\",\n        output=\"9/10 - Excellent clarity and no background noise\"\n    ),\n    AudioExamplePointwise(\n        audio_path=\"medium_quality.wav\",\n        output=\"6/10 - Acceptable quality with minor distortions\"\n    )\n]\n\nresult = judge.judge_audio_pointwise(\n    audio_path=\"test_audio.wav\",\n    system_prompt=\"Rate the audio quality from 1-10 with explanation.\",\n    examples=examples,\n    model=\"gpt-4o-audio-preview\"\n)\n```\n\n## Audio Concatenation Methods\n\nControl how audio files are combined for model input:\n\n### Available Methods\n\n**For Pairwise Evaluation:**\n1. **`no_concatenation`**: Keep all audio files separate\n2. **`pair_example_concatenation`**: Concatenate each example pair\n3. **`examples_concatenation`**: Concatenate all examples into one file\n4. **`test_concatenation`**: Concatenate test audio pair\n5. **`examples_and_test_concatenation`** (default): Concatenate all examples and test audio - shown as the most effective prompting strategy\n\n**For Pointwise Evaluation:**\n1. **`no_concatenation`** (default): Keep all audio files separate\n2. **`examples_concatenation`**: Concatenate all examples into one file\n\n### Example Usage\n\n```python\n# Pairwise: Keep everything separate\nresult = judge.judge_audio(\n    audio1_path=\"test1.wav\",\n    audio2_path=\"test2.wav\",\n    system_prompt=\"Compare these audio clips.\",\n    concatenation_method=\"no_concatenation\"\n)\n\n# Pairwise: Concatenate all for better context (recommended)\nresult = judge.judge_audio(\n    audio1_path=\"test1.wav\",\n    audio2_path=\"test2.wav\",\n    system_prompt=\"Compare these audio clips.\",\n    examples=examples,\n    concatenation_method=\"examples_and_test_concatenation\"\n)\n\n# Pointwise: With example concatenation\nresult = judge.judge_audio_pointwise(\n    audio_path=\"test.wav\",\n    system_prompt=\"Rate the audio quality from 1-10.\",\n    examples=pointwise_examples,\n    concatenation_method=\"examples_concatenation\"\n)\n```\n\n## Instruction Audio\n\nUse audio files as instructions for more complex tasks:\n\n### With Examples\n\n```python\n# Examples with instruction audio\nexamples = [\n    AudioExample(\n        audio1_path=\"example1.wav\",\n        audio2_path=\"example2.wav\",\n        instruction_path=\"instruction_example.wav\",\n        output=\"Audio 1 follows the instruction better.\"\n    )\n]\n\nresult = judge.judge_audio(\n    audio1_path=\"test1.wav\",\n    audio2_path=\"test2.wav\",\n    instruction_path=\"instruction.wav\",\n    system_prompt=\"Follow the audio instruction to evaluate these clips.\",\n    examples=examples,\n    model=\"gpt-4o-audio-preview\"\n)\n```\n\n## Supported Models\n\n### OpenAI Models\n- `gpt-4o-audio-preview` (recommended)\n- `gpt-4o-mini-audio-preview`\n\n### Google Models\n- `gemini-1.5-flash`\n- `gemini-2.0-flash`\n- `gemini-2.5-flash`\n\n```python\n# Using different models\nresult_gpt = judge.judge_audio(\n    audio1_path=\"test1.wav\",\n    audio2_path=\"test2.wav\",\n    system_prompt=\"Compare quality.\",\n    model=\"gpt-4o-audio-preview\"\n)\n\nresult_gemini = judge.judge_audio(\n    audio1_path=\"test1.wav\",\n    audio2_path=\"test2.wav\",\n    system_prompt=\"Compare quality.\",\n    model=\"gemini-2.0-flash\"\n)\n```\n\n## Caching\n\nAudioJudge includes intelligent caching to reduce API costs and improve performance:\n\n### Cache Management\n\n```python\n# Clear entire cache\njudge.clear_cache()\n\n# Clear only failed (None) responses\nvalid_entries = judge.clear_none_cache()\nprint(f\"Kept {valid_entries} valid cache entries\")\n\n# Get cache statistics\nstats = judge.get_cache_stats()\nprint(f\"Cache entries: {stats['total_entries']}\")\n```\n\n### Cache Configuration\n\n```python\n# Disable caching\njudge = AudioJudge(disable_cache=True)\n\n# Custom cache directory and expiration\njudge = AudioJudge(\n    cache_dir=\"my_audio_cache\",\n    cache_expire_seconds=86400  # 1 day\n)\n```\n\n## Advanced Usage\n\n### Error Handling\n\n```python\nresult = judge.judge_audio(\n    audio1_path=\"test1.wav\",\n    audio2_path=\"test2.wav\",\n    system_prompt=\"Compare these audio clips.\"\n)\n\nif result[\"success\"]:\n    response = result[\"response\"]\n    model_used = result[\"model\"]\n    print(f\"Success with {model_used}: {response}\")\nelse:\n    error_message = result[\"error\"]\n    print(f\"Evaluation failed: {error_message}\")\n```\n\n### Temperature and Token Control\n\n```python\n# Deterministic output\nresult = judge.judge_audio(\n    audio1_path=\"test1.wav\",\n    audio2_path=\"test2.wav\",\n    system_prompt=\"Compare quality.\",\n    temperature=0.000001,\n    max_tokens=100\n)\n\n# More creative output\nresult = judge.judge_audio(\n    audio1_path=\"test1.wav\",\n    audio2_path=\"test2.wav\",\n    system_prompt=\"Describe these audio clips creatively.\",\n    temperature=0.8,\n    max_tokens=500\n)\n```\n\n## Best Practices\n\n### 1. System Prompt Design\n\n```python\n# Good: Specific and clear\nsystem_prompt = \"\"\"\nYou are an audio quality expert. Compare two audio clips and determine which has:\n1. Better speech clarity\n2. Less background noise  \n3. More natural sound\n\nRespond with: \"Audio 1\" or \"Audio 2\" followed by your reasoning.\n\"\"\"\n\n# Avoid: Vague instructions\nsystem_prompt = \"Which audio is better?\"\n```\n\n### 2. Example Selection\n\n```python\n# Use diverse, representative examples\nexamples = [\n    AudioExample(\n        audio1_path=\"clear.wav\", \n        audio2_path=\"muffled.wav\", \n        output=\"Audio 1 - clearer speech\"\n    ),\n    AudioExample(\n        audio1_path=\"noisy.wav\", \n        audio2_path=\"clean.wav\", \n        output=\"Audio 2 - less background noise\"\n    ),\n    AudioExample(\n        audio1_path=\"fast.wav\", \n        audio2_path=\"normal.wav\", \n        output=\"Audio 2 - better pacing\"\n    )\n]\n```\n\n### 3. Concatenation Strategy\n\n- Use `no_concatenation` for simple cases or when preserving individual audio quality is crucial\n- Use `examples_and_test_concatenation` when you have examples (recommended for best performance)\n- Consider model context limits when choosing strategies\n\n### 4. Model Selection\n\n- **GPT-4o Audio**: Best for complex reasoning and detailed analysis\n- **Gemini 2.0+**: Good for general comparisons, potentially faster and more cost-effective\n\n## Research and Experiments\n\nThis package is based on research in audio evaluation using large audio models. The experimental code and evaluation scripts used in our research are available in the [`experiments/`](https://github.com/Woodygan/AudioJudge/tree/main/experiments) folder for reproducing the result.\n\n### Example Usage\n\nAdditional usage examples can be found in the [`examples/`](https://github.com/Woodygan/AudioJudge/tree/main/examples) folder, which wraps some of our experiments into the package for demonstration:\n\n- **[`examples/audiojudge_usage.py`](https://github.com/Woodygan/AudioJudge/tree/main/examples/audiojudge_usage.py)**: Pairwise comparison without instruction\n  - Datasets: somos, thaimos, tmhintq, pronunciation, speed, speaker evaluations\n- **[`examples/audiojudge_usage_with_instruction.py`](https://github.com/Woodygan/AudioJudge/tree/main/examples/audiojudge_usage_with_instruction.py)**: Pairwise comparison with instruction audio\n  - Datasets: System-level comparisons including ChatbotArena and SpeakBench\n- **[`examples/audiojudge_usage_pointwise.py`](https://github.com/Woodygan/AudioJudge/tree/main/examples/audiojudge_usage_pointwise.py)**: Pointwise evaluation\n  - Datasets: somos, thaimos, tmhintq,\n\n## License\n\nThis project is licensed under the MIT License - see the LICENSE file for details.\n\n## Support\n\nFor issues and questions:\n- GitHub Issues: [Create an issue](https://github.com/woodygan/audiojudge/issues)\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A simple package for audio comparison using large language models",
    "version": "0.1.1",
    "project_urls": {
        "Bug Tracker": "https://github.com/woodygan/audiojudge/issues",
        "Changelog": "https://github.com/woodygan/audiojudge/releases",
        "Documentation": "https://github.com/woodygan/audiojudge#readme",
        "Homepage": "https://github.com/woodygan/audiojudge",
        "Repository": "https://github.com/woodygan/audiojudge"
    },
    "split_keywords": [
        "ai",
        " audio",
        " comparison",
        " gemini",
        " llm",
        " openai",
        " pronunciation",
        " speech"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "4ed245b6fffda09a1d69d10cec4061300e5f1fecc69dd3b2e2e5ca472b7bd810",
                "md5": "6519e1054b01b188834ac862760ecf48",
                "sha256": "54864cccc8c9ffe53025319501d48e840e292ca3b06da11790f013f68069f037"
            },
            "downloads": -1,
            "filename": "audiojudge-0.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "6519e1054b01b188834ac862760ecf48",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 1992080,
            "upload_time": "2025-07-15T00:56:58",
            "upload_time_iso_8601": "2025-07-15T00:56:58.222833Z",
            "url": "https://files.pythonhosted.org/packages/4e/d2/45b6fffda09a1d69d10cec4061300e5f1fecc69dd3b2e2e5ca472b7bd810/audiojudge-0.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "0a566b22e8a717f849e35c1cc0287bf28f26699dfe25d6dce994e4af086e9e1a",
                "md5": "3ecd29d1c9ff4d90046c58e659ef466b",
                "sha256": "5aa1d5d23f133ced08f7118a784ee00df0378af36a5ec67aea4a2f6cf6c2dda2"
            },
            "downloads": -1,
            "filename": "audiojudge-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "3ecd29d1c9ff4d90046c58e659ef466b",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 1009695,
            "upload_time": "2025-07-15T00:57:01",
            "upload_time_iso_8601": "2025-07-15T00:57:01.967404Z",
            "url": "https://files.pythonhosted.org/packages/0a/56/6b22e8a717f849e35c1cc0287bf28f26699dfe25d6dce994e4af086e9e1a/audiojudge-0.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-15 00:57:01",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "woodygan",
    "github_project": "audiojudge",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "audiojudge"
}
        
Elapsed time: 1.34069s