whisper-eval-serbian

Name	whisper-eval-serbian JSON
Version	0.0.32 JSON
	download
home_page	https://aida.guru
Summary	An evaluation framework for Serbian Whisper models.
upload_time	2025-08-16 16:55:22
maintainer	Dejan Čugalj
docs_url	None
author	Dejan Čugalj
requires_python	<4.0,>=3.10
license	Apache-2.0
keywords	tts asr serbian audio ai
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # AiDA Whisper Evaluation Framework (Serbian)


[An evaluation framework for Serbian Whisper models.](https://aida.guru)


# Whisper Evaluator 🎤

A simple, modular framework to evaluate fine-tuned Whisper models in Python notebooks.

This library allows you to easily run evaluations on any dataset from the Hugging Face Hub using a simple configuration dictionary. It calculates a comprehensive set of metrics, including WER, CER, BLEU, and ROUGE, and automatically logs all results to a file.

## Installation

You can install the library directly from GitHub for latest updates and features. Make sure you have `git` installed on your system.

```bash
pip install git+https://github.com/your-username/whisper-evaluator.git
```

## Quickstart

Using the library in a Google Colab or Jupyter Notebook is straightforward.

```python
from whisper_evaluator import Evaluator
import json

# 1. Define your evaluation configuration
config = {
    "model_args": {
        "name_or_path": "openai/whisper-large-v2", # Your fine-tuned model ID
        "device": "cuda"
    },
    "task_args": {
        "dataset_name": "mozilla-foundation/common_voice_11_0",
        "dataset_subset": "sr", # Serbian language
        "dataset_split": "test[:20]", # Use the first 20 samples for a quick demo
        "audio_column": "audio",
        "text_column": "sentence"
    }
}

# 2. Initialize the evaluator
evaluator = Evaluator(config=config)

# 3. Run the evaluation (logs to 'evaluation_log.txt' by default)
detailed_results, metrics = evaluator.run()

# 4. Analyze the results
print("\n--- Final Metrics ---")
# Pretty print the metrics dictionary
print(json.dumps(metrics, indent=2))

print("\n--- Sample of evaluation details ---")
# Print the first 3 results from the list
for i, result in enumerate(detailed_results[:3]):
    print(f"\n--- Example {i+1} ---")
    print(f"Reference:  {result['reference']}")
    print(f"Prediction: {result['prediction']}")
```







# Project Setup

Follow these steps to set up the **AiDA-Whisper-Eval** project.

---

### Using Conda

### 1. Create a new Conda environment

```bash
conda create --name aida python=3.12 -y
conda activate aida
```

#### 2. Install Poetry

```bash
pip install poetry
```

#### 3. Install project dependencies

Navigate to the project's root directory and run:

```bash
poetry install
```

---

### Using Plain Python

#### 1. Create and activate a virtual environment

```bash
python -m venv venv

# On Linux/macOS
source venv/bin/activate

# On Windows
.\venv\Scripts\activate
```

#### 2. Upgrade pip and install Poetry

```bash
pip install --upgrade pip
pip install poetry
```

#### 3. Install project dependencies

From the project's root directory, run:

```bash
poetry install
```

```bash
pip install pre-commit
```

### 4. Set up pre-commit hooks

```bash
poetry run pre-commit install
```

---

### Verifying Installation

Check installation by running tests:

```bash
# On Linux/macOS
make test

# On Windows
poetry run pytest
```

Your setup is complete!

Raw data

            {
    "_id": null,
    "home_page": "https://aida.guru",
    "name": "whisper-eval-serbian",
    "maintainer": "Dejan \u010cugalj",
    "docs_url": null,
    "requires_python": "<4.0,>=3.10",
    "maintainer_email": "dejan@textintellect.com",
    "keywords": "TTS, ASR, Serbian, Audio, AI",
    "author": "Dejan \u010cugalj",
    "author_email": "dejan@textintellect.com",
    "download_url": "https://files.pythonhosted.org/packages/b5/6c/decdea2d3010a168ac0705cabb0c8bcaf10cbc87d10c3711d73305374b85/whisper_eval_serbian-0.0.32.tar.gz",
    "platform": null,
    "description": "# AiDA Whisper Evaluation Framework (Serbian)\n\n\n[An evaluation framework for Serbian Whisper models.](https://aida.guru)\n\n\n# Whisper Evaluator \ud83c\udfa4\n\nA simple, modular framework to evaluate fine-tuned Whisper models in Python notebooks.\n\nThis library allows you to easily run evaluations on any dataset from the Hugging Face Hub using a simple configuration dictionary. It calculates a comprehensive set of metrics, including WER, CER, BLEU, and ROUGE, and automatically logs all results to a file.\n\n## Installation\n\nYou can install the library directly from GitHub for latest updates and features. Make sure you have `git` installed on your system.\n\n```bash\npip install git+https://github.com/your-username/whisper-evaluator.git\n```\n\n## Quickstart\n\nUsing the library in a Google Colab or Jupyter Notebook is straightforward.\n\n```python\nfrom whisper_evaluator import Evaluator\nimport json\n\n# 1. Define your evaluation configuration\nconfig = {\n    \"model_args\": {\n        \"name_or_path\": \"openai/whisper-large-v2\", # Your fine-tuned model ID\n        \"device\": \"cuda\"\n    },\n    \"task_args\": {\n        \"dataset_name\": \"mozilla-foundation/common_voice_11_0\",\n        \"dataset_subset\": \"sr\", # Serbian language\n        \"dataset_split\": \"test[:20]\", # Use the first 20 samples for a quick demo\n        \"audio_column\": \"audio\",\n        \"text_column\": \"sentence\"\n    }\n}\n\n# 2. Initialize the evaluator\nevaluator = Evaluator(config=config)\n\n# 3. Run the evaluation (logs to 'evaluation_log.txt' by default)\ndetailed_results, metrics = evaluator.run()\n\n# 4. Analyze the results\nprint(\"\\n--- Final Metrics ---\")\n# Pretty print the metrics dictionary\nprint(json.dumps(metrics, indent=2))\n\nprint(\"\\n--- Sample of evaluation details ---\")\n# Print the first 3 results from the list\nfor i, result in enumerate(detailed_results[:3]):\n    print(f\"\\n--- Example {i+1} ---\")\n    print(f\"Reference:  {result['reference']}\")\n    print(f\"Prediction: {result['prediction']}\")\n```\n\n\n\n\n\n\n\n# Project Setup\n\nFollow these steps to set up the **AiDA-Whisper-Eval** project.\n\n---\n\n### Using Conda\n\n### 1. Create a new Conda environment\n\n```bash\nconda create --name aida python=3.12 -y\nconda activate aida\n```\n\n#### 2. Install Poetry\n\n```bash\npip install poetry\n```\n\n#### 3. Install project dependencies\n\nNavigate to the project's root directory and run:\n\n```bash\npoetry install\n```\n\n---\n\n### Using Plain Python\n\n#### 1. Create and activate a virtual environment\n\n```bash\npython -m venv venv\n\n# On Linux/macOS\nsource venv/bin/activate\n\n# On Windows\n.\\venv\\Scripts\\activate\n```\n\n#### 2. Upgrade pip and install Poetry\n\n```bash\npip install --upgrade pip\npip install poetry\n```\n\n#### 3. Install project dependencies\n\nFrom the project's root directory, run:\n\n```bash\npoetry install\n```\n\n```bash\npip install pre-commit\n```\n\n### 4. Set up pre-commit hooks\n\n```bash\npoetry run pre-commit install\n```\n\n---\n\n### Verifying Installation\n\nCheck installation by running tests:\n\n```bash\n# On Linux/macOS\nmake test\n\n# On Windows\npoetry run pytest\n```\n\nYour setup is complete!\n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "An evaluation framework for Serbian Whisper models.",
    "version": "0.0.32",
    "project_urls": {
        "Documentation": "https://aida.guru/docs",
        "Homepage": "https://aida.guru",
        "Repository": "https://github.com/DeanChugall/AiDA.git"
    },
    "split_keywords": [
        "tts",
        " asr",
        " serbian",
        " audio",
        " ai"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f677550b286c0d54fa99ef3b54395b82fa2eb4eff0cf3c0842a8d20e334752eb",
                "md5": "56823a4979a39a37f6f4fc0910afddc1",
                "sha256": "472a20c8596cf62964690839373144103f6dbe663ba558239d302824690e2f06"
            },
            "downloads": -1,
            "filename": "whisper_eval_serbian-0.0.32-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "56823a4979a39a37f6f4fc0910afddc1",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.10",
            "size": 13869,
            "upload_time": "2025-08-16T16:55:20",
            "upload_time_iso_8601": "2025-08-16T16:55:20.852495Z",
            "url": "https://files.pythonhosted.org/packages/f6/77/550b286c0d54fa99ef3b54395b82fa2eb4eff0cf3c0842a8d20e334752eb/whisper_eval_serbian-0.0.32-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b56cdecdea2d3010a168ac0705cabb0c8bcaf10cbc87d10c3711d73305374b85",
                "md5": "cd68b377ff1188b331c3e98d1dbcedbe",
                "sha256": "0f0607f280c6d13e3c14cecf45022d4f029cf7d0eacc657f687d071c1aa851b1"
            },
            "downloads": -1,
            "filename": "whisper_eval_serbian-0.0.32.tar.gz",
            "has_sig": false,
            "md5_digest": "cd68b377ff1188b331c3e98d1dbcedbe",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.10",
            "size": 13385,
            "upload_time": "2025-08-16T16:55:22",
            "upload_time_iso_8601": "2025-08-16T16:55:22.471869Z",
            "url": "https://files.pythonhosted.org/packages/b5/6c/decdea2d3010a168ac0705cabb0c8bcaf10cbc87d10c3711d73305374b85/whisper_eval_serbian-0.0.32.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-16 16:55:22",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "DeanChugall",
    "github_project": "AiDA",
    "github_not_found": true,
    "lcname": "whisper-eval-serbian"
}

Dejan Čugalj