elpis


Nameelpis JSON
Version 0.2.2 PyPI version JSON
download
home_pagehttps://github.com/CoEDL/elpis_lib
SummaryA library to perform automatic speech recognition with huggingface transformers.
upload_time2023-10-18 06:11:15
maintainer
docs_urlNone
authorHarry Keightley
requires_python>=3.10,<4.0
license
keywords elpis huggingface asr automatic speech recognition coedl
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Elpis Core Library

The Core Elpis Library, providing a quick api to [:hugs: transformers](https://huggingface.co/models?pipeline_tag=automatic-speech-recognition&sort=downloads)
for automatic-speech-recognition.

You can use the library to:

- Perform standalone inference using a pretrained HFT model.
- Fine tune a pretrained ASR model on your own dataset.
- Generate text and Elan files from inference results for further analysis.

## Documentation

Documentation for the library can be be found [here](https://coedl.github.io/elpis_lib/index.html).

## Dependencies

While we try to be as machine-independant as possible, there are some dependencies
you should be aware of when using this library:

- Processing datasets (`elpis.datasets.processing`) requires `librosa`, which
  depends on having `libsndfile` installed on your computer. If you're using
  elpis within a docker container, you may have to manually install
  `libsndfile`.
- Transcription (`elpis.transcription.transcribe`) requires `ffmpeg` if your
  audio you're attempting to transcribe needs to be resampled before it can
  be used. The default sample rate we assume is 16khz.
- The preprocessing flow (`elpis.datasets.preprocessing`) is free of external
  dependencies.

## Installation

You can install the elpis library with:
`pip3 install elpis`

## Usage

Below are some typical examples of use cases

### Standalone Inference

```python
from pathlib import Path

from elpis.transcriber.results import build_text
from elpis.transcriber.transcribe import build_pipeline, transcribe

# Perform inference
asr = build_pipeline(pretrained_location="facebook/wav2vec2-base-960h")
audio = Path("<to_some_audio_file.wav>")
annotations = transcribe(audio, asr) # Timed, per word annotation data

result = build_text(annotations) # Combine annotations to extract all text
print(result)

# Build output files
text_file = output_dir / "test.txt"
with open(text_file, "w") as output_file:
    output_file.write(result)
```

### Fine-tuning a Pretrained Model on Local Dataset

```python
from pathlib import Path
from typing import List

from elpis.datasets import Dataset
from elpis.datasets.dataset import CleaningOptions
from elpis.datasets.preprocessing import process_batch
from elpis.models import ElanOptions, ElanTierSelector
from elpis.trainer.job import TrainingJob, TrainingOptions
from elpis.trainer.trainer import train
from elpis.transcriber.results import build_elan, build_text
from elpis.transcriber.transcribe import build_pipeline, transcribe

files: List[Path] = [...] # A list of paths to the files to include.

dataset = Dataset(
    name="dataset",
    files=files,
    cleaning_options=CleaningOptions(), # Default cleaning options
    # Elan data extraction info- required if dataset includes .eaf files.
    elan_options=ElanOptions(
        selection_mechanism=ElanTierSelector.NAME, selection_value="Phrase"
    ),
)

# Setup
tmp_path = Path('...')

dataset_dir = tmp_path / "dataset"
model_dir = tmp_path / "model"
output_dir = tmp_path / "output"

# Make all directories
for directory in dataset_dir, model_dir, output_dir:
    directory.mkdir(exist_ok=True, parents=True)

# Preprocessing
batches = dataset.to_batches()
for batch in batches:
    process_batch(batch, dataset_dir)

# Train the model
job = TrainingJob(
    model_name="some_model",
    dataset_name="some_dataset",
    options=TrainingOptions(epochs=2, learning_rate=0.001),
    base_model="facebook/wav2vec2-base-960h"
)
train(
    job=job,
    output_dir=model_dir,
    dataset_dir=dataset_dir,
)

# Perform inference with pipeline
asr = build_pipeline(
    pretrained_location=str(model_dir.absolute()),
)
audio = Path("<to_some_audio_file.wav>")
annotations = transcribe(audio, asr)

# Build output files
text_file = output_dir / "test.txt"
with open(text_file, "w") as output_file:
    output_file.write(build_text(annotations))

elan_file = output_dir / "test.eaf"
eaf = build_elan(annotations)
eaf.to_file(str(elan_file))

print('voila ;)')
```

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/CoEDL/elpis_lib",
    "name": "elpis",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.10,<4.0",
    "maintainer_email": "",
    "keywords": "Elpis,huggingface,ASR,Automatic Speech Recognition,CoEDL",
    "author": "Harry Keightley",
    "author_email": "harrykeightley@outlook.com",
    "download_url": "https://files.pythonhosted.org/packages/4c/bf/f1f462c354c83557d064f3ec67871a17b93aae10e26ffea6a07056d7104a/elpis-0.2.2.tar.gz",
    "platform": null,
    "description": "# Elpis Core Library\n\nThe Core Elpis Library, providing a quick api to [:hugs: transformers](https://huggingface.co/models?pipeline_tag=automatic-speech-recognition&sort=downloads)\nfor automatic-speech-recognition.\n\nYou can use the library to:\n\n- Perform standalone inference using a pretrained HFT model.\n- Fine tune a pretrained ASR model on your own dataset.\n- Generate text and Elan files from inference results for further analysis.\n\n## Documentation\n\nDocumentation for the library can be be found [here](https://coedl.github.io/elpis_lib/index.html).\n\n## Dependencies\n\nWhile we try to be as machine-independant as possible, there are some dependencies\nyou should be aware of when using this library:\n\n- Processing datasets (`elpis.datasets.processing`) requires `librosa`, which\n  depends on having `libsndfile` installed on your computer. If you're using\n  elpis within a docker container, you may have to manually install\n  `libsndfile`.\n- Transcription (`elpis.transcription.transcribe`) requires `ffmpeg` if your\n  audio you're attempting to transcribe needs to be resampled before it can\n  be used. The default sample rate we assume is 16khz.\n- The preprocessing flow (`elpis.datasets.preprocessing`) is free of external\n  dependencies.\n\n## Installation\n\nYou can install the elpis library with:\n`pip3 install elpis`\n\n## Usage\n\nBelow are some typical examples of use cases\n\n### Standalone Inference\n\n```python\nfrom pathlib import Path\n\nfrom elpis.transcriber.results import build_text\nfrom elpis.transcriber.transcribe import build_pipeline, transcribe\n\n# Perform inference\nasr = build_pipeline(pretrained_location=\"facebook/wav2vec2-base-960h\")\naudio = Path(\"<to_some_audio_file.wav>\")\nannotations = transcribe(audio, asr) # Timed, per word annotation data\n\nresult = build_text(annotations) # Combine annotations to extract all text\nprint(result)\n\n# Build output files\ntext_file = output_dir / \"test.txt\"\nwith open(text_file, \"w\") as output_file:\n    output_file.write(result)\n```\n\n### Fine-tuning a Pretrained Model on Local Dataset\n\n```python\nfrom pathlib import Path\nfrom typing import List\n\nfrom elpis.datasets import Dataset\nfrom elpis.datasets.dataset import CleaningOptions\nfrom elpis.datasets.preprocessing import process_batch\nfrom elpis.models import ElanOptions, ElanTierSelector\nfrom elpis.trainer.job import TrainingJob, TrainingOptions\nfrom elpis.trainer.trainer import train\nfrom elpis.transcriber.results import build_elan, build_text\nfrom elpis.transcriber.transcribe import build_pipeline, transcribe\n\nfiles: List[Path] = [...] # A list of paths to the files to include.\n\ndataset = Dataset(\n    name=\"dataset\",\n    files=files,\n    cleaning_options=CleaningOptions(), # Default cleaning options\n    # Elan data extraction info- required if dataset includes .eaf files.\n    elan_options=ElanOptions(\n        selection_mechanism=ElanTierSelector.NAME, selection_value=\"Phrase\"\n    ),\n)\n\n# Setup\ntmp_path = Path('...')\n\ndataset_dir = tmp_path / \"dataset\"\nmodel_dir = tmp_path / \"model\"\noutput_dir = tmp_path / \"output\"\n\n# Make all directories\nfor directory in dataset_dir, model_dir, output_dir:\n    directory.mkdir(exist_ok=True, parents=True)\n\n# Preprocessing\nbatches = dataset.to_batches()\nfor batch in batches:\n    process_batch(batch, dataset_dir)\n\n# Train the model\njob = TrainingJob(\n    model_name=\"some_model\",\n    dataset_name=\"some_dataset\",\n    options=TrainingOptions(epochs=2, learning_rate=0.001),\n    base_model=\"facebook/wav2vec2-base-960h\"\n)\ntrain(\n    job=job,\n    output_dir=model_dir,\n    dataset_dir=dataset_dir,\n)\n\n# Perform inference with pipeline\nasr = build_pipeline(\n    pretrained_location=str(model_dir.absolute()),\n)\naudio = Path(\"<to_some_audio_file.wav>\")\nannotations = transcribe(audio, asr)\n\n# Build output files\ntext_file = output_dir / \"test.txt\"\nwith open(text_file, \"w\") as output_file:\n    output_file.write(build_text(annotations))\n\nelan_file = output_dir / \"test.eaf\"\neaf = build_elan(annotations)\neaf.to_file(str(elan_file))\n\nprint('voila ;)')\n```\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "A library to perform automatic speech recognition with huggingface transformers.",
    "version": "0.2.2",
    "project_urls": {
        "Homepage": "https://github.com/CoEDL/elpis_lib",
        "Repository": "https://github.com/CoEDL/elpis_lib"
    },
    "split_keywords": [
        "elpis",
        "huggingface",
        "asr",
        "automatic speech recognition",
        "coedl"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "685501522fec8b2c2c9d86a86272a8b86d3312466bf52fe0712fb9937d0ef8a4",
                "md5": "20cdfed07afa886a95c788224b2f36d6",
                "sha256": "3971ce45c0c720b2909407a75fcebe3ad60fec7603f5a56e5573ef61cfa80d8f"
            },
            "downloads": -1,
            "filename": "elpis-0.2.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "20cdfed07afa886a95c788224b2f36d6",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10,<4.0",
            "size": 27919,
            "upload_time": "2023-10-18T06:11:13",
            "upload_time_iso_8601": "2023-10-18T06:11:13.191729Z",
            "url": "https://files.pythonhosted.org/packages/68/55/01522fec8b2c2c9d86a86272a8b86d3312466bf52fe0712fb9937d0ef8a4/elpis-0.2.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "4cbff1f462c354c83557d064f3ec67871a17b93aae10e26ffea6a07056d7104a",
                "md5": "05a558f9cab53444137df5f47e4e9feb",
                "sha256": "b1b939a1ea8927e95dfee69f338463b6f8a9407e0c277d05c1a41d649c8338c1"
            },
            "downloads": -1,
            "filename": "elpis-0.2.2.tar.gz",
            "has_sig": false,
            "md5_digest": "05a558f9cab53444137df5f47e4e9feb",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10,<4.0",
            "size": 23518,
            "upload_time": "2023-10-18T06:11:15",
            "upload_time_iso_8601": "2023-10-18T06:11:15.818471Z",
            "url": "https://files.pythonhosted.org/packages/4c/bf/f1f462c354c83557d064f3ec67871a17b93aae10e26ffea6a07056d7104a/elpis-0.2.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-10-18 06:11:15",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "CoEDL",
    "github_project": "elpis_lib",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "elpis"
}
        
Elapsed time: 0.12767s