# Elpis Core Library
The Core Elpis Library, providing a quick api to [:hugs: transformers](https://huggingface.co/models?pipeline_tag=automatic-speech-recognition&sort=downloads)
for automatic-speech-recognition.
You can use the library to:
- Perform standalone inference using a pretrained HFT model.
- Fine tune a pretrained ASR model on your own dataset.
- Generate text and Elan files from inference results for further analysis.
## Documentation
Documentation for the library can be be found [here](https://coedl.github.io/elpis_lib/index.html).
## Dependencies
While we try to be as machine-independant as possible, there are some dependencies
you should be aware of when using this library:
- Processing datasets (`elpis.datasets.processing`) requires `librosa`, which
depends on having `libsndfile` installed on your computer. If you're using
elpis within a docker container, you may have to manually install
`libsndfile`.
- Transcription (`elpis.transcription.transcribe`) requires `ffmpeg` if your
audio you're attempting to transcribe needs to be resampled before it can
be used. The default sample rate we assume is 16khz.
- The preprocessing flow (`elpis.datasets.preprocessing`) is free of external
dependencies.
## Installation
You can install the elpis library with:
`pip3 install elpis`
## Usage
Below are some typical examples of use cases
### Standalone Inference
```python
from pathlib import Path
from elpis.transcriber.results import build_text
from elpis.transcriber.transcribe import build_pipeline, transcribe
# Perform inference
asr = build_pipeline(pretrained_location="facebook/wav2vec2-base-960h")
audio = Path("<to_some_audio_file.wav>")
annotations = transcribe(audio, asr) # Timed, per word annotation data
result = build_text(annotations) # Combine annotations to extract all text
print(result)
# Build output files
text_file = output_dir / "test.txt"
with open(text_file, "w") as output_file:
output_file.write(result)
```
### Fine-tuning a Pretrained Model on Local Dataset
```python
from pathlib import Path
from typing import List
from elpis.datasets import Dataset
from elpis.datasets.dataset import CleaningOptions
from elpis.datasets.preprocessing import process_batch
from elpis.models import ElanOptions, ElanTierSelector
from elpis.trainer.job import TrainingJob, TrainingOptions
from elpis.trainer.trainer import train
from elpis.transcriber.results import build_elan, build_text
from elpis.transcriber.transcribe import build_pipeline, transcribe
files: List[Path] = [...] # A list of paths to the files to include.
dataset = Dataset(
name="dataset",
files=files,
cleaning_options=CleaningOptions(), # Default cleaning options
# Elan data extraction info- required if dataset includes .eaf files.
elan_options=ElanOptions(
selection_mechanism=ElanTierSelector.NAME, selection_value="Phrase"
),
)
# Setup
tmp_path = Path('...')
dataset_dir = tmp_path / "dataset"
model_dir = tmp_path / "model"
output_dir = tmp_path / "output"
# Make all directories
for directory in dataset_dir, model_dir, output_dir:
directory.mkdir(exist_ok=True, parents=True)
# Preprocessing
batches = dataset.to_batches()
for batch in batches:
process_batch(batch, dataset_dir)
# Train the model
job = TrainingJob(
model_name="some_model",
dataset_name="some_dataset",
options=TrainingOptions(epochs=2, learning_rate=0.001),
base_model="facebook/wav2vec2-base-960h"
)
train(
job=job,
output_dir=model_dir,
dataset_dir=dataset_dir,
)
# Perform inference with pipeline
asr = build_pipeline(
pretrained_location=str(model_dir.absolute()),
)
audio = Path("<to_some_audio_file.wav>")
annotations = transcribe(audio, asr)
# Build output files
text_file = output_dir / "test.txt"
with open(text_file, "w") as output_file:
output_file.write(build_text(annotations))
elan_file = output_dir / "test.eaf"
eaf = build_elan(annotations)
eaf.to_file(str(elan_file))
print('voila ;)')
```
Raw data
{
"_id": null,
"home_page": "https://github.com/CoEDL/elpis_lib",
"name": "elpis",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.10,<4.0",
"maintainer_email": "",
"keywords": "Elpis,huggingface,ASR,Automatic Speech Recognition,CoEDL",
"author": "Harry Keightley",
"author_email": "harrykeightley@outlook.com",
"download_url": "https://files.pythonhosted.org/packages/4c/bf/f1f462c354c83557d064f3ec67871a17b93aae10e26ffea6a07056d7104a/elpis-0.2.2.tar.gz",
"platform": null,
"description": "# Elpis Core Library\n\nThe Core Elpis Library, providing a quick api to [:hugs: transformers](https://huggingface.co/models?pipeline_tag=automatic-speech-recognition&sort=downloads)\nfor automatic-speech-recognition.\n\nYou can use the library to:\n\n- Perform standalone inference using a pretrained HFT model.\n- Fine tune a pretrained ASR model on your own dataset.\n- Generate text and Elan files from inference results for further analysis.\n\n## Documentation\n\nDocumentation for the library can be be found [here](https://coedl.github.io/elpis_lib/index.html).\n\n## Dependencies\n\nWhile we try to be as machine-independant as possible, there are some dependencies\nyou should be aware of when using this library:\n\n- Processing datasets (`elpis.datasets.processing`) requires `librosa`, which\n depends on having `libsndfile` installed on your computer. If you're using\n elpis within a docker container, you may have to manually install\n `libsndfile`.\n- Transcription (`elpis.transcription.transcribe`) requires `ffmpeg` if your\n audio you're attempting to transcribe needs to be resampled before it can\n be used. The default sample rate we assume is 16khz.\n- The preprocessing flow (`elpis.datasets.preprocessing`) is free of external\n dependencies.\n\n## Installation\n\nYou can install the elpis library with:\n`pip3 install elpis`\n\n## Usage\n\nBelow are some typical examples of use cases\n\n### Standalone Inference\n\n```python\nfrom pathlib import Path\n\nfrom elpis.transcriber.results import build_text\nfrom elpis.transcriber.transcribe import build_pipeline, transcribe\n\n# Perform inference\nasr = build_pipeline(pretrained_location=\"facebook/wav2vec2-base-960h\")\naudio = Path(\"<to_some_audio_file.wav>\")\nannotations = transcribe(audio, asr) # Timed, per word annotation data\n\nresult = build_text(annotations) # Combine annotations to extract all text\nprint(result)\n\n# Build output files\ntext_file = output_dir / \"test.txt\"\nwith open(text_file, \"w\") as output_file:\n output_file.write(result)\n```\n\n### Fine-tuning a Pretrained Model on Local Dataset\n\n```python\nfrom pathlib import Path\nfrom typing import List\n\nfrom elpis.datasets import Dataset\nfrom elpis.datasets.dataset import CleaningOptions\nfrom elpis.datasets.preprocessing import process_batch\nfrom elpis.models import ElanOptions, ElanTierSelector\nfrom elpis.trainer.job import TrainingJob, TrainingOptions\nfrom elpis.trainer.trainer import train\nfrom elpis.transcriber.results import build_elan, build_text\nfrom elpis.transcriber.transcribe import build_pipeline, transcribe\n\nfiles: List[Path] = [...] # A list of paths to the files to include.\n\ndataset = Dataset(\n name=\"dataset\",\n files=files,\n cleaning_options=CleaningOptions(), # Default cleaning options\n # Elan data extraction info- required if dataset includes .eaf files.\n elan_options=ElanOptions(\n selection_mechanism=ElanTierSelector.NAME, selection_value=\"Phrase\"\n ),\n)\n\n# Setup\ntmp_path = Path('...')\n\ndataset_dir = tmp_path / \"dataset\"\nmodel_dir = tmp_path / \"model\"\noutput_dir = tmp_path / \"output\"\n\n# Make all directories\nfor directory in dataset_dir, model_dir, output_dir:\n directory.mkdir(exist_ok=True, parents=True)\n\n# Preprocessing\nbatches = dataset.to_batches()\nfor batch in batches:\n process_batch(batch, dataset_dir)\n\n# Train the model\njob = TrainingJob(\n model_name=\"some_model\",\n dataset_name=\"some_dataset\",\n options=TrainingOptions(epochs=2, learning_rate=0.001),\n base_model=\"facebook/wav2vec2-base-960h\"\n)\ntrain(\n job=job,\n output_dir=model_dir,\n dataset_dir=dataset_dir,\n)\n\n# Perform inference with pipeline\nasr = build_pipeline(\n pretrained_location=str(model_dir.absolute()),\n)\naudio = Path(\"<to_some_audio_file.wav>\")\nannotations = transcribe(audio, asr)\n\n# Build output files\ntext_file = output_dir / \"test.txt\"\nwith open(text_file, \"w\") as output_file:\n output_file.write(build_text(annotations))\n\nelan_file = output_dir / \"test.eaf\"\neaf = build_elan(annotations)\neaf.to_file(str(elan_file))\n\nprint('voila ;)')\n```\n",
"bugtrack_url": null,
"license": "",
"summary": "A library to perform automatic speech recognition with huggingface transformers.",
"version": "0.2.2",
"project_urls": {
"Homepage": "https://github.com/CoEDL/elpis_lib",
"Repository": "https://github.com/CoEDL/elpis_lib"
},
"split_keywords": [
"elpis",
"huggingface",
"asr",
"automatic speech recognition",
"coedl"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "685501522fec8b2c2c9d86a86272a8b86d3312466bf52fe0712fb9937d0ef8a4",
"md5": "20cdfed07afa886a95c788224b2f36d6",
"sha256": "3971ce45c0c720b2909407a75fcebe3ad60fec7603f5a56e5573ef61cfa80d8f"
},
"downloads": -1,
"filename": "elpis-0.2.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "20cdfed07afa886a95c788224b2f36d6",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10,<4.0",
"size": 27919,
"upload_time": "2023-10-18T06:11:13",
"upload_time_iso_8601": "2023-10-18T06:11:13.191729Z",
"url": "https://files.pythonhosted.org/packages/68/55/01522fec8b2c2c9d86a86272a8b86d3312466bf52fe0712fb9937d0ef8a4/elpis-0.2.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "4cbff1f462c354c83557d064f3ec67871a17b93aae10e26ffea6a07056d7104a",
"md5": "05a558f9cab53444137df5f47e4e9feb",
"sha256": "b1b939a1ea8927e95dfee69f338463b6f8a9407e0c277d05c1a41d649c8338c1"
},
"downloads": -1,
"filename": "elpis-0.2.2.tar.gz",
"has_sig": false,
"md5_digest": "05a558f9cab53444137df5f47e4e9feb",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10,<4.0",
"size": 23518,
"upload_time": "2023-10-18T06:11:15",
"upload_time_iso_8601": "2023-10-18T06:11:15.818471Z",
"url": "https://files.pythonhosted.org/packages/4c/bf/f1f462c354c83557d064f3ec67871a17b93aae10e26ffea6a07056d7104a/elpis-0.2.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-10-18 06:11:15",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "CoEDL",
"github_project": "elpis_lib",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "elpis"
}