audiotranser


Nameaudiotranser JSON
Version 0.10 PyPI version JSON
download
home_pagehttps://github.com/hansalemaos/audiotranser
SummaryTranscribes audio files
upload_time2023-08-06 08:39:51
maintainer
docs_urlNone
authorJohannes Fischer
requires_python
licenseMIT
keywords audio transcribe
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            
# Transcribes audio files 

## pip install audiotranser 

#### Tested against Windows 10 / Python 3.10 / Anaconda 

Uses the models from https://huggingface.co/ggerganov/whisper.cpp/tree/main

```python
    Args:
        inputfile: path to the input audio file
        small_large: model size (small or large)
        blas: use BLAS library for faster decoding
        silence_threshold: silence threshold in milliseconds
        min_silence_len: minimum silence length in milliseconds
        keep_silence: minimum silence length to keep after silence removal
        threads: number of threads to use
        processors: number of processors to use
        offset_t: time offset in milliseconds
        offset_n: segment index offset
        duration: duration of audio to process in milliseconds
        max_context: maximum number of text context tokens to store
        max_len: maximum segment length in characters
        best_of: number of best candidates to keep
        beam_size: beam size for beam search
        word_thold: word timestamp probability threshold
        entropy_thold: entropy threshold for decoder fail
        logprob_thold: log probability threshold for decoder fail
        speed_up: speed up audio by x2 (reduced accuracy)
        translate: translate from source language to english
        diarize: stereo audio diarization
        language: spoken language ('auto' for auto_detect)

    Returns:
        Pandas DataFrame with the results of the inference or the path to the output CSV file if pd.read_csv fails.

from audiotranser import transcribe_audio
df = transcribe_audio(
    inputfile=r"C:\untitled.wav",
    small_large="large",
    blas=True,
    silence_threshold=-30,  # ignored if == 0 or None
    min_silence_len=500,  # ignored if silence_threshold == 0 or None
    keep_silence=1000,  # ignored if silence_threshold == 0 or None
    threads=3,  # number of threads to use during computation
    processors=1,  # number of processors to use during computation
    offset_t=0,  # time offset in milliseconds
    offset_n=0,  # segment index offset
    duration=0,  # duration of audio to process in milliseconds
    max_context=-1,  # maximum number of text context tokens to store
    max_len=0,  # maximum segment length in characters
    best_of=5,  # number of best candidates to keep
    beam_size=-1,  # beam size for beam search
    word_thold=0.01,  # word timestamp probability threshold
    entropy_thold=2.40,  # entropy threshold for decoder fail
    logprob_thold=-1.00,  # log probability threshold for decoder fail
    speed_up=True,  # speed up audio by x2 (reduced accuracy)
    translate=False,  # translate from source language to english
    diarize=False,  # stereo audio diarization
    language="en",  # spoken language ('auto' for auto_detect)
)
print(df)
```

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/hansalemaos/audiotranser",
    "name": "audiotranser",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "audio,Transcribe",
    "author": "Johannes Fischer",
    "author_email": "aulasparticularesdealemaosp@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/29/95/9de2149217988d09d7836d3aa2f695f02ae85df1e003e671361083fece2b/audiotranser-0.10.tar.gz",
    "platform": null,
    "description": "\r\n# Transcribes audio files \r\n\r\n## pip install audiotranser \r\n\r\n#### Tested against Windows 10 / Python 3.10 / Anaconda \r\n\r\nUses the models from https://huggingface.co/ggerganov/whisper.cpp/tree/main\r\n\r\n```python\r\n    Args:\r\n        inputfile: path to the input audio file\r\n        small_large: model size (small or large)\r\n        blas: use BLAS library for faster decoding\r\n        silence_threshold: silence threshold in milliseconds\r\n        min_silence_len: minimum silence length in milliseconds\r\n        keep_silence: minimum silence length to keep after silence removal\r\n        threads: number of threads to use\r\n        processors: number of processors to use\r\n        offset_t: time offset in milliseconds\r\n        offset_n: segment index offset\r\n        duration: duration of audio to process in milliseconds\r\n        max_context: maximum number of text context tokens to store\r\n        max_len: maximum segment length in characters\r\n        best_of: number of best candidates to keep\r\n        beam_size: beam size for beam search\r\n        word_thold: word timestamp probability threshold\r\n        entropy_thold: entropy threshold for decoder fail\r\n        logprob_thold: log probability threshold for decoder fail\r\n        speed_up: speed up audio by x2 (reduced accuracy)\r\n        translate: translate from source language to english\r\n        diarize: stereo audio diarization\r\n        language: spoken language ('auto' for auto_detect)\r\n\r\n    Returns:\r\n        Pandas DataFrame with the results of the inference or the path to the output CSV file if pd.read_csv fails.\r\n\r\nfrom audiotranser import transcribe_audio\r\ndf = transcribe_audio(\r\n    inputfile=r\"C:\\untitled.wav\",\r\n    small_large=\"large\",\r\n    blas=True,\r\n    silence_threshold=-30,  # ignored if == 0 or None\r\n    min_silence_len=500,  # ignored if silence_threshold == 0 or None\r\n    keep_silence=1000,  # ignored if silence_threshold == 0 or None\r\n    threads=3,  # number of threads to use during computation\r\n    processors=1,  # number of processors to use during computation\r\n    offset_t=0,  # time offset in milliseconds\r\n    offset_n=0,  # segment index offset\r\n    duration=0,  # duration of audio to process in milliseconds\r\n    max_context=-1,  # maximum number of text context tokens to store\r\n    max_len=0,  # maximum segment length in characters\r\n    best_of=5,  # number of best candidates to keep\r\n    beam_size=-1,  # beam size for beam search\r\n    word_thold=0.01,  # word timestamp probability threshold\r\n    entropy_thold=2.40,  # entropy threshold for decoder fail\r\n    logprob_thold=-1.00,  # log probability threshold for decoder fail\r\n    speed_up=True,  # speed up audio by x2 (reduced accuracy)\r\n    translate=False,  # translate from source language to english\r\n    diarize=False,  # stereo audio diarization\r\n    language=\"en\",  # spoken language ('auto' for auto_detect)\r\n)\r\nprint(df)\r\n```\r\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Transcribes audio files",
    "version": "0.10",
    "project_urls": {
        "Homepage": "https://github.com/hansalemaos/audiotranser"
    },
    "split_keywords": [
        "audio",
        "transcribe"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "380f8c0a2ec09dd91caf445112310574ddd0e217598ccff9a166ff4c66ed37e1",
                "md5": "36dbed10f37d5af710339c97600a26c9",
                "sha256": "5e6d51355d5086f44ce3f2b0a43faffb7af58a2c3de06a960d21a6862dd7d765"
            },
            "downloads": -1,
            "filename": "audiotranser-0.10-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "36dbed10f37d5af710339c97600a26c9",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 14182474,
            "upload_time": "2023-08-06T08:39:37",
            "upload_time_iso_8601": "2023-08-06T08:39:37.692337Z",
            "url": "https://files.pythonhosted.org/packages/38/0f/8c0a2ec09dd91caf445112310574ddd0e217598ccff9a166ff4c66ed37e1/audiotranser-0.10-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "29959de2149217988d09d7836d3aa2f695f02ae85df1e003e671361083fece2b",
                "md5": "258564d7a0a32b48ab05b29dd42089d3",
                "sha256": "f60c1f2b32d281365efbcb1ee8a01f2788eb61f6b4e004e6ffb659952a2b4253"
            },
            "downloads": -1,
            "filename": "audiotranser-0.10.tar.gz",
            "has_sig": false,
            "md5_digest": "258564d7a0a32b48ab05b29dd42089d3",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 14035543,
            "upload_time": "2023-08-06T08:39:51",
            "upload_time_iso_8601": "2023-08-06T08:39:51.735011Z",
            "url": "https://files.pythonhosted.org/packages/29/95/9de2149217988d09d7836d3aa2f695f02ae85df1e003e671361083fece2b/audiotranser-0.10.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-08-06 08:39:51",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "hansalemaos",
    "github_project": "audiotranser",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [],
    "lcname": "audiotranser"
}
        
Elapsed time: 0.12554s