auditok


Nameauditok JSON
Version 0.3.0 PyPI version JSON
download
home_pagehttp://github.com/amsehili/auditok/
SummaryA module for Audio/Acoustic Activity Detection
upload_time2024-11-01 10:14:37
maintainerNone
docs_urlNone
authorAmine Sehili
requires_pythonNone
licenseMIT
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            .. image:: https://raw.githubusercontent.com/amsehili/auditok/f2e212068b6d5bfb7bf3932bc3a9cad01e03759d/doc/figures/auditok-logo.png
    :align: center

.. image:: https://github.com/amsehili/auditok/actions/workflows/ci.yml/badge.svg
    :target: https://github.com/amsehili/auditok/actions/workflows/ci.yml/
    :alt: Build Status

.. image:: https://codecov.io/github/amsehili/auditok/graph/badge.svg?token=0rwAqYBdkf
 :target: https://codecov.io/github/amsehili/auditok

.. image:: https://readthedocs.org/projects/auditok/badge/?version=latest
    :target: http://auditok.readthedocs.org/en/latest/?badge=latest
    :alt: Documentation Status

``auditok`` is an **Audio Activity Detection** tool that processes online data
(from an audio device or standard input) and audio files. It can be used via the command line or through its API.

Full documentation is available on `Read the Docs <https://auditok.readthedocs.io/en/latest/>`_.

Installation
------------

``auditok`` requires Python 3.7 or higher.

To install the latest stable version, use pip:

.. code:: bash

    sudo pip install auditok

To install the latest development version from GitHub:

.. code:: bash

    pip install git+https://github.com/amsehili/auditok

Alternatively, clone the repository and install it manually:

.. code:: bash

    git clone https://github.com/amsehili/auditok.git
    cd auditok
    python setup.py install

Basic example
-------------

Here's a simple example of using ``auditok`` to detect audio events:

.. code:: python

    import auditok

    # `split` returns a generator of AudioRegion objects
    audio_events = auditok.split(
        "audio.wav",
        min_dur=0.2,     # Minimum duration of a valid audio event in seconds
        max_dur=4,       # Maximum duration of an event
        max_silence=0.3, # Maximum tolerated silence duration within an event
        energy_threshold=55 # Detection threshold
    )

    for i, r in enumerate(audio_events):
        # AudioRegions returned by `split` have defined 'start' and 'end' attributes
        print(f"Event {i}: {r.start:.3f}s -- {r.end:.3f}")

        # Play the audio event
        r.play(progress_bar=True)

        # Save the event with start and end times in the filename
        filename = r.save("event_{start:.3f}-{end:.3f}.wav")
        print(f"Event saved as: {filename}")

Example output:

.. code:: bash

    Event 0: 0.700s -- 1.400s
    Event saved as: event_0.700-1.400.wav
    Event 1: 3.800s -- 4.500s
    Event saved as: event_3.800-4.500.wav
    Event 2: 8.750s -- 9.950s
    Event saved as: event_8.750-9.950.wav
    Event 3: 11.700s -- 12.400s
    Event saved as: event_11.700-12.400.wav
    Event 4: 15.050s -- 15.850s
    Event saved as: event_15.050-15.850.wav

Split and plot
--------------

Visualize the audio signal with detected events:

.. code:: python

    import auditok
    region = auditok.load("audio.wav") # Returns an AudioRegion object
    regions = region.split_and_plot(...) # Or simply use `region.splitp()`

Example output:

.. image:: https://raw.githubusercontent.com/amsehili/auditok/f2e212068b6d5bfb7bf3932bc3a9cad01e03759d/doc/figures/example_1.png

Split an audio stream and re-join (glue) audio events with silence
------------------------------------------------------------------

The following code detects audio events within an audio stream, then insert
1 second of silence between them to create an audio with pauses:

.. code:: python

    # Create a 1-second silent audio region
    # Audio parameters must match the original stream
    from auditok import split, make_silence
    silence = make_silence(duration=1,
                           sampling_rate=16000,
                           sample_width=2,
                           channels=1)
    events = split("audio.wav")
    audio_with_pauses = silence.join(events)

Alternatively, use ``split_and_join_with_silence``:

.. code:: python

    from auditok import split_and_join_with_silence
    audio_with_pauses = split_and_join_with_silence(silence_duration=1, input="audio.wav")

Export an ``AudioRegion`` as a ``numpy`` array
----------------------------------------------

.. code:: python

    from auditok import load, AudioRegion
    audio = load("audio.wav") # or use `AudioRegion.load("audio.wav")`
    x = audio.numpy()
    assert x.shape[0] == audio.channels
    assert x.shape[1] == len(audio)


Limitations
-----------

The detection algorithm is based on audio signal energy. While it performs well
in low-noise environments (e.g., podcasts, language lessons, or quiet recordings),
performance may drop in noisy settings. Additionally, the algorithm does not
distinguish between speech and other sounds, so it is not suitable for Voice
Activity Detection in multi-sound environments.

License
-------

MIT.

            

Raw data

            {
    "_id": null,
    "home_page": "http://github.com/amsehili/auditok/",
    "name": "auditok",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": null,
    "author": "Amine Sehili",
    "author_email": "amine.sehili@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/0e/05/57e6c498cc8b224dc3d057136ce40f983c55a02d1f279ffcf73c544ffdc0/auditok-0.3.0.tar.gz",
    "platform": "ANY",
    "description": ".. image:: https://raw.githubusercontent.com/amsehili/auditok/f2e212068b6d5bfb7bf3932bc3a9cad01e03759d/doc/figures/auditok-logo.png\n    :align: center\n\n.. image:: https://github.com/amsehili/auditok/actions/workflows/ci.yml/badge.svg\n    :target: https://github.com/amsehili/auditok/actions/workflows/ci.yml/\n    :alt: Build Status\n\n.. image:: https://codecov.io/github/amsehili/auditok/graph/badge.svg?token=0rwAqYBdkf\n :target: https://codecov.io/github/amsehili/auditok\n\n.. image:: https://readthedocs.org/projects/auditok/badge/?version=latest\n    :target: http://auditok.readthedocs.org/en/latest/?badge=latest\n    :alt: Documentation Status\n\n``auditok`` is an **Audio Activity Detection** tool that processes online data\n(from an audio device or standard input) and audio files. It can be used via the command line or through its API.\n\nFull documentation is available on `Read the Docs <https://auditok.readthedocs.io/en/latest/>`_.\n\nInstallation\n------------\n\n``auditok`` requires Python 3.7 or higher.\n\nTo install the latest stable version, use pip:\n\n.. code:: bash\n\n    sudo pip install auditok\n\nTo install the latest development version from GitHub:\n\n.. code:: bash\n\n    pip install git+https://github.com/amsehili/auditok\n\nAlternatively, clone the repository and install it manually:\n\n.. code:: bash\n\n    git clone https://github.com/amsehili/auditok.git\n    cd auditok\n    python setup.py install\n\nBasic example\n-------------\n\nHere's a simple example of using ``auditok`` to detect audio events:\n\n.. code:: python\n\n    import auditok\n\n    # `split` returns a generator of AudioRegion objects\n    audio_events = auditok.split(\n        \"audio.wav\",\n        min_dur=0.2,     # Minimum duration of a valid audio event in seconds\n        max_dur=4,       # Maximum duration of an event\n        max_silence=0.3, # Maximum tolerated silence duration within an event\n        energy_threshold=55 # Detection threshold\n    )\n\n    for i, r in enumerate(audio_events):\n        # AudioRegions returned by `split` have defined 'start' and 'end' attributes\n        print(f\"Event {i}: {r.start:.3f}s -- {r.end:.3f}\")\n\n        # Play the audio event\n        r.play(progress_bar=True)\n\n        # Save the event with start and end times in the filename\n        filename = r.save(\"event_{start:.3f}-{end:.3f}.wav\")\n        print(f\"Event saved as: {filename}\")\n\nExample output:\n\n.. code:: bash\n\n    Event 0: 0.700s -- 1.400s\n    Event saved as: event_0.700-1.400.wav\n    Event 1: 3.800s -- 4.500s\n    Event saved as: event_3.800-4.500.wav\n    Event 2: 8.750s -- 9.950s\n    Event saved as: event_8.750-9.950.wav\n    Event 3: 11.700s -- 12.400s\n    Event saved as: event_11.700-12.400.wav\n    Event 4: 15.050s -- 15.850s\n    Event saved as: event_15.050-15.850.wav\n\nSplit and plot\n--------------\n\nVisualize the audio signal with detected events:\n\n.. code:: python\n\n    import auditok\n    region = auditok.load(\"audio.wav\") # Returns an AudioRegion object\n    regions = region.split_and_plot(...) # Or simply use `region.splitp()`\n\nExample output:\n\n.. image:: https://raw.githubusercontent.com/amsehili/auditok/f2e212068b6d5bfb7bf3932bc3a9cad01e03759d/doc/figures/example_1.png\n\nSplit an audio stream and re-join (glue) audio events with silence\n------------------------------------------------------------------\n\nThe following code detects audio events within an audio stream, then insert\n1 second of silence between them to create an audio with pauses:\n\n.. code:: python\n\n    # Create a 1-second silent audio region\n    # Audio parameters must match the original stream\n    from auditok import split, make_silence\n    silence = make_silence(duration=1,\n                           sampling_rate=16000,\n                           sample_width=2,\n                           channels=1)\n    events = split(\"audio.wav\")\n    audio_with_pauses = silence.join(events)\n\nAlternatively, use ``split_and_join_with_silence``:\n\n.. code:: python\n\n    from auditok import split_and_join_with_silence\n    audio_with_pauses = split_and_join_with_silence(silence_duration=1, input=\"audio.wav\")\n\nExport an ``AudioRegion`` as a ``numpy`` array\n----------------------------------------------\n\n.. code:: python\n\n    from auditok import load, AudioRegion\n    audio = load(\"audio.wav\") # or use `AudioRegion.load(\"audio.wav\")`\n    x = audio.numpy()\n    assert x.shape[0] == audio.channels\n    assert x.shape[1] == len(audio)\n\n\nLimitations\n-----------\n\nThe detection algorithm is based on audio signal energy. While it performs well\nin low-noise environments (e.g., podcasts, language lessons, or quiet recordings),\nperformance may drop in noisy settings. Additionally, the algorithm does not\ndistinguish between speech and other sounds, so it is not suitable for Voice\nActivity Detection in multi-sound environments.\n\nLicense\n-------\n\nMIT.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A module for Audio/Acoustic Activity Detection",
    "version": "0.3.0",
    "project_urls": {
        "Homepage": "http://github.com/amsehili/auditok/"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "8f42644aef57467b6fd07d399bc38cabb120284fb86fa8989284bc5f8a1b34a6",
                "md5": "470cee33ee24dc1d24c58b4b1c2ebe14",
                "sha256": "32a19d2fedcac5dac67127d6c622472ba87ac3b6cd4ebc6f8276340658b52ecc"
            },
            "downloads": -1,
            "filename": "auditok-0.3.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "470cee33ee24dc1d24c58b4b1c2ebe14",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 1489787,
            "upload_time": "2024-11-01T10:14:35",
            "upload_time_iso_8601": "2024-11-01T10:14:35.284564Z",
            "url": "https://files.pythonhosted.org/packages/8f/42/644aef57467b6fd07d399bc38cabb120284fb86fa8989284bc5f8a1b34a6/auditok-0.3.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "0e0557e6c498cc8b224dc3d057136ce40f983c55a02d1f279ffcf73c544ffdc0",
                "md5": "6249a8c159f9ab836cbdeb018542afb2",
                "sha256": "8565d6e7dfbecb7dbbe5c54fb5af66f8c1c827e06745c19df0e3fa468d0022a1"
            },
            "downloads": -1,
            "filename": "auditok-0.3.0.tar.gz",
            "has_sig": false,
            "md5_digest": "6249a8c159f9ab836cbdeb018542afb2",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 1798515,
            "upload_time": "2024-11-01T10:14:37",
            "upload_time_iso_8601": "2024-11-01T10:14:37.163059Z",
            "url": "https://files.pythonhosted.org/packages/0e/05/57e6c498cc8b224dc3d057136ce40f983c55a02d1f279ffcf73c544ffdc0/auditok-0.3.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-01 10:14:37",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "amsehili",
    "github_project": "auditok",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "auditok"
}
        
Elapsed time: 0.38910s