audalign

Name	audalign JSON
Version	1.3.0 JSON
	download
home_page	None
Summary	Audio Alignment and Recognition in Python
upload_time	2024-06-03 06:11:58
maintainer	None
docs_url	None
author	None
requires_python	>=3.8
license	None
keywords	align alignment audio fingerprinting music python
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Audalign

Package for processing and aligning audio files using audio fingerprinting, cross-correlation, cross-correlation with spectrograms, or visual alignment techniques.

![gif of audalign aligning](audalign.gif)

This package offers tools to align many recordings of the same event. It has two main purposes: to accurately align recordings, and to process the audio files prior to alignments. All main functions are accessed through functions in the audalign.\_\_init\_\_ file. The recognizers themselves are objects in the recognizer directory which in turn have configurations in the config directories.

 Alignments are primarily accomplished with fingerprinting, though where fingerprinting fails, correlation, correlation with spectrograms, and visual alignment techniques can be used to get a closer result. After an initial alignment is found, that alignment can be passed to "fine_align," which will find smaller, relative alignments to the main one.

---

Each alignment technique has different degrees of adjustment for accuracy settings. Fingerprinting parameters can be generally set to get consistent results using it's config's `set_accuracy` method. Visual alignment has many parameters that can be adjusted and requires case by case adjustment. Parameters for correlation are focused on sample rate or scipy's find_peaks.

[Noisereduce](https://timsainburg.com/noise-reduction-python.html) is very useful for this application and a wrapper is implemented for ease of use. Uniformly leveling prior to noise reduction using uniform_level_file boosts quiet but important sound features.

Alignment and recognition results consist of a dictionary. If an output directory is given, silence is placed before all target files so that they will automatically be aligned and writen to the output directory along with an audio file containing the combined sum. A `rankings` key is included in each alignment and recognition result. This helps determine the strength of the alignment, but is not definitive proof. Values range from 1-10.

---

All formats that ffmpeg or libav support are supported here.

All fingerprints are stored in memory in the `FingerprintRecognzier` and must be saved to disk with the `save_fingerprinted_files` method in order to persist them.

Regular file recogniton can also be done with Audalign similar to [dejavu](https://github.com/worldveil/dejavu).

For more details on implementation and results, see the [wiki!!](https://github.com/benfmiller/audalign/wiki)

## Installation

Install from PyPI:

Don't forget to install ffmpeg/avlib (Below in the Readme)!

```bash
pip install audalign
```

OR

```bash
git clone https://github.com/benfmiller/audalign.git
cd audalign/
pip install audalign
```

OR

Download and extract audalign then

```bash
pip install audalign
```

in the directory

### Optional dependencies

- visrecognize: additional recognizer based on spectrogram image comparison. `pip install audalign[visrecognize]`
- noisereduce: wrapper utils around [timsainb/noisereduce](https://github.com/timsainb/noisereduce). `pip install audalign[noisereduce]`

## Recognizers

There are currently four included recognizers, each with their own config objects.

```python
import audalign as ad

fingerprint_rec = ad.FingerprintRecognizer()
correlation_rec = ad.CorrelationRecognizer()
cor_spec_rec = ad.CorrelationSpectrogramRecognizer()
visual_rec = ad.VisualRecognizer() # requires installting optional visrecognize dependencies

fingerprint_rec.config.set_accuracy(3)
# recognizer.config.some_item
```

For more info about the configuration objects, check out the [wiki](https://github.com/benfmiller/audalign/wiki) or the config objects themselves. They are relatively nicely commented.

Recognizers are then passed to recognize and align functions.

```python
results = ad.align("target/folder/", recognizer=fingerprint_rec)
results = ad.align("target/folder/", recognizer=correlation_rec)
results = ad.align("target/folder/", recognizer=cor_spec_rec)
results = ad.align("target/folder/", recognizer=visual_rec)
results = ad.recognize("target/file1", "target/file2", recognizer=fingerprint_rec)
results = ad.recognize("target/file1", "target/folder", recognizer=fingerprint_rec)
# or
results = ad.target_align(
    "target/files",
    "target/folder/",
    destination_path="write/alignments/to/folder",
    recognizer=fingerprint_rec
)
# or
results = ad.align_files(
    "target/file1",
    "target/file2",
    destination_path="write/alignments/to/folder",
    recognizer=correlation_rec
)

# results can then be sent to fine_align
fine_results = ad.fine_align(
    results,
    recognizer=cor_spec_rec,
)
```

Correlation is more precise than fingerprints and will always give a best alignment unlike fingerprinting, which can return no alignment. `max_lags` is very important for fine aligning. `locality` can be very useful for all alignments and recognitions.

## Other Functions

```python
# wrapper for timsainb/noisereduce, optional dependency
ad.remove_noise_file(
    "target/file",
    "5", # noise start in seconds
    "20", # noise end in seconds
    "destination/file",
    alt_noise_filepath="different/sound/file",
    prop_decrease="0.5", # If you want noise half reduced
)

ad.remove_noise_directory(
    "target/directory/",
    "noise/file",
    "5", # noise start in seconds
    "20", # noise end in seconds
    "destination/directory",
    prop_decrease="0.5", # If you want noise half reduced
)

ad.uniform_level_file(
    "target/file",
    "destination",
    mode="normalize",
    width=5,
)

ad.plot("file.wav") # Plots spectrogram with peaks overlaid
ad.convert_audio_file("audio.wav", "audio.mp3") # Also convert video file to audio file
ad.get_metadata("file.wav") # Returns metadata from ffmpeg/ avlib
```

You can easily recalcute the alignment shifts from previous results using recalc_shifts.
You can then write those shifts using write_shifts_from_results. write_shifts_from_results also
lets you use different source files for alignments too.

```python
recalculated_results = ad.recalc_shifts(older_results)
ad.write_shifts_from_results(recalculated_results, "destination", "source_files_folder_or_file_list")
```

## Fingerprinting

Fingerprinting is only used in the FingerprintRecognizer object. Alignments are not independent, so fingerprints created before alignments will be used for the alignment. The exception of this is in fine_aligning, where new fingerprints are always created.

Running recognitions will fingerprint all files in the recognitions not already fingerprinted.

```python
fingerprint_rec = ad.FingerprintRecognizer()

fingerprint_rec.fingerprint_file("test_file.wav")

# or

fingerprint_rec.fingerprint_directory("audio/directory")
```

fingerprints are stored in fingerprint_rec and can be saved by

```python
fingerprint_rec.save_fingerprinted_files("save_file.json") # or .pickle
# or loaded with
fingerprint_rec.load_fingerprinted_files("save_file.json") # or .pickle
```

## Resources and Tools

For more tools to align audio and video files, see [forart/HyMPS's collection of alignment resources.](https://github.com/forart/HyMPS/blob/main/A_Tools.md#alignmentsynch-)

[forart/HyMPS](https://github.com/forart/HyMPS/tree/main) also has many other audio/video resources.


## Getting ffmpeg set up

You can use **ffmpeg or libav**.

Mac (using [homebrew](http://brew.sh)):

```bash
# ffmpeg
brew install ffmpeg --with-libvorbis --with-sdl2 --with-theora

####    OR    #####

# libav
brew install libav --with-libvorbis --with-sdl --with-theora
```

Linux (using apt):

```bash
# ffmpeg
apt-get install ffmpeg libavcodec-extra

####    OR    #####

# libav
apt-get install libav-tools libavcodec-extra
```

Windows:

1. Download and extract ffmpeg from [Windows binaries provided here](https://ffmpeg.org/download.html).
2. Add the ffmpeg `/bin` folder to your PATH environment variable

OR

1. Download and extract libav from [Windows binaries provided here](http://builds.libav.org/windows/).
2. Add the libav `/bin` folder to your PATH environment variable

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "audalign",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "Ben Miller <benfmiller132@gmail.com>",
    "keywords": "align, alignment, audio, fingerprinting, music, python",
    "author": null,
    "author_email": "Ben Miller <benfmiller132@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/0d/3b/51bdadb640ee8ca249816de80bec5aee3b3d3ca72a1be9101de868711c95/audalign-1.3.0.tar.gz",
    "platform": null,
    "description": "# Audalign\n\nPackage for processing and aligning audio files using audio fingerprinting, cross-correlation, cross-correlation with spectrograms, or visual alignment techniques.\n\n![gif of audalign aligning](audalign.gif)\n\nThis package offers tools to align many recordings of the same event. It has two main purposes: to accurately align recordings, and to process the audio files prior to alignments. All main functions are accessed through functions in the audalign.\\_\\_init\\_\\_ file. The recognizers themselves are objects in the recognizer directory which in turn have configurations in the config directories.\n\n Alignments are primarily accomplished with fingerprinting, though where fingerprinting fails, correlation, correlation with spectrograms, and visual alignment techniques can be used to get a closer result. After an initial alignment is found, that alignment can be passed to \"fine_align,\" which will find smaller, relative alignments to the main one.\n\n---\n\nEach alignment technique has different degrees of adjustment for accuracy settings. Fingerprinting parameters can be generally set to get consistent results using it's config's `set_accuracy` method. Visual alignment has many parameters that can be adjusted and requires case by case adjustment. Parameters for correlation are focused on sample rate or scipy's find_peaks.\n\n[Noisereduce](https://timsainburg.com/noise-reduction-python.html) is very useful for this application and a wrapper is implemented for ease of use. Uniformly leveling prior to noise reduction using uniform_level_file boosts quiet but important sound features.\n\nAlignment and recognition results consist of a dictionary. If an output directory is given, silence is placed before all target files so that they will automatically be aligned and writen to the output directory along with an audio file containing the combined sum. A `rankings` key is included in each alignment and recognition result. This helps determine the strength of the alignment, but is not definitive proof. Values range from 1-10.\n\n---\n\nAll formats that ffmpeg or libav support are supported here.\n\nAll fingerprints are stored in memory in the `FingerprintRecognzier` and must be saved to disk with the `save_fingerprinted_files` method in order to persist them.\n\nRegular file recogniton can also be done with Audalign similar to [dejavu](https://github.com/worldveil/dejavu).\n\nFor more details on implementation and results, see the [wiki!!](https://github.com/benfmiller/audalign/wiki)\n\n## Installation\n\nInstall from PyPI:\n\nDon't forget to install ffmpeg/avlib (Below in the Readme)!\n\n```bash\npip install audalign\n```\n\nOR\n\n```bash\ngit clone https://github.com/benfmiller/audalign.git\ncd audalign/\npip install audalign\n```\n\nOR\n\nDownload and extract audalign then\n\n```bash\npip install audalign\n```\n\nin the directory\n\n### Optional dependencies\n\n- visrecognize: additional recognizer based on spectrogram image comparison. `pip install audalign[visrecognize]`\n- noisereduce: wrapper utils around [timsainb/noisereduce](https://github.com/timsainb/noisereduce). `pip install audalign[noisereduce]`\n\n## Recognizers\n\nThere are currently four included recognizers, each with their own config objects.\n\n```python\nimport audalign as ad\n\nfingerprint_rec = ad.FingerprintRecognizer()\ncorrelation_rec = ad.CorrelationRecognizer()\ncor_spec_rec = ad.CorrelationSpectrogramRecognizer()\nvisual_rec = ad.VisualRecognizer() # requires installting optional visrecognize dependencies\n\nfingerprint_rec.config.set_accuracy(3)\n# recognizer.config.some_item\n```\n\nFor more info about the configuration objects, check out the [wiki](https://github.com/benfmiller/audalign/wiki) or the config objects themselves. They are relatively nicely commented.\n\nRecognizers are then passed to recognize and align functions.\n\n```python\nresults = ad.align(\"target/folder/\", recognizer=fingerprint_rec)\nresults = ad.align(\"target/folder/\", recognizer=correlation_rec)\nresults = ad.align(\"target/folder/\", recognizer=cor_spec_rec)\nresults = ad.align(\"target/folder/\", recognizer=visual_rec)\nresults = ad.recognize(\"target/file1\", \"target/file2\", recognizer=fingerprint_rec)\nresults = ad.recognize(\"target/file1\", \"target/folder\", recognizer=fingerprint_rec)\n# or\nresults = ad.target_align(\n    \"target/files\",\n    \"target/folder/\",\n    destination_path=\"write/alignments/to/folder\",\n    recognizer=fingerprint_rec\n)\n# or\nresults = ad.align_files(\n    \"target/file1\",\n    \"target/file2\",\n    destination_path=\"write/alignments/to/folder\",\n    recognizer=correlation_rec\n)\n\n# results can then be sent to fine_align\nfine_results = ad.fine_align(\n    results,\n    recognizer=cor_spec_rec,\n)\n```\n\nCorrelation is more precise than fingerprints and will always give a best alignment unlike fingerprinting, which can return no alignment. `max_lags` is very important for fine aligning. `locality` can be very useful for all alignments and recognitions.\n\n## Other Functions\n\n```python\n# wrapper for timsainb/noisereduce, optional dependency\nad.remove_noise_file(\n    \"target/file\",\n    \"5\", # noise start in seconds\n    \"20\", # noise end in seconds\n    \"destination/file\",\n    alt_noise_filepath=\"different/sound/file\",\n    prop_decrease=\"0.5\", # If you want noise half reduced\n)\n\nad.remove_noise_directory(\n    \"target/directory/\",\n    \"noise/file\",\n    \"5\", # noise start in seconds\n    \"20\", # noise end in seconds\n    \"destination/directory\",\n    prop_decrease=\"0.5\", # If you want noise half reduced\n)\n\nad.uniform_level_file(\n    \"target/file\",\n    \"destination\",\n    mode=\"normalize\",\n    width=5,\n)\n\nad.plot(\"file.wav\") # Plots spectrogram with peaks overlaid\nad.convert_audio_file(\"audio.wav\", \"audio.mp3\") # Also convert video file to audio file\nad.get_metadata(\"file.wav\") # Returns metadata from ffmpeg/ avlib\n```\n\nYou can easily recalcute the alignment shifts from previous results using recalc_shifts.\nYou can then write those shifts using write_shifts_from_results. write_shifts_from_results also\nlets you use different source files for alignments too.\n\n```python\nrecalculated_results = ad.recalc_shifts(older_results)\nad.write_shifts_from_results(recalculated_results, \"destination\", \"source_files_folder_or_file_list\")\n```\n\n## Fingerprinting\n\nFingerprinting is only used in the FingerprintRecognizer object. Alignments are not independent, so fingerprints created before alignments will be used for the alignment. The exception of this is in fine_aligning, where new fingerprints are always created.\n\nRunning recognitions will fingerprint all files in the recognitions not already fingerprinted.\n\n```python\nfingerprint_rec = ad.FingerprintRecognizer()\n\nfingerprint_rec.fingerprint_file(\"test_file.wav\")\n\n# or\n\nfingerprint_rec.fingerprint_directory(\"audio/directory\")\n```\n\nfingerprints are stored in fingerprint_rec and can be saved by\n\n```python\nfingerprint_rec.save_fingerprinted_files(\"save_file.json\") # or .pickle\n# or loaded with\nfingerprint_rec.load_fingerprinted_files(\"save_file.json\") # or .pickle\n```\n\n## Resources and Tools\n\nFor more tools to align audio and video files, see [forart/HyMPS's collection of alignment resources.](https://github.com/forart/HyMPS/blob/main/A_Tools.md#alignmentsynch-)\n\n[forart/HyMPS](https://github.com/forart/HyMPS/tree/main) also has many other audio/video resources.\n\n\n## Getting ffmpeg set up\n\nYou can use **ffmpeg or libav**.\n\nMac (using [homebrew](http://brew.sh)):\n\n```bash\n# ffmpeg\nbrew install ffmpeg --with-libvorbis --with-sdl2 --with-theora\n\n####    OR    #####\n\n# libav\nbrew install libav --with-libvorbis --with-sdl --with-theora\n```\n\nLinux (using apt):\n\n```bash\n# ffmpeg\napt-get install ffmpeg libavcodec-extra\n\n####    OR    #####\n\n# libav\napt-get install libav-tools libavcodec-extra\n```\n\nWindows:\n\n1. Download and extract ffmpeg from [Windows binaries provided here](https://ffmpeg.org/download.html).\n2. Add the ffmpeg `/bin` folder to your PATH environment variable\n\nOR\n\n1. Download and extract libav from [Windows binaries provided here](http://builds.libav.org/windows/).\n2. Add the libav `/bin` folder to your PATH environment variable\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Audio Alignment and Recognition in Python",
    "version": "1.3.0",
    "project_urls": {
        "Changelog": "https://github.com/benfmiller/audalign/blob/main/CHANGELOG.md",
        "Documentation": "https://github.com/benfmiller/audalign/wiki",
        "Homepage": "http://github.com/benfmiller/audalign"
    },
    "split_keywords": [
        "align",
        " alignment",
        " audio",
        " fingerprinting",
        " music",
        " python"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "edcb0377ef8f24f73c4f890d3a0fa7cc3c1ad8bfe841d882b93ea035a4743772",
                "md5": "132e85e3bf054a68ff70611ffb24c499",
                "sha256": "c72212d8fa67661ef7d6ed9969102782d410d570205715c492309f34005c94b8"
            },
            "downloads": -1,
            "filename": "audalign-1.3.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "132e85e3bf054a68ff70611ffb24c499",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 55902,
            "upload_time": "2024-06-03T06:11:56",
            "upload_time_iso_8601": "2024-06-03T06:11:56.458648Z",
            "url": "https://files.pythonhosted.org/packages/ed/cb/0377ef8f24f73c4f890d3a0fa7cc3c1ad8bfe841d882b93ea035a4743772/audalign-1.3.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "0d3b51bdadb640ee8ca249816de80bec5aee3b3d3ca72a1be9101de868711c95",
                "md5": "fed43a3045b3850247c239fdf5a6f7ab",
                "sha256": "472c8399d147269f21dcbc284257289b5c32ab7a5788d29e17d84b63b6b834bd"
            },
            "downloads": -1,
            "filename": "audalign-1.3.0.tar.gz",
            "has_sig": false,
            "md5_digest": "fed43a3045b3850247c239fdf5a6f7ab",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 43010,
            "upload_time": "2024-06-03T06:11:58",
            "upload_time_iso_8601": "2024-06-03T06:11:58.328839Z",
            "url": "https://files.pythonhosted.org/packages/0d/3b/51bdadb640ee8ca249816de80bec5aee3b3d3ca72a1be9101de868711c95/audalign-1.3.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-06-03 06:11:58",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "benfmiller",
    "github_project": "audalign",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "audalign"
}

None