:toc:
:toc-placement!:
= wav2vec
toc::[]
== Introduction
`wav2vec` is a Python script and package for converting waveform files (WAV or AIFF) to vector graphics (SVG or PostScript). Use cases include using an audio waveform as an element in a graphic design or including a waveform in a document.
Note:: This project is completely unrelated to the https://arxiv.org/abs/1904.05862[wav2vec speech recognition model] (which was published after this tool).
== Features
* Portable: runs on Python 2.7+ and Python 3 and does not depend on any third-party packages.
** Python 3.13 removed several modules this tool relies on (see https://peps.python.org/pep-0594/), but those packages are still available on pypi (as `standard-aifc` and `standard-sndhdr`) and will be installed automatically if you install `wav2vec` with uv or pip.
* Supported PCM input file formats:
** 8-bit signed AIFF
** 8-bit unsigned WAV
** 16-bit signed WAV and AIFF
** 32-bit signed WAV and AIFF
** Floating point WAV files are not supported because they are not yet supported by the Python `wave` module (https://github.com/cristoper/wav2vec/issues/5)
* Input file format is automatically detected and handled (the file name/extension is unimportant)
* Output file formats:
** Scalable Vector Graphics (SVG)
** PostScript
** Comma-Separated Values (CSV)
* Easy to write a custom output formatter
* Options to scale the output data
* Can process input files in chunks so large files can be processed with minimal memory
== Install
=== From PyPI with PIP
The easiest way to install `wav2vec` is to use `pip` to install from the Python Package Index:
[source, sh]
----
$ uv tool install wav2vec
----
or
[source, sh]
----
$ pip install wav2vec
----
Depending on your system, in order to install in the Python 3 path, you may have to use `pip3` instead of `pip`.
=== From git repo
Alternatively, clone the git repository:
[source, sh]
----
$ git clone https://github.com/cristoper/wav2vec.git
$ cd wav2vec
----
If you are running Python >=3.13 then you'll also need to install the `aifc` and `sndhdr` modules:
[source, sh]
----
$ pip install standard-aifc standard-sndhdr
----
or on Debian:
[source, sh]
----
$ apt install python3-standard-aifc python3-standard-sndhdr
----
Now you can run `wav2vec.py` directly:
[source, sh]
----
$ python wav2vec.py -h
----
Or install the package with PIP:
[source, sh]
----
$ pip install .
$ wav2vec -h
----
== Usage
Once the package is installed using pip (see above), the command can be invoked as `wav2vec`. It takes an input file and outputs (SVG, by default) to stdout:
[source, sh]
----
$ wav2vec filename.wav > filename.svg
----
Run `wav2vec -h` to get a usage summary:
----
usage: wav2vec [-h] [--format {PostScript,SVG,CSV}] [--width WIDTH]
[--height HEIGHT] [--stream BS] [--downtoss N]
[--log {DEBUG,INFO,WARNING,ERROR,CRITICAL}]
filename
Convert WAV and AIFF files to vector (SVG, PostScript, CSV) graphics.
positional arguments:
filename The WAV file to read
optional arguments:
-h, --help show this help message and exit
--format {PostScript,SVG,CSV}, -f {PostScript,SVG,CSV}
The output format, one of: SVG, CSV, PostScript.
Default is SVG.
--width WIDTH Maximum width of generated SVG (graphic will be scaled
down to this size in px)
--height HEIGHT Maximum height of generated SVG (graphic will be
scaled down to this size in px). Note that this scales
according to the highest possible amplitude (given the
sample bit depth), not the highest amplitude that
actually occurs in the data.
--stream BS Stream the input file size in chunks (of BS number of
frames at a time) and process/format each chunk
separately. Useful for conserving memory when
processing large files, but note that multi-channel
paths will be split up into BS-sized chunks. By
default BS=0, which causes the entire file to be read
into memory before processing.
--downtoss N Downsample by keeping only 1 out of every N samples.
--log {DEBUG,INFO,WARNING,ERROR,CRITICAL}
Set the logging level.
The output is sent to stdout.
----
=== Options
==== Output format
The `--format` flag sets the output format. `wav2vec` includes three formatters: `SVG` (default if no `--format` is given), `PostScript`, and `CSV`.
[source, sh]
----
$ wav2vec filename.wav --format PostScript > output.ps
----
==== Scale output
Use the `--width` and `--height` options to scale the output so that its maximum bounds are equal to or less than the values following the flags. In SVG these values are pixels ("user units"); in PostScript the values are interpreted as pts (1/72 of an inch). By default (if the flags are not given), the width is set to 1000 and the height to 500.
[source, sh]
----
$ wav2vec filename.wav --width 500 --height 350 > output.svg
----
==== Stream input file
By default, `wav2vec` reads the entire input file into memory and then streams the output to stdout as it process it. Passing the `--stream` flag will cause `wav2vec` to process the input file in chunks. This can be useful if the input file is very big and won't fit into available memory. The `--stream` flag requires one argument, the number of frames to read and process at a time (each frame includes one sample from each channel). A value of around 1024 seems to work well.
[source, sh]
----
$ wav2vec filename.aiff --stream 1024 > output.svg
----
Note that using the `--stream` flag on files with multiple channels will result in non-continuous paths in the output (because channel data is interleaved in WAV/AIF files).
Note also that converting very large audio files to SVG may not be practical: most SVG editors will not handle paths with hundreds of thousands or millions of points well.
==== Downsampling
The `--downtoss N` flag will keep only 1 out of every N samples. This is a brutal form of downsampling which will clobber high frequency and add aliasing noise. It's best to instead downsample in your waveform recorder/editor before processing (or in your drawing program after processing).
=== API
You can also `import wav2vec` in order to convert wave files to the supported output formats in your own Python scripts. The package provides two main classes: `WavDecoder` and the abstract `Formatter` (and the concrete implementations: `SVGFormatter`, `PSFormatter`, and `CSVFormatter`). The documentation is currently contained in the source files; look at link:./wav2vec/main.py[main.py] for an example of usage.
The `WavDecoder` class wraps the standard library's `wave` and `aifc` modules and provides an easy way to read and decode WAV/AIFF files. Use it as a context manager to ensure `close()` is called. Use it as an iterator to process all frames:
[source, python]
----
>>> wd = WavDecoder('filename')
>>> with wd as data:
>>> for frames in data:
>>> print(frames)
----
See link:./wav2vec/WavDecoder.py[wav2vec/WavDecoder.py].
The `Formatter` class is an abstract base class which defines the interface for all formatters which output WAV data in textual formats. Each concrete subclass of `Formatter` takes a `WavDecoder` object in its constructor which is what is responsible for reading/decoding data from a WAV or AIFF file.
The `output()` method will stream output to a file (stdout by default), but the entire output string can be captured using the `__str__()` method.
[source, python]
----
>>> wd = WavDecoder("filename")
>>> svgformatter = SVGFormatter(wd)
>>> svgformatter.output() # outputs SVG to stdout
>>> svg_str = str(svgformatter) # get SVG as a string
----
See link:./wav2vec/formatter/[the formatter package].
=== Examples
==== SVG
Here's what the link:tests/valfiles/snd/test-16-stereo.wav[tests/valfiles/snd/test-16-stereo.wav] file looks like in Audacity:
image::./readme_imgs/audacity.png[]
We can convert it to an SVG and then open it in Inkscape:
[source, sh]
----
$ wav2vec tests/valfiles/snd/test-16-stereo.wav > test.svg
$ inkscape test.svg
----
image::./readme_imgs/inkscape.png[]
Then we can use Inkscape to non-destructively add filters and path effects and otherwise incorporate the waveform into a design:
image::./readme_imgs/output.png[]
==== PostScript
To convert to PostScript instead of SVG:
[source, sh]
----
$ wav2vec tests/valfiles/snd/test-16-stereo.wav -f PostScript > test.ps
$ ps2pdf test.ps
$ evince test.pdf
----
The above uses the Ghostscript `ps2pdf` tool to convert the resulting PostScript file to PDF and then opens it in the evince PDF reader (shown in the screenshot below). You could instead open `test.ps` directly in a PostScript viewer (or send it to a printer/plotter, or embed it in a LaTeX document, etc).
image::./readme_imgs/evince.png[]
==== CSV
`wav2vec` also comes with a CSV formatter, which is useful to get WAV data into a spreadsheet:
[source, sh]
----
$ wav2vec tests/valfiles/snd/test-16-stereo.wav -f CSV --height 0 > test.csv
$ libreoffice test.csv
----
Note the `--height 0` option which prevents `wav2vec` from scaling the raw PCM values.
== Hacking
=== Run tests
To run unit and validation tests (requires python3):
[source, sh]
----
$ python -m unittest discover
----
=== Write custom formatter
Creating a custom formatter is simply a matter of subclassing `Formatter` and overriding the five abstract methods it defines. Use the included SVGFormatter, PSFormatter, or CSVFormatter as a template (see link:./wav2vec/formatter/formatters.py[wav2vec/formatter/formatters.py]).
== Issues
Please feel free to use the Github issue tracker as a support forum for any questions, suggestions, bug reports, or feature requests. Thanks! https://github.com/cristoper/wav2vec/issues
== See also
- http://www.audacityteam.org/[Audacity] is a good Free audio recorder and waveform editor.
- https://inkscape.org/en/[Inkscape] is a Free SVG-based drawing program
- https://www.ghostscript.com/[Ghostscript] is a Free PostScript interpreter which can distill to PDF.
- https://github.com/afreiday/php-waveform-svg[php-waveform-svg] is a PHP script for converting mp3->wav->svg. (It looks simple, but I haven't tried it.)
Raw data
{
"_id": null,
"home_page": "https://github.com/cristoper/wav2vec",
"name": "wav2vec",
"maintainer": null,
"docs_url": null,
"requires_python": ">=2.7",
"maintainer_email": null,
"keywords": "audio graphics svg postscript data",
"author": "Chris Burkhardt",
"author_email": "dev@orangenoiseproduction.com",
"download_url": "https://files.pythonhosted.org/packages/5a/c3/3e50a4e3c59d2ae28c4e1d53d6e5b4c7a864fdcf816326140dadccc0a4f1/wav2vec-1.1.0.tar.gz",
"platform": null,
"description": ":toc:\n:toc-placement!:\n\n= wav2vec\n\ntoc::[]\n\n== Introduction\n\n`wav2vec` is a Python script and package for converting waveform files (WAV or AIFF) to vector graphics (SVG or PostScript). Use cases include using an audio waveform as an element in a graphic design or including a waveform in a document.\n\nNote:: This project is completely unrelated to the https://arxiv.org/abs/1904.05862[wav2vec speech recognition model] (which was published after this tool).\n\n== Features\n\n* Portable: runs on Python 2.7+ and Python 3 and does not depend on any third-party packages.\n** Python 3.13 removed several modules this tool relies on (see https://peps.python.org/pep-0594/), but those packages are still available on pypi (as `standard-aifc` and `standard-sndhdr`) and will be installed automatically if you install `wav2vec` with uv or pip.\n* Supported PCM input file formats:\n** 8-bit signed AIFF\n** 8-bit unsigned WAV\n** 16-bit signed WAV and AIFF\n** 32-bit signed WAV and AIFF\n** Floating point WAV files are not supported because they are not yet supported by the Python `wave` module (https://github.com/cristoper/wav2vec/issues/5)\n* Input file format is automatically detected and handled (the file name/extension is unimportant)\n* Output file formats:\n** Scalable Vector Graphics (SVG)\n** PostScript\n** Comma-Separated Values (CSV)\n* Easy to write a custom output formatter\n* Options to scale the output data\n* Can process input files in chunks so large files can be processed with minimal memory\n\n== Install\n\n=== From PyPI with PIP\nThe easiest way to install `wav2vec` is to use `pip` to install from the Python Package Index:\n\n[source, sh]\n----\n$ uv tool install wav2vec\n----\n\nor\n\n[source, sh]\n----\n$ pip install wav2vec\n----\n\nDepending on your system, in order to install in the Python 3 path, you may have to use `pip3` instead of `pip`.\n\n=== From git repo\n\nAlternatively, clone the git repository:\n\n[source, sh]\n----\n$ git clone https://github.com/cristoper/wav2vec.git\n$ cd wav2vec\n----\n\nIf you are running Python >=3.13 then you'll also need to install the `aifc` and `sndhdr` modules:\n\n[source, sh]\n----\n$ pip install standard-aifc standard-sndhdr\n----\n\nor on Debian:\n[source, sh]\n----\n$ apt install python3-standard-aifc python3-standard-sndhdr\n----\n\nNow you can run `wav2vec.py` directly:\n\n[source, sh]\n----\n$ python wav2vec.py -h\n----\n\nOr install the package with PIP:\n\n[source, sh]\n----\n$ pip install .\n$ wav2vec -h\n----\n\n== Usage\n\nOnce the package is installed using pip (see above), the command can be invoked as `wav2vec`. It takes an input file and outputs (SVG, by default) to stdout:\n\n[source, sh]\n----\n$ wav2vec filename.wav > filename.svg\n----\n\nRun `wav2vec -h` to get a usage summary:\n\n----\nusage: wav2vec [-h] [--format {PostScript,SVG,CSV}] [--width WIDTH]\n [--height HEIGHT] [--stream BS] [--downtoss N]\n [--log {DEBUG,INFO,WARNING,ERROR,CRITICAL}]\n filename\n\nConvert WAV and AIFF files to vector (SVG, PostScript, CSV) graphics.\n\npositional arguments:\n filename The WAV file to read\n\noptional arguments:\n -h, --help show this help message and exit\n --format {PostScript,SVG,CSV}, -f {PostScript,SVG,CSV}\n The output format, one of: SVG, CSV, PostScript.\n Default is SVG.\n --width WIDTH Maximum width of generated SVG (graphic will be scaled\n down to this size in px)\n --height HEIGHT Maximum height of generated SVG (graphic will be\n scaled down to this size in px). Note that this scales\n according to the highest possible amplitude (given the\n sample bit depth), not the highest amplitude that\n actually occurs in the data.\n --stream BS Stream the input file size in chunks (of BS number of\n frames at a time) and process/format each chunk\n separately. Useful for conserving memory when\n processing large files, but note that multi-channel\n paths will be split up into BS-sized chunks. By\n default BS=0, which causes the entire file to be read\n into memory before processing.\n --downtoss N Downsample by keeping only 1 out of every N samples.\n --log {DEBUG,INFO,WARNING,ERROR,CRITICAL}\n Set the logging level.\n\nThe output is sent to stdout.\n\n----\n\n=== Options\n==== Output format\n\nThe `--format` flag sets the output format. `wav2vec` includes three formatters: `SVG` (default if no `--format` is given), `PostScript`, and `CSV`.\n\n[source, sh]\n----\n$ wav2vec filename.wav --format PostScript > output.ps\n----\n\n==== Scale output\n\nUse the `--width` and `--height` options to scale the output so that its maximum bounds are equal to or less than the values following the flags. In SVG these values are pixels (\"user units\"); in PostScript the values are interpreted as pts (1/72 of an inch). By default (if the flags are not given), the width is set to 1000 and the height to 500.\n\n[source, sh]\n----\n$ wav2vec filename.wav --width 500 --height 350 > output.svg\n----\n\n==== Stream input file\n\nBy default, `wav2vec` reads the entire input file into memory and then streams the output to stdout as it process it. Passing the `--stream` flag will cause `wav2vec` to process the input file in chunks. This can be useful if the input file is very big and won't fit into available memory. The `--stream` flag requires one argument, the number of frames to read and process at a time (each frame includes one sample from each channel). A value of around 1024 seems to work well.\n\n[source, sh]\n----\n$ wav2vec filename.aiff --stream 1024 > output.svg\n----\n\nNote that using the `--stream` flag on files with multiple channels will result in non-continuous paths in the output (because channel data is interleaved in WAV/AIF files).\n\nNote also that converting very large audio files to SVG may not be practical: most SVG editors will not handle paths with hundreds of thousands or millions of points well.\n\n==== Downsampling\n\nThe `--downtoss N` flag will keep only 1 out of every N samples. This is a brutal form of downsampling which will clobber high frequency and add aliasing noise. It's best to instead downsample in your waveform recorder/editor before processing (or in your drawing program after processing).\n\n=== API\n\nYou can also `import wav2vec` in order to convert wave files to the supported output formats in your own Python scripts. The package provides two main classes: `WavDecoder` and the abstract `Formatter` (and the concrete implementations: `SVGFormatter`, `PSFormatter`, and `CSVFormatter`). The documentation is currently contained in the source files; look at link:./wav2vec/main.py[main.py] for an example of usage.\n\nThe `WavDecoder` class wraps the standard library's `wave` and `aifc` modules and provides an easy way to read and decode WAV/AIFF files. Use it as a context manager to ensure `close()` is called. Use it as an iterator to process all frames:\n\n[source, python]\n----\n>>> wd = WavDecoder('filename')\n>>> with wd as data:\n>>> for frames in data:\n>>> print(frames)\n----\n\nSee link:./wav2vec/WavDecoder.py[wav2vec/WavDecoder.py].\n\nThe `Formatter` class is an abstract base class which defines the interface for all formatters which output WAV data in textual formats. Each concrete subclass of `Formatter` takes a `WavDecoder` object in its constructor which is what is responsible for reading/decoding data from a WAV or AIFF file.\n\nThe `output()` method will stream output to a file (stdout by default), but the entire output string can be captured using the `__str__()` method.\n\n[source, python]\n----\n>>> wd = WavDecoder(\"filename\")\n>>> svgformatter = SVGFormatter(wd)\n>>> svgformatter.output() # outputs SVG to stdout\n>>> svg_str = str(svgformatter) # get SVG as a string\n----\n\nSee link:./wav2vec/formatter/[the formatter package].\n\n=== Examples\n\n==== SVG\n\nHere's what the link:tests/valfiles/snd/test-16-stereo.wav[tests/valfiles/snd/test-16-stereo.wav] file looks like in Audacity:\n\nimage::./readme_imgs/audacity.png[]\n\nWe can convert it to an SVG and then open it in Inkscape:\n\n[source, sh]\n----\n$ wav2vec tests/valfiles/snd/test-16-stereo.wav > test.svg\n$ inkscape test.svg\n----\n\nimage::./readme_imgs/inkscape.png[]\n\nThen we can use Inkscape to non-destructively add filters and path effects and otherwise incorporate the waveform into a design:\n\nimage::./readme_imgs/output.png[]\n\n==== PostScript\n\nTo convert to PostScript instead of SVG:\n\n[source, sh]\n----\n$ wav2vec tests/valfiles/snd/test-16-stereo.wav -f PostScript > test.ps\n$ ps2pdf test.ps\n$ evince test.pdf\n----\n\nThe above uses the Ghostscript `ps2pdf` tool to convert the resulting PostScript file to PDF and then opens it in the evince PDF reader (shown in the screenshot below). You could instead open `test.ps` directly in a PostScript viewer (or send it to a printer/plotter, or embed it in a LaTeX document, etc).\n\nimage::./readme_imgs/evince.png[]\n\n==== CSV\n\n`wav2vec` also comes with a CSV formatter, which is useful to get WAV data into a spreadsheet:\n\n[source, sh]\n----\n$ wav2vec tests/valfiles/snd/test-16-stereo.wav -f CSV --height 0 > test.csv\n$ libreoffice test.csv\n----\n\nNote the `--height 0` option which prevents `wav2vec` from scaling the raw PCM values.\n\n== Hacking\n\n=== Run tests\n\nTo run unit and validation tests (requires python3):\n\n[source, sh]\n----\n$ python -m unittest discover\n----\n\n=== Write custom formatter\n\nCreating a custom formatter is simply a matter of subclassing `Formatter` and overriding the five abstract methods it defines. Use the included SVGFormatter, PSFormatter, or CSVFormatter as a template (see link:./wav2vec/formatter/formatters.py[wav2vec/formatter/formatters.py]).\n\n== Issues\n\nPlease feel free to use the Github issue tracker as a support forum for any questions, suggestions, bug reports, or feature requests. Thanks! https://github.com/cristoper/wav2vec/issues\n\n== See also\n\n- http://www.audacityteam.org/[Audacity] is a good Free audio recorder and waveform editor.\n- https://inkscape.org/en/[Inkscape] is a Free SVG-based drawing program\n- https://www.ghostscript.com/[Ghostscript] is a Free PostScript interpreter which can distill to PDF.\n\n- https://github.com/afreiday/php-waveform-svg[php-waveform-svg] is a PHP script for converting mp3->wav->svg. (It looks simple, but I haven't tried it.)\n",
"bugtrack_url": null,
"license": "WTFPL",
"summary": "A Python package to convert waveform files (WAV or AIFF) to vector graphics (SVG, PostScript, or CVS)",
"version": "1.1.0",
"project_urls": {
"Homepage": "https://github.com/cristoper/wav2vec"
},
"split_keywords": [
"audio",
"graphics",
"svg",
"postscript",
"data"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "b2db192bec4f0598ff35f05c1137e3e0f09036ff22d3d8cd6bcfd5ce8eef2f64",
"md5": "e69a0a0bf1fdce0ee6187fa347ccabe6",
"sha256": "21d43d719452dca7af5d8a0452bcadf985caff87e62b122d4f6a12f77e4917b6"
},
"downloads": -1,
"filename": "wav2vec-1.1.0-py2.py3-none-any.whl",
"has_sig": false,
"md5_digest": "e69a0a0bf1fdce0ee6187fa347ccabe6",
"packagetype": "bdist_wheel",
"python_version": "py2.py3",
"requires_python": ">=2.7",
"size": 14643,
"upload_time": "2025-09-06T05:06:08",
"upload_time_iso_8601": "2025-09-06T05:06:08.838789Z",
"url": "https://files.pythonhosted.org/packages/b2/db/192bec4f0598ff35f05c1137e3e0f09036ff22d3d8cd6bcfd5ce8eef2f64/wav2vec-1.1.0-py2.py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "5ac33e50a4e3c59d2ae28c4e1d53d6e5b4c7a864fdcf816326140dadccc0a4f1",
"md5": "41655972369eb2e2b7642782319a64ca",
"sha256": "e852f581e87c7e702da742482cff8137ab914172a38c047051aa46f0fd1d697c"
},
"downloads": -1,
"filename": "wav2vec-1.1.0.tar.gz",
"has_sig": false,
"md5_digest": "41655972369eb2e2b7642782319a64ca",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=2.7",
"size": 19679,
"upload_time": "2025-09-06T05:06:10",
"upload_time_iso_8601": "2025-09-06T05:06:10.054786Z",
"url": "https://files.pythonhosted.org/packages/5a/c3/3e50a4e3c59d2ae28c4e1d53d6e5b4c7a864fdcf816326140dadccc0a4f1/wav2vec-1.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-09-06 05:06:10",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "cristoper",
"github_project": "wav2vec",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "wav2vec"
}