worldvocoder


Nameworldvocoder JSON
Version 0.0.5 PyPI version JSON
download
home_pagehttps://github.com/javanasse/Python-WORLD
SummaryPython implementation of WORLD vocoder.
upload_time2023-07-18 18:53:51
maintainer
docs_urlNone
authorJulianArmandVanasse
requires_python>=3.7
license
keywords
VCS
bugtrack_url
requirements numpy scipy numba cython simpleaudio matplotlib
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # PYTHON WORLD VOCODER: 
*************************************

This is a line-by-line implementation of WORLD vocoder (Matlab, C++) in python. It supports *python 3.0* and later.

# INSTALLATION
*********************

```
pip install worldvocoder
```

# EXAMPLE
**************

```python
import worldvocoder as wv
import soundfile as sf
import librosa

# read audio
audio, sample_rate = sf.read("some_file.wav")
audio = librosa.to_mono(audio)

# initialize vocoder
vocoder = wv.World()

# encode audio
dat = vocoder.encode(sample_rate, audio, f0_method='harvest')

```

in which, ```sample_rate``` is sampling frequency and ```audio``` is the speech/singing signal.

The ```dat``` is a dictionary object that contains pitch, magnitude spectrum, and aperiodicity. 

We can scale the pitch:

```python
dat = vocoder.scale_pitch(dat, 1.5)
```

Be careful when you scale the pich because there is upper limit and lower limit.

We can make speech faster or slower:

```python
dat = vocoder.scale_duration(dat, 2)
```

To resynthesize the audio:

```python
dat = vocoder.decode(dat)
output = dat["out"]
```

To use d4c_requiem analysis and requiem_synthesis in WORLD version 0.2.2, set the variable ```is_requiem=True```:

```python
# requiem analysis
dat = vocoder.encode(fs, x, f0_method='harvest', is_requiem=True)
```

To extract log-filterbanks, MCEP-40, VAE-12 as described in the paper `Using a Manifold Vocoder for Spectral Voice and Style Conversion`, check ```test/spectralFeatures.py```. You need Keras 2.2.4 and TensorFlow 1.14.0 to extract VAE-12.
Check out [speech samples](https://tuanad121.github.io/samples/2019-09-15-Manifold/)

# NOTE:
**********

* The vocoder use pitch-synchronous analysis, the size of each window is determined by fundamental frequency ```F0```. The centers of the windows are equally spaced with the distance of ```frame_period``` ms.

* The Fourier transform size (```fft_size```) is determined automatically using sampling frequency and the lowest value of F0 ```f0_floor```. 
When you want to specify your own ```fft_size```, you have to use ```f0_floor = 3.0 * fs / fft_size```. 
If you decrease ```fft_size```, the ```f0_floor``` increases. But, a high ```f0_floor``` might be not good for the analysis of male voices.


# CITATION:

Dinh, T., Kain, A., & Tjaden, K. (2019). Using a manifold vocoder for spectral voice and style conversion. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2019-September, 1388-1392.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/javanasse/Python-WORLD",
    "name": "worldvocoder",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "",
    "keywords": "",
    "author": "JulianArmandVanasse",
    "author_email": "Julian <julian.vanasse@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/d4/e4/8336dffb1a26e61d3558a8b9c8120538121089dbd978dbff7806de301d52/worldvocoder-0.0.5.tar.gz",
    "platform": null,
    "description": "# PYTHON WORLD VOCODER: \n*************************************\n\nThis is a line-by-line implementation of WORLD vocoder (Matlab, C++) in python. It supports *python 3.0* and later.\n\n# INSTALLATION\n*********************\n\n```\npip install worldvocoder\n```\n\n# EXAMPLE\n**************\n\n```python\nimport worldvocoder as wv\nimport soundfile as sf\nimport librosa\n\n# read audio\naudio, sample_rate = sf.read(\"some_file.wav\")\naudio = librosa.to_mono(audio)\n\n# initialize vocoder\nvocoder = wv.World()\n\n# encode audio\ndat = vocoder.encode(sample_rate, audio, f0_method='harvest')\n\n```\n\nin which, ```sample_rate``` is sampling frequency and ```audio``` is the speech/singing signal.\n\nThe ```dat``` is a dictionary object that contains pitch, magnitude spectrum, and aperiodicity. \n\nWe can scale the pitch:\n\n```python\ndat = vocoder.scale_pitch(dat, 1.5)\n```\n\nBe careful when you scale the pich because there is upper limit and lower limit.\n\nWe can make speech faster or slower:\n\n```python\ndat = vocoder.scale_duration(dat, 2)\n```\n\nTo resynthesize the audio:\n\n```python\ndat = vocoder.decode(dat)\noutput = dat[\"out\"]\n```\n\nTo use d4c_requiem analysis and requiem_synthesis in WORLD version 0.2.2, set the variable ```is_requiem=True```:\n\n```python\n# requiem analysis\ndat = vocoder.encode(fs, x, f0_method='harvest', is_requiem=True)\n```\n\nTo extract log-filterbanks, MCEP-40, VAE-12 as described in the paper `Using a Manifold Vocoder for Spectral Voice and Style Conversion`, check ```test/spectralFeatures.py```. You need Keras 2.2.4 and TensorFlow 1.14.0 to extract VAE-12.\nCheck out [speech samples](https://tuanad121.github.io/samples/2019-09-15-Manifold/)\n\n# NOTE:\n**********\n\n* The vocoder use pitch-synchronous analysis, the size of each window is determined by fundamental frequency ```F0```. The centers of the windows are equally spaced with the distance of ```frame_period``` ms.\n\n* The Fourier transform size (```fft_size```) is determined automatically using sampling frequency and the lowest value of F0 ```f0_floor```. \nWhen you want to specify your own ```fft_size```, you have to use ```f0_floor = 3.0 * fs / fft_size```. \nIf you decrease ```fft_size```, the ```f0_floor``` increases. But, a high ```f0_floor``` might be not good for the analysis of male voices.\n\n\n# CITATION:\n\nDinh, T., Kain, A., & Tjaden, K. (2019). Using a manifold vocoder for spectral voice and style conversion. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2019-September, 1388-1392.\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Python implementation of WORLD vocoder.",
    "version": "0.0.5",
    "project_urls": {
        "Download": "https://github.com/javanasse/Python-WORLD/archive/refs/tags/v0.tar.gz",
        "Homepage": "https://github.com/javanasse/worldvocoder"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c154de9ac193992ba965bed8cc611a56d0327919d09b9569bc5290886974334e",
                "md5": "f298c378be06bfc47b72ec7a866ebe90",
                "sha256": "df6b147d0e2d45d26ab0c5e52a44154d65a5b2ff8d29d2b3593ebecd3e518879"
            },
            "downloads": -1,
            "filename": "worldvocoder-0.0.5-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f298c378be06bfc47b72ec7a866ebe90",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 41207,
            "upload_time": "2023-07-18T18:53:49",
            "upload_time_iso_8601": "2023-07-18T18:53:49.900862Z",
            "url": "https://files.pythonhosted.org/packages/c1/54/de9ac193992ba965bed8cc611a56d0327919d09b9569bc5290886974334e/worldvocoder-0.0.5-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d4e48336dffb1a26e61d3558a8b9c8120538121089dbd978dbff7806de301d52",
                "md5": "9044046d5fbadd8cdb6e3604c6486a0c",
                "sha256": "9c2748c6bc0be1df04e4a7675805966c8981ce81b863d9b90cb8764a7ad03176"
            },
            "downloads": -1,
            "filename": "worldvocoder-0.0.5.tar.gz",
            "has_sig": false,
            "md5_digest": "9044046d5fbadd8cdb6e3604c6486a0c",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 31456,
            "upload_time": "2023-07-18T18:53:51",
            "upload_time_iso_8601": "2023-07-18T18:53:51.758787Z",
            "url": "https://files.pythonhosted.org/packages/d4/e4/8336dffb1a26e61d3558a8b9c8120538121089dbd978dbff7806de301d52/worldvocoder-0.0.5.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-07-18 18:53:51",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "javanasse",
    "github_project": "Python-WORLD",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "numpy",
            "specs": [
                [
                    "==",
                    "1.24.3"
                ]
            ]
        },
        {
            "name": "scipy",
            "specs": [
                [
                    "==",
                    "1.10.1"
                ]
            ]
        },
        {
            "name": "numba",
            "specs": [
                [
                    "==",
                    "0.57.0"
                ]
            ]
        },
        {
            "name": "cython",
            "specs": [
                [
                    "==",
                    "0.29.35"
                ]
            ]
        },
        {
            "name": "simpleaudio",
            "specs": [
                [
                    "==",
                    "1.0.2"
                ]
            ]
        },
        {
            "name": "matplotlib",
            "specs": [
                [
                    "==",
                    "3.7.1"
                ]
            ]
        }
    ],
    "lcname": "worldvocoder"
}
        
Elapsed time: 0.27794s