==========
whispercpp
==========
*Pybind11 bindings for* `whisper.cpp <https://github.com/ggerganov/whisper.cpp.git>`_
Quickstart
~~~~~~~~~~
Install with pip:
.. code-block:: bash
pip install whispercpp
To use the latest version, install from source:
.. code-block:: bash
pip install git+https://github.com/aarnphm/whispercpp.git
For local setup, initialize all submodules:
.. code-block:: bash
git submodule update --init --recursive
Build the wheel:
.. code-block:: bash
# Option 1: using pypa/build
python3 -m build -w
# Option 2: using bazel
./tools/bazel build //:whispercpp_wheel
Install the wheel:
.. code-block:: bash
# Option 1: via pypa/build
pip install dist/*.whl
# Option 2: using bazel
pip install $(./tools/bazel info bazel-bin)/*.whl
The binding provides a ``Whisper`` class:
.. code-block:: python
from whispercpp import Whisper
w = Whisper.from_pretrained("tiny.en")
Currently, the inference API is provided via ``transcribe``:
.. code-block:: python
w.transcribe(np.ones((1, 16000)))
You can use `ffmpeg <https://github.com/kkroening/ffmpeg-python>`_ or `librosa <https://librosa.org/doc/main/index.html>`_
to load audio files into a Numpy array, then pass it to ``transcribe``:
.. code-block:: python
import ffmpeg
import numpy as np
try:
y, _ = (
ffmpeg.input("/path/to/audio.wav", threads=0)
.output("-", format="s16le", acodec="pcm_s16le", ac=1, ar=sample_rate)
.run(
cmd=["ffmpeg", "-nostdin"], capture_stdout=True, capture_stderr=True
)
)
except ffmpeg.Error as e:
raise RuntimeError(f"Failed to load audio: {e.stderr.decode()}") from e
arr = np.frombuffer(y, np.int16).flatten().astype(np.float32) / 32768.0
w.transcribe(arr)
The Pybind11 bindings supports all of the features from whisper.cpp.
The binding can also be used via ``api``:
.. code-block:: python
from whispercpp import api
ctx = api.Context.from_file("/path/to/saved_weight.bin")
params = api.Params()
ctx.full(arr, params)
Development
~~~~~~~~~~~
See `DEVELOPMENT.md <https://github.com/aarnphm/whispercpp/blob/main/DEVELOPMENT.md>`_
APIs
~~~~
``Whisper``
------------
1. ``Whisper.from_pretrained(model_name: str) -> Whisper``
Load a pre-trained model from the local cache or download and cache if needed.
.. code-block:: python
w = Whisper.from_pretrained("tiny.en")
The model will be saved to ``$XDG_DATA_HOME/whispercpp`` or ``~/.local/share/whispercpp`` if the environment variable is
not set.
2. ``Whisper.transcribe(arr: NDArray[np.float32], num_proc: int = 1)``
Running transcription on a given Numpy array. This calls ``full`` from ``whisper.cpp``. If ``num_proc`` is greater than 1,
it will use ``full_parallel`` instead.
.. code-block:: python
w.transcribe(np.ones((1, 16000)))
``api``
-------
``api`` is a direct binding from ``whisper.cpp``, that has similar APIs to `whisper-rs <https://github.com/tazz4843/whisper-rs>`_.
1. ``api.Context``
This class is a wrapper around ``whisper_context``
.. code-block:: python
from whispercpp import api
ctx = api.Context.from_file("/path/to/saved_weight.bin")
.. note::
The context can also be accessed from the ``Whisper`` class via ``w.context``
2. ``api.Params``
This class is a wrapper around ``whisper_params``
.. code-block:: python
from whispercpp import api
params = api.Params()
.. note::
The params can also be accessed from the ``Whisper`` class via ``w.params``
Why not?
~~~~~~~~
* `whispercpp.py <https://github.com/stlukey/whispercpp.py>`_. There are a few key differences here:
* They provides the Cython bindings. From the UX standpoint, this achieves the same goal as ``whispercpp``. The difference is ``whispercpp`` use Pybind11 instead. Feel free to use it if you prefer Cython over Pybind11. Note that ``whispercpp.py`` and ``whispercpp`` are mutually exclusive, as they also use the ``whispercpp`` namespace.
* ``whispercpp`` doesn't pollute your ``$HOME`` directory, rather it follows the `XDG Base Directory Specification <https://specifications.freedesktop.org/basedir-spec/basedir-spec-latest.html>`_ for saved weights.
* Using ``cdll`` and ``ctypes`` and be done with it?
* This is also valid, but requires a lot of hacking and it is pretty slow comparing to Cython and Pybind11.
Raw data
{
"_id": null,
"home_page": "https://github.com/aarnphm/whispercpp_py",
"name": "whispercpp-py",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "",
"author": "Aaron Pham",
"author_email": "aarnphm@bentoml.com",
"download_url": "",
"platform": null,
"description": "==========\nwhispercpp\n==========\n\n*Pybind11 bindings for* `whisper.cpp <https://github.com/ggerganov/whisper.cpp.git>`_\n\nQuickstart\n~~~~~~~~~~\n\nInstall with pip:\n\n.. code-block:: bash\n\n pip install whispercpp\n\nTo use the latest version, install from source:\n\n.. code-block:: bash\n\n pip install git+https://github.com/aarnphm/whispercpp.git\n\nFor local setup, initialize all submodules:\n\n.. code-block:: bash\n\n git submodule update --init --recursive\n\nBuild the wheel:\n\n.. code-block:: bash\n\n # Option 1: using pypa/build\n python3 -m build -w\n\n # Option 2: using bazel\n ./tools/bazel build //:whispercpp_wheel\n\nInstall the wheel:\n\n.. code-block:: bash\n\n # Option 1: via pypa/build\n pip install dist/*.whl\n\n # Option 2: using bazel\n pip install $(./tools/bazel info bazel-bin)/*.whl\n\nThe binding provides a ``Whisper`` class:\n\n.. code-block:: python\n\n from whispercpp import Whisper\n\n w = Whisper.from_pretrained(\"tiny.en\")\n\nCurrently, the inference API is provided via ``transcribe``:\n\n.. code-block:: python\n\n w.transcribe(np.ones((1, 16000)))\n\nYou can use `ffmpeg <https://github.com/kkroening/ffmpeg-python>`_ or `librosa <https://librosa.org/doc/main/index.html>`_\nto load audio files into a Numpy array, then pass it to ``transcribe``:\n\n.. code-block:: python\n\n import ffmpeg\n import numpy as np\n\n try:\n y, _ = (\n ffmpeg.input(\"/path/to/audio.wav\", threads=0)\n .output(\"-\", format=\"s16le\", acodec=\"pcm_s16le\", ac=1, ar=sample_rate)\n .run(\n cmd=[\"ffmpeg\", \"-nostdin\"], capture_stdout=True, capture_stderr=True\n )\n )\n except ffmpeg.Error as e:\n raise RuntimeError(f\"Failed to load audio: {e.stderr.decode()}\") from e\n\n arr = np.frombuffer(y, np.int16).flatten().astype(np.float32) / 32768.0\n\n w.transcribe(arr)\n\nThe Pybind11 bindings supports all of the features from whisper.cpp.\n\nThe binding can also be used via ``api``:\n\n.. code-block:: python\n\n from whispercpp import api\n\n ctx = api.Context.from_file(\"/path/to/saved_weight.bin\")\n params = api.Params()\n\n ctx.full(arr, params)\n\nDevelopment\n~~~~~~~~~~~\n\nSee `DEVELOPMENT.md <https://github.com/aarnphm/whispercpp/blob/main/DEVELOPMENT.md>`_\n\nAPIs\n~~~~\n\n``Whisper``\n------------\n\n1. ``Whisper.from_pretrained(model_name: str) -> Whisper``\n\n Load a pre-trained model from the local cache or download and cache if needed.\n\n .. code-block:: python\n\n w = Whisper.from_pretrained(\"tiny.en\")\n\nThe model will be saved to ``$XDG_DATA_HOME/whispercpp`` or ``~/.local/share/whispercpp`` if the environment variable is\nnot set.\n\n2. ``Whisper.transcribe(arr: NDArray[np.float32], num_proc: int = 1)``\n\n Running transcription on a given Numpy array. This calls ``full`` from ``whisper.cpp``. If ``num_proc`` is greater than 1,\n it will use ``full_parallel`` instead.\n\n .. code-block:: python\n\n w.transcribe(np.ones((1, 16000)))\n\n``api``\n-------\n\n``api`` is a direct binding from ``whisper.cpp``, that has similar APIs to `whisper-rs <https://github.com/tazz4843/whisper-rs>`_.\n\n1. ``api.Context``\n\n This class is a wrapper around ``whisper_context``\n\n .. code-block:: python\n\n from whispercpp import api\n\n ctx = api.Context.from_file(\"/path/to/saved_weight.bin\")\n\n .. note::\n\n The context can also be accessed from the ``Whisper`` class via ``w.context``\n\n2. ``api.Params``\n\n This class is a wrapper around ``whisper_params``\n\n .. code-block:: python\n\n from whispercpp import api\n\n params = api.Params()\n\n .. note::\n\n The params can also be accessed from the ``Whisper`` class via ``w.params``\n\nWhy not?\n~~~~~~~~\n\n* `whispercpp.py <https://github.com/stlukey/whispercpp.py>`_. There are a few key differences here:\n\n * They provides the Cython bindings. From the UX standpoint, this achieves the same goal as ``whispercpp``. The difference is ``whispercpp`` use Pybind11 instead. Feel free to use it if you prefer Cython over Pybind11. Note that ``whispercpp.py`` and ``whispercpp`` are mutually exclusive, as they also use the ``whispercpp`` namespace.\n\n * ``whispercpp`` doesn't pollute your ``$HOME`` directory, rather it follows the `XDG Base Directory Specification <https://specifications.freedesktop.org/basedir-spec/basedir-spec-latest.html>`_ for saved weights.\n\n* Using ``cdll`` and ``ctypes`` and be done with it?\n\n * This is also valid, but requires a lot of hacking and it is pretty slow comparing to Cython and Pybind11.\n\n",
"bugtrack_url": null,
"license": "Apache 2.0",
"summary": "",
"version": "0.0.23",
"project_urls": {
"Homepage": "https://github.com/aarnphm/whispercpp_py"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "d78a024e3b897083bd002821137e94800ee755361769d9ed597c4180cf37b9f0",
"md5": "713c6d47c1be6bbf65e8b9845a314392",
"sha256": "4ecfa2bff8bbc02b33d9882b99448e7825cb044a9aeb88d7603b70a3b26e34ca"
},
"downloads": -1,
"filename": "whispercpp_py-0.0.23-cp310-cp310-macosx_10_9_x86_64.whl",
"has_sig": false,
"md5_digest": "713c6d47c1be6bbf65e8b9845a314392",
"packagetype": "bdist_wheel",
"python_version": "cp310",
"requires_python": null,
"size": 1342950,
"upload_time": "2023-06-01T15:41:10",
"upload_time_iso_8601": "2023-06-01T15:41:10.389182Z",
"url": "https://files.pythonhosted.org/packages/d7/8a/024e3b897083bd002821137e94800ee755361769d9ed597c4180cf37b9f0/whispercpp_py-0.0.23-cp310-cp310-macosx_10_9_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "71a29415df160e2952a2b2d1d4bfa3486d1cc73f99b219ce1fef8d021ba7769c",
"md5": "f06e3668f78110744cbbc5fb74f3ca2e",
"sha256": "f0e8776fbc5881f6d0e3ba4706e5a92a32fe05a3a72ead2c6d920070a46994b8"
},
"downloads": -1,
"filename": "whispercpp_py-0.0.23-cp310-cp310-manylinux2014_x86_64.whl",
"has_sig": false,
"md5_digest": "f06e3668f78110744cbbc5fb74f3ca2e",
"packagetype": "bdist_wheel",
"python_version": "cp310",
"requires_python": null,
"size": 1501548,
"upload_time": "2023-06-01T15:29:43",
"upload_time_iso_8601": "2023-06-01T15:29:43.245356Z",
"url": "https://files.pythonhosted.org/packages/71/a2/9415df160e2952a2b2d1d4bfa3486d1cc73f99b219ce1fef8d021ba7769c/whispercpp_py-0.0.23-cp310-cp310-manylinux2014_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "7491b553056bde8edb677ef7d066a42b949647169f73071ed13fe88791ad00fc",
"md5": "0a3765ccfb3eb3ffe06f351ae6a06c42",
"sha256": "175c6355a0916ccae426075ff1f36ee0bfff111ff3c10b3804e2d1dc3e89722d"
},
"downloads": -1,
"filename": "whispercpp_py-0.0.23-cp311-cp311-manylinux2014_x86_64.whl",
"has_sig": false,
"md5_digest": "0a3765ccfb3eb3ffe06f351ae6a06c42",
"packagetype": "bdist_wheel",
"python_version": "cp311",
"requires_python": null,
"size": 1501248,
"upload_time": "2023-06-01T15:29:50",
"upload_time_iso_8601": "2023-06-01T15:29:50.170035Z",
"url": "https://files.pythonhosted.org/packages/74/91/b553056bde8edb677ef7d066a42b949647169f73071ed13fe88791ad00fc/whispercpp_py-0.0.23-cp311-cp311-manylinux2014_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "d3d1d703eb6d56a491b9612c95ebbe83a78ed7b8c5f38d99982430d993bbfb6f",
"md5": "599f06d2903301706f56173288d28e34",
"sha256": "e865753313eb24a747f11b39c01b59991d1121a8653f6cdea2d99cfda0910398"
},
"downloads": -1,
"filename": "whispercpp_py-0.0.23-cp38-cp38-macosx_10_9_x86_64.whl",
"has_sig": false,
"md5_digest": "599f06d2903301706f56173288d28e34",
"packagetype": "bdist_wheel",
"python_version": "cp38",
"requires_python": null,
"size": 1342419,
"upload_time": "2023-06-01T15:44:30",
"upload_time_iso_8601": "2023-06-01T15:44:30.912051Z",
"url": "https://files.pythonhosted.org/packages/d3/d1/d703eb6d56a491b9612c95ebbe83a78ed7b8c5f38d99982430d993bbfb6f/whispercpp_py-0.0.23-cp38-cp38-macosx_10_9_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "ce526ee2602c6490805c423e68ff3929704e1f8d2021472b5eeccb30f2675e2d",
"md5": "ade32f63ffe04eea8f09ed3608d7c895",
"sha256": "f84059a9da39a5475e98184c3b3b672585598fc1182176a8b876a06a2a30fa28"
},
"downloads": -1,
"filename": "whispercpp_py-0.0.23-cp38-cp38-manylinux2014_x86_64.whl",
"has_sig": false,
"md5_digest": "ade32f63ffe04eea8f09ed3608d7c895",
"packagetype": "bdist_wheel",
"python_version": "cp38",
"requires_python": null,
"size": 1500100,
"upload_time": "2023-06-01T15:29:46",
"upload_time_iso_8601": "2023-06-01T15:29:46.435957Z",
"url": "https://files.pythonhosted.org/packages/ce/52/6ee2602c6490805c423e68ff3929704e1f8d2021472b5eeccb30f2675e2d/whispercpp_py-0.0.23-cp38-cp38-manylinux2014_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "bf688cf4171d069602f111d542d72eaad6d2b5b64ae12008c37507385ae571cc",
"md5": "ebf48bcaabf34a5dec06cb1c5e37053c",
"sha256": "31cffc6f69b38579b358609c5c45cc366e478481676090351a43dceff4efa14b"
},
"downloads": -1,
"filename": "whispercpp_py-0.0.23-cp39-cp39-manylinux2014_x86_64.whl",
"has_sig": false,
"md5_digest": "ebf48bcaabf34a5dec06cb1c5e37053c",
"packagetype": "bdist_wheel",
"python_version": "cp39",
"requires_python": null,
"size": 1501002,
"upload_time": "2023-06-01T15:29:42",
"upload_time_iso_8601": "2023-06-01T15:29:42.552506Z",
"url": "https://files.pythonhosted.org/packages/bf/68/8cf4171d069602f111d542d72eaad6d2b5b64ae12008c37507385ae571cc/whispercpp_py-0.0.23-cp39-cp39-manylinux2014_x86_64.whl",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-06-01 15:41:10",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "aarnphm",
"github_project": "whispercpp_py",
"github_not_found": true,
"lcname": "whispercpp-py"
}