pvleopard


Namepvleopard JSON
Version 2.0.2 PyPI version JSON
download
home_pagehttps://github.com/Picovoice/leopard
SummaryLeopard Speech-to-Text Engine.
upload_time2024-02-06 01:30:24
maintainer
docs_urlNone
authorPicovoice
requires_python>=3.7
license
keywords speech-to-text speech recognition voice recognition asr automatic speech recognition
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Leopard Binding for Python

## Leopard Speech-to-Text Engine

Made in Vancouver, Canada by [Picovoice](https://picovoice.ai)

Leopard is an on-device speech-to-text engine. Leopard is:

- Private; All voice processing runs locally.
- [Accurate](https://picovoice.ai/docs/benchmark/stt/)
- [Compact and Computationally-Efficient](https://github.com/Picovoice/speech-to-text-benchmark#rtf)
- Cross-Platform:
  - Linux (x86_64), macOS (x86_64, arm64), Windows (x86_64)
  - Android and iOS
  - Chrome, Safari, Firefox, and Edge
  - Raspberry Pi (5, 4, 3) and NVIDIA Jetson Nano

## Compatibility

- Python 3.7+
- Runs on Linux (x86_64), macOS (x86_64, arm64), Windows (x86_64), Raspberry Pi (4, 3), and NVIDIA Jetson Nano.

## Installation

```console
pip3 install pvleopard
```

## AccessKey

Leopard requires a valid Picovoice `AccessKey` at initialization. `AccessKey` acts as your credentials when using Leopard SDKs.
You can get your `AccessKey` for free. Make sure to keep your `AccessKey` secret.
Signup or Login to [Picovoice Console](https://console.picovoice.ai/) to get your `AccessKey`.

## Usage

Create an instance of the engine and transcribe an audio file:

```python
import pvleopard

leopard = pvleopard.create(access_key='${ACCESS_KEY}')

transcript, words = leopard.process_file('${AUDIO_FILE_PATH}')
print(transcript)
for word in words:
    print(
      "{word=\"%s\" start_sec=%.2f end_sec=%.2f confidence=%.2f speaker_tag=%d}"
      % (word.word, word.start_sec, word.end_sec, word.confidence, word.speaker_tag))
```

Replace `${ACCESS_KEY}` with yours obtained from [Picovoice Console](https://console.picovoice.ai/) and
`${AUDIO_FILE_PATH}` to the path an audio file.

Finally, when done be sure to explicitly release the resources:
```python
leopard.delete()
```

### Language Model

The Leopard Python SDK comes preloaded with a default English language model (`.pv` file).
Default models for other supported languages can be found in [lib/common](../../lib/common).

Create custom language models using the [Picovoice Console](https://console.picovoice.ai/). Here you can train
language models with custom vocabulary and boost words in the existing vocabulary.

Pass in the `.pv` file via the `model_path` argument:
```python
leopard = pvleopard.create(
    access_key='${ACCESS_KEY}',
    model_path='${MODEL_FILE_PATH}')
```

### Word Metadata

Along with the transcript, Leopard returns metadata for each transcribed word. Available metadata items are:

- **Start Time:** Indicates when the word started in the transcribed audio. Value is in seconds.
- **End Time:** Indicates when the word ended in the transcribed audio. Value is in seconds.
- **Confidence:** Leopard's confidence that the transcribed word is accurate. It is a number within `[0, 1]`.
- **Speaker Tag:** If speaker diarization is enabled on initialization, the speaker tag is a non-negative integer identifying unique speakers, with `0` reserved for unknown speakers. If speaker diarization is not enabled, the value will always be `-1`.

## Demos

[pvleoparddemo](https://pypi.org/project/pvleoparddemo/) provides command-line utilities for processing audio using
Leopard.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/Picovoice/leopard",
    "name": "pvleopard",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "",
    "keywords": "Speech-to-Text,Speech Recognition,Voice Recognition,ASR,Automatic Speech Recognition",
    "author": "Picovoice",
    "author_email": "hello@picovoice.ai",
    "download_url": "https://files.pythonhosted.org/packages/38/ba/7990652a6719cf732bb146660db074be9f9a9f11f6f19c65f3fbd3a9fb30/pvleopard-2.0.2.tar.gz",
    "platform": null,
    "description": "# Leopard Binding for Python\n\n## Leopard Speech-to-Text Engine\n\nMade in Vancouver, Canada by [Picovoice](https://picovoice.ai)\n\nLeopard is an on-device speech-to-text engine. Leopard is:\n\n- Private; All voice processing runs locally.\n- [Accurate](https://picovoice.ai/docs/benchmark/stt/)\n- [Compact and Computationally-Efficient](https://github.com/Picovoice/speech-to-text-benchmark#rtf)\n- Cross-Platform:\n  - Linux (x86_64), macOS (x86_64, arm64), Windows (x86_64)\n  - Android and iOS\n  - Chrome, Safari, Firefox, and Edge\n  - Raspberry Pi (5, 4, 3) and NVIDIA Jetson Nano\n\n## Compatibility\n\n- Python 3.7+\n- Runs on Linux (x86_64), macOS (x86_64, arm64), Windows (x86_64), Raspberry Pi (4, 3), and NVIDIA Jetson Nano.\n\n## Installation\n\n```console\npip3 install pvleopard\n```\n\n## AccessKey\n\nLeopard requires a valid Picovoice `AccessKey` at initialization. `AccessKey` acts as your credentials when using Leopard SDKs.\nYou can get your `AccessKey` for free. Make sure to keep your `AccessKey` secret.\nSignup or Login to [Picovoice Console](https://console.picovoice.ai/) to get your `AccessKey`.\n\n## Usage\n\nCreate an instance of the engine and transcribe an audio file:\n\n```python\nimport pvleopard\n\nleopard = pvleopard.create(access_key='${ACCESS_KEY}')\n\ntranscript, words = leopard.process_file('${AUDIO_FILE_PATH}')\nprint(transcript)\nfor word in words:\n    print(\n      \"{word=\\\"%s\\\" start_sec=%.2f end_sec=%.2f confidence=%.2f speaker_tag=%d}\"\n      % (word.word, word.start_sec, word.end_sec, word.confidence, word.speaker_tag))\n```\n\nReplace `${ACCESS_KEY}` with yours obtained from [Picovoice Console](https://console.picovoice.ai/) and\n`${AUDIO_FILE_PATH}` to the path an audio file.\n\nFinally, when done be sure to explicitly release the resources:\n```python\nleopard.delete()\n```\n\n### Language Model\n\nThe Leopard Python SDK comes preloaded with a default English language model (`.pv` file).\nDefault models for other supported languages can be found in [lib/common](../../lib/common).\n\nCreate custom language models using the [Picovoice Console](https://console.picovoice.ai/). Here you can train\nlanguage models with custom vocabulary and boost words in the existing vocabulary.\n\nPass in the `.pv` file via the `model_path` argument:\n```python\nleopard = pvleopard.create(\n    access_key='${ACCESS_KEY}',\n    model_path='${MODEL_FILE_PATH}')\n```\n\n### Word Metadata\n\nAlong with the transcript, Leopard returns metadata for each transcribed word. Available metadata items are:\n\n- **Start Time:** Indicates when the word started in the transcribed audio. Value is in seconds.\n- **End Time:** Indicates when the word ended in the transcribed audio. Value is in seconds.\n- **Confidence:** Leopard's confidence that the transcribed word is accurate. It is a number within `[0, 1]`.\n- **Speaker Tag:** If speaker diarization is enabled on initialization, the speaker tag is a non-negative integer identifying unique speakers, with `0` reserved for unknown speakers. If speaker diarization is not enabled, the value will always be `-1`.\n\n## Demos\n\n[pvleoparddemo](https://pypi.org/project/pvleoparddemo/) provides command-line utilities for processing audio using\nLeopard.\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Leopard Speech-to-Text Engine.",
    "version": "2.0.2",
    "project_urls": {
        "Homepage": "https://github.com/Picovoice/leopard"
    },
    "split_keywords": [
        "speech-to-text",
        "speech recognition",
        "voice recognition",
        "asr",
        "automatic speech recognition"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "99c869097fb8922895cf6686355e87998debcd106167680f8d5398eb34d1257f",
                "md5": "3243233efe6f1f9d1fca7416631ff86b",
                "sha256": "170429cc70ed7417e04a28b4098c33388014825344cefdc8026ac766bd8f8f65"
            },
            "downloads": -1,
            "filename": "pvleopard-2.0.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "3243233efe6f1f9d1fca7416631ff86b",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 42229575,
            "upload_time": "2024-02-06T01:30:14",
            "upload_time_iso_8601": "2024-02-06T01:30:14.503091Z",
            "url": "https://files.pythonhosted.org/packages/99/c8/69097fb8922895cf6686355e87998debcd106167680f8d5398eb34d1257f/pvleopard-2.0.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "38ba7990652a6719cf732bb146660db074be9f9a9f11f6f19c65f3fbd3a9fb30",
                "md5": "48c34d4051617c9bd61162cb73f76a5d",
                "sha256": "fccb8773a54179925e70eed7960fde2e939aed0a2c009b2f6b96eeae39fb6f80"
            },
            "downloads": -1,
            "filename": "pvleopard-2.0.2.tar.gz",
            "has_sig": false,
            "md5_digest": "48c34d4051617c9bd61162cb73f76a5d",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 42217511,
            "upload_time": "2024-02-06T01:30:24",
            "upload_time_iso_8601": "2024-02-06T01:30:24.083504Z",
            "url": "https://files.pythonhosted.org/packages/38/ba/7990652a6719cf732bb146660db074be9f9a9f11f6f19c65f3fbd3a9fb30/pvleopard-2.0.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-02-06 01:30:24",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Picovoice",
    "github_project": "leopard",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "pvleopard"
}
        
Elapsed time: 0.17838s