ovos-voice-embeddings-plugin

Name	ovos-voice-embeddings-plugin JSON
Version	0.0.0a2 JSON
	download
home_page	https://github.com/TigreGotico/ovos-voice-embeddings-plugin
Summary	A voice recognition plugin for OVOS
upload_time	2024-10-25 20:37:25
maintainer	None
docs_url	None
author	jarbasai
requires_python	None
license	MIT
keywords	ovos openvoiceos plugin voice recognition
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # VoiceEmbeddingsRecognitionPlugin

The `VoiceEmbeddingsRecognitionPlugin` is a plugin for recognizing and managing voice embeddings.

It uses [Resemblyzer](https://github.com/resemble-ai/Resemblyzer) to extract speaker embeddings and integrates with  [ovos-chromadb-embeddings-plugin](https://github.com/TigreGotico/ovos-chromadb-embeddings-plugin) for storing and retrieving voice embeddings. 

## Features

- **Voice Embeddings Extraction**: Converts audio data into voice embeddings using the `VoiceEncoder` from `resemblyzer`.
- **Voice Data Storage**: Stores and retrieves voice embeddings using `ChromaEmbeddingsDB`.
- **Voice Data Management**: Allows for adding, querying, and predicting voice embeddings associated with user IDs.
- **Supports Multiple Audio Formats**: Can handle audio data in various formats, including `wav` and `flac`.

## Usage

Here is a quick example of how to use the `VoiceEmbeddingsRecognitionPlugin`:

```python
from ovos_voice_embeddings import VoiceEmbeddingsRecognitionPlugin
from resemblyzer import preprocess_wav
from speech_recognition import Recognizer, AudioFile
from ovos_chromadb_embeddings import ChromaEmbeddingsDB

db = ChromaEmbeddingsDB("./voice_db")
v = VoiceEmbeddingsRecognitionPlugin(db)

a = "/home/miro/PycharmProjects/ovos-user-id/2609-156975-0001.flac"
b = "/home/miro/PycharmProjects/ovos-user-id/qCCWXoCURKY.mp3"
b2 = "/home/miro/PycharmProjects/ovos-user-id/4glfwiMXgwQ.mp3"

with AudioFile(a) as source:
    audio = Recognizer().record(source)
v.add_voice("user", audio)

wav = preprocess_wav(b)
v.add_voice("donald", wav)

wav = preprocess_wav(b2)
print(v.predict(wav))
print(v.prompt(wav))

```

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/TigreGotico/ovos-voice-embeddings-plugin",
    "name": "ovos-voice-embeddings-plugin",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "OVOS openvoiceos plugin voice recognition",
    "author": "jarbasai",
    "author_email": "jarbasai@mailfence.com",
    "download_url": "https://files.pythonhosted.org/packages/2a/95/ffc0ab294b79d4762ba4ecb38e7ab8b000235ec39151bd461ca73c1e85de/ovos-voice-embeddings-plugin-0.0.0a2.tar.gz",
    "platform": null,
    "description": "# VoiceEmbeddingsRecognitionPlugin\n\nThe `VoiceEmbeddingsRecognitionPlugin` is a plugin for recognizing and managing voice embeddings.\n\nIt uses [Resemblyzer](https://github.com/resemble-ai/Resemblyzer) to extract speaker embeddings and integrates with  [ovos-chromadb-embeddings-plugin](https://github.com/TigreGotico/ovos-chromadb-embeddings-plugin) for storing and retrieving voice embeddings. \n\n## Features\n\n- **Voice Embeddings Extraction**: Converts audio data into voice embeddings using the `VoiceEncoder` from `resemblyzer`.\n- **Voice Data Storage**: Stores and retrieves voice embeddings using `ChromaEmbeddingsDB`.\n- **Voice Data Management**: Allows for adding, querying, and predicting voice embeddings associated with user IDs.\n- **Supports Multiple Audio Formats**: Can handle audio data in various formats, including `wav` and `flac`.\n\n## Usage\n\nHere is a quick example of how to use the `VoiceEmbeddingsRecognitionPlugin`:\n\n```python\nfrom ovos_voice_embeddings import VoiceEmbeddingsRecognitionPlugin\nfrom resemblyzer import preprocess_wav\nfrom speech_recognition import Recognizer, AudioFile\nfrom ovos_chromadb_embeddings import ChromaEmbeddingsDB\n\ndb = ChromaEmbeddingsDB(\"./voice_db\")\nv = VoiceEmbeddingsRecognitionPlugin(db)\n\na = \"/home/miro/PycharmProjects/ovos-user-id/2609-156975-0001.flac\"\nb = \"/home/miro/PycharmProjects/ovos-user-id/qCCWXoCURKY.mp3\"\nb2 = \"/home/miro/PycharmProjects/ovos-user-id/4glfwiMXgwQ.mp3\"\n\nwith AudioFile(a) as source:\n    audio = Recognizer().record(source)\nv.add_voice(\"user\", audio)\n\nwav = preprocess_wav(b)\nv.add_voice(\"donald\", wav)\n\nwav = preprocess_wav(b2)\nprint(v.predict(wav))\nprint(v.prompt(wav))\n\n```\n\n\n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A voice recognition plugin for OVOS",
    "version": "0.0.0a2",
    "project_urls": {
        "Homepage": "https://github.com/TigreGotico/ovos-voice-embeddings-plugin"
    },
    "split_keywords": [
        "ovos",
        "openvoiceos",
        "plugin",
        "voice",
        "recognition"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "7cecee6a7e650ca7ccc20c16dbb95d5e5b66c4b77c226cb13c282c292ae6da81",
                "md5": "9018683b60946af92f065400895a2fa4",
                "sha256": "47289cb007642bca919791210ecec6c9cea4cf065d667a2d53cbbdff77e09509"
            },
            "downloads": -1,
            "filename": "ovos_voice_embeddings_plugin-0.0.0a2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "9018683b60946af92f065400895a2fa4",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 3808,
            "upload_time": "2024-10-25T20:37:24",
            "upload_time_iso_8601": "2024-10-25T20:37:24.100378Z",
            "url": "https://files.pythonhosted.org/packages/7c/ec/ee6a7e650ca7ccc20c16dbb95d5e5b66c4b77c226cb13c282c292ae6da81/ovos_voice_embeddings_plugin-0.0.0a2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "2a95ffc0ab294b79d4762ba4ecb38e7ab8b000235ec39151bd461ca73c1e85de",
                "md5": "a2856a813ec9c4c12073f1be4ebc138b",
                "sha256": "ba05b9e8bd7b734b9d2b7d606de3333c7e63aeda789ce4352669d8aedb512d47"
            },
            "downloads": -1,
            "filename": "ovos-voice-embeddings-plugin-0.0.0a2.tar.gz",
            "has_sig": false,
            "md5_digest": "a2856a813ec9c4c12073f1be4ebc138b",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 3354,
            "upload_time": "2024-10-25T20:37:25",
            "upload_time_iso_8601": "2024-10-25T20:37:25.580542Z",
            "url": "https://files.pythonhosted.org/packages/2a/95/ffc0ab294b79d4762ba4ecb38e7ab8b000235ec39151bd461ca73c1e85de/ovos-voice-embeddings-plugin-0.0.0a2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-10-25 20:37:25",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "TigreGotico",
    "github_project": "ovos-voice-embeddings-plugin",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "ovos-voice-embeddings-plugin"
}

jarbasai