pylipsync


Namepylipsync JSON
Version 0.1.2 PyPI version JSON
download
home_pageNone
SummaryA Python implementation of Hecomi's uLipSync for audio-based lip sync analysis.
upload_time2025-10-14 02:28:56
maintainerNone
docs_urlNone
authorNone
requires_python>=3.9
licenseMIT
keywords lip-sync audio phoneme mfcc speech-analysis animation
VCS
bugtrack_url
requirements numpy scipy librosa scikit-learn
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # pylipsync

A Python implementation of [Hecomi's uLipSync](https://github.com/hecomi/uLipSync) for audio-based lip sync analysis. This library analyzes audio and determines phoneme targets for lip synchronization in real-time applications.

## Installation

### Install from PyPI

```bash
pip install pylipsync
```

### Install from Local Clone

Alternatively, clone the repository and install:

```bash
git clone https://github.com/spava002/pyLipSync.git
cd pyLipSync
pip install -e .
```

## Quick Start

The library comes with built-in audio templates for common phonemes, so you can start using it immediately:

```python
import librosa as lb
from pylipsync import LipSync, CompareMethod

# Initialize LipSync - works out of the box with default templates
lipsync = LipSync(
    compare_method=CompareMethod.COSINE_SIMILARITY  # Options: L1_NORM, L2_NORM, COSINE_SIMILARITY
)

# Load your audio file
audio, sr = lb.load("path/to/your/audio.mp3", sr=None)

# Process audio and get phoneme segments
segments = lipsync.process_audio_segments(
    audio,
    sr,
    window_size_ms=64.0,  # Window size in milliseconds
    fps=60                # Frames per second for output
)

# Get the most prominent phoneme for each segment
for segment in segments:
    most_prominent_phoneme = segment.most_prominent_phoneme() if not segment.is_silence() else None
    print(f"({segment.start_time:.4f}-{segment.end_time:.4f})s | Most Prominent Phoneme: {most_prominent_phoneme}")
```

## Default Phonemes

The library includes pre-configured phoneme templates for:
- `aa` - "A" sounds
- `ee` - "E" sounds
- `ih` - "I" sounds
- `oh` - "O" sounds
- `ou` - "U" sounds
- `silence` - silence/no speech

These templates are ready to use without any additional setup.

### Adding New Phonemes

To add additional phonemes (e.g., consonants like "th", "sh", "f"):

1. Create a folder with all your phoneme names (or expand off the existing audio/ folder)
   ```
   audio/
   ├── aa/
   ├── ee/
   ├── th/          # New phoneme!
   │   └── th_sound.mp3
   └── sh/          # Another new one!
       └── sh_sound.mp3
   ```

2. Add audio samples to each folder (`.mp3`, `.wav`, `.ogg`, `.flac`, etc.)

3. Use your custom templates:
   ```python
   lipsync = LipSync(
       audio_templates_path="/path/to/my_custom_audio" # Not necessary if expanding within the audio/ folder
   )
   ```

**Note:** The folder name becomes the phoneme identifier in the output.

## How It Works

1. **Template Loading**: The library loads pre-computed MFCC templates from `data/phonemes.json`
2. **Audio Processing**: Input audio is processed in overlapping windows using MFCC extraction
3. **Phoneme Matching**: Each segment is compared against all phoneme templates using the selected comparison method
4. **Target Calculation**: Returns normalized confidence scores (0-1) for each phoneme per segment
5. **Silence Detection**: Segments below the silence threshold have all phoneme targets set to 0

## Credits

This is a Python implementation of [uLipSync](https://github.com/hecomi/uLipSync) by Hecomi.

## License

MIT License - see [LICENSE](LICENSE) file for details.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "pylipsync",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "lip-sync, audio, phoneme, mfcc, speech-analysis, animation",
    "author": null,
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/a6/75/4b391d30329702c9f8c45320a2257188f0c1a20e5440f60658520c128f9f/pylipsync-0.1.2.tar.gz",
    "platform": null,
    "description": "# pylipsync\r\n\r\nA Python implementation of [Hecomi's uLipSync](https://github.com/hecomi/uLipSync) for audio-based lip sync analysis. This library analyzes audio and determines phoneme targets for lip synchronization in real-time applications.\r\n\r\n## Installation\r\n\r\n### Install from PyPI\r\n\r\n```bash\r\npip install pylipsync\r\n```\r\n\r\n### Install from Local Clone\r\n\r\nAlternatively, clone the repository and install:\r\n\r\n```bash\r\ngit clone https://github.com/spava002/pyLipSync.git\r\ncd pyLipSync\r\npip install -e .\r\n```\r\n\r\n## Quick Start\r\n\r\nThe library comes with built-in audio templates for common phonemes, so you can start using it immediately:\r\n\r\n```python\r\nimport librosa as lb\r\nfrom pylipsync import LipSync, CompareMethod\r\n\r\n# Initialize LipSync - works out of the box with default templates\r\nlipsync = LipSync(\r\n    compare_method=CompareMethod.COSINE_SIMILARITY  # Options: L1_NORM, L2_NORM, COSINE_SIMILARITY\r\n)\r\n\r\n# Load your audio file\r\naudio, sr = lb.load(\"path/to/your/audio.mp3\", sr=None)\r\n\r\n# Process audio and get phoneme segments\r\nsegments = lipsync.process_audio_segments(\r\n    audio,\r\n    sr,\r\n    window_size_ms=64.0,  # Window size in milliseconds\r\n    fps=60                # Frames per second for output\r\n)\r\n\r\n# Get the most prominent phoneme for each segment\r\nfor segment in segments:\r\n    most_prominent_phoneme = segment.most_prominent_phoneme() if not segment.is_silence() else None\r\n    print(f\"({segment.start_time:.4f}-{segment.end_time:.4f})s | Most Prominent Phoneme: {most_prominent_phoneme}\")\r\n```\r\n\r\n## Default Phonemes\r\n\r\nThe library includes pre-configured phoneme templates for:\r\n- `aa` - \"A\" sounds\r\n- `ee` - \"E\" sounds\r\n- `ih` - \"I\" sounds\r\n- `oh` - \"O\" sounds\r\n- `ou` - \"U\" sounds\r\n- `silence` - silence/no speech\r\n\r\nThese templates are ready to use without any additional setup.\r\n\r\n### Adding New Phonemes\r\n\r\nTo add additional phonemes (e.g., consonants like \"th\", \"sh\", \"f\"):\r\n\r\n1. Create a folder with all your phoneme names (or expand off the existing audio/ folder)\r\n   ```\r\n   audio/\r\n   \u251c\u2500\u2500 aa/\r\n   \u251c\u2500\u2500 ee/\r\n   \u251c\u2500\u2500 th/          # New phoneme!\r\n   \u2502   \u2514\u2500\u2500 th_sound.mp3\r\n   \u2514\u2500\u2500 sh/          # Another new one!\r\n       \u2514\u2500\u2500 sh_sound.mp3\r\n   ```\r\n\r\n2. Add audio samples to each folder (`.mp3`, `.wav`, `.ogg`, `.flac`, etc.)\r\n\r\n3. Use your custom templates:\r\n   ```python\r\n   lipsync = LipSync(\r\n       audio_templates_path=\"/path/to/my_custom_audio\" # Not necessary if expanding within the audio/ folder\r\n   )\r\n   ```\r\n\r\n**Note:** The folder name becomes the phoneme identifier in the output.\r\n\r\n## How It Works\r\n\r\n1. **Template Loading**: The library loads pre-computed MFCC templates from `data/phonemes.json`\r\n2. **Audio Processing**: Input audio is processed in overlapping windows using MFCC extraction\r\n3. **Phoneme Matching**: Each segment is compared against all phoneme templates using the selected comparison method\r\n4. **Target Calculation**: Returns normalized confidence scores (0-1) for each phoneme per segment\r\n5. **Silence Detection**: Segments below the silence threshold have all phoneme targets set to 0\r\n\r\n## Credits\r\n\r\nThis is a Python implementation of [uLipSync](https://github.com/hecomi/uLipSync) by Hecomi.\r\n\r\n## License\r\n\r\nMIT License - see [LICENSE](LICENSE) file for details.\r\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A Python implementation of Hecomi's uLipSync for audio-based lip sync analysis.",
    "version": "0.1.2",
    "project_urls": {
        "Homepage": "https://github.com/spava002/pyLipSync"
    },
    "split_keywords": [
        "lip-sync",
        " audio",
        " phoneme",
        " mfcc",
        " speech-analysis",
        " animation"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "88dfb276ad5924f55d7f7e7caedff5074d31ea856160397aa664bc623c827da1",
                "md5": "4118c9b8dafabb9658ff09c8211c477f",
                "sha256": "8f4e4b97191a8fab7aedd43625e32c110fdffcd2cb8e4c3b63d87f7a07c9c076"
            },
            "downloads": -1,
            "filename": "pylipsync-0.1.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "4118c9b8dafabb9658ff09c8211c477f",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 63589,
            "upload_time": "2025-10-14T02:28:55",
            "upload_time_iso_8601": "2025-10-14T02:28:55.327395Z",
            "url": "https://files.pythonhosted.org/packages/88/df/b276ad5924f55d7f7e7caedff5074d31ea856160397aa664bc623c827da1/pylipsync-0.1.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "a6754b391d30329702c9f8c45320a2257188f0c1a20e5440f60658520c128f9f",
                "md5": "f1e2286b34426d02b31622098290fa76",
                "sha256": "d96c5cb4bea781ca3db68eeaa3160c058580890fc62b3b60c2f543da28f41920"
            },
            "downloads": -1,
            "filename": "pylipsync-0.1.2.tar.gz",
            "has_sig": false,
            "md5_digest": "f1e2286b34426d02b31622098290fa76",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 66186,
            "upload_time": "2025-10-14T02:28:56",
            "upload_time_iso_8601": "2025-10-14T02:28:56.640575Z",
            "url": "https://files.pythonhosted.org/packages/a6/75/4b391d30329702c9f8c45320a2257188f0c1a20e5440f60658520c128f9f/pylipsync-0.1.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-10-14 02:28:56",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "spava002",
    "github_project": "pyLipSync",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "numpy",
            "specs": [
                [
                    ">=",
                    "1.20.0"
                ],
                [
                    "<",
                    "2.0.0"
                ]
            ]
        },
        {
            "name": "scipy",
            "specs": [
                [
                    ">=",
                    "1.7.0"
                ]
            ]
        },
        {
            "name": "librosa",
            "specs": [
                [
                    ">=",
                    "0.10.0"
                ]
            ]
        },
        {
            "name": "scikit-learn",
            "specs": [
                [
                    ">=",
                    "1.0.0"
                ]
            ]
        }
    ],
    "lcname": "pylipsync"
}
        
Elapsed time: 1.61845s