# pylipsync
A Python implementation of [Hecomi's uLipSync](https://github.com/hecomi/uLipSync) for audio-based lip sync analysis. This library analyzes audio and determines phoneme targets for lip synchronization in real-time applications.
## Installation
### Install from PyPI
```bash
pip install pylipsync
```
### Install from Local Clone
Alternatively, clone the repository and install:
```bash
git clone https://github.com/spava002/pyLipSync.git
cd pyLipSync
pip install -e .
```
## Quick Start
The library comes with built-in audio templates for common phonemes, so you can start using it immediately:
```python
import librosa as lb
from pylipsync import LipSync, CompareMethod
# Initialize LipSync - works out of the box with default templates
lipsync = LipSync(
compare_method=CompareMethod.COSINE_SIMILARITY # Options: L1_NORM, L2_NORM, COSINE_SIMILARITY
)
# Load your audio file
audio, sr = lb.load("path/to/your/audio.mp3", sr=None)
# Process audio and get phoneme segments
segments = lipsync.process_audio_segments(
audio,
sr,
window_size_ms=64.0, # Window size in milliseconds
fps=60 # Frames per second for output
)
# Get the most prominent phoneme for each segment
for segment in segments:
most_prominent_phoneme = segment.most_prominent_phoneme() if not segment.is_silence() else None
print(f"({segment.start_time:.4f}-{segment.end_time:.4f})s | Most Prominent Phoneme: {most_prominent_phoneme}")
```
## Default Phonemes
The library includes pre-configured phoneme templates for:
- `aa` - "A" sounds
- `ee` - "E" sounds
- `ih` - "I" sounds
- `oh` - "O" sounds
- `ou` - "U" sounds
- `silence` - silence/no speech
These templates are ready to use without any additional setup.
### Adding New Phonemes
To add additional phonemes (e.g., consonants like "th", "sh", "f"):
1. Create a folder with all your phoneme names (or expand off the existing audio/ folder)
```
audio/
├── aa/
├── ee/
├── th/ # New phoneme!
│ └── th_sound.mp3
└── sh/ # Another new one!
└── sh_sound.mp3
```
2. Add audio samples to each folder (`.mp3`, `.wav`, `.ogg`, `.flac`, etc.)
3. Use your custom templates:
```python
lipsync = LipSync(
audio_templates_path="/path/to/my_custom_audio" # Not necessary if expanding within the audio/ folder
)
```
**Note:** The folder name becomes the phoneme identifier in the output.
## How It Works
1. **Template Loading**: The library loads pre-computed MFCC templates from `data/phonemes.json`
2. **Audio Processing**: Input audio is processed in overlapping windows using MFCC extraction
3. **Phoneme Matching**: Each segment is compared against all phoneme templates using the selected comparison method
4. **Target Calculation**: Returns normalized confidence scores (0-1) for each phoneme per segment
5. **Silence Detection**: Segments below the silence threshold have all phoneme targets set to 0
## Credits
This is a Python implementation of [uLipSync](https://github.com/hecomi/uLipSync) by Hecomi.
## License
MIT License - see [LICENSE](LICENSE) file for details.
Raw data
{
"_id": null,
"home_page": null,
"name": "pylipsync",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": "lip-sync, audio, phoneme, mfcc, speech-analysis, animation",
"author": null,
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/a6/75/4b391d30329702c9f8c45320a2257188f0c1a20e5440f60658520c128f9f/pylipsync-0.1.2.tar.gz",
"platform": null,
"description": "# pylipsync\r\n\r\nA Python implementation of [Hecomi's uLipSync](https://github.com/hecomi/uLipSync) for audio-based lip sync analysis. This library analyzes audio and determines phoneme targets for lip synchronization in real-time applications.\r\n\r\n## Installation\r\n\r\n### Install from PyPI\r\n\r\n```bash\r\npip install pylipsync\r\n```\r\n\r\n### Install from Local Clone\r\n\r\nAlternatively, clone the repository and install:\r\n\r\n```bash\r\ngit clone https://github.com/spava002/pyLipSync.git\r\ncd pyLipSync\r\npip install -e .\r\n```\r\n\r\n## Quick Start\r\n\r\nThe library comes with built-in audio templates for common phonemes, so you can start using it immediately:\r\n\r\n```python\r\nimport librosa as lb\r\nfrom pylipsync import LipSync, CompareMethod\r\n\r\n# Initialize LipSync - works out of the box with default templates\r\nlipsync = LipSync(\r\n compare_method=CompareMethod.COSINE_SIMILARITY # Options: L1_NORM, L2_NORM, COSINE_SIMILARITY\r\n)\r\n\r\n# Load your audio file\r\naudio, sr = lb.load(\"path/to/your/audio.mp3\", sr=None)\r\n\r\n# Process audio and get phoneme segments\r\nsegments = lipsync.process_audio_segments(\r\n audio,\r\n sr,\r\n window_size_ms=64.0, # Window size in milliseconds\r\n fps=60 # Frames per second for output\r\n)\r\n\r\n# Get the most prominent phoneme for each segment\r\nfor segment in segments:\r\n most_prominent_phoneme = segment.most_prominent_phoneme() if not segment.is_silence() else None\r\n print(f\"({segment.start_time:.4f}-{segment.end_time:.4f})s | Most Prominent Phoneme: {most_prominent_phoneme}\")\r\n```\r\n\r\n## Default Phonemes\r\n\r\nThe library includes pre-configured phoneme templates for:\r\n- `aa` - \"A\" sounds\r\n- `ee` - \"E\" sounds\r\n- `ih` - \"I\" sounds\r\n- `oh` - \"O\" sounds\r\n- `ou` - \"U\" sounds\r\n- `silence` - silence/no speech\r\n\r\nThese templates are ready to use without any additional setup.\r\n\r\n### Adding New Phonemes\r\n\r\nTo add additional phonemes (e.g., consonants like \"th\", \"sh\", \"f\"):\r\n\r\n1. Create a folder with all your phoneme names (or expand off the existing audio/ folder)\r\n ```\r\n audio/\r\n \u251c\u2500\u2500 aa/\r\n \u251c\u2500\u2500 ee/\r\n \u251c\u2500\u2500 th/ # New phoneme!\r\n \u2502 \u2514\u2500\u2500 th_sound.mp3\r\n \u2514\u2500\u2500 sh/ # Another new one!\r\n \u2514\u2500\u2500 sh_sound.mp3\r\n ```\r\n\r\n2. Add audio samples to each folder (`.mp3`, `.wav`, `.ogg`, `.flac`, etc.)\r\n\r\n3. Use your custom templates:\r\n ```python\r\n lipsync = LipSync(\r\n audio_templates_path=\"/path/to/my_custom_audio\" # Not necessary if expanding within the audio/ folder\r\n )\r\n ```\r\n\r\n**Note:** The folder name becomes the phoneme identifier in the output.\r\n\r\n## How It Works\r\n\r\n1. **Template Loading**: The library loads pre-computed MFCC templates from `data/phonemes.json`\r\n2. **Audio Processing**: Input audio is processed in overlapping windows using MFCC extraction\r\n3. **Phoneme Matching**: Each segment is compared against all phoneme templates using the selected comparison method\r\n4. **Target Calculation**: Returns normalized confidence scores (0-1) for each phoneme per segment\r\n5. **Silence Detection**: Segments below the silence threshold have all phoneme targets set to 0\r\n\r\n## Credits\r\n\r\nThis is a Python implementation of [uLipSync](https://github.com/hecomi/uLipSync) by Hecomi.\r\n\r\n## License\r\n\r\nMIT License - see [LICENSE](LICENSE) file for details.\r\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A Python implementation of Hecomi's uLipSync for audio-based lip sync analysis.",
"version": "0.1.2",
"project_urls": {
"Homepage": "https://github.com/spava002/pyLipSync"
},
"split_keywords": [
"lip-sync",
" audio",
" phoneme",
" mfcc",
" speech-analysis",
" animation"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "88dfb276ad5924f55d7f7e7caedff5074d31ea856160397aa664bc623c827da1",
"md5": "4118c9b8dafabb9658ff09c8211c477f",
"sha256": "8f4e4b97191a8fab7aedd43625e32c110fdffcd2cb8e4c3b63d87f7a07c9c076"
},
"downloads": -1,
"filename": "pylipsync-0.1.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "4118c9b8dafabb9658ff09c8211c477f",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 63589,
"upload_time": "2025-10-14T02:28:55",
"upload_time_iso_8601": "2025-10-14T02:28:55.327395Z",
"url": "https://files.pythonhosted.org/packages/88/df/b276ad5924f55d7f7e7caedff5074d31ea856160397aa664bc623c827da1/pylipsync-0.1.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "a6754b391d30329702c9f8c45320a2257188f0c1a20e5440f60658520c128f9f",
"md5": "f1e2286b34426d02b31622098290fa76",
"sha256": "d96c5cb4bea781ca3db68eeaa3160c058580890fc62b3b60c2f543da28f41920"
},
"downloads": -1,
"filename": "pylipsync-0.1.2.tar.gz",
"has_sig": false,
"md5_digest": "f1e2286b34426d02b31622098290fa76",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 66186,
"upload_time": "2025-10-14T02:28:56",
"upload_time_iso_8601": "2025-10-14T02:28:56.640575Z",
"url": "https://files.pythonhosted.org/packages/a6/75/4b391d30329702c9f8c45320a2257188f0c1a20e5440f60658520c128f9f/pylipsync-0.1.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-10-14 02:28:56",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "spava002",
"github_project": "pyLipSync",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [
{
"name": "numpy",
"specs": [
[
">=",
"1.20.0"
],
[
"<",
"2.0.0"
]
]
},
{
"name": "scipy",
"specs": [
[
">=",
"1.7.0"
]
]
},
{
"name": "librosa",
"specs": [
[
">=",
"0.10.0"
]
]
},
{
"name": "scikit-learn",
"specs": [
[
">=",
"1.0.0"
]
]
}
],
"lcname": "pylipsync"
}