# PyLipExtractor
A Python package for robust lip frame extraction from videos using MediaPipe, designed specifically for visual speech recognition (VSR) and lip-reading tasks. It provides a streamlined, configurable process to convert raw video into ready-to-use lip sequences, ideal for deep learning model training.
## Features
* **Accurate Lip Landmark Detection:** Leverages MediaPipe Face Mesh for precise identification of 3D lip contours, ensuring high fidelity in extraction.
* **Configurable Lip Region Extraction:** Offers fine-grained control over the bounding box around detected lips, allowing for custom proportional margins and padding to capture the desired context.
* **Temporal Smoothing:** Implements a moving average filter on bounding box coordinates to ensure stable and consistent lip frame extraction across video sequences. **The smoothing window size is now configurable.**
* **Illumination Normalization (CLAHE):** Applies Adaptive Histogram Equalization (CLAHE) to enhance contrast and normalize illumination, improving the robustness of extracted frames to varying lighting conditions.
* **Optional Video Conversion (FFmpeg):** Can automatically convert various video formats (e.g., MPG) to MP4 internally using FFmpeg, enhancing compatibility and robustness with MediaPipe and PyAV. This can resolve issues with specific problematic video codecs.
* **Flexible Output & Quality Control:** Extracts processed lip frames as NumPy arrays (.npy format). Includes a configurable threshold (`MAX_PROBLEMATIC_FRAMES_PERCENTAGE`) to automatically reject video clips with too many unprocessable (black) frames, ensuring output data quality.
* **Debugging Visualizations:** Provides options to save intermediate frames with landmarks and bounding boxes, aiding in visual inspection and troubleshooting of the extraction process.
* **Efficient Video Handling:** Utilizes PyAV for robust and efficient video decoding.
# Demo
https://github.com/user-attachments/assets/cfc7ce9d-dfc0-4a8c-9fc1-1a37299437bf
Original video by Tima Miroshnichenko
## Installation
You can easily install pylipextractor using pip directly from PyPI. Ensure you have a compatible Python version (3.8 or newer) installed.
```bash
pip install pylipextractor
```
For Development (Optional)
If you plan to contribute to the project or need an editable installation, follow these steps:
```bash
# First, clone the repository
git clone https://github.com/MehradYaghoubi/pylipextractor.git
cd pylipextractor
# Install the package in editable mode along with its dependencies
pip install -e .
```
## Usage
See example_usage.py in the project root for a full demonstration on how to use the LipExtractor class to process a video and save the lip frames.
Example:
```bash
from pathlib import Path
from pylipextractor.lip_extractor import LipExtractor
# Set your video path (e.g., ensure 'bbar8a.mpg' is in your current working directory or adjust path)
# You can use various formats like .mpg, .avi, .mp4, etc.
input_video_path = Path("your_video.mpg") # !!! IMPORTANT: CHANGE THIS TO YOUR VIDEO FILE NAME !!!
output_npy_directory = Path("./output_data")
output_npy_filename = input_video_path.stem + ".npy"
output_npy_path = output_npy_directory / output_npy_filename
# --- Configure LipExtractor settings (optional, defaults are from config.py) ---
# You can override any default setting like this:
LipExtractor.config.SAVE_DEBUG_FRAMES = True
LipExtractor.config.MAX_DEBUG_FRAMES = 75 # Limit debug frames saved
LipExtractor.config.APPLY_CLAHE = True # Ensure CLAHE is applied for contrast
LipExtractor.config.INCLUDE_LANDMARKS_ON_FINAL_OUTPUT = False # Don't draw landmarks on final output
# New: Enable optional MP4 conversion for input videos that are not already MP4.
# This is highly recommended for problematic formats like some .mpg files.
LipExtractor.config.CONVERT_TO_MP4_IF_NEEDED = True
LipExtractor.config.MP4_TEMP_DIR = Path("./temp_converted_mp4s") # Directory for temporary converted files
# New: Adjust the smoothing window size for bounding box stability (default is 5)
# LipExtractor.config.SMOOTHING_WINDOW_SIZE = 7
# New: Set the maximum percentage of problematic (e.g., black) frames allowed.
# If a video exceeds this threshold, it will be rejected as invalid.
LipExtractor.config.MAX_PROBLEMATIC_FRAMES_PERCENTAGE = 30.0 # Allow up to 30% problematic frames
# LipExtractor.config.MAX_FRAMES = 100 # Uncomment to limit the total number of frames processed
# Create an instance of the extractor
extractor = LipExtractor()
# Perform extraction
print(f"Starting extraction for {input_video_path.name}...")
extracted_frames = extractor.extract_lip_frames(input_video_path, output_npy_path=output_npy_path)
if extracted_frames is not None:
print(f"Successfully extracted {extracted_frames.shape[0]} frames.")
print(f"Frames saved to {output_npy_path}")
else:
print("Extraction failed or the video clip was rejected (e.g., too many invalid frames or no faces detected).")
```
To convert the extracted .npy file into individual image frames (e.g., PNGs), use the provided save_npy_frames_to_images.py utility script:
```bash
python save_npy_frames_to_images.py
```
## Dependencies
This project heavily relies on the following open-source libraries:
* **opencv-python:** Essential for core image and video processing operations, including frame manipulation, resizing, and color space conversions.
* **numpy:** Fundamental for efficient numerical computations and handling multi-dimensional data arrays (like image frames).
* **mediapipe:** Utilized for its highly accurate and performant Face Mesh solution, enabling robust facial landmark detection for precise lip localization.
* **av (PyAV):** Provides efficient and reliable reading and writing of various video file formats.
* **Pillow:** A fork of the Python Imaging Library (PIL), often used implicitly by other libraries for image file handling.
* **FFmpeg (External Tool):** Required for the optional automatic video format conversion feature. It must be installed separately on your system and accessible via the system's PATH.
## Acknowledgements
I sincerely thank the developers and the vibrant open source community behind all the libraries mentioned in the "Dependencies" section for their valuable work.
## Contributing
Contributions are highly welcome! If you encounter any bugs, have feature requests, or wish to contribute code, please feel free to:
Open an Issue on our GitHub repository.
Submit a Pull Request with your proposed changes.
## License
This project is licensed under the MIT License. See the [LICENSE](https://github.com/MehradYaghoubi/pylipextractor/blob/main/LICENSE) file for more details.
Raw data
{
"_id": null,
"home_page": null,
"name": "pylipextractor",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "lip-extraction, visual-speech-recognition, mediapipe, video-processing, lip-reading, ffmpeg, video-conversion",
"author": null,
"author_email": "Mehrad Yaghoubi <mehradyaghoubi96@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/45/6d/e9c7402b89364a3da977b0fd0bc3b943d080f0a76f16f1558e980c691d7f/pylipextractor-1.1.0.tar.gz",
"platform": null,
"description": "# PyLipExtractor\r\n\r\nA Python package for robust lip frame extraction from videos using MediaPipe, designed specifically for visual speech recognition (VSR) and lip-reading tasks. It provides a streamlined, configurable process to convert raw video into ready-to-use lip sequences, ideal for deep learning model training.\r\n\r\n## Features\r\n\r\n* **Accurate Lip Landmark Detection:** Leverages MediaPipe Face Mesh for precise identification of 3D lip contours, ensuring high fidelity in extraction.\r\n* **Configurable Lip Region Extraction:** Offers fine-grained control over the bounding box around detected lips, allowing for custom proportional margins and padding to capture the desired context.\r\n* **Temporal Smoothing:** Implements a moving average filter on bounding box coordinates to ensure stable and consistent lip frame extraction across video sequences. **The smoothing window size is now configurable.**\r\n* **Illumination Normalization (CLAHE):** Applies Adaptive Histogram Equalization (CLAHE) to enhance contrast and normalize illumination, improving the robustness of extracted frames to varying lighting conditions.\r\n* **Optional Video Conversion (FFmpeg):** Can automatically convert various video formats (e.g., MPG) to MP4 internally using FFmpeg, enhancing compatibility and robustness with MediaPipe and PyAV. This can resolve issues with specific problematic video codecs.\r\n* **Flexible Output & Quality Control:** Extracts processed lip frames as NumPy arrays (.npy format). Includes a configurable threshold (`MAX_PROBLEMATIC_FRAMES_PERCENTAGE`) to automatically reject video clips with too many unprocessable (black) frames, ensuring output data quality.\r\n* **Debugging Visualizations:** Provides options to save intermediate frames with landmarks and bounding boxes, aiding in visual inspection and troubleshooting of the extraction process.\r\n* **Efficient Video Handling:** Utilizes PyAV for robust and efficient video decoding.\r\n\r\n# Demo\r\nhttps://github.com/user-attachments/assets/cfc7ce9d-dfc0-4a8c-9fc1-1a37299437bf\r\n\r\nOriginal video by Tima Miroshnichenko\r\n\r\n## Installation\r\n\r\nYou can easily install pylipextractor using pip directly from PyPI. Ensure you have a compatible Python version (3.8 or newer) installed.\r\n\r\n```bash\r\npip install pylipextractor\r\n```\r\nFor Development (Optional)\r\nIf you plan to contribute to the project or need an editable installation, follow these steps:\r\n\r\n```bash\r\n# First, clone the repository\r\ngit clone https://github.com/MehradYaghoubi/pylipextractor.git\r\ncd pylipextractor\r\n\r\n# Install the package in editable mode along with its dependencies\r\npip install -e .\r\n```\r\n\r\n## Usage\r\nSee example_usage.py in the project root for a full demonstration on how to use the LipExtractor class to process a video and save the lip frames.\r\n\r\nExample:\r\n```bash\r\nfrom pathlib import Path\r\nfrom pylipextractor.lip_extractor import LipExtractor\r\n\r\n# Set your video path (e.g., ensure 'bbar8a.mpg' is in your current working directory or adjust path)\r\n# You can use various formats like .mpg, .avi, .mp4, etc.\r\ninput_video_path = Path(\"your_video.mpg\") # !!! IMPORTANT: CHANGE THIS TO YOUR VIDEO FILE NAME !!!\r\noutput_npy_directory = Path(\"./output_data\")\r\noutput_npy_filename = input_video_path.stem + \".npy\"\r\noutput_npy_path = output_npy_directory / output_npy_filename\r\n\r\n# --- Configure LipExtractor settings (optional, defaults are from config.py) ---\r\n# You can override any default setting like this:\r\nLipExtractor.config.SAVE_DEBUG_FRAMES = True\r\nLipExtractor.config.MAX_DEBUG_FRAMES = 75 # Limit debug frames saved\r\nLipExtractor.config.APPLY_CLAHE = True # Ensure CLAHE is applied for contrast\r\nLipExtractor.config.INCLUDE_LANDMARKS_ON_FINAL_OUTPUT = False # Don't draw landmarks on final output\r\n\r\n# New: Enable optional MP4 conversion for input videos that are not already MP4.\r\n# This is highly recommended for problematic formats like some .mpg files.\r\nLipExtractor.config.CONVERT_TO_MP4_IF_NEEDED = True\r\nLipExtractor.config.MP4_TEMP_DIR = Path(\"./temp_converted_mp4s\") # Directory for temporary converted files\r\n\r\n# New: Adjust the smoothing window size for bounding box stability (default is 5)\r\n# LipExtractor.config.SMOOTHING_WINDOW_SIZE = 7\r\n\r\n# New: Set the maximum percentage of problematic (e.g., black) frames allowed.\r\n# If a video exceeds this threshold, it will be rejected as invalid.\r\nLipExtractor.config.MAX_PROBLEMATIC_FRAMES_PERCENTAGE = 30.0 # Allow up to 30% problematic frames\r\n# LipExtractor.config.MAX_FRAMES = 100 # Uncomment to limit the total number of frames processed\r\n\r\n# Create an instance of the extractor\r\nextractor = LipExtractor()\r\n\r\n# Perform extraction\r\nprint(f\"Starting extraction for {input_video_path.name}...\")\r\nextracted_frames = extractor.extract_lip_frames(input_video_path, output_npy_path=output_npy_path)\r\n\r\nif extracted_frames is not None:\r\n print(f\"Successfully extracted {extracted_frames.shape[0]} frames.\")\r\n print(f\"Frames saved to {output_npy_path}\")\r\nelse:\r\n print(\"Extraction failed or the video clip was rejected (e.g., too many invalid frames or no faces detected).\")\r\n\r\n```\r\n\r\nTo convert the extracted .npy file into individual image frames (e.g., PNGs), use the provided save_npy_frames_to_images.py utility script:\r\n```bash\r\npython save_npy_frames_to_images.py\r\n```\r\n\r\n## Dependencies\r\n\r\nThis project heavily relies on the following open-source libraries:\r\n\r\n* **opencv-python:** Essential for core image and video processing operations, including frame manipulation, resizing, and color space conversions.\r\n* **numpy:** Fundamental for efficient numerical computations and handling multi-dimensional data arrays (like image frames).\r\n* **mediapipe:** Utilized for its highly accurate and performant Face Mesh solution, enabling robust facial landmark detection for precise lip localization.\r\n* **av (PyAV):** Provides efficient and reliable reading and writing of various video file formats.\r\n* **Pillow:** A fork of the Python Imaging Library (PIL), often used implicitly by other libraries for image file handling.\r\n* **FFmpeg (External Tool):** Required for the optional automatic video format conversion feature. It must be installed separately on your system and accessible via the system's PATH.\r\n\r\n## Acknowledgements\r\nI sincerely thank the developers and the vibrant open source community behind all the libraries mentioned in the \"Dependencies\" section for their valuable work.\r\n\r\n## Contributing\r\nContributions are highly welcome! If you encounter any bugs, have feature requests, or wish to contribute code, please feel free to:\r\n\r\nOpen an Issue on our GitHub repository.\r\n\r\nSubmit a Pull Request with your proposed changes.\r\n\r\n## License\r\nThis project is licensed under the MIT License. See the [LICENSE](https://github.com/MehradYaghoubi/pylipextractor/blob/main/LICENSE) file for more details.\r\n",
"bugtrack_url": null,
"license": null,
"summary": "A Python package for robust lip frame extraction from videos using MediaPipe, featuring enhanced stability, configurable options, and optional video format conversion for VSR tasks.",
"version": "1.1.0",
"project_urls": {
"Bug Tracker": "https://github.com/MehradYaghoubi/pylipextractor/issues",
"Homepage": "https://github.com/MehradYaghoubi/pylipextractor"
},
"split_keywords": [
"lip-extraction",
" visual-speech-recognition",
" mediapipe",
" video-processing",
" lip-reading",
" ffmpeg",
" video-conversion"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "fbf056f879fa7d77387179edb658b957d3eec28f0e57a70f863925d84eef6796",
"md5": "666e2593d2c11f655e3edad90d96d338",
"sha256": "a9b895f234bb1f8135d8d67cd9680032234bfb9a21c22d1db7fd06d2e835d09f"
},
"downloads": -1,
"filename": "pylipextractor-1.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "666e2593d2c11f655e3edad90d96d338",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 15326,
"upload_time": "2025-07-08T15:49:06",
"upload_time_iso_8601": "2025-07-08T15:49:06.591369Z",
"url": "https://files.pythonhosted.org/packages/fb/f0/56f879fa7d77387179edb658b957d3eec28f0e57a70f863925d84eef6796/pylipextractor-1.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "456de9c7402b89364a3da977b0fd0bc3b943d080f0a76f16f1558e980c691d7f",
"md5": "d0a556f8a640c9193f55462c28b78dfe",
"sha256": "99adbed41f436f4e8a40d86ad343a16aee18293253482aa4db8004a6509a2b68"
},
"downloads": -1,
"filename": "pylipextractor-1.1.0.tar.gz",
"has_sig": false,
"md5_digest": "d0a556f8a640c9193f55462c28b78dfe",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 17261,
"upload_time": "2025-07-08T15:49:08",
"upload_time_iso_8601": "2025-07-08T15:49:08.911123Z",
"url": "https://files.pythonhosted.org/packages/45/6d/e9c7402b89364a3da977b0fd0bc3b943d080f0a76f16f1558e980c691d7f/pylipextractor-1.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-08 15:49:08",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "MehradYaghoubi",
"github_project": "pylipextractor",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "pylipextractor"
}