pylipextractor


Namepylipextractor JSON
Version 1.1.0 PyPI version JSON
download
home_pageNone
SummaryA Python package for robust lip frame extraction from videos using MediaPipe, featuring enhanced stability, configurable options, and optional video format conversion for VSR tasks.
upload_time2025-07-08 15:49:08
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseNone
keywords lip-extraction visual-speech-recognition mediapipe video-processing lip-reading ffmpeg video-conversion
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # PyLipExtractor

A Python package for robust lip frame extraction from videos using MediaPipe, designed specifically for visual speech recognition (VSR) and lip-reading tasks. It provides a streamlined, configurable process to convert raw video into ready-to-use lip sequences, ideal for deep learning model training.

## Features

* **Accurate Lip Landmark Detection:** Leverages MediaPipe Face Mesh for precise identification of 3D lip contours, ensuring high fidelity in extraction.
* **Configurable Lip Region Extraction:** Offers fine-grained control over the bounding box around detected lips, allowing for custom proportional margins and padding to capture the desired context.
* **Temporal Smoothing:** Implements a moving average filter on bounding box coordinates to ensure stable and consistent lip frame extraction across video sequences. **The smoothing window size is now configurable.**
* **Illumination Normalization (CLAHE):** Applies Adaptive Histogram Equalization (CLAHE) to enhance contrast and normalize illumination, improving the robustness of extracted frames to varying lighting conditions.
* **Optional Video Conversion (FFmpeg):** Can automatically convert various video formats (e.g., MPG) to MP4 internally using FFmpeg, enhancing compatibility and robustness with MediaPipe and PyAV. This can resolve issues with specific problematic video codecs.
* **Flexible Output & Quality Control:** Extracts processed lip frames as NumPy arrays (.npy format). Includes a configurable threshold (`MAX_PROBLEMATIC_FRAMES_PERCENTAGE`) to automatically reject video clips with too many unprocessable (black) frames, ensuring output data quality.
* **Debugging Visualizations:** Provides options to save intermediate frames with landmarks and bounding boxes, aiding in visual inspection and troubleshooting of the extraction process.
* **Efficient Video Handling:** Utilizes PyAV for robust and efficient video decoding.

# Demo
https://github.com/user-attachments/assets/cfc7ce9d-dfc0-4a8c-9fc1-1a37299437bf

Original video by Tima Miroshnichenko

## Installation

You can easily install pylipextractor using pip directly from PyPI. Ensure you have a compatible Python version (3.8 or newer) installed.

```bash
pip install pylipextractor
```
For Development (Optional)
If you plan to contribute to the project or need an editable installation, follow these steps:

```bash
# First, clone the repository
git clone https://github.com/MehradYaghoubi/pylipextractor.git
cd pylipextractor

# Install the package in editable mode along with its dependencies
pip install -e .
```

## Usage
See example_usage.py in the project root for a full demonstration on how to use the LipExtractor class to process a video and save the lip frames.

Example:
```bash
from pathlib import Path
from pylipextractor.lip_extractor import LipExtractor

# Set your video path (e.g., ensure 'bbar8a.mpg' is in your current working directory or adjust path)
# You can use various formats like .mpg, .avi, .mp4, etc.
input_video_path = Path("your_video.mpg") # !!! IMPORTANT: CHANGE THIS TO YOUR VIDEO FILE NAME !!!
output_npy_directory = Path("./output_data")
output_npy_filename = input_video_path.stem + ".npy"
output_npy_path = output_npy_directory / output_npy_filename

# --- Configure LipExtractor settings (optional, defaults are from config.py) ---
# You can override any default setting like this:
LipExtractor.config.SAVE_DEBUG_FRAMES = True
LipExtractor.config.MAX_DEBUG_FRAMES = 75 # Limit debug frames saved
LipExtractor.config.APPLY_CLAHE = True   # Ensure CLAHE is applied for contrast
LipExtractor.config.INCLUDE_LANDMARKS_ON_FINAL_OUTPUT = False # Don't draw landmarks on final output

# New: Enable optional MP4 conversion for input videos that are not already MP4.
# This is highly recommended for problematic formats like some .mpg files.
LipExtractor.config.CONVERT_TO_MP4_IF_NEEDED = True
LipExtractor.config.MP4_TEMP_DIR = Path("./temp_converted_mp4s") # Directory for temporary converted files

# New: Adjust the smoothing window size for bounding box stability (default is 5)
# LipExtractor.config.SMOOTHING_WINDOW_SIZE = 7

# New: Set the maximum percentage of problematic (e.g., black) frames allowed.
# If a video exceeds this threshold, it will be rejected as invalid.
LipExtractor.config.MAX_PROBLEMATIC_FRAMES_PERCENTAGE = 30.0 # Allow up to 30% problematic frames
# LipExtractor.config.MAX_FRAMES = 100        # Uncomment to limit the total number of frames processed

# Create an instance of the extractor
extractor = LipExtractor()

# Perform extraction
print(f"Starting extraction for {input_video_path.name}...")
extracted_frames = extractor.extract_lip_frames(input_video_path, output_npy_path=output_npy_path)

if extracted_frames is not None:
    print(f"Successfully extracted {extracted_frames.shape[0]} frames.")
    print(f"Frames saved to {output_npy_path}")
else:
    print("Extraction failed or the video clip was rejected (e.g., too many invalid frames or no faces detected).")

```

To convert the extracted .npy file into individual image frames (e.g., PNGs), use the provided save_npy_frames_to_images.py utility script:
```bash
python save_npy_frames_to_images.py
```

## Dependencies

This project heavily relies on the following open-source libraries:

* **opencv-python:** Essential for core image and video processing operations, including frame manipulation, resizing, and color space conversions.
* **numpy:** Fundamental for efficient numerical computations and handling multi-dimensional data arrays (like image frames).
* **mediapipe:** Utilized for its highly accurate and performant Face Mesh solution, enabling robust facial landmark detection for precise lip localization.
* **av (PyAV):** Provides efficient and reliable reading and writing of various video file formats.
* **Pillow:** A fork of the Python Imaging Library (PIL), often used implicitly by other libraries for image file handling.
* **FFmpeg (External Tool):** Required for the optional automatic video format conversion feature. It must be installed separately on your system and accessible via the system's PATH.

## Acknowledgements
I sincerely thank the developers and the vibrant open source community behind all the libraries mentioned in the "Dependencies" section for their valuable work.

## Contributing
Contributions are highly welcome! If you encounter any bugs, have feature requests, or wish to contribute code, please feel free to:

Open an Issue on our GitHub repository.

Submit a Pull Request with your proposed changes.

## License
This project is licensed under the MIT License. See the [LICENSE](https://github.com/MehradYaghoubi/pylipextractor/blob/main/LICENSE) file for more details.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "pylipextractor",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "lip-extraction, visual-speech-recognition, mediapipe, video-processing, lip-reading, ffmpeg, video-conversion",
    "author": null,
    "author_email": "Mehrad Yaghoubi <mehradyaghoubi96@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/45/6d/e9c7402b89364a3da977b0fd0bc3b943d080f0a76f16f1558e980c691d7f/pylipextractor-1.1.0.tar.gz",
    "platform": null,
    "description": "# PyLipExtractor\r\n\r\nA Python package for robust lip frame extraction from videos using MediaPipe, designed specifically for visual speech recognition (VSR) and lip-reading tasks. It provides a streamlined, configurable process to convert raw video into ready-to-use lip sequences, ideal for deep learning model training.\r\n\r\n## Features\r\n\r\n* **Accurate Lip Landmark Detection:** Leverages MediaPipe Face Mesh for precise identification of 3D lip contours, ensuring high fidelity in extraction.\r\n* **Configurable Lip Region Extraction:** Offers fine-grained control over the bounding box around detected lips, allowing for custom proportional margins and padding to capture the desired context.\r\n* **Temporal Smoothing:** Implements a moving average filter on bounding box coordinates to ensure stable and consistent lip frame extraction across video sequences. **The smoothing window size is now configurable.**\r\n* **Illumination Normalization (CLAHE):** Applies Adaptive Histogram Equalization (CLAHE) to enhance contrast and normalize illumination, improving the robustness of extracted frames to varying lighting conditions.\r\n* **Optional Video Conversion (FFmpeg):** Can automatically convert various video formats (e.g., MPG) to MP4 internally using FFmpeg, enhancing compatibility and robustness with MediaPipe and PyAV. This can resolve issues with specific problematic video codecs.\r\n* **Flexible Output & Quality Control:** Extracts processed lip frames as NumPy arrays (.npy format). Includes a configurable threshold (`MAX_PROBLEMATIC_FRAMES_PERCENTAGE`) to automatically reject video clips with too many unprocessable (black) frames, ensuring output data quality.\r\n* **Debugging Visualizations:** Provides options to save intermediate frames with landmarks and bounding boxes, aiding in visual inspection and troubleshooting of the extraction process.\r\n* **Efficient Video Handling:** Utilizes PyAV for robust and efficient video decoding.\r\n\r\n# Demo\r\nhttps://github.com/user-attachments/assets/cfc7ce9d-dfc0-4a8c-9fc1-1a37299437bf\r\n\r\nOriginal video by Tima Miroshnichenko\r\n\r\n## Installation\r\n\r\nYou can easily install pylipextractor using pip directly from PyPI. Ensure you have a compatible Python version (3.8 or newer) installed.\r\n\r\n```bash\r\npip install pylipextractor\r\n```\r\nFor Development (Optional)\r\nIf you plan to contribute to the project or need an editable installation, follow these steps:\r\n\r\n```bash\r\n# First, clone the repository\r\ngit clone https://github.com/MehradYaghoubi/pylipextractor.git\r\ncd pylipextractor\r\n\r\n# Install the package in editable mode along with its dependencies\r\npip install -e .\r\n```\r\n\r\n## Usage\r\nSee example_usage.py in the project root for a full demonstration on how to use the LipExtractor class to process a video and save the lip frames.\r\n\r\nExample:\r\n```bash\r\nfrom pathlib import Path\r\nfrom pylipextractor.lip_extractor import LipExtractor\r\n\r\n# Set your video path (e.g., ensure 'bbar8a.mpg' is in your current working directory or adjust path)\r\n# You can use various formats like .mpg, .avi, .mp4, etc.\r\ninput_video_path = Path(\"your_video.mpg\") # !!! IMPORTANT: CHANGE THIS TO YOUR VIDEO FILE NAME !!!\r\noutput_npy_directory = Path(\"./output_data\")\r\noutput_npy_filename = input_video_path.stem + \".npy\"\r\noutput_npy_path = output_npy_directory / output_npy_filename\r\n\r\n# --- Configure LipExtractor settings (optional, defaults are from config.py) ---\r\n# You can override any default setting like this:\r\nLipExtractor.config.SAVE_DEBUG_FRAMES = True\r\nLipExtractor.config.MAX_DEBUG_FRAMES = 75 # Limit debug frames saved\r\nLipExtractor.config.APPLY_CLAHE = True   # Ensure CLAHE is applied for contrast\r\nLipExtractor.config.INCLUDE_LANDMARKS_ON_FINAL_OUTPUT = False # Don't draw landmarks on final output\r\n\r\n# New: Enable optional MP4 conversion for input videos that are not already MP4.\r\n# This is highly recommended for problematic formats like some .mpg files.\r\nLipExtractor.config.CONVERT_TO_MP4_IF_NEEDED = True\r\nLipExtractor.config.MP4_TEMP_DIR = Path(\"./temp_converted_mp4s\") # Directory for temporary converted files\r\n\r\n# New: Adjust the smoothing window size for bounding box stability (default is 5)\r\n# LipExtractor.config.SMOOTHING_WINDOW_SIZE = 7\r\n\r\n# New: Set the maximum percentage of problematic (e.g., black) frames allowed.\r\n# If a video exceeds this threshold, it will be rejected as invalid.\r\nLipExtractor.config.MAX_PROBLEMATIC_FRAMES_PERCENTAGE = 30.0 # Allow up to 30% problematic frames\r\n# LipExtractor.config.MAX_FRAMES = 100        # Uncomment to limit the total number of frames processed\r\n\r\n# Create an instance of the extractor\r\nextractor = LipExtractor()\r\n\r\n# Perform extraction\r\nprint(f\"Starting extraction for {input_video_path.name}...\")\r\nextracted_frames = extractor.extract_lip_frames(input_video_path, output_npy_path=output_npy_path)\r\n\r\nif extracted_frames is not None:\r\n    print(f\"Successfully extracted {extracted_frames.shape[0]} frames.\")\r\n    print(f\"Frames saved to {output_npy_path}\")\r\nelse:\r\n    print(\"Extraction failed or the video clip was rejected (e.g., too many invalid frames or no faces detected).\")\r\n\r\n```\r\n\r\nTo convert the extracted .npy file into individual image frames (e.g., PNGs), use the provided save_npy_frames_to_images.py utility script:\r\n```bash\r\npython save_npy_frames_to_images.py\r\n```\r\n\r\n## Dependencies\r\n\r\nThis project heavily relies on the following open-source libraries:\r\n\r\n* **opencv-python:** Essential for core image and video processing operations, including frame manipulation, resizing, and color space conversions.\r\n* **numpy:** Fundamental for efficient numerical computations and handling multi-dimensional data arrays (like image frames).\r\n* **mediapipe:** Utilized for its highly accurate and performant Face Mesh solution, enabling robust facial landmark detection for precise lip localization.\r\n* **av (PyAV):** Provides efficient and reliable reading and writing of various video file formats.\r\n* **Pillow:** A fork of the Python Imaging Library (PIL), often used implicitly by other libraries for image file handling.\r\n* **FFmpeg (External Tool):** Required for the optional automatic video format conversion feature. It must be installed separately on your system and accessible via the system's PATH.\r\n\r\n## Acknowledgements\r\nI sincerely thank the developers and the vibrant open source community behind all the libraries mentioned in the \"Dependencies\" section for their valuable work.\r\n\r\n## Contributing\r\nContributions are highly welcome! If you encounter any bugs, have feature requests, or wish to contribute code, please feel free to:\r\n\r\nOpen an Issue on our GitHub repository.\r\n\r\nSubmit a Pull Request with your proposed changes.\r\n\r\n## License\r\nThis project is licensed under the MIT License. See the [LICENSE](https://github.com/MehradYaghoubi/pylipextractor/blob/main/LICENSE) file for more details.\r\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A Python package for robust lip frame extraction from videos using MediaPipe, featuring enhanced stability, configurable options, and optional video format conversion for VSR tasks.",
    "version": "1.1.0",
    "project_urls": {
        "Bug Tracker": "https://github.com/MehradYaghoubi/pylipextractor/issues",
        "Homepage": "https://github.com/MehradYaghoubi/pylipextractor"
    },
    "split_keywords": [
        "lip-extraction",
        " visual-speech-recognition",
        " mediapipe",
        " video-processing",
        " lip-reading",
        " ffmpeg",
        " video-conversion"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "fbf056f879fa7d77387179edb658b957d3eec28f0e57a70f863925d84eef6796",
                "md5": "666e2593d2c11f655e3edad90d96d338",
                "sha256": "a9b895f234bb1f8135d8d67cd9680032234bfb9a21c22d1db7fd06d2e835d09f"
            },
            "downloads": -1,
            "filename": "pylipextractor-1.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "666e2593d2c11f655e3edad90d96d338",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 15326,
            "upload_time": "2025-07-08T15:49:06",
            "upload_time_iso_8601": "2025-07-08T15:49:06.591369Z",
            "url": "https://files.pythonhosted.org/packages/fb/f0/56f879fa7d77387179edb658b957d3eec28f0e57a70f863925d84eef6796/pylipextractor-1.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "456de9c7402b89364a3da977b0fd0bc3b943d080f0a76f16f1558e980c691d7f",
                "md5": "d0a556f8a640c9193f55462c28b78dfe",
                "sha256": "99adbed41f436f4e8a40d86ad343a16aee18293253482aa4db8004a6509a2b68"
            },
            "downloads": -1,
            "filename": "pylipextractor-1.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "d0a556f8a640c9193f55462c28b78dfe",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 17261,
            "upload_time": "2025-07-08T15:49:08",
            "upload_time_iso_8601": "2025-07-08T15:49:08.911123Z",
            "url": "https://files.pythonhosted.org/packages/45/6d/e9c7402b89364a3da977b0fd0bc3b943d080f0a76f16f1558e980c691d7f/pylipextractor-1.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-08 15:49:08",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "MehradYaghoubi",
    "github_project": "pylipextractor",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "pylipextractor"
}
        
Elapsed time: 0.42420s