# MarkMyMedia
[](https://pypi.org/project/MarkMyMedia/)
[](https://opensource.org/licenses/MIT)
A fast, simple utility to visually stamp media files with their filenames, preparing them for multimodal LLM training and analysis.
## The Problem: Lost Context in Multimodal Sequences
When you feed a sequence of media files (e.g., `portal 2 mod.jpg`, `intro.mp3`, `my homework.mp4`) to a Large Language Model, the model sees a continuous stream of data. It lacks explicit, built-in separators or context about where one file ends and another begins, or what the original source of a particular frame or soundbite was.
This ambiguity makes it difficult to:
- Analyze which specific file triggered a response.
- Train the model on tasks that require knowledge of file boundaries.
- Debug model behavior on complex, mixed-media inputs.
## The Solution: Visibly Embedded Markers
**MarkMyMedia** solves this by "stamping" each file with its own name, creating an unambiguous visual or auditory marker directly within the data.
- **Images:** Get a clean text overlay with the filename.
- **Audio:** Are converted into a video with the filename displayed on a black background.
- **Videos:** Get a short, 0.5-second marker clip prepended, showing the filename without re-encoding the entire video.
This way, the context is never lost. The model "sees" the filename associated with the content that follows.
## Key Features
- **Multimodal Support:** Works out-of-the-box for images, audio, and video.
- **Blazing Fast:** Uses parallel processing to handle large datasets quickly.
- **Efficient Video Processing:** Prepends markers to videos **without re-encoding**, saving massive amounts of time and preserving original quality.
- **Flexible Usage:** Can be used as a simple command-line tool or as a Python library.
- **Recursive Search:** Point it at a directory, and it can process all nested media files.
- **Simple & Focused:** Does one job and does it well.
### How It Looks
**MarkMyMedia** provides clear, unambiguous markers for each file type.
#### 🖼️ Images
A clean, readable marker with the filename is embedded directly onto the image. This ensures that even in a long sequence, the source of each image is immediately visible.

*<p align="center">Example: A screenshot of a Discord message marked with its filename.</p>*
#### 🎧 Audio
Audio files are converted into a static video format. This clever workaround makes them visually identifiable in multimodal timelines and tools like Google AI Studio, where audio-only files might not provide visual cues. The entire audio track is preserved under a single, persistent frame showing its original filename.

*<p align="center">The result is a standard video file, making the audio's presence known visually.</p>*

#### 🎬 Video
A short, 0.5-second marker clip is prepended to the video. This process is nearly instant because it **avoids re-encoding** the entire file, preserving the original quality and saving significant time.

*<p align="center">The model sees the filename right before the video content begins.</p>*
## Technical Constraints
1. This tool relies on **FFmpeg** for all audio and video operations. You must have `ffmpeg` and `ffprobe` installed and available in your system's PATH.
2. To achieve high speed by avoiding full re-encoding, `MarkMyMedia` relies on **stream copying**. This approach is extremely fast but requires input files to meet specific format criteria.
| Modality | Requirement | Reason & Details |
| :--- | :--- | :--- |
| **Video (`mark_video`)** | <ul><li>Video Codec: `h264` or `hevc`</li><li>Audio Codec: `aac` (if present)</li></ul> | **For preserving quality and speed.** Processing other codecs (like VP9 in `.webm`) will fail, as they cannot be directly concatenated in this workflow. |
| **Audio (`mark_audio`)** | <ul><li>Always outputs a `.mp4` video file.</li><li>Audio Format: `mp3`, `flac`, `aac`, `m4a`, `ogg` or `opus`</li></ul> | **To create a visual marker.** The original audio stream is copied losslessly into the new video container, ensuring no quality is lost. |
## Installation
Install `MarkMyMedia` directly from PyPI:
```bash
pip install MarkMyMedia
```
## Usage
### As a Command-Line Tool (CLI)
The CLI is designed for batch processing entire directories.
**Mark all media in the current directory (output to `markered_modals/`):**
```bash
markmymedia
```
**Recursively process a dataset and specify an output folder:**
```bash
markmymedia ./my_dataset -r -o ./processed_data
```
**See all available options:**
`markmymedia --help`
```
usage: markmymedia [-h] [-r] [-o OUTPUT] [-j JOBS] [-p] [--version] [inputs ...]
Batch mark images, audio, and video with filename overlays.
positional arguments:
inputs Files or directories to process. If omitted, current directory is used.
options:
-h, --help show this help message and exit
-r, --recursive Recursively traverse directories.
-o, --output OUTPUT Base output directory (default: markered_modals).
-j, --jobs JOBS Number of worker threads to use per modality (default: number of CPUs).
-p, --preserve-structure
Preserve the directory structure of input files in the output directory.
--version show program's version number and exit
```
### As a Python Library
You can also use the core functions directly in your Python scripts for more granular control.
```python
from markmymedia import mark_image, mark_audio, mark_video
# Mark a single image
mark_image(
input_path='data/cat.jpg',
output_path='processed/cat_marked.jpg'
)
# Create a marked video from an audio file
mark_audio(
input_path='data/intro.mp3',
output_path='processed/intro.mp4'
)
# Prepend a marker to a video file
mark_video(
input_path='data/dog_on_beach.mp4',
output_path='processed/dog_on_beach.mp4',
overlay_text="Some cool video!!",
)
```
## Contributing
Contributions are welcome! If you find a bug or have a feature request, please [open an issue](https://github.com/LaVashikk/MarkMyMedia-LLM/blob/main/media//issues).
## License
This project is licensed under the MIT License. See the `LICENSE` file for details.
Raw data
{
"_id": null,
"home_page": null,
"name": "MarkMyMedia",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "multimodal, llm, ffmpeg, media, image-processing, video-processing, audio-processing, utilities",
"author": null,
"author_email": "laVashik <me@lavashik.dev>",
"download_url": "https://files.pythonhosted.org/packages/95/83/38219220741e6c08ef924064ac9298ee3fc1f1f9c4502285259acd088ed4/markmymedia-1.0.0.tar.gz",
"platform": null,
"description": "# MarkMyMedia\n\n[](https://pypi.org/project/MarkMyMedia/)\n[](https://opensource.org/licenses/MIT)\n\nA fast, simple utility to visually stamp media files with their filenames, preparing them for multimodal LLM training and analysis.\n\n## The Problem: Lost Context in Multimodal Sequences\n\nWhen you feed a sequence of media files (e.g., `portal 2 mod.jpg`, `intro.mp3`, `my homework.mp4`) to a Large Language Model, the model sees a continuous stream of data. It lacks explicit, built-in separators or context about where one file ends and another begins, or what the original source of a particular frame or soundbite was.\n\nThis ambiguity makes it difficult to:\n- Analyze which specific file triggered a response.\n- Train the model on tasks that require knowledge of file boundaries.\n- Debug model behavior on complex, mixed-media inputs.\n\n## The Solution: Visibly Embedded Markers\n\n**MarkMyMedia** solves this by \"stamping\" each file with its own name, creating an unambiguous visual or auditory marker directly within the data.\n\n- **Images:** Get a clean text overlay with the filename.\n- **Audio:** Are converted into a video with the filename displayed on a black background.\n- **Videos:** Get a short, 0.5-second marker clip prepended, showing the filename without re-encoding the entire video.\n\nThis way, the context is never lost. The model \"sees\" the filename associated with the content that follows.\n\n## Key Features\n\n- **Multimodal Support:** Works out-of-the-box for images, audio, and video.\n- **Blazing Fast:** Uses parallel processing to handle large datasets quickly.\n- **Efficient Video Processing:** Prepends markers to videos **without re-encoding**, saving massive amounts of time and preserving original quality.\n- **Flexible Usage:** Can be used as a simple command-line tool or as a Python library.\n- **Recursive Search:** Point it at a directory, and it can process all nested media files.\n- **Simple & Focused:** Does one job and does it well.\n\n### How It Looks\n\n**MarkMyMedia** provides clear, unambiguous markers for each file type.\n\n#### \ud83d\uddbc\ufe0f Images\n\nA clean, readable marker with the filename is embedded directly onto the image. This ensures that even in a long sequence, the source of each image is immediately visible.\n\n\n\n*<p align=\"center\">Example: A screenshot of a Discord message marked with its filename.</p>*\n\n#### \ud83c\udfa7 Audio\n\nAudio files are converted into a static video format. This clever workaround makes them visually identifiable in multimodal timelines and tools like Google AI Studio, where audio-only files might not provide visual cues. The entire audio track is preserved under a single, persistent frame showing its original filename.\n\n\n\n*<p align=\"center\">The result is a standard video file, making the audio's presence known visually.</p>*\n\n\n\n#### \ud83c\udfac Video\n\nA short, 0.5-second marker clip is prepended to the video. This process is nearly instant because it **avoids re-encoding** the entire file, preserving the original quality and saving significant time.\n\n\n\n*<p align=\"center\">The model sees the filename right before the video content begins.</p>*\n\n\n## Technical Constraints\n\n1. This tool relies on **FFmpeg** for all audio and video operations. You must have `ffmpeg` and `ffprobe` installed and available in your system's PATH.\n2. To achieve high speed by avoiding full re-encoding, `MarkMyMedia` relies on **stream copying**. This approach is extremely fast but requires input files to meet specific format criteria.\n\n| Modality | Requirement | Reason & Details |\n| :--- | :--- | :--- |\n| **Video (`mark_video`)** | <ul><li>Video Codec: `h264` or `hevc`</li><li>Audio Codec: `aac` (if present)</li></ul> | **For preserving quality and speed.** Processing other codecs (like VP9 in `.webm`) will fail, as they cannot be directly concatenated in this workflow. |\n| **Audio (`mark_audio`)** | <ul><li>Always outputs a `.mp4` video file.</li><li>Audio Format: `mp3`, `flac`, `aac`, `m4a`, `ogg` or `opus`</li></ul> | **To create a visual marker.** The original audio stream is copied losslessly into the new video container, ensuring no quality is lost. |\n\n## Installation\n\nInstall `MarkMyMedia` directly from PyPI:\n\n```bash\npip install MarkMyMedia\n```\n\n## Usage\n\n### As a Command-Line Tool (CLI)\n\nThe CLI is designed for batch processing entire directories.\n\n**Mark all media in the current directory (output to `markered_modals/`):**\n```bash\nmarkmymedia \n```\n\n**Recursively process a dataset and specify an output folder:**\n```bash\nmarkmymedia ./my_dataset -r -o ./processed_data\n```\n\n**See all available options:**\n`markmymedia --help`\n```\nusage: markmymedia [-h] [-r] [-o OUTPUT] [-j JOBS] [-p] [--version] [inputs ...]\n\nBatch mark images, audio, and video with filename overlays.\n\npositional arguments:\n inputs Files or directories to process. If omitted, current directory is used.\n\noptions:\n -h, --help show this help message and exit\n -r, --recursive Recursively traverse directories.\n -o, --output OUTPUT Base output directory (default: markered_modals).\n -j, --jobs JOBS Number of worker threads to use per modality (default: number of CPUs).\n -p, --preserve-structure\n Preserve the directory structure of input files in the output directory.\n --version show program's version number and exit\n\n```\n\n### As a Python Library\n\nYou can also use the core functions directly in your Python scripts for more granular control.\n\n```python\nfrom markmymedia import mark_image, mark_audio, mark_video\n\n# Mark a single image\nmark_image(\n input_path='data/cat.jpg',\n output_path='processed/cat_marked.jpg'\n)\n\n# Create a marked video from an audio file\nmark_audio(\n input_path='data/intro.mp3',\n output_path='processed/intro.mp4'\n)\n\n# Prepend a marker to a video file\nmark_video(\n input_path='data/dog_on_beach.mp4',\n output_path='processed/dog_on_beach.mp4',\n overlay_text=\"Some cool video!!\",\n)\n```\n\n## Contributing\n\nContributions are welcome! If you find a bug or have a feature request, please [open an issue](https://github.com/LaVashikk/MarkMyMedia-LLM/blob/main/media//issues).\n\n## License\n\nThis project is licensed under the MIT License. See the `LICENSE` file for details.\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A fast, simple utility to visually stamp media files with their filenames, preparing them for multimodal LLM training and analysis.",
"version": "1.0.0",
"project_urls": {
"Bug Tracker": "https://github.com/IaVashik/MarkMyMedia-LLM/issues",
"Homepage": "https://github.com/LaVashikk/MarkMyMedia-LLM"
},
"split_keywords": [
"multimodal",
" llm",
" ffmpeg",
" media",
" image-processing",
" video-processing",
" audio-processing",
" utilities"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "48077f17d244eff84e973d7c92f2d5311d44de5cd946cb799d1013a20a373b95",
"md5": "4a62cbd2099562cfbec9554bbf05fa66",
"sha256": "eafa037d6a3a50525a235cf60e2e1347bc5201b4ad18bc095089ae62709de578"
},
"downloads": -1,
"filename": "markmymedia-1.0.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "4a62cbd2099562cfbec9554bbf05fa66",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 15585,
"upload_time": "2025-08-04T07:11:21",
"upload_time_iso_8601": "2025-08-04T07:11:21.785626Z",
"url": "https://files.pythonhosted.org/packages/48/07/7f17d244eff84e973d7c92f2d5311d44de5cd946cb799d1013a20a373b95/markmymedia-1.0.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "958338219220741e6c08ef924064ac9298ee3fc1f1f9c4502285259acd088ed4",
"md5": "dd90b8427f09156080ac40e5b697a017",
"sha256": "b03209ffd936f8ad5881e78b367f38d398817430246f57600178b30f34d0a247"
},
"downloads": -1,
"filename": "markmymedia-1.0.0.tar.gz",
"has_sig": false,
"md5_digest": "dd90b8427f09156080ac40e5b697a017",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 13027,
"upload_time": "2025-08-04T07:11:24",
"upload_time_iso_8601": "2025-08-04T07:11:24.207765Z",
"url": "https://files.pythonhosted.org/packages/95/83/38219220741e6c08ef924064ac9298ee3fc1f1f9c4502285259acd088ed4/markmymedia-1.0.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-04 07:11:24",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "IaVashik",
"github_project": "MarkMyMedia-LLM",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "markmymedia"
}