# DataGenKit
This project aims to create a Python package for generating diverse and enriched image datasets from a small original dataset using three augmentation families:
1. **Traditional Augmentation**: Flips, rotations, scaling, cropping, color jitter, etc., implemented via Albumentations.
2. **Neural Style Transfer (NST)**: Applies artistic/domain-specific textures from style images, implemented with PyTorch + pre-trained fast NST models.
3. **Patch Mixing**: Combines regions from different images (CutMix, MixUp) to boost structural diversity.
## Goals
- Produce lightweight, diverse datasets for small-data training scenarios.
- Allow custom combinations of techniques per batch.
## Features
- **Gradio-based UI**: For interactive usage, allowing users to upload base datasets and optional style images, choose augmentation pipelines and parameters, and preview generated samples in real-time.
- **Python API & CLI**: For batch automation.
- **Export**: To standard dataset formats (COCO, ImageFolder, etc.).
- **Diversity Scoring**: (LPIPS, FID) with visual reports.
## Gradio Workflow Example
1. User uploads original images.
2. Selects techniques (checklist) and parameters (sliders for rotation, blend ratio, style strength).
3. Previews augmented images instantly.
4. Clicks "Generate & Download" to export the batch.
Raw data
{
"_id": null,
"home_page": "https://github.com/Jayavardhan-7/DataGenKit-",
"name": "datagenkit",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "image augmentation, dataset generation, neural style transfer, cutmix, mixup, computer vision, deep learning",
"author": "Jayavardhan",
"author_email": "jayavardhanperala@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/d3/cf/04c4aea4d677fd88a9bd28b6134a71048996708403b89dcfd40de5cb4e21/datagenkit-0.1.tar.gz",
"platform": null,
"description": "# DataGenKit\r\n\r\nThis project aims to create a Python package for generating diverse and enriched image datasets from a small original dataset using three augmentation families:\r\n\r\n1. **Traditional Augmentation**: Flips, rotations, scaling, cropping, color jitter, etc., implemented via Albumentations.\r\n2. **Neural Style Transfer (NST)**: Applies artistic/domain-specific textures from style images, implemented with PyTorch + pre-trained fast NST models.\r\n3. **Patch Mixing**: Combines regions from different images (CutMix, MixUp) to boost structural diversity.\r\n\r\n## Goals\r\n\r\n- Produce lightweight, diverse datasets for small-data training scenarios.\r\n- Allow custom combinations of techniques per batch.\r\n\r\n## Features\r\n\r\n- **Gradio-based UI**: For interactive usage, allowing users to upload base datasets and optional style images, choose augmentation pipelines and parameters, and preview generated samples in real-time.\r\n- **Python API & CLI**: For batch automation.\r\n- **Export**: To standard dataset formats (COCO, ImageFolder, etc.).\r\n- **Diversity Scoring**: (LPIPS, FID) with visual reports.\r\n\r\n## Gradio Workflow Example\r\n\r\n1. User uploads original images.\r\n2. Selects techniques (checklist) and parameters (sliders for rotation, blend ratio, style strength).\r\n3. Previews augmented images instantly.\r\n4. Clicks \"Generate & Download\" to export the batch.\r\n",
"bugtrack_url": null,
"license": null,
"summary": "A Python package for generating diverse and enriched image datasets using traditional, neural style transfer, and patch mixing augmentations.",
"version": "0.1",
"project_urls": {
"Homepage": "https://github.com/Jayavardhan-7/DataGenKit-"
},
"split_keywords": [
"image augmentation",
" dataset generation",
" neural style transfer",
" cutmix",
" mixup",
" computer vision",
" deep learning"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "5dab00afb816695e0b97eeff93eb371bc64e12dcc2f6e3697222ef6a12ca6060",
"md5": "d6ed4a78875d7ed9ee070954e4eaf75c",
"sha256": "510fa1fc50bb6978a1ccfd3bb3be5b9de55748064e71730e3f9f58e5b6b3a53c"
},
"downloads": -1,
"filename": "datagenkit-0.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "d6ed4a78875d7ed9ee070954e4eaf75c",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 8646,
"upload_time": "2025-08-14T18:29:42",
"upload_time_iso_8601": "2025-08-14T18:29:42.437032Z",
"url": "https://files.pythonhosted.org/packages/5d/ab/00afb816695e0b97eeff93eb371bc64e12dcc2f6e3697222ef6a12ca6060/datagenkit-0.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "d3cf04c4aea4d677fd88a9bd28b6134a71048996708403b89dcfd40de5cb4e21",
"md5": "4253029b64bf5e0b9b96a432447e80d3",
"sha256": "b7102c6a80bc81e9534eed4ef523272c6ea36163150a5db1ada5cd817ed25def"
},
"downloads": -1,
"filename": "datagenkit-0.1.tar.gz",
"has_sig": false,
"md5_digest": "4253029b64bf5e0b9b96a432447e80d3",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 7727,
"upload_time": "2025-08-14T18:29:44",
"upload_time_iso_8601": "2025-08-14T18:29:44.809734Z",
"url": "https://files.pythonhosted.org/packages/d3/cf/04c4aea4d677fd88a9bd28b6134a71048996708403b89dcfd40de5cb4e21/datagenkit-0.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-14 18:29:44",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "Jayavardhan-7",
"github_project": "DataGenKit-",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [
{
"name": "gradio",
"specs": []
},
{
"name": "albumentations",
"specs": []
},
{
"name": "torch",
"specs": []
},
{
"name": "torchvision",
"specs": []
},
{
"name": "scikit-image",
"specs": []
},
{
"name": "numpy",
"specs": []
},
{
"name": "opencv-python",
"specs": []
}
],
"lcname": "datagenkit"
}