vae-toolkit

Name	vae-toolkit JSON
Version	0.1.0 JSON
	download
home_page	None
Summary	Stable Diffusion VAE toolkit - image processing and model loading utilities
upload_time	2025-08-28 14:36:24
maintainer	None
docs_url	None
author	None
requires_python	>=3.8
license	MIT License Copyright (c) 2024 Yus314 Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
keywords	vae stable-diffusion image-processing deep-learning diffusers
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # VAE Toolkit

[![PyPI version](https://badge.fury.io/py/vae-toolkit.svg)](https://badge.fury.io/py/vae-toolkit)
[![Python Support](https://img.shields.io/pypi/pyversions/vae-toolkit.svg)](https://pypi.org/project/vae-toolkit/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

A comprehensive toolkit for working with Stable Diffusion VAE models, providing image preprocessing utilities and model loading capabilities.

## Features

- 🖼️ **Image Processing**: Efficient image preprocessing and tensor conversions optimized for VAE models
- 🚀 **Model Loading**: Easy loading of Stable Diffusion VAE models with automatic device selection
- ⚡ **Performance**: Built-in caching and optimized transforms for faster processing
- 🔧 **Flexible API**: Both high-level and low-level APIs for different use cases
- 🛡️ **Type Safety**: Full type hints for better IDE support and code reliability
- 🔐 **Secure**: No hardcoded tokens - authentication via environment variables only

## Installation

```bash
pip install vae-toolkit
```

### Optional Dependencies

For development:
```bash
pip install vae-toolkit[dev]
```

For testing:
```bash
pip install vae-toolkit[test]
```

For all extras:
```bash
pip install vae-toolkit[all]
```

## Quick Start

### Basic Image Processing

```python
from vae_toolkit import load_and_preprocess_image, tensor_to_pil

# Load and preprocess an image for VAE encoding
tensor, original_pil = load_and_preprocess_image("path/to/image.png", target_size=512)
print(f"Tensor shape: {tensor.shape}")  # [1, 3, 512, 512]
print(f"Value range: [{tensor.min():.2f}, {tensor.max():.2f}]")  # [-1.00, 1.00]

# Convert tensor back to PIL image
reconstructed = tensor_to_pil(tensor)
reconstructed.save("reconstructed.png")
```

### Loading VAE Models

```python
from vae_toolkit import VAELoader

# Initialize the loader
loader = VAELoader()

# Load Stable Diffusion v1.5 VAE
vae, device = loader.load_sd_vae(
    model_name="sd15",  # or "sd14" for v1.4
    device="auto"        # automatically selects GPU/CPU
)

print(f"Model loaded on: {device}")
```

### Complete VAE Workflow

```python
import torch
from vae_toolkit import load_and_preprocess_image, VAELoader, tensor_to_pil

# Setup
loader = VAELoader()
vae, device = loader.load_sd_vae("sd14")

# Load and preprocess image
image_tensor, original = load_and_preprocess_image("input.jpg", target_size=512)
image_tensor = image_tensor.to(device)

# Encode to latent space
with torch.no_grad():
    latent = vae.encode(image_tensor).latent_dist.sample()
    print(f"Latent shape: {latent.shape}")  # [1, 4, 64, 64]

# Decode back to image
with torch.no_grad():
    decoded = vae.decode(latent).sample
    
# Save result
output_image = tensor_to_pil(decoded)
output_image.save("output.png")
```

### Using the ImageProcessor Class

```python
from vae_toolkit import ImageProcessor

# Create a processor with custom settings
processor = ImageProcessor(
    target_size=768,
    normalize_mean=(0.5, 0.5, 0.5),
    normalize_std=(0.5, 0.5, 0.5)
)

# Process multiple images with the same settings
for image_path in image_paths:
    tensor, original = processor.load_and_preprocess(image_path)
    # Process tensor...
```

## Authentication

To use models from Hugging Face Hub, set your token as an environment variable:

```bash
export HF_TOKEN="your_huggingface_token"
# or
export HUGGING_FACE_HUB_TOKEN="your_huggingface_token"
```

## API Reference

### Image Processing Functions

#### `load_and_preprocess_image(image_path, target_size=512)`
Loads and preprocesses an image for VAE encoding.

**Parameters:**
- `image_path` (str | Path): Path to the input image
- `target_size` (int): Target size for the square output image

**Returns:**
- `tuple[torch.Tensor, PIL.Image]`: Preprocessed tensor and original PIL image

#### `tensor_to_pil(tensor)`
Converts a tensor to PIL Image format.

**Parameters:**
- `tensor` (torch.Tensor): Input tensor with shape [C, H, W] or [1, C, H, W]

**Returns:**
- `PIL.Image`: RGB PIL image

#### `pil_to_tensor(pil_image, target_size=None, normalize=True)`
Converts a PIL image to tensor format.

**Parameters:**
- `pil_image` (PIL.Image): Input PIL image
- `target_size` (int | None): Optional target size for resizing
- `normalize` (bool): Whether to normalize to [-1, 1] range

**Returns:**
- `torch.Tensor`: Tensor with shape [3, H, W]

### VAE Loader

#### `VAELoader`
Main class for loading and managing Stable Diffusion VAE models.

**Methods:**
- `load_sd_vae(model_name="sd14", device="auto", token=None, use_cache=True)`
  - Loads a Stable Diffusion VAE model
  - Returns: `tuple[AutoencoderKL, torch.device]`
  
- `get_optimal_device(preferred_device="auto")`
  - Determines the best available device
  - Returns: `torch.device`
  
- `clear_cache()`
  - Clears the model cache to free memory

### Model Configuration

#### `get_model_config(model_name)`
Gets configuration for a specific model.

#### `list_available_models()`
Lists all available model identifiers.

#### `add_model_config(model_name, config)`
Adds a custom model configuration.

## Available Models

- `sd14`: Stable Diffusion v1.4 VAE
- `sd15`: Stable Diffusion v1.5 VAE

## Error Handling

The toolkit includes custom exceptions for better error handling:

```python
from vae_toolkit import ImageProcessingError

try:
    tensor, _ = load_and_preprocess_image("invalid_path.jpg")
except ImageProcessingError as e:
    print(f"Failed to process image: {e}")
```

## Performance Tips

1. **Use caching**: The VAELoader caches models by default to avoid reloading
2. **Batch processing**: Process multiple images together when possible
3. **Device selection**: Use "auto" for automatic GPU/CPU selection
4. **Memory management**: Call `loader.clear_cache()` when switching between models

## Requirements

- Python >= 3.8
- PyTorch >= 2.0.0
- torchvision >= 0.15.0
- Pillow >= 9.0.0
- numpy >= 1.20.0
- diffusers >= 0.20.0

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

1. Fork the repository
2. Create your feature branch (`git checkout -b feature/AmazingFeature`)
3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)
4. Push to the branch (`git push origin feature/AmazingFeature`)
5. Open a Pull Request

## Testing

Run tests with pytest:

```bash
# Install test dependencies
pip install vae-toolkit[test]

# Run tests
pytest

# Run with coverage
pytest --cov=vae_toolkit
```

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## Citation

If you use this toolkit in your research, please cite:

```bibtex
@software{vae-toolkit,
  author = {Yus314},
  title = {VAE Toolkit: Stable Diffusion VAE utilities},
  year = {2024},
  url = {https://github.com/mdipcit/vae-toolkit}
}
```

## Acknowledgments

- Built on top of the amazing [diffusers](https://github.com/huggingface/diffusers) library
- Inspired by the Stable Diffusion community

## Support

For issues and questions, please use the [GitHub Issues](https://github.com/mdipcit/vae-toolkit/issues) page.

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "vae-toolkit",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "vae, stable-diffusion, image-processing, deep-learning, diffusers",
    "author": null,
    "author_email": "Yus314 <shizhaoyoujie@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/42/54/aab17e889352307099bd62b5bf416fa300f62ad82b026af31493f66ae104/vae_toolkit-0.1.0.tar.gz",
    "platform": null,
    "description": "# VAE Toolkit\n\n[![PyPI version](https://badge.fury.io/py/vae-toolkit.svg)](https://badge.fury.io/py/vae-toolkit)\n[![Python Support](https://img.shields.io/pypi/pyversions/vae-toolkit.svg)](https://pypi.org/project/vae-toolkit/)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n\nA comprehensive toolkit for working with Stable Diffusion VAE models, providing image preprocessing utilities and model loading capabilities.\n\n## Features\n\n- \ud83d\uddbc\ufe0f **Image Processing**: Efficient image preprocessing and tensor conversions optimized for VAE models\n- \ud83d\ude80 **Model Loading**: Easy loading of Stable Diffusion VAE models with automatic device selection\n- \u26a1 **Performance**: Built-in caching and optimized transforms for faster processing\n- \ud83d\udd27 **Flexible API**: Both high-level and low-level APIs for different use cases\n- \ud83d\udee1\ufe0f **Type Safety**: Full type hints for better IDE support and code reliability\n- \ud83d\udd10 **Secure**: No hardcoded tokens - authentication via environment variables only\n\n## Installation\n\n```bash\npip install vae-toolkit\n```\n\n### Optional Dependencies\n\nFor development:\n```bash\npip install vae-toolkit[dev]\n```\n\nFor testing:\n```bash\npip install vae-toolkit[test]\n```\n\nFor all extras:\n```bash\npip install vae-toolkit[all]\n```\n\n## Quick Start\n\n### Basic Image Processing\n\n```python\nfrom vae_toolkit import load_and_preprocess_image, tensor_to_pil\n\n# Load and preprocess an image for VAE encoding\ntensor, original_pil = load_and_preprocess_image(\"path/to/image.png\", target_size=512)\nprint(f\"Tensor shape: {tensor.shape}\")  # [1, 3, 512, 512]\nprint(f\"Value range: [{tensor.min():.2f}, {tensor.max():.2f}]\")  # [-1.00, 1.00]\n\n# Convert tensor back to PIL image\nreconstructed = tensor_to_pil(tensor)\nreconstructed.save(\"reconstructed.png\")\n```\n\n### Loading VAE Models\n\n```python\nfrom vae_toolkit import VAELoader\n\n# Initialize the loader\nloader = VAELoader()\n\n# Load Stable Diffusion v1.5 VAE\nvae, device = loader.load_sd_vae(\n    model_name=\"sd15\",  # or \"sd14\" for v1.4\n    device=\"auto\"        # automatically selects GPU/CPU\n)\n\nprint(f\"Model loaded on: {device}\")\n```\n\n### Complete VAE Workflow\n\n```python\nimport torch\nfrom vae_toolkit import load_and_preprocess_image, VAELoader, tensor_to_pil\n\n# Setup\nloader = VAELoader()\nvae, device = loader.load_sd_vae(\"sd14\")\n\n# Load and preprocess image\nimage_tensor, original = load_and_preprocess_image(\"input.jpg\", target_size=512)\nimage_tensor = image_tensor.to(device)\n\n# Encode to latent space\nwith torch.no_grad():\n    latent = vae.encode(image_tensor).latent_dist.sample()\n    print(f\"Latent shape: {latent.shape}\")  # [1, 4, 64, 64]\n\n# Decode back to image\nwith torch.no_grad():\n    decoded = vae.decode(latent).sample\n    \n# Save result\noutput_image = tensor_to_pil(decoded)\noutput_image.save(\"output.png\")\n```\n\n### Using the ImageProcessor Class\n\n```python\nfrom vae_toolkit import ImageProcessor\n\n# Create a processor with custom settings\nprocessor = ImageProcessor(\n    target_size=768,\n    normalize_mean=(0.5, 0.5, 0.5),\n    normalize_std=(0.5, 0.5, 0.5)\n)\n\n# Process multiple images with the same settings\nfor image_path in image_paths:\n    tensor, original = processor.load_and_preprocess(image_path)\n    # Process tensor...\n```\n\n## Authentication\n\nTo use models from Hugging Face Hub, set your token as an environment variable:\n\n```bash\nexport HF_TOKEN=\"your_huggingface_token\"\n# or\nexport HUGGING_FACE_HUB_TOKEN=\"your_huggingface_token\"\n```\n\n## API Reference\n\n### Image Processing Functions\n\n#### `load_and_preprocess_image(image_path, target_size=512)`\nLoads and preprocesses an image for VAE encoding.\n\n**Parameters:**\n- `image_path` (str | Path): Path to the input image\n- `target_size` (int): Target size for the square output image\n\n**Returns:**\n- `tuple[torch.Tensor, PIL.Image]`: Preprocessed tensor and original PIL image\n\n#### `tensor_to_pil(tensor)`\nConverts a tensor to PIL Image format.\n\n**Parameters:**\n- `tensor` (torch.Tensor): Input tensor with shape [C, H, W] or [1, C, H, W]\n\n**Returns:**\n- `PIL.Image`: RGB PIL image\n\n#### `pil_to_tensor(pil_image, target_size=None, normalize=True)`\nConverts a PIL image to tensor format.\n\n**Parameters:**\n- `pil_image` (PIL.Image): Input PIL image\n- `target_size` (int | None): Optional target size for resizing\n- `normalize` (bool): Whether to normalize to [-1, 1] range\n\n**Returns:**\n- `torch.Tensor`: Tensor with shape [3, H, W]\n\n### VAE Loader\n\n#### `VAELoader`\nMain class for loading and managing Stable Diffusion VAE models.\n\n**Methods:**\n- `load_sd_vae(model_name=\"sd14\", device=\"auto\", token=None, use_cache=True)`\n  - Loads a Stable Diffusion VAE model\n  - Returns: `tuple[AutoencoderKL, torch.device]`\n  \n- `get_optimal_device(preferred_device=\"auto\")`\n  - Determines the best available device\n  - Returns: `torch.device`\n  \n- `clear_cache()`\n  - Clears the model cache to free memory\n\n### Model Configuration\n\n#### `get_model_config(model_name)`\nGets configuration for a specific model.\n\n#### `list_available_models()`\nLists all available model identifiers.\n\n#### `add_model_config(model_name, config)`\nAdds a custom model configuration.\n\n## Available Models\n\n- `sd14`: Stable Diffusion v1.4 VAE\n- `sd15`: Stable Diffusion v1.5 VAE\n\n## Error Handling\n\nThe toolkit includes custom exceptions for better error handling:\n\n```python\nfrom vae_toolkit import ImageProcessingError\n\ntry:\n    tensor, _ = load_and_preprocess_image(\"invalid_path.jpg\")\nexcept ImageProcessingError as e:\n    print(f\"Failed to process image: {e}\")\n```\n\n## Performance Tips\n\n1. **Use caching**: The VAELoader caches models by default to avoid reloading\n2. **Batch processing**: Process multiple images together when possible\n3. **Device selection**: Use \"auto\" for automatic GPU/CPU selection\n4. **Memory management**: Call `loader.clear_cache()` when switching between models\n\n## Requirements\n\n- Python >= 3.8\n- PyTorch >= 2.0.0\n- torchvision >= 0.15.0\n- Pillow >= 9.0.0\n- numpy >= 1.20.0\n- diffusers >= 0.20.0\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request.\n\n1. Fork the repository\n2. Create your feature branch (`git checkout -b feature/AmazingFeature`)\n3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)\n4. Push to the branch (`git push origin feature/AmazingFeature`)\n5. Open a Pull Request\n\n## Testing\n\nRun tests with pytest:\n\n```bash\n# Install test dependencies\npip install vae-toolkit[test]\n\n# Run tests\npytest\n\n# Run with coverage\npytest --cov=vae_toolkit\n```\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## Citation\n\nIf you use this toolkit in your research, please cite:\n\n```bibtex\n@software{vae-toolkit,\n  author = {Yus314},\n  title = {VAE Toolkit: Stable Diffusion VAE utilities},\n  year = {2024},\n  url = {https://github.com/mdipcit/vae-toolkit}\n}\n```\n\n## Acknowledgments\n\n- Built on top of the amazing [diffusers](https://github.com/huggingface/diffusers) library\n- Inspired by the Stable Diffusion community\n\n## Support\n\nFor issues and questions, please use the [GitHub Issues](https://github.com/mdipcit/vae-toolkit/issues) page.\n",
    "bugtrack_url": null,
    "license": "MIT License  Copyright (c) 2024 Yus314  Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:  The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.  THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.",
    "summary": "Stable Diffusion VAE toolkit - image processing and model loading utilities",
    "version": "0.1.0",
    "project_urls": {
        "Changelog": "https://github.com/mdipcit/vae-toolkit/blob/main/CHANGELOG.md",
        "Documentation": "https://vae-toolkit.readthedocs.io",
        "Homepage": "https://github.com/mdipcit/vae-toolkit",
        "Issues": "https://github.com/mdipcit/vae-toolkit/issues",
        "Repository": "https://github.com/mdipcit/vae-toolkit"
    },
    "split_keywords": [
        "vae",
        " stable-diffusion",
        " image-processing",
        " deep-learning",
        " diffusers"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "c544b75846c931ec7ba932281f94e61733931f67ac96f932d86c16f61ca20152",
                "md5": "5836ee162f9315b4dd1d51803bfefa88",
                "sha256": "4534eb5ce4a35ed7534df10e0299151a6c2bbbcd52233699542f393e3763036d"
            },
            "downloads": -1,
            "filename": "vae_toolkit-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "5836ee162f9315b4dd1d51803bfefa88",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 11867,
            "upload_time": "2025-08-28T14:36:22",
            "upload_time_iso_8601": "2025-08-28T14:36:22.342390Z",
            "url": "https://files.pythonhosted.org/packages/c5/44/b75846c931ec7ba932281f94e61733931f67ac96f932d86c16f61ca20152/vae_toolkit-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "4254aab17e889352307099bd62b5bf416fa300f62ad82b026af31493f66ae104",
                "md5": "72874ac0e65e3cda15c2e5e82ddc5672",
                "sha256": "ad07060e70bf56d63f5fd2028115647c15e093f72c34e4fd220d3325b4b55435"
            },
            "downloads": -1,
            "filename": "vae_toolkit-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "72874ac0e65e3cda15c2e5e82ddc5672",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 19507,
            "upload_time": "2025-08-28T14:36:24",
            "upload_time_iso_8601": "2025-08-28T14:36:24.035666Z",
            "url": "https://files.pythonhosted.org/packages/42/54/aab17e889352307099bd62b5bf416fa300f62ad82b026af31493f66ae104/vae_toolkit-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-28 14:36:24",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "mdipcit",
    "github_project": "vae-toolkit",
    "github_not_found": true,
    "lcname": "vae-toolkit"
}

None