medvae

Name	medvae JSON
Version	0.1.4 JSON
	download
home_page	None
Summary	MedVAE is a family of six medical image autoencoders that can encode high-dimensional medical images into latent representations.
upload_time	2025-07-16 01:51:45
maintainer	None
docs_url	None
author	Stanford Machine Intelligence for Medical Imaging (MIMI)
requires_python	>=3.9
license	MIT License Copyright (c) 2025 Stanford MIMI Lab Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
keywords	deep learning image compression compression efficiency computer aided diagnosis medical image analysis autoencoders representation learning med-vae medvae
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # MedVAE: Efficient Automated Interpretation of Medical Images with Large-Scale Generalizable Autoencoders

[![Hugging Face](https://huggingface.co/datasets/huggingface/badges/resolve/main/model-on-hf-md.svg)](https://huggingface.co/stanfordmimi/MedVAE)    [![pypi](https://img.shields.io/pypi/v/medvae?style=for-the-badge)](https://pypi.org/project/medvae/)    [![arXiv](https://img.shields.io/badge/arXiv-2502.14753-b31b1b.svg?style=for-the-badge)](https://arxiv.org/abs/2502.14753)    [![Watch the Talk on YouTube](https://img.shields.io/badge/YouTube-Talk-red?style=for-the-badge&logo=youtube)](https://www.youtube.com/watch?v=5zoxHz71ZgY)    [![License](https://img.shields.io/github/license/stanfordmimi/medvae?style=for-the-badge)](LICENSE)

This repository contains the official PyTorch implementation for [MedVAE: Efficient Automated Interpretation of Medical Images with Large-Scale Generalizable Autoencoders](https://arxiv.org/abs/2502.14753) (MIDL 2025; Best Oral Paper Award).

<!-- ![Overview](documentation/assets/overview.png) -->

## 🫁 What is MedVAE?

MedVAE is a family of six large-scale, generalizable 2D and 3D variational autoencoders (VAEs) designed for medical imaging. It is trained on over one million medical images across multiple anatomical regions and modalities. MedVAE autoencoders encode medical images as downsized latent representations and decode latent representations back to high-resolution images. Across diverse tasks obtained from 20 medical image datasets, we demonstrate that utilizing MedVAE latent representations in place of high-resolution images when training downstream models can lead to efficiency benefits (up to 70x improvement in throughput) while simultaneously preserving clinically-relevant features.

## ⚡️ Installation

To install MedVAE, you can simply run:

```python
pip install medvae
```

For an editable installation, use the following commands to clone and install this repository.

```python
git clone https://github.com/StanfordMIMI/MedVAE.git
cd MedVAE
pip install -e .[dev]
pre-commit install
pre-commit
```

## 🚀 Inference Instructions

```python
import torch
from medvae import MVAE

fpath = "documentation/data/mmg_data/isJV8hQ2hhJsvEP5rdQNiy.png"
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

model = MVAE(model_name="medvae_4_3_2d", modality="xray").to(device)
img = model.apply_transform(fpath).to(device)

model.requires_grad_(False)
model.eval()

with torch.no_grad():
    latent = model(img)
```

We also developed an easy-to-use CLI inference tool for compressing your high-dimensional medical images into usable latents:

```python
medvae_inference -i INPUT_FOLDER -o OUTPUT_FOLDER -model_name MED_VAE_MODEL -modality MODALITY
```

For more information, please check our [inference documentation](/documentation/inference.md) and [demo](documentation/demo.ipynb).

## 🔧 Finetuning Instructions

Easily finetune MedVAE on **your own dataset**! Follow the instructions below (requires Python 3.9 and cloning the repository).

Run the following commands depending on your finetuning scenario:

**Stage 1 (2D) Finetuning**

```bash
medvae_finetune experiment=medvae_4x_1c_2d_finetuning
```

**Stage 2 (2D) Finetuning:**

```bash
medvae_finetune_s2 experiment=medvae_4x_1c_2d_s2_finetuning
```

**Stage 2 (3D) Finetuning:**

```bash
medvae_finetune experiment=medvae_4x_1c_3d_finetuning
```

This setup supports multi-GPU training and includes integration with Weights & Biases for experiment tracking.

For detailed finetuning guidelines, see the [Finetuning Documentation](documentation/finetune.md).

To create classification models using downsized latent representations, refer to the [Classification Documentation](documentation/classification.md).

## 📎 Citation

If you find this repository useful for your work, please cite the following paper:

```bibtex
@misc{varma2025medvaeefficientautomatedinterpretation,
      title={MedVAE: Efficient Automated Interpretation of Medical Images with Large-Scale Generalizable Autoencoders}, 
      author={Maya Varma and Ashwin Kumar and Rogier van der Sluijs and Sophie Ostmeier and Louis Blankemeier and Pierre Chambon and Christian Bluethgen and Jip Prince and Curtis Langlotz and Akshay Chaudhari},
      year={2025},
      eprint={2502.14753},
      archivePrefix={arXiv},
      primaryClass={eess.IV},
      url={https://arxiv.org/abs/2502.14753}, 
}
```

This repository is powered by [Hydra](https://github.com/facebookresearch/hydra) and [HuggingFace Accelerate](https://github.com/huggingface/accelerate). Our implementation of MedVAE is inspired by prior work on diffusion models from [CompVis](https://github.com/CompVis/latent-diffusion) and [Stability AI](https://github.com/Stability-AI/stablediffusion).

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "medvae",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "deep learning, image compression, compression, efficiency, computer aided diagnosis, medical image analysis, autoencoders, representation learning, Med-VAE, medvae",
    "author": "Stanford Machine Intelligence for Medical Imaging (MIMI)",
    "author_email": "Maya Varma <mayavarma@cs.stanford.edu>, Ashwin Kumar <akkumar@stanford.edu>, Rogier van der Sluijs <sluijs@stanford.edu>",
    "download_url": "https://files.pythonhosted.org/packages/81/09/e8bae10fb556aa409aa77accd94da7fcdbc726543783a9f657b270fbb00e/medvae-0.1.4.tar.gz",
    "platform": null,
    "description": "# MedVAE: Efficient Automated Interpretation of Medical Images with Large-Scale Generalizable Autoencoders\n\n[![Hugging Face](https://huggingface.co/datasets/huggingface/badges/resolve/main/model-on-hf-md.svg)](https://huggingface.co/stanfordmimi/MedVAE)\u00a0\u00a0\u00a0\u00a0[![pypi](https://img.shields.io/pypi/v/medvae?style=for-the-badge)](https://pypi.org/project/medvae/)\u00a0\u00a0\u00a0\u00a0[![arXiv](https://img.shields.io/badge/arXiv-2502.14753-b31b1b.svg?style=for-the-badge)](https://arxiv.org/abs/2502.14753)\u00a0\u00a0\u00a0\u00a0[![Watch the Talk on YouTube](https://img.shields.io/badge/YouTube-Talk-red?style=for-the-badge&logo=youtube)](https://www.youtube.com/watch?v=5zoxHz71ZgY)\u00a0\u00a0\u00a0\u00a0[![License](https://img.shields.io/github/license/stanfordmimi/medvae?style=for-the-badge)](LICENSE)\n\nThis repository contains the official PyTorch implementation for [MedVAE: Efficient Automated Interpretation of Medical Images with Large-Scale Generalizable Autoencoders](https://arxiv.org/abs/2502.14753) (MIDL 2025; Best Oral Paper Award).\n\n<!-- ![Overview](documentation/assets/overview.png) -->\n\n## \ud83e\udec1 What is MedVAE?\n\nMedVAE is a family of six large-scale, generalizable 2D and 3D variational autoencoders (VAEs) designed for medical imaging. It is trained on over one million medical images across multiple anatomical regions and modalities. MedVAE autoencoders encode medical images as downsized latent representations and decode latent representations back to high-resolution images. Across diverse tasks obtained from 20 medical image datasets, we demonstrate that utilizing MedVAE latent representations in place of high-resolution images when training downstream models can lead to efficiency benefits (up to 70x improvement in throughput) while simultaneously preserving clinically-relevant features.\n\n## \u26a1\ufe0f Installation\n\nTo install MedVAE, you can simply run:\n\n```python\npip install medvae\n```\n\nFor an editable installation, use the following commands to clone and install this repository.\n\n```python\ngit clone https://github.com/StanfordMIMI/MedVAE.git\ncd MedVAE\npip install -e .[dev]\npre-commit install\npre-commit\n```\n\n## \ud83d\ude80 Inference Instructions\n\n```python\nimport torch\nfrom medvae import MVAE\n\nfpath = \"documentation/data/mmg_data/isJV8hQ2hhJsvEP5rdQNiy.png\"\ndevice = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\n\nmodel = MVAE(model_name=\"medvae_4_3_2d\", modality=\"xray\").to(device)\nimg = model.apply_transform(fpath).to(device)\n\nmodel.requires_grad_(False)\nmodel.eval()\n\nwith torch.no_grad():\n    latent = model(img)\n```\n\nWe also developed an easy-to-use CLI inference tool for compressing your high-dimensional medical images into usable latents:\n\n```python\nmedvae_inference -i INPUT_FOLDER -o OUTPUT_FOLDER -model_name MED_VAE_MODEL -modality MODALITY\n```\n\nFor more information, please check our [inference documentation](/documentation/inference.md) and [demo](documentation/demo.ipynb).\n\n## \ud83d\udd27 Finetuning Instructions\n\nEasily finetune MedVAE on **your own dataset**! Follow the instructions below (requires Python 3.9 and cloning the repository).\n\nRun the following commands depending on your finetuning scenario:\n\n**Stage 1 (2D) Finetuning**\n\n```bash\nmedvae_finetune experiment=medvae_4x_1c_2d_finetuning\n```\n\n**Stage 2 (2D) Finetuning:**\n\n```bash\nmedvae_finetune_s2 experiment=medvae_4x_1c_2d_s2_finetuning\n```\n\n**Stage 2 (3D) Finetuning:**\n\n```bash\nmedvae_finetune experiment=medvae_4x_1c_3d_finetuning\n```\n\nThis setup supports multi-GPU training and includes integration with Weights & Biases for experiment tracking.\n\nFor detailed finetuning guidelines, see the [Finetuning Documentation](documentation/finetune.md).\n\nTo create classification models using downsized latent representations, refer to the [Classification Documentation](documentation/classification.md).\n\n## \ud83d\udcce Citation\n\nIf you find this repository useful for your work, please cite the following paper:\n\n```bibtex\n@misc{varma2025medvaeefficientautomatedinterpretation,\n      title={MedVAE: Efficient Automated Interpretation of Medical Images with Large-Scale Generalizable Autoencoders}, \n      author={Maya Varma and Ashwin Kumar and Rogier van der Sluijs and Sophie Ostmeier and Louis Blankemeier and Pierre Chambon and Christian Bluethgen and Jip Prince and Curtis Langlotz and Akshay Chaudhari},\n      year={2025},\n      eprint={2502.14753},\n      archivePrefix={arXiv},\n      primaryClass={eess.IV},\n      url={https://arxiv.org/abs/2502.14753}, \n}\n```\n\nThis repository is powered by [Hydra](https://github.com/facebookresearch/hydra) and [HuggingFace Accelerate](https://github.com/huggingface/accelerate). Our implementation of MedVAE is inspired by prior work on diffusion models from [CompVis](https://github.com/CompVis/latent-diffusion) and [Stability AI](https://github.com/Stability-AI/stablediffusion).\n",
    "bugtrack_url": null,
    "license": "MIT License  Copyright (c) 2025 Stanford MIMI Lab  Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:  The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.  THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.",
    "summary": "MedVAE is a family of six medical image autoencoders that can encode high-dimensional medical images into latent representations.",
    "version": "0.1.4",
    "project_urls": {
        "homepage": "https://github.com/StanfordMIMI/MedVAE",
        "repository": "https://github.com/StanfordMIMI/MedVAE"
    },
    "split_keywords": [
        "deep learning",
        " image compression",
        " compression",
        " efficiency",
        " computer aided diagnosis",
        " medical image analysis",
        " autoencoders",
        " representation learning",
        " med-vae",
        " medvae"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "639be1ba11fdc1b72066aa7b11a8f90301a50f6cc9ac8acef60af4f9f280b0d7",
                "md5": "f02d2debfb2758ebd07339d830ef3bd2",
                "sha256": "8f168e51e137fa454651999ac29c815c872d590a11fdb718d5a449ae309031c8"
            },
            "downloads": -1,
            "filename": "medvae-0.1.4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f02d2debfb2758ebd07339d830ef3bd2",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 46979,
            "upload_time": "2025-07-16T01:51:44",
            "upload_time_iso_8601": "2025-07-16T01:51:44.453556Z",
            "url": "https://files.pythonhosted.org/packages/63/9b/e1ba11fdc1b72066aa7b11a8f90301a50f6cc9ac8acef60af4f9f280b0d7/medvae-0.1.4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "8109e8bae10fb556aa409aa77accd94da7fcdbc726543783a9f657b270fbb00e",
                "md5": "79a95098a3045cf82b63b3c3fa95d709",
                "sha256": "f2ba5d4eed5a57d2686a7d8b5c11b1ab71113811a81880f50902bc7f0bfe7f37"
            },
            "downloads": -1,
            "filename": "medvae-0.1.4.tar.gz",
            "has_sig": false,
            "md5_digest": "79a95098a3045cf82b63b3c3fa95d709",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 38036,
            "upload_time": "2025-07-16T01:51:45",
            "upload_time_iso_8601": "2025-07-16T01:51:45.727745Z",
            "url": "https://files.pythonhosted.org/packages/81/09/e8bae10fb556aa409aa77accd94da7fcdbc726543783a9f657b270fbb00e/medvae-0.1.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-16 01:51:45",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "StanfordMIMI",
    "github_project": "MedVAE",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "medvae"
}

Stanford Machine Intelligence for Medical Imaging (MIMI)