torch-mpo


Nametorch-mpo JSON
Version 0.1.0 PyPI version JSON
download
home_pageNone
SummaryPyTorch implementation of Matrix Product Operators for neural network compression
upload_time2025-09-19 16:32:37
maintainerKrzysztof Woś
docs_urlNone
authorKrzysztof Woś
requires_python>=3.12
licenseMIT
keywords deep-learning model-compression mpo neural-networks pytorch tensor-decomposition tensor-train
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # PyTorch Matrix Product Operators

A modern PyTorch implementation of Matrix Product Operators (MPO) for neural network compression, based on the paper "Compressing deep neural networks by matrix product operators" by Ze-Feng Gao et al.

## Overview

This library provides PyTorch implementations of tensor-train decomposed neural network layers that can significantly reduce the number of parameters in deep neural networks while maintaining accuracy.

## Features

- **TT-decomposed layers**: `TTLinear` and `TTConv2d` for compressed fully-connected and convolutional layers
- **Modern PyTorch**: Full compatibility with PyTorch 2.0+, type hints, device-agnostic
- **Pretrained model compression**: Convert existing PyTorch models to MPO format
- **Multiple architectures**: VGG-16/19, ResNet-18/34/50/101/152, and custom models
- **Automatic factorization**: Smart dimension factorization for optimal compression
- **Comprehensive examples**: MNIST, CIFAR-10, ImageNet training scripts
- **Analysis tools**: Compression ratio calculation, performance benchmarks

## Installation

```bash
# Clone the repository
git clone https://github.com/krzysztofwos/torch-mpo
cd torch-mpo

# Install with uv (recommended)
uv sync                  # Install base dependencies
uv sync --all-extras     # Install with all extras (dev, docs)

# Or install with pip (development mode)
pip install -e .
pip install -e ".[dev]"  # With development dependencies
```

## Quick Start

### Basic Usage

```python
import torch
from torch_mpo import TTLinear, TTConv2d

# Create a TT-decomposed linear layer
linear = TTLinear(
    in_features=1024,
    out_features=512,
    tt_ranks=8,  # Higher rank = better accuracy, more parameters
    bias=True
)

# Create a TT-decomposed convolutional layer
conv = TTConv2d(
    in_channels=128,
    out_channels=256,
    kernel_size=3,
    padding=1,
    tt_ranks=8
)

# Use them like standard PyTorch layers
x = torch.randn(32, 1024)
y = linear(x)  # Shape: [32, 512]

x = torch.randn(32, 128, 32, 32)
y = conv(x)  # Shape: [32, 256, 32, 32]
```

### Compress Existing Models

```python
from torch_mpo import compress_model
import torchvision.models as models

# Load a pretrained model
model = models.vgg16(pretrained=True)

# Compress it with MPO
compressed_model = compress_model(
    model,
    compression_ratio=0.1,  # Target 10x compression
    compress_linear=True,   # Compress Linear layers
    compress_conv=True,     # Compress Conv2d layers
    verbose=True
)

# Fine-tune the compressed model
optimizer = torch.optim.Adam(compressed_model.parameters(), lr=1e-4)
# ... continue with training
```

### Use Pre-built Architectures

```python
from torch_mpo.models import vgg16_mpo, resnet50_mpo

# VGG-16 with MPO compression
model = vgg16_mpo(
    num_classes=10,
    tt_ranks_conv=8,   # TT-rank for conv layers
    tt_ranks_fc=16,    # TT-rank for FC layers
    compress_conv=True,
    compress_fc=True
)

# ResNet-50 with MPO compression
model = resnet50_mpo(
    num_classes=1000,
    tt_ranks_conv=16,
    tt_ranks_fc=32,
    use_mpo_conv=True,
    use_mpo_fc=True
)
```

## Examples

The `examples/` directory contains complete training scripts:

### MNIST with LeNet-5 MPO

```bash
python examples/mnist_lenet5_mpo.py --tt-rank 8 --epochs 10
```

### CIFAR-10 with VGG-16 MPO

```bash
python examples/cifar10_vgg16_mpo.py --tt-rank-conv 8 --tt-rank-fc 16 --epochs 20
```

### ImageNet with ResNet-50 MPO

```bash
python examples/imagenet_resnet50_mpo.py /path/to/imagenet \
    --tt-rank-conv 16 --tt-rank-fc 32 --epochs 90
```

### Compress Pretrained VGG

```bash
python examples/compress_vgg.py --model vgg16 --compression-ratio 0.1
```

## Performance Benchmarks

Run benchmarks to compare MPO layers with standard layers:

```bash
python benchmarks/benchmark_layers.py
```

### Typical Results

| Layer                  | Original Params | MPO Params (rank=8) | Compression | Speedup |
| ---------------------- | --------------- | ------------------- | ----------- | ------- |
| Linear(4096, 4096)     | 16.8M           | 655K                | 25.6x       | 0.8x    |
| Conv2d(256, 512, 3)    | 1.2M            | 123K                | 9.7x        | 0.9x    |
| VGG-16 (full model)    | 138M            | 15M                 | 9.2x        | 0.85x   |
| ResNet-50 (full model) | 25.6M           | 8.2M                | 3.1x        | 0.95x   |

## Documentation

See the comprehensive tutorial in `docs/mpo_tutorial.md` covering:

- Mathematical foundations of TT decomposition
- How MPO compression works
- Implementation details
- Best practices and tips
- Advanced topics

## Key Concepts

### TT-Ranks

The `tt_ranks` parameter controls the trade-off between compression and accuracy:

- **Lower ranks** (4-8): High compression, some accuracy loss
- **Medium ranks** (8-16): Good balance
- **Higher ranks** (16-32): Less compression, minimal accuracy loss

### Automatic Factorization

The library automatically factorizes dimensions for optimal compression:

```python
# 1024 = 4 × 16 × 16 (automatic factorization)
layer = TTLinear(1024, 512, tt_ranks=8)
```

### Custom Factorization

You can also specify custom factorizations:

```python
layer = TTLinear(
    in_features=784,  # 28×28 MNIST
    out_features=256,
    inp_modes=[7, 4, 7, 4],  # 7×4×7×4 = 784
    out_modes=[4, 4, 4, 4],  # 4×4×4×4 = 256
    tt_ranks=[1, 8, 8, 8, 1]
)
```

### Initialization and Numerical Stability

Proper initialization is crucial for TT-decomposed layers to maintain stable gradients during training:

#### TTLinear Initialization

- Uses standard Xavier/Kaiming initialization for each core
- No additional scaling needed as the decomposition naturally regularizes

#### TTConv2d Initialization

- More complex due to spatial convolution followed by TT cores
- **Key insight**: Variance accumulates through both spatial conv and TT cores
- **Solution**: TT cores are scaled by `1/d^0.25` where `d` is the number of cores
- This empirically maintains output variance similar to standard Conv2d layers

Without proper initialization scaling, deep networks can experience:

- **Exploding activations**: Outputs growing exponentially through layers
- **Vanishing gradients**: Making training impossible
- **Poor convergence**: Model stuck at random performance

The library handles this automatically, but when implementing custom layers, careful attention to initialization is essential.

## Contributing

Contributions are welcome. Please feel free to submit a Pull Request.

## Citation

If you use this code in your research, please cite both the original paper and this implementation:

### Original Paper

```bibtex
@article{gao2020compressing,
  title={Compressing deep neural networks by matrix product operators},
  author={Gao, Ze-Feng and Song, Chao and Wang, Lei and others},
  journal={Physical Review Research},
  volume={2},
  number={2},
  pages={023300},
  year={2020}
}
```

### This Implementation

```bibtex
@software{torch-mpo2024,
  title={torch-mpo: PyTorch Matrix Product Operators},
  author={Woś, Krzysztof},
  year={2024},
  url={https://github.com/krzysztofwos/torch-mpo},
  version={0.1.0}
}
```

## License

MIT License

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "torch-mpo",
    "maintainer": "Krzysztof Wo\u015b",
    "docs_url": null,
    "requires_python": ">=3.12",
    "maintainer_email": null,
    "keywords": "deep-learning, model-compression, mpo, neural-networks, pytorch, tensor-decomposition, tensor-train",
    "author": "Krzysztof Wo\u015b",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/20/0d/9481dfb00db5858c7f0df4ec6cab8659d4e4875bf660e3ea7a0c3084d596/torch_mpo-0.1.0.tar.gz",
    "platform": null,
    "description": "# PyTorch Matrix Product Operators\n\nA modern PyTorch implementation of Matrix Product Operators (MPO) for neural network compression, based on the paper \"Compressing deep neural networks by matrix product operators\" by Ze-Feng Gao et al.\n\n## Overview\n\nThis library provides PyTorch implementations of tensor-train decomposed neural network layers that can significantly reduce the number of parameters in deep neural networks while maintaining accuracy.\n\n## Features\n\n- **TT-decomposed layers**: `TTLinear` and `TTConv2d` for compressed fully-connected and convolutional layers\n- **Modern PyTorch**: Full compatibility with PyTorch 2.0+, type hints, device-agnostic\n- **Pretrained model compression**: Convert existing PyTorch models to MPO format\n- **Multiple architectures**: VGG-16/19, ResNet-18/34/50/101/152, and custom models\n- **Automatic factorization**: Smart dimension factorization for optimal compression\n- **Comprehensive examples**: MNIST, CIFAR-10, ImageNet training scripts\n- **Analysis tools**: Compression ratio calculation, performance benchmarks\n\n## Installation\n\n```bash\n# Clone the repository\ngit clone https://github.com/krzysztofwos/torch-mpo\ncd torch-mpo\n\n# Install with uv (recommended)\nuv sync                  # Install base dependencies\nuv sync --all-extras     # Install with all extras (dev, docs)\n\n# Or install with pip (development mode)\npip install -e .\npip install -e \".[dev]\"  # With development dependencies\n```\n\n## Quick Start\n\n### Basic Usage\n\n```python\nimport torch\nfrom torch_mpo import TTLinear, TTConv2d\n\n# Create a TT-decomposed linear layer\nlinear = TTLinear(\n    in_features=1024,\n    out_features=512,\n    tt_ranks=8,  # Higher rank = better accuracy, more parameters\n    bias=True\n)\n\n# Create a TT-decomposed convolutional layer\nconv = TTConv2d(\n    in_channels=128,\n    out_channels=256,\n    kernel_size=3,\n    padding=1,\n    tt_ranks=8\n)\n\n# Use them like standard PyTorch layers\nx = torch.randn(32, 1024)\ny = linear(x)  # Shape: [32, 512]\n\nx = torch.randn(32, 128, 32, 32)\ny = conv(x)  # Shape: [32, 256, 32, 32]\n```\n\n### Compress Existing Models\n\n```python\nfrom torch_mpo import compress_model\nimport torchvision.models as models\n\n# Load a pretrained model\nmodel = models.vgg16(pretrained=True)\n\n# Compress it with MPO\ncompressed_model = compress_model(\n    model,\n    compression_ratio=0.1,  # Target 10x compression\n    compress_linear=True,   # Compress Linear layers\n    compress_conv=True,     # Compress Conv2d layers\n    verbose=True\n)\n\n# Fine-tune the compressed model\noptimizer = torch.optim.Adam(compressed_model.parameters(), lr=1e-4)\n# ... continue with training\n```\n\n### Use Pre-built Architectures\n\n```python\nfrom torch_mpo.models import vgg16_mpo, resnet50_mpo\n\n# VGG-16 with MPO compression\nmodel = vgg16_mpo(\n    num_classes=10,\n    tt_ranks_conv=8,   # TT-rank for conv layers\n    tt_ranks_fc=16,    # TT-rank for FC layers\n    compress_conv=True,\n    compress_fc=True\n)\n\n# ResNet-50 with MPO compression\nmodel = resnet50_mpo(\n    num_classes=1000,\n    tt_ranks_conv=16,\n    tt_ranks_fc=32,\n    use_mpo_conv=True,\n    use_mpo_fc=True\n)\n```\n\n## Examples\n\nThe `examples/` directory contains complete training scripts:\n\n### MNIST with LeNet-5 MPO\n\n```bash\npython examples/mnist_lenet5_mpo.py --tt-rank 8 --epochs 10\n```\n\n### CIFAR-10 with VGG-16 MPO\n\n```bash\npython examples/cifar10_vgg16_mpo.py --tt-rank-conv 8 --tt-rank-fc 16 --epochs 20\n```\n\n### ImageNet with ResNet-50 MPO\n\n```bash\npython examples/imagenet_resnet50_mpo.py /path/to/imagenet \\\n    --tt-rank-conv 16 --tt-rank-fc 32 --epochs 90\n```\n\n### Compress Pretrained VGG\n\n```bash\npython examples/compress_vgg.py --model vgg16 --compression-ratio 0.1\n```\n\n## Performance Benchmarks\n\nRun benchmarks to compare MPO layers with standard layers:\n\n```bash\npython benchmarks/benchmark_layers.py\n```\n\n### Typical Results\n\n| Layer                  | Original Params | MPO Params (rank=8) | Compression | Speedup |\n| ---------------------- | --------------- | ------------------- | ----------- | ------- |\n| Linear(4096, 4096)     | 16.8M           | 655K                | 25.6x       | 0.8x    |\n| Conv2d(256, 512, 3)    | 1.2M            | 123K                | 9.7x        | 0.9x    |\n| VGG-16 (full model)    | 138M            | 15M                 | 9.2x        | 0.85x   |\n| ResNet-50 (full model) | 25.6M           | 8.2M                | 3.1x        | 0.95x   |\n\n## Documentation\n\nSee the comprehensive tutorial in `docs/mpo_tutorial.md` covering:\n\n- Mathematical foundations of TT decomposition\n- How MPO compression works\n- Implementation details\n- Best practices and tips\n- Advanced topics\n\n## Key Concepts\n\n### TT-Ranks\n\nThe `tt_ranks` parameter controls the trade-off between compression and accuracy:\n\n- **Lower ranks** (4-8): High compression, some accuracy loss\n- **Medium ranks** (8-16): Good balance\n- **Higher ranks** (16-32): Less compression, minimal accuracy loss\n\n### Automatic Factorization\n\nThe library automatically factorizes dimensions for optimal compression:\n\n```python\n# 1024 = 4 \u00d7 16 \u00d7 16 (automatic factorization)\nlayer = TTLinear(1024, 512, tt_ranks=8)\n```\n\n### Custom Factorization\n\nYou can also specify custom factorizations:\n\n```python\nlayer = TTLinear(\n    in_features=784,  # 28\u00d728 MNIST\n    out_features=256,\n    inp_modes=[7, 4, 7, 4],  # 7\u00d74\u00d77\u00d74 = 784\n    out_modes=[4, 4, 4, 4],  # 4\u00d74\u00d74\u00d74 = 256\n    tt_ranks=[1, 8, 8, 8, 1]\n)\n```\n\n### Initialization and Numerical Stability\n\nProper initialization is crucial for TT-decomposed layers to maintain stable gradients during training:\n\n#### TTLinear Initialization\n\n- Uses standard Xavier/Kaiming initialization for each core\n- No additional scaling needed as the decomposition naturally regularizes\n\n#### TTConv2d Initialization\n\n- More complex due to spatial convolution followed by TT cores\n- **Key insight**: Variance accumulates through both spatial conv and TT cores\n- **Solution**: TT cores are scaled by `1/d^0.25` where `d` is the number of cores\n- This empirically maintains output variance similar to standard Conv2d layers\n\nWithout proper initialization scaling, deep networks can experience:\n\n- **Exploding activations**: Outputs growing exponentially through layers\n- **Vanishing gradients**: Making training impossible\n- **Poor convergence**: Model stuck at random performance\n\nThe library handles this automatically, but when implementing custom layers, careful attention to initialization is essential.\n\n## Contributing\n\nContributions are welcome. Please feel free to submit a Pull Request.\n\n## Citation\n\nIf you use this code in your research, please cite both the original paper and this implementation:\n\n### Original Paper\n\n```bibtex\n@article{gao2020compressing,\n  title={Compressing deep neural networks by matrix product operators},\n  author={Gao, Ze-Feng and Song, Chao and Wang, Lei and others},\n  journal={Physical Review Research},\n  volume={2},\n  number={2},\n  pages={023300},\n  year={2020}\n}\n```\n\n### This Implementation\n\n```bibtex\n@software{torch-mpo2024,\n  title={torch-mpo: PyTorch Matrix Product Operators},\n  author={Wo\u015b, Krzysztof},\n  year={2024},\n  url={https://github.com/krzysztofwos/torch-mpo},\n  version={0.1.0}\n}\n```\n\n## License\n\nMIT License\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "PyTorch implementation of Matrix Product Operators for neural network compression",
    "version": "0.1.0",
    "project_urls": {
        "Documentation": "https://github.com/krzysztofwos/torch-mpo/blob/main/docs/tutorial.md",
        "Homepage": "https://github.com/krzysztofwos/torch-mpo",
        "Issues": "https://github.com/krzysztofwos/torch-mpo/issues",
        "Repository": "https://github.com/krzysztofwos/torch-mpo"
    },
    "split_keywords": [
        "deep-learning",
        " model-compression",
        " mpo",
        " neural-networks",
        " pytorch",
        " tensor-decomposition",
        " tensor-train"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "f90d25b8afc7150a772731c8f14a515461832ac24f3f0774e2fa7c392f55a692",
                "md5": "89ddfb2a52a2562e64bc1a4ca18d77c1",
                "sha256": "adc22ae2a43f52e167b9bd05708cae1ecef2a67cfb3fd04558f9947e8c94e70e"
            },
            "downloads": -1,
            "filename": "torch_mpo-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "89ddfb2a52a2562e64bc1a4ca18d77c1",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.12",
            "size": 25715,
            "upload_time": "2025-09-19T16:32:35",
            "upload_time_iso_8601": "2025-09-19T16:32:35.916234Z",
            "url": "https://files.pythonhosted.org/packages/f9/0d/25b8afc7150a772731c8f14a515461832ac24f3f0774e2fa7c392f55a692/torch_mpo-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "200d9481dfb00db5858c7f0df4ec6cab8659d4e4875bf660e3ea7a0c3084d596",
                "md5": "3cb856cd301fa0a09473293c878c140b",
                "sha256": "952a1d4554e015574a25120dd804d696ac50539a2576901ac430d657284ae14a"
            },
            "downloads": -1,
            "filename": "torch_mpo-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "3cb856cd301fa0a09473293c878c140b",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.12",
            "size": 141942,
            "upload_time": "2025-09-19T16:32:37",
            "upload_time_iso_8601": "2025-09-19T16:32:37.257271Z",
            "url": "https://files.pythonhosted.org/packages/20/0d/9481dfb00db5858c7f0df4ec6cab8659d4e4875bf660e3ea7a0c3084d596/torch_mpo-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-09-19 16:32:37",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "krzysztofwos",
    "github_project": "torch-mpo",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "torch-mpo"
}
        
Elapsed time: 1.83113s