# <b>Antialiased CNNs</b> [[Project Page]](http://richzhang.github.io/antialiased-cnns/) [[Paper]](https://arxiv.org/abs/1904.11486) [[Talk]](https://www.youtube.com/watch?v=HjewNBZz00w)
<img src='https://richzhang.github.io/antialiased-cnns/resources/gifs2/video_00810.gif' align="right" width=300>
**Making Convolutional Networks Shift-Invariant Again** <br>
[Richard Zhang](https://richzhang.github.io/). In [ICML, 2019](https://arxiv.org/abs/1904.11486).
### Quick & easy start
Run `pip install antialiased-cnns`
```python
import antialiased_cnns
model = antialiased_cnns.resnet50(pretrained=True)
```
<!-- model.load_state_dict(torch.load('resnet50_lpf4-994b528f.pth.tar')['state_dict']) # load weights; download it beforehand from https://www.dropbox.com/s/zqsudi0oz5ym8w8/resnet50_lpf4-994b528f.pth.tar?dl=0 -->
<!-- Now you are antialiased! -->
If you have a model already and want to antialias and continue training, copy your old weights over:
``` python
import torchvision.models as models
old_model = models.resnet50(pretrained=True) # old (aliased) model
antialiased_cnns.copy_params_buffers(old_model, model) # copy the weights over
```
If you want to modify your own model, use the BlurPool layer. More information about our provided models and how to use BlurPool is below.
```python
C = 10 # example feature channel size
blurpool = antialiased_cnns.BlurPool(C, stride=2) # BlurPool layer; use to downsample a feature map
ex_tens = torch.Tensor(1,C,128,128)
print(blurpool(ex_tens).shape) # 1xCx64x64 tensor
```
**Updates**
* **(Oct 2020) Finetune** I initialize the antialiased model with weights from baseline model, and finetune. Before, I was training from scratch. The results are better.
* **(Oct 2020) Additional models** We now have 23 total model variants. I added variants of vgg, densenet, resnext, wide resnet varieties! The same conclusions hold.
* **(Sept 2020) Pip install** You can also now `pip install antialiased-cnns` and load models with the `pretrained=True` flag.
* **(Sept 2020) Kernel 4** I have added kernel size 4 experiments. When downsampling an even sized feature map (e.g., a 128x128-->64x64), this is actually the correct size to use to keep the indices from drifting.
### Table of contents
1. [More information about antialiased models](#1-more-information-loading-an-antialiased-model)<br>
2. [Instructions for antialiasing your own model](#2-more-information-how-to-antialias-your-own-architecture), using the [`BlurPool`](antialiased_cnns/__init__.py) layer<br>
3. [ImageNet training and evaluation code](README_IMAGENET.md). Achieving better consistency, while maintaining or improving accuracy, is an open problem. Help improve the results!
## (0) Preliminaries
Pip install this package
- `pip install antialiased-cnns`
Or clone this repository and install requirements (notably, PyTorch)
```bash
https://github.com/adobe/antialiased-cnns.git
cd antialiased-cnns
pip install -r requirements.txt
```
## (1) Loading an antialiased model
The following loads a pretrained antialiased model, perhaps as a backbone for your application.
```python
import antialiased_cnns
model = antialiased_cnns.resnet50(pretrained=True, filter_size=4)
```
We also provide weights for antialiased `AlexNet`, `VGG16(bn)`, `Resnet18,34,50,101`, `Densenet121`, and `MobileNetv2` (see [example_usage.py](example_usage.py)).
## (2) How to antialias your own architecture
The `antialiased_cnns` module contains the `BlurPool` [class](antialiased_cnns/downsample.py), which does blur+subsampling. Run `pip install antialiased-cnns` or copy the `antialiased_cnns` subdirectory.
**Methodology** The methodology is simple -- first evaluate with stride 1, and then use our `BlurPool` layer to do antialiased downsampling. Make the following architectural changes.
```python
import antialiased_cnns
# MaxPool --> MaxBlurPool
baseline = nn.MaxPool2d(kernel_size=2, stride=2)
antialiased = [nn.MaxPool2d(kernel_size=2, stride=1),
antialiased_cnns.BlurPool(C, stride=2)]
# Conv --> ConvBlurPool
baseline = [nn.Conv2d(Cin, C, kernel_size=3, stride=2, padding=1),
nn.ReLU(inplace=True)]
antialiased = [nn.Conv2d(Cin, C, kernel_size=3, stride=1, padding=1),
nn.ReLU(inplace=True),
antialiased_cnns.BlurPool(C, stride=2)]
# AvgPool --> BlurPool
baseline = nn.AvgPool2d(kernel_size=2, stride=2)
antialiased = antialiased_cnns.BlurPool(C, stride=2)
```
We assume incoming tensor has `C` channels. Computing a layer at stride 1 instead of stride 2 adds memory and run-time. As such, we typically skip antialiasing at the highest-resolution (early in the network), to prevent large increases.
**Add antialiasing and then continue training** If you already trained a model, and then add antialiasing, you can fine-tune from that old model:
``` python
antialiased_cnns.copy_params_buffers(old_model, antialiased_model)
```
If this doesn't work, you can just copy the parameters (and not buffers). Adding antialiasing doesn't add any parameters, so the parameter lists are identical. (It does add buffers, so some heuristic is used to match the buffers, which may throw an error.)
``` python
antialiased_cnns.copy_params(old_model, antialiased_model)
```
<img src='https://richzhang.github.io/antialiased-cnns/resources/antialias_mod.jpg' width=800><br>
## (3) ImageNet Evaluation, Results, and Training code
We observe improvements in both **accuracy** (how often the image is classified correctly) and **consistency** (how often two shifts of the same image are classified the same).
<img src='plots/plots2_acc.png' align="left" width=750>
<img src='plots/plots2_con.png' align="left" width=750>
| ACCURACY | Baseline | Antialiased | Delta |
| :------: | :------: | :-------: | :-------: |
| alexnet | 56.55 | 56.94 | +0.39 |
| vgg11 | 69.02 | 70.51 | +1.49 |
| vgg13 | 69.93 | 71.52 | +1.59 |
| vgg16 | 71.59 | 72.96 | +1.37 |
| vgg19 | 72.38 | 73.54 | +1.16 |
| vgg11_bn | 70.38 | 72.63 | +2.25 |
| vgg13_bn | 71.55 | 73.61 | +2.06 |
| vgg16_bn | 73.36 | 75.13 | +1.77 |
| vgg19_bn | 74.24 | 75.68 | +1.44 |
| resnet18 | 69.74 | 71.67 | +1.93 |
| resnet34 | 73.30 | 74.60 | +1.30 |
| resnet50 | 76.16 | 77.41 | +1.25 |
| resnet101 | 77.37 | 78.38 | +1.01 |
| resnet152 | 78.31 | 79.07 | +0.76 |
| resnext50_32x4d | 77.62 | 77.93 | +0.31 |
| resnext101_32x8d | 79.31 | 79.33 | +0.02 |
| wide_resnet50_2 | 78.47 | 78.70 | +0.23 |
| wide_resnet101_2 | 78.85 | 78.99 | +0.14 |
| densenet121 | 74.43 | 75.79 | +1.36 |
| densenet169 | 75.60 | 76.73 | +1.13 |
| densenet201 | 76.90 | 77.31 | +0.41 |
| densenet161 | 77.14 | 77.88 | +0.74 |
| mobilenet_v2 | 71.88 | 72.72 | +0.84 |
| CONSISTENCY | Baseline | Antialiased | Delta |
| :------: | :------: | :-------: | :-------: |
| alexnet | 78.18 | 83.31 | +5.13 |
| vgg11 | 86.58 | 90.09 | +3.51 |
| vgg13 | 86.92 | 90.31 | +3.39 |
| vgg16 | 88.52 | 90.91 | +2.39 |
| vgg19 | 89.17 | 91.08 | +1.91 |
| vgg11_bn | 87.16 | 90.67 | +3.51 |
| vgg13_bn | 88.03 | 91.09 | +3.06 |
| vgg16_bn | 89.24 | 91.58 | +2.34 |
| vgg19_bn | 89.59 | 91.60 | +2.01 |
| resnet18 | 85.11 | 88.36 | +3.25 |
| resnet34 | 87.56 | 89.77 | +2.21 |
| resnet50 | 89.20 | 91.32 | +2.12 |
| resnet101 | 89.81 | 91.97 | +2.16 |
| resnet152 | 90.92 | 92.42 | +1.50 |
| resnext50_32x4d | 90.17 | 91.48 | +1.31 |
| resnext101_32x8d | 91.33 | 92.67 | +1.34 |
| wide_resnet50_2 | 90.77 | 92.46 | +1.69 |
| wide_resnet101_2 | 90.93 | 92.10 | +1.17 |
| densenet121 | 88.81 | 90.35 | +1.54 |
| densenet169 | 89.68 | 90.61 | +0.93 |
| densenet201 | 90.36 | 91.32 | +0.96 |
| densenet161 | 90.82 | 91.66 | +0.84 |
| mobilenet_v2 | 86.50 | 87.73 | +1.23 |
To reduce clutter, extended results (different filter sizes) are [here](README_IMAGENET.md). Help improve the results!
## Licenses
<a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-nc-sa/4.0/80x15.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/">Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License</a>.
All material is made available under [Creative Commons BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode) license by Adobe Inc. You can **use, redistribute, and adapt** the material for **non-commercial purposes**, as long as you give appropriate credit by **citing our paper** and **indicating any changes** that you've made.
The repository builds off the PyTorch [examples repository](https://github.com/pytorch/examples) and torchvision [models repository](https://github.com/pytorch/vision/tree/master/torchvision/models). These are [BSD-style licensed](https://github.com/pytorch/examples/blob/master/LICENSE).
## Citation, contact
If you find this useful for your research, please consider citing this [bibtex](https://richzhang.github.io/index_files/bibtex_icml2019.txt). Please contact Richard Zhang \<rizhang at adobe dot com\> with any comments or feedback.
Raw data
{
"_id": null,
"home_page": "https://github.com/adobe/antialiased-cnns",
"name": "antialiased-cnns",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "",
"author": "Richard Zhang",
"author_email": "rizhang@adobe.com",
"download_url": "",
"platform": "",
"description": "# <b>Antialiased CNNs</b> [[Project Page]](http://richzhang.github.io/antialiased-cnns/) [[Paper]](https://arxiv.org/abs/1904.11486) [[Talk]](https://www.youtube.com/watch?v=HjewNBZz00w)\n\n<img src='https://richzhang.github.io/antialiased-cnns/resources/gifs2/video_00810.gif' align=\"right\" width=300>\n\n**Making Convolutional Networks Shift-Invariant Again** <br>\n[Richard Zhang](https://richzhang.github.io/). In [ICML, 2019](https://arxiv.org/abs/1904.11486).\n\n### Quick & easy start\n\nRun `pip install antialiased-cnns`\n\n```python\nimport antialiased_cnns\nmodel = antialiased_cnns.resnet50(pretrained=True) \n```\n<!-- model.load_state_dict(torch.load('resnet50_lpf4-994b528f.pth.tar')['state_dict']) # load weights; download it beforehand from https://www.dropbox.com/s/zqsudi0oz5ym8w8/resnet50_lpf4-994b528f.pth.tar?dl=0 -->\n\n<!-- Now you are antialiased! -->\n\nIf you have a model already and want to antialias and continue training, copy your old weights over:\n\n``` python\nimport torchvision.models as models\nold_model = models.resnet50(pretrained=True) # old (aliased) model\nantialiased_cnns.copy_params_buffers(old_model, model) # copy the weights over\n```\n\nIf you want to modify your own model, use the BlurPool layer. More information about our provided models and how to use BlurPool is below.\n\n```python\nC = 10 # example feature channel size\nblurpool = antialiased_cnns.BlurPool(C, stride=2) # BlurPool layer; use to downsample a feature map\nex_tens = torch.Tensor(1,C,128,128)\nprint(blurpool(ex_tens).shape) # 1xCx64x64 tensor\n```\n\n**Updates**\n* **(Oct 2020) Finetune** I initialize the antialiased model with weights from baseline model, and finetune. Before, I was training from scratch. The results are better.\n* **(Oct 2020) Additional models** We now have 23 total model variants. I added variants of vgg, densenet, resnext, wide resnet varieties! The same conclusions hold.\n* **(Sept 2020) Pip install** You can also now `pip install antialiased-cnns` and load models with the `pretrained=True` flag.\n* **(Sept 2020) Kernel 4** I have added kernel size 4 experiments. When downsampling an even sized feature map (e.g., a 128x128-->64x64), this is actually the correct size to use to keep the indices from drifting.\n\n### Table of contents\n\n1. [More information about antialiased models](#1-more-information-loading-an-antialiased-model)<br>\n2. [Instructions for antialiasing your own model](#2-more-information-how-to-antialias-your-own-architecture), using the [`BlurPool`](antialiased_cnns/__init__.py) layer<br>\n3. [ImageNet training and evaluation code](README_IMAGENET.md). Achieving better consistency, while maintaining or improving accuracy, is an open problem. Help improve the results!\n\n## (0) Preliminaries\n\nPip install this package\n\n- `pip install antialiased-cnns`\n\nOr clone this repository and install requirements (notably, PyTorch)\n\n```bash\n\nhttps://github.com/adobe/antialiased-cnns.git\ncd antialiased-cnns\npip install -r requirements.txt\n```\n\n\n## (1) Loading an antialiased model\n\nThe following loads a pretrained antialiased model, perhaps as a backbone for your application.\n\n```python\nimport antialiased_cnns\nmodel = antialiased_cnns.resnet50(pretrained=True, filter_size=4)\n```\n\nWe also provide weights for antialiased `AlexNet`, `VGG16(bn)`, `Resnet18,34,50,101`, `Densenet121`, and `MobileNetv2` (see [example_usage.py](example_usage.py)).\n\n## (2) How to antialias your own architecture\n\nThe `antialiased_cnns` module contains the `BlurPool` [class](antialiased_cnns/downsample.py), which does blur+subsampling. Run `pip install antialiased-cnns` or copy the `antialiased_cnns` subdirectory.\n\n**Methodology** The methodology is simple -- first evaluate with stride 1, and then use our `BlurPool` layer to do antialiased downsampling. Make the following architectural changes.\n\n```python\nimport antialiased_cnns\n\n# MaxPool --> MaxBlurPool\nbaseline = nn.MaxPool2d(kernel_size=2, stride=2)\nantialiased = [nn.MaxPool2d(kernel_size=2, stride=1), \n antialiased_cnns.BlurPool(C, stride=2)]\n\n# Conv --> ConvBlurPool\nbaseline = [nn.Conv2d(Cin, C, kernel_size=3, stride=2, padding=1), \n nn.ReLU(inplace=True)]\nantialiased = [nn.Conv2d(Cin, C, kernel_size=3, stride=1, padding=1),\n nn.ReLU(inplace=True),\n antialiased_cnns.BlurPool(C, stride=2)]\n\n# AvgPool --> BlurPool\nbaseline = nn.AvgPool2d(kernel_size=2, stride=2)\nantialiased = antialiased_cnns.BlurPool(C, stride=2)\n```\n\nWe assume incoming tensor has `C` channels. Computing a layer at stride 1 instead of stride 2 adds memory and run-time. As such, we typically skip antialiasing at the highest-resolution (early in the network), to prevent large increases.\n\n**Add antialiasing and then continue training** If you already trained a model, and then add antialiasing, you can fine-tune from that old model:\n\n``` python\nantialiased_cnns.copy_params_buffers(old_model, antialiased_model)\n```\n\nIf this doesn't work, you can just copy the parameters (and not buffers). Adding antialiasing doesn't add any parameters, so the parameter lists are identical. (It does add buffers, so some heuristic is used to match the buffers, which may throw an error.)\n\n``` python\nantialiased_cnns.copy_params(old_model, antialiased_model)\n```\n\n<img src='https://richzhang.github.io/antialiased-cnns/resources/antialias_mod.jpg' width=800><br>\n\n## (3) ImageNet Evaluation, Results, and Training code\n\nWe observe improvements in both **accuracy** (how often the image is classified correctly) and **consistency** (how often two shifts of the same image are classified the same).\n\n<img src='plots/plots2_acc.png' align=\"left\" width=750>\n\n<img src='plots/plots2_con.png' align=\"left\" width=750>\n\n| ACCURACY | Baseline | Antialiased | Delta |\n| :------: | :------: | :-------: | :-------: |\n| alexnet | 56.55 | 56.94 | +0.39 |\n| vgg11 | 69.02 | 70.51 | +1.49 |\n| vgg13 | 69.93 | 71.52 | +1.59 |\n| vgg16 | 71.59 | 72.96 | +1.37 |\n| vgg19 | 72.38 | 73.54 | +1.16 |\n| vgg11_bn | 70.38 | 72.63 | +2.25 |\n| vgg13_bn | 71.55 | 73.61 | +2.06 |\n| vgg16_bn | 73.36 | 75.13 | +1.77 |\n| vgg19_bn | 74.24 | 75.68 | +1.44 |\n| resnet18 | 69.74 | 71.67 | +1.93 |\n| resnet34 | 73.30 | 74.60 | +1.30 |\n| resnet50 | 76.16 | 77.41 | +1.25 |\n| resnet101 | 77.37 | 78.38 | +1.01 |\n| resnet152 | 78.31 | 79.07 | +0.76 |\n| resnext50_32x4d | 77.62 | 77.93 | +0.31 |\n| resnext101_32x8d | 79.31 | 79.33 | +0.02 |\n| wide_resnet50_2 | 78.47 | 78.70 | +0.23 |\n| wide_resnet101_2 | 78.85 | 78.99 | +0.14 |\n| densenet121 | 74.43 | 75.79 | +1.36 |\n| densenet169 | 75.60 | 76.73 | +1.13 |\n| densenet201 | 76.90 | 77.31 | +0.41 |\n| densenet161 | 77.14 | 77.88 | +0.74 |\n| mobilenet_v2 | 71.88 | 72.72 | +0.84 |\n\n| CONSISTENCY | Baseline | Antialiased | Delta |\n| :------: | :------: | :-------: | :-------: |\n| alexnet | 78.18 | 83.31 | +5.13 |\n| vgg11 | 86.58 | 90.09 | +3.51 |\n| vgg13 | 86.92 | 90.31 | +3.39 |\n| vgg16 | 88.52 | 90.91 | +2.39 |\n| vgg19 | 89.17 | 91.08 | +1.91 |\n| vgg11_bn | 87.16 | 90.67 | +3.51 |\n| vgg13_bn | 88.03 | 91.09 | +3.06 |\n| vgg16_bn | 89.24 | 91.58 | +2.34 |\n| vgg19_bn | 89.59 | 91.60 | +2.01 |\n| resnet18 | 85.11 | 88.36 | +3.25 |\n| resnet34 | 87.56 | 89.77 | +2.21 |\n| resnet50 | 89.20 | 91.32 | +2.12 |\n| resnet101 | 89.81 | 91.97 | +2.16 |\n| resnet152 | 90.92 | 92.42 | +1.50 |\n| resnext50_32x4d | 90.17 | 91.48 | +1.31 |\n| resnext101_32x8d | 91.33 | 92.67 | +1.34 |\n| wide_resnet50_2 | 90.77 | 92.46 | +1.69 |\n| wide_resnet101_2 | 90.93 | 92.10 | +1.17 |\n| densenet121 | 88.81 | 90.35 | +1.54 |\n| densenet169 | 89.68 | 90.61 | +0.93 |\n| densenet201 | 90.36 | 91.32 | +0.96 |\n| densenet161 | 90.82 | 91.66 | +0.84 |\n| mobilenet_v2 | 86.50 | 87.73 | +1.23 |\n\nTo reduce clutter, extended results (different filter sizes) are [here](README_IMAGENET.md). Help improve the results!\n\n## Licenses\n\n<a rel=\"license\" href=\"http://creativecommons.org/licenses/by-nc-sa/4.0/\"><img alt=\"Creative Commons License\" style=\"border-width:0\" src=\"https://i.creativecommons.org/l/by-nc-sa/4.0/80x15.png\" /></a><br />This work is licensed under a <a rel=\"license\" href=\"http://creativecommons.org/licenses/by-nc-sa/4.0/\">Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License</a>.\n\nAll material is made available under [Creative Commons BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode) license by Adobe Inc. You can **use, redistribute, and adapt** the material for **non-commercial purposes**, as long as you give appropriate credit by **citing our paper** and **indicating any changes** that you've made.\n\nThe repository builds off the PyTorch [examples repository](https://github.com/pytorch/examples) and torchvision [models repository](https://github.com/pytorch/vision/tree/master/torchvision/models). These are [BSD-style licensed](https://github.com/pytorch/examples/blob/master/LICENSE).\n\n## Citation, contact\n\nIf you find this useful for your research, please consider citing this [bibtex](https://richzhang.github.io/index_files/bibtex_icml2019.txt). Please contact Richard Zhang \\<rizhang at adobe dot com\\> with any comments or feedback.\n\n\n\n\n",
"bugtrack_url": null,
"license": "",
"summary": "Antialiased models and pooling layer from Zhang. Making Convnets Shift-Invariant Again. ICML 2019.",
"version": "0.3",
"project_urls": {
"Homepage": "https://github.com/adobe/antialiased-cnns"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "0c734da776fe426722d08587baa085589c2b7d1b09dbede12e6a03881d428ac8",
"md5": "19b8ea4deb4e912d92c2bfcadd6c9fa6",
"sha256": "0c740d37dd971c94491755edcce79d9144523107af53b44bce7acc681896f27d"
},
"downloads": -1,
"filename": "antialiased_cnns-0.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "19b8ea4deb4e912d92c2bfcadd6c9fa6",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 29919,
"upload_time": "2020-10-23T22:42:49",
"upload_time_iso_8601": "2020-10-23T22:42:49.681676Z",
"url": "https://files.pythonhosted.org/packages/0c/73/4da776fe426722d08587baa085589c2b7d1b09dbede12e6a03881d428ac8/antialiased_cnns-0.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2020-10-23 22:42:49",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "adobe",
"github_project": "antialiased-cnns",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [],
"lcname": "antialiased-cnns"
}