rotograd


Namerotograd JSON
Version 0.1.6.0 PyPI version JSON
download
home_pagehttps://github.com/adrianjav/rotograd
SummaryRotoGrad: Gradient Homogenization in Multitask Learning in Pytorch
upload_time2023-08-01 15:55:51
maintainer
docs_urlNone
authorAdrián Javaloy
requires_python>=3.7
licenseMIT
keywords multitask learning gradient alignment gradient interference negative transfer pytorch positive transfer gradient conflict
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # RotoGrad


[![Documentation](https://img.shields.io/badge/docs-stable-informational.svg)](https://rotograd.readthedocs.io/en/stable/index.html)
[![Package](https://img.shields.io/badge/pypi-rotograd-informational.svg)](https://pypi.org/project/rotograd/)
[![Paper](http://img.shields.io/badge/paper-arxiv.2103.02631-9cf.svg)](https://arxiv.org/abs/2103.02631)
[![License](https://img.shields.io/badge/license-MIT-yellow.svg)](https://github.com/adrianjav/rotograd/blob/main/LICENSE)

> A library for dynamic gradient homogenization for multitask learning in Pytorch

## Installation

Installing this library is as simple as running in your terminal
```bash
pip install rotograd
```

The code has been tested in Pytorch 1.7.0, yet it should work on most versions. Feel free to open an issue
if that were not the case.

## Overview

This is the official Pytorch implementation of RotoGrad, an algorithm to reduce the negative transfer due 
to gradient conflict with respect to the shared parameters when different tasks of a multitask learning
system fight for the shared resources.

Let's say you have a hard-parameter sharing architecture with a `backbone` model shared across tasks, and 
two different tasks you want to solve. These tasks take the output of the backbone `z = backbone(x)` and fed
it to a task-specific model (`head1` and `head2`) to obtain the predictions of their tasks, that is,
`y1 = head1(z)` and `y2 = head2(z)`.

Then you can simply use RotateOnly, RotoGrad. or RotoGradNorm (RotateOnly + GradNorm) by putting all parts together in a single model.

```python
from rotograd import RotoGrad
model = RotoGrad(backbone, [head1, head2], size_z, normalize_losses=True)
```

where you can recover the backbone and i-th head simply calling `model.backbone` and `model.heads[i]`. Even
more, you can obtain the end-to-end model for a single task (that is, backbone + head), by typing `model[i]`.

As discussed in the paper, it is advisable to have a smaller learning rate for the parameters of RotoGrad
and GradNorm. This is as simple as doing:

```python
optimizer = nn.Adam(
    [{'params': m.parameters()} for m in [backbone, head1, head2]] +
    [{'params': model.parameters(), 'lr': learning_rate_rotograd}],
    lr=learning_rate_model)
```

Finally, we can train the model on all tasks using a simple step function:
```python
import rotograd

def step(x, y1, y2):
    model.train()
    
    optimizer.zero_grad()

    with rotograd.cached():  # Speeds-up computations by caching Rotograd's parameters
        pred1, pred2 = model(x)
        loss1, loss2 = loss_task1(pred1, y1), loss_task2(pred2, y2)
        model.backward([loss1, loss2])
    optimizer.step()
    
    return loss1, loss2
```

## Example

You can find a working example in the folder `example`. However, it requires some other dependencies to run (e.g., 
ignite and seaborn). The example shows how to use RotoGrad on one of the regression problems from the manuscript.

![image](_assets/toy.gif)

## Citing

Consider citing the following paper if you use RotoGrad:

```bibtex
@inproceedings{javaloy2022rotograd,
   title={RotoGrad: Gradient Homogenization in Multitask Learning},
   author={Adri{\'a}n Javaloy and Isabel Valera},
   booktitle={International Conference on Learning Representations},
   year={2022},
   url={https://openreview.net/forum?id=T8wHz4rnuGL}
}
```

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/adrianjav/rotograd",
    "name": "rotograd",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "",
    "keywords": "Multitask Learning,Gradient Alignment,Gradient Interference,Negative Transfer,Pytorch,Positive Transfer,Gradient Conflict",
    "author": "Adri\u00e1n Javaloy",
    "author_email": "adrian.javaloy@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/e0/f6/5ad7199b612fd98f5b417df93d59b7d6b65333641c9a68f1cee4c7580db4/rotograd-0.1.6.0.tar.gz",
    "platform": null,
    "description": "# RotoGrad\n\n\n[![Documentation](https://img.shields.io/badge/docs-stable-informational.svg)](https://rotograd.readthedocs.io/en/stable/index.html)\n[![Package](https://img.shields.io/badge/pypi-rotograd-informational.svg)](https://pypi.org/project/rotograd/)\n[![Paper](http://img.shields.io/badge/paper-arxiv.2103.02631-9cf.svg)](https://arxiv.org/abs/2103.02631)\n[![License](https://img.shields.io/badge/license-MIT-yellow.svg)](https://github.com/adrianjav/rotograd/blob/main/LICENSE)\n\n> A library for dynamic gradient homogenization for multitask learning in Pytorch\n\n## Installation\n\nInstalling this library is as simple as running in your terminal\n```bash\npip install rotograd\n```\n\nThe code has been tested in Pytorch 1.7.0, yet it should work on most versions. Feel free to open an issue\nif that were not the case.\n\n## Overview\n\nThis is the official Pytorch implementation of RotoGrad, an algorithm to reduce the negative transfer due \nto gradient conflict with respect to the shared parameters when different tasks of a multitask learning\nsystem fight for the shared resources.\n\nLet's say you have a hard-parameter sharing architecture with a `backbone` model shared across tasks, and \ntwo different tasks you want to solve. These tasks take the output of the backbone `z = backbone(x)` and fed\nit to a task-specific model (`head1` and `head2`) to obtain the predictions of their tasks, that is,\n`y1 = head1(z)` and `y2 = head2(z)`.\n\nThen you can simply use RotateOnly, RotoGrad. or RotoGradNorm (RotateOnly + GradNorm) by putting all parts together in a single model.\n\n```python\nfrom rotograd import RotoGrad\nmodel = RotoGrad(backbone, [head1, head2], size_z, normalize_losses=True)\n```\n\nwhere you can recover the backbone and i-th head simply calling `model.backbone` and `model.heads[i]`. Even\nmore, you can obtain the end-to-end model for a single task (that is, backbone + head), by typing `model[i]`.\n\nAs discussed in the paper, it is advisable to have a smaller learning rate for the parameters of RotoGrad\nand GradNorm. This is as simple as doing:\n\n```python\noptimizer = nn.Adam(\n    [{'params': m.parameters()} for m in [backbone, head1, head2]] +\n    [{'params': model.parameters(), 'lr': learning_rate_rotograd}],\n    lr=learning_rate_model)\n```\n\nFinally, we can train the model on all tasks using a simple step function:\n```python\nimport rotograd\n\ndef step(x, y1, y2):\n    model.train()\n    \n    optimizer.zero_grad()\n\n    with rotograd.cached():  # Speeds-up computations by caching Rotograd's parameters\n        pred1, pred2 = model(x)\n        loss1, loss2 = loss_task1(pred1, y1), loss_task2(pred2, y2)\n        model.backward([loss1, loss2])\n    optimizer.step()\n    \n    return loss1, loss2\n```\n\n## Example\n\nYou can find a working example in the folder `example`. However, it requires some other dependencies to run (e.g., \nignite and seaborn). The example shows how to use RotoGrad on one of the regression problems from the manuscript.\n\n![image](_assets/toy.gif)\n\n## Citing\n\nConsider citing the following paper if you use RotoGrad:\n\n```bibtex\n@inproceedings{javaloy2022rotograd,\n   title={RotoGrad: Gradient Homogenization in Multitask Learning},\n   author={Adri{\\'a}n Javaloy and Isabel Valera},\n   booktitle={International Conference on Learning Representations},\n   year={2022},\n   url={https://openreview.net/forum?id=T8wHz4rnuGL}\n}\n```\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "RotoGrad: Gradient Homogenization in Multitask Learning in Pytorch",
    "version": "0.1.6.0",
    "project_urls": {
        "Homepage": "https://github.com/adrianjav/rotograd"
    },
    "split_keywords": [
        "multitask learning",
        "gradient alignment",
        "gradient interference",
        "negative transfer",
        "pytorch",
        "positive transfer",
        "gradient conflict"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "550a73dbcf343c0c8271c3b338bc72080f66372b1bd18b0204d7b0a4bb4f976c",
                "md5": "9a9a25ede6e1475eb5bec1e2162a6147",
                "sha256": "84322c7eb1d0f20fe587df70340936c8666c3174b0b8c4bc97d082bcf29bcd14"
            },
            "downloads": -1,
            "filename": "rotograd-0.1.6.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "9a9a25ede6e1475eb5bec1e2162a6147",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 8645,
            "upload_time": "2023-08-01T15:55:49",
            "upload_time_iso_8601": "2023-08-01T15:55:49.028405Z",
            "url": "https://files.pythonhosted.org/packages/55/0a/73dbcf343c0c8271c3b338bc72080f66372b1bd18b0204d7b0a4bb4f976c/rotograd-0.1.6.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e0f65ad7199b612fd98f5b417df93d59b7d6b65333641c9a68f1cee4c7580db4",
                "md5": "50f244e0d9fefd9c67e1720909326770",
                "sha256": "cbc172f0fd03aaf5970ce05066928258a0eca9c4b1e4c992932850f1bef1d3b1"
            },
            "downloads": -1,
            "filename": "rotograd-0.1.6.0.tar.gz",
            "has_sig": false,
            "md5_digest": "50f244e0d9fefd9c67e1720909326770",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 10115,
            "upload_time": "2023-08-01T15:55:51",
            "upload_time_iso_8601": "2023-08-01T15:55:51.411460Z",
            "url": "https://files.pythonhosted.org/packages/e0/f6/5ad7199b612fd98f5b417df93d59b7d6b65333641c9a68f1cee4c7580db4/rotograd-0.1.6.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-08-01 15:55:51",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "adrianjav",
    "github_project": "rotograd",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "rotograd"
}
        
Elapsed time: 0.10071s