momo-opt


Namemomo-opt JSON
Version 0.1.0 PyPI version JSON
download
home_pagehttps://github.com/fabian-sp/MoMo
SummaryMoMo: Momentum Models for Adaptive Learning Rates
upload_time2023-05-13 08:51:19
maintainer
docs_urlNone
authorFabian Schaipp
requires_python>=3.8.0
license
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # MoMo
Pytorch implementation of MoMo methods. Adaptive learning rates for SGD with momentum (SGD-M) and Adam. 

## Installation

You can install the package with

```
pip install momo-opt
```

## Usage

Import the optimizers in Python with

``` python
from momo import Momo
opt = Momo(model.parameters(), lr=1)
```
or

``` python
from momo import MomoAdam
opt = MomoAdam(model.parameters(), lr=1e-2)
```

**Note that Momo needs access to the value of the batch loss.** 
In the ``.step()`` method, you need to pass either 
* the loss tensor (when backward has already been done) to the argument `loss`
* or a callable ``closure`` to the argument `closure` that computes gradients and returns the loss. 

For example:

``` python
def compute_loss(output, labels):
  loss = criterion(output, labels)
  loss.backward()
  return loss

# in each training step, use:
closure = lambda: compute_loss(output,labels)
opt.step(closure=closure)
```
**For more details, see [a full example script](example.py).**




## Examples

### ResNet110 for CIFAR100

<p float="left">
    <img src="png/cifar100_resnet110.png" width="320" />
    <img src="png/cifar100_resnet110_training.png" width="305" />
</p>

### ResNet20 for CIFAR10


<p float="left">
    <img src="png/cifar10_resnet20.png" width="320" />
    <img src="png/cifar10_resnet20_training.png" width="305" />
</p>


## Recommendations

In general, if you expect SGD-M to work well on your task, then use Momo. If you expect Adam to work well on your problem, then use MomoAdam.

* The option `lr` and `weight_decay` are the same as in standard optimizers. As Momo and MomoAdam automatically adapt the learning rate, you should get good preformance without heavy tuning of `lr` and setting a schedule. Setting `lr` constant should work fine. For Momo, our experiments work well with `lr=1`, for MomoAdam `lr=1e-2` (or slightly smaller) should work well.

**One of the main goals of Momo optimizers is to reduce the tuning effort for the learning-rate schedule and get good performance for a wide range of learning rates.**

* For Momo, the argument `beta` refers to the momentum parameter. The default is `beta=0.9`. For MomoAdam, `(beta1,beta2)` have the same role as in Adam.

* The option `lb` refers to a lower bound of your loss function. In many cases, `lb=0` will be a good enough estimate. If your loss converges to a large positive number (and you roughly know the value), then set `lb` to this value (or slightly smaller). 

* If you can not estimate a lower bound before training, use the option `use_fstar=True`. This will activate an online estimation of the lower bound.



            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/fabian-sp/MoMo",
    "name": "momo-opt",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8.0",
    "maintainer_email": "",
    "keywords": "",
    "author": "Fabian Schaipp",
    "author_email": "fabian.schaipp@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/ee/09/23651f542e8ac27e2ae63aa7b38365ff1d6b289b39254ada4c58f39e0e57/momo-opt-0.1.0.tar.gz",
    "platform": null,
    "description": "# MoMo\nPytorch implementation of MoMo methods. Adaptive learning rates for SGD with momentum (SGD-M) and Adam. \n\n## Installation\n\nYou can install the package with\n\n```\npip install momo-opt\n```\n\n## Usage\n\nImport the optimizers in Python with\n\n``` python\nfrom momo import Momo\nopt = Momo(model.parameters(), lr=1)\n```\nor\n\n``` python\nfrom momo import MomoAdam\nopt = MomoAdam(model.parameters(), lr=1e-2)\n```\n\n**Note that Momo needs access to the value of the batch loss.** \nIn the ``.step()`` method, you need to pass either \n* the loss tensor (when backward has already been done) to the argument `loss`\n* or a callable ``closure`` to the argument `closure` that computes gradients and returns the loss. \n\nFor example:\n\n``` python\ndef compute_loss(output, labels):\n  loss = criterion(output, labels)\n  loss.backward()\n  return loss\n\n# in each training step, use:\nclosure = lambda: compute_loss(output,labels)\nopt.step(closure=closure)\n```\n**For more details, see [a full example script](example.py).**\n\n\n\n\n## Examples\n\n### ResNet110 for CIFAR100\n\n<p float=\"left\">\n    <img src=\"png/cifar100_resnet110.png\" width=\"320\" />\n    <img src=\"png/cifar100_resnet110_training.png\" width=\"305\" />\n</p>\n\n### ResNet20 for CIFAR10\n\n\n<p float=\"left\">\n    <img src=\"png/cifar10_resnet20.png\" width=\"320\" />\n    <img src=\"png/cifar10_resnet20_training.png\" width=\"305\" />\n</p>\n\n\n## Recommendations\n\nIn general, if you expect SGD-M to work well on your task, then use Momo. If you expect Adam to work well on your problem, then use MomoAdam.\n\n* The option `lr` and `weight_decay` are the same as in standard optimizers. As Momo and MomoAdam automatically adapt the learning rate, you should get good preformance without heavy tuning of `lr` and setting a schedule. Setting `lr` constant should work fine. For Momo, our experiments work well with `lr=1`, for MomoAdam `lr=1e-2` (or slightly smaller) should work well.\n\n**One of the main goals of Momo optimizers is to reduce the tuning effort for the learning-rate schedule and get good performance for a wide range of learning rates.**\n\n* For Momo, the argument `beta` refers to the momentum parameter. The default is `beta=0.9`. For MomoAdam, `(beta1,beta2)` have the same role as in Adam.\n\n* The option `lb` refers to a lower bound of your loss function. In many cases, `lb=0` will be a good enough estimate. If your loss converges to a large positive number (and you roughly know the value), then set `lb` to this value (or slightly smaller). \n\n* If you can not estimate a lower bound before training, use the option `use_fstar=True`. This will activate an online estimation of the lower bound.\n\n\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "MoMo: Momentum Models for Adaptive Learning Rates",
    "version": "0.1.0",
    "project_urls": {
        "Homepage": "https://github.com/fabian-sp/MoMo"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f3f604626a49f15cb3608f02ab84bcebdd7ca647f0b92fef7c2c5fe6c57eb4cd",
                "md5": "c451de4700cc5a3029d310d80c595d79",
                "sha256": "90648b8189bfc34cf183d8f2f286baa78c2ca1f0541ef332d3ff13cde77728c1"
            },
            "downloads": -1,
            "filename": "momo_opt-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "c451de4700cc5a3029d310d80c595d79",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8.0",
            "size": 8632,
            "upload_time": "2023-05-13T08:51:17",
            "upload_time_iso_8601": "2023-05-13T08:51:17.230856Z",
            "url": "https://files.pythonhosted.org/packages/f3/f6/04626a49f15cb3608f02ab84bcebdd7ca647f0b92fef7c2c5fe6c57eb4cd/momo_opt-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ee0923651f542e8ac27e2ae63aa7b38365ff1d6b289b39254ada4c58f39e0e57",
                "md5": "815578882ce61c45a029b03a5dcdecee",
                "sha256": "4c4e9336652d68d0cad4dfddbc8f7a38acaa9e4e6fd8e83262294d547f737352"
            },
            "downloads": -1,
            "filename": "momo-opt-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "815578882ce61c45a029b03a5dcdecee",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8.0",
            "size": 7004,
            "upload_time": "2023-05-13T08:51:19",
            "upload_time_iso_8601": "2023-05-13T08:51:19.585171Z",
            "url": "https://files.pythonhosted.org/packages/ee/09/23651f542e8ac27e2ae63aa7b38365ff1d6b289b39254ada4c58f39e0e57/momo-opt-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-05-13 08:51:19",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "fabian-sp",
    "github_project": "MoMo",
    "github_not_found": true,
    "lcname": "momo-opt"
}
        
Elapsed time: 4.28841s