torch-cosine-annealing

Name	torch-cosine-annealing JSON
Version	0.1.3 JSON
	download
home_page	None
Summary	Cosine annealing learning rate scheduler for PyTorch based on SGDR
upload_time	2024-06-10 07:14:25
maintainer	None
docs_url	None
author	None
requires_python	>=3.8
license	MIT License Copyright (c) 2024 Steve Immanuel Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
keywords	cosine-annealing sgdr torch scheduler
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Torch Cosine Annealing
![PyPI version](https://img.shields.io/pypi/v/torch-cosine-annealing)
![Build Status](https://img.shields.io/github/actions/workflow/status/SteveImmanuel/torch-cosine-annealing/test-and-deploy.yml)

Implementation of cosine annealing scheduler introduced in [SGDR](https://arxiv.org/abs/1608.03983) paper. Compared to the [original](https://pytorch.org/docs/stable/generated/torch.optim.lr_scheduler.CosineAnnealingWarmRestarts.html) implementation, it has the following additional features:
- Support linear warm-up/burn-in period
- Linear warm-up can be applied only to the first cycle
- Support float values for the warmup period, cycle period ($T_0$), and cycle_mult ($T_{mult}$)
- Support multiple learning rates for each param group
- Scheduler can be updated by epoch or step progress

## Installation
```bash
pip install torch-cosine-annealing
```

## Quick Start
In the following examples, assume any standard PyTorch `model` and `optimizer` are defined.
### Using `step` Strategy
```python
from torch_cosine_annealing import CosineAnnealingWithWarmRestarts

scheduler = CosineAnnealingWithWarmRestarts(
    optimizer, 
    cycle_period=50, 
    cycle_mult=1, 
    warmup_period=5, 
    min_lr=1e-7, 
    gamma=1, 
    strategy='step',
)

for epoch in range(100):
    for data in dataloader:
        # insert training logic here
        
        scheduler.step()
```
### Using `epoch` Strategy
```python
from torch_cosine_annealing import CosineAnnealingWithWarmRestarts

scheduler = CosineAnnealingWithWarmRestarts(
    optimizer, 
    cycle_period=1, 
    cycle_mult=1, 
    warmup_period=0.1, 
    min_lr=1e-8, 
    gamma=1, 
    strategy='epoch',
)

for epoch in range(100):
    for i, data in enumerate(dataloader):
        # insert training logic here
        
        scheduler.step((epoch * len(dataloader) + i + 1) / len(dataloader))
```

## Arguments
The `CosineAnnealingWithWarmRestarts` class has the following arguments:
- **optimizer** (`Optimizer`): PyTorch optimizer
- **cycle_period** (`Union[float, int]`): The period for the first cycle. If strategy is 'step', this is the number of steps in the first cycle. If strategy is 'epoch', this is the number of epochs in the first cycle.
- **cycle_mult** (`float`): The multiplier for the cycle period after each cycle. Defaults to 1.
- **warmup_period** (`Union[float, int]`): The period for warmup for each cycle. If strategy is 'step', this is the number of steps for the warmup. If strategy is 'epoch', this is the number of epochs for the warmup. Defaults to 0.
- **warmup_once** (`bool`): Whether to apply warmup only once at the beginning of the first cycle. Only affects when warmup_period > 0. Defaults to False.
- **max_lr** (`Union[float, List[float]]`, optional): The maximum learning rate for the optimizer (eta_max). If omitted, the learning rate of the optimizer will be used. If a float is given, all lr in the optimizer param groups will be overridden with this value. If a list is given, the length of the list must be the same as the number of param groups in the optimizer. Defaults to None.
- **min_lr** (`float`, optional): The minimum learning rate for the optimizer (eta_min). Defaults to 1e-8.
- **gamma** (`float`, optional): The decay rate for the learning rate after each cycle. Defaults to 1.
- **strategy** (`str`, optional): Defines whether the cycle period and warmup period to be treated as steps or epochs. Can be `step` or `epoch`. Note that if you use `epoch`, you need to specify the epoch progress each time you call `.step()`. Defaults to `step`.

## Use Cases
### Restart every 50 steps without warmup, no decay, constant restart period
```strategy='step', cycle_period=50, cycle_mult=1, max_lr=1e-3, min_lr=1e-7, warmup_period=0, gamma=1```
![Ex1](static/ex1.png)
### Restart every 1 epoch without warmup, decay learning rate by 0.8 every restart, constant restart period
**Note**: In this example, one epoch consists of 50 steps.

```strategy='epoch', cycle_period=1, cycle_mult=1, max_lr=1e-3, min_lr=1e-7, warmup_period=0, gamma=0.8```
![Ex1](static/ex2.png)
### Restart every 50 steps with 5 steps warmup, no decay, constant restart period
```strategy='step', cycle_period=50, cycle_mult=1, max_lr=1e-3, min_lr=1e-7, warmup_period=5, gamma=1```
![Ex1](static/ex3.png)
### Restart every 2 epoch with 0.5 epoch warmup only on first restart, no decay, restart period multiplied by 1.5 every restart
**Note**: In this example, one epoch consists of 50 steps.

```strategy='epoch', cycle_period=2, cycle_mult=1.5, max_lr=1e-3, min_lr=1e-7, warmup_period=0.5, warmup_once=True, gamma=1```
![Ex1](static/ex4.png)

### Restart every 25 steps with 5 steps warmup only on first restart, decay learning rate by 0.8 and restart period multiplied by 1.5 every restart, apply to multiple learning rates
```strategy='step', cycle_period=25, cycle_mult=2, max_lr=[1e-3, 5e-4], min_lr=1e-7, warmup_period=5, warmup_once=True, gamma=0.8```
![Ex1](static/ex5.png)

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "torch-cosine-annealing",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "cosine-annealing, sgdr, torch, scheduler",
    "author": null,
    "author_email": "Steve Immanuel <iam.steve.immanuel@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/99/c3/a26b89dcfe9e799864d7e537bf892e945e69465e70fb9f7aaa373d9a776f/torch_cosine_annealing-0.1.3.tar.gz",
    "platform": null,
    "description": "# Torch Cosine Annealing\n![PyPI version](https://img.shields.io/pypi/v/torch-cosine-annealing)\n![Build Status](https://img.shields.io/github/actions/workflow/status/SteveImmanuel/torch-cosine-annealing/test-and-deploy.yml)\n\nImplementation of cosine annealing scheduler introduced in [SGDR](https://arxiv.org/abs/1608.03983) paper. Compared to the [original](https://pytorch.org/docs/stable/generated/torch.optim.lr_scheduler.CosineAnnealingWarmRestarts.html) implementation, it has the following additional features:\n- Support linear warm-up/burn-in period\n- Linear warm-up can be applied only to the first cycle\n- Support float values for the warmup period, cycle period ($T_0$), and cycle_mult ($T_{mult}$)\n- Support multiple learning rates for each param group\n- Scheduler can be updated by epoch or step progress\n\n## Installation\n```bash\npip install torch-cosine-annealing\n```\n\n## Quick Start\nIn the following examples, assume any standard PyTorch `model` and `optimizer` are defined.\n### Using `step` Strategy\n```python\nfrom torch_cosine_annealing import CosineAnnealingWithWarmRestarts\n\nscheduler = CosineAnnealingWithWarmRestarts(\n    optimizer, \n    cycle_period=50, \n    cycle_mult=1, \n    warmup_period=5, \n    min_lr=1e-7, \n    gamma=1, \n    strategy='step',\n)\n\nfor epoch in range(100):\n    for data in dataloader:\n        # insert training logic here\n        \n        scheduler.step()\n```\n### Using `epoch` Strategy\n```python\nfrom torch_cosine_annealing import CosineAnnealingWithWarmRestarts\n\nscheduler = CosineAnnealingWithWarmRestarts(\n    optimizer, \n    cycle_period=1, \n    cycle_mult=1, \n    warmup_period=0.1, \n    min_lr=1e-8, \n    gamma=1, \n    strategy='epoch',\n)\n\nfor epoch in range(100):\n    for i, data in enumerate(dataloader):\n        # insert training logic here\n        \n        scheduler.step((epoch * len(dataloader) + i + 1) / len(dataloader))\n```\n\n## Arguments\nThe `CosineAnnealingWithWarmRestarts` class has the following arguments:\n- **optimizer** (`Optimizer`): PyTorch optimizer\n- **cycle_period** (`Union[float, int]`): The period for the first cycle. If strategy is 'step', this is the number of steps in the first cycle. If strategy is 'epoch', this is the number of epochs in the first cycle.\n- **cycle_mult** (`float`): The multiplier for the cycle period after each cycle. Defaults to 1.\n- **warmup_period** (`Union[float, int]`): The period for warmup for each cycle. If strategy is 'step', this is the number of steps for the warmup. If strategy is 'epoch', this is the number of epochs for the warmup. Defaults to 0.\n- **warmup_once** (`bool`): Whether to apply warmup only once at the beginning of the first cycle. Only affects when warmup_period > 0. Defaults to False.\n- **max_lr** (`Union[float, List[float]]`, optional): The maximum learning rate for the optimizer (eta_max). If omitted, the learning rate of the optimizer will be used. If a float is given, all lr in the optimizer param groups will be overridden with this value. If a list is given, the length of the list must be the same as the number of param groups in the optimizer. Defaults to None.\n- **min_lr** (`float`, optional): The minimum learning rate for the optimizer (eta_min). Defaults to 1e-8.\n- **gamma** (`float`, optional): The decay rate for the learning rate after each cycle. Defaults to 1.\n- **strategy** (`str`, optional): Defines whether the cycle period and warmup period to be treated as steps or epochs. Can be `step` or `epoch`. Note that if you use `epoch`, you need to specify the epoch progress each time you call `.step()`. Defaults to `step`.\n\n## Use Cases\n### Restart every 50 steps without warmup, no decay, constant restart period\n```strategy='step', cycle_period=50, cycle_mult=1, max_lr=1e-3, min_lr=1e-7, warmup_period=0, gamma=1```\n![Ex1](static/ex1.png)\n### Restart every 1 epoch without warmup, decay learning rate by 0.8 every restart, constant restart period\n**Note**: In this example, one epoch consists of 50 steps.\n\n```strategy='epoch', cycle_period=1, cycle_mult=1, max_lr=1e-3, min_lr=1e-7, warmup_period=0, gamma=0.8```\n![Ex1](static/ex2.png)\n### Restart every 50 steps with 5 steps warmup, no decay, constant restart period\n```strategy='step', cycle_period=50, cycle_mult=1, max_lr=1e-3, min_lr=1e-7, warmup_period=5, gamma=1```\n![Ex1](static/ex3.png)\n### Restart every 2 epoch with 0.5 epoch warmup only on first restart, no decay, restart period multiplied by 1.5 every restart\n**Note**: In this example, one epoch consists of 50 steps.\n\n```strategy='epoch', cycle_period=2, cycle_mult=1.5, max_lr=1e-3, min_lr=1e-7, warmup_period=0.5, warmup_once=True, gamma=1```\n![Ex1](static/ex4.png)\n\n### Restart every 25 steps with 5 steps warmup only on first restart, decay learning rate by 0.8 and restart period multiplied by 1.5 every restart, apply to multiple learning rates\n```strategy='step', cycle_period=25, cycle_mult=2, max_lr=[1e-3, 5e-4], min_lr=1e-7, warmup_period=5, warmup_once=True, gamma=0.8```\n![Ex1](static/ex5.png)\n",
    "bugtrack_url": null,
    "license": "MIT License  Copyright (c) 2024 Steve Immanuel  Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:  The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.  THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ",
    "summary": "Cosine annealing learning rate scheduler for PyTorch based on SGDR",
    "version": "0.1.3",
    "project_urls": {
        "Homepage": "https://github.com/SteveImmanuel/torch-cosine-annealing"
    },
    "split_keywords": [
        "cosine-annealing",
        " sgdr",
        " torch",
        " scheduler"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "7f44c4258bedb868bf3090a1c03478920004e8b498a8a41b03430e0778239365",
                "md5": "a52ef1a8d527ac1f67cb414e128db83f",
                "sha256": "c1582cfc7f8864bec0d42329a05e1e054ab6c131033d42863ada00a40639eb3f"
            },
            "downloads": -1,
            "filename": "torch_cosine_annealing-0.1.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "a52ef1a8d527ac1f67cb414e128db83f",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 6947,
            "upload_time": "2024-06-10T07:14:23",
            "upload_time_iso_8601": "2024-06-10T07:14:23.866079Z",
            "url": "https://files.pythonhosted.org/packages/7f/44/c4258bedb868bf3090a1c03478920004e8b498a8a41b03430e0778239365/torch_cosine_annealing-0.1.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "99c3a26b89dcfe9e799864d7e537bf892e945e69465e70fb9f7aaa373d9a776f",
                "md5": "2bd4ee5c4280a09a7b29a7f3511345de",
                "sha256": "926ba3c7013dee56e529c428580127a5d7f94b71b8c8eee4d99c09167cec7dc5"
            },
            "downloads": -1,
            "filename": "torch_cosine_annealing-0.1.3.tar.gz",
            "has_sig": false,
            "md5_digest": "2bd4ee5c4280a09a7b29a7f3511345de",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 7797,
            "upload_time": "2024-06-10T07:14:25",
            "upload_time_iso_8601": "2024-06-10T07:14:25.021720Z",
            "url": "https://files.pythonhosted.org/packages/99/c3/a26b89dcfe9e799864d7e537bf892e945e69465e70fb9f7aaa373d9a776f/torch_cosine_annealing-0.1.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-06-10 07:14:25",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "SteveImmanuel",
    "github_project": "torch-cosine-annealing",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "torch-cosine-annealing"
}

None