torchzero


Nametorchzero JSON
Version 0.1.7 PyPI version JSON
download
home_pageNone
SummaryModular optimization library for PyTorch.
upload_time2025-02-10 13:50:21
maintainerNone
docs_urlNone
authorNone
requires_python>=3.10
licenseMIT License Copyright (c) 2024 inikishev Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
keywords optimization optimizers torch neural networks zeroth order second order
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            ![example workflow](https://github.com/inikishev/torchzero/actions/workflows/tests.yml/badge.svg)

# torchzero

This is a work-in-progress optimizers library for pytorch with composable zeroth, first, second order and quasi newton methods, gradient approximation, line searches and a whole lot of other stuff.

Most optimizers are modular, meaning you can chain them like this:

```py
optimizer = torchzero.optim.Modular(model.parameters(), [*list of modules*])`
```

For example you might use `[ClipNorm(4), LR(1e-3), NesterovMomentum(0.9)]` for standard SGD with gradient clipping and nesterov momentum. Move `ClipNorm` to the end to clip the update instead of the gradients. If you don't have access to gradients, add a `RandomizedFDM()` at the beginning to approximate them via randomized finite differences. Add `Cautious()` to make the optimizer cautious.

Each new module takes previous module update and works on it. That way there is no need to reimplement stuff like laplacian smoothing for all optimizers, and it is easy to experiment with grafting, interpolation between different optimizers, and perhaps some weirder combinations like nested momentum.

# How to use

All modules are defined in `torchzero.modules`. You can generally mix and match them however you want. Some pre-made optimizers are available in `torchzero.optim`.

Some optimizers require closure, which should look like this:

```py
def closure(backward = True):
  preds = model(inputs)
  loss = loss_fn(preds, targets)

  # if you can't call loss.backward(), and instead use gradient-free methods,
  # they always call closure with backward=False.
  # so you can remove the part below, but keep the unused backward argument.
  if backward:
    optimizer.zero_grad()
    loss.backward()
  return loss

optimizer.step(closure)
```

This closure will also work with all built in pytorch optimizers, including LBFGS, all optimizers in this library, as well as most custom ones.

# Contents

Docs are available at [torchzero.readthedocs.io](https://torchzero.readthedocs.io/en/latest/). A preliminary list of all modules is available here <https://torchzero.readthedocs.io/en/latest/autoapi/torchzero/modules/index.html#classes>. Some of the implemented algorithms:

- SGD/Rprop/RMSProp/AdaGrad/Adam as composable modules. They are also tested to exactly match built in pytorch versions.
- Cautious Optimizers (<https://huggingface.co/papers/2411.16085>)
- Optimizer grafting (<https://openreview.net/forum?id=FpKgG31Z_i9>)
- Laplacian smoothing (<https://arxiv.org/abs/1806.06317>)
- Polyak momentum, nesterov momentum
- Gradient norm and value clipping, gradient normalization
- Gradient centralization (<https://arxiv.org/abs/2004.01461>)
- Learning rate droput (<https://pubmed.ncbi.nlm.nih.gov/35286266/>).
- Forward gradient (<https://arxiv.org/abs/2202.08587>)
- Gradient approximation via finite difference or randomized finite difference, which includes SPSA, RDSA, FDSA and Gaussian smoothing (<https://arxiv.org/abs/2211.13566v3>)
- Various line searches
- Exact Newton's method (with Levenberg-Marquardt regularization), newton with hessian approximation via finite difference, subspace finite differences newton.
- Directional newton via one additional forward pass

All modules should be quite fast, especially on models with many different parameters, due to `_foreach` operations.

I am getting to the point where I can start focusing on good docs and tests. As of now, the code should be considered experimental, untested and subject to change, so feel free but be careful if using this for actual project.

# Wrappers

### scipy.optimize.minimize wrapper

scipy.optimize.minimize wrapper with support for both gradient and hessian via batched autograd

```py
from torchzero.optim.wrappers.scipy import ScipyMinimize
opt = ScipyMinimize(model.parameters(), method = 'trust-krylov')
```

Use as any other optimizer (make sure closure accepts `backward` argument like one from **How to use**). Note that it performs full minimization on each step.

### Nevergrad wrapper

```py
opt = NevergradOptimizer(bench.parameters(), ng.optimizers.NGOptBase, budget = 1000)
```

Use as any other optimizer (make sure closure accepts `backward` argument like one from **How to use**).

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "torchzero",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "optimization, optimizers, torch, neural networks, zeroth order, second order",
    "author": null,
    "author_email": "Ivan Nikishev <nkshv2@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/30/13/f16f4c439480126bb1aeab5d8e743556ca0f116ad397396220dadbaa2952/torchzero-0.1.7.tar.gz",
    "platform": null,
    "description": "![example workflow](https://github.com/inikishev/torchzero/actions/workflows/tests.yml/badge.svg)\n\n# torchzero\n\nThis is a work-in-progress optimizers library for pytorch with composable zeroth, first, second order and quasi newton methods, gradient approximation, line searches and a whole lot of other stuff.\n\nMost optimizers are modular, meaning you can chain them like this:\n\n```py\noptimizer = torchzero.optim.Modular(model.parameters(), [*list of modules*])`\n```\n\nFor example you might use `[ClipNorm(4), LR(1e-3), NesterovMomentum(0.9)]` for standard SGD with gradient clipping and nesterov momentum. Move `ClipNorm` to the end to clip the update instead of the gradients. If you don't have access to gradients, add a `RandomizedFDM()` at the beginning to approximate them via randomized finite differences. Add `Cautious()` to make the optimizer cautious.\n\nEach new module takes previous module update and works on it. That way there is no need to reimplement stuff like laplacian smoothing for all optimizers, and it is easy to experiment with grafting, interpolation between different optimizers, and perhaps some weirder combinations like nested momentum.\n\n# How to use\n\nAll modules are defined in `torchzero.modules`. You can generally mix and match them however you want. Some pre-made optimizers are available in `torchzero.optim`.\n\nSome optimizers require closure, which should look like this:\n\n```py\ndef closure(backward = True):\n  preds = model(inputs)\n  loss = loss_fn(preds, targets)\n\n  # if you can't call loss.backward(), and instead use gradient-free methods,\n  # they always call closure with backward=False.\n  # so you can remove the part below, but keep the unused backward argument.\n  if backward:\n    optimizer.zero_grad()\n    loss.backward()\n  return loss\n\noptimizer.step(closure)\n```\n\nThis closure will also work with all built in pytorch optimizers, including LBFGS, all optimizers in this library, as well as most custom ones.\n\n# Contents\n\nDocs are available at [torchzero.readthedocs.io](https://torchzero.readthedocs.io/en/latest/). A preliminary list of all modules is available here <https://torchzero.readthedocs.io/en/latest/autoapi/torchzero/modules/index.html#classes>. Some of the implemented algorithms:\n\n- SGD/Rprop/RMSProp/AdaGrad/Adam as composable modules. They are also tested to exactly match built in pytorch versions.\n- Cautious Optimizers (<https://huggingface.co/papers/2411.16085>)\n- Optimizer grafting (<https://openreview.net/forum?id=FpKgG31Z_i9>)\n- Laplacian smoothing (<https://arxiv.org/abs/1806.06317>)\n- Polyak momentum, nesterov momentum\n- Gradient norm and value clipping, gradient normalization\n- Gradient centralization (<https://arxiv.org/abs/2004.01461>)\n- Learning rate droput (<https://pubmed.ncbi.nlm.nih.gov/35286266/>).\n- Forward gradient (<https://arxiv.org/abs/2202.08587>)\n- Gradient approximation via finite difference or randomized finite difference, which includes SPSA, RDSA, FDSA and Gaussian smoothing (<https://arxiv.org/abs/2211.13566v3>)\n- Various line searches\n- Exact Newton's method (with Levenberg-Marquardt regularization), newton with hessian approximation via finite difference, subspace finite differences newton.\n- Directional newton via one additional forward pass\n\nAll modules should be quite fast, especially on models with many different parameters, due to `_foreach` operations.\n\nI am getting to the point where I can start focusing on good docs and tests. As of now, the code should be considered experimental, untested and subject to change, so feel free but be careful if using this for actual project.\n\n# Wrappers\n\n### scipy.optimize.minimize wrapper\n\nscipy.optimize.minimize wrapper with support for both gradient and hessian via batched autograd\n\n```py\nfrom torchzero.optim.wrappers.scipy import ScipyMinimize\nopt = ScipyMinimize(model.parameters(), method = 'trust-krylov')\n```\n\nUse as any other optimizer (make sure closure accepts `backward` argument like one from **How to use**). Note that it performs full minimization on each step.\n\n### Nevergrad wrapper\n\n```py\nopt = NevergradOptimizer(bench.parameters(), ng.optimizers.NGOptBase, budget = 1000)\n```\n\nUse as any other optimizer (make sure closure accepts `backward` argument like one from **How to use**).\n",
    "bugtrack_url": null,
    "license": "MIT License\n        \n        Copyright (c) 2024 inikishev\n        \n        Permission is hereby granted, free of charge, to any person obtaining a copy\n        of this software and associated documentation files (the \"Software\"), to deal\n        in the Software without restriction, including without limitation the rights\n        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell\n        copies of the Software, and to permit persons to whom the Software is\n        furnished to do so, subject to the following conditions:\n        \n        The above copyright notice and this permission notice shall be included in all\n        copies or substantial portions of the Software.\n        \n        THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\n        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\n        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\n        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\n        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\n        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\n        SOFTWARE.\n        ",
    "summary": "Modular optimization library for PyTorch.",
    "version": "0.1.7",
    "project_urls": {
        "Homepage": "https://github.com/inikishev/torchzero",
        "Issues": "https://github.com/inikishev/torchzero/isses",
        "Repository": "https://github.com/inikishev/torchzero"
    },
    "split_keywords": [
        "optimization",
        " optimizers",
        " torch",
        " neural networks",
        " zeroth order",
        " second order"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "550a5a34d1e59a70ae6ca88e68e668a9961840a594c17d23a1ff6e6cb49b9f2a",
                "md5": "b4d9f7c4424edd8c9d903e8bf98f5090",
                "sha256": "4360157b545f9f4bdce62d1d421ce787a977a573c61622b3008149f99d0d70af"
            },
            "downloads": -1,
            "filename": "torchzero-0.1.7-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "b4d9f7c4424edd8c9d903e8bf98f5090",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 139014,
            "upload_time": "2025-02-10T13:50:11",
            "upload_time_iso_8601": "2025-02-10T13:50:11.514686Z",
            "url": "https://files.pythonhosted.org/packages/55/0a/5a34d1e59a70ae6ca88e68e668a9961840a594c17d23a1ff6e6cb49b9f2a/torchzero-0.1.7-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "3013f16f4c439480126bb1aeab5d8e743556ca0f116ad397396220dadbaa2952",
                "md5": "a3bb5835554e8879ee486ce307115e07",
                "sha256": "a34db10a2fa247a75b4d2b4f2bd90e7e039adea877ab6dac47b2c18a8f08f2eb"
            },
            "downloads": -1,
            "filename": "torchzero-0.1.7.tar.gz",
            "has_sig": false,
            "md5_digest": "a3bb5835554e8879ee486ce307115e07",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 103837,
            "upload_time": "2025-02-10T13:50:21",
            "upload_time_iso_8601": "2025-02-10T13:50:21.183342Z",
            "url": "https://files.pythonhosted.org/packages/30/13/f16f4c439480126bb1aeab5d8e743556ca0f116ad397396220dadbaa2952/torchzero-0.1.7.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-02-10 13:50:21",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "inikishev",
    "github_project": "torchzero",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "torchzero"
}
        
Elapsed time: 0.72598s