Name | torchzero JSON |
Version |
0.1.7
JSON |
| download |
home_page | None |
Summary | Modular optimization library for PyTorch. |
upload_time | 2025-02-10 13:50:21 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.10 |
license | MIT License
Copyright (c) 2024 inikishev
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
|
keywords |
optimization
optimizers
torch
neural networks
zeroth order
second order
|
VCS |
 |
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|

# torchzero
This is a work-in-progress optimizers library for pytorch with composable zeroth, first, second order and quasi newton methods, gradient approximation, line searches and a whole lot of other stuff.
Most optimizers are modular, meaning you can chain them like this:
```py
optimizer = torchzero.optim.Modular(model.parameters(), [*list of modules*])`
```
For example you might use `[ClipNorm(4), LR(1e-3), NesterovMomentum(0.9)]` for standard SGD with gradient clipping and nesterov momentum. Move `ClipNorm` to the end to clip the update instead of the gradients. If you don't have access to gradients, add a `RandomizedFDM()` at the beginning to approximate them via randomized finite differences. Add `Cautious()` to make the optimizer cautious.
Each new module takes previous module update and works on it. That way there is no need to reimplement stuff like laplacian smoothing for all optimizers, and it is easy to experiment with grafting, interpolation between different optimizers, and perhaps some weirder combinations like nested momentum.
# How to use
All modules are defined in `torchzero.modules`. You can generally mix and match them however you want. Some pre-made optimizers are available in `torchzero.optim`.
Some optimizers require closure, which should look like this:
```py
def closure(backward = True):
preds = model(inputs)
loss = loss_fn(preds, targets)
# if you can't call loss.backward(), and instead use gradient-free methods,
# they always call closure with backward=False.
# so you can remove the part below, but keep the unused backward argument.
if backward:
optimizer.zero_grad()
loss.backward()
return loss
optimizer.step(closure)
```
This closure will also work with all built in pytorch optimizers, including LBFGS, all optimizers in this library, as well as most custom ones.
# Contents
Docs are available at [torchzero.readthedocs.io](https://torchzero.readthedocs.io/en/latest/). A preliminary list of all modules is available here <https://torchzero.readthedocs.io/en/latest/autoapi/torchzero/modules/index.html#classes>. Some of the implemented algorithms:
- SGD/Rprop/RMSProp/AdaGrad/Adam as composable modules. They are also tested to exactly match built in pytorch versions.
- Cautious Optimizers (<https://huggingface.co/papers/2411.16085>)
- Optimizer grafting (<https://openreview.net/forum?id=FpKgG31Z_i9>)
- Laplacian smoothing (<https://arxiv.org/abs/1806.06317>)
- Polyak momentum, nesterov momentum
- Gradient norm and value clipping, gradient normalization
- Gradient centralization (<https://arxiv.org/abs/2004.01461>)
- Learning rate droput (<https://pubmed.ncbi.nlm.nih.gov/35286266/>).
- Forward gradient (<https://arxiv.org/abs/2202.08587>)
- Gradient approximation via finite difference or randomized finite difference, which includes SPSA, RDSA, FDSA and Gaussian smoothing (<https://arxiv.org/abs/2211.13566v3>)
- Various line searches
- Exact Newton's method (with Levenberg-Marquardt regularization), newton with hessian approximation via finite difference, subspace finite differences newton.
- Directional newton via one additional forward pass
All modules should be quite fast, especially on models with many different parameters, due to `_foreach` operations.
I am getting to the point where I can start focusing on good docs and tests. As of now, the code should be considered experimental, untested and subject to change, so feel free but be careful if using this for actual project.
# Wrappers
### scipy.optimize.minimize wrapper
scipy.optimize.minimize wrapper with support for both gradient and hessian via batched autograd
```py
from torchzero.optim.wrappers.scipy import ScipyMinimize
opt = ScipyMinimize(model.parameters(), method = 'trust-krylov')
```
Use as any other optimizer (make sure closure accepts `backward` argument like one from **How to use**). Note that it performs full minimization on each step.
### Nevergrad wrapper
```py
opt = NevergradOptimizer(bench.parameters(), ng.optimizers.NGOptBase, budget = 1000)
```
Use as any other optimizer (make sure closure accepts `backward` argument like one from **How to use**).
Raw data
{
"_id": null,
"home_page": null,
"name": "torchzero",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": "optimization, optimizers, torch, neural networks, zeroth order, second order",
"author": null,
"author_email": "Ivan Nikishev <nkshv2@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/30/13/f16f4c439480126bb1aeab5d8e743556ca0f116ad397396220dadbaa2952/torchzero-0.1.7.tar.gz",
"platform": null,
"description": "\n\n# torchzero\n\nThis is a work-in-progress optimizers library for pytorch with composable zeroth, first, second order and quasi newton methods, gradient approximation, line searches and a whole lot of other stuff.\n\nMost optimizers are modular, meaning you can chain them like this:\n\n```py\noptimizer = torchzero.optim.Modular(model.parameters(), [*list of modules*])`\n```\n\nFor example you might use `[ClipNorm(4), LR(1e-3), NesterovMomentum(0.9)]` for standard SGD with gradient clipping and nesterov momentum. Move `ClipNorm` to the end to clip the update instead of the gradients. If you don't have access to gradients, add a `RandomizedFDM()` at the beginning to approximate them via randomized finite differences. Add `Cautious()` to make the optimizer cautious.\n\nEach new module takes previous module update and works on it. That way there is no need to reimplement stuff like laplacian smoothing for all optimizers, and it is easy to experiment with grafting, interpolation between different optimizers, and perhaps some weirder combinations like nested momentum.\n\n# How to use\n\nAll modules are defined in `torchzero.modules`. You can generally mix and match them however you want. Some pre-made optimizers are available in `torchzero.optim`.\n\nSome optimizers require closure, which should look like this:\n\n```py\ndef closure(backward = True):\n preds = model(inputs)\n loss = loss_fn(preds, targets)\n\n # if you can't call loss.backward(), and instead use gradient-free methods,\n # they always call closure with backward=False.\n # so you can remove the part below, but keep the unused backward argument.\n if backward:\n optimizer.zero_grad()\n loss.backward()\n return loss\n\noptimizer.step(closure)\n```\n\nThis closure will also work with all built in pytorch optimizers, including LBFGS, all optimizers in this library, as well as most custom ones.\n\n# Contents\n\nDocs are available at [torchzero.readthedocs.io](https://torchzero.readthedocs.io/en/latest/). A preliminary list of all modules is available here <https://torchzero.readthedocs.io/en/latest/autoapi/torchzero/modules/index.html#classes>. Some of the implemented algorithms:\n\n- SGD/Rprop/RMSProp/AdaGrad/Adam as composable modules. They are also tested to exactly match built in pytorch versions.\n- Cautious Optimizers (<https://huggingface.co/papers/2411.16085>)\n- Optimizer grafting (<https://openreview.net/forum?id=FpKgG31Z_i9>)\n- Laplacian smoothing (<https://arxiv.org/abs/1806.06317>)\n- Polyak momentum, nesterov momentum\n- Gradient norm and value clipping, gradient normalization\n- Gradient centralization (<https://arxiv.org/abs/2004.01461>)\n- Learning rate droput (<https://pubmed.ncbi.nlm.nih.gov/35286266/>).\n- Forward gradient (<https://arxiv.org/abs/2202.08587>)\n- Gradient approximation via finite difference or randomized finite difference, which includes SPSA, RDSA, FDSA and Gaussian smoothing (<https://arxiv.org/abs/2211.13566v3>)\n- Various line searches\n- Exact Newton's method (with Levenberg-Marquardt regularization), newton with hessian approximation via finite difference, subspace finite differences newton.\n- Directional newton via one additional forward pass\n\nAll modules should be quite fast, especially on models with many different parameters, due to `_foreach` operations.\n\nI am getting to the point where I can start focusing on good docs and tests. As of now, the code should be considered experimental, untested and subject to change, so feel free but be careful if using this for actual project.\n\n# Wrappers\n\n### scipy.optimize.minimize wrapper\n\nscipy.optimize.minimize wrapper with support for both gradient and hessian via batched autograd\n\n```py\nfrom torchzero.optim.wrappers.scipy import ScipyMinimize\nopt = ScipyMinimize(model.parameters(), method = 'trust-krylov')\n```\n\nUse as any other optimizer (make sure closure accepts `backward` argument like one from **How to use**). Note that it performs full minimization on each step.\n\n### Nevergrad wrapper\n\n```py\nopt = NevergradOptimizer(bench.parameters(), ng.optimizers.NGOptBase, budget = 1000)\n```\n\nUse as any other optimizer (make sure closure accepts `backward` argument like one from **How to use**).\n",
"bugtrack_url": null,
"license": "MIT License\n \n Copyright (c) 2024 inikishev\n \n Permission is hereby granted, free of charge, to any person obtaining a copy\n of this software and associated documentation files (the \"Software\"), to deal\n in the Software without restriction, including without limitation the rights\n to use, copy, modify, merge, publish, distribute, sublicense, and/or sell\n copies of the Software, and to permit persons to whom the Software is\n furnished to do so, subject to the following conditions:\n \n The above copyright notice and this permission notice shall be included in all\n copies or substantial portions of the Software.\n \n THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\n IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\n FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\n AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\n LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\n OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\n SOFTWARE.\n ",
"summary": "Modular optimization library for PyTorch.",
"version": "0.1.7",
"project_urls": {
"Homepage": "https://github.com/inikishev/torchzero",
"Issues": "https://github.com/inikishev/torchzero/isses",
"Repository": "https://github.com/inikishev/torchzero"
},
"split_keywords": [
"optimization",
" optimizers",
" torch",
" neural networks",
" zeroth order",
" second order"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "550a5a34d1e59a70ae6ca88e68e668a9961840a594c17d23a1ff6e6cb49b9f2a",
"md5": "b4d9f7c4424edd8c9d903e8bf98f5090",
"sha256": "4360157b545f9f4bdce62d1d421ce787a977a573c61622b3008149f99d0d70af"
},
"downloads": -1,
"filename": "torchzero-0.1.7-py3-none-any.whl",
"has_sig": false,
"md5_digest": "b4d9f7c4424edd8c9d903e8bf98f5090",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 139014,
"upload_time": "2025-02-10T13:50:11",
"upload_time_iso_8601": "2025-02-10T13:50:11.514686Z",
"url": "https://files.pythonhosted.org/packages/55/0a/5a34d1e59a70ae6ca88e68e668a9961840a594c17d23a1ff6e6cb49b9f2a/torchzero-0.1.7-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "3013f16f4c439480126bb1aeab5d8e743556ca0f116ad397396220dadbaa2952",
"md5": "a3bb5835554e8879ee486ce307115e07",
"sha256": "a34db10a2fa247a75b4d2b4f2bd90e7e039adea877ab6dac47b2c18a8f08f2eb"
},
"downloads": -1,
"filename": "torchzero-0.1.7.tar.gz",
"has_sig": false,
"md5_digest": "a3bb5835554e8879ee486ce307115e07",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 103837,
"upload_time": "2025-02-10T13:50:21",
"upload_time_iso_8601": "2025-02-10T13:50:21.183342Z",
"url": "https://files.pythonhosted.org/packages/30/13/f16f4c439480126bb1aeab5d8e743556ca0f116ad397396220dadbaa2952/torchzero-0.1.7.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-02-10 13:50:21",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "inikishev",
"github_project": "torchzero",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "torchzero"
}