# PyTorch Minimize
For the most up-to-date information on pytorch-minimize, see the docs site: [pytorch-minimize.readthedocs.io](https://pytorch-minimize.readthedocs.io/)
Pytorch-minimize represents a collection of utilities for minimizing multivariate functions in PyTorch.
It is inspired heavily by SciPy's `optimize` module and MATLAB's [Optimization Toolbox](https://www.mathworks.com/products/optimization.html).
Unlike SciPy and MATLAB, which use numerical approximations of function derivatives, pytorch-minimize uses _real_ first- and second-order derivatives, computed seamlessly behind the scenes with autograd.
Both CPU and CUDA are supported.
__Author__: Reuben Feinman
__At a glance:__
```python
import torch
from torchmin import minimize
def rosen(x):
return torch.sum(100*(x[..., 1:] - x[..., :-1]**2)**2
+ (1 - x[..., :-1])**2)
# initial point
x0 = torch.tensor([1., 8.])
# Select from the following methods:
# ['bfgs', 'l-bfgs', 'cg', 'newton-cg', 'newton-exact',
# 'trust-ncg', 'trust-krylov', 'trust-exact', 'dogleg']
# BFGS
result = minimize(rosen, x0, method='bfgs')
# Newton Conjugate Gradient
result = minimize(rosen, x0, method='newton-cg')
# Newton Exact
result = minimize(rosen, x0, method='newton-exact')
```
__Solvers:__ BFGS, L-BFGS, Conjugate Gradient (CG), Newton Conjugate Gradient (NCG), Newton Exact, Dogleg, Trust-Region Exact, Trust-Region NCG, Trust-Region GLTR (Krylov)
__Examples:__ See the [Rosenbrock minimization notebook](https://github.com/rfeinman/pytorch-minimize/blob/master/examples/rosen_minimize.ipynb) for a demonstration of function minimization with a handful of different algorithms.
__Install:__
git clone https://github.com/rfeinman/pytorch-minimize.git
cd pytorch-minimize
pip install -e .
## Motivation
Although PyTorch offers many routines for stochastic optimization, utilities for deterministic optimization are scarce; only L-BFGS is included in the `optim` package, and it's modified for mini-batch training.
MATLAB and SciPy are industry standards for deterministic optimization.
These libraries have a comprehensive set of routines; however, automatic differentiation is not supported.*
Therefore, the user must provide explicit 1st- and 2nd-order gradients (if they are known) or use finite-difference approximations.
The motivation for pytorch-minimize is to offer a set of tools for deterministic optimization with automatic gradients and GPU acceleration.
__
*MATLAB offers minimal autograd support via the Deep Learning Toolbox, but the integration is not seamless: data must be converted to "dlarray" structures, and only a [subset of functions](https://www.mathworks.com/help/deeplearning/ug/list-of-functions-with-dlarray-support.html) are supported.
Furthermore, derivatives must still be constructed and provided as function handles.
Pytorch-minimize uses autograd to compute derivatives behind the scenes, so all you provide is an objective function.
## Library
The pytorch-minimize library includes solvers for general-purpose function minimization (unconstrained & constrained), as well as for nonlinear least squares problems.
### 1. Unconstrained Minimizers
The following solvers are available for _unconstrained_ minimization:
- __BFGS/L-BFGS.__ BFGS is a cannonical quasi-Newton method for unconstrained optimization. I've implemented both the standard BFGS and the "limited memory" L-BFGS. For smaller scale problems where memory is not a concern, BFGS should be significantly faster than L-BFGS (especially on CUDA) since it avoids Python for loops and instead uses pure torch.
- __Conjugate Gradient (CG).__ The conjugate gradient algorithm is a generalization of linear conjugate gradient to nonlinear optimization problems. Pytorch-minimize includes an implementation of the Polak-RibiƩre CG algorithm described in Nocedal & Wright (2006) chapter 5.2.
- __Newton Conjugate Gradient (NCG).__ The Newton-Raphson method is a staple of unconstrained optimization. Although computing full Hessian matrices with PyTorch's reverse-mode automatic differentiation can be costly, computing Hessian-vector products is cheap, and it also saves a lot of memory. The Conjugate Gradient (CG) variant of Newton's method is an effective solution for unconstrained minimization with Hessian-vector products. I've implemented a lightweight NewtonCG minimizer that uses HVP for the linear inverse sub-problems.
- __Newton Exact.__ In some cases, we may prefer a more precise variant of the Newton-Raphson method at the cost of additional complexity. I've also implemented an "exact" variant of Newton's method that computes the full Hessian matrix and uses Cholesky factorization for linear inverse sub-problems. When Cholesky fails--i.e. the Hessian is not positive definite--the solver resorts to one of two options as specified by the user: 1) steepest descent direction (default), or 2) solve the inverse hessian with LU factorization.
- __Trust-Region Newton Conjugate Gradient.__ Description coming soon.
- __Trust-Region Newton Generalized Lanczos (Krylov).__ Description coming soon.
- __Trust-Region Exact.__ Description coming soon.
- __Dogleg.__ Description coming soon.
To access the unconstrained minimizer interface, use the following import statement:
from torchmin import minimize
Use the argument `method` to specify which of the afformentioned solvers should be applied.
### 2. Constrained Minimizers
The following solvers are available for _constrained_ minimization:
- __Trust-Region Constrained Algorithm.__ Pytorch-minimize includes a single constrained minimization routine based on SciPy's 'trust-constr' method. The algorithm accepts generalized nonlinear constraints and variable boundries via the "constr" and "bounds" arguments. For equality constrained problems, it is an implementation of the Byrd-Omojokun Trust-Region SQP method. When inequality constraints are imposed, the trust-region interior point method is used. NOTE: The current trust-region constrained minimizer is not a custom implementation, but rather a wrapper for SciPy's `optimize.minimize` routine. It uses autograd behind the scenes to build jacobian & hessian callables before invoking scipy. Inputs and objectivs should use torch tensors like other pytorch-minimize routines. CUDA is supported but not recommended; data will be moved back-and-forth between GPU/CPU.
To access the constrained minimizer interface, use the following import statement:
from torchmin import minimize_constr
### 3. Nonlinear Least Squares
The library also includes specialized solvers for nonlinear least squares problems.
These solvers revolve around the Gauss-Newton method, a modification of Newton's method tailored to the lstsq setting.
The least squares interface can be imported as follows:
from torchmin import least_squares
The least_squares function is heavily motivated by scipy's `optimize.least_squares`.
Much of the scipy code was borrowed directly (all rights reserved) and ported from numpy to torch.
Rather than have the user provide a jacobian function, in the new interface, jacobian-vector products are computed behind the scenes with autograd.
At the moment, only the Trust Region Reflective ("trf") method is implemented, and bounds are not yet supported.
## Examples
The [Rosenbrock minimization tutorial](https://github.com/rfeinman/pytorch-minimize/blob/master/examples/rosen_minimize.ipynb) demonstrates how to use pytorch-minimize to find the minimum of a scalar-valued function of multiple variables using various optimization strategies.
In addition, the [SciPy benchmark](https://github.com/rfeinman/pytorch-minimize/blob/master/examples/scipy_benchmark.py) provides a comparison of pytorch-minimize solvers to their analogous solvers from the `scipy.optimize` library.
For those transitioning from scipy, this script will help get a feel for the design of the current library.
Unlike scipy, jacobian and hessian functions need not be provided to pytorch-minimize solvers, and numerical approximations are never used.
For constrained optimization, the [adversarial examples tutorial](https://github.com/rfeinman/pytorch-minimize/blob/master/examples/constrained_optimization_adversarial_examples.ipynb) demonstrates how to use the trust-region constrained routine to generate an optimal adversarial perturbation given a constraint on the perturbation norm.
## Optimizer API
As an alternative to the functional API, pytorch-minimize also includes an "optimizer" API based on the `torch.optim.Optimizer` class.
To access the optimizer class, import as follows:
from torchmin import Minimizer
Raw data
{
"_id": null,
"home_page": "https://pytorch-minimize.readthedocs.io",
"name": "pytorch-minimize",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": "",
"keywords": "",
"author": "Reuben Feinman",
"author_email": "reuben.feinman@nyu.edu",
"download_url": "https://files.pythonhosted.org/packages/7e/b0/fb7b39a3417ea301ddb92b9c40986111e7c3e27cef2ecb5dd4a027323248/pytorch-minimize-0.0.2.tar.gz",
"platform": null,
"description": "# PyTorch Minimize\n\nFor the most up-to-date information on pytorch-minimize, see the docs site: [pytorch-minimize.readthedocs.io](https://pytorch-minimize.readthedocs.io/)\n\nPytorch-minimize represents a collection of utilities for minimizing multivariate functions in PyTorch. \nIt is inspired heavily by SciPy's `optimize` module and MATLAB's [Optimization Toolbox](https://www.mathworks.com/products/optimization.html). \nUnlike SciPy and MATLAB, which use numerical approximations of function derivatives, pytorch-minimize uses _real_ first- and second-order derivatives, computed seamlessly behind the scenes with autograd.\nBoth CPU and CUDA are supported.\n\n__Author__: Reuben Feinman\n\n__At a glance:__\n\n```python\nimport torch\nfrom torchmin import minimize\n\ndef rosen(x):\n return torch.sum(100*(x[..., 1:] - x[..., :-1]**2)**2 \n + (1 - x[..., :-1])**2)\n\n# initial point\nx0 = torch.tensor([1., 8.])\n\n# Select from the following methods:\n# ['bfgs', 'l-bfgs', 'cg', 'newton-cg', 'newton-exact', \n# 'trust-ncg', 'trust-krylov', 'trust-exact', 'dogleg']\n\n# BFGS\nresult = minimize(rosen, x0, method='bfgs')\n\n# Newton Conjugate Gradient\nresult = minimize(rosen, x0, method='newton-cg')\n\n# Newton Exact\nresult = minimize(rosen, x0, method='newton-exact')\n```\n\n__Solvers:__ BFGS, L-BFGS, Conjugate Gradient (CG), Newton Conjugate Gradient (NCG), Newton Exact, Dogleg, Trust-Region Exact, Trust-Region NCG, Trust-Region GLTR (Krylov)\n\n__Examples:__ See the [Rosenbrock minimization notebook](https://github.com/rfeinman/pytorch-minimize/blob/master/examples/rosen_minimize.ipynb) for a demonstration of function minimization with a handful of different algorithms.\n\n__Install:__\n\n git clone https://github.com/rfeinman/pytorch-minimize.git\n cd pytorch-minimize\n pip install -e .\n\n## Motivation\nAlthough PyTorch offers many routines for stochastic optimization, utilities for deterministic optimization are scarce; only L-BFGS is included in the `optim` package, and it's modified for mini-batch training.\n\nMATLAB and SciPy are industry standards for deterministic optimization. \nThese libraries have a comprehensive set of routines; however, automatic differentiation is not supported.* \nTherefore, the user must provide explicit 1st- and 2nd-order gradients (if they are known) or use finite-difference approximations.\n\nThe motivation for pytorch-minimize is to offer a set of tools for deterministic optimization with automatic gradients and GPU acceleration.\n\n__\n\n*MATLAB offers minimal autograd support via the Deep Learning Toolbox, but the integration is not seamless: data must be converted to \"dlarray\" structures, and only a [subset of functions](https://www.mathworks.com/help/deeplearning/ug/list-of-functions-with-dlarray-support.html) are supported.\nFurthermore, derivatives must still be constructed and provided as function handles. \nPytorch-minimize uses autograd to compute derivatives behind the scenes, so all you provide is an objective function.\n\n## Library\n\nThe pytorch-minimize library includes solvers for general-purpose function minimization (unconstrained & constrained), as well as for nonlinear least squares problems.\n\n### 1. Unconstrained Minimizers\n\nThe following solvers are available for _unconstrained_ minimization:\n\n- __BFGS/L-BFGS.__ BFGS is a cannonical quasi-Newton method for unconstrained optimization. I've implemented both the standard BFGS and the \"limited memory\" L-BFGS. For smaller scale problems where memory is not a concern, BFGS should be significantly faster than L-BFGS (especially on CUDA) since it avoids Python for loops and instead uses pure torch.\n\n- __Conjugate Gradient (CG).__ The conjugate gradient algorithm is a generalization of linear conjugate gradient to nonlinear optimization problems. Pytorch-minimize includes an implementation of the Polak-Ribi\u00e9re CG algorithm described in Nocedal & Wright (2006) chapter 5.2.\n \n- __Newton Conjugate Gradient (NCG).__ The Newton-Raphson method is a staple of unconstrained optimization. Although computing full Hessian matrices with PyTorch's reverse-mode automatic differentiation can be costly, computing Hessian-vector products is cheap, and it also saves a lot of memory. The Conjugate Gradient (CG) variant of Newton's method is an effective solution for unconstrained minimization with Hessian-vector products. I've implemented a lightweight NewtonCG minimizer that uses HVP for the linear inverse sub-problems.\n\n- __Newton Exact.__ In some cases, we may prefer a more precise variant of the Newton-Raphson method at the cost of additional complexity. I've also implemented an \"exact\" variant of Newton's method that computes the full Hessian matrix and uses Cholesky factorization for linear inverse sub-problems. When Cholesky fails--i.e. the Hessian is not positive definite--the solver resorts to one of two options as specified by the user: 1) steepest descent direction (default), or 2) solve the inverse hessian with LU factorization.\n\n- __Trust-Region Newton Conjugate Gradient.__ Description coming soon.\n\n- __Trust-Region Newton Generalized Lanczos (Krylov).__ Description coming soon.\n\n- __Trust-Region Exact.__ Description coming soon.\n\n- __Dogleg.__ Description coming soon.\n\nTo access the unconstrained minimizer interface, use the following import statement:\n\n from torchmin import minimize\n\nUse the argument `method` to specify which of the afformentioned solvers should be applied.\n\n### 2. Constrained Minimizers\n\nThe following solvers are available for _constrained_ minimization:\n\n- __Trust-Region Constrained Algorithm.__ Pytorch-minimize includes a single constrained minimization routine based on SciPy's 'trust-constr' method. The algorithm accepts generalized nonlinear constraints and variable boundries via the \"constr\" and \"bounds\" arguments. For equality constrained problems, it is an implementation of the Byrd-Omojokun Trust-Region SQP method. When inequality constraints are imposed, the trust-region interior point method is used. NOTE: The current trust-region constrained minimizer is not a custom implementation, but rather a wrapper for SciPy's `optimize.minimize` routine. It uses autograd behind the scenes to build jacobian & hessian callables before invoking scipy. Inputs and objectivs should use torch tensors like other pytorch-minimize routines. CUDA is supported but not recommended; data will be moved back-and-forth between GPU/CPU. \n \nTo access the constrained minimizer interface, use the following import statement:\n\n from torchmin import minimize_constr\n\n### 3. Nonlinear Least Squares\n\nThe library also includes specialized solvers for nonlinear least squares problems. \nThese solvers revolve around the Gauss-Newton method, a modification of Newton's method tailored to the lstsq setting. \nThe least squares interface can be imported as follows:\n\n from torchmin import least_squares\n\nThe least_squares function is heavily motivated by scipy's `optimize.least_squares`. \nMuch of the scipy code was borrowed directly (all rights reserved) and ported from numpy to torch. \nRather than have the user provide a jacobian function, in the new interface, jacobian-vector products are computed behind the scenes with autograd. \nAt the moment, only the Trust Region Reflective (\"trf\") method is implemented, and bounds are not yet supported.\n\n## Examples\n\nThe [Rosenbrock minimization tutorial](https://github.com/rfeinman/pytorch-minimize/blob/master/examples/rosen_minimize.ipynb) demonstrates how to use pytorch-minimize to find the minimum of a scalar-valued function of multiple variables using various optimization strategies.\n\nIn addition, the [SciPy benchmark](https://github.com/rfeinman/pytorch-minimize/blob/master/examples/scipy_benchmark.py) provides a comparison of pytorch-minimize solvers to their analogous solvers from the `scipy.optimize` library. \nFor those transitioning from scipy, this script will help get a feel for the design of the current library. \nUnlike scipy, jacobian and hessian functions need not be provided to pytorch-minimize solvers, and numerical approximations are never used.\n\nFor constrained optimization, the [adversarial examples tutorial](https://github.com/rfeinman/pytorch-minimize/blob/master/examples/constrained_optimization_adversarial_examples.ipynb) demonstrates how to use the trust-region constrained routine to generate an optimal adversarial perturbation given a constraint on the perturbation norm.\n\n## Optimizer API\n\nAs an alternative to the functional API, pytorch-minimize also includes an \"optimizer\" API based on the `torch.optim.Optimizer` class. \nTo access the optimizer class, import as follows:\n\n from torchmin import Minimizer\n",
"bugtrack_url": null,
"license": "MIT Licence",
"summary": "Newton and Quasi-Newton optimization with PyTorch",
"version": "0.0.2",
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"md5": "86bcf987d01e4dc8f5e6e99e941a4ded",
"sha256": "49fe138be0be76764ec4f35c6ec8c63e7252642dde0523e517e140f55fc1fc48"
},
"downloads": -1,
"filename": "pytorch_minimize-0.0.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "86bcf987d01e4dc8f5e6e99e941a4ded",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 59610,
"upload_time": "2022-09-27T22:16:13",
"upload_time_iso_8601": "2022-09-27T22:16:13.841955Z",
"url": "https://files.pythonhosted.org/packages/a0/61/752f76c30fb1e1b31f930896a7eb0ecaa6415465ad1055b33ef02a865215/pytorch_minimize-0.0.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"md5": "beaf61e1e90164e044c2cc3e0a5e514f",
"sha256": "32dfe29759d8c07ca4b740793ec9578e50d1a7c41782e53cff1318d9d9154756"
},
"downloads": -1,
"filename": "pytorch-minimize-0.0.2.tar.gz",
"has_sig": false,
"md5_digest": "beaf61e1e90164e044c2cc3e0a5e514f",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 46881,
"upload_time": "2022-09-27T22:16:15",
"upload_time_iso_8601": "2022-09-27T22:16:15.194023Z",
"url": "https://files.pythonhosted.org/packages/7e/b0/fb7b39a3417ea301ddb92b9c40986111e7c3e27cef2ecb5dd4a027323248/pytorch-minimize-0.0.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2022-09-27 22:16:15",
"github": false,
"gitlab": false,
"bitbucket": false,
"lcname": "pytorch-minimize"
}