quad-torch


Namequad-torch JSON
Version 0.2.0 PyPI version JSON
download
home_pageNone
SummaryAn implementation of PSGD-QUAD optimizer in PyTorch.
upload_time2025-08-24 17:16:50
maintainerNone
docs_urlNone
authorEvan Walters, Omead Pooladzandi, Xi-Lin Li
requires_python>=3.10
licenseNone
keywords python machine learning optimization pytorch
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # PSGD-QUAD
An implementation of PSGD-QUAD for PyTorch.


```python
import torch
from quad_torch import QUAD

model = torch.nn.Linear(10, 10)
optimizer = QUAD(
    model.parameters(),
    lr=0.001,
    lr_style="adam",  # "adam", "mu-p", or None
    momentum=0.95,
    weight_decay=0.1,
    preconditioner_lr=0.7,
    max_size_dense=8192,
    max_skew_dense=1.0,
    normalize_grads=False,
    dtype=torch.bfloat16,
)
```

`lr_style` can be "adam" for adam-style scaling, "mu-p" for mu-p scaling based on sqrt(G.shape[-2]), or None for 
PSGD scaling of RMS=1.0.


## Resources

Xi-Lin Li's repo: https://github.com/lixilinx/psgd_torch

PSGD papers and resources listed from Xi-Lin's repo

1) Xi-Lin Li. Preconditioned stochastic gradient descent, [arXiv:1512.04202](https://arxiv.org/abs/1512.04202), 2015. (General ideas of PSGD, preconditioner fitting losses and Kronecker product preconditioners.)
2) Xi-Lin Li. Preconditioner on matrix Lie group for SGD, [arXiv:1809.10232](https://arxiv.org/abs/1809.10232), 2018. (Focus on preconditioners with the affine Lie group.)
3) Xi-Lin Li. Black box Lie group preconditioners for SGD, [arXiv:2211.04422](https://arxiv.org/abs/2211.04422), 2022. (Mainly about the LRA preconditioner. See [these supplementary materials](https://drive.google.com/file/d/1CTNx1q67_py87jn-0OI-vSLcsM1K7VsM/view) for detailed math derivations.)
4) Xi-Lin Li. Stochastic Hessian fittings on Lie groups, [arXiv:2402.11858](https://arxiv.org/abs/2402.11858), 2024. (Some theoretical works on the efficiency of PSGD. The Hessian fitting problem is shown to be strongly convex on set ${\rm GL}(n, \mathbb{R})/R_{\rm polar}$.)
5) Omead Pooladzandi, Xi-Lin Li. Curvature-informed SGD via general purpose Lie-group preconditioners, [arXiv:2402.04553](https://arxiv.org/abs/2402.04553), 2024. (Plenty of benchmark results and analyses for PSGD vs. other optimizers.)


## License

[![CC BY 4.0][cc-by-image]][cc-by]

This work is licensed under a [Creative Commons Attribution 4.0 International License][cc-by].

2024 Evan Walters, Omead Pooladzandi, Xi-Lin Li


[cc-by]: http://creativecommons.org/licenses/by/4.0/
[cc-by-image]: https://licensebuttons.net/l/by/4.0/88x31.png
[cc-by-shield]: https://img.shields.io/badge/License-CC%20BY%204.0-lightgrey.svg


            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "quad-torch",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "python, machine learning, optimization, pytorch",
    "author": "Evan Walters, Omead Pooladzandi, Xi-Lin Li",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/43/70/a2b7bac69a55a054e23a26cab4403ecfec8f86e4729193e0267101582ec2/quad_torch-0.2.0.tar.gz",
    "platform": null,
    "description": "# PSGD-QUAD\nAn implementation of PSGD-QUAD for PyTorch.\n\n\n```python\nimport torch\nfrom quad_torch import QUAD\n\nmodel = torch.nn.Linear(10, 10)\noptimizer = QUAD(\n    model.parameters(),\n    lr=0.001,\n    lr_style=\"adam\",  # \"adam\", \"mu-p\", or None\n    momentum=0.95,\n    weight_decay=0.1,\n    preconditioner_lr=0.7,\n    max_size_dense=8192,\n    max_skew_dense=1.0,\n    normalize_grads=False,\n    dtype=torch.bfloat16,\n)\n```\n\n`lr_style` can be \"adam\" for adam-style scaling, \"mu-p\" for mu-p scaling based on sqrt(G.shape[-2]), or None for \nPSGD scaling of RMS=1.0.\n\n\n## Resources\n\nXi-Lin Li's repo: https://github.com/lixilinx/psgd_torch\n\nPSGD papers and resources listed from Xi-Lin's repo\n\n1) Xi-Lin Li. Preconditioned stochastic gradient descent, [arXiv:1512.04202](https://arxiv.org/abs/1512.04202), 2015. (General ideas of PSGD, preconditioner fitting losses and Kronecker product preconditioners.)\n2) Xi-Lin Li. Preconditioner on matrix Lie group for SGD, [arXiv:1809.10232](https://arxiv.org/abs/1809.10232), 2018. (Focus on preconditioners with the affine Lie group.)\n3) Xi-Lin Li. Black box Lie group preconditioners for SGD, [arXiv:2211.04422](https://arxiv.org/abs/2211.04422), 2022. (Mainly about the LRA preconditioner. See [these supplementary materials](https://drive.google.com/file/d/1CTNx1q67_py87jn-0OI-vSLcsM1K7VsM/view) for detailed math derivations.)\n4) Xi-Lin Li. Stochastic Hessian fittings on Lie groups, [arXiv:2402.11858](https://arxiv.org/abs/2402.11858), 2024. (Some theoretical works on the efficiency of PSGD. The Hessian fitting problem is shown to be strongly convex on set ${\\rm GL}(n, \\mathbb{R})/R_{\\rm polar}$.)\n5) Omead Pooladzandi, Xi-Lin Li. Curvature-informed SGD via general purpose Lie-group preconditioners, [arXiv:2402.04553](https://arxiv.org/abs/2402.04553), 2024. (Plenty of benchmark results and analyses for PSGD vs. other optimizers.)\n\n\n## License\n\n[![CC BY 4.0][cc-by-image]][cc-by]\n\nThis work is licensed under a [Creative Commons Attribution 4.0 International License][cc-by].\n\n2024 Evan Walters, Omead Pooladzandi, Xi-Lin Li\n\n\n[cc-by]: http://creativecommons.org/licenses/by/4.0/\n[cc-by-image]: https://licensebuttons.net/l/by/4.0/88x31.png\n[cc-by-shield]: https://img.shields.io/badge/License-CC%20BY%204.0-lightgrey.svg\n\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "An implementation of PSGD-QUAD optimizer in PyTorch.",
    "version": "0.2.0",
    "project_urls": {
        "homepage": "https://github.com/evanatyourservice/quad_torch",
        "repository": "https://github.com/evanatyourservice/quad_torch"
    },
    "split_keywords": [
        "python",
        " machine learning",
        " optimization",
        " pytorch"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "fd4f0cc589fb05abd0e24023ce67bb2b1958a0020d8f3267923daae5381186cc",
                "md5": "d1bf410f43feaf584a344097b53f6932",
                "sha256": "5d9ebd030196dd163ed3e5602828b377d0b8d7a644a8b68d3190028f688fc615"
            },
            "downloads": -1,
            "filename": "quad_torch-0.2.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "d1bf410f43feaf584a344097b53f6932",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 11982,
            "upload_time": "2025-08-24T17:16:48",
            "upload_time_iso_8601": "2025-08-24T17:16:48.895265Z",
            "url": "https://files.pythonhosted.org/packages/fd/4f/0cc589fb05abd0e24023ce67bb2b1958a0020d8f3267923daae5381186cc/quad_torch-0.2.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "4370a2b7bac69a55a054e23a26cab4403ecfec8f86e4729193e0267101582ec2",
                "md5": "567a61bce24b79e35a53046f07142234",
                "sha256": "4f7541660e475e0a2407b4f622017f1cad4f5ba4f98e8d6d3e8517f53d68a9ab"
            },
            "downloads": -1,
            "filename": "quad_torch-0.2.0.tar.gz",
            "has_sig": false,
            "md5_digest": "567a61bce24b79e35a53046f07142234",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 11268,
            "upload_time": "2025-08-24T17:16:50",
            "upload_time_iso_8601": "2025-08-24T17:16:50.139955Z",
            "url": "https://files.pythonhosted.org/packages/43/70/a2b7bac69a55a054e23a26cab4403ecfec8f86e4729193e0267101582ec2/quad_torch-0.2.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-24 17:16:50",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "evanatyourservice",
    "github_project": "quad_torch",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "quad-torch"
}
        
Elapsed time: 1.53363s