MLorc-optim

Name	MLorc-optim JSON
Version	0.1.8 JSON
	download
home_page	https://github.com/Koratahiu/MLorc
Summary	Unofficial implementation of Momentum Low-Rank Compression (MLorc) for memory-efficient LLM fine-tuning
upload_time	2025-08-31 15:43:40
maintainer	None
docs_url	None
author	Koratahiu
requires_python	>=3.8
license	Apache 2.0
keywords	llm fine-tuning memory-efficient low-rank compression pytorch optimizer adam lion
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            
# MLorc - Momentum Low-Rank Compression for Memory-Efficient LLM Fine-tuning
Unofficial implementation of "MLorc: Momentum Low-rank Compression for Large Language Model Adaptation"

This repository introduces **MLorc (Momentum Low-rank Compression)**, a novel and highly memory-efficient paradigm designed to significantly reduce the memory footprint of full-parameter fine-tuning for large language models. Based on the paper "[MLorc: Momentum Low-rank Compression for Large Language Model Adaptation](https://arxiv.org/abs/2506.01897)" this method offers a compelling alternative to existing memory-efficient techniques.

<img width="1385" height="469" alt="image" src="https://github.com/user-attachments/assets/7bcab5ec-beaf-4d1a-b115-81ab1a7d4b18" />

---
### Install
`pip install MLorc-optim`

---
### How MLorc Works

MLorc's core innovation lies in its approach to **momentum compression and reconstruction**:

* **Direct Momentum Compression:** It directly compresses and reconstructs both first and second-order momentum using **Randomized SVD (RSVD)** at each optimization step.
* **Adaptive Second-Order Momentum Handling:** To ensure stability, especially for non-negative second-order momentum, MLorc adaptively adds a small constant to zero values introduced by ReLU during reconstruction.

---

### Key Advantages of MLorc

MLorc is broadly applicable to any momentum-based optimizer (e.g., Adam, Lion) and delivers superior performance:

* **State-of-the-Art Performance:** Empirically, MLorc consistently **outperforms other memory-efficient methods like LoRA and GaLore** in terms of validation accuracy. It can even match or **exceed the performance of full fine-tuning** with a small rank (e.g., `rank=4`).
* **Memory and Time Efficiency:** It maintains **comparable memory efficiency to LoRA** while demonstrating **improved time efficiency compared to GaLore**.
* **Theoretical Guarantees:** MLorc offers a **theoretical guarantee for convergence**, matching the convergence rate of the original Lion optimizer under reasonable assumptions.

<img width="1403" height="602" alt="image" src="https://github.com/user-attachments/assets/ad76a8ab-966d-4121-b010-28a2ddb6e28d" />

---

### Included MLorc-Integrated Optimizers

This repository integrates MLorc into six momentum-based optimizers, each with additional enhancements for improved performance and stability:

1.  **`MLorc_AdamW`**: AdamW with MLorc compression, featuring:
    * **Fused Backward Pass**
    * **[Gradient Descent with Adaptive Momentum Scaling (Grams)](https://github.com/Gunale0926/Grams)**: For better performance and faster convergence.
    * **[`atan2` smoothing & scaling](https://github.com/lucidrains/adam-atan2-pytorch)**: A robust replacement for `eps` (no tuning required), which also incorporates gradient clipping. (If enabled, `eps` is ignored.)
    * **[OrthoGrad](https://github.com/LucasPrietoAl/grokking-at-the-edge-of-numerical-stability)**: Prevents "naïve loss minimization" (NLM) that can lead to overfitting by removing the gradient component parallel to the weight, thus improving generalization

2.  **`MLorc_Prodigy`**:
    * **Same Features as `MLorc_AdamW`**
    * Incorporates MLorc with the [**Prodigy adaptive method**](https://github.com/konstmish/prodigy) and its associated features.

3.  **`MLorc_Lion`**: Lion with MLorc compression, featuring:
    * **Fused Backward Pass**
    * **OrthoGrad**
    * **[`use_cautious`](https://github.com/kyleliang919/C-Optim)**: use the cautious varaint of Lion.
    * **`clip_threshold`**: whether to clip the gradients norm per-parameter as proposed in the paper **[Lions and Muons: Optimization via Stochastic Frank-Wolfe](https://arxiv.org/abs/2506.04192)** to make Lion more stable (default: 5.0, from the paper).

4.  **`MLorc_DAdapt_Lion`**:
    * **Same Features as `MLorc_Lion`**
    * Integrates MLorc with the [**DAdaptation adaptive**](https://github.com/facebookresearch/dadaptation) method for **LION**, and includes the slice_p feature (from Prodigy).

5.  **`MLorc_Adopt`**:
    * **Same Features as `MLorc_AdamW`**
    * Implements the method of **[ADOPT: Modified Adam Can Converge with Any β_2 with the Optimal Rate](https://arxiv.org/abs/2411.02853)**.
  
6.  **`MLorc_CAME`**:
    * **Same Features as `MLorc_AdamW`**
    * The first moment (momentum) is compressed using the low-rank factorization from MLorc, while the adaptive pre-conditioning and confidence-guided updates are from **[CAME: Confidence-guided Adaptive Memory Efficient Optimization](https://arxiv.org/abs/2307.02047)**.

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/Koratahiu/MLorc",
    "name": "MLorc-optim",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "llm, fine-tuning, memory-efficient, low-rank, compression, pytorch, optimizer, adam, lion",
    "author": "Koratahiu",
    "author_email": "hiuhonor@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/a2/db/652c3f407c0a80e933609ef8e72dc0e142c3bddbeaa2626be40c48dfb5dc/mlorc_optim-0.1.8.tar.gz",
    "platform": null,
    "description": "\r\n# MLorc - Momentum Low-Rank Compression for Memory-Efficient LLM Fine-tuning\r\nUnofficial implementation of \"MLorc: Momentum Low-rank Compression for Large Language Model Adaptation\"\r\n\r\nThis repository introduces **MLorc (Momentum Low-rank Compression)**, a novel and highly memory-efficient paradigm designed to significantly reduce the memory footprint of full-parameter fine-tuning for large language models. Based on the paper \"[MLorc: Momentum Low-rank Compression for Large Language Model Adaptation](https://arxiv.org/abs/2506.01897)\" this method offers a compelling alternative to existing memory-efficient techniques.\r\n\r\n<img width=\"1385\" height=\"469\" alt=\"image\" src=\"https://github.com/user-attachments/assets/7bcab5ec-beaf-4d1a-b115-81ab1a7d4b18\" />\r\n\r\n---\r\n### Install\r\n`pip install MLorc-optim`\r\n\r\n---\r\n### How MLorc Works\r\n\r\nMLorc's core innovation lies in its approach to **momentum compression and reconstruction**:\r\n\r\n* **Direct Momentum Compression:** It directly compresses and reconstructs both first and second-order momentum using **Randomized SVD (RSVD)** at each optimization step.\r\n* **Adaptive Second-Order Momentum Handling:** To ensure stability, especially for non-negative second-order momentum, MLorc adaptively adds a small constant to zero values introduced by ReLU during reconstruction.\r\n\r\n---\r\n\r\n### Key Advantages of MLorc\r\n\r\nMLorc is broadly applicable to any momentum-based optimizer (e.g., Adam, Lion) and delivers superior performance:\r\n\r\n* **State-of-the-Art Performance:** Empirically, MLorc consistently **outperforms other memory-efficient methods like LoRA and GaLore** in terms of validation accuracy. It can even match or **exceed the performance of full fine-tuning** with a small rank (e.g., `rank=4`).\r\n* **Memory and Time Efficiency:** It maintains **comparable memory efficiency to LoRA** while demonstrating **improved time efficiency compared to GaLore**.\r\n* **Theoretical Guarantees:** MLorc offers a **theoretical guarantee for convergence**, matching the convergence rate of the original Lion optimizer under reasonable assumptions.\r\n\r\n<img width=\"1403\" height=\"602\" alt=\"image\" src=\"https://github.com/user-attachments/assets/ad76a8ab-966d-4121-b010-28a2ddb6e28d\" />\r\n\r\n---\r\n\r\n### Included MLorc-Integrated Optimizers\r\n\r\nThis repository integrates MLorc into six momentum-based optimizers, each with additional enhancements for improved performance and stability:\r\n\r\n1.  **`MLorc_AdamW`**: AdamW with MLorc compression, featuring:\r\n    * **Fused Backward Pass**\r\n    * **[Gradient Descent with Adaptive Momentum Scaling (Grams)](https://github.com/Gunale0926/Grams)**: For better performance and faster convergence.\r\n    * **[`atan2` smoothing & scaling](https://github.com/lucidrains/adam-atan2-pytorch)**: A robust replacement for `eps` (no tuning required), which also incorporates gradient clipping. (If enabled, `eps` is ignored.)\r\n    * **[OrthoGrad](https://github.com/LucasPrietoAl/grokking-at-the-edge-of-numerical-stability)**: Prevents \"na\u00efve loss minimization\" (NLM) that can lead to overfitting by removing the gradient component parallel to the weight, thus improving generalization\r\n\r\n2.  **`MLorc_Prodigy`**:\r\n    * **Same Features as `MLorc_AdamW`**\r\n    * Incorporates MLorc with the [**Prodigy adaptive method**](https://github.com/konstmish/prodigy) and its associated features.\r\n\r\n3.  **`MLorc_Lion`**: Lion with MLorc compression, featuring:\r\n    * **Fused Backward Pass**\r\n    * **OrthoGrad**\r\n    * **[`use_cautious`](https://github.com/kyleliang919/C-Optim)**: use the cautious varaint of Lion.\r\n    * **`clip_threshold`**: whether to clip the gradients norm per-parameter as proposed in the paper **[Lions and Muons: Optimization via Stochastic Frank-Wolfe](https://arxiv.org/abs/2506.04192)** to make Lion more stable (default: 5.0, from the paper).\r\n\r\n4.  **`MLorc_DAdapt_Lion`**:\r\n    * **Same Features as `MLorc_Lion`**\r\n    * Integrates MLorc with the [**DAdaptation adaptive**](https://github.com/facebookresearch/dadaptation) method for **LION**, and includes the slice_p feature (from Prodigy).\r\n\r\n5.  **`MLorc_Adopt`**:\r\n    * **Same Features as `MLorc_AdamW`**\r\n    * Implements the method of **[ADOPT: Modified Adam Can Converge with Any \u03b2_2 with the Optimal Rate](https://arxiv.org/abs/2411.02853)**.\r\n  \r\n6.  **`MLorc_CAME`**:\r\n    * **Same Features as `MLorc_AdamW`**\r\n    * The first moment (momentum) is compressed using the low-rank factorization from MLorc, while the adaptive pre-conditioning and confidence-guided updates are from **[CAME: Confidence-guided Adaptive Memory Efficient Optimization](https://arxiv.org/abs/2307.02047)**.\r\n",
    "bugtrack_url": null,
    "license": "Apache 2.0",
    "summary": "Unofficial implementation of Momentum Low-Rank Compression (MLorc) for memory-efficient LLM fine-tuning",
    "version": "0.1.8",
    "project_urls": {
        "Homepage": "https://github.com/Koratahiu/MLorc"
    },
    "split_keywords": [
        "llm",
        " fine-tuning",
        " memory-efficient",
        " low-rank",
        " compression",
        " pytorch",
        " optimizer",
        " adam",
        " lion"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "3a26312cebda944e4f3c42a220a6d06a1988fed44604b5f3e46c45d983a08363",
                "md5": "867cd8ebd6085f77d3c7f0f8ec45cb14",
                "sha256": "c78a517fb244bbbdcb2ff7224ae459f0aa62fdc3020c265f5b788ebfc644394d"
            },
            "downloads": -1,
            "filename": "mlorc_optim-0.1.8-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "867cd8ebd6085f77d3c7f0f8ec45cb14",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 30817,
            "upload_time": "2025-08-31T15:43:39",
            "upload_time_iso_8601": "2025-08-31T15:43:39.756466Z",
            "url": "https://files.pythonhosted.org/packages/3a/26/312cebda944e4f3c42a220a6d06a1988fed44604b5f3e46c45d983a08363/mlorc_optim-0.1.8-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "a2db652c3f407c0a80e933609ef8e72dc0e142c3bddbeaa2626be40c48dfb5dc",
                "md5": "3bb1142208771d0193e9bffd92786f21",
                "sha256": "65a2430adb496404e3e148097664cfc3dc001dc14f5caf00f3bb4faa86abceb1"
            },
            "downloads": -1,
            "filename": "mlorc_optim-0.1.8.tar.gz",
            "has_sig": false,
            "md5_digest": "3bb1142208771d0193e9bffd92786f21",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 20129,
            "upload_time": "2025-08-31T15:43:40",
            "upload_time_iso_8601": "2025-08-31T15:43:40.841318Z",
            "url": "https://files.pythonhosted.org/packages/a2/db/652c3f407c0a80e933609ef8e72dc0e142c3bddbeaa2626be40c48dfb5dc/mlorc_optim-0.1.8.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-31 15:43:40",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Koratahiu",
    "github_project": "MLorc",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "mlorc-optim"
}

Koratahiu