AdapterLoRa


NameAdapterLoRa JSON
Version 2.0.0 PyPI version JSON
download
home_pagehttps://github.com/youness-elbrag/AdapterLoRa/
SummaryA Tool for adaptation Larger Transfomer-Based model and Quantization built top on libraries LoRa and LoRa-Torch.
upload_time2023-08-26 15:15:26
maintainer
docs_urlNone
authorYouness EL BRAG
requires_python>=3.7
license
keywords quantization adapterllm peft
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Adapter-LoRa for Quantization  

<div align="center">
  <img src="assets/LoRa.png" alt="LoRa-Logo" width="200">

[![Made With Love](https://img.shields.io/badge/Made%20With-Love-orange.svg)](https://github.com/youness-elbrag/AdapterLoRa/)
[![GitHub issues](https://img.shields.io/github/issues/kyegomez/Med-Palm)](https://github.com/youness-elbrag/AdapterLoRa/issues) 
[![GitHub forks](https://img.shields.io/github/forks/kyegomez/Med-Palm)](https://github.com/youness-elbrag/AdapterLoRa/network) 
[![GitHub stars](https://img.shields.io/github/stars/kyegomez/Med-Palm)](https://github.com/youness-elbrag/AdapterLoRa/stargazers) [![GitHub license](https://img.shields.io/github/license/youness-elbrag/AdapterLoRa)](https://github.com/youness-elbrag/AdapterLoRa/blob/master/LICENSE)
</div>


## Comparative Features of "loralib" and "loratorch" Implementations

**Distinguishing the "loralib" and "loratorch" Approaches for Implementation**

The implementations of "loralib" and "loratorch" exhibit distinct methodologies, particularly when using the example of `nn.Linear`. The underlying mathematical representations are as follows:

1. * **loralib** Approach

  The computation is defined as:

  \[
  h = x W_0^\top + \frac{\alpha}{r} x(BA)^\top,
  \]

  where:
  - `x` is an input matrix of dimensions \(k \times n\),
  - `W_0` is a pre-trained weight matrix of dimensions \(m \times n\),
  - `r` is a predefined LoRA rank,
  - `B` and `A` are LoRA matrices of dimensions \(m \times r\) and \(r \times n\) respectively,
  - `\alpha` is a hyper-parameter.


1. For ``loralib``,
   $h = x W_0^\top + \frac{\alpha}{r} x(BA)^\top,$

where $x\in\mathbb{R}^{k\times n}$ is the input matrix, $W_0\in\mathbb{R}^{m\times n}$ is the pre-trained weight matrix, $r$ is the predefined LoRA rank, $B\in\mathbb{R}^{m\times r}$ and $A\in \mathbb{R}^{r\times n}$ are the LoRA matrixes, and $\alpha$ is a hyper-parameter.

2. For ``loratorch``,
   $h = x (W_0 + \frac{\alpha}{r} BA)^\top.$
   
``loralib`` computes $xW_0^\top$ and $x(BA)^\top$ respectively and then merges the results. While ``loratorch`` merges pre-trained weight $W_0$ and its LoRA weight $BA$ and then computes the results by simply using ``nn.Linear.forward()``. There is no difference between ``loralib`` and ``loratorch`` in the linear layers. But in some no-linear or complex layers, we are no sure whether this layer satisfies $L(x, W_0)+L(x, BA) = L(x, W_0+BA)$. Hence, it is difficult to extend LoRA to some complex layers by using ``loralib``. On the contrary, the idea of merging weights first in ``loratorch`` is more general and extensible. You just call ``merge_lora_param()`` in ``loratorch`` to merge weights and then call ``forward()`` in the original layer to compute the results. With the help of ``loratorch``, you can easily implement LoRA to any type of layer of ``torch.nn``.

## Supported Layers

|                           | ``loralib``    | ``loratorch``  |                                                    |
| ------------------------- |:--------------:|:--------------:| -------------------------------------------------- |
| ``nn.Linear``             | ✓              | ✓              | [linear.ipynb](https://github.com/Baijiong-Lin/LoRA-Torch/blob/main/examples/linear.ipynb)            |
| ``nn.Embedding``          | ✓              | ✓              | [embedding.ipynb](https://github.com/Baijiong-Lin/LoRA-Torch/blob/main/examples/embedding.ipynb)      |
| ``nn.Conv1d``             | ✓              | ✓              |                                                    |
| ``nn.Conv2d``             | ✓              | ✓              |                                                    |
| ``nn.Conv3d``             | ✓              | ✓              |                                                    |
| ``nn.MultiheadAttention`` | ✘              | ✓              |                                                    |
| ``MergedLinear``          | ✓ (Error)      | ✓              | [mergedlinear.ipynb](https://github.com/Baijiong-Lin/LoRA-Torch/blob/main/examples/mergedlinear.ipynb) |
| $\cdots$                  | hard to extend | easy to extend |                                                    |

*We compare the results of ``loralib`` and ``loratorch``  in [examples](./examples) to demonstrate the correctness of the implementation in ``loratorch``.*

## Quick Start

**The usage of ``AdapterLoRa``**

1. Install ``AdapterLoRa``.
   
   ```bash
   pip install git+https://github.com/Baijiong-Lin/LoRA-Torch
   ```

  ```python
  pip install AdapterLoRa
  ```

### Usage Tool AdpaterLoRa

```python

import torch.nn as nn
import torch
from core.Quantized import AdapterLoRa

model = nn.TransformerEncoderLayer(d_model=512, nhead=8)

Adpate_model = AdapterLoRa(model , method="LoRa", Rank=4)

"""
adding Linear Layer built Self.attention 
Replace the layers where you would like to use AdapterLoRa by using  add_layer function.
"""

Adpate_model.add_layer("self_attn") 
Adpate_model.add_layer("linear1")
Adpate_model.add_layer("linear2")

# reconstruct model Quantized 
Adpate_model.reconstruct_model()

# Iplmented LoRa Method
model = Adpate_model.implement_lora(verbose=True)
# Total trainable parameters before LoRA: 3176960
# Total trainable parameters after LoRA: 24576

# This sets requires_grad to False for all parameters without the string "lora_" in their names

# Training loop
for batch in dataloader:
    model.train()
```
### Saving Wieghts model 

* Save LoRA model (only the LoRA matrixes will be saved).

```python
import loralib as lora 
# ===== Before =====
# torch.save(model.state_dict(), checkpoint_path)
# ===== After =====
torch.save(lora.lora_state_dict(model), checkpoint_path)
```

### Loading the Pre-Trained Model 

* Load LoRA model (need to load the pre-trained model first).

```python
import loralib as lora 
# Load the pre-trained checkpoint first
model.load_state_dict(torch.load('ckpt_pretrained.pt'), strict=False)
# Then load the LoRA checkpoint
model.load_state_dict(torch.load('ckpt_lora.pt'), strict=False)
```


- <img src="assets/rocket.gif" width="32" height="32"/> Quantized Model <img src="assets/rocket.gif" width="32" height="32"/>

- <img src="assets/time.gif" width="32" height="32"/> Time to Train <img src="assets/time.gif" width="32" height="32"/>

- <img src="assets/money.gif" width="32" height="32"/> Cost to Train <img src="assets/money.gif" width="32" height="32"/>


## What's in it for you?

For each of the above four pillars, we are sharing our codebase and insights to:
- Assist you to leverage Transfomer-Based Model for your machines needs and challenges

- Boost reproducibility efforts which are becoming increasingly difficult with Transfomers 

i am providing Tool that are ready-to-use for Quantize the model:

- Finetuning Transfomer-Based on your proprietary dataset via PeFT methodologies such as LoRA and QLoRa

- Performing hyperparameter optimization to get the maximum performance out of these models

## What's the best way to use this repository?

Go over to the Transfomer-Based-specific directory that you are interested in, and open the ```README.md```. We have included details about the LLMs, followed by performance results on open-source datasets!

## Roadmap

Our plan is to perform these experiments on all the Transformer-Based model below. To that end, this is a tentative roadmap of the LLMs that we aim to cover:

- [x] TransfomerEncoder
- [x] TransfomerDecoder
- [x] Vision-Transfomer
- [x] minGPT 
- [x] OpenAI GPT-2 
- [ ] Inflection Pi **Under Progress**

## Correspondence

## Contributor

``AdapterLoRa`` is developed and maintained by 
''Youness ELbrag'' ([Email](younsselbrag@gmail.com) | [LinkedIn](https://www.linkedin.com/in/youness-el-brag-b13628203/))





            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/youness-elbrag/AdapterLoRa/",
    "name": "AdapterLoRa",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "",
    "keywords": "Quantization,AdapterLLM,PEFT",
    "author": "Youness EL BRAG",
    "author_email": "younsselbrag@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/af/39/6921f288b74d1ae8ddfd7f4b81ab290f75e029935bb00f73b9a700dd3cf1/AdapterLoRa-2.0.0.tar.gz",
    "platform": null,
    "description": "# Adapter-LoRa for Quantization  \n\n<div align=\"center\">\n  <img src=\"assets/LoRa.png\" alt=\"LoRa-Logo\" width=\"200\">\n\n[![Made With Love](https://img.shields.io/badge/Made%20With-Love-orange.svg)](https://github.com/youness-elbrag/AdapterLoRa/)\n[![GitHub issues](https://img.shields.io/github/issues/kyegomez/Med-Palm)](https://github.com/youness-elbrag/AdapterLoRa/issues) \n[![GitHub forks](https://img.shields.io/github/forks/kyegomez/Med-Palm)](https://github.com/youness-elbrag/AdapterLoRa/network) \n[![GitHub stars](https://img.shields.io/github/stars/kyegomez/Med-Palm)](https://github.com/youness-elbrag/AdapterLoRa/stargazers) [![GitHub license](https://img.shields.io/github/license/youness-elbrag/AdapterLoRa)](https://github.com/youness-elbrag/AdapterLoRa/blob/master/LICENSE)\n</div>\n\n\n## Comparative Features of \"loralib\" and \"loratorch\" Implementations\n\n**Distinguishing the \"loralib\" and \"loratorch\" Approaches for Implementation**\n\nThe implementations of \"loralib\" and \"loratorch\" exhibit distinct methodologies, particularly when using the example of `nn.Linear`. The underlying mathematical representations are as follows:\n\n1. * **loralib** Approach\n\n  The computation is defined as:\n\n  \\[\n  h = x W_0^\\top + \\frac{\\alpha}{r} x(BA)^\\top,\n  \\]\n\n  where:\n  - `x` is an input matrix of dimensions \\(k \\times n\\),\n  - `W_0` is a pre-trained weight matrix of dimensions \\(m \\times n\\),\n  - `r` is a predefined LoRA rank,\n  - `B` and `A` are LoRA matrices of dimensions \\(m \\times r\\) and \\(r \\times n\\) respectively,\n  - `\\alpha` is a hyper-parameter.\n\n\n1. For ``loralib``,\n   $h = x W_0^\\top + \\frac{\\alpha}{r} x(BA)^\\top,$\n\nwhere $x\\in\\mathbb{R}^{k\\times n}$ is the input matrix, $W_0\\in\\mathbb{R}^{m\\times n}$ is the pre-trained weight matrix, $r$ is the predefined LoRA rank, $B\\in\\mathbb{R}^{m\\times r}$ and $A\\in \\mathbb{R}^{r\\times n}$ are the LoRA matrixes, and $\\alpha$ is a hyper-parameter.\n\n2. For ``loratorch``,\n   $h = x (W_0 + \\frac{\\alpha}{r} BA)^\\top.$\n   \n``loralib`` computes $xW_0^\\top$ and $x(BA)^\\top$ respectively and then merges the results. While ``loratorch`` merges pre-trained weight $W_0$ and its LoRA weight $BA$ and then computes the results by simply using ``nn.Linear.forward()``. There is no difference between ``loralib`` and ``loratorch`` in the linear layers. But in some no-linear or complex layers, we are no sure whether this layer satisfies $L(x, W_0)+L(x, BA) = L(x, W_0+BA)$. Hence, it is difficult to extend LoRA to some complex layers by using ``loralib``. On the contrary, the idea of merging weights first in ``loratorch`` is more general and extensible. You just call ``merge_lora_param()`` in ``loratorch`` to merge weights and then call ``forward()`` in the original layer to compute the results. With the help of ``loratorch``, you can easily implement LoRA to any type of layer of ``torch.nn``.\n\n## Supported Layers\n\n|                           | ``loralib``    | ``loratorch``  |                                                    |\n| ------------------------- |:--------------:|:--------------:| -------------------------------------------------- |\n| ``nn.Linear``             | \u2713              | \u2713              | [linear.ipynb](https://github.com/Baijiong-Lin/LoRA-Torch/blob/main/examples/linear.ipynb)            |\n| ``nn.Embedding``          | \u2713              | \u2713              | [embedding.ipynb](https://github.com/Baijiong-Lin/LoRA-Torch/blob/main/examples/embedding.ipynb)      |\n| ``nn.Conv1d``             | \u2713              | \u2713              |                                                    |\n| ``nn.Conv2d``             | \u2713              | \u2713              |                                                    |\n| ``nn.Conv3d``             | \u2713              | \u2713              |                                                    |\n| ``nn.MultiheadAttention`` | \u2718              | \u2713              |                                                    |\n| ``MergedLinear``          | \u2713 (Error)      | \u2713              | [mergedlinear.ipynb](https://github.com/Baijiong-Lin/LoRA-Torch/blob/main/examples/mergedlinear.ipynb) |\n| $\\cdots$                  | hard to extend | easy to extend |                                                    |\n\n*We compare the results of ``loralib`` and ``loratorch``  in [examples](./examples) to demonstrate the correctness of the implementation in ``loratorch``.*\n\n## Quick Start\n\n**The usage of ``AdapterLoRa``**\n\n1. Install ``AdapterLoRa``.\n   \n   ```bash\n   pip install git+https://github.com/Baijiong-Lin/LoRA-Torch\n   ```\n\n  ```python\n  pip install AdapterLoRa\n  ```\n\n### Usage Tool AdpaterLoRa\n\n```python\n\nimport torch.nn as nn\nimport torch\nfrom core.Quantized import AdapterLoRa\n\nmodel = nn.TransformerEncoderLayer(d_model=512, nhead=8)\n\nAdpate_model = AdapterLoRa(model , method=\"LoRa\", Rank=4)\n\n\"\"\"\nadding Linear Layer built Self.attention \nReplace the layers where you would like to use AdapterLoRa by using  add_layer function.\n\"\"\"\n\nAdpate_model.add_layer(\"self_attn\") \nAdpate_model.add_layer(\"linear1\")\nAdpate_model.add_layer(\"linear2\")\n\n# reconstruct model Quantized \nAdpate_model.reconstruct_model()\n\n# Iplmented LoRa Method\nmodel = Adpate_model.implement_lora(verbose=True)\n# Total trainable parameters before LoRA: 3176960\n# Total trainable parameters after LoRA: 24576\n\n# This sets requires_grad to False for all parameters without the string \"lora_\" in their names\n\n# Training loop\nfor batch in dataloader:\n    model.train()\n```\n### Saving Wieghts model \n\n* Save LoRA model (only the LoRA matrixes will be saved).\n\n```python\nimport loralib as lora \n# ===== Before =====\n# torch.save(model.state_dict(), checkpoint_path)\n# ===== After =====\ntorch.save(lora.lora_state_dict(model), checkpoint_path)\n```\n\n### Loading the Pre-Trained Model \n\n* Load LoRA model (need to load the pre-trained model first).\n\n```python\nimport loralib as lora \n# Load the pre-trained checkpoint first\nmodel.load_state_dict(torch.load('ckpt_pretrained.pt'), strict=False)\n# Then load the LoRA checkpoint\nmodel.load_state_dict(torch.load('ckpt_lora.pt'), strict=False)\n```\n\n\n- <img src=\"assets/rocket.gif\" width=\"32\" height=\"32\"/> Quantized Model <img src=\"assets/rocket.gif\" width=\"32\" height=\"32\"/>\n\n- <img src=\"assets/time.gif\" width=\"32\" height=\"32\"/> Time to Train <img src=\"assets/time.gif\" width=\"32\" height=\"32\"/>\n\n- <img src=\"assets/money.gif\" width=\"32\" height=\"32\"/> Cost to Train <img src=\"assets/money.gif\" width=\"32\" height=\"32\"/>\n\n\n## What's in it for you?\n\nFor each of the above four pillars, we are sharing our codebase and insights to:\n- Assist you to leverage Transfomer-Based Model for your machines needs and challenges\n\n- Boost reproducibility efforts which are becoming increasingly difficult with Transfomers \n\ni am providing Tool that are ready-to-use for Quantize the model:\n\n- Finetuning Transfomer-Based on your proprietary dataset via PeFT methodologies such as LoRA and QLoRa\n\n- Performing hyperparameter optimization to get the maximum performance out of these models\n\n## What's the best way to use this repository?\n\nGo over to the Transfomer-Based-specific directory that you are interested in, and open the ```README.md```. We have included details about the LLMs, followed by performance results on open-source datasets!\n\n## Roadmap\n\nOur plan is to perform these experiments on all the Transformer-Based model below. To that end, this is a tentative roadmap of the LLMs that we aim to cover:\n\n- [x] TransfomerEncoder\n- [x] TransfomerDecoder\n- [x] Vision-Transfomer\n- [x] minGPT \n- [x] OpenAI GPT-2 \n- [ ] Inflection Pi **Under Progress**\n\n## Correspondence\n\n## Contributor\n\n``AdapterLoRa`` is developed and maintained by \n''Youness ELbrag'' ([Email](younsselbrag@gmail.com) | [LinkedIn](https://www.linkedin.com/in/youness-el-brag-b13628203/))\n\n\n\n\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "A Tool for adaptation Larger Transfomer-Based model and Quantization built top on libraries LoRa and LoRa-Torch.",
    "version": "2.0.0",
    "project_urls": {
        "Homepage": "https://github.com/youness-elbrag/AdapterLoRa/"
    },
    "split_keywords": [
        "quantization",
        "adapterllm",
        "peft"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "3a4753a777a92eab68ab8b7519de5a36734477871a631c08d52d0338251bc16f",
                "md5": "63a12c4418a413c1e3147e68d6743157",
                "sha256": "112d00ae96f710d2c900cbe3c2223ed349e3619aa75dbf01d334726e2225596b"
            },
            "downloads": -1,
            "filename": "AdapterLoRa-2.0.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "63a12c4418a413c1e3147e68d6743157",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 7247,
            "upload_time": "2023-08-26T15:15:24",
            "upload_time_iso_8601": "2023-08-26T15:15:24.908734Z",
            "url": "https://files.pythonhosted.org/packages/3a/47/53a777a92eab68ab8b7519de5a36734477871a631c08d52d0338251bc16f/AdapterLoRa-2.0.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "af396921f288b74d1ae8ddfd7f4b81ab290f75e029935bb00f73b9a700dd3cf1",
                "md5": "f15f45fc74f01f8b4743808453a17768",
                "sha256": "142f3f17d480b4541a95dfae133a590c25e5212152fa92adf48f77f4be558a12"
            },
            "downloads": -1,
            "filename": "AdapterLoRa-2.0.0.tar.gz",
            "has_sig": false,
            "md5_digest": "f15f45fc74f01f8b4743808453a17768",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 7316,
            "upload_time": "2023-08-26T15:15:26",
            "upload_time_iso_8601": "2023-08-26T15:15:26.443612Z",
            "url": "https://files.pythonhosted.org/packages/af/39/6921f288b74d1ae8ddfd7f4b81ab290f75e029935bb00f73b9a700dd3cf1/AdapterLoRa-2.0.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-08-26 15:15:26",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "youness-elbrag",
    "github_project": "AdapterLoRa",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "adapterlora"
}
        
Elapsed time: 0.11446s