# Deltatuner
Deltatuner is an extension for [Peft](https://github.com/huggingface/peft) to improve LLM fine-tuning speed through multiple optimizations, including: leverage the compact model constructor [DE-NAS](https://github.com/intel/e2eAIOK/tree/main/e2eAIOK/DeNas) to construct/modify the compact delta layers in a hardware-aware and train-free approach, and adding more new deltatuning algorithms.
## Introduction
<p align="center">
<img width="90%" src="./doc/deltatuner.png">
</p>
### Key Components
- Supported parameter efficient finetuning algorithms
- [LoRA](https://arxiv.org/pdf/2106.09685.pdf) algorithm
- Scaling and Shifting([SSF](https://arxiv.org/abs/2210.08823)) algorithm: Scale and Shift the deep features in a pre-trained model with much less parameters to catch up with the performance of full finetuning
- WIP on adding more algos (AdaLora etc.)
- De-Nas: Automatically construct compact and optimal delta layers with train-free and hardware-aware mode (more details [here](https://community.intel.com/t5/Blogs/Tech-Innovation/Artificial-Intelligence-AI/Multi-Model-Hardware-Aware-Train-Free-Neural-Architecture-Search/post/1479863))
- step1: Generate search space for delta layers
- step2: Search algorithm populates delta layers for LM
- step3: Train-free score evaluates LM with adaptive delta layers
### Features
- Easy-to-use: provide package install, just need to inject few codes into the original code
- Auto-tuning: automatically select best algorithms and delta structure for finetuning model
### Values
- Saving computation power: reduce the computation power and time required to fine-tune a model by reducing parameter size as well as memory footprint.
- Improve accuracy: ensure same or no accuracy regression.
## Get Started
### Installation
- install the python package
```shell
pip install e2eAIOK-deltatuner
```
### Fast Fine-tuning on Base models
Below is an example of optimizing [MPT](https://huggingface.co/mosaicml/mpt-7b) model by adding the following few-lines to use the delatuner optimizations. It use the DE-NAS in delatuner to optimize a LLM with LoRA layers to a LLM with compact LoRA layers, so as to efficiently improve the LLM fine-tuning process in peak memory reduction and time speedup.
```diff
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import LoraConfig, get_peft_model
+ from deltatuner import deltatuner, deltatuner_args
+ from delta import deltatuner, deltatuner_args
# import model from huggingface
model_id = "mosaicml/mpt-7b"
model = AutoModelForCausalLM.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)
# adding the lora componenents with peft
config = LoraConfig()
lora_model = get_peft_model(model, config)
# delatuner optimize the model with best lora layer configuration
+ deltatuning_args = deltatuner_args.DeltaTunerArguments()
+ deltatuner_model = deltatuner.optimize(model=lora_model, tokenizer=tokenizer, deltatuning_args=deltatuning_args)
...
```
### API reference
In above examples, `deltatuner.optimize` is a python function to using deltatuner supported optimization algorithms to optimize the model.
```python
def optimize(model, tokenizer, algo: str="auto", deltatuning_args: DeltaTunerArguments=None, **kwargs) -> DeltaTunerModel:
'''
Parameters:
- model - a PreTrainedModel or LoraModel. Specifies what model should be optimized
- tokenizer - a tokenizer for preprocess text
- deltatuning_args (optional) – the deltatuner configuration.
- deltatuning_args.denas is to use the denas in the optimization (default: True)
- deltatuning_args.algo Specifies what type of adapter algorithm (default: auto)
- "auto" – If the input model is mpt, the algorithm is ssf; elif the algorithm is lora
- "lora" – use the lora algotihm
- "ssf" – use the ssf algotithm
- deltatuning_args.best_model_structure Specifies the pre-searched delta best structure so the model can be directly initilized without searching.
- kwargs - used to initilize deltatuning_args through key=value, such as algo="lora"
Return
DeltaTunerModel - a wrapper of model, which composed of the original properties/function together with adavance properties/function provided by deltatuner
'''
```
### Detailed examples
Please refer to [example page](https://github.com/intel/e2eAIOK/tree/main/example) for more use cases on fine-tuning other LLMs with the help of DeltaTuner.
## Model supported matrix
We have upload the searched delta best structure to the [conf dir](https://github.com/intel/e2eAIOK/tree/main/e2eAIOK/deltatuner/deltatuner/conf/best_structure), so that users can directly use our searched structure for directly fine-tuning by passing the `DeltaTunerArguments.best_model_structure` to the `deltatuner.optimize` function.
### Causal Language Modeling
| Model | LoRA | SSF |
|--------------| ---- | ---- |
| GPT-2 | ✅ | |
| GPT-J | ✅ | ✅ |
| Bloom | ✅ | ✅ |
| OPT | ✅ | ✅ |
| GPT-Neo | ✅ | ✅ |
| Falcon | ✅ | ✅ |
| LLaMA | ✅ | ✅ |
| MPT | ✅ | ✅ |
Raw data
{
"_id": null,
"home_page": "https://github.com/intel/e2eAIOK/",
"name": "e2eAIOK-deltatuner",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": "",
"keywords": "deep learning,LLM,fine-tuning,pytorch,peft,lora,NAS",
"author": "Intel AIA",
"author_email": "bdf.aiok@intel.com",
"download_url": "https://files.pythonhosted.org/packages/4c/01/a0e205e051d5d0cf61baeacf40180075283bbce4ec603eb3bea6472bdf0a/e2eAIOK-deltatuner-1.1.10.tar.gz",
"platform": null,
"description": "# Deltatuner\nDeltatuner is an extension for [Peft](https://github.com/huggingface/peft) to improve LLM fine-tuning speed through multiple optimizations, including: leverage the compact model constructor [DE-NAS](https://github.com/intel/e2eAIOK/tree/main/e2eAIOK/DeNas) to construct/modify the compact delta layers in a hardware-aware and train-free approach, and adding more new deltatuning algorithms.\n\n## Introduction\n<p align=\"center\">\n <img width=\"90%\" src=\"./doc/deltatuner.png\">\n</p>\n\n### Key Components\n- Supported parameter efficient finetuning algorithms\n - [LoRA](https://arxiv.org/pdf/2106.09685.pdf) algorithm\n - Scaling and Shifting([SSF](https://arxiv.org/abs/2210.08823)) algorithm: Scale and Shift the deep features in a pre-trained model with much less parameters to catch up with the performance of full finetuning\n - WIP on adding more algos (AdaLora etc.)\n- De-Nas: Automatically construct compact and optimal delta layers with train-free and hardware-aware mode (more details [here](https://community.intel.com/t5/Blogs/Tech-Innovation/Artificial-Intelligence-AI/Multi-Model-Hardware-Aware-Train-Free-Neural-Architecture-Search/post/1479863))\n - step1: Generate search space for delta layers\n - step2: Search algorithm populates delta layers for LM\n - step3: Train-free score evaluates LM with adaptive delta layers\n\n### Features\n- Easy-to-use: provide package install, just need to inject few codes into the original code \n- Auto-tuning: automatically select best algorithms and delta structure for finetuning model\n\n### Values\n- Saving computation power: reduce the computation power and time required to fine-tune a model by reducing parameter size as well as memory footprint.\n- Improve accuracy: ensure same or no accuracy regression.\n\n## Get Started\n\n### Installation\n- install the python package\n```shell\npip install e2eAIOK-deltatuner\n```\n\n### Fast Fine-tuning on Base models\nBelow is an example of optimizing [MPT](https://huggingface.co/mosaicml/mpt-7b) model by adding the following few-lines to use the delatuner optimizations. It use the DE-NAS in delatuner to optimize a LLM with LoRA layers to a LLM with compact LoRA layers, so as to efficiently improve the LLM fine-tuning process in peak memory reduction and time speedup. \n\n```diff\nfrom transformers import AutoModelForCausalLM, AutoTokenizer\nfrom peft import LoraConfig, get_peft_model\n+ from deltatuner import deltatuner, deltatuner_args\n+ from delta import deltatuner, deltatuner_args\n\n# import model from huggingface\nmodel_id = \"mosaicml/mpt-7b\"\nmodel = AutoModelForCausalLM.from_pretrained(model_id)\ntokenizer = AutoTokenizer.from_pretrained(model_id)\n# adding the lora componenents with peft\nconfig = LoraConfig()\nlora_model = get_peft_model(model, config) \n# delatuner optimize the model with best lora layer configuration\n+ deltatuning_args = deltatuner_args.DeltaTunerArguments()\n+ deltatuner_model = deltatuner.optimize(model=lora_model, tokenizer=tokenizer, deltatuning_args=deltatuning_args)\n...\n```\n\n### API reference\nIn above examples, `deltatuner.optimize` is a python function to using deltatuner supported optimization algorithms to optimize the model.\n```python\ndef optimize(model, tokenizer, algo: str=\"auto\", deltatuning_args: DeltaTunerArguments=None, **kwargs) -> DeltaTunerModel:\n '''\n Parameters:\n - model - a PreTrainedModel or LoraModel. Specifies what model should be optimized\n - tokenizer - a tokenizer for preprocess text\n - deltatuning_args (optional) \u2013 the deltatuner configuration. \n - deltatuning_args.denas is to use the denas in the optimization (default: True)\n - deltatuning_args.algo Specifies what type of adapter algorithm (default: auto)\n - \"auto\" \u2013 If the input model is mpt, the algorithm is ssf; elif the algorithm is lora\n - \"lora\" \u2013 use the lora algotihm\n - \"ssf\" \u2013 use the ssf algotithm \n - deltatuning_args.best_model_structure Specifies the pre-searched delta best structure so the model can be directly initilized without searching.\n - kwargs - used to initilize deltatuning_args through key=value, such as algo=\"lora\"\n Return \n DeltaTunerModel - a wrapper of model, which composed of the original properties/function together with adavance properties/function provided by deltatuner\n '''\n```\n\n\n### Detailed examples\n\nPlease refer to [example page](https://github.com/intel/e2eAIOK/tree/main/example) for more use cases on fine-tuning other LLMs with the help of DeltaTuner.\n\n## Model supported matrix\nWe have upload the searched delta best structure to the [conf dir](https://github.com/intel/e2eAIOK/tree/main/e2eAIOK/deltatuner/deltatuner/conf/best_structure), so that users can directly use our searched structure for directly fine-tuning by passing the `DeltaTunerArguments.best_model_structure` to the `deltatuner.optimize` function.\n\n### Causal Language Modeling\n\n| Model | LoRA | SSF |\n|--------------| ---- | ---- |\n| GPT-2 | \u2705 | |\n| GPT-J | \u2705 | \u2705 |\n| Bloom | \u2705 | \u2705 |\n| OPT | \u2705 | \u2705 |\n| GPT-Neo | \u2705 | \u2705 |\n| Falcon | \u2705 | \u2705 |\n| LLaMA | \u2705 | \u2705 |\n| MPT | \u2705 | \u2705 |\n\n\n",
"bugtrack_url": null,
"license": "Apache-2.0",
"summary": "Intel extension for peft with PyTorch and DENAS",
"version": "1.1.10",
"project_urls": {
"Download": "https://github.com/intel/e2eAIOK/",
"Homepage": "https://github.com/intel/e2eAIOK/"
},
"split_keywords": [
"deep learning",
"llm",
"fine-tuning",
"pytorch",
"peft",
"lora",
"nas"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "4c01a0e205e051d5d0cf61baeacf40180075283bbce4ec603eb3bea6472bdf0a",
"md5": "f8322ada980fe896bb36097d4a71f85f",
"sha256": "2ed67527b89aea69ec23743ae65b30108ce0f8645dade5f69f144aba89099e73"
},
"downloads": -1,
"filename": "e2eAIOK-deltatuner-1.1.10.tar.gz",
"has_sig": false,
"md5_digest": "f8322ada980fe896bb36097d4a71f85f",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 26821,
"upload_time": "2023-12-04T14:21:54",
"upload_time_iso_8601": "2023-12-04T14:21:54.813643Z",
"url": "https://files.pythonhosted.org/packages/4c/01/a0e205e051d5d0cf61baeacf40180075283bbce4ec603eb3bea6472bdf0a/e2eAIOK-deltatuner-1.1.10.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-12-04 14:21:54",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "intel",
"github_project": "e2eAIOK",
"travis_ci": true,
"coveralls": false,
"github_actions": true,
"lcname": "e2eaiok-deltatuner"
}