e2eAIOK-deltatuner

Name	e2eAIOK-deltatuner JSON
Version	1.1.10 JSON
	download
home_page	https://github.com/intel/e2eAIOK/
Summary	Intel extension for peft with PyTorch and DENAS
upload_time	2023-12-04 14:21:54
maintainer
docs_url	None
author	Intel AIA
requires_python	>=3.7
license	Apache-2.0
keywords	deep learning llm fine-tuning pytorch peft lora nas
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI
coveralls test coverage	No coveralls.

            # Deltatuner
Deltatuner is an extension for [Peft](https://github.com/huggingface/peft) to improve LLM fine-tuning speed through multiple optimizations, including: leverage the compact model constructor [DE-NAS](https://github.com/intel/e2eAIOK/tree/main/e2eAIOK/DeNas) to construct/modify the compact delta layers in a hardware-aware and train-free approach, and adding more new deltatuning algorithms.

## Introduction
<p align="center">
  <img width="90%" src="./doc/deltatuner.png">
</p>

### Key Components
- Supported parameter efficient finetuning algorithms
  - [LoRA](https://arxiv.org/pdf/2106.09685.pdf) algorithm
  - Scaling and Shifting([SSF](https://arxiv.org/abs/2210.08823)) algorithm: Scale and Shift the deep features in a pre-trained model with much less parameters to catch up with the performance of full finetuning
  - WIP on adding more algos (AdaLora etc.)
- De-Nas: Automatically construct compact and optimal delta layers with train-free and hardware-aware mode (more details [here](https://community.intel.com/t5/Blogs/Tech-Innovation/Artificial-Intelligence-AI/Multi-Model-Hardware-Aware-Train-Free-Neural-Architecture-Search/post/1479863))
  - step1: Generate search space for delta layers
  - step2: Search algorithm populates delta layers for LM
  - step3: Train-free score evaluates LM with adaptive delta layers

### Features
- Easy-to-use: provide package install, just need to inject few codes into the original code 
- Auto-tuning: automatically select best algorithms and delta structure for finetuning model

### Values
- Saving computation power: reduce the computation power and time required to fine-tune a model by reducing parameter size as well as memory footprint.
- Improve accuracy: ensure same or no accuracy regression.

## Get Started

### Installation
- install the python package
```shell
pip install e2eAIOK-deltatuner
```

### Fast Fine-tuning on Base models
Below is an example of optimizing [MPT](https://huggingface.co/mosaicml/mpt-7b) model by adding the following few-lines to use the delatuner optimizations. It use the DE-NAS in delatuner to optimize a LLM with LoRA layers to a LLM with compact LoRA layers, so as to efficiently improve the LLM fine-tuning process in peak memory reduction and time speedup. 

```diff
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import LoraConfig, get_peft_model
+ from deltatuner import deltatuner, deltatuner_args
+ from delta import deltatuner, deltatuner_args

# import model from huggingface
model_id =  "mosaicml/mpt-7b"
model = AutoModelForCausalLM.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)
# adding the lora componenents with peft
config = LoraConfig()
lora_model = get_peft_model(model, config) 
# delatuner optimize the model with best lora layer configuration
+ deltatuning_args = deltatuner_args.DeltaTunerArguments()
+ deltatuner_model = deltatuner.optimize(model=lora_model, tokenizer=tokenizer, deltatuning_args=deltatuning_args)
...
```

### API reference
In above examples, `deltatuner.optimize` is a python function to using deltatuner supported optimization algorithms to optimize the model.
```python
def optimize(model, tokenizer, algo: str="auto", deltatuning_args: DeltaTunerArguments=None, **kwargs) -> DeltaTunerModel:
    '''
    Parameters:
        - model - a PreTrainedModel or LoraModel. Specifies what model should be optimized
        - tokenizer - a tokenizer for preprocess text
        - deltatuning_args (optional) – the deltatuner configuration. 
          - deltatuning_args.denas is to use the denas in the optimization (default: True)
          - deltatuning_args.algo Specifies what type of adapter algorithm (default: auto)
            - "auto" – If the input model is mpt, the algorithm is ssf; elif the algorithm is lora
            - "lora" – use the lora algotihm
            - "ssf" – use the ssf algotithm 
          - deltatuning_args.best_model_structure Specifies the pre-searched delta best structure so the model can be directly initilized without searching.
        - kwargs - used to initilize deltatuning_args through key=value, such as algo="lora"
    Return 
        DeltaTunerModel - a wrapper of model, which composed of the original properties/function together with adavance properties/function provided by deltatuner
    '''
```


### Detailed examples

Please refer to [example page](https://github.com/intel/e2eAIOK/tree/main/example) for more use cases on fine-tuning other LLMs with the help of DeltaTuner.

## Model supported matrix
We have upload the searched delta best structure to the [conf dir](https://github.com/intel/e2eAIOK/tree/main/e2eAIOK/deltatuner/deltatuner/conf/best_structure), so that users can directly use our searched structure for directly fine-tuning by passing the `DeltaTunerArguments.best_model_structure` to the `deltatuner.optimize` function.

### Causal Language Modeling

| Model        | LoRA | SSF  |
|--------------| ---- | ---- |
| GPT-2        | ✅  |  |
| GPT-J        | ✅  | ✅ |
| Bloom        | ✅  | ✅ |
| OPT          | ✅  | ✅ |
| GPT-Neo      | ✅  | ✅ |
| Falcon       | ✅  | ✅ |
| LLaMA        | ✅  | ✅ |
| MPT          | ✅  | ✅ |

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/intel/e2eAIOK/",
    "name": "e2eAIOK-deltatuner",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "",
    "keywords": "deep learning,LLM,fine-tuning,pytorch,peft,lora,NAS",
    "author": "Intel AIA",
    "author_email": "bdf.aiok@intel.com",
    "download_url": "https://files.pythonhosted.org/packages/4c/01/a0e205e051d5d0cf61baeacf40180075283bbce4ec603eb3bea6472bdf0a/e2eAIOK-deltatuner-1.1.10.tar.gz",
    "platform": null,
    "description": "# Deltatuner\nDeltatuner is an extension for [Peft](https://github.com/huggingface/peft) to improve LLM fine-tuning speed through multiple optimizations, including: leverage the compact model constructor [DE-NAS](https://github.com/intel/e2eAIOK/tree/main/e2eAIOK/DeNas) to construct/modify the compact delta layers in a hardware-aware and train-free approach, and adding more new deltatuning algorithms.\n\n## Introduction\n<p align=\"center\">\n  <img width=\"90%\" src=\"./doc/deltatuner.png\">\n</p>\n\n### Key Components\n- Supported parameter efficient finetuning algorithms\n  - [LoRA](https://arxiv.org/pdf/2106.09685.pdf) algorithm\n  - Scaling and Shifting([SSF](https://arxiv.org/abs/2210.08823)) algorithm: Scale and Shift the deep features in a pre-trained model with much less parameters to catch up with the performance of full finetuning\n  - WIP on adding more algos (AdaLora etc.)\n- De-Nas: Automatically construct compact and optimal delta layers with train-free and hardware-aware mode (more details [here](https://community.intel.com/t5/Blogs/Tech-Innovation/Artificial-Intelligence-AI/Multi-Model-Hardware-Aware-Train-Free-Neural-Architecture-Search/post/1479863))\n  - step1: Generate search space for delta layers\n  - step2: Search algorithm populates delta layers for LM\n  - step3: Train-free score evaluates LM with adaptive delta layers\n\n### Features\n- Easy-to-use: provide package install, just need to inject few codes into the original code \n- Auto-tuning: automatically select best algorithms and delta structure for finetuning model\n\n### Values\n- Saving computation power: reduce the computation power and time required to fine-tune a model by reducing parameter size as well as memory footprint.\n- Improve accuracy: ensure same or no accuracy regression.\n\n## Get Started\n\n### Installation\n- install the python package\n```shell\npip install e2eAIOK-deltatuner\n```\n\n### Fast Fine-tuning on Base models\nBelow is an example of optimizing [MPT](https://huggingface.co/mosaicml/mpt-7b) model by adding the following few-lines to use the delatuner optimizations. It use the DE-NAS in delatuner to optimize a LLM with LoRA layers to a LLM with compact LoRA layers, so as to efficiently improve the LLM fine-tuning process in peak memory reduction and time speedup. \n\n```diff\nfrom transformers import AutoModelForCausalLM, AutoTokenizer\nfrom peft import LoraConfig, get_peft_model\n+ from deltatuner import deltatuner, deltatuner_args\n+ from delta import deltatuner, deltatuner_args\n\n# import model from huggingface\nmodel_id =  \"mosaicml/mpt-7b\"\nmodel = AutoModelForCausalLM.from_pretrained(model_id)\ntokenizer = AutoTokenizer.from_pretrained(model_id)\n# adding the lora componenents with peft\nconfig = LoraConfig()\nlora_model = get_peft_model(model, config) \n# delatuner optimize the model with best lora layer configuration\n+ deltatuning_args = deltatuner_args.DeltaTunerArguments()\n+ deltatuner_model = deltatuner.optimize(model=lora_model, tokenizer=tokenizer, deltatuning_args=deltatuning_args)\n...\n```\n\n### API reference\nIn above examples, `deltatuner.optimize` is a python function to using deltatuner supported optimization algorithms to optimize the model.\n```python\ndef optimize(model, tokenizer, algo: str=\"auto\", deltatuning_args: DeltaTunerArguments=None, **kwargs) -> DeltaTunerModel:\n    '''\n    Parameters:\n        - model - a PreTrainedModel or LoraModel. Specifies what model should be optimized\n        - tokenizer - a tokenizer for preprocess text\n        - deltatuning_args (optional) \u2013 the deltatuner configuration. \n          - deltatuning_args.denas is to use the denas in the optimization (default: True)\n          - deltatuning_args.algo Specifies what type of adapter algorithm (default: auto)\n            - \"auto\" \u2013 If the input model is mpt, the algorithm is ssf; elif the algorithm is lora\n            - \"lora\" \u2013 use the lora algotihm\n            - \"ssf\" \u2013 use the ssf algotithm \n          - deltatuning_args.best_model_structure Specifies the pre-searched delta best structure so the model can be directly initilized without searching.\n        - kwargs - used to initilize deltatuning_args through key=value, such as algo=\"lora\"\n    Return \n        DeltaTunerModel - a wrapper of model, which composed of the original properties/function together with adavance properties/function provided by deltatuner\n    '''\n```\n\n\n### Detailed examples\n\nPlease refer to [example page](https://github.com/intel/e2eAIOK/tree/main/example) for more use cases on fine-tuning other LLMs with the help of DeltaTuner.\n\n## Model supported matrix\nWe have upload the searched delta best structure to the [conf dir](https://github.com/intel/e2eAIOK/tree/main/e2eAIOK/deltatuner/deltatuner/conf/best_structure), so that users can directly use our searched structure for directly fine-tuning by passing the `DeltaTunerArguments.best_model_structure` to the `deltatuner.optimize` function.\n\n### Causal Language Modeling\n\n| Model        | LoRA | SSF  |\n|--------------| ---- | ---- |\n| GPT-2        | \u2705  |  |\n| GPT-J        | \u2705  | \u2705 |\n| Bloom        | \u2705  | \u2705 |\n| OPT          | \u2705  | \u2705 |\n| GPT-Neo      | \u2705  | \u2705 |\n| Falcon       | \u2705  | \u2705 |\n| LLaMA        | \u2705  | \u2705 |\n| MPT          | \u2705  | \u2705 |\n\n\n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "Intel extension for peft with PyTorch and DENAS",
    "version": "1.1.10",
    "project_urls": {
        "Download": "https://github.com/intel/e2eAIOK/",
        "Homepage": "https://github.com/intel/e2eAIOK/"
    },
    "split_keywords": [
        "deep learning",
        "llm",
        "fine-tuning",
        "pytorch",
        "peft",
        "lora",
        "nas"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "4c01a0e205e051d5d0cf61baeacf40180075283bbce4ec603eb3bea6472bdf0a",
                "md5": "f8322ada980fe896bb36097d4a71f85f",
                "sha256": "2ed67527b89aea69ec23743ae65b30108ce0f8645dade5f69f144aba89099e73"
            },
            "downloads": -1,
            "filename": "e2eAIOK-deltatuner-1.1.10.tar.gz",
            "has_sig": false,
            "md5_digest": "f8322ada980fe896bb36097d4a71f85f",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 26821,
            "upload_time": "2023-12-04T14:21:54",
            "upload_time_iso_8601": "2023-12-04T14:21:54.813643Z",
            "url": "https://files.pythonhosted.org/packages/4c/01/a0e205e051d5d0cf61baeacf40180075283bbce4ec603eb3bea6472bdf0a/e2eAIOK-deltatuner-1.1.10.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-12-04 14:21:54",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "intel",
    "github_project": "e2eAIOK",
    "travis_ci": true,
    "coveralls": false,
    "github_actions": true,
    "lcname": "e2eaiok-deltatuner"
}

Intel AIA