# FMS Acceleration Framework Library
This contains the library code that implements the acceleration plugin framework, in particular the classes:
- `AccelerationFramework`
- `AccelerationPlugin`
The library is envisioned to:
- Provide single integration point into [Huggingface](https://github.com/huggingface/transformers).
- Manage `AccelerationPlugin` in a flexible manner.
- Load plugins from single configuration YAML, while enforcing compatiblity rules on how plugins can be combined.
See following resources:
- Instructions for [running acceleration framework with `fms-hf-tuning`](https://github.com/foundation-model-stack/fms-hf-tuning)
- [Sample plugin YAML configurations](../../sample-configurations) for important accelerations.
## Using AccelerationFramework with HF Trainer
Being by instantiating an `AccelerationFramework` object, passing a YAML configuration (say via a `path_to_config`):
```python
from fms_acceleration import AccelerationFramework
framework = AccelerationFramework(path_to_config)
```
Plugins automatically configured based on configuration; for more details on how plugins are configured, [see below](#configuration-of-plugins).
Some plugins may require custom model loaders (in replacement of the typical `AutoModel.from_pretrained`). In this case, call `framework.model_loader`:
```python
model = framework.model_loader(model_name_or_path, ...)
```
E.g., in the GPTQ example, see [sample GPTQ QLoRA configuration](../../sample-configurations/accelerated-peft-autogptq-sample-configuration.yaml), we require `model_name_or_path` to be custom loaded from a quantized checkpoint.
We provide a flag `framework.requires_custom_loading` to check if plugins require custom loading.
Also some plugins will require the model to be augmented, e.g., replacing layers with plugin-compliant PEFT adapters. In this case:
```python
# will also take in some other configs that may affect augmentation
# some of these args may be modified due to the augmentation
# e.g., peft_config will be consumed in augmentation, and returned as None
# to prevent SFTTrainer from doing extraneous PEFT logic
model, (peft_config,) = framework.augmentation(
model,
train_args, modifiable_args=(peft_config,),
)
```
We also provide `framework.requires_agumentation` to check if augumentation is required by the plugins.
Finally pass the model to train:
```python
# e.g. using transformers.Trainer. Pass in model (with training enchancements)
trainer = Trainer(model, ...)
# call train
trainer.train()
```
Thats all! the model will not be reap all acceleration speedups based on the plugins that were installed!
## Configuration of Plugins
Each [package](#packages) in this monorepo:
- can be *independently installed*. Install only the libraries you need:
```shell
pip install fms-acceleration/plugins/accelerated-peft
pip install fms-acceleration/plugins/fused-ops-and-kernels
```
- can be *independently configured*. Each plugin is registed under a particular configuration path. E.g., the [autogptq plugin](libs/peft/src/fms_acceleration_peft/framework_plugin_autogptq.py) is reqistered under the config path `peft.quantization.auto_gptq`.
```python
AccelerationPlugin.register_plugin(
AutoGPTQAccelerationPlugin,
configuration_and_paths=["peft.quantization.auto_gptq"],
)
```
This means that it will be configured under theat exact stanza:
```yaml
plugins:
peft:
quantization:
auto_gptq:
# everything under here will be passed to plugin
# when instantiating
...
```
- When instantiating `fms_acceleration.AccelerationFramework`, it internally parses through the configuration stanzas. For plugins that are installed, it will instantiate them; for those that are not, it will simply *passthrough*.
- `AccelerationFramework` will manage plugins transparently for user. User only needs to call `AccelerationFramework.model_loader` and `AccelerationFramework.augmentation`.
## Adding New Plugins
To add new plugins:
1. Create an appropriately `pip`-packaged plugin in `plugins`; the package needs to be named like `fms-acceleration-<postfix>` .
2. For `framework` to properly load and manage plugin, add the package `<postfix>` to [constants.py](./src/fms_acceleration/constants.py):
```python
PLUGINS = [
"peft",
"foak",
"<postfix>",
]
```
3. Create a sample template YAML file inside the `<PLUGIN_DIR>/configs` to demonstrate how to configure the plugin. As an example, reference the [sample config for accelerated peft](../accelerated-peft/configs/autogptq.yaml).
4. Update [generate_sample_configurations.py](../../scripts/generate_sample_configurations.py) and run `tox -e gen-configs` on the top level directory to generate the sample configurations.
```python
KEY_AUTO_GPTQ = "auto_gptq"
KEY_BNB_NF4 = "bnb-nf4"
PLUGIN_A = "<NEW PLUGIN NAME>"
CONFIGURATIONS = {
KEY_AUTO_GPTQ: "plugins/accelerated-peft/configs/autogptq.yaml",
KEY_BNB_NF4: (
"plugins/accelerated-peft/configs/bnb.yaml",
[("peft.quantization.bitsandbytes.quant_type", "nf4")],
),
PLUGIN_A: (
"plugins/<plugin>/configs/plugin_config.yaml",
[
(<1st field in plugin_config.yaml>, <value>),
(<2nd field in plugin_config.yaml>, <value>),
]
)
}
# Passing a tuple of configuration keys will combine the templates together
COMBINATIONS = [
("accelerated-peft-autogptq", (KEY_AUTO_GPTQ,)),
("accelerated-peft-bnb-nf4", (KEY_BNB_NF4,)),
(<"combined name with your plugin">), (KEY_AUTO_GPTQ, PLUGIN_A)
(<"combined name with your plugin">), (KEY_BNB_NF4, PLUGIN_A)
]
```
5. After sample configuration is generated by `tox -e gen-configs`, update [CONTENTS.yaml](../../sample-configurations/CONTENTS.yaml) with the shortname and the configuration fullpath.
6. Update [scenarios YAML](../../scripts/benchmarks/scenarios.yaml) to configure benchmark test scenarios that will be triggered when running `tox -e run-benches` on the top level directory.
7. Update the [top-level tox.ini](../../tox.ini) to install the plugin for the `run-benches`.
Raw data
{
"_id": null,
"home_page": null,
"name": "fms-acceleration",
"maintainer": null,
"docs_url": null,
"requires_python": "~=3.9",
"maintainer_email": null,
"keywords": "acceleration, fms-hf-tuning",
"author": null,
"author_email": "Fabian Lim <flim@sg.ibm.com>",
"download_url": null,
"platform": null,
"description": "# FMS Acceleration Framework Library\n\nThis contains the library code that implements the acceleration plugin framework, in particular the classes:\n- `AccelerationFramework`\n- `AccelerationPlugin`\n\nThe library is envisioned to:\n- Provide single integration point into [Huggingface](https://github.com/huggingface/transformers).\n- Manage `AccelerationPlugin` in a flexible manner. \n- Load plugins from single configuration YAML, while enforcing compatiblity rules on how plugins can be combined.\n\nSee following resources:\n- Instructions for [running acceleration framework with `fms-hf-tuning`](https://github.com/foundation-model-stack/fms-hf-tuning)\n- [Sample plugin YAML configurations](../../sample-configurations) for important accelerations.\n\n## Using AccelerationFramework with HF Trainer\n\nBeing by instantiating an `AccelerationFramework` object, passing a YAML configuration (say via a `path_to_config`):\n```python\nfrom fms_acceleration import AccelerationFramework\nframework = AccelerationFramework(path_to_config)\n```\n\nPlugins automatically configured based on configuration; for more details on how plugins are configured, [see below](#configuration-of-plugins).\n\nSome plugins may require custom model loaders (in replacement of the typical `AutoModel.from_pretrained`). In this case, call `framework.model_loader`:\n\n```python\nmodel = framework.model_loader(model_name_or_path, ...)\n```\nE.g., in the GPTQ example, see [sample GPTQ QLoRA configuration](../../sample-configurations/accelerated-peft-autogptq-sample-configuration.yaml), we require `model_name_or_path` to be custom loaded from a quantized checkpoint.\n\nWe provide a flag `framework.requires_custom_loading` to check if plugins require custom loading.\n\nAlso some plugins will require the model to be augmented, e.g., replacing layers with plugin-compliant PEFT adapters. In this case:\n\n```python\n# will also take in some other configs that may affect augmentation\n# some of these args may be modified due to the augmentation\n# e.g., peft_config will be consumed in augmentation, and returned as None \n# to prevent SFTTrainer from doing extraneous PEFT logic\nmodel, (peft_config,) = framework.augmentation(\n model, \n train_args, modifiable_args=(peft_config,),\n)\n```\n\nWe also provide `framework.requires_agumentation` to check if augumentation is required by the plugins.\n\nFinally pass the model to train:\n\n```python\n# e.g. using transformers.Trainer. Pass in model (with training enchancements)\ntrainer = Trainer(model, ...)\n\n# call train\ntrainer.train()\n```\n\nThats all! the model will not be reap all acceleration speedups based on the plugins that were installed!\n\n## Configuration of Plugins\n\nEach [package](#packages) in this monorepo:\n- can be *independently installed*. Install only the libraries you need:\n ```shell\n pip install fms-acceleration/plugins/accelerated-peft\n pip install fms-acceleration/plugins/fused-ops-and-kernels\n ```\n- can be *independently configured*. Each plugin is registed under a particular configuration path. E.g., the [autogptq plugin](libs/peft/src/fms_acceleration_peft/framework_plugin_autogptq.py) is reqistered under the config path `peft.quantization.auto_gptq`.\n ```python\n AccelerationPlugin.register_plugin(\n AutoGPTQAccelerationPlugin,\n configuration_and_paths=[\"peft.quantization.auto_gptq\"], \n )\n ```\n\n This means that it will be configured under theat exact stanza:\n ```yaml\n plugins:\n peft:\n quantization:\n auto_gptq:\n # everything under here will be passed to plugin \n # when instantiating\n ...\n ```\n\n- When instantiating `fms_acceleration.AccelerationFramework`, it internally parses through the configuration stanzas. For plugins that are installed, it will instantiate them; for those that are not, it will simply *passthrough*.\n- `AccelerationFramework` will manage plugins transparently for user. User only needs to call `AccelerationFramework.model_loader` and `AccelerationFramework.augmentation`.\n\n## Adding New Plugins\n\nTo add new plugins:\n\n1. Create an appropriately `pip`-packaged plugin in `plugins`; the package needs to be named like `fms-acceleration-<postfix>` .\n2. For `framework` to properly load and manage plugin, add the package `<postfix>` to [constants.py](./src/fms_acceleration/constants.py):\n\n ```python\n PLUGINS = [\n \"peft\",\n \"foak\",\n \"<postfix>\",\n ]\n ```\n3. Create a sample template YAML file inside the `<PLUGIN_DIR>/configs` to demonstrate how to configure the plugin. As an example, reference the [sample config for accelerated peft](../accelerated-peft/configs/autogptq.yaml).\n4. Update [generate_sample_configurations.py](../../scripts/generate_sample_configurations.py) and run `tox -e gen-configs` on the top level directory to generate the sample configurations.\n\n ```python\n KEY_AUTO_GPTQ = \"auto_gptq\"\n KEY_BNB_NF4 = \"bnb-nf4\"\n PLUGIN_A = \"<NEW PLUGIN NAME>\"\n\n CONFIGURATIONS = {\n KEY_AUTO_GPTQ: \"plugins/accelerated-peft/configs/autogptq.yaml\",\n KEY_BNB_NF4: (\n \"plugins/accelerated-peft/configs/bnb.yaml\",\n [(\"peft.quantization.bitsandbytes.quant_type\", \"nf4\")],\n ),\n PLUGIN_A: (\n \"plugins/<plugin>/configs/plugin_config.yaml\",\n [\n (<1st field in plugin_config.yaml>, <value>),\n (<2nd field in plugin_config.yaml>, <value>),\n ]\n )\n }\n\n # Passing a tuple of configuration keys will combine the templates together\n COMBINATIONS = [\n (\"accelerated-peft-autogptq\", (KEY_AUTO_GPTQ,)),\n (\"accelerated-peft-bnb-nf4\", (KEY_BNB_NF4,)), \n (<\"combined name with your plugin\">), (KEY_AUTO_GPTQ, PLUGIN_A)\n (<\"combined name with your plugin\">), (KEY_BNB_NF4, PLUGIN_A)\n ]\n ```\n5. After sample configuration is generated by `tox -e gen-configs`, update [CONTENTS.yaml](../../sample-configurations/CONTENTS.yaml) with the shortname and the configuration fullpath.\n6. Update [scenarios YAML](../../scripts/benchmarks/scenarios.yaml) to configure benchmark test scenarios that will be triggered when running `tox -e run-benches` on the top level directory.\n7. Update the [top-level tox.ini](../../tox.ini) to install the plugin for the `run-benches`.\n\n",
"bugtrack_url": null,
"license": "Apache-2.0",
"summary": "FMS Acceleration Plugin Framework",
"version": "0.4.0",
"project_urls": null,
"split_keywords": [
"acceleration",
" fms-hf-tuning"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "70b38273157993d9d5f7da68e25b45edd8b234ba5435e7952fb9be8aa2bf0465",
"md5": "df160aa59613a4ecfb9bc4403648bc55",
"sha256": "ac8d884ec84ffadeae311dc656a9655f17a8aac8a67bc7a99c935f9a5f81d6c4"
},
"downloads": -1,
"filename": "fms_acceleration-0.4.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "df160aa59613a4ecfb9bc4403648bc55",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "~=3.9",
"size": 25035,
"upload_time": "2024-09-16T06:41:14",
"upload_time_iso_8601": "2024-09-16T06:41:14.782881Z",
"url": "https://files.pythonhosted.org/packages/70/b3/8273157993d9d5f7da68e25b45edd8b234ba5435e7952fb9be8aa2bf0465/fms_acceleration-0.4.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-09-16 06:41:14",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "fms-acceleration"
}