fms-acceleration

Name	fms-acceleration JSON
Version	0.4.0 JSON
	download
home_page	None
Summary	FMS Acceleration Plugin Framework
upload_time	2024-09-16 06:41:14
maintainer	None
docs_url	None
author	None
requires_python	~=3.9
license	Apache-2.0
keywords	acceleration fms-hf-tuning
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # FMS Acceleration Framework Library

This contains the library code that implements the acceleration plugin framework, in particular the classes:
- `AccelerationFramework`
- `AccelerationPlugin`

The library is envisioned to:
- Provide single integration point into [Huggingface](https://github.com/huggingface/transformers).
- Manage `AccelerationPlugin` in a flexible manner. 
- Load plugins from single configuration YAML, while enforcing compatiblity rules on how plugins can be combined.

See following resources:
- Instructions for [running acceleration framework with `fms-hf-tuning`](https://github.com/foundation-model-stack/fms-hf-tuning)
- [Sample plugin YAML configurations](../../sample-configurations) for important accelerations.

## Using AccelerationFramework with HF Trainer

Being by instantiating an `AccelerationFramework` object, passing a YAML configuration (say via a `path_to_config`):
```python
from fms_acceleration import AccelerationFramework
framework = AccelerationFramework(path_to_config)
```

Plugins automatically configured based on configuration; for more details on how plugins are configured, [see below](#configuration-of-plugins).

Some plugins may require custom model loaders (in replacement of the typical `AutoModel.from_pretrained`). In this case, call `framework.model_loader`:

```python
model = framework.model_loader(model_name_or_path, ...)
```
E.g., in the GPTQ example, see [sample GPTQ QLoRA configuration](../../sample-configurations/accelerated-peft-autogptq-sample-configuration.yaml), we require `model_name_or_path` to be custom loaded from a quantized checkpoint.

We provide a flag `framework.requires_custom_loading` to check if plugins require custom loading.

Also some plugins will require the model to be augmented, e.g., replacing layers with plugin-compliant PEFT adapters.  In this case:

```python
# will also take in some other configs that may affect augmentation
# some of these args may be modified due to the augmentation
# e.g., peft_config will be consumed in augmentation, and returned as None 
#       to prevent SFTTrainer from doing extraneous PEFT logic
model, (peft_config,) = framework.augmentation(
    model, 
    train_args, modifiable_args=(peft_config,),
)
```

We also provide `framework.requires_agumentation` to check if augumentation is required by the plugins.

Finally pass the model to train:

```python
# e.g. using transformers.Trainer. Pass in model (with training enchancements)
trainer = Trainer(model, ...)

# call train
trainer.train()
```

Thats all! the model will not be reap all acceleration speedups based on the plugins that were installed!

## Configuration of Plugins

Each [package](#packages) in this monorepo:
- can be *independently installed*. Install only the libraries you need:
   ```shell
   pip install fms-acceleration/plugins/accelerated-peft
   pip install fms-acceleration/plugins/fused-ops-and-kernels
   ```
- can be *independently configured*. Each plugin is registed under a particular configuration path. E.g., the [autogptq plugin](libs/peft/src/fms_acceleration_peft/framework_plugin_autogptq.py) is reqistered under the config path `peft.quantization.auto_gptq`.
    ```python
    AccelerationPlugin.register_plugin(
        AutoGPTQAccelerationPlugin,
        configuration_and_paths=["peft.quantization.auto_gptq"], 
    )
    ```

    This means that it will be configured under theat exact stanza:
    ```yaml
    plugins:
        peft:
            quantization:
                auto_gptq:
                    # everything under here will be passed to plugin 
                    # when instantiating
                    ...
    ```

- When instantiating `fms_acceleration.AccelerationFramework`, it internally parses through the configuration stanzas. For plugins that are installed, it will instantiate them; for those that are not, it will simply *passthrough*.
- `AccelerationFramework` will manage plugins transparently for user. User only needs to call `AccelerationFramework.model_loader` and `AccelerationFramework.augmentation`.

## Adding New Plugins

To add new plugins:

1. Create an appropriately `pip`-packaged plugin in `plugins`; the package needs to be named like `fms-acceleration-<postfix>` .
2. For `framework` to properly load and manage plugin, add the package `<postfix>` to [constants.py](./src/fms_acceleration/constants.py):

    ```python
    PLUGINS = [
        "peft",
        "foak",
        "<postfix>",
    ]
    ```
3. Create a sample template YAML file inside the `<PLUGIN_DIR>/configs` to demonstrate how to configure the plugin. As an example, reference the [sample config for accelerated peft](../accelerated-peft/configs/autogptq.yaml).
4. Update [generate_sample_configurations.py](../../scripts/generate_sample_configurations.py) and run `tox -e gen-configs` on the top level directory to generate the sample configurations.

    ```python
    KEY_AUTO_GPTQ = "auto_gptq"
    KEY_BNB_NF4 = "bnb-nf4"
    PLUGIN_A = "<NEW PLUGIN NAME>"

    CONFIGURATIONS = {
        KEY_AUTO_GPTQ: "plugins/accelerated-peft/configs/autogptq.yaml",
        KEY_BNB_NF4: (
            "plugins/accelerated-peft/configs/bnb.yaml",
            [("peft.quantization.bitsandbytes.quant_type", "nf4")],
        ),
        PLUGIN_A: (
            "plugins/<plugin>/configs/plugin_config.yaml",
            [
                (<1st field in plugin_config.yaml>, <value>),
                (<2nd field in plugin_config.yaml>, <value>),
            ]
        )
    }

    # Passing a tuple of configuration keys will combine the templates together
    COMBINATIONS = [
        ("accelerated-peft-autogptq", (KEY_AUTO_GPTQ,)),
        ("accelerated-peft-bnb-nf4", (KEY_BNB_NF4,)),    
        (<"combined name with your plugin">), (KEY_AUTO_GPTQ, PLUGIN_A)
        (<"combined name with your plugin">), (KEY_BNB_NF4, PLUGIN_A)
    ]
    ```
5. After sample configuration is generated by `tox -e gen-configs`, update [CONTENTS.yaml](../../sample-configurations/CONTENTS.yaml) with the shortname and the configuration fullpath.
6. Update [scenarios YAML](../../scripts/benchmarks/scenarios.yaml) to configure benchmark test scenarios that will be triggered when running `tox -e run-benches` on the top level directory.
7. Update the [top-level tox.ini](../../tox.ini) to install the plugin for the `run-benches`.

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "fms-acceleration",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "~=3.9",
    "maintainer_email": null,
    "keywords": "acceleration, fms-hf-tuning",
    "author": null,
    "author_email": "Fabian Lim <flim@sg.ibm.com>",
    "download_url": null,
    "platform": null,
    "description": "# FMS Acceleration Framework Library\n\nThis contains the library code that implements the acceleration plugin framework, in particular the classes:\n- `AccelerationFramework`\n- `AccelerationPlugin`\n\nThe library is envisioned to:\n- Provide single integration point into [Huggingface](https://github.com/huggingface/transformers).\n- Manage `AccelerationPlugin` in a flexible manner. \n- Load plugins from single configuration YAML, while enforcing compatiblity rules on how plugins can be combined.\n\nSee following resources:\n- Instructions for [running acceleration framework with `fms-hf-tuning`](https://github.com/foundation-model-stack/fms-hf-tuning)\n- [Sample plugin YAML configurations](../../sample-configurations) for important accelerations.\n\n## Using AccelerationFramework with HF Trainer\n\nBeing by instantiating an `AccelerationFramework` object, passing a YAML configuration (say via a `path_to_config`):\n```python\nfrom fms_acceleration import AccelerationFramework\nframework = AccelerationFramework(path_to_config)\n```\n\nPlugins automatically configured based on configuration; for more details on how plugins are configured, [see below](#configuration-of-plugins).\n\nSome plugins may require custom model loaders (in replacement of the typical `AutoModel.from_pretrained`). In this case, call `framework.model_loader`:\n\n```python\nmodel = framework.model_loader(model_name_or_path, ...)\n```\nE.g., in the GPTQ example, see [sample GPTQ QLoRA configuration](../../sample-configurations/accelerated-peft-autogptq-sample-configuration.yaml), we require `model_name_or_path` to be custom loaded from a quantized checkpoint.\n\nWe provide a flag `framework.requires_custom_loading` to check if plugins require custom loading.\n\nAlso some plugins will require the model to be augmented, e.g., replacing layers with plugin-compliant PEFT adapters.  In this case:\n\n```python\n# will also take in some other configs that may affect augmentation\n# some of these args may be modified due to the augmentation\n# e.g., peft_config will be consumed in augmentation, and returned as None \n#       to prevent SFTTrainer from doing extraneous PEFT logic\nmodel, (peft_config,) = framework.augmentation(\n    model, \n    train_args, modifiable_args=(peft_config,),\n)\n```\n\nWe also provide `framework.requires_agumentation` to check if augumentation is required by the plugins.\n\nFinally pass the model to train:\n\n```python\n# e.g. using transformers.Trainer. Pass in model (with training enchancements)\ntrainer = Trainer(model, ...)\n\n# call train\ntrainer.train()\n```\n\nThats all! the model will not be reap all acceleration speedups based on the plugins that were installed!\n\n## Configuration of Plugins\n\nEach [package](#packages) in this monorepo:\n- can be *independently installed*. Install only the libraries you need:\n   ```shell\n   pip install fms-acceleration/plugins/accelerated-peft\n   pip install fms-acceleration/plugins/fused-ops-and-kernels\n   ```\n- can be *independently configured*. Each plugin is registed under a particular configuration path. E.g., the [autogptq plugin](libs/peft/src/fms_acceleration_peft/framework_plugin_autogptq.py) is reqistered under the config path `peft.quantization.auto_gptq`.\n    ```python\n    AccelerationPlugin.register_plugin(\n        AutoGPTQAccelerationPlugin,\n        configuration_and_paths=[\"peft.quantization.auto_gptq\"], \n    )\n    ```\n\n    This means that it will be configured under theat exact stanza:\n    ```yaml\n    plugins:\n        peft:\n            quantization:\n                auto_gptq:\n                    # everything under here will be passed to plugin \n                    # when instantiating\n                    ...\n    ```\n\n- When instantiating `fms_acceleration.AccelerationFramework`, it internally parses through the configuration stanzas. For plugins that are installed, it will instantiate them; for those that are not, it will simply *passthrough*.\n- `AccelerationFramework` will manage plugins transparently for user. User only needs to call `AccelerationFramework.model_loader` and `AccelerationFramework.augmentation`.\n\n## Adding New Plugins\n\nTo add new plugins:\n\n1. Create an appropriately `pip`-packaged plugin in `plugins`; the package needs to be named like `fms-acceleration-<postfix>` .\n2. For `framework` to properly load and manage plugin, add the package `<postfix>` to [constants.py](./src/fms_acceleration/constants.py):\n\n    ```python\n    PLUGINS = [\n        \"peft\",\n        \"foak\",\n        \"<postfix>\",\n    ]\n    ```\n3. Create a sample template YAML file inside the `<PLUGIN_DIR>/configs` to demonstrate how to configure the plugin. As an example, reference the [sample config for accelerated peft](../accelerated-peft/configs/autogptq.yaml).\n4. Update [generate_sample_configurations.py](../../scripts/generate_sample_configurations.py) and run `tox -e gen-configs` on the top level directory to generate the sample configurations.\n\n    ```python\n    KEY_AUTO_GPTQ = \"auto_gptq\"\n    KEY_BNB_NF4 = \"bnb-nf4\"\n    PLUGIN_A = \"<NEW PLUGIN NAME>\"\n\n    CONFIGURATIONS = {\n        KEY_AUTO_GPTQ: \"plugins/accelerated-peft/configs/autogptq.yaml\",\n        KEY_BNB_NF4: (\n            \"plugins/accelerated-peft/configs/bnb.yaml\",\n            [(\"peft.quantization.bitsandbytes.quant_type\", \"nf4\")],\n        ),\n        PLUGIN_A: (\n            \"plugins/<plugin>/configs/plugin_config.yaml\",\n            [\n                (<1st field in plugin_config.yaml>, <value>),\n                (<2nd field in plugin_config.yaml>, <value>),\n            ]\n        )\n    }\n\n    # Passing a tuple of configuration keys will combine the templates together\n    COMBINATIONS = [\n        (\"accelerated-peft-autogptq\", (KEY_AUTO_GPTQ,)),\n        (\"accelerated-peft-bnb-nf4\", (KEY_BNB_NF4,)),    \n        (<\"combined name with your plugin\">), (KEY_AUTO_GPTQ, PLUGIN_A)\n        (<\"combined name with your plugin\">), (KEY_BNB_NF4, PLUGIN_A)\n    ]\n    ```\n5. After sample configuration is generated by `tox -e gen-configs`, update [CONTENTS.yaml](../../sample-configurations/CONTENTS.yaml) with the shortname and the configuration fullpath.\n6. Update [scenarios YAML](../../scripts/benchmarks/scenarios.yaml) to configure benchmark test scenarios that will be triggered when running `tox -e run-benches` on the top level directory.\n7. Update the [top-level tox.ini](../../tox.ini) to install the plugin for the `run-benches`.\n\n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "FMS Acceleration Plugin Framework",
    "version": "0.4.0",
    "project_urls": null,
    "split_keywords": [
        "acceleration",
        " fms-hf-tuning"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "70b38273157993d9d5f7da68e25b45edd8b234ba5435e7952fb9be8aa2bf0465",
                "md5": "df160aa59613a4ecfb9bc4403648bc55",
                "sha256": "ac8d884ec84ffadeae311dc656a9655f17a8aac8a67bc7a99c935f9a5f81d6c4"
            },
            "downloads": -1,
            "filename": "fms_acceleration-0.4.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "df160aa59613a4ecfb9bc4403648bc55",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "~=3.9",
            "size": 25035,
            "upload_time": "2024-09-16T06:41:14",
            "upload_time_iso_8601": "2024-09-16T06:41:14.782881Z",
            "url": "https://files.pythonhosted.org/packages/70/b3/8273157993d9d5f7da68e25b45edd8b234ba5435e7952fb9be8aa2bf0465/fms_acceleration-0.4.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-09-16 06:41:14",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "fms-acceleration"
}

None