fms-acceleration-aadp


Namefms-acceleration-aadp JSON
Version 0.1.1 PyPI version JSON
download
home_pageNone
SummaryFMS Acceleration Plugin for Attention and Distributed Packing Optimizations
upload_time2024-09-16 06:41:18
maintainerNone
docs_urlNone
authorNone
requires_python~=3.9
licenseApache-2.0
keywords acceleration fms-hf-tuning multipack padding-free
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # FMS Acceleration for Attention And Distributed Packing Plugin

This library contains plugins to accelerate finetuning with the following optimizations:

1. Padding-Free Flash Attention Computation
2. Multipack Distributed Sampling


## Plugins

Plugin | Description | Depends | Loading | Augmentation | Callbacks
--|--|--|--|--|--
[padding_free](./src/fms_acceleration_aadp/framework_plugin_padding_free.py) | Padding-Free Flash Attention Computation | flash_attn | | ✅ | 
[multipack sampler](./src/fms_acceleration_aadp/framework_plugin_multipack.py) | Multipack Distributed Sampling | numba | | ✅ | 


## Native Transformers Support from v4.44.0
Transformers natively supports padding-free from v4.44.0 [see here](https://github.com/huggingface/transformers/pull/31629). The padding-free plugin will use the transformers library if compatible, 
otherwise if `transformers < v4.44.0` the plugin will use an internal implementation instead.

## Native TRL Support for PaddingFree with DataCollatorForCompletionOnlyLM from v0.10.1
Users will be able to use PaddingFree with untokenized data from TRL >= v0.10.1. The flattening of inputs and addition of `position_ids` to the batch
is carried out inside `DataCollatorForCompletionOnlyLM` when keyword `padding_free` is passed to the collator. The plugin uses the TRL library if compatible, 
otherwise if `trl < v0.10.1` the plugin will use an internal implementation instead.

If a user still passes in a pretokenized dataset, the plugin will still use `DataCollaterForFlattening` in the `collate_fn`.

## Running Benchmarks

To reproduce the benchmarks, simply run the following commands,

Reproduce [Padding Free on A100 80GB](scripts/benchmarks/refs_orca/a100_80gb_pf.csv)
`tox -e run-benches -- "1 2" "4 8" benchmark_outputs scenarios-orca.yaml "none"`

Reproduce [MultiPack on A100 80GB](scripts/benchmarks/refs_orca/a100_80gb_mp.csv)
`tox -e run-benches -- "2 4 8" "16 32 64" benchmark_outputs scenarios-orca.yaml "padding-free"`

## Known Issues

### Currenly Only Supports Multipack with Padding-Free

The multipack plugin currently also requires the padding-free plugin to work.
This may change in the future if there is demand for multipack to work standalone without padding free.


            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "fms-acceleration-aadp",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "~=3.9",
    "maintainer_email": null,
    "keywords": "acceleration, fms-hf-tuning, multipack, padding-free",
    "author": null,
    "author_email": "Fabian Lim <flim@sg.ibm.com>, Aaron Chew <aaron.chew1@ibm.com>",
    "download_url": null,
    "platform": null,
    "description": "# FMS Acceleration for Attention And Distributed Packing Plugin\n\nThis library contains plugins to accelerate finetuning with the following optimizations:\n\n1. Padding-Free Flash Attention Computation\n2. Multipack Distributed Sampling\n\n\n## Plugins\n\nPlugin | Description | Depends | Loading | Augmentation | Callbacks\n--|--|--|--|--|--\n[padding_free](./src/fms_acceleration_aadp/framework_plugin_padding_free.py) | Padding-Free Flash Attention Computation | flash_attn | | \u2705 | \n[multipack sampler](./src/fms_acceleration_aadp/framework_plugin_multipack.py) | Multipack Distributed Sampling | numba | | \u2705 | \n\n\n## Native Transformers Support from v4.44.0\nTransformers natively supports padding-free from v4.44.0 [see here](https://github.com/huggingface/transformers/pull/31629). The padding-free plugin will use the transformers library if compatible, \notherwise if `transformers < v4.44.0` the plugin will use an internal implementation instead.\n\n## Native TRL Support for PaddingFree with DataCollatorForCompletionOnlyLM from v0.10.1\nUsers will be able to use PaddingFree with untokenized data from TRL >= v0.10.1. The flattening of inputs and addition of `position_ids` to the batch\nis carried out inside `DataCollatorForCompletionOnlyLM` when keyword `padding_free` is passed to the collator. The plugin uses the TRL library if compatible, \notherwise if `trl < v0.10.1` the plugin will use an internal implementation instead.\n\nIf a user still passes in a pretokenized dataset, the plugin will still use `DataCollaterForFlattening` in the `collate_fn`.\n\n## Running Benchmarks\n\nTo reproduce the benchmarks, simply run the following commands,\n\nReproduce [Padding Free on A100 80GB](scripts/benchmarks/refs_orca/a100_80gb_pf.csv)\n`tox -e run-benches -- \"1 2\" \"4 8\" benchmark_outputs scenarios-orca.yaml \"none\"`\n\nReproduce [MultiPack on A100 80GB](scripts/benchmarks/refs_orca/a100_80gb_mp.csv)\n`tox -e run-benches -- \"2 4 8\" \"16 32 64\" benchmark_outputs scenarios-orca.yaml \"padding-free\"`\n\n## Known Issues\n\n### Currenly Only Supports Multipack with Padding-Free\n\nThe multipack plugin currently also requires the padding-free plugin to work.\nThis may change in the future if there is demand for multipack to work standalone without padding free.\n\n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "FMS Acceleration Plugin for Attention and Distributed Packing Optimizations",
    "version": "0.1.1",
    "project_urls": null,
    "split_keywords": [
        "acceleration",
        " fms-hf-tuning",
        " multipack",
        " padding-free"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f9091ea1428ab69d28550fe4e0077f56832b4230c445fdb02b5d01439076990f",
                "md5": "b4c485b1a5227a38ebc95c650e1bcc4e",
                "sha256": "f7cf38e5d93693d084306f59efde73dba4e3bd58982e89d32ef01ef523589c3a"
            },
            "downloads": -1,
            "filename": "fms_acceleration_aadp-0.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "b4c485b1a5227a38ebc95c650e1bcc4e",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "~=3.9",
            "size": 16457,
            "upload_time": "2024-09-16T06:41:18",
            "upload_time_iso_8601": "2024-09-16T06:41:18.597198Z",
            "url": "https://files.pythonhosted.org/packages/f9/09/1ea1428ab69d28550fe4e0077f56832b4230c445fdb02b5d01439076990f/fms_acceleration_aadp-0.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-09-16 06:41:18",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "fms-acceleration-aadp"
}
        
Elapsed time: 0.29754s