realhf


Namerealhf JSON
Version 0.1.0.post2 PyPI version JSON
download
home_pageNone
SummaryReaL: Efficient RLHF Training of Large Language Models with Parameter Reallocation
upload_time2024-06-20 10:41:18
maintainerNone
docs_urlNone
authorNone
requires_python<3.12,>=3.10
licenseNone
keywords distributed-systems reinforcement-learning-from-human-feedback large-language-models llm-training
VCS
bugtrack_url
requirements sphinx-nefertiti sphinx build wheel distro-info python-debian huggingface_hub datasets accelerate ninja matplotlib ipython megatron_core deepspeed h5py nltk sentencepiece wandb tensorboardx blosc colorama colorlog einops hydra-core matplotlib numba omegaconf packaging pandas pybind11 numpy psutil pynvml pytest PyYAML pyzmq ray redis scipy seaborn setuptools torch tqdm transformers
Travis-CI No Travis.
coveralls test coverage No coveralls.
            

<p align="center">
  <picture>
    <source media="(prefers-color-scheme: dark)" srcset="docs/source/images/real_logo_dark.svg">
    <img alt="ReaL" src="docs/source/images/real_logo.svg" width="55%">
  </picture>
</p>

<p align="center">
| <a href="https://openpsi-project.github.io/ReaLHF/"><b>Documentation</b></a> | <a href="https://openpsi-project.github.io/ReaLHF/"><b>Paper</b></a> |
</p>

<h1 align="center">
<em>ReaL</em>: Efficient RLHF Training for LLMs <br>with Parameter Reallocation
</h1>

***ReaL*** (short for *<ins>ReaL</ins>location*) is a distributed system designed for efficient RLHF training with LLMs.

ReaL introduces a novel approach called *parameter reallocation*, which dynamically redistributes LLM parameters across the cluster and adapts parallelization strategies during training. By optimizing allocations and parallelism for each computation workload, ReaL minimizes redundant communication while maximizing GPU utilization.

ReaL achieves significantly higher PPO training throughput compared to state-of-the-art open-source systems.

(In the following figure, as the number of GPUs increases, the model size scales up from LLaMA 7B, LLaMA 13B, and CodeLLaMA 34B, to the largest LLaMA 70B.)

![Throughput Comparison](docs/source/images/vws.svg)

## Highlights

### Efficiency

- Achieves state-of-the-art training throughput for RLHF using **parameter reallocation**.
- Supports large-scale training with 3D parallelism, ZeRO optimization, and sequence parallelism.
- Enables memory-efficient training with parameter and optimizer offloading.

### Ease of Use

- Seamlessly integrates with HuggingFace checkpoints and inference frameworks like vLLM.
- Allows launching local or distributed experiments with a single command.

Check out our [tutorial](https://openpsi-project.github.io/ReaLHF/quickstart.html) to reproduce the full RLHF procedure (SFT/RW/PPO) with 4×LLaMA-7B in just **30 minutes**.

### Flexibility

- Offers versatile configuration customization with Hydra structured config.
- Supports many commonly used RLHF algorithms, including DPO, PPO, RAFT, and more.
- Allows the addition of custom algorithms with fewer than 100 lines of code.

Refer to our [customization guide](https://openpsi-project.github.io/ReaLHF/customization.html) for hands-on examples.

## Getting Started

We provide pre-built [Docker images](https://openpsi-project.github.io/ReaLHF/install.html#docker-images) and [PyPI packages](https://openpsi-project.github.io/ReaLHF/install.html#install-from-pypi-or-source).

```bash
pip3 install realhf --no-build-isolation
```

For detailed information, please visit our [documentation site](https://openpsi-project.github.io/ReaLHF/).

- [Quickstart](https://openpsi-project.github.io/ReaLHF/quickstart.html)

- [Experiment Configurations](https://openpsi-project.github.io/ReaLHF/expconfig.html)

- [Code Architecture](https://openpsi-project.github.io/ReaLHF/arch.html)

- [Contributing](https://openpsi-project.github.io/ReaLHF/contributing.html)

## Acknowledgement

We would like to thank the authors of our paper and the following individuals for their contributions: Shusheng Xu and Jiaxuan Gao from Tsinghua University, and Weilin Liu, Wenjie Ye, and Chuyi He from OpenPsi Inc, for thoroughly testing and using ReaL in their research, and for providing valuable suggestions that greatly improved the system.

## Citation

If you find our system useful for your research or production, please cite our paper.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "realhf",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<3.12,>=3.10",
    "maintainer_email": "Zhiyu Mei <meizy20@mails.tsinghua.edu.cn>, Wei Fu <fuwth17@gmail.com>",
    "keywords": "distributed-systems, reinforcement-learning-from-human-feedback, large-language-models, llm-training",
    "author": null,
    "author_email": "Zhiyu Mei <meizy20@mails.tsinghua.edu.cn>, Wei Fu <fuwth17@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/af/b4/01a29534999c699f25b16cca13ffc6c300964ff59177e6426cbb5d07f671/realhf-0.1.0.post2.tar.gz",
    "platform": null,
    "description": "\n\n<p align=\"center\">\n  <picture>\n    <source media=\"(prefers-color-scheme: dark)\" srcset=\"docs/source/images/real_logo_dark.svg\">\n    <img alt=\"ReaL\" src=\"docs/source/images/real_logo.svg\" width=\"55%\">\n  </picture>\n</p>\n\n<p align=\"center\">\n| <a href=\"https://openpsi-project.github.io/ReaLHF/\"><b>Documentation</b></a> | <a href=\"https://openpsi-project.github.io/ReaLHF/\"><b>Paper</b></a> |\n</p>\n\n<h1 align=\"center\">\n<em>ReaL</em>: Efficient RLHF Training for LLMs <br>with Parameter Reallocation\n</h1>\n\n***ReaL*** (short for *<ins>ReaL</ins>location*) is a distributed system designed for efficient RLHF training with LLMs.\n\nReaL introduces a novel approach called *parameter reallocation*, which dynamically redistributes LLM parameters across the cluster and adapts parallelization strategies during training. By optimizing allocations and parallelism for each computation workload, ReaL minimizes redundant communication while maximizing GPU utilization.\n\nReaL achieves significantly higher PPO training throughput compared to state-of-the-art open-source systems.\n\n(In the following figure, as the number of GPUs increases, the model size scales up from LLaMA 7B, LLaMA 13B, and CodeLLaMA 34B, to the largest LLaMA 70B.)\n\n![Throughput Comparison](docs/source/images/vws.svg)\n\n## Highlights\n\n### Efficiency\n\n- Achieves state-of-the-art training throughput for RLHF using **parameter reallocation**.\n- Supports large-scale training with 3D parallelism, ZeRO optimization, and sequence parallelism.\n- Enables memory-efficient training with parameter and optimizer offloading.\n\n### Ease of Use\n\n- Seamlessly integrates with HuggingFace checkpoints and inference frameworks like vLLM.\n- Allows launching local or distributed experiments with a single command.\n\nCheck out our [tutorial](https://openpsi-project.github.io/ReaLHF/quickstart.html) to reproduce the full RLHF procedure (SFT/RW/PPO) with 4\u00d7LLaMA-7B in just **30 minutes**.\n\n### Flexibility\n\n- Offers versatile configuration customization with Hydra structured config.\n- Supports many commonly used RLHF algorithms, including DPO, PPO, RAFT, and more.\n- Allows the addition of custom algorithms with fewer than 100 lines of code.\n\nRefer to our [customization guide](https://openpsi-project.github.io/ReaLHF/customization.html) for hands-on examples.\n\n## Getting Started\n\nWe provide pre-built [Docker images](https://openpsi-project.github.io/ReaLHF/install.html#docker-images) and [PyPI packages](https://openpsi-project.github.io/ReaLHF/install.html#install-from-pypi-or-source).\n\n```bash\npip3 install realhf --no-build-isolation\n```\n\nFor detailed information, please visit our [documentation site](https://openpsi-project.github.io/ReaLHF/).\n\n- [Quickstart](https://openpsi-project.github.io/ReaLHF/quickstart.html)\n\n- [Experiment Configurations](https://openpsi-project.github.io/ReaLHF/expconfig.html)\n\n- [Code Architecture](https://openpsi-project.github.io/ReaLHF/arch.html)\n\n- [Contributing](https://openpsi-project.github.io/ReaLHF/contributing.html)\n\n## Acknowledgement\n\nWe would like to thank the authors of our paper and the following individuals for their contributions: Shusheng Xu and Jiaxuan Gao from Tsinghua University, and Weilin Liu, Wenjie Ye, and Chuyi He from OpenPsi Inc, for thoroughly testing and using ReaL in their research, and for providing valuable suggestions that greatly improved the system.\n\n## Citation\n\nIf you find our system useful for your research or production, please cite our paper.\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "ReaL: Efficient RLHF Training of Large Language Models with Parameter Reallocation",
    "version": "0.1.0.post2",
    "project_urls": {
        "Documentation": "https://openpsi-project.github.io/ReaLHF/",
        "Homepage": "https://github.com/openpsi-project/ReaLHF",
        "Issues": "https://github.com/openpsi-project/ReaLHF/issues",
        "Repository": "https://github.com/openpsi-project/ReaLHF"
    },
    "split_keywords": [
        "distributed-systems",
        " reinforcement-learning-from-human-feedback",
        " large-language-models",
        " llm-training"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "afb401a29534999c699f25b16cca13ffc6c300964ff59177e6426cbb5d07f671",
                "md5": "b3284c95922bd9ae3c369642babbe24a",
                "sha256": "8fcb3a7ae033c4a4dec9ed290f32dedfc44fd2de25b31be6e316c0c6409452b9"
            },
            "downloads": -1,
            "filename": "realhf-0.1.0.post2.tar.gz",
            "has_sig": false,
            "md5_digest": "b3284c95922bd9ae3c369642babbe24a",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<3.12,>=3.10",
            "size": 299399,
            "upload_time": "2024-06-20T10:41:18",
            "upload_time_iso_8601": "2024-06-20T10:41:18.808420Z",
            "url": "https://files.pythonhosted.org/packages/af/b4/01a29534999c699f25b16cca13ffc6c300964ff59177e6426cbb5d07f671/realhf-0.1.0.post2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-06-20 10:41:18",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "openpsi-project",
    "github_project": "ReaLHF",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "sphinx-nefertiti",
            "specs": []
        },
        {
            "name": "sphinx",
            "specs": []
        },
        {
            "name": "build",
            "specs": [
                [
                    ">=",
                    "1.2.1"
                ]
            ]
        },
        {
            "name": "wheel",
            "specs": [
                [
                    ">=",
                    "0.43.0"
                ]
            ]
        },
        {
            "name": "distro-info",
            "specs": [
                [
                    ">=",
                    "1.0"
                ]
            ]
        },
        {
            "name": "python-debian",
            "specs": [
                [
                    ">=",
                    "0.1.49"
                ]
            ]
        },
        {
            "name": "huggingface_hub",
            "specs": []
        },
        {
            "name": "datasets",
            "specs": []
        },
        {
            "name": "accelerate",
            "specs": []
        },
        {
            "name": "ninja",
            "specs": []
        },
        {
            "name": "matplotlib",
            "specs": []
        },
        {
            "name": "ipython",
            "specs": []
        },
        {
            "name": "megatron_core",
            "specs": [
                [
                    "==",
                    "0.6.0"
                ]
            ]
        },
        {
            "name": "deepspeed",
            "specs": [
                [
                    "==",
                    "0.14.0"
                ]
            ]
        },
        {
            "name": "h5py",
            "specs": []
        },
        {
            "name": "nltk",
            "specs": []
        },
        {
            "name": "sentencepiece",
            "specs": []
        },
        {
            "name": "wandb",
            "specs": []
        },
        {
            "name": "tensorboardx",
            "specs": []
        },
        {
            "name": "blosc",
            "specs": []
        },
        {
            "name": "colorama",
            "specs": []
        },
        {
            "name": "colorlog",
            "specs": []
        },
        {
            "name": "einops",
            "specs": []
        },
        {
            "name": "hydra-core",
            "specs": []
        },
        {
            "name": "matplotlib",
            "specs": []
        },
        {
            "name": "numba",
            "specs": []
        },
        {
            "name": "omegaconf",
            "specs": []
        },
        {
            "name": "packaging",
            "specs": []
        },
        {
            "name": "pandas",
            "specs": []
        },
        {
            "name": "pybind11",
            "specs": [
                [
                    ">=",
                    "2.10.0"
                ]
            ]
        },
        {
            "name": "numpy",
            "specs": [
                [
                    "<",
                    "2.0.0"
                ]
            ]
        },
        {
            "name": "psutil",
            "specs": []
        },
        {
            "name": "pynvml",
            "specs": []
        },
        {
            "name": "pytest",
            "specs": []
        },
        {
            "name": "PyYAML",
            "specs": []
        },
        {
            "name": "pyzmq",
            "specs": []
        },
        {
            "name": "ray",
            "specs": []
        },
        {
            "name": "redis",
            "specs": []
        },
        {
            "name": "scipy",
            "specs": []
        },
        {
            "name": "seaborn",
            "specs": []
        },
        {
            "name": "setuptools",
            "specs": [
                [
                    ">=",
                    "61.0"
                ]
            ]
        },
        {
            "name": "torch",
            "specs": []
        },
        {
            "name": "tqdm",
            "specs": []
        },
        {
            "name": "transformers",
            "specs": [
                [
                    "==",
                    "4.39.3"
                ]
            ]
        }
    ],
    "lcname": "realhf"
}
        
Elapsed time: 0.32253s