nemo-aligner


Namenemo-aligner JSON
Version 0.2.0 PyPI version JSON
download
home_pagehttps://github.com/NVIDIA/NeMo-Aligner
SummaryNeMo-Aligner - a toolkit for model alignment
upload_time2024-03-13 23:08:14
maintainerNVIDIA
docs_urlNone
authorNVIDIA
requires_python
licenseApache2
keywords deep learning machine learning gpu nlp nemo nvidia pytorch torch language reinforcement learning rlhf preference modeling steerlm dpo
VCS
bugtrack_url
requirements nemo_toolkit nvidia-pytriton
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # NVIDIA NeMo-Aligner

## Introduction

NeMo-Aligner is a scalable toolkit for efficient model alignment. The toolkit has support for state of the art model alignment algorithms such as SteerLM, DPO and Reinforcement Learning from Human Feedback (RLHF). These algorithms enable users to align language models to be more safe, harmless and helpful. Users can do end-to-end model alignment on a wide range of model sizes and take advantage of all the parallelism techniques to ensure their model alignment is done in a performant and resource efficient manner.

NeMo-Aligner toolkit is built using the [NeMo Toolkit](https://github.com/NVIDIA/NeMo) which allows for scaling training up to 1000s of GPUs using tensor, data and pipeline parallelism for all components of alignment. All of our checkpoints are cross compatible with the NeMo ecosystem; allowing for inference deployment and further customization.

The toolkit is currently in it's early stages, and we are committed to improving the toolkit to make it easier for developers to pick and choose different alignment algorithms to build safe, helpful and reliable models.

## Key features

* **SteerLM: Attribute Conditioned SFT as an (User-Steerable) Alternative to RLHF.** 
    * Learn more at our [SteerLM](https://arxiv.org/abs/2310.05344) and [HelpSteer](https://arxiv.org/abs/2311.09528) papers. Try our [NV-Llama2-70B-SteerLM-Chat model](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-foundation/models/llama2-70b-steerlm) instantly for free on NVIDIA AI Foundation.
* **Supervised Fine Tuning**
* **Reward Model Training**
* **Reinforcement Learning from Human Feedback using the [PPO](https://arxiv.org/pdf/1707.06347.pdf) Algorithm**
    * Check out our aligned [NV-Llama2-70B-RLHF model](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-foundation/models/nv-llama2-70b-rlhf) on NVIDIA AI Foundation for free.
* **Direct Preference Optimization as described in [paper](https://arxiv.org/pdf/2305.18290.pdf)**

## Learn More
* [Documentation](https://github.com/NVIDIA/NeMo-Aligner/blob/main/docs/README.md)
* [Examples](https://github.com/NVIDIA/NeMo-Aligner/tree/main/examples/nlp/gpt)
* [Tutorials](https://docs.nvidia.com/nemo-framework/user-guide/latest/ModelAlignment/index.html)

## Latest Release

For the latest stable release please see the [releases page](https://github.com/NVIDIA/NeMo-Aligner/releases). All releases come with a pre-built container. Changes within each release will be documented in [CHANGELOG](https://github.com/NVIDIA/NeMo-Aligner/blob/main/CHANGELOG.md).

## Installing your own environment

### Requirements
NeMo-Aligner has the same requirements as the [NeMo Toolkit Requirements](https://github.com/NVIDIA/NeMo#requirements) with the addition of [PyTriton](https://github.com/triton-inference-server/pytriton).

### Installation
Please follow the same steps as the [NeMo Toolkit Installation Guide](https://github.com/NVIDIA/NeMo#installation) but run the following after installing NeMo
```bash
pip install nemo-aligner
```
or if you prefer to install the latest commit
```bash
pip install .
```

### Docker Containers

We provide an official NeMo-Aligner Dockerfile which is based on stable, tested versions of NeMo, Megatron-LM, and TransformerEngine. The goal of this Dockerfile
is stability, so it may not track the very latest versions of those 3 packages. You can access our Dockerfile [here](https://github.com/NVIDIA/NeMo-Aligner/blob/main/Dockerfile)

Alternatively, you can build the NeMo Dockerfile here [NeMo Dockerfile](https://github.com/NVIDIA/NeMo/blob/main/Dockerfile) and add `RUN pip install nemo-aligner` at the end.

## Future work
- Add Rejection Sampling support
- We will continue improving the stability of the PPO learning phase.
- Improve the performance of RLHF

## Contributing
We welcome community contributions! Please refer to [CONTRIBUTING.md](https://github.com/NVIDIA/NeMo-Aligner/blob/main/CONTRIBUTING.md) for guidelines.

## License
This toolkit is licensed under the [Apache License, Version 2.0.](https://github.com/NVIDIA/NeMo-Aligner/blob/main/LICENSE)

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/NVIDIA/NeMo-Aligner",
    "name": "nemo-aligner",
    "maintainer": "NVIDIA",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "nemo-toolkit@nvidia.com",
    "keywords": "deep learning,machine learning,gpu,NLP,NeMo,nvidia,pytorch,torch,language,reinforcement learning,RLHF,preference modeling,SteerLM,DPO",
    "author": "NVIDIA",
    "author_email": "nemo-toolkit@nvidia.com",
    "download_url": "https://files.pythonhosted.org/packages/c8/1d/00228a95227654d0cef2a19151339a6be226a20c6c13c995cb7256bc4f66/nemo_aligner-0.2.0.tar.gz",
    "platform": null,
    "description": "# NVIDIA NeMo-Aligner\n\n## Introduction\n\nNeMo-Aligner is a scalable toolkit for efficient model alignment. The toolkit has support for state of the art model alignment algorithms such as SteerLM, DPO and Reinforcement Learning from Human Feedback (RLHF). These algorithms enable users to align language models to be more safe, harmless and helpful. Users can do end-to-end model alignment on a wide range of model sizes and take advantage of all the parallelism techniques to ensure their model alignment is done in a performant and resource efficient manner.\n\nNeMo-Aligner toolkit is built using the [NeMo Toolkit](https://github.com/NVIDIA/NeMo) which allows for scaling training up to 1000s of GPUs using tensor, data and pipeline parallelism for all components of alignment. All of our checkpoints are cross compatible with the NeMo ecosystem; allowing for inference deployment and further customization.\n\nThe toolkit is currently in it's early stages, and we are committed to improving the toolkit to make it easier for developers to pick and choose different alignment algorithms to build safe, helpful and reliable models.\n\n## Key features\n\n* **SteerLM: Attribute Conditioned SFT as an (User-Steerable) Alternative to RLHF.** \n    * Learn more at our [SteerLM](https://arxiv.org/abs/2310.05344) and [HelpSteer](https://arxiv.org/abs/2311.09528) papers. Try our [NV-Llama2-70B-SteerLM-Chat model](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-foundation/models/llama2-70b-steerlm) instantly for free on NVIDIA AI Foundation.\n* **Supervised Fine Tuning**\n* **Reward Model Training**\n* **Reinforcement Learning from Human Feedback using the [PPO](https://arxiv.org/pdf/1707.06347.pdf) Algorithm**\n    * Check out our aligned [NV-Llama2-70B-RLHF model](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-foundation/models/nv-llama2-70b-rlhf) on NVIDIA AI Foundation for free.\n* **Direct Preference Optimization as described in [paper](https://arxiv.org/pdf/2305.18290.pdf)**\n\n## Learn More\n* [Documentation](https://github.com/NVIDIA/NeMo-Aligner/blob/main/docs/README.md)\n* [Examples](https://github.com/NVIDIA/NeMo-Aligner/tree/main/examples/nlp/gpt)\n* [Tutorials](https://docs.nvidia.com/nemo-framework/user-guide/latest/ModelAlignment/index.html)\n\n## Latest Release\n\nFor the latest stable release please see the [releases page](https://github.com/NVIDIA/NeMo-Aligner/releases). All releases come with a pre-built container. Changes within each release will be documented in [CHANGELOG](https://github.com/NVIDIA/NeMo-Aligner/blob/main/CHANGELOG.md).\n\n## Installing your own environment\n\n### Requirements\nNeMo-Aligner has the same requirements as the [NeMo Toolkit Requirements](https://github.com/NVIDIA/NeMo#requirements) with the addition of [PyTriton](https://github.com/triton-inference-server/pytriton).\n\n### Installation\nPlease follow the same steps as the [NeMo Toolkit Installation Guide](https://github.com/NVIDIA/NeMo#installation) but run the following after installing NeMo\n```bash\npip install nemo-aligner\n```\nor if you prefer to install the latest commit\n```bash\npip install .\n```\n\n### Docker Containers\n\nWe provide an official NeMo-Aligner Dockerfile which is based on stable, tested versions of NeMo, Megatron-LM, and TransformerEngine. The goal of this Dockerfile\nis stability, so it may not track the very latest versions of those 3 packages. You can access our Dockerfile [here](https://github.com/NVIDIA/NeMo-Aligner/blob/main/Dockerfile)\n\nAlternatively, you can build the NeMo Dockerfile here [NeMo Dockerfile](https://github.com/NVIDIA/NeMo/blob/main/Dockerfile) and add `RUN pip install nemo-aligner` at the end.\n\n## Future work\n- Add Rejection Sampling support\n- We will continue improving the stability of the PPO learning phase.\n- Improve the performance of RLHF\n\n## Contributing\nWe welcome community contributions! Please refer to [CONTRIBUTING.md](https://github.com/NVIDIA/NeMo-Aligner/blob/main/CONTRIBUTING.md) for guidelines.\n\n## License\nThis toolkit is licensed under the [Apache License, Version 2.0.](https://github.com/NVIDIA/NeMo-Aligner/blob/main/LICENSE)\n",
    "bugtrack_url": null,
    "license": "Apache2",
    "summary": "NeMo-Aligner - a toolkit for model alignment",
    "version": "0.2.0",
    "project_urls": {
        "Download": "https://github.com/NVIDIA/NeMo-Aligner/releases",
        "Homepage": "https://github.com/NVIDIA/NeMo-Aligner"
    },
    "split_keywords": [
        "deep learning",
        "machine learning",
        "gpu",
        "nlp",
        "nemo",
        "nvidia",
        "pytorch",
        "torch",
        "language",
        "reinforcement learning",
        "rlhf",
        "preference modeling",
        "steerlm",
        "dpo"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "515a2cddf8e67387d93262213f9c58fb23112fe578da8aa3b27f9b8a6a7f8f7f",
                "md5": "a98d3b404d566802528b1cda233d10fc",
                "sha256": "3bc2723bc9d1dc31ab2625088c9d889ffa132c57423996ac6109088b2d518293"
            },
            "downloads": -1,
            "filename": "nemo_aligner-0.2.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "a98d3b404d566802528b1cda233d10fc",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 97098,
            "upload_time": "2024-03-13T23:08:12",
            "upload_time_iso_8601": "2024-03-13T23:08:12.986996Z",
            "url": "https://files.pythonhosted.org/packages/51/5a/2cddf8e67387d93262213f9c58fb23112fe578da8aa3b27f9b8a6a7f8f7f/nemo_aligner-0.2.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c81d00228a95227654d0cef2a19151339a6be226a20c6c13c995cb7256bc4f66",
                "md5": "8f8d70a0377ff37aed88b0acf25dd7c6",
                "sha256": "bbae6e5668ffa414d4f707b7aef789fb6ebd7685f700b0d92119e43259eedbb9"
            },
            "downloads": -1,
            "filename": "nemo_aligner-0.2.0.tar.gz",
            "has_sig": false,
            "md5_digest": "8f8d70a0377ff37aed88b0acf25dd7c6",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 70551,
            "upload_time": "2024-03-13T23:08:14",
            "upload_time_iso_8601": "2024-03-13T23:08:14.317462Z",
            "url": "https://files.pythonhosted.org/packages/c8/1d/00228a95227654d0cef2a19151339a6be226a20c6c13c995cb7256bc4f66/nemo_aligner-0.2.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-03-13 23:08:14",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "NVIDIA",
    "github_project": "NeMo-Aligner",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "nemo_toolkit",
            "specs": []
        },
        {
            "name": "nvidia-pytriton",
            "specs": []
        }
    ],
    "lcname": "nemo-aligner"
}
        
Elapsed time: 0.20264s