steering-vectors


Namesteering-vectors JSON
Version 0.10.2 PyPI version JSON
download
home_pagehttps://steering-vectors.github.io/steering-vectors
SummarySteering vectors for transformer language models in Pytorch / Huggingface
upload_time2024-04-02 10:37:59
maintainerNone
docs_urlNone
authorDavid Chanin
requires_python<4.0,>=3.10
licenseMIT
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Steering Vectors

[![ci](https://img.shields.io/github/actions/workflow/status/steering-vectors/steering-vectors/ci.yaml?branch=main)](https://github.com/steering-vectors/steering-vectors)
[![PyPI](https://img.shields.io/pypi/v/steering-vectors?color=blue)](https://pypi.org/project/steering-vectors/)

Steering vectors / representation engineering for transformer language models in Pytorch / Huggingface

Check out our [example notebook](examples/caa_sycophancy.ipynb). <a target="_blank" href="https://colab.research.google.com/github/steering-vectors/steering-vectors/blob/main/examples/caa_sycophancy.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

Full docs: https://steering-vectors.github.io/steering-vectors

## About

This library provides utilies for training and applying steering vectors to language models (LMs) from [Huggingface](https://huggingface.co/), like GPT, LLaMa, Gemma, Mistral, Pythia, and many more!

For more info on steering vectors and representation engineering, check out the following work:

- [Steering Llama 2 via Contrastive Activation Addition](https://arxiv.org/abs/2312.06681) Rimsky et al., 2023
- [Representation Engineering: A Top-Down Approach to AI Transparency](https://arxiv.org/abs/2310.01405) Zou et al., 2023

## Installation

```
pip install steering-vectors
```

Check out the [full documentation](https://steering-vectors.github.io/steering-vectors/) for more usage info.

## Contributing

Any contributions to improve this project are welcome! Please open an issue or pull request in this repo with any bugfixes / changes / improvements you have.

This project uses [Ruff](https://docs.astral.sh/ruff/) for code formatting and linting, [MyPy](https://mypy.readthedocs.io/en/stable/) for type checking, and [Pytest](https://docs.pytest.org/) for tests. Make sure any changes you submit pass these code checks in your PR. If you have trouble getting these to run feel free to open a pull-request regardless and we can discuss further in the PR.

## License

This code is released under a MIT license.


            

Raw data

            {
    "_id": null,
    "home_page": "https://steering-vectors.github.io/steering-vectors",
    "name": "steering-vectors",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.10",
    "maintainer_email": null,
    "keywords": null,
    "author": "David Chanin",
    "author_email": "chanindav@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/91/f5/6e1f61a8d9b3b3852cad8801b8cecf089fc15ca8d48ce332ac30aac6c7b3/steering_vectors-0.10.2.tar.gz",
    "platform": null,
    "description": "# Steering Vectors\n\n[![ci](https://img.shields.io/github/actions/workflow/status/steering-vectors/steering-vectors/ci.yaml?branch=main)](https://github.com/steering-vectors/steering-vectors)\n[![PyPI](https://img.shields.io/pypi/v/steering-vectors?color=blue)](https://pypi.org/project/steering-vectors/)\n\nSteering vectors / representation engineering for transformer language models in Pytorch / Huggingface\n\nCheck out our [example notebook](examples/caa_sycophancy.ipynb). <a target=\"_blank\" href=\"https://colab.research.google.com/github/steering-vectors/steering-vectors/blob/main/examples/caa_sycophancy.ipynb\">\n  <img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/>\n</a>\n\nFull docs: https://steering-vectors.github.io/steering-vectors\n\n## About\n\nThis library provides utilies for training and applying steering vectors to language models (LMs) from [Huggingface](https://huggingface.co/), like GPT, LLaMa, Gemma, Mistral, Pythia, and many more!\n\nFor more info on steering vectors and representation engineering, check out the following work:\n\n- [Steering Llama 2 via Contrastive Activation Addition](https://arxiv.org/abs/2312.06681) Rimsky et al., 2023\n- [Representation Engineering: A Top-Down Approach to AI Transparency](https://arxiv.org/abs/2310.01405) Zou et al., 2023\n\n## Installation\n\n```\npip install steering-vectors\n```\n\nCheck out the [full documentation](https://steering-vectors.github.io/steering-vectors/) for more usage info.\n\n## Contributing\n\nAny contributions to improve this project are welcome! Please open an issue or pull request in this repo with any bugfixes / changes / improvements you have.\n\nThis project uses [Ruff](https://docs.astral.sh/ruff/) for code formatting and linting, [MyPy](https://mypy.readthedocs.io/en/stable/) for type checking, and [Pytest](https://docs.pytest.org/) for tests. Make sure any changes you submit pass these code checks in your PR. If you have trouble getting these to run feel free to open a pull-request regardless and we can discuss further in the PR.\n\n## License\n\nThis code is released under a MIT license.\n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Steering vectors for transformer language models in Pytorch / Huggingface",
    "version": "0.10.2",
    "project_urls": {
        "Homepage": "https://steering-vectors.github.io/steering-vectors",
        "Repository": "https://github.com/steering-vectors/steering-vectors"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ab36fd04b696fa578efeb3daec744ab2f06f71b24f8607b7cf2afca3a4719638",
                "md5": "eea19b22c86953eb950c62481bc1882f",
                "sha256": "32d052f22f33fc4d7f42fc39441cedc8f4cd70c89fb0f117db88bec6d9498254"
            },
            "downloads": -1,
            "filename": "steering_vectors-0.10.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "eea19b22c86953eb950c62481bc1882f",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.10",
            "size": 14340,
            "upload_time": "2024-04-02T10:37:58",
            "upload_time_iso_8601": "2024-04-02T10:37:58.403338Z",
            "url": "https://files.pythonhosted.org/packages/ab/36/fd04b696fa578efeb3daec744ab2f06f71b24f8607b7cf2afca3a4719638/steering_vectors-0.10.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "91f56e1f61a8d9b3b3852cad8801b8cecf089fc15ca8d48ce332ac30aac6c7b3",
                "md5": "798b8ab74385c8f98bdf55159e186e92",
                "sha256": "9924a228f06e80efd49b004b7e8769db343da38263c1e73429a1ad0e281f4618"
            },
            "downloads": -1,
            "filename": "steering_vectors-0.10.2.tar.gz",
            "has_sig": false,
            "md5_digest": "798b8ab74385c8f98bdf55159e186e92",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.10",
            "size": 12024,
            "upload_time": "2024-04-02T10:37:59",
            "upload_time_iso_8601": "2024-04-02T10:37:59.993039Z",
            "url": "https://files.pythonhosted.org/packages/91/f5/6e1f61a8d9b3b3852cad8801b8cecf089fc15ca8d48ce332ac30aac6c7b3/steering_vectors-0.10.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-04-02 10:37:59",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "steering-vectors",
    "github_project": "steering-vectors",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "steering-vectors"
}
        
Elapsed time: 0.22327s