llm-steer


Namellm-steer JSON
Version 1.1.0 PyPI version JSON
download
home_pagehttps://github.com/Mihaiii/llm_steer
SummarySteer LLM outputs towards a certain topic/subject and enhance response capabilities using activation engineering by adding steer vectors
upload_time2024-04-13 19:04:43
maintainerNone
docs_urlNone
authorMihai Chirculescu
requires_pythonNone
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # LLM Steer
A Python module to steer LLM responses towards a certain topic/subject and to enhance capabilities (e.g., making it provide correct responses to tricky logical puzzles more often).
A practical tool for using activation engineering by adding steer vectors to different layers of a Large Language Model (LLM).
It should be used along with the transformers library.
## Demo
Google Colab demo: https://colab.research.google.com/github/Mihaiii/llm_steer/blob/main/demo/llm_steer_demo.ipynb

## Basic usage
Install it: `pip install llm_steer`
Then use:
```python
from llm_steer import Steer
steered_model = Steer(model, tokenizer)
```
Add a steering vector on a particular layer of the model with a given coefficient and text.
The coefficient can also be negative.
```python
steered_model.add(layer_idx=20, coeff=0.4, text="logical")
```
Get all the applied steering vectors:
```python
steered_model.get_all()
```
Remove all steering vectors to revert to initial model:
```python
steered_model.reset_all()
```

## Advanced usage
The so-called "advanced usage" involves changing the default values of 2 parameters ("try_keep_nr" and "exclude_bos_token"), which, from my experiments - almost always leads to the LLM outputting gibberish. In the very few cases when the LLM outputs text that does make sense, **the basic usage provides higher quality outputs**.

More info will be provided in a separate file.

## Q / A
Q: What's the difference between llm_steer and mentioning what you want in the system prompt?

A: I see llm_steer as an enhancer. It can be used together with the system prompt.

<br/>
Q: How to determine the best parameters to be used?

A: I don't have a method; it's all trial and error. I recommend starting with a small coefficient and then slowly increase it.

<br/>
Q: What models are supported?

A: I tested it on multiple architectures, including LLaMa, Mistral, Phi, StableLM.
Keep in mind that llm_steer is meant to be used together with HuggingFace's transformers library, so it won't work on GGUF, for example.

<br/>
Q: I applied steering vectors, but the LLM outputs gibberish. What should I do?

A: Try a lower coeff value or another layer.

<br/>
Q: Can I add multiple steering vectors on the same layer? Can I add the same steering vector on multiple layers? Can I add steering vectors with negative coefficients?

A: Yes, and please do. llm_steer is built for experimenting.
See the Colab for examples: https://colab.research.google.com/github/Mihaiii/llm_steer/blob/main/demo/llm_steer_demo.ipynb

<br/>
Q: Can I use steer vectors to enhance role-play characteristics (e.g., personas that are more funny or cocky)?

A: I believe this is possible, but I haven't had good results yet. I'm considering doing some more intensive testing and I might write a new notebook for it.

<br/>
Q: Can I use negative steering vectors to force it not to say "As an AI language model"?

A: Yes.

## Credits / Thanks
- [DL Explorers](https://www.youtube.com/@DLExplorers-lg7dt) for his video on [activation engineer](https://www.youtube.com/watch?v=J2Gx6FFEaRY&t=29s) which goes over [an article](https://www.greaterwrong.com/posts/5spBue2z2tw4JuDCx/steering-gpt-2-xl-by-adding-an-activation-vector) and [a colab he made](https://colab.research.google.com/github/githubpradeep/notebooks/blob/main/activation_engineering.ipynb). The resources mentioned in that video were the starting point of llm_steer.
- Gary Bernhardt for his excellent [Python for programmers](https://www.executeprogram.com/courses/python-for-programmers) course. I needed a course that could help me go through the basics of Python without treating me like a dev noob (like most basic level tutorials treat their audience).
- Andrej Karpathy for his [State of GPT](https://www.youtube.com/watch?v=bZQun8Y4L2A) video. I always wanted to make an open-source project, but there already was a repo for every idea I had. Not when it comes to tools for LLMs, though!

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/Mihaiii/llm_steer",
    "name": "llm-steer",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": null,
    "author": "Mihai Chirculescu",
    "author_email": "apropodemine@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/35/72/86ad5c613295d5d11f82ee64ea56919753acb6cb51e9b58b6ac9c76226b1/llm_steer-1.1.0.tar.gz",
    "platform": null,
    "description": "# LLM Steer\r\nA Python module to steer LLM responses towards a certain topic/subject and to enhance capabilities (e.g., making it provide correct responses to tricky logical puzzles more often).\r\nA practical tool for using activation engineering by adding steer vectors to different layers of a Large Language Model (LLM).\r\nIt should be used along with the transformers library.\r\n## Demo\r\nGoogle Colab demo: https://colab.research.google.com/github/Mihaiii/llm_steer/blob/main/demo/llm_steer_demo.ipynb\r\n\r\n## Basic usage\r\nInstall it: `pip install llm_steer`\r\nThen use:\r\n```python\r\nfrom llm_steer import Steer\r\nsteered_model = Steer(model, tokenizer)\r\n```\r\nAdd a steering vector on a particular layer of the model with a given coefficient and text.\r\nThe coefficient can also be negative.\r\n```python\r\nsteered_model.add(layer_idx=20, coeff=0.4, text=\"logical\")\r\n```\r\nGet all the applied steering vectors:\r\n```python\r\nsteered_model.get_all()\r\n```\r\nRemove all steering vectors to revert to initial model:\r\n```python\r\nsteered_model.reset_all()\r\n```\r\n\r\n## Advanced usage\r\nThe so-called \"advanced usage\" involves changing the default values of 2 parameters (\"try_keep_nr\" and \"exclude_bos_token\"), which, from my experiments - almost always leads to the LLM outputting gibberish. In the very few cases when the LLM outputs text that does make sense, **the basic usage provides higher quality outputs**.\r\n\r\nMore info will be provided in a separate file.\r\n\r\n## Q / A\r\nQ: What's the difference between llm_steer and mentioning what you want in the system prompt?\r\n\r\nA: I see llm_steer as an enhancer. It can be used together with the system prompt.\r\n\r\n<br/>\r\nQ: How to determine the best parameters to be used?\r\n\r\nA: I don't have a method; it's all trial and error. I recommend starting with a small coefficient and then slowly increase it.\r\n\r\n<br/>\r\nQ: What models are supported?\r\n\r\nA: I tested it on multiple architectures, including LLaMa, Mistral, Phi, StableLM.\r\nKeep in mind that llm_steer is meant to be used together with HuggingFace's transformers library, so it won't work on GGUF, for example.\r\n\r\n<br/>\r\nQ: I applied steering vectors, but the LLM outputs gibberish. What should I do?\r\n\r\nA: Try a lower coeff value or another layer.\r\n\r\n<br/>\r\nQ: Can I add multiple steering vectors on the same layer? Can I add the same steering vector on multiple layers? Can I add steering vectors with negative coefficients?\r\n\r\nA: Yes, and please do. llm_steer is built for experimenting.\r\nSee the Colab for examples: https://colab.research.google.com/github/Mihaiii/llm_steer/blob/main/demo/llm_steer_demo.ipynb\r\n\r\n<br/>\r\nQ: Can I use steer vectors to enhance role-play characteristics (e.g., personas that are more funny or cocky)?\r\n\r\nA: I believe this is possible, but I haven't had good results yet. I'm considering doing some more intensive testing and I might write a new notebook for it.\r\n\r\n<br/>\r\nQ: Can I use negative steering vectors to force it not to say \"As an AI language model\"?\r\n\r\nA: Yes.\r\n\r\n## Credits / Thanks\r\n- [DL Explorers](https://www.youtube.com/@DLExplorers-lg7dt) for his video on [activation engineer](https://www.youtube.com/watch?v=J2Gx6FFEaRY&t=29s) which goes over [an article](https://www.greaterwrong.com/posts/5spBue2z2tw4JuDCx/steering-gpt-2-xl-by-adding-an-activation-vector) and [a colab he made](https://colab.research.google.com/github/githubpradeep/notebooks/blob/main/activation_engineering.ipynb). The resources mentioned in that video were the starting point of llm_steer.\r\n- Gary Bernhardt for his excellent [Python for programmers](https://www.executeprogram.com/courses/python-for-programmers) course. I needed a course that could help me go through the basics of Python without treating me like a dev noob (like most basic level tutorials treat their audience).\r\n- Andrej Karpathy for his [State of GPT](https://www.youtube.com/watch?v=bZQun8Y4L2A) video. I always wanted to make an open-source project, but there already was a repo for every idea I had. Not when it comes to tools for LLMs, though!\r\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Steer LLM outputs towards a certain topic/subject and enhance response capabilities using activation engineering by adding steer vectors",
    "version": "1.1.0",
    "project_urls": {
        "Homepage": "https://github.com/Mihaiii/llm_steer"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "31f570877486a5d5f7502b854f7086a2a21d008a47aac5622fe6f5d4d0a16f89",
                "md5": "fd07b967764bd426752e89bd83457b3e",
                "sha256": "2c84d6f9764369399315476dd49a8195d5e11390a1315066a667ebf9fc0f53ac"
            },
            "downloads": -1,
            "filename": "llm_steer-1.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "fd07b967764bd426752e89bd83457b3e",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 5761,
            "upload_time": "2024-04-13T19:04:42",
            "upload_time_iso_8601": "2024-04-13T19:04:42.120678Z",
            "url": "https://files.pythonhosted.org/packages/31/f5/70877486a5d5f7502b854f7086a2a21d008a47aac5622fe6f5d4d0a16f89/llm_steer-1.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "357286ad5c613295d5d11f82ee64ea56919753acb6cb51e9b58b6ac9c76226b1",
                "md5": "6b946c3ef27f1a7ec457829fd5242e1e",
                "sha256": "9e75d0a273bdb1636e689ded781a4136d0a77ee50843e5dc4624411d05bac7bc"
            },
            "downloads": -1,
            "filename": "llm_steer-1.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "6b946c3ef27f1a7ec457829fd5242e1e",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 5566,
            "upload_time": "2024-04-13T19:04:43",
            "upload_time_iso_8601": "2024-04-13T19:04:43.163490Z",
            "url": "https://files.pythonhosted.org/packages/35/72/86ad5c613295d5d11f82ee64ea56919753acb6cb51e9b58b6ac9c76226b1/llm_steer-1.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-04-13 19:04:43",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Mihaiii",
    "github_project": "llm_steer",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "llm-steer"
}
        
Elapsed time: 0.22993s