# Seperability
My basic library for studying LLMs (currently, only the Meta OPT models).
This includes functions for analysing the activations of the models for different inputs, and for pruning different parts of the model based on those activations.
## Pruning based on Capabilities
For a full example, see `src/seperability.ipynb`.
The simple example is:
```
from model import Model
from activations import prune_and_evaluate, evaluate_all
# Load and Evaluate Model on Pile and Code
opt = Model('125m', limit=1000)
eval_data = evaluate_all(opt, 1e5)
print(eval_data)
# Prune Model, Removing coding capabilities (compared to pile), and evaluate
eval_data = prune_and_evaluate(opt, ff_prune_frac=0.05, attn_prune_frac=0.05,
ff_eps=1e-3, sample_size=1e5, eval_size=1e5, cripple='code', focus='pile')
print(eval_data)
```
## model.py
This defines a wrapper function that encapsulates the HuggingFace implementation of Meta OPT.
To get the model, simply run:
```
from model import Model
opt = Model('125m', limit=1000)
```
Where you can provide any of the model sizes that are pre-trained for OPT, and the token limit must be smaller than the max token length that the model is able to handle.
Next, you can run the model to do 2 tokens of predictions, by, for example, running:
```
text = 'Hello, my name is'
inpt, output = opt.predict( text, num=2 )
```
We can look at the residual stream of how the output changes over time.
```
residual_stream = opt.get_residual_stream( text )
```
This will return a tensor of size `2 + 2*n_layers`.
i.e:
- the input (w/ positional encoding)
- n attention layer outputs
- n feed forward layer outputs
- the final output
If we want just the output of the attention / feed forward layers, we can instead look at the activations:
```
inpt, attn_out, ff_out, output = opt.get_text_activations( text )
```
or alternatively:
```
inpt, attn_out, ff_out, output = opt.get_text_activations( residual_stream=residual_stream )
```
To get the activations for the input text at all of the MLP mid layers, we can look at:
`opt.get_ff_key_activations( text )` or `opt.get_ff_key_activations( residual_stream=residual_stream )`.
## texts.py
Has some basic tools for loading the two text datasets I am using:
- 'the_pile' ( validation set of The Pile )
- 'codeparrot-clean-valid' ( validation set of codeparrot )
## activations.py
Has code specific to the two datasets I am using to analyze and attempt to remove capabilities from the OPT.
Raw data
{
"_id": null,
"home_page": "https://github.com/pesvut/seperability",
"name": "seperability",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.8,<4.0",
"maintainer_email": "",
"keywords": "",
"author": "Nicky Pochinkov",
"author_email": "",
"download_url": "https://files.pythonhosted.org/packages/dd/61/b5a33f42629ec815a686678575d0c69e64812bbb6d5e950c9e06c6737c20/seperability-0.1.2.tar.gz",
"platform": null,
"description": "# Seperability\n\nMy basic library for studying LLMs (currently, only the Meta OPT models).\nThis includes functions for analysing the activations of the models for different inputs, and for pruning different parts of the model based on those activations.\n\n## Pruning based on Capabilities\n\nFor a full example, see `src/seperability.ipynb`. \n\nThe simple example is:\n```\nfrom model import Model\nfrom activations import prune_and_evaluate, evaluate_all\n\n#\u00a0Load and Evaluate Model on Pile and Code\n\nopt = Model('125m', limit=1000)\neval_data = evaluate_all(opt, 1e5)\nprint(eval_data)\n\n#\u00a0Prune Model, Removing coding capabilities (compared to pile), and evaluate\n\neval_data = prune_and_evaluate(opt, ff_prune_frac=0.05, attn_prune_frac=0.05,\n ff_eps=1e-3, sample_size=1e5, eval_size=1e5, cripple='code', focus='pile')\nprint(eval_data)\n```\n\n## model.py\nThis defines a wrapper function that encapsulates the HuggingFace implementation of Meta OPT. \nTo get the model, simply run:\n\n```\nfrom model import Model\n\nopt = Model('125m', limit=1000)\n```\n\nWhere you can provide any of the model sizes that are pre-trained for OPT, and the token limit must be smaller than the max token length that the model is able to handle.\n\nNext, you can run the model to do 2 tokens of predictions, by, for example, running:\n```\ntext = 'Hello, my name is'\ninpt, output = opt.predict( text, num=2 )\n```\n\nWe can look at the residual stream of how the output changes over time.\n```\nresidual_stream = opt.get_residual_stream( text )\n```\nThis will return a tensor of size `2 + 2*n_layers`.\ni.e: \n- the input (w/ positional encoding)\n- n attention layer outputs\n- n feed forward layer outputs\n- the final output\n\nIf we want just the output of the attention / feed forward layers, we can instead look at the activations:\n```\ninpt, attn_out, ff_out, output = opt.get_text_activations( text )\n```\nor alternatively:\n```\ninpt, attn_out, ff_out, output = opt.get_text_activations( residual_stream=residual_stream )\n```\n\nTo get the activations for the input text at all of the MLP mid layers, we can look at:\n`opt.get_ff_key_activations( text )` or `opt.get_ff_key_activations( residual_stream=residual_stream )`.\n\n## texts.py\nHas some basic tools for loading the two text datasets I am using:\n- 'the_pile' ( validation set of The Pile )\n- 'codeparrot-clean-valid' ( validation set of codeparrot )\n\n## activations.py\nHas code specific to the two datasets I am using to analyze and attempt to remove capabilities from the OPT.\n\n",
"bugtrack_url": null,
"license": "",
"summary": "Seperability of LLM Capabilities",
"version": "0.1.2",
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "b1d5698189cc91dd3d80c17713267e43ab3861656aaeabf24f58dcb0d7e150a4",
"md5": "b8118d158a9d59bd69cb4baf2e77dbc9",
"sha256": "517b3be27fc853ef04bdf4b269a42f80a750764820e2badffdd2379ac9c7c01d"
},
"downloads": -1,
"filename": "seperability-0.1.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "b8118d158a9d59bd69cb4baf2e77dbc9",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8,<4.0",
"size": 23277,
"upload_time": "2023-02-09T21:03:50",
"upload_time_iso_8601": "2023-02-09T21:03:50.032617Z",
"url": "https://files.pythonhosted.org/packages/b1/d5/698189cc91dd3d80c17713267e43ab3861656aaeabf24f58dcb0d7e150a4/seperability-0.1.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "dd61b5a33f42629ec815a686678575d0c69e64812bbb6d5e950c9e06c6737c20",
"md5": "4233b96cb59a1cbf6f977294b6801378",
"sha256": "4f81e8a964c2e70350d4d80642bfa406d6c98d00517332993e0b715e9daa99ec"
},
"downloads": -1,
"filename": "seperability-0.1.2.tar.gz",
"has_sig": false,
"md5_digest": "4233b96cb59a1cbf6f977294b6801378",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8,<4.0",
"size": 23274,
"upload_time": "2023-02-09T21:03:52",
"upload_time_iso_8601": "2023-02-09T21:03:52.043971Z",
"url": "https://files.pythonhosted.org/packages/dd/61/b5a33f42629ec815a686678575d0c69e64812bbb6d5e950c9e06c6737c20/seperability-0.1.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-02-09 21:03:52",
"github": true,
"gitlab": false,
"bitbucket": false,
"github_user": "pesvut",
"github_project": "seperability",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "seperability"
}