separability


Nameseparability JSON
Version 0.9.1 PyPI version JSON
download
home_pagehttps://github.com/pesvut/separability
SummaryLLM Tools for looking at separability of LLM Capabilities
upload_time2023-07-28 15:45:32
maintainer
docs_urlNone
authorNicky Pochinkov
requires_python>=3.9,<3.13
license
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # separability

My basic library for studying LLMs, with support for Multi-GPU inference and
editing, as well as quantilised inference (not editing yet).
This includes functions for analysing the activations of the models for
different inputs, and for pruning different parts of the model based on those
activations.

The currently tested list of models is:
- GPT2
- EleutherAI's Pythia
- Meta Opt
- Meta Galactica

## Pruning based on Capabilities

For a full example, see `src/examples/prune_30.py`.

The simple example is:
```
from separability.data_classes import PruningConfig
from separability.parser import cli_parser
from separability.prune import run_pruning

# Configure initial model and tests
c = PruningConfig(
    wandb_project = "testing",
    model_repo   = "facebook/opt-125m",
    token_limit  = 1000,
    run_pre_test = True,

    # Removals parameters
    ff_scoring = "abs"
    ff_frac   = 0.02,
    ff_eps    = 0.001,
    attn_scoring = "abs",
    attn_frac = 0.00,
    attn_eps  = 1e-4,

    # Eval
    focus     = "pile_codeless",
    cripple   = "code",
    additional_datasets=tuple(),
)

# optionally, use parser to get CLI arguments.
# c, args = cli_parser(c)

# Run the iterated pruning
model, history = run_pruning(c)

```

## model.py
This defines a wrapper function that encapsulates the HuggingFace implementation of Meta OPT.
To get the model, simply run:

```
from separability import Model

m = Model("facebook/opt-125m", limit=1000)
```

Where you can provide any of the model sizes that are pre-trained for OPT, and the token limit must be smaller than the max token length that the model is able to handle.

Next, you can run the model to do 2 tokens of predictions, by, for example, running:
```
text = 'Hello, my name is'
inpt, output = opt.predict( text, num=2 )
```

We can look at the residual stream of how the output changes over time.
```
residual_stream = opt.get_residual_stream( text )
```
This will return a tensor of size `2 + 2*n_layers`.
i.e:
- the input (w/ positional encoding)
- n attention layer outputs
- n feed forward layer outputs
- the final output

If we want just the output of the attention / feed forward layers, we can instead look at the activations:
```
inpt, attn_out, ff_out, output = opt.get_text_activations( text )
```
or alternatively:
```
inpt, attn_out, ff_out, output = opt.get_text_activations( residual_stream=residual_stream )
```

To get the activations for the input text at all of the MLP mid layers, we can look at:
`opt.get_ff_key_activations( text )` or `opt.get_ff_key_activations( residual_stream=residual_stream )`.

## texts.py
Has some basic tools for loading the two text datasets I am using:
- 'pile', ( EleutherAI's 'The Pile' dataset)
- 'code' (CodeParrot's 'github-code' dataset)

## activations.py
Has code specific to the two datasets I am using to analyze and attempt to remove capabilities from the models.


            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/pesvut/separability",
    "name": "separability",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.9,<3.13",
    "maintainer_email": "",
    "keywords": "",
    "author": "Nicky Pochinkov",
    "author_email": "",
    "download_url": "https://files.pythonhosted.org/packages/df/68/91ff6a36ae75b9c593de1c1e1436566bb78f2efb3180203a5316481212fd/separability-0.9.1.tar.gz",
    "platform": null,
    "description": "# separability\n\nMy basic library for studying LLMs, with support for Multi-GPU inference and\nediting, as well as quantilised inference (not editing yet).\nThis includes functions for analysing the activations of the models for\ndifferent inputs, and for pruning different parts of the model based on those\nactivations.\n\nThe currently tested list of models is:\n- GPT2\n- EleutherAI's Pythia\n- Meta Opt\n- Meta Galactica\n\n## Pruning based on Capabilities\n\nFor a full example, see `src/examples/prune_30.py`.\n\nThe simple example is:\n```\nfrom separability.data_classes import PruningConfig\nfrom separability.parser import cli_parser\nfrom separability.prune import run_pruning\n\n# Configure initial model and tests\nc = PruningConfig(\n    wandb_project = \"testing\",\n    model_repo   = \"facebook/opt-125m\",\n    token_limit  = 1000,\n    run_pre_test = True,\n\n    # Removals parameters\n    ff_scoring = \"abs\"\n    ff_frac   = 0.02,\n    ff_eps    = 0.001,\n    attn_scoring = \"abs\",\n    attn_frac = 0.00,\n    attn_eps  = 1e-4,\n\n    # Eval\n    focus     = \"pile_codeless\",\n    cripple   = \"code\",\n    additional_datasets=tuple(),\n)\n\n# optionally, use parser to get CLI arguments.\n# c, args = cli_parser(c)\n\n# Run the iterated pruning\nmodel, history = run_pruning(c)\n\n```\n\n## model.py\nThis defines a wrapper function that encapsulates the HuggingFace implementation of Meta OPT.\nTo get the model, simply run:\n\n```\nfrom separability import Model\n\nm = Model(\"facebook/opt-125m\", limit=1000)\n```\n\nWhere you can provide any of the model sizes that are pre-trained for OPT, and the token limit must be smaller than the max token length that the model is able to handle.\n\nNext, you can run the model to do 2 tokens of predictions, by, for example, running:\n```\ntext = 'Hello, my name is'\ninpt, output = opt.predict( text, num=2 )\n```\n\nWe can look at the residual stream of how the output changes over time.\n```\nresidual_stream = opt.get_residual_stream( text )\n```\nThis will return a tensor of size `2 + 2*n_layers`.\ni.e:\n- the input (w/ positional encoding)\n- n attention layer outputs\n- n feed forward layer outputs\n- the final output\n\nIf we want just the output of the attention / feed forward layers, we can instead look at the activations:\n```\ninpt, attn_out, ff_out, output = opt.get_text_activations( text )\n```\nor alternatively:\n```\ninpt, attn_out, ff_out, output = opt.get_text_activations( residual_stream=residual_stream )\n```\n\nTo get the activations for the input text at all of the MLP mid layers, we can look at:\n`opt.get_ff_key_activations( text )` or `opt.get_ff_key_activations( residual_stream=residual_stream )`.\n\n## texts.py\nHas some basic tools for loading the two text datasets I am using:\n- 'pile', ( EleutherAI's 'The Pile' dataset)\n- 'code' (CodeParrot's 'github-code' dataset)\n\n## activations.py\nHas code specific to the two datasets I am using to analyze and attempt to remove capabilities from the models.\n\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "LLM Tools for looking at separability of LLM Capabilities",
    "version": "0.9.1",
    "project_urls": {
        "Homepage": "https://github.com/pesvut/separability",
        "Repository": "https://github.com/pesvut/separability"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e11bf682761f15f2ce27ef59a718411f63530f62e47bae9315712047c490d6cd",
                "md5": "b7d08008cababb2f93624cfc458095a2",
                "sha256": "a2dcfc93382a13056015817c59561e515f9c011bf704fac16b479d001b51a597"
            },
            "downloads": -1,
            "filename": "separability-0.9.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "b7d08008cababb2f93624cfc458095a2",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9,<3.13",
            "size": 39212,
            "upload_time": "2023-07-28T15:45:30",
            "upload_time_iso_8601": "2023-07-28T15:45:30.997569Z",
            "url": "https://files.pythonhosted.org/packages/e1/1b/f682761f15f2ce27ef59a718411f63530f62e47bae9315712047c490d6cd/separability-0.9.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "df6891ff6a36ae75b9c593de1c1e1436566bb78f2efb3180203a5316481212fd",
                "md5": "a606b9e01f88d14d4f032053df2e75ca",
                "sha256": "00d5e1026d9e33f32a5f67109814f21bfb3bac9cc5ebe35bc7a01111f673cfe5"
            },
            "downloads": -1,
            "filename": "separability-0.9.1.tar.gz",
            "has_sig": false,
            "md5_digest": "a606b9e01f88d14d4f032053df2e75ca",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9,<3.13",
            "size": 36953,
            "upload_time": "2023-07-28T15:45:32",
            "upload_time_iso_8601": "2023-07-28T15:45:32.197346Z",
            "url": "https://files.pythonhosted.org/packages/df/68/91ff6a36ae75b9c593de1c1e1436566bb78f2efb3180203a5316481212fd/separability-0.9.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-07-28 15:45:32",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "pesvut",
    "github_project": "separability",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "separability"
}
        
Elapsed time: 0.13630s