concept-erasure


Nameconcept-erasure JSON
Version 0.2.3 PyPI version JSON
download
home_page
SummaryErasing concepts from neural representations with provable guarantees
upload_time2024-01-10 19:49:32
maintainer
docs_urlNone
author
requires_python>=3.10
licenseMIT License
keywords fairness interpretability explainable-ai
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Least-Squares Concept Erasure (LEACE)
Concept erasure aims to remove specified features from a representation. It can be used to improve fairness (e.g. preventing a classifier from using gender or race) and interpretability (e.g. removing a concept to observe changes in model behavior). This is the repo for **LEAst-squares Concept Erasure (LEACE)**, a closed-form method which provably prevents all linear classifiers from detecting a concept while inflicting the least possible damage to the representation. You can check out the paper [here](https://arxiv.org/abs/2306.03819).

# Installation

We require Python 3.10 or later. You can install the package from PyPI:

```bash
pip install concept-erasure
```

# Usage

The two main classes in this repo are `LeaceFitter` and `LeaceEraser`.

- `LeaceFitter` keeps track of the covariance and cross-covariance statistics needed to compute the LEACE erasure function. These statistics can be updated in an incremental fashion with `LeaceFitter.update()`. The erasure function is lazily computed when the `.eraser` property is accessed. This class uses O(_d<sup>2</sup>_) memory, where _d_ is the dimensionality of the representation, so you may want to discard it after computing the erasure function.
- `LeaceEraser` is a compact representation of the LEACE erasure function, using only O(_dk_) memory, where _k_ is the number of classes in the concept you're trying to erase (or equivalently, the _dimensionality_ of the concept if it's not categorical).

## Batch usage

In most cases, you probably have a batch of feature vectors `X` and concept labels `Z` and want to erase the concept from `X`. The easiest way to do this is by using the `LeaceEraser.fit()` convenience method:

```python
import torch
from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression

from concept_erasure import LeaceEraser

n, d, k = 2048, 128, 2

X, Y = make_classification(
    n_samples=n,
    n_features=d,
    n_classes=k,
    random_state=42,
)
X_t = torch.from_numpy(X)
Y_t = torch.from_numpy(Y)

# Logistic regression does learn something before concept erasure
real_lr = LogisticRegression(max_iter=1000).fit(X, Y)
beta = torch.from_numpy(real_lr.coef_)
assert beta.norm(p=torch.inf) > 0.1

eraser = LeaceEraser.fit(X_t, Y_t)
X_ = eraser(X_t)

# But learns nothing after
null_lr = LogisticRegression(max_iter=1000, tol=0.0).fit(X_.numpy(), Y)
beta = torch.from_numpy(null_lr.coef_)
assert beta.norm(p=torch.inf) < 1e-4
```

## Streaming usage
If you have a **stream** of data, you can use `LeaceFitter.update()` to update the statistics. This is useful if you have a large dataset and want to avoid storing it all in memory.

```python
from concept_erasure import LeaceFitter
from sklearn.datasets import make_classification
import torch

n, d, k = 2048, 128, 2

X, Y = make_classification(
    n_samples=n,
    n_features=d,
    n_classes=k,
    random_state=42,
)
X_t = torch.from_numpy(X)
Y_t = torch.from_numpy(Y)

fitter = LeaceFitter(d, 1, dtype=X_t.dtype)

# Compute cross-covariance matrix using batched updates
for x, y in zip(X_t.chunk(2), Y_t.chunk(2)):
    fitter.update(x, y)

# Erase the concept from the data
x_ = fitter.eraser(X_t[0])
```

# Paper replication

Scripts used to generate the part-of-speech tags for the concept scrubbing experiments can be found in [this repo](https://github.com/EleutherAI/tagged-pile). We plan to upload the tagged datasets to the HuggingFace Hub shortly.

## Concept scrubbing

The concept scrubbing code is a bit messy right now, and will probably be refactored soon. We found it necessary to write bespoke implementations for different HuggingFace model families. So far we've implemented LLaMA and GPT-NeoX. These can be found in the `concept_erasure.scrubbing` submodule.

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "concept-erasure",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": "",
    "keywords": "fairness,interpretability,explainable-ai",
    "author": "",
    "author_email": "",
    "download_url": "https://files.pythonhosted.org/packages/73/70/bdbbbeae7ff5bd6e25f2747af23a74115f4fc70a0b158e31b105cc06412f/concept-erasure-0.2.3.tar.gz",
    "platform": null,
    "description": "# Least-Squares Concept Erasure (LEACE)\nConcept erasure aims to remove specified features from a representation. It can be used to improve fairness (e.g. preventing a classifier from using gender or race) and interpretability (e.g. removing a concept to observe changes in model behavior). This is the repo for **LEAst-squares Concept Erasure (LEACE)**, a closed-form method which provably prevents all linear classifiers from detecting a concept while inflicting the least possible damage to the representation. You can check out the paper [here](https://arxiv.org/abs/2306.03819).\n\n# Installation\n\nWe require Python 3.10 or later. You can install the package from PyPI:\n\n```bash\npip install concept-erasure\n```\n\n# Usage\n\nThe two main classes in this repo are `LeaceFitter` and `LeaceEraser`.\n\n- `LeaceFitter` keeps track of the covariance and cross-covariance statistics needed to compute the LEACE erasure function. These statistics can be updated in an incremental fashion with `LeaceFitter.update()`. The erasure function is lazily computed when the `.eraser` property is accessed. This class uses O(_d<sup>2</sup>_) memory, where _d_ is the dimensionality of the representation, so you may want to discard it after computing the erasure function.\n- `LeaceEraser` is a compact representation of the LEACE erasure function, using only O(_dk_) memory, where _k_ is the number of classes in the concept you're trying to erase (or equivalently, the _dimensionality_ of the concept if it's not categorical).\n\n## Batch usage\n\nIn most cases, you probably have a batch of feature vectors `X` and concept labels `Z` and want to erase the concept from `X`. The easiest way to do this is by using the `LeaceEraser.fit()` convenience method:\n\n```python\nimport torch\nfrom sklearn.datasets import make_classification\nfrom sklearn.linear_model import LogisticRegression\n\nfrom concept_erasure import LeaceEraser\n\nn, d, k = 2048, 128, 2\n\nX, Y = make_classification(\n    n_samples=n,\n    n_features=d,\n    n_classes=k,\n    random_state=42,\n)\nX_t = torch.from_numpy(X)\nY_t = torch.from_numpy(Y)\n\n# Logistic regression does learn something before concept erasure\nreal_lr = LogisticRegression(max_iter=1000).fit(X, Y)\nbeta = torch.from_numpy(real_lr.coef_)\nassert beta.norm(p=torch.inf) > 0.1\n\neraser = LeaceEraser.fit(X_t, Y_t)\nX_ = eraser(X_t)\n\n# But learns nothing after\nnull_lr = LogisticRegression(max_iter=1000, tol=0.0).fit(X_.numpy(), Y)\nbeta = torch.from_numpy(null_lr.coef_)\nassert beta.norm(p=torch.inf) < 1e-4\n```\n\n## Streaming usage\nIf you have a **stream** of data, you can use `LeaceFitter.update()` to update the statistics. This is useful if you have a large dataset and want to avoid storing it all in memory.\n\n```python\nfrom concept_erasure import LeaceFitter\nfrom sklearn.datasets import make_classification\nimport torch\n\nn, d, k = 2048, 128, 2\n\nX, Y = make_classification(\n    n_samples=n,\n    n_features=d,\n    n_classes=k,\n    random_state=42,\n)\nX_t = torch.from_numpy(X)\nY_t = torch.from_numpy(Y)\n\nfitter = LeaceFitter(d, 1, dtype=X_t.dtype)\n\n# Compute cross-covariance matrix using batched updates\nfor x, y in zip(X_t.chunk(2), Y_t.chunk(2)):\n    fitter.update(x, y)\n\n# Erase the concept from the data\nx_ = fitter.eraser(X_t[0])\n```\n\n# Paper replication\n\nScripts used to generate the part-of-speech tags for the concept scrubbing experiments can be found in [this repo](https://github.com/EleutherAI/tagged-pile). We plan to upload the tagged datasets to the HuggingFace Hub shortly.\n\n## Concept scrubbing\n\nThe concept scrubbing code is a bit messy right now, and will probably be refactored soon. We found it necessary to write bespoke implementations for different HuggingFace model families. So far we've implemented LLaMA and GPT-NeoX. These can be found in the `concept_erasure.scrubbing` submodule.\n",
    "bugtrack_url": null,
    "license": "MIT License",
    "summary": "Erasing concepts from neural representations with provable guarantees",
    "version": "0.2.3",
    "project_urls": null,
    "split_keywords": [
        "fairness",
        "interpretability",
        "explainable-ai"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "a59fcfe9c0529bcf10bd769878d97d7c134879b52e4a5c5ae6bd2726686368cb",
                "md5": "add5146572ad0624a1e5ab257c6cac07",
                "sha256": "ad65c3f7d074e2c69bd81861725d5fcac8c1eb4b3a3ce7fa7f85cdeb5d167ba5"
            },
            "downloads": -1,
            "filename": "concept_erasure-0.2.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "add5146572ad0624a1e5ab257c6cac07",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 25657,
            "upload_time": "2024-01-10T19:49:30",
            "upload_time_iso_8601": "2024-01-10T19:49:30.539967Z",
            "url": "https://files.pythonhosted.org/packages/a5/9f/cfe9c0529bcf10bd769878d97d7c134879b52e4a5c5ae6bd2726686368cb/concept_erasure-0.2.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "7370bdbbbeae7ff5bd6e25f2747af23a74115f4fc70a0b158e31b105cc06412f",
                "md5": "a323894121cd22713b2e33b603fb66ac",
                "sha256": "662ed47e327a8c2d5d459a11facd1e97a5766cdec79777e9a92d5bee39e5979e"
            },
            "downloads": -1,
            "filename": "concept-erasure-0.2.3.tar.gz",
            "has_sig": false,
            "md5_digest": "a323894121cd22713b2e33b603fb66ac",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 24928,
            "upload_time": "2024-01-10T19:49:32",
            "upload_time_iso_8601": "2024-01-10T19:49:32.109129Z",
            "url": "https://files.pythonhosted.org/packages/73/70/bdbbbeae7ff5bd6e25f2747af23a74115f4fc70a0b158e31b105cc06412f/concept-erasure-0.2.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-01-10 19:49:32",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "concept-erasure"
}
        
Elapsed time: 0.18679s