semanticlens


Namesemanticlens JSON
Version 0.2.0 PyPI version JSON
download
home_pageNone
SummaryA package for mechanistic understanding and validation of large AI model with SemanticLens
upload_time2025-08-19 06:47:44
maintainerNone
docs_urlNone
authorNone
requires_python>=3.9
licenseNone
keywords deep learning foundation model mechanistic interpretability semantic analysis
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <div align="center">
  <img src="https://github.com/jim-berend/semanticlens/blob/be718f96ba7c52b29249ff7b4806999890895c72/static/images/logo-with-name_big.svg" width="400px" alt="SemanticLens logo" align="center" />
  <p>
  An open-source PyTorch library for interpreting and validating large vision models.
  <br>
  Read the paper now as part of <a href="https://www.nature.com/articles/s42256-025-01084-w">Nature Machine Intelligence</a> (Open Access).
  </p>
</div>

<br>

<div align="center">
  <a href="https://www.nature.com/articles/s42256-025-01084-w">
    <img  src="https://img.shields.io/static/v1?label=Nature&message=Machine%20Intelligence&color=green">
  </a>
  <a href="https://doi.org/10.5281/zenodo.15233581">
    <img alt="DOI" src="https://zenodo.org/badge/DOI/10.5281/zenodo.15233581.svg">
  </a>
  <a href="https://pypi.org/project/semanticlens/">
    <img alt="pypi" src="https://img.shields.io/pypi/v/semanticlens">
  </a>
  <img  src="https://img.shields.io/badge/Python-3.9, 3.10, 3.11-efefef">
  <a href="LICENSE">
    <img alt="DOI" src="https://img.shields.io/badge/License-BSD%203--Clause-blue.svg">
  </a>
	<img alt="PyLint" src="https://github.com/jim-berend/semanticlens/actions/workflows/ruff-lint.yml/badge.svg">
  <a href="https://jim-berend.github.io/semanticlens/">
    <img  src="https://img.shields.io/badge/Docs-SemanticLens-ff8c00">
  </a>
</div>


**SemanticLens** is a universal framework for explaining and validating large vision models. While deep learning models are powerful, their internal workings are often a "black box," making them difficult to trust and debug. SemanticLens addresses this by mapping the internal components of a model (like neurons or filters) into the rich, semantic space of a foundation model (e.g., CLIP or SigLIP).

This allows you to "translate" what the model is doing into a human-understandable format, enabling you to search, analyze, and audit its internal representations.


## How It Works


<div align="center">
  <img src="https://github.com/jim-berend/semanticlens/blob/be718f96ba7c52b29249ff7b4806999890895c72/static/images/overview-figure.svg" width="90%" alt="Overview figure" align="center" />
  <p>
  Overview of the SemanticLens framework as introduced in our <a href="https://www.nature.com/articles/s42256-025-01084-w"> research paper.</a>

  </p>
</div>

The core workflow of SemanticLens involves three main steps:
1) **Collect**: For each component in a model M, we identify the data samples that cause the highest activation (the "concept examples").
We provide a suite of [`ComponentVisualizers`](semanticlens/component_visualization) that implement different strategies, from simple activation maximization to relevance-maximization and attribution-based cropping.

2) **Embed**: These examples are then fed into a foundation model (like CLIP), which creates a meaningful vector representation for each component. SemanticLens includes built-in support for [OpenCLIP](https://github.com/mlfoundations/open_clip) and can be easily extended with other foundation models (see [base.py](semanticlens/foundation_models/base.py)).


3) **Analyze**: These vector representations enable powerful analyses. The [`Lens`](semanticlens/lens.py) class is the main interface for this, orchestrating the preprocessing, caching, and evaluation needed to search and audit your model using its new semantic embeddings.


## Installation

You can install SemanticLens directly from PyPI:
```bash
pip install semanticlens
```

To install the latest version from this repository:

```bash
pip install git+https://github.com/jim-berend/semanticlens.git
```

## Quickstart
Example usage:
```python
import semanticlens as sl

... # dataset and model setup

# Initialization

cv = sl.component_visualization.ActivationComponentVisualizer(
    model,
    dataset_model,
    dataset_fm,
    layer_names=layer_names,
    device=device,
    cache_dir=cache_dir,
)

fm = sl.foundation_models.OpenClip(url="RN50", pretrained="openai", device=device)

lens = sl.Lens(fm, device=device)

# Semantic Embedding 

concept_db = lens.compute_concept_db(cv, batch_size=128, num_workers=8)
aggregated_cpt_db = {k: v.mean(1) for k, v in concept_db.items()}

# Analysis

polysemanticity_scores = lens.eval_polysemanticity(concept_db)

search_results = lens.text_probing(["cats", "dogs"], aggregated_cpt_db)

...
```
<a href="tutorials/quickstart.ipynb">
<img  src="https://img.shields.io/badge/Tutorial-Quickstart.ipynb-2881db">
</a>

Full quickstart guide: [quickstart.ipynb](tutorials/quickstart.ipynb)


<a href="https://jim-berend.github.io/semanticlens/">
<img  src="https://img.shields.io/badge/Docs-SemanticLens-ff8c00">
</a>

Package documentation: [docs](https://jim-berend.github.io/semanticlens/) 

## Contributing

We welcome contributions to SemanticLens! Whether you're fixing a bug, adding a new feature, or improving the documentation, your help is appreciated. 

If you'd like to contribute, please follow these steps:
1. Fork the repository on GitHub.
2. Create a new branch for your feature or bug fix (git checkout -b feature/your-feature-name).
3. Make your changes and commit them with a clear message.
4. Open a pull request to the main branch of the original repository.

For bug reports or feature requests, please use the GitHub Issues section. Before starting work on a major change, it's a good idea to open an issue first to discuss your plan.

## License

[BSD 3-Clause License](LICENSE)


## Citation
```
@article{dreyer_mechanistic_2025,
	title = {Mechanistic understanding and validation of large {AI} models with {SemanticLens}},
	copyright = {2025 The Author(s)},
	issn = {2522-5839},
	url = {https://www.nature.com/articles/s42256-025-01084-w},
	doi = {10.1038/s42256-025-01084-w},
	language = {en},
	urldate = {2025-08-18},
	journal = {Nature Machine Intelligence},
	author = {Dreyer, Maximilian and Berend, Jim and Labarta, Tobias and Vielhaben, Johanna and Wiegand, Thomas and Lapuschkin, Sebastian and Samek, Wojciech},
	month = aug,
	year = {2025},
	note = {Publisher: Nature Publishing Group},
	keywords = {Computer science, Information technology},
	pages = {1--14},
}
```


            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "semanticlens",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "deep learning, foundation model, mechanistic interpretability, semantic analysis",
    "author": null,
    "author_email": "Jim Berend <jim.berend@hhi.fraunhofer.de>, Maximilian Dreyer <maximilian.dreyer@hhi.fraunhofer.de>",
    "download_url": "https://files.pythonhosted.org/packages/1e/13/41778547c2ceff88737aaec92b6dce3cabaf1a98488603097f0854f1a9f2/semanticlens-0.2.0.tar.gz",
    "platform": null,
    "description": "<div align=\"center\">\n  <img src=\"https://github.com/jim-berend/semanticlens/blob/be718f96ba7c52b29249ff7b4806999890895c72/static/images/logo-with-name_big.svg\" width=\"400px\" alt=\"SemanticLens logo\" align=\"center\" />\n  <p>\n  An open-source PyTorch library for interpreting and validating large vision models.\n  <br>\n  Read the paper now as part of <a href=\"https://www.nature.com/articles/s42256-025-01084-w\">Nature Machine Intelligence</a> (Open Access).\n  </p>\n</div>\n\n<br>\n\n<div align=\"center\">\n  <a href=\"https://www.nature.com/articles/s42256-025-01084-w\">\n    <img  src=\"https://img.shields.io/static/v1?label=Nature&message=Machine%20Intelligence&color=green\">\n  </a>\n  <a href=\"https://doi.org/10.5281/zenodo.15233581\">\n    <img alt=\"DOI\" src=\"https://zenodo.org/badge/DOI/10.5281/zenodo.15233581.svg\">\n  </a>\n  <a href=\"https://pypi.org/project/semanticlens/\">\n    <img alt=\"pypi\" src=\"https://img.shields.io/pypi/v/semanticlens\">\n  </a>\n  <img  src=\"https://img.shields.io/badge/Python-3.9, 3.10, 3.11-efefef\">\n  <a href=\"LICENSE\">\n    <img alt=\"DOI\" src=\"https://img.shields.io/badge/License-BSD%203--Clause-blue.svg\">\n  </a>\n\t<img alt=\"PyLint\" src=\"https://github.com/jim-berend/semanticlens/actions/workflows/ruff-lint.yml/badge.svg\">\n  <a href=\"https://jim-berend.github.io/semanticlens/\">\n    <img  src=\"https://img.shields.io/badge/Docs-SemanticLens-ff8c00\">\n  </a>\n</div>\n\n\n**SemanticLens** is a universal framework for explaining and validating large vision models. While deep learning models are powerful, their internal workings are often a \"black box,\" making them difficult to trust and debug. SemanticLens addresses this by mapping the internal components of a model (like neurons or filters) into the rich, semantic space of a foundation model (e.g., CLIP or SigLIP).\n\nThis allows you to \"translate\" what the model is doing into a human-understandable format, enabling you to search, analyze, and audit its internal representations.\n\n\n## How It Works\n\n\n<div align=\"center\">\n  <img src=\"https://github.com/jim-berend/semanticlens/blob/be718f96ba7c52b29249ff7b4806999890895c72/static/images/overview-figure.svg\" width=\"90%\" alt=\"Overview figure\" align=\"center\" />\n  <p>\n  Overview of the SemanticLens framework as introduced in our <a href=\"https://www.nature.com/articles/s42256-025-01084-w\"> research paper.</a>\n\n  </p>\n</div>\n\nThe core workflow of SemanticLens involves three main steps:\n1) **Collect**: For each component in a model M, we identify the data samples that cause the highest activation (the \"concept examples\").\nWe provide a suite of [`ComponentVisualizers`](semanticlens/component_visualization) that implement different strategies, from simple activation maximization to relevance-maximization and attribution-based cropping.\n\n2) **Embed**: These examples are then fed into a foundation model (like CLIP), which creates a meaningful vector representation for each component. SemanticLens includes built-in support for [OpenCLIP](https://github.com/mlfoundations/open_clip) and can be easily extended with other foundation models (see [base.py](semanticlens/foundation_models/base.py)).\n\n\n3) **Analyze**: These vector representations enable powerful analyses. The [`Lens`](semanticlens/lens.py) class is the main interface for this, orchestrating the preprocessing, caching, and evaluation needed to search and audit your model using its new semantic embeddings.\n\n\n## Installation\n\nYou can install SemanticLens directly from PyPI:\n```bash\npip install semanticlens\n```\n\nTo install the latest version from this repository:\n\n```bash\npip install git+https://github.com/jim-berend/semanticlens.git\n```\n\n## Quickstart\nExample usage:\n```python\nimport semanticlens as sl\n\n... # dataset and model setup\n\n# Initialization\n\ncv = sl.component_visualization.ActivationComponentVisualizer(\n    model,\n    dataset_model,\n    dataset_fm,\n    layer_names=layer_names,\n    device=device,\n    cache_dir=cache_dir,\n)\n\nfm = sl.foundation_models.OpenClip(url=\"RN50\", pretrained=\"openai\", device=device)\n\nlens = sl.Lens(fm, device=device)\n\n# Semantic Embedding \n\nconcept_db = lens.compute_concept_db(cv, batch_size=128, num_workers=8)\naggregated_cpt_db = {k: v.mean(1) for k, v in concept_db.items()}\n\n# Analysis\n\npolysemanticity_scores = lens.eval_polysemanticity(concept_db)\n\nsearch_results = lens.text_probing([\"cats\", \"dogs\"], aggregated_cpt_db)\n\n...\n```\n<a href=\"tutorials/quickstart.ipynb\">\n<img  src=\"https://img.shields.io/badge/Tutorial-Quickstart.ipynb-2881db\">\n</a>\n\nFull quickstart guide: [quickstart.ipynb](tutorials/quickstart.ipynb)\n\n\n<a href=\"https://jim-berend.github.io/semanticlens/\">\n<img  src=\"https://img.shields.io/badge/Docs-SemanticLens-ff8c00\">\n</a>\n\nPackage documentation: [docs](https://jim-berend.github.io/semanticlens/) \n\n## Contributing\n\nWe welcome contributions to SemanticLens! Whether you're fixing a bug, adding a new feature, or improving the documentation, your help is appreciated. \n\nIf you'd like to contribute, please follow these steps:\n1. Fork the repository on GitHub.\n2. Create a new branch for your feature or bug fix (git checkout -b feature/your-feature-name).\n3. Make your changes and commit them with a clear message.\n4. Open a pull request to the main branch of the original repository.\n\nFor bug reports or feature requests, please use the GitHub Issues section. Before starting work on a major change, it's a good idea to open an issue first to discuss your plan.\n\n## License\n\n[BSD 3-Clause License](LICENSE)\n\n\n## Citation\n```\n@article{dreyer_mechanistic_2025,\n\ttitle = {Mechanistic understanding and validation of large {AI} models with {SemanticLens}},\n\tcopyright = {2025 The Author(s)},\n\tissn = {2522-5839},\n\turl = {https://www.nature.com/articles/s42256-025-01084-w},\n\tdoi = {10.1038/s42256-025-01084-w},\n\tlanguage = {en},\n\turldate = {2025-08-18},\n\tjournal = {Nature Machine Intelligence},\n\tauthor = {Dreyer, Maximilian and Berend, Jim and Labarta, Tobias and Vielhaben, Johanna and Wiegand, Thomas and Lapuschkin, Sebastian and Samek, Wojciech},\n\tmonth = aug,\n\tyear = {2025},\n\tnote = {Publisher: Nature Publishing Group},\n\tkeywords = {Computer science, Information technology},\n\tpages = {1--14},\n}\n```\n\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A package for mechanistic understanding and validation of large  AI model with SemanticLens",
    "version": "0.2.0",
    "project_urls": null,
    "split_keywords": [
        "deep learning",
        " foundation model",
        " mechanistic interpretability",
        " semantic analysis"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "eb9f014d9a8318ce10b4377833dde92837f6fd5747752692022f3f60e2d94129",
                "md5": "18b1d7f8d149f429129bbba7fa526313",
                "sha256": "ce09f67e92daa3292056b700027dcdcd952ea5378fa1bbd77db88b4e72bb08da"
            },
            "downloads": -1,
            "filename": "semanticlens-0.2.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "18b1d7f8d149f429129bbba7fa526313",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 40307,
            "upload_time": "2025-08-19T06:47:42",
            "upload_time_iso_8601": "2025-08-19T06:47:42.719375Z",
            "url": "https://files.pythonhosted.org/packages/eb/9f/014d9a8318ce10b4377833dde92837f6fd5747752692022f3f60e2d94129/semanticlens-0.2.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "1e1341778547c2ceff88737aaec92b6dce3cabaf1a98488603097f0854f1a9f2",
                "md5": "2f4d04c08ad3fbf40deeb24e5283cb11",
                "sha256": "0953e26bef6e204bd317f766e1dac68425df495893454f5bb6099182a8803b3d"
            },
            "downloads": -1,
            "filename": "semanticlens-0.2.0.tar.gz",
            "has_sig": false,
            "md5_digest": "2f4d04c08ad3fbf40deeb24e5283cb11",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 20998357,
            "upload_time": "2025-08-19T06:47:44",
            "upload_time_iso_8601": "2025-08-19T06:47:44.588935Z",
            "url": "https://files.pythonhosted.org/packages/1e/13/41778547c2ceff88737aaec92b6dce3cabaf1a98488603097f0854f1a9f2/semanticlens-0.2.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-19 06:47:44",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "semanticlens"
}
        
Elapsed time: 3.01539s