<div align="center">
<img src="https://github.com/jim-berend/semanticlens/blob/be718f96ba7c52b29249ff7b4806999890895c72/static/images/logo-with-name_big.svg" width="400px" alt="SemanticLens logo" align="center" />
<p>
An open-source PyTorch library for interpreting and validating large vision models.
<br>
Read the paper now as part of <a href="https://www.nature.com/articles/s42256-025-01084-w">Nature Machine Intelligence</a> (Open Access).
</p>
</div>
<br>
<div align="center">
<a href="https://www.nature.com/articles/s42256-025-01084-w">
<img src="https://img.shields.io/static/v1?label=Nature&message=Machine%20Intelligence&color=green">
</a>
<a href="https://doi.org/10.5281/zenodo.15233581">
<img alt="DOI" src="https://zenodo.org/badge/DOI/10.5281/zenodo.15233581.svg">
</a>
<a href="https://pypi.org/project/semanticlens/">
<img alt="pypi" src="https://img.shields.io/pypi/v/semanticlens">
</a>
<img src="https://img.shields.io/badge/Python-3.9, 3.10, 3.11-efefef">
<a href="LICENSE">
<img alt="DOI" src="https://img.shields.io/badge/License-BSD%203--Clause-blue.svg">
</a>
<img alt="PyLint" src="https://github.com/jim-berend/semanticlens/actions/workflows/ruff-lint.yml/badge.svg">
<a href="https://jim-berend.github.io/semanticlens/">
<img src="https://img.shields.io/badge/Docs-SemanticLens-ff8c00">
</a>
</div>
**SemanticLens** is a universal framework for explaining and validating large vision models. While deep learning models are powerful, their internal workings are often a "black box," making them difficult to trust and debug. SemanticLens addresses this by mapping the internal components of a model (like neurons or filters) into the rich, semantic space of a foundation model (e.g., CLIP or SigLIP).
This allows you to "translate" what the model is doing into a human-understandable format, enabling you to search, analyze, and audit its internal representations.
## How It Works
<div align="center">
<img src="https://github.com/jim-berend/semanticlens/blob/be718f96ba7c52b29249ff7b4806999890895c72/static/images/overview-figure.svg" width="90%" alt="Overview figure" align="center" />
<p>
Overview of the SemanticLens framework as introduced in our <a href="https://www.nature.com/articles/s42256-025-01084-w"> research paper.</a>
</p>
</div>
The core workflow of SemanticLens involves three main steps:
1) **Collect**: For each component in a model M, we identify the data samples that cause the highest activation (the "concept examples").
We provide a suite of [`ComponentVisualizers`](semanticlens/component_visualization) that implement different strategies, from simple activation maximization to relevance-maximization and attribution-based cropping.
2) **Embed**: These examples are then fed into a foundation model (like CLIP), which creates a meaningful vector representation for each component. SemanticLens includes built-in support for [OpenCLIP](https://github.com/mlfoundations/open_clip) and can be easily extended with other foundation models (see [base.py](semanticlens/foundation_models/base.py)).
3) **Analyze**: These vector representations enable powerful analyses. The [`Lens`](semanticlens/lens.py) class is the main interface for this, orchestrating the preprocessing, caching, and evaluation needed to search and audit your model using its new semantic embeddings.
## Installation
You can install SemanticLens directly from PyPI:
```bash
pip install semanticlens
```
To install the latest version from this repository:
```bash
pip install git+https://github.com/jim-berend/semanticlens.git
```
## Quickstart
Example usage:
```python
import semanticlens as sl
... # dataset and model setup
# Initialization
cv = sl.component_visualization.ActivationComponentVisualizer(
model,
dataset_model,
dataset_fm,
layer_names=layer_names,
device=device,
cache_dir=cache_dir,
)
fm = sl.foundation_models.OpenClip(url="RN50", pretrained="openai", device=device)
lens = sl.Lens(fm, device=device)
# Semantic Embedding
concept_db = lens.compute_concept_db(cv, batch_size=128, num_workers=8)
aggregated_cpt_db = {k: v.mean(1) for k, v in concept_db.items()}
# Analysis
polysemanticity_scores = lens.eval_polysemanticity(concept_db)
search_results = lens.text_probing(["cats", "dogs"], aggregated_cpt_db)
...
```
<a href="tutorials/quickstart.ipynb">
<img src="https://img.shields.io/badge/Tutorial-Quickstart.ipynb-2881db">
</a>
Full quickstart guide: [quickstart.ipynb](tutorials/quickstart.ipynb)
<a href="https://jim-berend.github.io/semanticlens/">
<img src="https://img.shields.io/badge/Docs-SemanticLens-ff8c00">
</a>
Package documentation: [docs](https://jim-berend.github.io/semanticlens/)
## Contributing
We welcome contributions to SemanticLens! Whether you're fixing a bug, adding a new feature, or improving the documentation, your help is appreciated.
If you'd like to contribute, please follow these steps:
1. Fork the repository on GitHub.
2. Create a new branch for your feature or bug fix (git checkout -b feature/your-feature-name).
3. Make your changes and commit them with a clear message.
4. Open a pull request to the main branch of the original repository.
For bug reports or feature requests, please use the GitHub Issues section. Before starting work on a major change, it's a good idea to open an issue first to discuss your plan.
## License
[BSD 3-Clause License](LICENSE)
## Citation
```
@article{dreyer_mechanistic_2025,
title = {Mechanistic understanding and validation of large {AI} models with {SemanticLens}},
copyright = {2025 The Author(s)},
issn = {2522-5839},
url = {https://www.nature.com/articles/s42256-025-01084-w},
doi = {10.1038/s42256-025-01084-w},
language = {en},
urldate = {2025-08-18},
journal = {Nature Machine Intelligence},
author = {Dreyer, Maximilian and Berend, Jim and Labarta, Tobias and Vielhaben, Johanna and Wiegand, Thomas and Lapuschkin, Sebastian and Samek, Wojciech},
month = aug,
year = {2025},
note = {Publisher: Nature Publishing Group},
keywords = {Computer science, Information technology},
pages = {1--14},
}
```
Raw data
{
"_id": null,
"home_page": null,
"name": "semanticlens",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": "deep learning, foundation model, mechanistic interpretability, semantic analysis",
"author": null,
"author_email": "Jim Berend <jim.berend@hhi.fraunhofer.de>, Maximilian Dreyer <maximilian.dreyer@hhi.fraunhofer.de>",
"download_url": "https://files.pythonhosted.org/packages/1e/13/41778547c2ceff88737aaec92b6dce3cabaf1a98488603097f0854f1a9f2/semanticlens-0.2.0.tar.gz",
"platform": null,
"description": "<div align=\"center\">\n <img src=\"https://github.com/jim-berend/semanticlens/blob/be718f96ba7c52b29249ff7b4806999890895c72/static/images/logo-with-name_big.svg\" width=\"400px\" alt=\"SemanticLens logo\" align=\"center\" />\n <p>\n An open-source PyTorch library for interpreting and validating large vision models.\n <br>\n Read the paper now as part of <a href=\"https://www.nature.com/articles/s42256-025-01084-w\">Nature Machine Intelligence</a> (Open Access).\n </p>\n</div>\n\n<br>\n\n<div align=\"center\">\n <a href=\"https://www.nature.com/articles/s42256-025-01084-w\">\n <img src=\"https://img.shields.io/static/v1?label=Nature&message=Machine%20Intelligence&color=green\">\n </a>\n <a href=\"https://doi.org/10.5281/zenodo.15233581\">\n <img alt=\"DOI\" src=\"https://zenodo.org/badge/DOI/10.5281/zenodo.15233581.svg\">\n </a>\n <a href=\"https://pypi.org/project/semanticlens/\">\n <img alt=\"pypi\" src=\"https://img.shields.io/pypi/v/semanticlens\">\n </a>\n <img src=\"https://img.shields.io/badge/Python-3.9, 3.10, 3.11-efefef\">\n <a href=\"LICENSE\">\n <img alt=\"DOI\" src=\"https://img.shields.io/badge/License-BSD%203--Clause-blue.svg\">\n </a>\n\t<img alt=\"PyLint\" src=\"https://github.com/jim-berend/semanticlens/actions/workflows/ruff-lint.yml/badge.svg\">\n <a href=\"https://jim-berend.github.io/semanticlens/\">\n <img src=\"https://img.shields.io/badge/Docs-SemanticLens-ff8c00\">\n </a>\n</div>\n\n\n**SemanticLens** is a universal framework for explaining and validating large vision models. While deep learning models are powerful, their internal workings are often a \"black box,\" making them difficult to trust and debug. SemanticLens addresses this by mapping the internal components of a model (like neurons or filters) into the rich, semantic space of a foundation model (e.g., CLIP or SigLIP).\n\nThis allows you to \"translate\" what the model is doing into a human-understandable format, enabling you to search, analyze, and audit its internal representations.\n\n\n## How It Works\n\n\n<div align=\"center\">\n <img src=\"https://github.com/jim-berend/semanticlens/blob/be718f96ba7c52b29249ff7b4806999890895c72/static/images/overview-figure.svg\" width=\"90%\" alt=\"Overview figure\" align=\"center\" />\n <p>\n Overview of the SemanticLens framework as introduced in our <a href=\"https://www.nature.com/articles/s42256-025-01084-w\"> research paper.</a>\n\n </p>\n</div>\n\nThe core workflow of SemanticLens involves three main steps:\n1) **Collect**: For each component in a model M, we identify the data samples that cause the highest activation (the \"concept examples\").\nWe provide a suite of [`ComponentVisualizers`](semanticlens/component_visualization) that implement different strategies, from simple activation maximization to relevance-maximization and attribution-based cropping.\n\n2) **Embed**: These examples are then fed into a foundation model (like CLIP), which creates a meaningful vector representation for each component. SemanticLens includes built-in support for [OpenCLIP](https://github.com/mlfoundations/open_clip) and can be easily extended with other foundation models (see [base.py](semanticlens/foundation_models/base.py)).\n\n\n3) **Analyze**: These vector representations enable powerful analyses. The [`Lens`](semanticlens/lens.py) class is the main interface for this, orchestrating the preprocessing, caching, and evaluation needed to search and audit your model using its new semantic embeddings.\n\n\n## Installation\n\nYou can install SemanticLens directly from PyPI:\n```bash\npip install semanticlens\n```\n\nTo install the latest version from this repository:\n\n```bash\npip install git+https://github.com/jim-berend/semanticlens.git\n```\n\n## Quickstart\nExample usage:\n```python\nimport semanticlens as sl\n\n... # dataset and model setup\n\n# Initialization\n\ncv = sl.component_visualization.ActivationComponentVisualizer(\n model,\n dataset_model,\n dataset_fm,\n layer_names=layer_names,\n device=device,\n cache_dir=cache_dir,\n)\n\nfm = sl.foundation_models.OpenClip(url=\"RN50\", pretrained=\"openai\", device=device)\n\nlens = sl.Lens(fm, device=device)\n\n# Semantic Embedding \n\nconcept_db = lens.compute_concept_db(cv, batch_size=128, num_workers=8)\naggregated_cpt_db = {k: v.mean(1) for k, v in concept_db.items()}\n\n# Analysis\n\npolysemanticity_scores = lens.eval_polysemanticity(concept_db)\n\nsearch_results = lens.text_probing([\"cats\", \"dogs\"], aggregated_cpt_db)\n\n...\n```\n<a href=\"tutorials/quickstart.ipynb\">\n<img src=\"https://img.shields.io/badge/Tutorial-Quickstart.ipynb-2881db\">\n</a>\n\nFull quickstart guide: [quickstart.ipynb](tutorials/quickstart.ipynb)\n\n\n<a href=\"https://jim-berend.github.io/semanticlens/\">\n<img src=\"https://img.shields.io/badge/Docs-SemanticLens-ff8c00\">\n</a>\n\nPackage documentation: [docs](https://jim-berend.github.io/semanticlens/) \n\n## Contributing\n\nWe welcome contributions to SemanticLens! Whether you're fixing a bug, adding a new feature, or improving the documentation, your help is appreciated. \n\nIf you'd like to contribute, please follow these steps:\n1. Fork the repository on GitHub.\n2. Create a new branch for your feature or bug fix (git checkout -b feature/your-feature-name).\n3. Make your changes and commit them with a clear message.\n4. Open a pull request to the main branch of the original repository.\n\nFor bug reports or feature requests, please use the GitHub Issues section. Before starting work on a major change, it's a good idea to open an issue first to discuss your plan.\n\n## License\n\n[BSD 3-Clause License](LICENSE)\n\n\n## Citation\n```\n@article{dreyer_mechanistic_2025,\n\ttitle = {Mechanistic understanding and validation of large {AI} models with {SemanticLens}},\n\tcopyright = {2025 The Author(s)},\n\tissn = {2522-5839},\n\turl = {https://www.nature.com/articles/s42256-025-01084-w},\n\tdoi = {10.1038/s42256-025-01084-w},\n\tlanguage = {en},\n\turldate = {2025-08-18},\n\tjournal = {Nature Machine Intelligence},\n\tauthor = {Dreyer, Maximilian and Berend, Jim and Labarta, Tobias and Vielhaben, Johanna and Wiegand, Thomas and Lapuschkin, Sebastian and Samek, Wojciech},\n\tmonth = aug,\n\tyear = {2025},\n\tnote = {Publisher: Nature Publishing Group},\n\tkeywords = {Computer science, Information technology},\n\tpages = {1--14},\n}\n```\n\n",
"bugtrack_url": null,
"license": null,
"summary": "A package for mechanistic understanding and validation of large AI model with SemanticLens",
"version": "0.2.0",
"project_urls": null,
"split_keywords": [
"deep learning",
" foundation model",
" mechanistic interpretability",
" semantic analysis"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "eb9f014d9a8318ce10b4377833dde92837f6fd5747752692022f3f60e2d94129",
"md5": "18b1d7f8d149f429129bbba7fa526313",
"sha256": "ce09f67e92daa3292056b700027dcdcd952ea5378fa1bbd77db88b4e72bb08da"
},
"downloads": -1,
"filename": "semanticlens-0.2.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "18b1d7f8d149f429129bbba7fa526313",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 40307,
"upload_time": "2025-08-19T06:47:42",
"upload_time_iso_8601": "2025-08-19T06:47:42.719375Z",
"url": "https://files.pythonhosted.org/packages/eb/9f/014d9a8318ce10b4377833dde92837f6fd5747752692022f3f60e2d94129/semanticlens-0.2.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "1e1341778547c2ceff88737aaec92b6dce3cabaf1a98488603097f0854f1a9f2",
"md5": "2f4d04c08ad3fbf40deeb24e5283cb11",
"sha256": "0953e26bef6e204bd317f766e1dac68425df495893454f5bb6099182a8803b3d"
},
"downloads": -1,
"filename": "semanticlens-0.2.0.tar.gz",
"has_sig": false,
"md5_digest": "2f4d04c08ad3fbf40deeb24e5283cb11",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 20998357,
"upload_time": "2025-08-19T06:47:44",
"upload_time_iso_8601": "2025-08-19T06:47:44.588935Z",
"url": "https://files.pythonhosted.org/packages/1e/13/41778547c2ceff88737aaec92b6dce3cabaf1a98488603097f0854f1a9f2/semanticlens-0.2.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-19 06:47:44",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "semanticlens"
}