llm_mri

Name	llm_mri JSON
Version	0.1.9 JSON
	download
home_page	None
Summary	Package to visualize LLM's Neural Networks activation regions
upload_time	2025-01-09 20:02:59
maintainer	None
docs_url	None
author	lipecorradini
requires_python	<4.0,>=3.10
license	None
keywords
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # LLM-MRI: a brain scanner for LLMs

As the everyday use of large language models (LLMs) expands, so does the necessity of understanding how these models achieve their designated outputs. While many approaches focus on the interpretability of LLMs through visualizing different attention mechanisms and methods that explain the model's architecture, `LLM-MRI` focuses on the activations of the feed-forward layers in a transformer-based LLM.

By adopting this approach, the library examines the neuron activations produced by the model for each distinct label. Through a series of steps, such as dimensionality reduction and representing each layer as a grid, the tool provides various visualization methods for the activation patterns in the feed-forward layers. Accordingly, the objective of this library is to contribute to LLM interpretability research, enabling users to explore visualization methods, such as heatmaps and graph representations of the hidden layers' activations in transformer-based LLMs.

This model allows users to explore questions such as:

- How do different categories of text in the corpus activate different neural regions?
- What are the differences between the properties of graphs formed by activations from two distinct categories?
- Are there regions of activation in the model more related to specific aspects of a category?

We encourage you to not only use this toolkit but also to extend it as you see fit.

## Index
- [Online Example](#online-example)
- [Installation](#installation)
- [Usage](#usage)
- [Functions](#functions)
  - [Activation Extraction](#activation-extraction)
  - [Heatmap Representation of Activations](#heatmap-representation-of-activations)
  - [Graph Representation of Activations](#graph-representation-of-activations)
  - [Composed Graph Visualization](#composed-graph-visualization)


## Online Example

The link below runs an online example of our library, in the Jupyter platform running over the Binder server:

[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/luizcelsojr/LLM-MRI/v01.2?labpath=examples%2FEmotions.ipynb)

## Instalation

To see LLM-MRI in action on your own data:

```
pip install llm_mri
```

## Usage

Firstly, the user needs to import the `LLM-MRI` and `matplotlib.pyplot` packages:

```
from llm_mri import LLM_MRI
import matplotlib.pyplot as plt
```
The user also needs to specify the Hugging Face Dataset that will be used to process the model's activations. There are two ways to do this:


- Load the Dataset from Hugging Face Hub: 
  ```
  dataset_url = "https://huggingface.co/datasets/dataset_link"
  dataset = load_dataset("csv", data_files=dataset_url)
  ```
- If you already have the dataset loaded on your machine, you can use the _load_from_disk_ function:
  ```
  dataset = load_from_disk(dataset_path) # Specify the Dataset's path
  ```
> Make sure that the selected dataset is a HuggingFace Dataset, and contains the columns "text" and "label", the last one being "ClassLabel" type. For more instructions on how to make this conversion, check out some of the examples on the [GitHub documentation.](https://github.com/explic-ai/LLM-MRI/tree/main/examples)


Next, the user selects the model to be used as a string:
```
model_ckpt = "distilbert/distilbert-base-multilingual-cased"
```
Then, the user instantiates `LLM-MRI`, to apply the methods defined on Functions:
```
llm_mri = LLM_MRI(model=model_ckpt, device="cpu", dataset=dataset)
```
## Functions
The library's functionality is divided into the following sections:

### Activation Extraction: 
As the user inputs the model and corpus to be analyzed, the dimensionality of the model's hidden layers is reduced, enabling visualization as an NxN grid.
  ```
  llm_mri.process_activation_areas(map_dimension)
  ```


  
### Heatmap representation of activations:
This includes the _get_layer_image_ function, which transforms the NxN grid for a selected layer into a heatmap. In this heatmap, each cell represents the number of activations that different regions on a determined layer received for the provided corpus. Additionally, users can visualize activations for a specific label.
  ```
  fig = llm_mri.get_layer_image(layer, category)
  ```
![hidden_state_1_true](https://github.com/user-attachments/assets/0bfbc90e-2bb9-4bd0-aa20-68c67608189f)



  
### Graph Representation of Activations:
Using the _get_graph_ function, the module connects regions from neighboring layers based on co-activations to form a graph representing the entire network. The graph's edges can also be colored according to different labels, allowing the user to identify the specific category that activated each neighboring node.

> **_colormap:_**  The default used colormap is the 'coolwarm'. More can be found on [matplotlib.colors](https://matplotlib.org/stable/users/explain/colors/colormaps.html). We recommend the use of a 'Diverging' colormap for better visualization.
> **_fix_node_positions:_**  'True' keeps the nodes and edges at the same positions, independently of the categories. This could be useful for comparing activations between distinct categories. Setting to 'False' does not allow this comparison, although the graph will be more easily visualized.

   ```
   graph = llm_mri.get_graph(category)
   graph_image = llm_mri.get_graph_image(graph, colormap, fix_node_positions)
  ```
![true](https://github.com/user-attachments/assets/98b006ad-1e1a-40c1-9259-66e0496203b8)



### Composed Graph Visualization:
The user is also able to obtain a composed visualization of two different categories using the _get_composed_graph_ function. By setting a category, each edge is colored based on the designated label, so the user is able to see which document label activated each region. Additionally, the user can select a colormap, where node colors reflect the label that most strongly activated them. Nodes colored white indicate equal activation by both categories, as white represents the midpoint of the color spectrum.
```
g_composed = llm_mri.get_composed_graph("true", "fake")
g_composed_img = llm_mri.get_graph_image(g_composed)
```

![fake_and_true_graph](https://github.com/user-attachments/assets/7ca1c194-045f-45fd-a2a7-33941fe0dc86)

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "llm_mri",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.10",
    "maintainer_email": null,
    "keywords": null,
    "author": "lipecorradini",
    "author_email": "luizfelipecorradini@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/0a/aa/87a5942be863b61726ff79be09c44cb0bc467023fb193ec1071349a4f931/llm_mri-0.1.9.tar.gz",
    "platform": null,
    "description": "# LLM-MRI: a brain scanner for LLMs\n\nAs the everyday use of large language models (LLMs) expands, so does the necessity of understanding how these models achieve their designated outputs. While many approaches focus on the interpretability of LLMs through visualizing different attention mechanisms and methods that explain the model's architecture, `LLM-MRI` focuses on the activations of the feed-forward layers in a transformer-based LLM.\n\nBy adopting this approach, the library examines the neuron activations produced by the model for each distinct label. Through a series of steps, such as dimensionality reduction and representing each layer as a grid, the tool provides various visualization methods for the activation patterns in the feed-forward layers. Accordingly, the objective of this library is to contribute to LLM interpretability research, enabling users to explore visualization methods, such as heatmaps and graph representations of the hidden layers' activations in transformer-based LLMs.\n\nThis model allows users to explore questions such as:\n\n- How do different categories of text in the corpus activate different neural regions?\n- What are the differences between the properties of graphs formed by activations from two distinct categories?\n- Are there regions of activation in the model more related to specific aspects of a category?\n\nWe encourage you to not only use this toolkit but also to extend it as you see fit.\n\n## Index\n- [Online Example](#online-example)\n- [Installation](#installation)\n- [Usage](#usage)\n- [Functions](#functions)\n  - [Activation Extraction](#activation-extraction)\n  - [Heatmap Representation of Activations](#heatmap-representation-of-activations)\n  - [Graph Representation of Activations](#graph-representation-of-activations)\n  - [Composed Graph Visualization](#composed-graph-visualization)\n\n\n## Online Example\n\nThe link below runs an online example of our library, in the Jupyter platform running over the Binder server:\n\n[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/luizcelsojr/LLM-MRI/v01.2?labpath=examples%2FEmotions.ipynb)\n\n## Instalation\n\nTo see LLM-MRI in action on your own data:\n\n```\npip install llm_mri\n```\n\n## Usage\n\nFirstly, the user needs to import the `LLM-MRI` and `matplotlib.pyplot` packages:\n\n```\nfrom llm_mri import LLM_MRI\nimport matplotlib.pyplot as plt\n```\nThe user also needs to specify the Hugging Face Dataset that will be used to process the model's activations. There are two ways to do this:\n\n\n- Load the Dataset from Hugging Face Hub: \n  ```\n  dataset_url = \"https://huggingface.co/datasets/dataset_link\"\n  dataset = load_dataset(\"csv\", data_files=dataset_url)\n  ```\n- If you already have the dataset loaded on your machine, you can use the _load_from_disk_ function:\n  ```\n  dataset = load_from_disk(dataset_path) # Specify the Dataset's path\n  ```\n> Make sure that the selected dataset is a HuggingFace Dataset, and contains the columns \"text\" and \"label\", the last one being \"ClassLabel\" type. For more instructions on how to make this conversion, check out some of the examples on the [GitHub documentation.](https://github.com/explic-ai/LLM-MRI/tree/main/examples)\n\n\nNext, the user selects the model to be used as a string:\n```\nmodel_ckpt = \"distilbert/distilbert-base-multilingual-cased\"\n```\nThen, the user instantiates `LLM-MRI`, to apply the methods defined on Functions:\n```\nllm_mri = LLM_MRI(model=model_ckpt, device=\"cpu\", dataset=dataset)\n```\n## Functions\nThe library's functionality is divided into the following sections:\n\n### Activation Extraction: \nAs the user inputs the model and corpus to be analyzed, the dimensionality of the model's hidden layers is reduced, enabling visualization as an NxN grid.\n  ```\n  llm_mri.process_activation_areas(map_dimension)\n  ```\n\n\n  \n### Heatmap representation of activations:\nThis includes the _get_layer_image_ function, which transforms the NxN grid for a selected layer into a heatmap. In this heatmap, each cell represents the number of activations that different regions on a determined layer received for the provided corpus. Additionally, users can visualize activations for a specific label.\n  ```\n  fig = llm_mri.get_layer_image(layer, category)\n  ```\n![hidden_state_1_true](https://github.com/user-attachments/assets/0bfbc90e-2bb9-4bd0-aa20-68c67608189f)\n\n\n\n  \n### Graph Representation of Activations:\nUsing the _get_graph_ function, the module connects regions from neighboring layers based on co-activations to form a graph representing the entire network. The graph's edges can also be colored according to different labels, allowing the user to identify the specific category that activated each neighboring node.\n\n> **_colormap:_**  The default used colormap is the 'coolwarm'. More can be found on [matplotlib.colors](https://matplotlib.org/stable/users/explain/colors/colormaps.html). We recommend the use of a 'Diverging' colormap for better visualization.\n> **_fix_node_positions:_**  'True' keeps the nodes and edges at the same positions, independently of the categories. This could be useful for comparing activations between distinct categories. Setting to 'False' does not allow this comparison, although the graph will be more easily visualized.\n\n   ```\n   graph = llm_mri.get_graph(category)\n   graph_image = llm_mri.get_graph_image(graph, colormap, fix_node_positions)\n  ```\n![true](https://github.com/user-attachments/assets/98b006ad-1e1a-40c1-9259-66e0496203b8)\n\n\n\n### Composed Graph Visualization:\nThe user is also able to obtain a composed visualization of two different categories using the _get_composed_graph_ function. By setting a category, each edge is colored based on the designated label, so the user is able to see which document label activated each region. Additionally, the user can select a colormap, where node colors reflect the label that most strongly activated them. Nodes colored white indicate equal activation by both categories, as white represents the midpoint of the color spectrum.\n```\ng_composed = llm_mri.get_composed_graph(\"true\", \"fake\")\ng_composed_img = llm_mri.get_graph_image(g_composed)\n```\n\n![fake_and_true_graph](https://github.com/user-attachments/assets/7ca1c194-045f-45fd-a2a7-33941fe0dc86)\n\n\n\n\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Package to visualize LLM's Neural Networks activation regions",
    "version": "0.1.9",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "8b7fa5c594723db8ff498487e913ec0154c202b2e92ebdb525b46d544ac59f8c",
                "md5": "a78d00879192210a15a445b9b924684f",
                "sha256": "be08e615b397332b0fee895cc2ce3fd50f6cd4c7fb65205e4045a91dd8d03755"
            },
            "downloads": -1,
            "filename": "llm_mri-0.1.9-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "a78d00879192210a15a445b9b924684f",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.10",
            "size": 13835,
            "upload_time": "2025-01-09T20:02:56",
            "upload_time_iso_8601": "2025-01-09T20:02:56.853631Z",
            "url": "https://files.pythonhosted.org/packages/8b/7f/a5c594723db8ff498487e913ec0154c202b2e92ebdb525b46d544ac59f8c/llm_mri-0.1.9-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "0aaa87a5942be863b61726ff79be09c44cb0bc467023fb193ec1071349a4f931",
                "md5": "4d55ec65246166d81f0d238773be5b26",
                "sha256": "87f9bf4df01a9d9728693c284f7816fa49a0839ae33801768cab03b09c07948c"
            },
            "downloads": -1,
            "filename": "llm_mri-0.1.9.tar.gz",
            "has_sig": false,
            "md5_digest": "4d55ec65246166d81f0d238773be5b26",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.10",
            "size": 14873,
            "upload_time": "2025-01-09T20:02:59",
            "upload_time_iso_8601": "2025-01-09T20:02:59.260835Z",
            "url": "https://files.pythonhosted.org/packages/0a/aa/87a5942be863b61726ff79be09c44cb0bc467023fb193ec1071349a4f931/llm_mri-0.1.9.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-01-09 20:02:59",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "llm_mri"
}

lipecorradini