inspectus


Nameinspectus JSON
Version 0.1.5 PyPI version JSON
download
home_pagehttps://github.com/labmlai/inspectus
SummaryAnalytics for LLMs
upload_time2024-07-08 08:38:40
maintainerNone
docs_urlNone
authorlabml.ai
requires_pythonNone
licenseNone
keywords llm
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            [![PyPI - Python Version](https://badge.fury.io/py/inspectus.svg)](https://badge.fury.io/py/inspectus)
[![PyPI Status](https://pepy.tech/badge/inspectus)](https://pepy.tech/project/inspectus)
[![Twitter](https://img.shields.io/twitter/follow/labmlai?style=social)](https://twitter.com/labmlai?ref_src=twsrc%5Etfw)

# Inspectus

Inspectus is a versatile visualization tool for machine learning. It runs smoothly in Jupyter notebooks via an easy-to-use Python API.

## Content

- [Installation](#installation)
- [Attention Visualization](#attention-visualization)
  - [Preview](#preview)
  - [Components](#components)
  - [Usage](#usage)
  - [Tutorials](#tutorials)
    - [Huggingface model](#huggingface-model)
    - [Custom attention map](#custom-attention-map)
- [Distribution Plot](#distribution-plot)
  - [Preview](#preview-1)
  - [Usage](#usage-1)
  - [Sample Use case](#sample-use-case)
- [Setting up for Development](#setting-up-for-development)
- [Citing](#citing)

## Installation

```bash
pip install inspectus
```

## Attention Visualization

Inspectus provides visualization tools for attention mechanisms in deep learning models. 
It provides a set of comprehensive views, making it easier to understand how these models work.

### Preview

![Attention visualization](https://github.com/labmlai/inspectus/raw/main/assets/preview.gif)

*Click a token to select it and deselect others. Clicking again will select all again. 
To change the state of only one token, do shift+click*

### Components

**Attention Matrix**:
Visualizes the attention scores between tokens, highlighting how each token focuses on others during processing.

**Query Token Heatmap**:
Shows the sum of attention scores between each query and selected key tokens

**Key Token Heatmap**:
Shows the sum of attention scores between each key and selected query tokens

**Dimension Heatmap**:
Shows the sum of attention scores for each item in dimensions (Layers and Heads) normalized over the dimension.

#### Usage

Import the library

```python
import inspectus
```

Simple usage

```python
# attn: Attention map; a 2-4D tensor or attention maps from Huggingface transformers
inspectus.attention(attn, tokens)
```

For different query and key tokens

```python
inspectus.attention(attns, query_tokens, key_tokens)
```

For detailed API documentation, please refer to the [official documentation - wip]().

### Tutorials

#### Huggingface model

```python
from transformers import AutoTokenizer, GPT2LMHeadModel, AutoConfig
import torch
import inspectus

# Initialize the tokenizer and model
context_length = 128
tokenizer = AutoTokenizer.from_pretrained("huggingface-course/code-search-net-tokenizer")

config = AutoConfig.from_pretrained(
    "gpt2",
    vocab_size=len(tokenizer),
    n_ctx=context_length,
    bos_token_id=tokenizer.bos_token_id,
    eos_token_id=tokenizer.eos_token_id,
)

model = GPT2LMHeadModel(config)

# Tokenize the input text
text= 'The quick brown fox jumps over the lazy dog'
tokenized = tokenizer(
    text,
    return_tensors='pt',
    return_offsets_mapping=True
)
input_ids = tokenized['input_ids']

tokens = [text[s: e] for s, e in tokenized['offset_mapping'][0]]

with torch.no_grad():
    res = model(input_ids=input_ids.to(model.device), output_attentions=True)

# Visualize the attention maps using the Inspectus library
inspectus.attention(res['attentions'], tokens)
```

Check out the notebook here: [Huggingface Tutorial](https://github.com/labmlai/inspectus/blob/main/notebooks/gpt2.ipynb)
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/labmlai/inspectus/blob/main/notebooks/gpt2.ipynb)


#### Custom attention map

```python
import numpy as np
import inspectus

# 2D attention representing attention values between Query and Key tokens
attn = np.random.rand(3, 3)

# Visualize the attention values using the Inspectus library
# The first argument is the attention matrix
# The second argument is the list of query tokens
# The third argument is the list of key tokens
inspectus.attention(arr, ['a', 'b', 'c'], ['d', 'e', 'f'])
```

Check out the notebook here: [Custom attention map tutorial](https://github.com/labmlai/inspectus/blob/main/notebooks/custom_attn.ipynb)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/labmlai/inspectus/blob/main/notebooks/custom_attn.ipynb)

## Distribution Plot
The distribution plot is a plot that shows the distribution of a series of data. At each step, 
the distribution of the data is calculated and maximum of 5 bands are drawn from 9 basis points. 
(0, 6.68, 15.87, 30.85, 50.00, 69.15, 84.13, 93.32, 100.00) 


#### Preview

![Distribution Plot visualization](https://github.com/labmlai/inspectus/raw/main/assets/dist_preview.gif)

#### Usage

```python
import inspectus

inspectus.distribution({'x': [x for x in range(0, 100)]})
```

To focus on parts of the plot and zoom in, the minimap can be used. To select a single plot, use the legend on the top right.

For comprehensive usage guide please check the notebook here: [Distribution Plot Tutorial](https://github.com/labmlai/inspectus/blob/main/notebooks/distribution_plot.ipynb)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/labmlai/inspectus/blob/main/notebooks/distribution_plot.ipynb)

#### Sample Use case

This plot can be used to identify the existence of outliers in the data. Following 
notebooks demonstrate how to use the distribution plot to identify outliers in the MNIST training loss.

[MNIST](https://github.com/labmlai/inspectus/blob/main/notebooks/mnist.ipynb)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/labmlai/inspectus/blob/main/notebooks/mnist.ipynb)

## Setting up for Development

[Development Docs](https://github.com/labmlai/inspectus/blob/main/development.md)

## Citing

If you use Inspectus for academic research, please cite the library using the following BibTeX entry.

```bibtext
@misc{inspectus,
 author = {Varuna Jayasiri, Lakshith Nishshanke},
 title = {inspectus: A visualization and analytics tool for large language models},
 year = {2024},
 url = {https://github.com/labmlai/inspectus},
}
```


            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/labmlai/inspectus",
    "name": "inspectus",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "llm",
    "author": "labml.ai",
    "author_email": "contact@labml.ai",
    "download_url": "https://files.pythonhosted.org/packages/ab/4f/142e1988debcd29ff939efc498f7f33cd8ba58b9fa9f9cfd7d546299c1ec/inspectus-0.1.5.tar.gz",
    "platform": null,
    "description": "[![PyPI - Python Version](https://badge.fury.io/py/inspectus.svg)](https://badge.fury.io/py/inspectus)\n[![PyPI Status](https://pepy.tech/badge/inspectus)](https://pepy.tech/project/inspectus)\n[![Twitter](https://img.shields.io/twitter/follow/labmlai?style=social)](https://twitter.com/labmlai?ref_src=twsrc%5Etfw)\n\n# Inspectus\n\nInspectus is a versatile visualization tool for machine learning. It runs smoothly in Jupyter notebooks via an easy-to-use Python API.\n\n## Content\n\n- [Installation](#installation)\n- [Attention Visualization](#attention-visualization)\n  - [Preview](#preview)\n  - [Components](#components)\n  - [Usage](#usage)\n  - [Tutorials](#tutorials)\n    - [Huggingface model](#huggingface-model)\n    - [Custom attention map](#custom-attention-map)\n- [Distribution Plot](#distribution-plot)\n  - [Preview](#preview-1)\n  - [Usage](#usage-1)\n  - [Sample Use case](#sample-use-case)\n- [Setting up for Development](#setting-up-for-development)\n- [Citing](#citing)\n\n## Installation\n\n```bash\npip install inspectus\n```\n\n## Attention Visualization\n\nInspectus provides visualization tools for attention mechanisms in deep learning models. \nIt provides a set of comprehensive views, making it easier to understand how these models work.\n\n### Preview\n\n![Attention visualization](https://github.com/labmlai/inspectus/raw/main/assets/preview.gif)\n\n*Click a token to select it and deselect others. Clicking again will select all again. \nTo change the state of only one token, do shift+click*\n\n### Components\n\n**Attention Matrix**:\nVisualizes the attention scores between tokens, highlighting how each token focuses on others during processing.\n\n**Query Token Heatmap**:\nShows the sum of attention scores between each query and selected key tokens\n\n**Key Token Heatmap**:\nShows the sum of attention scores between each key and selected query tokens\n\n**Dimension Heatmap**:\nShows the sum of attention scores for each item in dimensions (Layers and Heads) normalized over the dimension.\n\n#### Usage\n\nImport the library\n\n```python\nimport inspectus\n```\n\nSimple usage\n\n```python\n# attn: Attention map; a 2-4D tensor or attention maps from Huggingface transformers\ninspectus.attention(attn, tokens)\n```\n\nFor different query and key tokens\n\n```python\ninspectus.attention(attns, query_tokens, key_tokens)\n```\n\nFor detailed API documentation, please refer to the [official documentation - wip]().\n\n### Tutorials\n\n#### Huggingface model\n\n```python\nfrom transformers import AutoTokenizer, GPT2LMHeadModel, AutoConfig\nimport torch\nimport inspectus\n\n# Initialize the tokenizer and model\ncontext_length = 128\ntokenizer = AutoTokenizer.from_pretrained(\"huggingface-course/code-search-net-tokenizer\")\n\nconfig = AutoConfig.from_pretrained(\n    \"gpt2\",\n    vocab_size=len(tokenizer),\n    n_ctx=context_length,\n    bos_token_id=tokenizer.bos_token_id,\n    eos_token_id=tokenizer.eos_token_id,\n)\n\nmodel = GPT2LMHeadModel(config)\n\n# Tokenize the input text\ntext= 'The quick brown fox jumps over the lazy dog'\ntokenized = tokenizer(\n    text,\n    return_tensors='pt',\n    return_offsets_mapping=True\n)\ninput_ids = tokenized['input_ids']\n\ntokens = [text[s: e] for s, e in tokenized['offset_mapping'][0]]\n\nwith torch.no_grad():\n    res = model(input_ids=input_ids.to(model.device), output_attentions=True)\n\n# Visualize the attention maps using the Inspectus library\ninspectus.attention(res['attentions'], tokens)\n```\n\nCheck out the notebook here: [Huggingface Tutorial](https://github.com/labmlai/inspectus/blob/main/notebooks/gpt2.ipynb)\n[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/labmlai/inspectus/blob/main/notebooks/gpt2.ipynb)\n\n\n#### Custom attention map\n\n```python\nimport numpy as np\nimport inspectus\n\n# 2D attention representing attention values between Query and Key tokens\nattn = np.random.rand(3, 3)\n\n# Visualize the attention values using the Inspectus library\n# The first argument is the attention matrix\n# The second argument is the list of query tokens\n# The third argument is the list of key tokens\ninspectus.attention(arr, ['a', 'b', 'c'], ['d', 'e', 'f'])\n```\n\nCheck out the notebook here: [Custom attention map tutorial](https://github.com/labmlai/inspectus/blob/main/notebooks/custom_attn.ipynb)\n\n[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/labmlai/inspectus/blob/main/notebooks/custom_attn.ipynb)\n\n## Distribution Plot\nThe distribution plot is a plot that shows the distribution of a series of data. At each step, \nthe distribution of the data is calculated and maximum of 5 bands are drawn from 9 basis points. \n(0, 6.68, 15.87, 30.85, 50.00, 69.15, 84.13, 93.32, 100.00) \n\n\n#### Preview\n\n![Distribution Plot visualization](https://github.com/labmlai/inspectus/raw/main/assets/dist_preview.gif)\n\n#### Usage\n\n```python\nimport inspectus\n\ninspectus.distribution({'x': [x for x in range(0, 100)]})\n```\n\nTo focus on parts of the plot and zoom in, the minimap can be used. To select a single plot, use the legend on the top right.\n\nFor comprehensive usage guide please check the notebook here: [Distribution Plot Tutorial](https://github.com/labmlai/inspectus/blob/main/notebooks/distribution_plot.ipynb)\n\n[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/labmlai/inspectus/blob/main/notebooks/distribution_plot.ipynb)\n\n#### Sample Use case\n\nThis plot can be used to identify the existence of outliers in the data. Following \nnotebooks demonstrate how to use the distribution plot to identify outliers in the MNIST training loss.\n\n[MNIST](https://github.com/labmlai/inspectus/blob/main/notebooks/mnist.ipynb)\n\n[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/labmlai/inspectus/blob/main/notebooks/mnist.ipynb)\n\n## Setting up for Development\n\n[Development Docs](https://github.com/labmlai/inspectus/blob/main/development.md)\n\n## Citing\n\nIf you use Inspectus for academic research, please cite the library using the following BibTeX entry.\n\n```bibtext\n@misc{inspectus,\n author = {Varuna Jayasiri, Lakshith Nishshanke},\n title = {inspectus: A visualization and analytics tool for large language models},\n year = {2024},\n url = {https://github.com/labmlai/inspectus},\n}\n```\n\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Analytics for LLMs",
    "version": "0.1.5",
    "project_urls": {
        "Documentation": "https://docs.labml.ai/",
        "Homepage": "https://github.com/labmlai/inspectus"
    },
    "split_keywords": [
        "llm"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c2db2d7b76531f4b8f22b5b0459fac24029a26ab4b04cb083f96d5c072577c67",
                "md5": "d2cc63983aec17f65f4b0d7ac99e7fa2",
                "sha256": "fbc733b1d8fbfb09f0ef0459c307e774ba06a399607b09d13a5519b77426f2e1"
            },
            "downloads": -1,
            "filename": "inspectus-0.1.5-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "d2cc63983aec17f65f4b0d7ac99e7fa2",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 112998,
            "upload_time": "2024-07-08T08:38:38",
            "upload_time_iso_8601": "2024-07-08T08:38:38.700199Z",
            "url": "https://files.pythonhosted.org/packages/c2/db/2d7b76531f4b8f22b5b0459fac24029a26ab4b04cb083f96d5c072577c67/inspectus-0.1.5-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ab4f142e1988debcd29ff939efc498f7f33cd8ba58b9fa9f9cfd7d546299c1ec",
                "md5": "bc055f6b52d4ad8f37aa71260d9446e1",
                "sha256": "fee54ad2f7cb8e185a0280d4fe6d502ad44fc38d2863654024c8fb8c81e65dce"
            },
            "downloads": -1,
            "filename": "inspectus-0.1.5.tar.gz",
            "has_sig": false,
            "md5_digest": "bc055f6b52d4ad8f37aa71260d9446e1",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 112110,
            "upload_time": "2024-07-08T08:38:40",
            "upload_time_iso_8601": "2024-07-08T08:38:40.715617Z",
            "url": "https://files.pythonhosted.org/packages/ab/4f/142e1988debcd29ff939efc498f7f33cd8ba58b9fa9f9cfd7d546299c1ec/inspectus-0.1.5.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-07-08 08:38:40",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "labmlai",
    "github_project": "inspectus",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "inspectus"
}
        
Elapsed time: 0.94640s