textualheatmap


Nametextualheatmap JSON
Version 1.2.0 PyPI version JSON
download
home_pageNone
SummaryCreate interactive textual heat maps for Jupiter notebooks
upload_time2024-05-30 22:49:11
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseMIT
keywords saliency heatmap text textual jupyter colab interactive
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # textualheatmap

**Create interactive textual heatmaps for Jupiter notebooks.**

I originally published this visualization method in my distill paper
https://distill.pub/2019/memorization-in-rnns/. In this context, it is used
as a saliency map for showing which parts of a sentence are used to predict
the next word. However, the visualization method is more general-purpose than
that and can be used for any kind of textual heatmap purposes.

`textualheatmap` works with python 3.6 or newer and is distributed under the
MIT license.

![Gif of saliency in RNN models](gifs/show_meta.gif)

An end-to-end example of how to use the
[HuggingFace 🤗 Transformers](https://github.com/huggingface/transformers) python
module to create a textual saliency map for how each masked token is predicted.

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/AndreasMadsen/python-textualheatmap/blob/master/notebooks/huggingface_bert_example.ipynb)


![Gif of saliency in BERT models](gifs/huggingface_bert.gif)

## Install

```bash
pip install -U textualheatmap
```

## API

* [`textualheatmap.TextualHeatmap`](textualheatmap/textual_heatmap.py)

## Examples

### Example of sequential-charecter model with metadata visible

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/AndreasMadsen/python-textualheatmap/blob/master/notebooks/general_example.ipynb)

```python
from textualheatmap import TextualHeatmap

data = [[
    # GRU data
    {"token":" ",
     "meta":["the","one","of"],
     "heat":[1,0,0,0,0,0,0,0,0]},
    {"token":"c",
     "meta":["can","called","century"],
     "heat":[1,0.22,0,0,0,0,0,0,0]},
    {"token":"o",
     "meta":["country","could","company"],
     "heat":[0.57,0.059,1,0,0,0,0,0,0]},
    {"token":"n",
     "meta":["control","considered","construction"],
     "heat":[1,0.20,0.11,0.84,0,0,0,0,0]},
    {"token":"t",
     "meta":["control","continued","continental"],
     "heat":[0.27,0.17,0.052,0.44,1,0,0,0,0]},
    {"token":"e",
     "meta":["context","content","contested"],
     "heat":[0.17,0.039,0.034,0.22,1,0.53,0,0,0]},
    {"token":"x",
     "meta":["context","contexts","contemporary"],
     "heat":[0.17,0.0044,0.021,0.17,1,0.90,0.48,0,0]},
    {"token":"t",
     "meta":["context","contexts","contentious"],
     "heat":[0.14,0.011,0.034,0.14,0.68,1,0.80,0.86,0]},
    {"token":" ",
     "meta":["of","and","the"],
     "heat":[0.014,0.0063,0.0044,0.011,0.034,0.10,0.32,0.28,1]},
    # ...
],[
    # LSTM data
    # ...
]]

heatmap = TextualHeatmap(
    width = 600,
    show_meta = True,
    facet_titles = ['GRU', 'LSTM']
)
# Set data and render plot, this can be called again to replace
# the data.
heatmap.set_data(data)
# Focus on the token with the given index. Especially useful when
# `interactive=False` is used in `TextualHeatmap`.
heatmap.highlight(159)
```

![Shows saliency with predicted words at metadata](gifs/show_meta.gif)

### Example of sequential-charecter model without metadata

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/AndreasMadsen/python-textualheatmap/blob/master/notebooks/general_example.ipynb)

When `show_meta` is not `True`, the `meta` part of the `data` object has no effect.

```python
heatmap = TextualHeatmap(
    facet_titles = ['LSTM', 'GRU'],
    rotate_facet_titles = True
)
heatmap.set_data(data)
heatmap.highlight(159)
```

![Shows saliency without metadata](gifs/no_meta_and_rotated.gif)

### Example of non-sequential-word model

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/AndreasMadsen/python-textualheatmap/blob/master/notebooks/bert_hardcoded_example.ipynb)

`format = True` can be set in the `data` object to inducate tokens that are
not directly used by the model. This is useful if word or sub-word tokenization
is used.


```python
data = [[
{'token': '[CLR]',
 'meta': ['', '', ''],
 'heat': [1, 0, 0, 0, 0, ...]},
{'token': ' ',
 'format': True},
{'token': 'context',
 'meta': ['today', 'and', 'thus'],
 'heat': [0.13, 0.40, 0.23, 1.0, 0.56, ...]},
{'token': ' ',
 'format': True},
{'token': 'the',
 'meta': ['##ual', 'the', '##ually'],
 'heat': [0.11, 1.0, 0.34, 0.58, 0.59, ...]},
{'token': ' ',
 'format': True},
{'token': 'formal',
 'meta': ['formal', 'academic', 'systematic'],
 'heat': [0.13, 0.74, 0.26, 0.35, 1.0, ...]},
{'token': ' ',
 'format': True},
{'token': 'study',
 'meta': ['##ization', 'study', '##ity'],
 'heat': [0.09, 0.27, 0.19, 1.0, 0.26, ...]}
]]

heatmap = TextualHeatmap(facet_titles = ['BERT'], show_meta=True)
heatmap.set_data(data)
```

![Shows saliency in a BERT model, using sub-word tokenization](gifs/sub_word_tokenized.gif)

## Citation

If you use this in a publication, please cite my [Distill publication](https://distill.pub/2019/memorization-in-rnns/) where I first demonstrated this visualization method.

```bib
@article{madsen2019visualizing,
  author = {Madsen, Andreas},
  title = {Visualizing memorization in RNNs},
  journal = {Distill},
  year = {2019},
  note = {https://distill.pub/2019/memorization-in-rnns},
  doi = {10.23915/distill.00016}
}
```

## Sponsor

Sponsored by <a href="https://www.nearform.com/research/">NearForm Research</a>.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "textualheatmap",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "saliency, heatmap, text, textual, jupyter, colab, interactive",
    "author": null,
    "author_email": "Andreas Madsen <amwebdk@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/24/0a/a2e90d14891f84c1e6e8e70e0b0ba5866523e9ae3350accfc89d53ff9171/textualheatmap-1.2.0.tar.gz",
    "platform": null,
    "description": "# textualheatmap\n\n**Create interactive textual heatmaps for Jupiter notebooks.**\n\nI originally published this visualization method in my distill paper\nhttps://distill.pub/2019/memorization-in-rnns/. In this context, it is used\nas a saliency map for showing which parts of a sentence are used to predict\nthe next word. However, the visualization method is more general-purpose than\nthat and can be used for any kind of textual heatmap purposes.\n\n`textualheatmap` works with python 3.6 or newer and is distributed under the\nMIT license.\n\n![Gif of saliency in RNN models](gifs/show_meta.gif)\n\nAn end-to-end example of how to use the\n[HuggingFace \ud83e\udd17 Transformers](https://github.com/huggingface/transformers) python\nmodule to create a textual saliency map for how each masked token is predicted.\n\n[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/AndreasMadsen/python-textualheatmap/blob/master/notebooks/huggingface_bert_example.ipynb)\n\n\n![Gif of saliency in BERT models](gifs/huggingface_bert.gif)\n\n## Install\n\n```bash\npip install -U textualheatmap\n```\n\n## API\n\n* [`textualheatmap.TextualHeatmap`](textualheatmap/textual_heatmap.py)\n\n## Examples\n\n### Example of sequential-charecter model with metadata visible\n\n[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/AndreasMadsen/python-textualheatmap/blob/master/notebooks/general_example.ipynb)\n\n```python\nfrom textualheatmap import TextualHeatmap\n\ndata = [[\n    # GRU data\n    {\"token\":\" \",\n     \"meta\":[\"the\",\"one\",\"of\"],\n     \"heat\":[1,0,0,0,0,0,0,0,0]},\n    {\"token\":\"c\",\n     \"meta\":[\"can\",\"called\",\"century\"],\n     \"heat\":[1,0.22,0,0,0,0,0,0,0]},\n    {\"token\":\"o\",\n     \"meta\":[\"country\",\"could\",\"company\"],\n     \"heat\":[0.57,0.059,1,0,0,0,0,0,0]},\n    {\"token\":\"n\",\n     \"meta\":[\"control\",\"considered\",\"construction\"],\n     \"heat\":[1,0.20,0.11,0.84,0,0,0,0,0]},\n    {\"token\":\"t\",\n     \"meta\":[\"control\",\"continued\",\"continental\"],\n     \"heat\":[0.27,0.17,0.052,0.44,1,0,0,0,0]},\n    {\"token\":\"e\",\n     \"meta\":[\"context\",\"content\",\"contested\"],\n     \"heat\":[0.17,0.039,0.034,0.22,1,0.53,0,0,0]},\n    {\"token\":\"x\",\n     \"meta\":[\"context\",\"contexts\",\"contemporary\"],\n     \"heat\":[0.17,0.0044,0.021,0.17,1,0.90,0.48,0,0]},\n    {\"token\":\"t\",\n     \"meta\":[\"context\",\"contexts\",\"contentious\"],\n     \"heat\":[0.14,0.011,0.034,0.14,0.68,1,0.80,0.86,0]},\n    {\"token\":\" \",\n     \"meta\":[\"of\",\"and\",\"the\"],\n     \"heat\":[0.014,0.0063,0.0044,0.011,0.034,0.10,0.32,0.28,1]},\n    # ...\n],[\n    # LSTM data\n    # ...\n]]\n\nheatmap = TextualHeatmap(\n    width = 600,\n    show_meta = True,\n    facet_titles = ['GRU', 'LSTM']\n)\n# Set data and render plot, this can be called again to replace\n# the data.\nheatmap.set_data(data)\n# Focus on the token with the given index. Especially useful when\n# `interactive=False` is used in `TextualHeatmap`.\nheatmap.highlight(159)\n```\n\n![Shows saliency with predicted words at metadata](gifs/show_meta.gif)\n\n### Example of sequential-charecter model without metadata\n\n[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/AndreasMadsen/python-textualheatmap/blob/master/notebooks/general_example.ipynb)\n\nWhen `show_meta` is not `True`, the `meta` part of the `data` object has no effect.\n\n```python\nheatmap = TextualHeatmap(\n    facet_titles = ['LSTM', 'GRU'],\n    rotate_facet_titles = True\n)\nheatmap.set_data(data)\nheatmap.highlight(159)\n```\n\n![Shows saliency without metadata](gifs/no_meta_and_rotated.gif)\n\n### Example of non-sequential-word model\n\n[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/AndreasMadsen/python-textualheatmap/blob/master/notebooks/bert_hardcoded_example.ipynb)\n\n`format = True` can be set in the `data` object to inducate tokens that are\nnot directly used by the model. This is useful if word or sub-word tokenization\nis used.\n\n\n```python\ndata = [[\n{'token': '[CLR]',\n 'meta': ['', '', ''],\n 'heat': [1, 0, 0, 0, 0, ...]},\n{'token': ' ',\n 'format': True},\n{'token': 'context',\n 'meta': ['today', 'and', 'thus'],\n 'heat': [0.13, 0.40, 0.23, 1.0, 0.56, ...]},\n{'token': ' ',\n 'format': True},\n{'token': 'the',\n 'meta': ['##ual', 'the', '##ually'],\n 'heat': [0.11, 1.0, 0.34, 0.58, 0.59, ...]},\n{'token': ' ',\n 'format': True},\n{'token': 'formal',\n 'meta': ['formal', 'academic', 'systematic'],\n 'heat': [0.13, 0.74, 0.26, 0.35, 1.0, ...]},\n{'token': ' ',\n 'format': True},\n{'token': 'study',\n 'meta': ['##ization', 'study', '##ity'],\n 'heat': [0.09, 0.27, 0.19, 1.0, 0.26, ...]}\n]]\n\nheatmap = TextualHeatmap(facet_titles = ['BERT'], show_meta=True)\nheatmap.set_data(data)\n```\n\n![Shows saliency in a BERT model, using sub-word tokenization](gifs/sub_word_tokenized.gif)\n\n## Citation\n\nIf you use this in a publication, please cite my [Distill publication](https://distill.pub/2019/memorization-in-rnns/) where I first demonstrated this visualization method.\n\n```bib\n@article{madsen2019visualizing,\n  author = {Madsen, Andreas},\n  title = {Visualizing memorization in RNNs},\n  journal = {Distill},\n  year = {2019},\n  note = {https://distill.pub/2019/memorization-in-rnns},\n  doi = {10.23915/distill.00016}\n}\n```\n\n## Sponsor\n\nSponsored by <a href=\"https://www.nearform.com/research/\">NearForm Research</a>.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Create interactive textual heat maps for Jupiter notebooks",
    "version": "1.2.0",
    "project_urls": null,
    "split_keywords": [
        "saliency",
        " heatmap",
        " text",
        " textual",
        " jupyter",
        " colab",
        " interactive"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "178b6a1b2fb7b3aec829c677039eab566d67a020edb2475ad4d65d26101ea673",
                "md5": "9fe04ae723f072bc14e2f92ca7a60be9",
                "sha256": "27beaaaa36e84d7261f7c0bc0588fc90d971e86a9df89e7af8f1414dbdbe2a9c"
            },
            "downloads": -1,
            "filename": "textualheatmap-1.2.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "9fe04ae723f072bc14e2f92ca7a60be9",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 10080,
            "upload_time": "2024-05-30T22:49:09",
            "upload_time_iso_8601": "2024-05-30T22:49:09.938523Z",
            "url": "https://files.pythonhosted.org/packages/17/8b/6a1b2fb7b3aec829c677039eab566d67a020edb2475ad4d65d26101ea673/textualheatmap-1.2.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "240aa2e90d14891f84c1e6e8e70e0b0ba5866523e9ae3350accfc89d53ff9171",
                "md5": "9400d0bf158429b611c2d6d930330716",
                "sha256": "0e7b24f8b8815db1690fe8b29525f86f38738c950df2b000b8001e9e2565f457"
            },
            "downloads": -1,
            "filename": "textualheatmap-1.2.0.tar.gz",
            "has_sig": false,
            "md5_digest": "9400d0bf158429b611c2d6d930330716",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 11087,
            "upload_time": "2024-05-30T22:49:11",
            "upload_time_iso_8601": "2024-05-30T22:49:11.020075Z",
            "url": "https://files.pythonhosted.org/packages/24/0a/a2e90d14891f84c1e6e8e70e0b0ba5866523e9ae3350accfc89d53ff9171/textualheatmap-1.2.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-05-30 22:49:11",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "textualheatmap"
}
        
Elapsed time: 1.10272s