positional-encodings

Name	positional-encodings JSON
Version	6.0.3 JSON
	download
home_page	https://github.com/tatp22/multidim-positional-encoding
Summary	1D, 2D, and 3D Sinusodal Positional Encodings in PyTorch
upload_time	2024-01-04 18:17:44
maintainer
docs_url	None
author	Peter Tatkowski
requires_python	>=3.7
license
keywords	transformers attention
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # 1D, 2D, and 3D Sinusoidal Postional Encoding (Pytorch and Tensorflow)

![Code Coverage](./svgs/cov.svg)
[![PyPI version](https://badge.fury.io/py/positional-encodings.svg)](https://badge.fury.io/py/positional-encodings)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

![A 2D Example](./example.png)

This is a practical, easy to download implemenation of 1D, 2D, and 3D
sinusodial positional encodings for PyTorch and Tensorflow.

It is able to encode on tensors of the form `(batchsize, x, ch)`, `(batchsize,
x, y, ch)`, and `(batchsize, x, y, z, ch)`, where the positional encodings will
be calculated along the `ch` dimension. The [Attention is All You
Need](https://arxiv.org/pdf/1706.03762.pdf) allowed for positional encoding in
only one dimension, however, this works to extend this to 2 and 3 dimensions.

This also works on tensors of the form `(batchsize, ch, x)`, etc. See the usage for more information.

**NOTE**: The import syntax has changed as of version `6.0.1`. See the section for details.

To install, simply run:

```
pip install positional-encodings[pytorch,tensorflow]
```

You can also install the pytorch and tf encodings individually with the following
commands.

* For a PyTorch only installation, run `pip install positional-encodings[pytorch]`
* For a TensorFlow only installation, run `pip install positional-encodings[tensorflow]`

## Usage (PyTorch):

The repo comes with the three main positional encoding models,
`PositionalEncoding{1,2,3}D`. In addition, there are a `Summer` class that adds
the input tensor to the positional encodings.

```python3
import torch
from positional_encodings.torch_encodings import PositionalEncoding1D, PositionalEncoding2D, PositionalEncoding3D, Summer

# Returns the position encoding only
p_enc_1d_model = PositionalEncoding1D(10)

# Return the inputs with the position encoding added
p_enc_1d_model_sum = Summer(PositionalEncoding1D(10))

x = torch.rand(1,6,10)
penc_no_sum = p_enc_1d_model(x) # penc_no_sum.shape == (1, 6, 10)
penc_sum = p_enc_1d_model_sum(x)
print(penc_no_sum + x == penc_sum) # True
```

```python3
p_enc_2d = PositionalEncoding2D(8)
y = torch.zeros((1,6,2,8))
print(p_enc_2d(y).shape) # (1, 6, 2, 8)

p_enc_3d = PositionalEncoding3D(11)
z = torch.zeros((1,5,6,4,11))
print(p_enc_3d(z).shape) # (1, 5, 6, 4, 11)
```

And for tensors of the form `(batchsize, ch, x)` or their 2D and 3D
counterparts, include the word `Permute` before the number in the class; e.g.
for a 1D input of size `(batchsize, ch, x)`, do `PositionalEncodingPermute1D`
instead of `PositionalEncoding1D`.


```python3
import torch
from positional_encodings.torch_encodings import PositionalEncodingPermute3D

p_enc_3d = PositionalEncodingPermute3D(11)
z = torch.zeros((1,11,5,6,4))
print(p_enc_3d(z).shape) # (1, 11, 5, 6, 4)
```

### Tensorflow Keras

This also supports Tensorflow. Simply prepend all class names with `TF`.

```python3
import tensorflow as tf
from positional_encodings.tf_encodings import TFPositionalEncoding2D, TFSummer

# Returns the position encoding only
p_enc_2d = TFPositionalEncoding2D(170)
y = tf.zeros((1,8,6,2))
print(p_enc_2d(y).shape) # (1, 8, 6, 2)

# Return the inputs with the position encoding added
add_p_enc_2d = TFSummer(TFPositionalEncoding2D(170))
y = tf.ones((1,8,6,2))
print(add_p_enc_2d(y) - p_enc_2d(y)) # tf.ones((1,8,6,2))
```

## Changes as of version `6.0.1`

Before `6.0.1`, users had to install both the `tensorflow` and the
`torch` packages, both of which are quite large. Now, one can install the
packages individually, but now the code has to be changed:

If using PyTorch:

```
from positional_encodings import * -> from positional_encodings.torch_encodings import *
```

If using TensorFlow:

```
from positional_encodings import * -> from positional_encodings.tf_encodings import *
```

## Formulas

The formula for inserting the positional encoding are as follows:

1D:
```
PE(x,2i) = sin(x/10000^(2i/D))
PE(x,2i+1) = cos(x/10000^(2i/D))

Where:
x is a point in 2d space
i is an integer in [0, D/2), where D is the size of the ch dimension
```

2D:
```
PE(x,y,2i) = sin(x/10000^(4i/D))
PE(x,y,2i+1) = cos(x/10000^(4i/D))
PE(x,y,2j+D/2) = sin(y/10000^(4j/D))
PE(x,y,2j+1+D/2) = cos(y/10000^(4j/D))

Where:
(x,y) is a point in 2d space
i,j is an integer in [0, D/4), where D is the size of the ch dimension
```

3D:
```
PE(x,y,z,2i) = sin(x/10000^(6i/D))
PE(x,y,z,2i+1) = cos(x/10000^(6i/D))
PE(x,y,z,2j+D/3) = sin(y/10000^(6j/D))
PE(x,y,z,2j+1+D/3) = cos(y/10000^(6j/D))
PE(x,y,z,2k+2D/3) = sin(z/10000^(6k/D))
PE(x,y,z,2k+1+2D/3) = cos(z/10000^(6k/D))

Where:
(x,y,z) is a point in 3d space
i,j,k is an integer in [0, D/6), where D is the size of the ch dimension
```

The 3D formula is just a natural extension of the 2D positional encoding used
in [this](https://arxiv.org/pdf/1908.11415.pdf) paper.

Don't worry if the input is not divisible by 2 (1D), 4 (2D), or 6 (3D); all the
necessary padding will be taken care of.

## Thank you

Thank you for [this](https://github.com/wzlxjtu/PositionalEncoding2D) repo for inspriration of this method.

## Citations
1D:
```bibtex
@inproceedings{vaswani2017attention,
  title={Attention is all you need},
  author={Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N and Kaiser, {\L}ukasz and Polosukhin, Illia},
  booktitle={Advances in neural information processing systems},
  pages={5998--6008},
  year={2017}
}
```

2D:
```bibtex
@misc{wang2019translating,
    title={Translating Math Formula Images to LaTeX Sequences Using Deep Neural Networks with Sequence-level Training},
    author={Zelun Wang and Jyh-Charn Liu},
    year={2019},
    eprint={1908.11415},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}
```

3D:
Coming soon!

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/tatp22/multidim-positional-encoding",
    "name": "positional-encodings",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "",
    "keywords": "transformers,attention",
    "author": "Peter Tatkowski",
    "author_email": "tatp22@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/53/ae/677391c5e75858b5496dcdf4146952646c13a8b31c04090fa98c37a04198/positional_encodings-6.0.3.tar.gz",
    "platform": null,
    "description": "# 1D, 2D, and 3D Sinusoidal Postional Encoding (Pytorch and Tensorflow)\n\n![Code Coverage](./svgs/cov.svg)\n[![PyPI version](https://badge.fury.io/py/positional-encodings.svg)](https://badge.fury.io/py/positional-encodings)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n\n![A 2D Example](./example.png)\n\nThis is a practical, easy to download implemenation of 1D, 2D, and 3D\nsinusodial positional encodings for PyTorch and Tensorflow.\n\nIt is able to encode on tensors of the form `(batchsize, x, ch)`, `(batchsize,\nx, y, ch)`, and `(batchsize, x, y, z, ch)`, where the positional encodings will\nbe calculated along the `ch` dimension. The [Attention is All You\nNeed](https://arxiv.org/pdf/1706.03762.pdf) allowed for positional encoding in\nonly one dimension, however, this works to extend this to 2 and 3 dimensions.\n\nThis also works on tensors of the form `(batchsize, ch, x)`, etc. See the usage for more information.\n\n**NOTE**: The import syntax has changed as of version `6.0.1`. See the section for details.\n\nTo install, simply run:\n\n```\npip install positional-encodings[pytorch,tensorflow]\n```\n\nYou can also install the pytorch and tf encodings individually with the following\ncommands.\n\n* For a PyTorch only installation, run `pip install positional-encodings[pytorch]`\n* For a TensorFlow only installation, run `pip install positional-encodings[tensorflow]`\n\n## Usage (PyTorch):\n\nThe repo comes with the three main positional encoding models,\n`PositionalEncoding{1,2,3}D`. In addition, there are a `Summer` class that adds\nthe input tensor to the positional encodings.\n\n```python3\nimport torch\nfrom positional_encodings.torch_encodings import PositionalEncoding1D, PositionalEncoding2D, PositionalEncoding3D, Summer\n\n# Returns the position encoding only\np_enc_1d_model = PositionalEncoding1D(10)\n\n# Return the inputs with the position encoding added\np_enc_1d_model_sum = Summer(PositionalEncoding1D(10))\n\nx = torch.rand(1,6,10)\npenc_no_sum = p_enc_1d_model(x) # penc_no_sum.shape == (1, 6, 10)\npenc_sum = p_enc_1d_model_sum(x)\nprint(penc_no_sum + x == penc_sum) # True\n```\n\n```python3\np_enc_2d = PositionalEncoding2D(8)\ny = torch.zeros((1,6,2,8))\nprint(p_enc_2d(y).shape) # (1, 6, 2, 8)\n\np_enc_3d = PositionalEncoding3D(11)\nz = torch.zeros((1,5,6,4,11))\nprint(p_enc_3d(z).shape) # (1, 5, 6, 4, 11)\n```\n\nAnd for tensors of the form `(batchsize, ch, x)` or their 2D and 3D\ncounterparts, include the word `Permute` before the number in the class; e.g.\nfor a 1D input of size `(batchsize, ch, x)`, do `PositionalEncodingPermute1D`\ninstead of `PositionalEncoding1D`.\n\n\n```python3\nimport torch\nfrom positional_encodings.torch_encodings import PositionalEncodingPermute3D\n\np_enc_3d = PositionalEncodingPermute3D(11)\nz = torch.zeros((1,11,5,6,4))\nprint(p_enc_3d(z).shape) # (1, 11, 5, 6, 4)\n```\n\n### Tensorflow Keras\n\nThis also supports Tensorflow. Simply prepend all class names with `TF`.\n\n```python3\nimport tensorflow as tf\nfrom positional_encodings.tf_encodings import TFPositionalEncoding2D, TFSummer\n\n# Returns the position encoding only\np_enc_2d = TFPositionalEncoding2D(170)\ny = tf.zeros((1,8,6,2))\nprint(p_enc_2d(y).shape) # (1, 8, 6, 2)\n\n# Return the inputs with the position encoding added\nadd_p_enc_2d = TFSummer(TFPositionalEncoding2D(170))\ny = tf.ones((1,8,6,2))\nprint(add_p_enc_2d(y) - p_enc_2d(y)) # tf.ones((1,8,6,2))\n```\n\n## Changes as of version `6.0.1`\n\nBefore `6.0.1`, users had to install both the `tensorflow` and the\n`torch` packages, both of which are quite large. Now, one can install the\npackages individually, but now the code has to be changed:\n\nIf using PyTorch:\n\n```\nfrom positional_encodings import * -> from positional_encodings.torch_encodings import *\n```\n\nIf using TensorFlow:\n\n```\nfrom positional_encodings import * -> from positional_encodings.tf_encodings import *\n```\n\n## Formulas\n\nThe formula for inserting the positional encoding are as follows:\n\n1D:\n```\nPE(x,2i) = sin(x/10000^(2i/D))\nPE(x,2i+1) = cos(x/10000^(2i/D))\n\nWhere:\nx is a point in 2d space\ni is an integer in [0, D/2), where D is the size of the ch dimension\n```\n\n2D:\n```\nPE(x,y,2i) = sin(x/10000^(4i/D))\nPE(x,y,2i+1) = cos(x/10000^(4i/D))\nPE(x,y,2j+D/2) = sin(y/10000^(4j/D))\nPE(x,y,2j+1+D/2) = cos(y/10000^(4j/D))\n\nWhere:\n(x,y) is a point in 2d space\ni,j is an integer in [0, D/4), where D is the size of the ch dimension\n```\n\n3D:\n```\nPE(x,y,z,2i) = sin(x/10000^(6i/D))\nPE(x,y,z,2i+1) = cos(x/10000^(6i/D))\nPE(x,y,z,2j+D/3) = sin(y/10000^(6j/D))\nPE(x,y,z,2j+1+D/3) = cos(y/10000^(6j/D))\nPE(x,y,z,2k+2D/3) = sin(z/10000^(6k/D))\nPE(x,y,z,2k+1+2D/3) = cos(z/10000^(6k/D))\n\nWhere:\n(x,y,z) is a point in 3d space\ni,j,k is an integer in [0, D/6), where D is the size of the ch dimension\n```\n\nThe 3D formula is just a natural extension of the 2D positional encoding used\nin [this](https://arxiv.org/pdf/1908.11415.pdf) paper.\n\nDon't worry if the input is not divisible by 2 (1D), 4 (2D), or 6 (3D); all the\nnecessary padding will be taken care of.\n\n## Thank you\n\nThank you for [this](https://github.com/wzlxjtu/PositionalEncoding2D) repo for inspriration of this method.\n\n## Citations\n1D:\n```bibtex\n@inproceedings{vaswani2017attention,\n  title={Attention is all you need},\n  author={Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N and Kaiser, {\\L}ukasz and Polosukhin, Illia},\n  booktitle={Advances in neural information processing systems},\n  pages={5998--6008},\n  year={2017}\n}\n```\n\n2D:\n```bibtex\n@misc{wang2019translating,\n    title={Translating Math Formula Images to LaTeX Sequences Using Deep Neural Networks with Sequence-level Training},\n    author={Zelun Wang and Jyh-Charn Liu},\n    year={2019},\n    eprint={1908.11415},\n    archivePrefix={arXiv},\n    primaryClass={cs.LG}\n}\n```\n\n3D:\nComing soon!\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "1D, 2D, and 3D Sinusodal Positional Encodings in PyTorch",
    "version": "6.0.3",
    "project_urls": {
        "Homepage": "https://github.com/tatp22/multidim-positional-encoding"
    },
    "split_keywords": [
        "transformers",
        "attention"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "03a6e7458dcb000a087690927d84eec0455be693729ebeb732429920f93b199b",
                "md5": "809eb9b9c144b7bd37171f08aa42330e",
                "sha256": "714135704d54f42adc77585d54747e9d42580e03746ca90441e869ba7b3fc324"
            },
            "downloads": -1,
            "filename": "positional_encodings-6.0.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "809eb9b9c144b7bd37171f08aa42330e",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 7519,
            "upload_time": "2024-01-04T18:17:42",
            "upload_time_iso_8601": "2024-01-04T18:17:42.563735Z",
            "url": "https://files.pythonhosted.org/packages/03/a6/e7458dcb000a087690927d84eec0455be693729ebeb732429920f93b199b/positional_encodings-6.0.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "53ae677391c5e75858b5496dcdf4146952646c13a8b31c04090fa98c37a04198",
                "md5": "ae32e0d3342c3148c447590a73b72fba",
                "sha256": "c144ad52e78f09d16effcbfabdf3e84050b56a37e342248476e18bbffd747b24"
            },
            "downloads": -1,
            "filename": "positional_encodings-6.0.3.tar.gz",
            "has_sig": false,
            "md5_digest": "ae32e0d3342c3148c447590a73b72fba",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 6626,
            "upload_time": "2024-01-04T18:17:44",
            "upload_time_iso_8601": "2024-01-04T18:17:44.133582Z",
            "url": "https://files.pythonhosted.org/packages/53/ae/677391c5e75858b5496dcdf4146952646c13a8b31c04090fa98c37a04198/positional_encodings-6.0.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-01-04 18:17:44",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "tatp22",
    "github_project": "multidim-positional-encoding",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "positional-encodings"
}

Peter Tatkowski