# 1D, 2D, and 3D Sinusoidal Postional Encoding (Pytorch and Tensorflow)
![Code Coverage](./svgs/cov.svg)
[![PyPI version](https://badge.fury.io/py/positional-encodings.svg)](https://badge.fury.io/py/positional-encodings)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
![A 2D Example](./example.png)
This is a practical, easy to download implemenation of 1D, 2D, and 3D
sinusodial positional encodings for PyTorch and Tensorflow.
It is able to encode on tensors of the form `(batchsize, x, ch)`, `(batchsize,
x, y, ch)`, and `(batchsize, x, y, z, ch)`, where the positional encodings will
be calculated along the `ch` dimension. The [Attention is All You
Need](https://arxiv.org/pdf/1706.03762.pdf) allowed for positional encoding in
only one dimension, however, this works to extend this to 2 and 3 dimensions.
This also works on tensors of the form `(batchsize, ch, x)`, etc. See the usage for more information.
**NOTE**: The import syntax has changed as of version `6.0.1`. See the section for details.
To install, simply run:
```
pip install positional-encodings[pytorch,tensorflow]
```
You can also install the pytorch and tf encodings individually with the following
commands.
* For a PyTorch only installation, run `pip install positional-encodings[pytorch]`
* For a TensorFlow only installation, run `pip install positional-encodings[tensorflow]`
## Usage:
### Pytorch
The repo comes with the three main positional encoding models,
`PositionalEncoding{1,2,3}D`. In addition, there are a `Summer` class that adds
the input tensor to the positional encodings.
```python3
import torch
from positional_encodings.torch_encodings import PositionalEncoding1D, PositionalEncoding2D, PositionalEncoding3D, Summer
# Returns the position encoding only
p_enc_1d_model = PositionalEncoding1D(10)
# Return the inputs with the position encoding added
p_enc_1d_model_sum = Summer(PositionalEncoding1D(10))
x = torch.rand(1,6,10)
penc_no_sum = p_enc_1d_model(x) # penc_no_sum.shape == (1, 6, 10)
penc_sum = p_enc_1d_model_sum(x)
print(penc_no_sum + x == penc_sum) # True
```
```python3
p_enc_2d = PositionalEncoding2D(8)
y = torch.zeros((1,6,2,8))
print(p_enc_2d(y).shape) # (1, 6, 2, 8)
p_enc_3d = PositionalEncoding3D(11)
z = torch.zeros((1,5,6,4,11))
print(p_enc_3d(z).shape) # (1, 5, 6, 4, 11)
```
And for tensors of the form `(batchsize, ch, x)` or their 2D and 3D
counterparts, include the word `Permute` before the number in the class; e.g.
for a 1D input of size `(batchsize, ch, x)`, do `PositionalEncodingPermute1D`
instead of `PositionalEncoding1D`.
```python3
import torch
from positional_encodings.torch_encodings import PositionalEncodingPermute3D
p_enc_3d = PositionalEncodingPermute3D(11)
z = torch.zeros((1,11,5,6,4))
print(p_enc_3d(z).shape) # (1, 11, 5, 6, 4)
```
Note to override the output dtype you can specify it when creating the encoding:
```python3
p_enc_3d = PositionalEncodingPermute3D(11, dtype_override=torch.float64)
```
This is particularly useful when the input tensor is of an `int` type since the
output will always be zero (see issue #39).
### Tensorflow Keras
This also supports Tensorflow. Simply prepend all class names with `TF`.
```python3
import tensorflow as tf
from positional_encodings.tf_encodings import TFPositionalEncoding2D, TFSummer
# Returns the position encoding only
p_enc_2d = TFPositionalEncoding2D(170)
y = tf.zeros((1,8,6,2))
print(p_enc_2d(y).shape) # (1, 8, 6, 2)
# Return the inputs with the position encoding added
add_p_enc_2d = TFSummer(TFPositionalEncoding2D(170))
y = tf.ones((1,8,6,2))
print(add_p_enc_2d(y) - p_enc_2d(y)) # tf.ones((1,8,6,2))
```
## Changes as of version `6.0.1`
Before `6.0.1`, users had to install both the `tensorflow` and the
`torch` packages, both of which are quite large. Now, one can install the
packages individually, but now the code has to be changed:
If using PyTorch:
```
from positional_encodings import * -> from positional_encodings.torch_encodings import *
```
If using TensorFlow:
```
from positional_encodings import * -> from positional_encodings.tf_encodings import *
```
## Formulas
The formula for inserting the positional encoding are as follows:
1D:
```
PE(x,2i) = sin(x/10000^(2i/D))
PE(x,2i+1) = cos(x/10000^(2i/D))
Where:
x is a point in 2d space
i is an integer in [0, D/2), where D is the size of the ch dimension
```
2D:
```
PE(x,y,2i) = sin(x/10000^(4i/D))
PE(x,y,2i+1) = cos(x/10000^(4i/D))
PE(x,y,2j+D/2) = sin(y/10000^(4j/D))
PE(x,y,2j+1+D/2) = cos(y/10000^(4j/D))
Where:
(x,y) is a point in 2d space
i,j is an integer in [0, D/4), where D is the size of the ch dimension
```
3D:
```
PE(x,y,z,2i) = sin(x/10000^(6i/D))
PE(x,y,z,2i+1) = cos(x/10000^(6i/D))
PE(x,y,z,2j+D/3) = sin(y/10000^(6j/D))
PE(x,y,z,2j+1+D/3) = cos(y/10000^(6j/D))
PE(x,y,z,2k+2D/3) = sin(z/10000^(6k/D))
PE(x,y,z,2k+1+2D/3) = cos(z/10000^(6k/D))
Where:
(x,y,z) is a point in 3d space
i,j,k is an integer in [0, D/6), where D is the size of the ch dimension
```
The 3D formula is just a natural extension of the 2D positional encoding used
in [this](https://arxiv.org/pdf/1908.11415.pdf) paper.
Don't worry if the input is not divisible by 2 (1D), 4 (2D), or 6 (3D); all the
necessary padding will be taken care of.
## Thank you
Thank you for [this](https://github.com/wzlxjtu/PositionalEncoding2D) repo for inspriration of this method.
## Citations
1D:
```bibtex
@inproceedings{vaswani2017attention,
title={Attention is all you need},
author={Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N and Kaiser, {\L}ukasz and Polosukhin, Illia},
booktitle={Advances in neural information processing systems},
pages={5998--6008},
year={2017}
}
```
2D:
```bibtex
@misc{wang2019translating,
title={Translating Math Formula Images to LaTeX Sequences Using Deep Neural Networks with Sequence-level Training},
author={Zelun Wang and Jyh-Charn Liu},
year={2019},
eprint={1908.11415},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
```
3D:
Coming soon!
Raw data
{
"_id": null,
"home_page": "https://github.com/tatp22/multidim-positional-encoding",
"name": "positional-encodings",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.12",
"maintainer_email": null,
"keywords": "transformers, attention",
"author": "Peter Tatkowski",
"author_email": "tatp22@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/6d/04/93a78a33088fe9115602883a39f8395cdb5561388a24e65f57c11920c638/positional_encodings-6.0.4.tar.gz",
"platform": null,
"description": "# 1D, 2D, and 3D Sinusoidal Postional Encoding (Pytorch and Tensorflow)\n\n![Code Coverage](./svgs/cov.svg)\n[![PyPI version](https://badge.fury.io/py/positional-encodings.svg)](https://badge.fury.io/py/positional-encodings)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n\n![A 2D Example](./example.png)\n\nThis is a practical, easy to download implemenation of 1D, 2D, and 3D\nsinusodial positional encodings for PyTorch and Tensorflow.\n\nIt is able to encode on tensors of the form `(batchsize, x, ch)`, `(batchsize,\nx, y, ch)`, and `(batchsize, x, y, z, ch)`, where the positional encodings will\nbe calculated along the `ch` dimension. The [Attention is All You\nNeed](https://arxiv.org/pdf/1706.03762.pdf) allowed for positional encoding in\nonly one dimension, however, this works to extend this to 2 and 3 dimensions.\n\nThis also works on tensors of the form `(batchsize, ch, x)`, etc. See the usage for more information.\n\n**NOTE**: The import syntax has changed as of version `6.0.1`. See the section for details.\n\nTo install, simply run:\n\n```\npip install positional-encodings[pytorch,tensorflow]\n```\n\nYou can also install the pytorch and tf encodings individually with the following\ncommands.\n\n* For a PyTorch only installation, run `pip install positional-encodings[pytorch]`\n* For a TensorFlow only installation, run `pip install positional-encodings[tensorflow]`\n\n## Usage:\n\n### Pytorch\n\nThe repo comes with the three main positional encoding models,\n`PositionalEncoding{1,2,3}D`. In addition, there are a `Summer` class that adds\nthe input tensor to the positional encodings.\n\n```python3\nimport torch\nfrom positional_encodings.torch_encodings import PositionalEncoding1D, PositionalEncoding2D, PositionalEncoding3D, Summer\n\n# Returns the position encoding only\np_enc_1d_model = PositionalEncoding1D(10)\n\n# Return the inputs with the position encoding added\np_enc_1d_model_sum = Summer(PositionalEncoding1D(10))\n\nx = torch.rand(1,6,10)\npenc_no_sum = p_enc_1d_model(x) # penc_no_sum.shape == (1, 6, 10)\npenc_sum = p_enc_1d_model_sum(x)\nprint(penc_no_sum + x == penc_sum) # True\n```\n\n```python3\np_enc_2d = PositionalEncoding2D(8)\ny = torch.zeros((1,6,2,8))\nprint(p_enc_2d(y).shape) # (1, 6, 2, 8)\n\np_enc_3d = PositionalEncoding3D(11)\nz = torch.zeros((1,5,6,4,11))\nprint(p_enc_3d(z).shape) # (1, 5, 6, 4, 11)\n```\n\nAnd for tensors of the form `(batchsize, ch, x)` or their 2D and 3D\ncounterparts, include the word `Permute` before the number in the class; e.g.\nfor a 1D input of size `(batchsize, ch, x)`, do `PositionalEncodingPermute1D`\ninstead of `PositionalEncoding1D`.\n\n\n```python3\nimport torch\nfrom positional_encodings.torch_encodings import PositionalEncodingPermute3D\n\np_enc_3d = PositionalEncodingPermute3D(11)\nz = torch.zeros((1,11,5,6,4))\nprint(p_enc_3d(z).shape) # (1, 11, 5, 6, 4)\n```\n\nNote to override the output dtype you can specify it when creating the encoding:\n\n```python3\np_enc_3d = PositionalEncodingPermute3D(11, dtype_override=torch.float64)\n```\n\nThis is particularly useful when the input tensor is of an `int` type since the\noutput will always be zero (see issue #39).\n\n### Tensorflow Keras\n\nThis also supports Tensorflow. Simply prepend all class names with `TF`.\n\n```python3\nimport tensorflow as tf\nfrom positional_encodings.tf_encodings import TFPositionalEncoding2D, TFSummer\n\n# Returns the position encoding only\np_enc_2d = TFPositionalEncoding2D(170)\ny = tf.zeros((1,8,6,2))\nprint(p_enc_2d(y).shape) # (1, 8, 6, 2)\n\n# Return the inputs with the position encoding added\nadd_p_enc_2d = TFSummer(TFPositionalEncoding2D(170))\ny = tf.ones((1,8,6,2))\nprint(add_p_enc_2d(y) - p_enc_2d(y)) # tf.ones((1,8,6,2))\n```\n\n## Changes as of version `6.0.1`\n\nBefore `6.0.1`, users had to install both the `tensorflow` and the\n`torch` packages, both of which are quite large. Now, one can install the\npackages individually, but now the code has to be changed:\n\nIf using PyTorch:\n\n```\nfrom positional_encodings import * -> from positional_encodings.torch_encodings import *\n```\n\nIf using TensorFlow:\n\n```\nfrom positional_encodings import * -> from positional_encodings.tf_encodings import *\n```\n\n## Formulas\n\nThe formula for inserting the positional encoding are as follows:\n\n1D:\n```\nPE(x,2i) = sin(x/10000^(2i/D))\nPE(x,2i+1) = cos(x/10000^(2i/D))\n\nWhere:\nx is a point in 2d space\ni is an integer in [0, D/2), where D is the size of the ch dimension\n```\n\n2D:\n```\nPE(x,y,2i) = sin(x/10000^(4i/D))\nPE(x,y,2i+1) = cos(x/10000^(4i/D))\nPE(x,y,2j+D/2) = sin(y/10000^(4j/D))\nPE(x,y,2j+1+D/2) = cos(y/10000^(4j/D))\n\nWhere:\n(x,y) is a point in 2d space\ni,j is an integer in [0, D/4), where D is the size of the ch dimension\n```\n\n3D:\n```\nPE(x,y,z,2i) = sin(x/10000^(6i/D))\nPE(x,y,z,2i+1) = cos(x/10000^(6i/D))\nPE(x,y,z,2j+D/3) = sin(y/10000^(6j/D))\nPE(x,y,z,2j+1+D/3) = cos(y/10000^(6j/D))\nPE(x,y,z,2k+2D/3) = sin(z/10000^(6k/D))\nPE(x,y,z,2k+1+2D/3) = cos(z/10000^(6k/D))\n\nWhere:\n(x,y,z) is a point in 3d space\ni,j,k is an integer in [0, D/6), where D is the size of the ch dimension\n```\n\nThe 3D formula is just a natural extension of the 2D positional encoding used\nin [this](https://arxiv.org/pdf/1908.11415.pdf) paper.\n\nDon't worry if the input is not divisible by 2 (1D), 4 (2D), or 6 (3D); all the\nnecessary padding will be taken care of.\n\n## Thank you\n\nThank you for [this](https://github.com/wzlxjtu/PositionalEncoding2D) repo for inspriration of this method.\n\n## Citations\n1D:\n```bibtex\n@inproceedings{vaswani2017attention,\n title={Attention is all you need},\n author={Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N and Kaiser, {\\L}ukasz and Polosukhin, Illia},\n booktitle={Advances in neural information processing systems},\n pages={5998--6008},\n year={2017}\n}\n```\n\n2D:\n```bibtex\n@misc{wang2019translating,\n title={Translating Math Formula Images to LaTeX Sequences Using Deep Neural Networks with Sequence-level Training},\n author={Zelun Wang and Jyh-Charn Liu},\n year={2019},\n eprint={1908.11415},\n archivePrefix={arXiv},\n primaryClass={cs.LG}\n}\n```\n\n3D:\nComing soon!\n",
"bugtrack_url": null,
"license": null,
"summary": "1D, 2D, and 3D Sinusodal Positional Encodings in PyTorch",
"version": "6.0.4",
"project_urls": {
"Homepage": "https://github.com/tatp22/multidim-positional-encoding"
},
"split_keywords": [
"transformers",
" attention"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "25523785732aea08f848949549686c46473eeed2017005b5847d9976e62f1716",
"md5": "e3be926ab9863fc7782515fe5c16bc06",
"sha256": "26a2aa914fa4c784d87557d142f760e037427505e17a1e07902bb91d66915de0"
},
"downloads": -1,
"filename": "positional_encodings-6.0.4-py3-none-any.whl",
"has_sig": false,
"md5_digest": "e3be926ab9863fc7782515fe5c16bc06",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.12",
"size": 7674,
"upload_time": "2024-10-23T07:33:47",
"upload_time_iso_8601": "2024-10-23T07:33:47.049649Z",
"url": "https://files.pythonhosted.org/packages/25/52/3785732aea08f848949549686c46473eeed2017005b5847d9976e62f1716/positional_encodings-6.0.4-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "6d0493a78a33088fe9115602883a39f8395cdb5561388a24e65f57c11920c638",
"md5": "9c3c9256a78990f407ba54cd483c7cd2",
"sha256": "435db9596a73759e7caa8f677ebf6d2ee4d39187112244e4d8b6d7b62bc119ad"
},
"downloads": -1,
"filename": "positional_encodings-6.0.4.tar.gz",
"has_sig": false,
"md5_digest": "9c3c9256a78990f407ba54cd483c7cd2",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.12",
"size": 6814,
"upload_time": "2024-10-23T07:33:48",
"upload_time_iso_8601": "2024-10-23T07:33:48.273481Z",
"url": "https://files.pythonhosted.org/packages/6d/04/93a78a33088fe9115602883a39f8395cdb5561388a24e65f57c11920c638/positional_encodings-6.0.4.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-23 07:33:48",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "tatp22",
"github_project": "multidim-positional-encoding",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [],
"lcname": "positional-encodings"
}