positional-encodings


Namepositional-encodings JSON
Version 6.0.4 PyPI version JSON
download
home_pagehttps://github.com/tatp22/multidim-positional-encoding
Summary1D, 2D, and 3D Sinusodal Positional Encodings in PyTorch
upload_time2024-10-23 07:33:48
maintainerNone
docs_urlNone
authorPeter Tatkowski
requires_python>=3.12
licenseNone
keywords transformers attention
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # 1D, 2D, and 3D Sinusoidal Postional Encoding (Pytorch and Tensorflow)

![Code Coverage](./svgs/cov.svg)
[![PyPI version](https://badge.fury.io/py/positional-encodings.svg)](https://badge.fury.io/py/positional-encodings)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

![A 2D Example](./example.png)

This is a practical, easy to download implemenation of 1D, 2D, and 3D
sinusodial positional encodings for PyTorch and Tensorflow.

It is able to encode on tensors of the form `(batchsize, x, ch)`, `(batchsize,
x, y, ch)`, and `(batchsize, x, y, z, ch)`, where the positional encodings will
be calculated along the `ch` dimension. The [Attention is All You
Need](https://arxiv.org/pdf/1706.03762.pdf) allowed for positional encoding in
only one dimension, however, this works to extend this to 2 and 3 dimensions.

This also works on tensors of the form `(batchsize, ch, x)`, etc. See the usage for more information.

**NOTE**: The import syntax has changed as of version `6.0.1`. See the section for details.

To install, simply run:

```
pip install positional-encodings[pytorch,tensorflow]
```

You can also install the pytorch and tf encodings individually with the following
commands.

* For a PyTorch only installation, run `pip install positional-encodings[pytorch]`
* For a TensorFlow only installation, run `pip install positional-encodings[tensorflow]`

## Usage:

### Pytorch

The repo comes with the three main positional encoding models,
`PositionalEncoding{1,2,3}D`. In addition, there are a `Summer` class that adds
the input tensor to the positional encodings.

```python3
import torch
from positional_encodings.torch_encodings import PositionalEncoding1D, PositionalEncoding2D, PositionalEncoding3D, Summer

# Returns the position encoding only
p_enc_1d_model = PositionalEncoding1D(10)

# Return the inputs with the position encoding added
p_enc_1d_model_sum = Summer(PositionalEncoding1D(10))

x = torch.rand(1,6,10)
penc_no_sum = p_enc_1d_model(x) # penc_no_sum.shape == (1, 6, 10)
penc_sum = p_enc_1d_model_sum(x)
print(penc_no_sum + x == penc_sum) # True
```

```python3
p_enc_2d = PositionalEncoding2D(8)
y = torch.zeros((1,6,2,8))
print(p_enc_2d(y).shape) # (1, 6, 2, 8)

p_enc_3d = PositionalEncoding3D(11)
z = torch.zeros((1,5,6,4,11))
print(p_enc_3d(z).shape) # (1, 5, 6, 4, 11)
```

And for tensors of the form `(batchsize, ch, x)` or their 2D and 3D
counterparts, include the word `Permute` before the number in the class; e.g.
for a 1D input of size `(batchsize, ch, x)`, do `PositionalEncodingPermute1D`
instead of `PositionalEncoding1D`.


```python3
import torch
from positional_encodings.torch_encodings import PositionalEncodingPermute3D

p_enc_3d = PositionalEncodingPermute3D(11)
z = torch.zeros((1,11,5,6,4))
print(p_enc_3d(z).shape) # (1, 11, 5, 6, 4)
```

Note to override the output dtype you can specify it when creating the encoding:

```python3
p_enc_3d = PositionalEncodingPermute3D(11, dtype_override=torch.float64)
```

This is particularly useful when the input tensor is of an `int` type since the
output will always be zero (see issue #39).

### Tensorflow Keras

This also supports Tensorflow. Simply prepend all class names with `TF`.

```python3
import tensorflow as tf
from positional_encodings.tf_encodings import TFPositionalEncoding2D, TFSummer

# Returns the position encoding only
p_enc_2d = TFPositionalEncoding2D(170)
y = tf.zeros((1,8,6,2))
print(p_enc_2d(y).shape) # (1, 8, 6, 2)

# Return the inputs with the position encoding added
add_p_enc_2d = TFSummer(TFPositionalEncoding2D(170))
y = tf.ones((1,8,6,2))
print(add_p_enc_2d(y) - p_enc_2d(y)) # tf.ones((1,8,6,2))
```

## Changes as of version `6.0.1`

Before `6.0.1`, users had to install both the `tensorflow` and the
`torch` packages, both of which are quite large. Now, one can install the
packages individually, but now the code has to be changed:

If using PyTorch:

```
from positional_encodings import * -> from positional_encodings.torch_encodings import *
```

If using TensorFlow:

```
from positional_encodings import * -> from positional_encodings.tf_encodings import *
```

## Formulas

The formula for inserting the positional encoding are as follows:

1D:
```
PE(x,2i) = sin(x/10000^(2i/D))
PE(x,2i+1) = cos(x/10000^(2i/D))

Where:
x is a point in 2d space
i is an integer in [0, D/2), where D is the size of the ch dimension
```

2D:
```
PE(x,y,2i) = sin(x/10000^(4i/D))
PE(x,y,2i+1) = cos(x/10000^(4i/D))
PE(x,y,2j+D/2) = sin(y/10000^(4j/D))
PE(x,y,2j+1+D/2) = cos(y/10000^(4j/D))

Where:
(x,y) is a point in 2d space
i,j is an integer in [0, D/4), where D is the size of the ch dimension
```

3D:
```
PE(x,y,z,2i) = sin(x/10000^(6i/D))
PE(x,y,z,2i+1) = cos(x/10000^(6i/D))
PE(x,y,z,2j+D/3) = sin(y/10000^(6j/D))
PE(x,y,z,2j+1+D/3) = cos(y/10000^(6j/D))
PE(x,y,z,2k+2D/3) = sin(z/10000^(6k/D))
PE(x,y,z,2k+1+2D/3) = cos(z/10000^(6k/D))

Where:
(x,y,z) is a point in 3d space
i,j,k is an integer in [0, D/6), where D is the size of the ch dimension
```

The 3D formula is just a natural extension of the 2D positional encoding used
in [this](https://arxiv.org/pdf/1908.11415.pdf) paper.

Don't worry if the input is not divisible by 2 (1D), 4 (2D), or 6 (3D); all the
necessary padding will be taken care of.

## Thank you

Thank you for [this](https://github.com/wzlxjtu/PositionalEncoding2D) repo for inspriration of this method.

## Citations
1D:
```bibtex
@inproceedings{vaswani2017attention,
  title={Attention is all you need},
  author={Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N and Kaiser, {\L}ukasz and Polosukhin, Illia},
  booktitle={Advances in neural information processing systems},
  pages={5998--6008},
  year={2017}
}
```

2D:
```bibtex
@misc{wang2019translating,
    title={Translating Math Formula Images to LaTeX Sequences Using Deep Neural Networks with Sequence-level Training},
    author={Zelun Wang and Jyh-Charn Liu},
    year={2019},
    eprint={1908.11415},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}
```

3D:
Coming soon!

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/tatp22/multidim-positional-encoding",
    "name": "positional-encodings",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.12",
    "maintainer_email": null,
    "keywords": "transformers, attention",
    "author": "Peter Tatkowski",
    "author_email": "tatp22@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/6d/04/93a78a33088fe9115602883a39f8395cdb5561388a24e65f57c11920c638/positional_encodings-6.0.4.tar.gz",
    "platform": null,
    "description": "# 1D, 2D, and 3D Sinusoidal Postional Encoding (Pytorch and Tensorflow)\n\n![Code Coverage](./svgs/cov.svg)\n[![PyPI version](https://badge.fury.io/py/positional-encodings.svg)](https://badge.fury.io/py/positional-encodings)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n\n![A 2D Example](./example.png)\n\nThis is a practical, easy to download implemenation of 1D, 2D, and 3D\nsinusodial positional encodings for PyTorch and Tensorflow.\n\nIt is able to encode on tensors of the form `(batchsize, x, ch)`, `(batchsize,\nx, y, ch)`, and `(batchsize, x, y, z, ch)`, where the positional encodings will\nbe calculated along the `ch` dimension. The [Attention is All You\nNeed](https://arxiv.org/pdf/1706.03762.pdf) allowed for positional encoding in\nonly one dimension, however, this works to extend this to 2 and 3 dimensions.\n\nThis also works on tensors of the form `(batchsize, ch, x)`, etc. See the usage for more information.\n\n**NOTE**: The import syntax has changed as of version `6.0.1`. See the section for details.\n\nTo install, simply run:\n\n```\npip install positional-encodings[pytorch,tensorflow]\n```\n\nYou can also install the pytorch and tf encodings individually with the following\ncommands.\n\n* For a PyTorch only installation, run `pip install positional-encodings[pytorch]`\n* For a TensorFlow only installation, run `pip install positional-encodings[tensorflow]`\n\n## Usage:\n\n### Pytorch\n\nThe repo comes with the three main positional encoding models,\n`PositionalEncoding{1,2,3}D`. In addition, there are a `Summer` class that adds\nthe input tensor to the positional encodings.\n\n```python3\nimport torch\nfrom positional_encodings.torch_encodings import PositionalEncoding1D, PositionalEncoding2D, PositionalEncoding3D, Summer\n\n# Returns the position encoding only\np_enc_1d_model = PositionalEncoding1D(10)\n\n# Return the inputs with the position encoding added\np_enc_1d_model_sum = Summer(PositionalEncoding1D(10))\n\nx = torch.rand(1,6,10)\npenc_no_sum = p_enc_1d_model(x) # penc_no_sum.shape == (1, 6, 10)\npenc_sum = p_enc_1d_model_sum(x)\nprint(penc_no_sum + x == penc_sum) # True\n```\n\n```python3\np_enc_2d = PositionalEncoding2D(8)\ny = torch.zeros((1,6,2,8))\nprint(p_enc_2d(y).shape) # (1, 6, 2, 8)\n\np_enc_3d = PositionalEncoding3D(11)\nz = torch.zeros((1,5,6,4,11))\nprint(p_enc_3d(z).shape) # (1, 5, 6, 4, 11)\n```\n\nAnd for tensors of the form `(batchsize, ch, x)` or their 2D and 3D\ncounterparts, include the word `Permute` before the number in the class; e.g.\nfor a 1D input of size `(batchsize, ch, x)`, do `PositionalEncodingPermute1D`\ninstead of `PositionalEncoding1D`.\n\n\n```python3\nimport torch\nfrom positional_encodings.torch_encodings import PositionalEncodingPermute3D\n\np_enc_3d = PositionalEncodingPermute3D(11)\nz = torch.zeros((1,11,5,6,4))\nprint(p_enc_3d(z).shape) # (1, 11, 5, 6, 4)\n```\n\nNote to override the output dtype you can specify it when creating the encoding:\n\n```python3\np_enc_3d = PositionalEncodingPermute3D(11, dtype_override=torch.float64)\n```\n\nThis is particularly useful when the input tensor is of an `int` type since the\noutput will always be zero (see issue #39).\n\n### Tensorflow Keras\n\nThis also supports Tensorflow. Simply prepend all class names with `TF`.\n\n```python3\nimport tensorflow as tf\nfrom positional_encodings.tf_encodings import TFPositionalEncoding2D, TFSummer\n\n# Returns the position encoding only\np_enc_2d = TFPositionalEncoding2D(170)\ny = tf.zeros((1,8,6,2))\nprint(p_enc_2d(y).shape) # (1, 8, 6, 2)\n\n# Return the inputs with the position encoding added\nadd_p_enc_2d = TFSummer(TFPositionalEncoding2D(170))\ny = tf.ones((1,8,6,2))\nprint(add_p_enc_2d(y) - p_enc_2d(y)) # tf.ones((1,8,6,2))\n```\n\n## Changes as of version `6.0.1`\n\nBefore `6.0.1`, users had to install both the `tensorflow` and the\n`torch` packages, both of which are quite large. Now, one can install the\npackages individually, but now the code has to be changed:\n\nIf using PyTorch:\n\n```\nfrom positional_encodings import * -> from positional_encodings.torch_encodings import *\n```\n\nIf using TensorFlow:\n\n```\nfrom positional_encodings import * -> from positional_encodings.tf_encodings import *\n```\n\n## Formulas\n\nThe formula for inserting the positional encoding are as follows:\n\n1D:\n```\nPE(x,2i) = sin(x/10000^(2i/D))\nPE(x,2i+1) = cos(x/10000^(2i/D))\n\nWhere:\nx is a point in 2d space\ni is an integer in [0, D/2), where D is the size of the ch dimension\n```\n\n2D:\n```\nPE(x,y,2i) = sin(x/10000^(4i/D))\nPE(x,y,2i+1) = cos(x/10000^(4i/D))\nPE(x,y,2j+D/2) = sin(y/10000^(4j/D))\nPE(x,y,2j+1+D/2) = cos(y/10000^(4j/D))\n\nWhere:\n(x,y) is a point in 2d space\ni,j is an integer in [0, D/4), where D is the size of the ch dimension\n```\n\n3D:\n```\nPE(x,y,z,2i) = sin(x/10000^(6i/D))\nPE(x,y,z,2i+1) = cos(x/10000^(6i/D))\nPE(x,y,z,2j+D/3) = sin(y/10000^(6j/D))\nPE(x,y,z,2j+1+D/3) = cos(y/10000^(6j/D))\nPE(x,y,z,2k+2D/3) = sin(z/10000^(6k/D))\nPE(x,y,z,2k+1+2D/3) = cos(z/10000^(6k/D))\n\nWhere:\n(x,y,z) is a point in 3d space\ni,j,k is an integer in [0, D/6), where D is the size of the ch dimension\n```\n\nThe 3D formula is just a natural extension of the 2D positional encoding used\nin [this](https://arxiv.org/pdf/1908.11415.pdf) paper.\n\nDon't worry if the input is not divisible by 2 (1D), 4 (2D), or 6 (3D); all the\nnecessary padding will be taken care of.\n\n## Thank you\n\nThank you for [this](https://github.com/wzlxjtu/PositionalEncoding2D) repo for inspriration of this method.\n\n## Citations\n1D:\n```bibtex\n@inproceedings{vaswani2017attention,\n  title={Attention is all you need},\n  author={Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N and Kaiser, {\\L}ukasz and Polosukhin, Illia},\n  booktitle={Advances in neural information processing systems},\n  pages={5998--6008},\n  year={2017}\n}\n```\n\n2D:\n```bibtex\n@misc{wang2019translating,\n    title={Translating Math Formula Images to LaTeX Sequences Using Deep Neural Networks with Sequence-level Training},\n    author={Zelun Wang and Jyh-Charn Liu},\n    year={2019},\n    eprint={1908.11415},\n    archivePrefix={arXiv},\n    primaryClass={cs.LG}\n}\n```\n\n3D:\nComing soon!\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "1D, 2D, and 3D Sinusodal Positional Encodings in PyTorch",
    "version": "6.0.4",
    "project_urls": {
        "Homepage": "https://github.com/tatp22/multidim-positional-encoding"
    },
    "split_keywords": [
        "transformers",
        " attention"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "25523785732aea08f848949549686c46473eeed2017005b5847d9976e62f1716",
                "md5": "e3be926ab9863fc7782515fe5c16bc06",
                "sha256": "26a2aa914fa4c784d87557d142f760e037427505e17a1e07902bb91d66915de0"
            },
            "downloads": -1,
            "filename": "positional_encodings-6.0.4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "e3be926ab9863fc7782515fe5c16bc06",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.12",
            "size": 7674,
            "upload_time": "2024-10-23T07:33:47",
            "upload_time_iso_8601": "2024-10-23T07:33:47.049649Z",
            "url": "https://files.pythonhosted.org/packages/25/52/3785732aea08f848949549686c46473eeed2017005b5847d9976e62f1716/positional_encodings-6.0.4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "6d0493a78a33088fe9115602883a39f8395cdb5561388a24e65f57c11920c638",
                "md5": "9c3c9256a78990f407ba54cd483c7cd2",
                "sha256": "435db9596a73759e7caa8f677ebf6d2ee4d39187112244e4d8b6d7b62bc119ad"
            },
            "downloads": -1,
            "filename": "positional_encodings-6.0.4.tar.gz",
            "has_sig": false,
            "md5_digest": "9c3c9256a78990f407ba54cd483c7cd2",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.12",
            "size": 6814,
            "upload_time": "2024-10-23T07:33:48",
            "upload_time_iso_8601": "2024-10-23T07:33:48.273481Z",
            "url": "https://files.pythonhosted.org/packages/6d/04/93a78a33088fe9115602883a39f8395cdb5561388a24e65f57c11920c638/positional_encodings-6.0.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-10-23 07:33:48",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "tatp22",
    "github_project": "multidim-positional-encoding",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "positional-encodings"
}
        
Elapsed time: 0.45632s