# S5: Simplified State Space Layers for Sequence Modeling
This is a ported version derived from <https://github.com/lindermanlab/S5> and <https://github.com/kavorite/S5>.
It includes a bunch of functions ported from jax/lax/flax/whatever since they didn't exist yet.
~~Jax is required because it relies on the pytree structure but it's not used for any computation.~~
Since version 0.2.0 jax is not required, it's using the pytorch native `torch.utils._pytree` (this may be incompatible for pytorch future versions).
Pytorch 2 or later is required because it makes heavy use of `torch.vmap` and `torch.utils._pytree` to substitute it's jax counterpart.
Python 3.10 or later is required due to usage of the `match` keyword
\---
Update:
In my experiments it follows the results found in the [Hyena Hierarchy](https://arxiv.org/abs/2302.10866) (& H3) paper that the state spaces alone lack the recall capabilities required for LLM but seem work well for regular sequence feature extraction and linear complexity.
You can use variable step-size as described in the paper using a 1D tensor for `step_scale` however this takes **a lot of memory** due to a lot of intermediate values needing to be held (which I believe is true for the official S5 repo, but not mentioned in the paper unless I missed it).
## Install
```sh
pip install s5-pytorch
```
## Example
```py3
from s5 import S5, S5Block
# Raw S5 operator
x = torch.rand([2, 256, 32])
model = S5(32, 32)
model(x) # [2, 256, 32]
# S5-former block (S5+FFN-GLU w/ layernorm, dropout & residual)
model = S5Block(32, 32, False)
model(x) # [2, 256, 32]
```
Raw data
{
"_id": null,
"home_page": "https://github.com/i404788/s5-pytorch",
"name": "s5-pytorch",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": "artificial intelligence, deep learning, transformers, attention mechanism, audio generation",
"author": "Ferris Kwaijtaal",
"author_email": "ferris+gh@devdroplets.com",
"download_url": "https://files.pythonhosted.org/packages/c3/6e/83a0ed161a4626263c328fe743e6bff69599bc5733a73f475fe5b444d863/s5-pytorch-0.2.1.tar.gz",
"platform": null,
"description": "# S5: Simplified State Space Layers for Sequence Modeling\nThis is a ported version derived from <https://github.com/lindermanlab/S5> and <https://github.com/kavorite/S5>.\nIt includes a bunch of functions ported from jax/lax/flax/whatever since they didn't exist yet. \n\n~~Jax is required because it relies on the pytree structure but it's not used for any computation.~~\nSince version 0.2.0 jax is not required, it's using the pytorch native `torch.utils._pytree` (this may be incompatible for pytorch future versions).\nPytorch 2 or later is required because it makes heavy use of `torch.vmap` and `torch.utils._pytree` to substitute it's jax counterpart.\nPython 3.10 or later is required due to usage of the `match` keyword\n\n\\--- \n\nUpdate:\n\nIn my experiments it follows the results found in the [Hyena Hierarchy](https://arxiv.org/abs/2302.10866) (& H3) paper that the state spaces alone lack the recall capabilities required for LLM but seem work well for regular sequence feature extraction and linear complexity.\n\nYou can use variable step-size as described in the paper using a 1D tensor for `step_scale` however this takes **a lot of memory** due to a lot of intermediate values needing to be held (which I believe is true for the official S5 repo, but not mentioned in the paper unless I missed it).\n\n## Install\n\n```sh\npip install s5-pytorch \n```\n\n## Example\n\n```py3\nfrom s5 import S5, S5Block\n\n# Raw S5 operator\nx = torch.rand([2, 256, 32])\nmodel = S5(32, 32)\nmodel(x) # [2, 256, 32]\n\n# S5-former block (S5+FFN-GLU w/ layernorm, dropout & residual)\nmodel = S5Block(32, 32, False)\nmodel(x) # [2, 256, 32]\n```\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "S5 - Simplified State Space Layers for Sequence Modeling - Pytorch",
"version": "0.2.1",
"project_urls": {
"Homepage": "https://github.com/i404788/s5-pytorch"
},
"split_keywords": [
"artificial intelligence",
" deep learning",
" transformers",
" attention mechanism",
" audio generation"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "c36e83a0ed161a4626263c328fe743e6bff69599bc5733a73f475fe5b444d863",
"md5": "58d82d65b6babd352f2ce12dadefe8d3",
"sha256": "b0b07031400369fea45e0e3b91d12ca9bd3b58d67d7b93055768599c676cde35"
},
"downloads": -1,
"filename": "s5-pytorch-0.2.1.tar.gz",
"has_sig": false,
"md5_digest": "58d82d65b6babd352f2ce12dadefe8d3",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 22234,
"upload_time": "2024-04-26T09:39:13",
"upload_time_iso_8601": "2024-04-26T09:39:13.708308Z",
"url": "https://files.pythonhosted.org/packages/c3/6e/83a0ed161a4626263c328fe743e6bff69599bc5733a73f475fe5b444d863/s5-pytorch-0.2.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-04-26 09:39:13",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "i404788",
"github_project": "s5-pytorch",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "s5-pytorch"
}