[![Multi-Modality](agorabanner.png)](https://discord.gg/qUtxnK2NMf)
# HSSS
Implementation of a Hierarchical Mamba as described in the paper: "Hierarchical State Space Models for Continuous Sequence-to-Sequence Modeling" but instead of using traditional SSMs were using Mambas. Basically the flow is single input -> low level mambas -> concat -> high level ssm -> multiple outputs.
I believe in this architecture alot as it segments local and global learning.
## install
`pip install hsss`
## usage
```python
import torch
from hsss import LowLevelMamba, HSSS
# Reandom tensor
x = torch.randn(1, 10, 8)
# Low level model
mamba = LowLevelMamba(
dim=8, # dimension of input
depth=6, # depth of input
dt_rank=4, # rank of input
d_state=4, # state of input
expand_factor=4, # expansion factor of input
d_conv=6, # convolution dimension of input
dt_min=0.001, # minimum time step of input
dt_max=0.1, # maximum time step of input
dt_init="random", # initialization method of input
dt_scale=1.0, # scaling factor of input
bias=False, # whether to use bias in input
conv_bias=True, # whether to use bias in convolution of input
pscan=True, # whether to use parallel scan in input
)
# Low level model 2
mamba2 = LowLevelMamba(
dim=8, # dimension of input
depth=6, # depth of input
dt_rank=4, # rank of input
d_state=4, # state of input
expand_factor=4, # expansion factor of input
d_conv=6, # convolution dimension of input
dt_min=0.001, # minimum time step of input
dt_max=0.1, # maximum time step of input
dt_init="random", # initialization method of input
dt_scale=1.0, # scaling factor of input
bias=False, # whether to use bias in input
conv_bias=True, # whether to use bias in convolution of input
pscan=True, # whether to use parallel scan in input
)
# Low level mamba 3
mamba3 = LowLevelMamba(
dim=8, # dimension of input
depth=6, # depth of input
dt_rank=4, # rank of input
d_state=4, # state of input
expand_factor=4, # expansion factor of input
d_conv=6, # convolution dimension of input
dt_min=0.001, # minimum time step of input
dt_max=0.1, # maximum time step of input
dt_init="random", # initialization method of input
dt_scale=1.0, # scaling factor of input
bias=False, # whether to use bias in input
conv_bias=True, # whether to use bias in convolution of input
pscan=True, # whether to use parallel scan in input
)
# HSSS
hsss = HSSS(
layers=[mamba, mamba2, mamba3],
dim=12, # dimension of model
depth=3, # depth of model
dt_rank=2, # rank of model
d_state=2, # state of model
expand_factor=2, # expansion factor of model
d_conv=3, # convolution dimension of model
dt_min=0.001, # minimum time step of model
dt_max=0.1, # maximum time step of model
dt_init="random", # initialization method of model
dt_scale=1.0, # scaling factor of model
bias=False, # whether to use bias in model
conv_bias=True, # whether to use bias in convolution of model
pscan=True, # whether to use parallel scan in model
proj_layer=True,
)
# Forward pass
out = hsss(x)
print(out.shape)
```
## Citation
```bibtex
@misc{bhirangi2024hierarchical,
title={Hierarchical State Space Models for Continuous Sequence-to-Sequence Modeling},
author={Raunaq Bhirangi and Chenyu Wang and Venkatesh Pattabiraman and Carmel Majidi and Abhinav Gupta and Tess Hellebrekers and Lerrel Pinto},
year={2024},
eprint={2402.10211},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
```
# License
MIT
Raw data
{
"_id": null,
"home_page": "https://github.com/kyegomez/HSSS",
"name": "hsss",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.6,<4.0",
"maintainer_email": "",
"keywords": "artificial intelligence,deep learning,optimizers,Prompt Engineering",
"author": "Kye Gomez",
"author_email": "kye@apac.ai",
"download_url": "https://files.pythonhosted.org/packages/2f/ef/90717ce45c43bd3bbaf98df974138ef14e6f0eb80939840b50411c6d1b70/hsss-0.0.7.tar.gz",
"platform": null,
"description": "[![Multi-Modality](agorabanner.png)](https://discord.gg/qUtxnK2NMf)\n\n# HSSS\nImplementation of a Hierarchical Mamba as described in the paper: \"Hierarchical State Space Models for Continuous Sequence-to-Sequence Modeling\" but instead of using traditional SSMs were using Mambas. Basically the flow is single input -> low level mambas -> concat -> high level ssm -> multiple outputs.\n\nI believe in this architecture alot as it segments local and global learning. \n\n\n## install\n`pip install hsss`\n\n## usage\n```python\nimport torch\nfrom hsss import LowLevelMamba, HSSS\n\n\n# Reandom tensor\nx = torch.randn(1, 10, 8)\n\n# Low level model\nmamba = LowLevelMamba(\n dim=8, # dimension of input\n depth=6, # depth of input\n dt_rank=4, # rank of input\n d_state=4, # state of input\n expand_factor=4, # expansion factor of input\n d_conv=6, # convolution dimension of input\n dt_min=0.001, # minimum time step of input\n dt_max=0.1, # maximum time step of input\n dt_init=\"random\", # initialization method of input\n dt_scale=1.0, # scaling factor of input\n bias=False, # whether to use bias in input\n conv_bias=True, # whether to use bias in convolution of input\n pscan=True, # whether to use parallel scan in input\n)\n\n\n# Low level model 2\nmamba2 = LowLevelMamba(\n dim=8, # dimension of input\n depth=6, # depth of input\n dt_rank=4, # rank of input\n d_state=4, # state of input\n expand_factor=4, # expansion factor of input\n d_conv=6, # convolution dimension of input\n dt_min=0.001, # minimum time step of input\n dt_max=0.1, # maximum time step of input\n dt_init=\"random\", # initialization method of input\n dt_scale=1.0, # scaling factor of input\n bias=False, # whether to use bias in input\n conv_bias=True, # whether to use bias in convolution of input\n pscan=True, # whether to use parallel scan in input\n)\n\n\n# Low level mamba 3\nmamba3 = LowLevelMamba(\n dim=8, # dimension of input\n depth=6, # depth of input\n dt_rank=4, # rank of input\n d_state=4, # state of input\n expand_factor=4, # expansion factor of input\n d_conv=6, # convolution dimension of input\n dt_min=0.001, # minimum time step of input\n dt_max=0.1, # maximum time step of input\n dt_init=\"random\", # initialization method of input\n dt_scale=1.0, # scaling factor of input\n bias=False, # whether to use bias in input\n conv_bias=True, # whether to use bias in convolution of input\n pscan=True, # whether to use parallel scan in input\n)\n\n\n# HSSS\nhsss = HSSS(\n layers=[mamba, mamba2, mamba3],\n dim=12, # dimension of model\n depth=3, # depth of model\n dt_rank=2, # rank of model\n d_state=2, # state of model\n expand_factor=2, # expansion factor of model\n d_conv=3, # convolution dimension of model\n dt_min=0.001, # minimum time step of model\n dt_max=0.1, # maximum time step of model\n dt_init=\"random\", # initialization method of model\n dt_scale=1.0, # scaling factor of model\n bias=False, # whether to use bias in model\n conv_bias=True, # whether to use bias in convolution of model\n pscan=True, # whether to use parallel scan in model\n proj_layer=True,\n)\n\n\n# Forward pass\nout = hsss(x)\nprint(out.shape)\n\n```\n## Citation\n```bibtex\n@misc{bhirangi2024hierarchical,\n title={Hierarchical State Space Models for Continuous Sequence-to-Sequence Modeling}, \n author={Raunaq Bhirangi and Chenyu Wang and Venkatesh Pattabiraman and Carmel Majidi and Abhinav Gupta and Tess Hellebrekers and Lerrel Pinto},\n year={2024},\n eprint={2402.10211},\n archivePrefix={arXiv},\n primaryClass={cs.LG}\n}\n```\n\n\n# License\nMIT\n\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Paper - Pytorch",
"version": "0.0.7",
"project_urls": {
"Documentation": "https://github.com/kyegomez/HSSS",
"Homepage": "https://github.com/kyegomez/HSSS",
"Repository": "https://github.com/kyegomez/HSSS"
},
"split_keywords": [
"artificial intelligence",
"deep learning",
"optimizers",
"prompt engineering"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "2fab40b491e30d015a8cdb801a19d12fc51c31c16685d3beb4ad4fccb1217369",
"md5": "d0fdac9c4765491538b95063576be4c3",
"sha256": "e82d482f6a55f43acba082a2cc5a69593272c50b25f064a1b99fa4b4c4577301"
},
"downloads": -1,
"filename": "hsss-0.0.7-py3-none-any.whl",
"has_sig": false,
"md5_digest": "d0fdac9c4765491538b95063576be4c3",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6,<4.0",
"size": 9304,
"upload_time": "2024-02-16T18:51:39",
"upload_time_iso_8601": "2024-02-16T18:51:39.331571Z",
"url": "https://files.pythonhosted.org/packages/2f/ab/40b491e30d015a8cdb801a19d12fc51c31c16685d3beb4ad4fccb1217369/hsss-0.0.7-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "2fef90717ce45c43bd3bbaf98df974138ef14e6f0eb80939840b50411c6d1b70",
"md5": "24773669e13ac079d0ea67e760daf37f",
"sha256": "796095ad5361ced02fcfd3a8d7d45aaa227b3c11fb1d244c3cd98c6af9e5a986"
},
"downloads": -1,
"filename": "hsss-0.0.7.tar.gz",
"has_sig": false,
"md5_digest": "24773669e13ac079d0ea67e760daf37f",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6,<4.0",
"size": 10415,
"upload_time": "2024-02-16T18:51:41",
"upload_time_iso_8601": "2024-02-16T18:51:41.459028Z",
"url": "https://files.pythonhosted.org/packages/2f/ef/90717ce45c43bd3bbaf98df974138ef14e6f0eb80939840b50411c6d1b70/hsss-0.0.7.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-02-16 18:51:41",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "kyegomez",
"github_project": "HSSS",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [],
"lcname": "hsss"
}