[![Multi-Modality](agorabanner.png)](https://discord.gg/qUtxnK2NMf)
# VIM
A simple implementation of "VIMA: General Robot Manipulation with Multimodal Prompts"
[Original implementation Link](https://github.com/vimalabs/VIMA)
# Appreciation
* Lucidrains
* Agorians
# Install
`pip install vima`
---
# Usage
```python
import torch
from vima import Vima
# Generate a random input sequence
x = torch.randint(0, 256, (1, 1024)).cuda()
# Initialize VIMA model
model = Vima()
# Pass the input sequence through the model
output = model(x)
```
## MultiModal Iteration
* Pass in text and and image tensors into vima
```python
import torch
from vima.vima import VimaMultiModal
#usage
img = torch.randn(1, 3, 256, 256)
text = torch.randint(0, 20000, (1, 1024))
model = VimaMultiModal()
output = model(text, img)
```
# License
MIT
# Citations
```latex
@inproceedings{jiang2023vima,
title = {VIMA: General Robot Manipulation with Multimodal Prompts},
author = {Yunfan Jiang and Agrim Gupta and Zichen Zhang and Guanzhi Wang and Yongqiang Dou and Yanjun Chen and Li Fei-Fei and Anima Anandkumar and Yuke Zhu and Linxi Fan},
booktitle = {Fortieth International Conference on Machine Learning},
year = {2023}
}
```
Raw data
{
"_id": null,
"home_page": "https://github.com/kyegomez/vima",
"name": "vima",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.6,<4.0",
"maintainer_email": "",
"keywords": "artificial intelligence,deep learning,optimizers,Prompt Engineering",
"author": "Kye Gomez",
"author_email": "kye@apac.ai",
"download_url": "https://files.pythonhosted.org/packages/a3/49/6f37fd63826ab028595f674257eef6a88aec0f8e2b1ebabe6319a2e86dab/vima-0.0.2.tar.gz",
"platform": null,
"description": "[![Multi-Modality](agorabanner.png)](https://discord.gg/qUtxnK2NMf)\n\n# VIM\nA simple implementation of \"VIMA: General Robot Manipulation with Multimodal Prompts\"\n\n[Original implementation Link](https://github.com/vimalabs/VIMA)\n\n# Appreciation\n* Lucidrains\n* Agorians\n\n# Install\n`pip install vima`\n\n---\n\n# Usage\n```python\nimport torch\nfrom vima import Vima\n\n# Generate a random input sequence\nx = torch.randint(0, 256, (1, 1024)).cuda()\n\n# Initialize VIMA model\nmodel = Vima()\n\n# Pass the input sequence through the model\noutput = model(x)\n```\n\n## MultiModal Iteration\n* Pass in text and and image tensors into vima\n```python\nimport torch\nfrom vima.vima import VimaMultiModal\n\n#usage\nimg = torch.randn(1, 3, 256, 256)\ntext = torch.randint(0, 20000, (1, 1024))\n\n\nmodel = VimaMultiModal()\noutput = model(text, img)\n\n```\n\n# License\nMIT\n\n# Citations\n```latex\n@inproceedings{jiang2023vima,\n title = {VIMA: General Robot Manipulation with Multimodal Prompts},\n author = {Yunfan Jiang and Agrim Gupta and Zichen Zhang and Guanzhi Wang and Yongqiang Dou and Yanjun Chen and Li Fei-Fei and Anima Anandkumar and Yuke Zhu and Linxi Fan},\n booktitle = {Fortieth International Conference on Machine Learning},\n year = {2023}\n}\n```",
"bugtrack_url": null,
"license": "MIT",
"summary": "vima - Pytorch",
"version": "0.0.2",
"project_urls": {
"Homepage": "https://github.com/kyegomez/vima",
"Repository": "https://github.com/kyegomez/vima"
},
"split_keywords": [
"artificial intelligence",
"deep learning",
"optimizers",
"prompt engineering"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "0610e7d1f2d201507e3e9d09a1d488d31b4125fc2384955e996fc04ab021c411",
"md5": "3798275df741489b8684864127629c98",
"sha256": "8e2ba3a331d7566a043aea4c1ac5ceb6a54a2e141bc6c4049214c030ca6a26ee"
},
"downloads": -1,
"filename": "vima-0.0.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "3798275df741489b8684864127629c98",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6,<4.0",
"size": 26282,
"upload_time": "2023-09-13T15:31:46",
"upload_time_iso_8601": "2023-09-13T15:31:46.077624Z",
"url": "https://files.pythonhosted.org/packages/06/10/e7d1f2d201507e3e9d09a1d488d31b4125fc2384955e996fc04ab021c411/vima-0.0.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "a3496f37fd63826ab028595f674257eef6a88aec0f8e2b1ebabe6319a2e86dab",
"md5": "6408b8791ad9d7bf4a9b8b5875463e2e",
"sha256": "eb458c3f26586668547962bb0ad031fa63475f62f783682ef84416b279829bf8"
},
"downloads": -1,
"filename": "vima-0.0.2.tar.gz",
"has_sig": false,
"md5_digest": "6408b8791ad9d7bf4a9b8b5875463e2e",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6,<4.0",
"size": 25277,
"upload_time": "2023-09-13T15:31:47",
"upload_time_iso_8601": "2023-09-13T15:31:47.730543Z",
"url": "https://files.pythonhosted.org/packages/a3/49/6f37fd63826ab028595f674257eef6a88aec0f8e2b1ebabe6319a2e86dab/vima-0.0.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-09-13 15:31:47",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "kyegomez",
"github_project": "vima",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [],
"lcname": "vima"
}