vision-llama

Name	vision-llama JSON
Version	0.0.8 JSON
	download
home_page	https://github.com/kyegomez/VisionLLaMA
Summary	Vision Llama - Pytorch
upload_time	2024-03-06 05:13:23
maintainer
docs_url	None
author	Kye Gomez
requires_python	>=3.6,<4.0
license	MIT
keywords	artificial intelligence deep learning optimizers prompt engineering
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            [![Multi-Modality](agorabanner.png)](https://discord.gg/qUtxnK2NMf)

# Vision LLama
Implementation of VisionLLaMA from the paper: "VisionLLaMA: A Unified LLaMA Interface for Vision Tasks" in PyTorch and Zeta. [PAPER LINK](https://arxiv.org/abs/2403.00522)


## install
`$ pip install vision-llama`


## usage
```python

import torch
from vision_llama.main import VisionLlama

# Forward Tensor
x = torch.randn(1, 3, 224, 224)

# Create an instance of the VisionLlamaBlock model with the specified parameters
model = VisionLlama(
    dim=768, depth=12, channels=3, heads=12, num_classes=1000
)


# Print the shape of the output tensor when x is passed through the model
print(model(x))

```



# License
MIT

## Citation
```bibtex
@misc{chu2024visionllama,
    title={VisionLLaMA: A Unified LLaMA Interface for Vision Tasks}, 
    author={Xiangxiang Chu and Jianlin Su and Bo Zhang and Chunhua Shen},
    year={2024},
    eprint={2403.00522},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}
```

## todo
- [ ] Implement the AS2DRoPE rope, might just use axial rotary embeddings instead, my implementation is really bad
- [x] Implement the GSA attention, i implemented it but's bad
- [ ] Add imagenet training script with distributed

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/kyegomez/VisionLLaMA",
    "name": "vision-llama",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.6,<4.0",
    "maintainer_email": "",
    "keywords": "artificial intelligence,deep learning,optimizers,Prompt Engineering",
    "author": "Kye Gomez",
    "author_email": "kye@apac.ai",
    "download_url": "https://files.pythonhosted.org/packages/51/68/d3bd820836cfb702b873d7af9adc2eda4300ebf9758abe5e90f1a076ff98/vision_llama-0.0.8.tar.gz",
    "platform": null,
    "description": "[![Multi-Modality](agorabanner.png)](https://discord.gg/qUtxnK2NMf)\n\n# Vision LLama\nImplementation of VisionLLaMA from the paper: \"VisionLLaMA: A Unified LLaMA Interface for Vision Tasks\" in PyTorch and Zeta. [PAPER LINK](https://arxiv.org/abs/2403.00522)\n\n\n## install\n`$ pip install vision-llama`\n\n\n## usage\n```python\n\nimport torch\nfrom vision_llama.main import VisionLlama\n\n# Forward Tensor\nx = torch.randn(1, 3, 224, 224)\n\n# Create an instance of the VisionLlamaBlock model with the specified parameters\nmodel = VisionLlama(\n    dim=768, depth=12, channels=3, heads=12, num_classes=1000\n)\n\n\n# Print the shape of the output tensor when x is passed through the model\nprint(model(x))\n\n```\n\n\n\n# License\nMIT\n\n## Citation\n```bibtex\n@misc{chu2024visionllama,\n    title={VisionLLaMA: A Unified LLaMA Interface for Vision Tasks}, \n    author={Xiangxiang Chu and Jianlin Su and Bo Zhang and Chunhua Shen},\n    year={2024},\n    eprint={2403.00522},\n    archivePrefix={arXiv},\n    primaryClass={cs.CV}\n}\n```\n\n## todo\n- [ ] Implement the AS2DRoPE rope, might just use axial rotary embeddings instead, my implementation is really bad\n- [x] Implement the GSA attention, i implemented it but's bad\n- [ ] Add imagenet training script with distributed",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Vision Llama - Pytorch",
    "version": "0.0.8",
    "project_urls": {
        "Documentation": "https://github.com/kyegomez/VisionLLaMA",
        "Homepage": "https://github.com/kyegomez/VisionLLaMA",
        "Repository": "https://github.com/kyegomez/VisionLLaMA"
    },
    "split_keywords": [
        "artificial intelligence",
        "deep learning",
        "optimizers",
        "prompt engineering"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "a97d3bdcd336d5f261367182e6f8c3549476c78238bc56b11a12be4f8e6ffa20",
                "md5": "71021388427164bc509ee2843abc9c5d",
                "sha256": "e9ba5d07001b8115eff47e07bfbca15838d75c1f53065dfef5da0ccd2ffa7e28"
            },
            "downloads": -1,
            "filename": "vision_llama-0.0.8-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "71021388427164bc509ee2843abc9c5d",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6,<4.0",
            "size": 7480,
            "upload_time": "2024-03-06T05:13:22",
            "upload_time_iso_8601": "2024-03-06T05:13:22.104608Z",
            "url": "https://files.pythonhosted.org/packages/a9/7d/3bdcd336d5f261367182e6f8c3549476c78238bc56b11a12be4f8e6ffa20/vision_llama-0.0.8-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "5168d3bd820836cfb702b873d7af9adc2eda4300ebf9758abe5e90f1a076ff98",
                "md5": "1b95dd41e192fd3ccf2f78bf6f301437",
                "sha256": "5adc93a897c33fed5db0f4fa05f7ec6254986990f4f4691b38e39f2d9d02cb6a"
            },
            "downloads": -1,
            "filename": "vision_llama-0.0.8.tar.gz",
            "has_sig": false,
            "md5_digest": "1b95dd41e192fd3ccf2f78bf6f301437",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6,<4.0",
            "size": 7326,
            "upload_time": "2024-03-06T05:13:23",
            "upload_time_iso_8601": "2024-03-06T05:13:23.423729Z",
            "url": "https://files.pythonhosted.org/packages/51/68/d3bd820836cfb702b873d7af9adc2eda4300ebf9758abe5e90f1a076ff98/vision_llama-0.0.8.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-03-06 05:13:23",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "kyegomez",
    "github_project": "VisionLLaMA",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "vision-llama"
}

Kye Gomez