[![Multi-Modality](agorabanner.png)](https://discord.gg/qUtxnK2NMf)
# Kosmos2.5
My implementation of Kosmos2.5 from Microsoft research and the paper: "KOSMOS-2.5: A Multimodal Literate Model"
[Paper Link](https://arxiv.org/pdf/2309.11419.pdf)
# Appreciation
* Lucidrains
* Agorians
# Install
# Dataset Strategy
Here is a table summarizing the datasets used in the paper KOSMOS-2.5: A Multimodal Literate Model with metadata and source links:
| Dataset | Modality | # Samples | Domain | Source |
|-|-|-|-|-|
| IIT-CDIP | Text + Layout | 27.6M pages | Scanned documents | [Link](https://ir.nist.gov/cdip/)|
| arXiv papers | Text + Layout | 20.9M pages | Research papers | [Link](https://arxiv.org/) |
| PowerPoint slides | Text + Layout | 6.2M pages | Presentation slides | Web crawl |
| General PDF | Text + Layout | 155.2M pages | Diverse PDF files | Web crawl |
| Web screenshots | Text + Layout | 100M pages | Webpage screenshots | [Link](https://www.tensorflow.org/datasets/catalog/c4) |
| README | Text + Markdown | 2.9M files | GitHub README files | [Link](https://github.com/) |
| DOCX | Text + Markdown | 1.1M pages | WORD documents | Web crawl |
| LaTeX | Text + Markdown | 3.7M pages | Research papers | [Link](https://arxiv.org/) |
| HTML | Text + Markdown | 6.3M pages | Webpages | [Link](https://www.tensorflow.org/datasets/catalog/c4) |
# License
MIT
# Citations
```bibtex
@misc{2309.11419,
Author = {Tengchao Lv and Yupan Huang and Jingye Chen and Lei Cui and Shuming Ma and Yaoyao Chang and Shaohan Huang and Wenhui Wang and Li Dong and Weiyao Luo and Shaoxiang Wu and Guoxin Wang and Cha Zhang and Furu Wei},
Title = {Kosmos-2.5: A Multimodal Literate Model},
Year = {2023},
Eprint = {arXiv:2309.11419},
}
```
**bold**
*italics*
Raw data
{
"_id": null,
"home_page": "https://github.com/kyegomez/Kosmos2.5",
"name": "kosmos2-torch",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.6,<4.0",
"maintainer_email": "",
"keywords": "artificial intelligence,deep learning,optimizers,Prompt Engineering",
"author": "Kye Gomez",
"author_email": "kye@apac.ai",
"download_url": "https://files.pythonhosted.org/packages/f0/e5/fda311172cbc46c8c2b6ce196dc5fafd379e29185f88105e960c589ef5fb/kosmos2_torch-0.0.1.tar.gz",
"platform": null,
"description": "[![Multi-Modality](agorabanner.png)](https://discord.gg/qUtxnK2NMf)\n\n# Kosmos2.5\nMy implementation of Kosmos2.5 from Microsoft research and the paper: \"KOSMOS-2.5: A Multimodal Literate Model\"\n\n[Paper Link](https://arxiv.org/pdf/2309.11419.pdf)\n\n# Appreciation\n* Lucidrains\n* Agorians\n\n\n\n# Install\n\n\n# Dataset Strategy\nHere is a table summarizing the datasets used in the paper KOSMOS-2.5: A Multimodal Literate Model with metadata and source links:\n\n| Dataset | Modality | # Samples | Domain | Source | \n|-|-|-|-|-| \n| IIT-CDIP | Text + Layout | 27.6M pages | Scanned documents | [Link](https://ir.nist.gov/cdip/)|\n| arXiv papers | Text + Layout | 20.9M pages | Research papers | [Link](https://arxiv.org/) | \n| PowerPoint slides | Text + Layout | 6.2M pages | Presentation slides | Web crawl |\n| General PDF | Text + Layout | 155.2M pages | Diverse PDF files | Web crawl |\n| Web screenshots | Text + Layout | 100M pages | Webpage screenshots | [Link](https://www.tensorflow.org/datasets/catalog/c4) |\n| README | Text + Markdown | 2.9M files | GitHub README files | [Link](https://github.com/) | \n| DOCX | Text + Markdown | 1.1M pages | WORD documents | Web crawl |\n| LaTeX | Text + Markdown | 3.7M pages | Research papers | [Link](https://arxiv.org/) |\n| HTML | Text + Markdown | 6.3M pages | Webpages | [Link](https://www.tensorflow.org/datasets/catalog/c4) |\n\n\n\n# License\nMIT\n\n# Citations\n```bibtex\n@misc{2309.11419,\nAuthor = {Tengchao Lv and Yupan Huang and Jingye Chen and Lei Cui and Shuming Ma and Yaoyao Chang and Shaohan Huang and Wenhui Wang and Li Dong and Weiyao Luo and Shaoxiang Wu and Guoxin Wang and Cha Zhang and Furu Wei},\nTitle = {Kosmos-2.5: A Multimodal Literate Model},\nYear = {2023},\nEprint = {arXiv:2309.11419},\n}\n```\n\n**bold**\n*italics*\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Kosmos - Pytorch",
"version": "0.0.1",
"project_urls": {
"Homepage": "https://github.com/kyegomez/Kosmos2.5",
"Repository": "https://github.com/kyegomez/Kosmos2.5"
},
"split_keywords": [
"artificial intelligence",
"deep learning",
"optimizers",
"prompt engineering"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "994a5f0b4cf15224cf011da1ba9dde583bbc614dbc4615af2705412f6a788321",
"md5": "426abd2596828d89322500c70e6f4851",
"sha256": "cbabda0ddfddeef7db1370b311f4e8a7fd5adc67df53bbe8bc6faef9347be159"
},
"downloads": -1,
"filename": "kosmos2_torch-0.0.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "426abd2596828d89322500c70e6f4851",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6,<4.0",
"size": 4615,
"upload_time": "2023-09-22T03:53:28",
"upload_time_iso_8601": "2023-09-22T03:53:28.456812Z",
"url": "https://files.pythonhosted.org/packages/99/4a/5f0b4cf15224cf011da1ba9dde583bbc614dbc4615af2705412f6a788321/kosmos2_torch-0.0.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "f0e5fda311172cbc46c8c2b6ce196dc5fafd379e29185f88105e960c589ef5fb",
"md5": "897ce443f60a1a0f0521ef709a6b217e",
"sha256": "c8b426518c224c052a2387ee2e33c8da2ed64a4a02a7d244bf9a51d127df2ff3"
},
"downloads": -1,
"filename": "kosmos2_torch-0.0.1.tar.gz",
"has_sig": false,
"md5_digest": "897ce443f60a1a0f0521ef709a6b217e",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6,<4.0",
"size": 4264,
"upload_time": "2023-09-22T03:53:30",
"upload_time_iso_8601": "2023-09-22T03:53:30.150811Z",
"url": "https://files.pythonhosted.org/packages/f0/e5/fda311172cbc46c8c2b6ce196dc5fafd379e29185f88105e960c589ef5fb/kosmos2_torch-0.0.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-09-22 03:53:30",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "kyegomez",
"github_project": "Kosmos2.5",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [],
"lcname": "kosmos2-torch"
}