kosmos2-torch


Namekosmos2-torch JSON
Version 0.0.1 PyPI version JSON
download
home_pagehttps://github.com/kyegomez/Kosmos2.5
SummaryKosmos - Pytorch
upload_time2023-09-22 03:53:30
maintainer
docs_urlNone
authorKye Gomez
requires_python>=3.6,<4.0
licenseMIT
keywords artificial intelligence deep learning optimizers prompt engineering
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            [![Multi-Modality](agorabanner.png)](https://discord.gg/qUtxnK2NMf)

# Kosmos2.5
My implementation of Kosmos2.5 from Microsoft research and the paper: "KOSMOS-2.5: A Multimodal Literate Model"

[Paper Link](https://arxiv.org/pdf/2309.11419.pdf)

# Appreciation
* Lucidrains
* Agorians



# Install


# Dataset Strategy
Here is a table summarizing the datasets used in the paper KOSMOS-2.5: A Multimodal Literate Model with metadata and source links:

| Dataset | Modality | # Samples | Domain | Source | 
|-|-|-|-|-|  
| IIT-CDIP | Text + Layout | 27.6M pages | Scanned documents | [Link](https://ir.nist.gov/cdip/)|
| arXiv papers | Text + Layout | 20.9M pages | Research papers | [Link](https://arxiv.org/) |  
| PowerPoint slides | Text + Layout | 6.2M pages | Presentation slides | Web crawl |
| General PDF | Text + Layout | 155.2M pages | Diverse PDF files | Web crawl |
| Web screenshots | Text + Layout | 100M pages | Webpage screenshots | [Link](https://www.tensorflow.org/datasets/catalog/c4) |
| README | Text + Markdown | 2.9M files | GitHub README files | [Link](https://github.com/) |  
| DOCX | Text + Markdown | 1.1M pages | WORD documents | Web crawl |
| LaTeX | Text + Markdown | 3.7M pages | Research papers | [Link](https://arxiv.org/) |
| HTML | Text + Markdown | 6.3M pages | Webpages | [Link](https://www.tensorflow.org/datasets/catalog/c4) |



# License
MIT

# Citations
```bibtex
@misc{2309.11419,
Author = {Tengchao Lv and Yupan Huang and Jingye Chen and Lei Cui and Shuming Ma and Yaoyao Chang and Shaohan Huang and Wenhui Wang and Li Dong and Weiyao Luo and Shaoxiang Wu and Guoxin Wang and Cha Zhang and Furu Wei},
Title = {Kosmos-2.5: A Multimodal Literate Model},
Year = {2023},
Eprint = {arXiv:2309.11419},
}
```

**bold**
*italics*

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/kyegomez/Kosmos2.5",
    "name": "kosmos2-torch",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.6,<4.0",
    "maintainer_email": "",
    "keywords": "artificial intelligence,deep learning,optimizers,Prompt Engineering",
    "author": "Kye Gomez",
    "author_email": "kye@apac.ai",
    "download_url": "https://files.pythonhosted.org/packages/f0/e5/fda311172cbc46c8c2b6ce196dc5fafd379e29185f88105e960c589ef5fb/kosmos2_torch-0.0.1.tar.gz",
    "platform": null,
    "description": "[![Multi-Modality](agorabanner.png)](https://discord.gg/qUtxnK2NMf)\n\n# Kosmos2.5\nMy implementation of Kosmos2.5 from Microsoft research and the paper: \"KOSMOS-2.5: A Multimodal Literate Model\"\n\n[Paper Link](https://arxiv.org/pdf/2309.11419.pdf)\n\n# Appreciation\n* Lucidrains\n* Agorians\n\n\n\n# Install\n\n\n# Dataset Strategy\nHere is a table summarizing the datasets used in the paper KOSMOS-2.5: A Multimodal Literate Model with metadata and source links:\n\n| Dataset | Modality | # Samples | Domain | Source | \n|-|-|-|-|-|  \n| IIT-CDIP | Text + Layout | 27.6M pages | Scanned documents | [Link](https://ir.nist.gov/cdip/)|\n| arXiv papers | Text + Layout | 20.9M pages | Research papers | [Link](https://arxiv.org/) |  \n| PowerPoint slides | Text + Layout | 6.2M pages | Presentation slides | Web crawl |\n| General PDF | Text + Layout | 155.2M pages | Diverse PDF files | Web crawl |\n| Web screenshots | Text + Layout | 100M pages | Webpage screenshots | [Link](https://www.tensorflow.org/datasets/catalog/c4) |\n| README | Text + Markdown | 2.9M files | GitHub README files | [Link](https://github.com/) |  \n| DOCX | Text + Markdown | 1.1M pages | WORD documents | Web crawl |\n| LaTeX | Text + Markdown | 3.7M pages | Research papers | [Link](https://arxiv.org/) |\n| HTML | Text + Markdown | 6.3M pages | Webpages | [Link](https://www.tensorflow.org/datasets/catalog/c4) |\n\n\n\n# License\nMIT\n\n# Citations\n```bibtex\n@misc{2309.11419,\nAuthor = {Tengchao Lv and Yupan Huang and Jingye Chen and Lei Cui and Shuming Ma and Yaoyao Chang and Shaohan Huang and Wenhui Wang and Li Dong and Weiyao Luo and Shaoxiang Wu and Guoxin Wang and Cha Zhang and Furu Wei},\nTitle = {Kosmos-2.5: A Multimodal Literate Model},\nYear = {2023},\nEprint = {arXiv:2309.11419},\n}\n```\n\n**bold**\n*italics*\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Kosmos - Pytorch",
    "version": "0.0.1",
    "project_urls": {
        "Homepage": "https://github.com/kyegomez/Kosmos2.5",
        "Repository": "https://github.com/kyegomez/Kosmos2.5"
    },
    "split_keywords": [
        "artificial intelligence",
        "deep learning",
        "optimizers",
        "prompt engineering"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "994a5f0b4cf15224cf011da1ba9dde583bbc614dbc4615af2705412f6a788321",
                "md5": "426abd2596828d89322500c70e6f4851",
                "sha256": "cbabda0ddfddeef7db1370b311f4e8a7fd5adc67df53bbe8bc6faef9347be159"
            },
            "downloads": -1,
            "filename": "kosmos2_torch-0.0.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "426abd2596828d89322500c70e6f4851",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6,<4.0",
            "size": 4615,
            "upload_time": "2023-09-22T03:53:28",
            "upload_time_iso_8601": "2023-09-22T03:53:28.456812Z",
            "url": "https://files.pythonhosted.org/packages/99/4a/5f0b4cf15224cf011da1ba9dde583bbc614dbc4615af2705412f6a788321/kosmos2_torch-0.0.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f0e5fda311172cbc46c8c2b6ce196dc5fafd379e29185f88105e960c589ef5fb",
                "md5": "897ce443f60a1a0f0521ef709a6b217e",
                "sha256": "c8b426518c224c052a2387ee2e33c8da2ed64a4a02a7d244bf9a51d127df2ff3"
            },
            "downloads": -1,
            "filename": "kosmos2_torch-0.0.1.tar.gz",
            "has_sig": false,
            "md5_digest": "897ce443f60a1a0f0521ef709a6b217e",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6,<4.0",
            "size": 4264,
            "upload_time": "2023-09-22T03:53:30",
            "upload_time_iso_8601": "2023-09-22T03:53:30.150811Z",
            "url": "https://files.pythonhosted.org/packages/f0/e5/fda311172cbc46c8c2b6ce196dc5fafd379e29185f88105e960c589ef5fb/kosmos2_torch-0.0.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-09-22 03:53:30",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "kyegomez",
    "github_project": "Kosmos2.5",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "kosmos2-torch"
}
        
Elapsed time: 0.14624s