visualchatgpt


Namevisualchatgpt JSON
Version 0.0.1.dev0 PyPI version JSON
download
home_pagehttps://github.com/juncongmoo/visual-chatgpt
Summary💬 Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models
upload_time2023-03-10 18:02:46
maintainer
docs_urlNone
authorJuncong Moo
requires_python
license
keywords visual chatgpt
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Visual ChatGPT 

**Visual ChatGPT** connects ChatGPT and a series of Visual Foundation Models to enable **sending** and **receiving** images during chatting.

See our paper: [<font size=5>Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models</font>](https://arxiv.org/abs/2303.04671)

## Demo 
<img src="./assets/demo_short.gif" width="750">

##  System Architecture 

 
<p align="center"><img src="./assets/figure.jpg" alt="Logo"></p>


## Quick Start

```
# create a new environment
conda create -n visgpt python=3.8

# activate the new environment
conda activate visgpt

#  prepare the basic environments
pip install -r requirement.txt

# download the visual foundation models
bash download.sh

# prepare your private openAI private key
export OPENAI_API_KEY={Your_Private_Openai_Key}

# create a folder to save images
mkdir ./image

# Start Visual ChatGPT !
python visual_chatgpt.py
```

## GPU memory usage
Here we list the GPU memory usage of each visual foundation model, one can modify ``self.tools`` with fewer visual foundation models to save your GPU memory:

| Foundation Model        | Memory Usage (MB) |
|------------------------|-------------------|
| ImageEditing           | 6667              |
| ImageCaption           | 1755              |
| T2I                    | 6677              |
| canny2image            | 5540              |
| line2image             | 6679              |
| hed2image              | 6679              |
| scribble2image         | 6679              |
| pose2image             | 6681              |
| BLIPVQA                | 2709              |
| seg2image              | 5540              |
| depth2image            | 6677              |
| normal2image           | 3974              |
| InstructPix2Pix        | 2795              |



## Acknowledgement
We appreciate the open source of the following projects:

[Hugging Face](https://github.com/huggingface) &#8194;
[LangChain](https://github.com/hwchase17/langchain) &#8194;
[Stable Diffusion](https://github.com/CompVis/stable-diffusion) &#8194; 
[ControlNet](https://github.com/lllyasviel/ControlNet) &#8194; 
[InstructPix2Pix](https://github.com/timothybrooks/instruct-pix2pix) &#8194; 
[CLIPSeg](https://github.com/timojl/clipseg) &#8194;
[BLIP](https://github.com/salesforce/BLIP) &#8194;
            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/juncongmoo/visual-chatgpt",
    "name": "visualchatgpt",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "Visual,ChatGPT",
    "author": "Juncong Moo",
    "author_email": "JuncongMoo@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/38/45/7a59ee00311e6222ce94f4ea148497cf73abaee206ef6bd4d97936b4c634/visualchatgpt-0.0.1.dev0.linux-x86_64.tar.gz",
    "platform": null,
    "description": "# Visual ChatGPT \n\n**Visual ChatGPT** connects ChatGPT and a series of Visual Foundation Models to enable **sending** and **receiving** images during chatting.\n\nSee our paper: [<font size=5>Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models</font>](https://arxiv.org/abs/2303.04671)\n\n## Demo \n<img src=\"./assets/demo_short.gif\" width=\"750\">\n\n##  System Architecture \n\n \n<p align=\"center\"><img src=\"./assets/figure.jpg\" alt=\"Logo\"></p>\n\n\n## Quick Start\n\n```\n# create a new environment\nconda create -n visgpt python=3.8\n\n# activate the new environment\nconda activate visgpt\n\n#  prepare the basic environments\npip install -r requirement.txt\n\n# download the visual foundation models\nbash download.sh\n\n# prepare your private openAI private key\nexport OPENAI_API_KEY={Your_Private_Openai_Key}\n\n# create a folder to save images\nmkdir ./image\n\n# Start Visual ChatGPT !\npython visual_chatgpt.py\n```\n\n## GPU memory usage\nHere we list the GPU memory usage of each visual foundation model, one can modify ``self.tools`` with fewer visual foundation models to save your GPU memory:\n\n| Foundation Model        | Memory Usage (MB) |\n|------------------------|-------------------|\n| ImageEditing           | 6667              |\n| ImageCaption           | 1755              |\n| T2I                    | 6677              |\n| canny2image            | 5540              |\n| line2image             | 6679              |\n| hed2image              | 6679              |\n| scribble2image         | 6679              |\n| pose2image             | 6681              |\n| BLIPVQA                | 2709              |\n| seg2image              | 5540              |\n| depth2image            | 6677              |\n| normal2image           | 3974              |\n| InstructPix2Pix        | 2795              |\n\n\n\n## Acknowledgement\nWe appreciate the open source of the following projects:\n\n[Hugging Face](https://github.com/huggingface) &#8194;\n[LangChain](https://github.com/hwchase17/langchain) &#8194;\n[Stable Diffusion](https://github.com/CompVis/stable-diffusion) &#8194; \n[ControlNet](https://github.com/lllyasviel/ControlNet) &#8194; \n[InstructPix2Pix](https://github.com/timothybrooks/instruct-pix2pix) &#8194; \n[CLIPSeg](https://github.com/timojl/clipseg) &#8194;\n[BLIP](https://github.com/salesforce/BLIP) &#8194;",
    "bugtrack_url": null,
    "license": "",
    "summary": "\ud83d\udcac Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models",
    "version": "0.0.1.dev0",
    "split_keywords": [
        "visual",
        "chatgpt"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "38457a59ee00311e6222ce94f4ea148497cf73abaee206ef6bd4d97936b4c634",
                "md5": "5602a98e306679ea69efd577469ea382",
                "sha256": "24c1bec087a1f8d1d2807d75782fb4f5a955e3dcf147caebf10d5712107a1a38"
            },
            "downloads": -1,
            "filename": "visualchatgpt-0.0.1.dev0.linux-x86_64.tar.gz",
            "has_sig": false,
            "md5_digest": "5602a98e306679ea69efd577469ea382",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 2005,
            "upload_time": "2023-03-10T18:02:46",
            "upload_time_iso_8601": "2023-03-10T18:02:46.069044Z",
            "url": "https://files.pythonhosted.org/packages/38/45/7a59ee00311e6222ce94f4ea148497cf73abaee206ef6bd4d97936b4c634/visualchatgpt-0.0.1.dev0.linux-x86_64.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-03-10 18:02:46",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "juncongmoo",
    "github_project": "visual-chatgpt",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "visualchatgpt"
}
        
Elapsed time: 0.13063s