Name | mmgpt JSON |
Version |
0.0.1
JSON |
| download |
home_page | |
Summary | An open-source framework for multi-modality instruction fine-tuning |
upload_time | 2023-04-27 05:54:13 |
maintainer | |
docs_url | None |
author | |
requires_python | |
license | Apache 2.0 |
keywords |
machine
learning
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# π€ Multi-modal GPT
Train a multi-modal chatbot with visual and language instructions!
Based on the open-source multi-modal model [OpenFlamingo](https://github.com/mlfoundations/open_flamingo), we create various **visual instruction** data with open datasets, including VQA, Image Captioning, Visual Reasoning, Text OCR, and Visual Dialogue. Additionally, we also train the language model component of OpenFlamingo using only **language-only instruction** data.
The **joint training** of visual and language instructions effectively improves the performance of the model!
# Features
- Support various vision and language instruction data
- Parameter efficient fine-tuning with LoRA
- Tuning vision and language at the same time, complement each other
# Installaion
To install the package in an existing environment, run
```bash
git clone https://github.com/open-mmlab/Multimodal-GPT.git
pip install -r requirements.txt
pip install -e. -v
```
or create a new conda environment
```bash
conda env create -f environment.yml
```
# Demo
1. Download the pre-trained weights.
Use [this script](https://github.com/huggingface/transformers/blob/main/src/transformers/models/llama/convert_llama_weights_to_hf.py) for converting LLaMA weights to HuggingFace format.
Download the OpenFlamingo pre-trained model from [openflamingo/OpenFlamingo-9B](https://huggingface.co/openflamingo/OpenFlamingo-9B)
Download our LoRA Weight from [here](https://download.openmmlab.com/mmgpt/v0/mmgpt-lora-v0-release.pt)
Then place these models in checkpoints folders like this:
```
checkpoints
βββ llama-7b_hf
β βββ config.json
β βββ pytorch_model-00001-of-00002.bin
β βββ ......
β βββ tokenizer.model
βββ OpenFlamingo-9B
β βββcheckpoint.pt
βββmmgpt-lora-v0-release.pt
2. launch the gradio demo
```bash
python chat_gradio_demo.py
```
# Examples
### Recipe:

### Travel plan:

### Movie:

### Famous person:

# Fine-tuning
## Prepare datasets
1. [A-OKVQA](https://allenai.org/project/a-okvqa/home)
Download annotation from [this link](https://prior-datasets.s3.us-east-2.amazonaws.com/aokvqa/aokvqa_v1p0.tar.gz) and unzip to `data/aokvqa/annotations`
It also requires images from coco dataset which can be downloaded from [here](https://cocodataset.org/#home).
2. [COCO Caption](https://cs.stanford.edu/people/karpathy/deepimagesent/)
Download from [this link](https://cs.stanford.edu/people/karpathy/deepimagesent/coco.zip) and unzip to `data/coco`
It also requires images from coco dataset which can be downloaded from [here](https://cocodataset.org/#home).
3. [OCR VQA](https://ocr-vqa.github.io/)
Download from [this link](https://drive.google.com/drive/folders/1_GYPY5UkUy7HIcR0zq3ZCFgeZN7BAfm_?usp=sharing) and place in `data/OCR_VQA/`
4. [LlaVA](https://llava-vl.github.io/)
Download from [liuhaotian/LLaVA-Instruct-150K](https://huggingface.co/datasets/liuhaotian/LLaVA-Instruct-150K) and place in `data/llava/`
It also requires images from coco dataset which can be downloaded from [here](https://cocodataset.org/#home).
5. [Mini-GPT4](https://minigpt-4.github.io/)
Download from [Vision-CAIR/cc_sbu_align](https://huggingface.co/datasets/Vision-CAIR/cc_sbu_align) and place in `data/cc_sbu_align/`
6. [Dolly 15k](https://www.databricks.com/blog/2023/03/24/hello-dolly-democratizing-magic-chatgpt-open-models.html)
Download from [databricks/databricks-dolly-15k](https://huggingface.co/datasets/databricks/databricks-dolly-15k) and place it in `data/dolly/databricks-dolly-15k.jsonl`
7. [Alpaca GPT4](https://github.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM)
Download it from [this link](https://github.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM/raw/main/data/alpaca_gpt4_data.json) and place it in `data/alpaca_gpt4/alpaca_gpt4_data.json`
You can also customize the data path in the [configs/dataset_config.py](configs/dataset_config.py).
## Start training
```bash
torchrun --nproc_per_node=8 mmgpt/train/instruction_finetune.py \
--lm_path checkpoints/llama-7b_hf \
--tokenizer_path checkpoints/llama-7b_hf \
--pretrained_path checkpoints/OpenFlamingo-9B/checkpoint.pt \
--run_name train-my-gpt4 \
--learning_rate 1e-5 \
--lr_scheduler cosine \
--batch_size 1 \
--tuning_config configs/lora_config.py \
--dataset_config configs/dataset_config.py \
--report_to_wandb \
```
# Acknowledgements
- [OpenFlamingo](https://github.com/mlfoundations/open_flamingo)
- [LAVIS](https://github.com/salesforce/LAVIS)
- [Stanford Alpaca](https://github.com/tatsu-lab/stanford_alpaca)
- [MiniGPT-4](https://github.com/Vision-CAIR/MiniGPT-4)
- [LLaVA](https://github.com/haotian-liu/LLaVA/tree/main)
- [Instruction Tuning with GPT-4](https://github.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM)
Raw data
{
"_id": null,
"home_page": "",
"name": "mmgpt",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "machine learning",
"author": "",
"author_email": "",
"download_url": "https://files.pythonhosted.org/packages/45/02/70febd09c09cd1819b4962b1f666a3177651bc34c673f616b791adc496ca/mmgpt-0.0.1.tar.gz",
"platform": null,
"description": "# \ud83e\udd16 Multi-modal GPT\n\nTrain a multi-modal chatbot with visual and language instructions! \n\nBased on the open-source multi-modal model [OpenFlamingo](https://github.com/mlfoundations/open_flamingo), we create various **visual instruction** data with open datasets, including VQA, Image Captioning, Visual Reasoning, Text OCR, and Visual Dialogue. Additionally, we also train the language model component of OpenFlamingo using only **language-only instruction** data.\n\nThe **joint training** of visual and language instructions effectively improves the performance of the model!\n\n# Features\n\n- Support various vision and language instruction data\n- Parameter efficient fine-tuning with LoRA\n- Tuning vision and language at the same time, complement each other\n\n# Installaion\n\nTo install the package in an existing environment, run\n\n```bash\ngit clone https://github.com/open-mmlab/Multimodal-GPT.git\npip install -r requirements.txt\npip install -e. -v\n```\n\nor create a new conda environment\n\n```bash\nconda env create -f environment.yml\n```\n\n\n# Demo\n\n1. Download the pre-trained weights.\n\n Use [this script](https://github.com/huggingface/transformers/blob/main/src/transformers/models/llama/convert_llama_weights_to_hf.py) for converting LLaMA weights to HuggingFace format.\n\n Download the OpenFlamingo pre-trained model from [openflamingo/OpenFlamingo-9B](https://huggingface.co/openflamingo/OpenFlamingo-9B)\n\n Download our LoRA Weight from [here](https://download.openmmlab.com/mmgpt/v0/mmgpt-lora-v0-release.pt)\n\n Then place these models in checkpoints folders like this:\n\n ```\n checkpoints\n \u251c\u2500\u2500 llama-7b_hf\n \u2502 \u251c\u2500\u2500 config.json\n \u2502 \u251c\u2500\u2500 pytorch_model-00001-of-00002.bin\n \u2502 \u251c\u2500\u2500 ......\n \u2502 \u2514\u2500\u2500 tokenizer.model\n \u251c\u2500\u2500 OpenFlamingo-9B\n \u2502 \u2514\u2500\u2500checkpoint.pt\n \u251c\u2500\u2500mmgpt-lora-v0-release.pt\n\n2. launch the gradio demo\n\n ```bash\n python chat_gradio_demo.py\n ```\n\n# Examples\n\n### Recipe:\n\n\n### Travel plan:\n\n### Movie:\n\n### Famous person:\n\n\n\n# Fine-tuning\n\n## Prepare datasets\n\n1. [A-OKVQA](https://allenai.org/project/a-okvqa/home)\n\n Download annotation from [this link](https://prior-datasets.s3.us-east-2.amazonaws.com/aokvqa/aokvqa_v1p0.tar.gz) and unzip to `data/aokvqa/annotations`\n\n It also requires images from coco dataset which can be downloaded from [here](https://cocodataset.org/#home). \n\n2. [COCO Caption](https://cs.stanford.edu/people/karpathy/deepimagesent/)\n\n Download from [this link](https://cs.stanford.edu/people/karpathy/deepimagesent/coco.zip) and unzip to `data/coco`\n\n It also requires images from coco dataset which can be downloaded from [here](https://cocodataset.org/#home).\n\n3. [OCR VQA](https://ocr-vqa.github.io/)\n\n Download from [this link](https://drive.google.com/drive/folders/1_GYPY5UkUy7HIcR0zq3ZCFgeZN7BAfm_?usp=sharing) and place in `data/OCR_VQA/`\n\n4. [LlaVA](https://llava-vl.github.io/)\n\n Download from [liuhaotian/LLaVA-Instruct-150K](https://huggingface.co/datasets/liuhaotian/LLaVA-Instruct-150K) and place in `data/llava/`\n\n It also requires images from coco dataset which can be downloaded from [here](https://cocodataset.org/#home).\n\n5. [Mini-GPT4](https://minigpt-4.github.io/)\n\n Download from [Vision-CAIR/cc_sbu_align](https://huggingface.co/datasets/Vision-CAIR/cc_sbu_align) and place in `data/cc_sbu_align/`\n\n6. [Dolly 15k](https://www.databricks.com/blog/2023/03/24/hello-dolly-democratizing-magic-chatgpt-open-models.html)\n\n Download from [databricks/databricks-dolly-15k](https://huggingface.co/datasets/databricks/databricks-dolly-15k) and place it in `data/dolly/databricks-dolly-15k.jsonl`\n\n7. [Alpaca GPT4](https://github.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM)\n\n Download it from [this link](https://github.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM/raw/main/data/alpaca_gpt4_data.json) and place it in `data/alpaca_gpt4/alpaca_gpt4_data.json`\n\nYou can also customize the data path in the [configs/dataset_config.py](configs/dataset_config.py).\n\n\n## Start training\n\n```bash\ntorchrun --nproc_per_node=8 mmgpt/train/instruction_finetune.py \\\n--lm_path checkpoints/llama-7b_hf \\\n--tokenizer_path checkpoints/llama-7b_hf \\\n--pretrained_path checkpoints/OpenFlamingo-9B/checkpoint.pt \\\n--run_name train-my-gpt4 \\\n--learning_rate 1e-5 \\\n--lr_scheduler cosine \\\n--batch_size 1 \\ \n--tuning_config configs/lora_config.py \\\n--dataset_config configs/dataset_config.py \\\n--report_to_wandb \\\n```\n\n\n# Acknowledgements\n\n- [OpenFlamingo](https://github.com/mlfoundations/open_flamingo)\n- [LAVIS](https://github.com/salesforce/LAVIS)\n- [Stanford Alpaca](https://github.com/tatsu-lab/stanford_alpaca)\n- [MiniGPT-4](https://github.com/Vision-CAIR/MiniGPT-4)\n- [LLaVA](https://github.com/haotian-liu/LLaVA/tree/main)\n- [Instruction Tuning with GPT-4](https://github.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM)\n",
"bugtrack_url": null,
"license": "Apache 2.0",
"summary": "An open-source framework for multi-modality instruction fine-tuning",
"version": "0.0.1",
"split_keywords": [
"machine",
"learning"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "9bdb928a76666ee9e8c2c0894af4212160ffeaf0a3a7d4acbf540ff3cc1b334f",
"md5": "97d27045ce6bf14bb55df04318a1c7bb",
"sha256": "f3d09a490b85ac5d61372a1350706cf9e525b61655118f1d775f4b8039050662"
},
"downloads": -1,
"filename": "mmgpt-0.0.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "97d27045ce6bf14bb55df04318a1c7bb",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 49397,
"upload_time": "2023-04-27T05:54:10",
"upload_time_iso_8601": "2023-04-27T05:54:10.525954Z",
"url": "https://files.pythonhosted.org/packages/9b/db/928a76666ee9e8c2c0894af4212160ffeaf0a3a7d4acbf540ff3cc1b334f/mmgpt-0.0.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "450270febd09c09cd1819b4962b1f666a3177651bc34c673f616b791adc496ca",
"md5": "47fb8a0658f8827b1b55b9d6e03e0654",
"sha256": "83350144458406b550bfbaee76d221514d7fde106d39c4e62cd354e0ff3a6fa7"
},
"downloads": -1,
"filename": "mmgpt-0.0.1.tar.gz",
"has_sig": false,
"md5_digest": "47fb8a0658f8827b1b55b9d6e03e0654",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 35115,
"upload_time": "2023-04-27T05:54:13",
"upload_time_iso_8601": "2023-04-27T05:54:13.731838Z",
"url": "https://files.pythonhosted.org/packages/45/02/70febd09c09cd1819b4962b1f666a3177651bc34c673f616b791adc496ca/mmgpt-0.0.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-04-27 05:54:13",
"github": false,
"gitlab": false,
"bitbucket": false,
"lcname": "mmgpt"
}