hydra-vl4ai


Namehydra-vl4ai JSON
Version 0.0.0 PyPI version JSON
download
home_pagehttps://hydra-vl4ai.github.io/
SummaryOfficial implementation for HYDRA.
upload_time2024-08-05 08:10:15
maintainerNone
docs_urlNone
authorControlNet
requires_python>=3.10
licenseNone
keywords deep learning pytorch ai
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # <img src="media/HYDRA_icon_minimal.png" width="20"> HYDRA: A Hyper Agent for Dynamic Compositional Visual Reasoning

<div align="center">
    <img src="media/Frame.png">
    <p></p>
</div>


<div align="center">
    <a href="https://github.com/ControlNet/HYDRA/issues">
        <img src="https://img.shields.io/github/issues/ControlNet/HYDRA?style=flat-square">
    </a>
    <a href="https://github.com/ControlNet/HYDRA/network/members">
        <img src="https://img.shields.io/github/forks/ControlNet/HYDRA?style=flat-square">
    </a>
    <a href="https://github.com/ControlNet/HYDRA/stargazers">
        <img src="https://img.shields.io/github/stars/ControlNet/HYDRA?style=flat-square">
    </a>
    <a href="https://github.com/ControlNet/HYDRA/blob/master/LICENSE">
        <img src="https://img.shields.io/github/license/ControlNet/HYDRA?style=flat-square">
    </a>
    <a href="https://arxiv.org/abs/2403.12884">
        <img src="https://img.shields.io/badge/arXiv-2403.12884-b31b1b.svg?style=flat-square">
    </a>
</div>

**This is the code for the paper [HYDRA: A Hyper Agent for Dynamic Compositional Visual Reasoning](https://arxiv.org/abs/2403.12884), accepted by ECCV 2024 \[[Project Page](https://hydra-vl4ai.github.io)\].**

## Release

- [2024/07/29] 🔥 **HYDRA** is open sourced in GitHub.

## TODOs
We realize that `gpt-3.5-turbo-0613` is deprecated, and `gpt-3.5` will be replaced by `gpt-4o-mini`. We will release another version of HYDRA.
>As of July 2024, `gpt-4o-mini` should be used in place of `gpt-3.5-turbo`, as it is cheaper, more capable, multimodal, and just as fast [Openai API Page](https://platform.openai.com/docs/models/gpt-3-5-turbo).

We also notice the embedding model is updated by OpenAI as shown in this [link](https://openai.com/index/new-embedding-models-and-api-updates/). Due to the uncertainty of the embedding model updates from OpenAI, we suggest you train a new version of the RL controller yourself and update the RL models.
- [x] GPT-4o-mini replacement.
- [x] LLaMA3.1 (ollama) replacement.
- [ ] Gradio Demo
- [ ] GPT-4o Version.
- [ ] HYDRA with RL


## Installation

### Requirements

- Python >= 3.10
- conda

Please follow the instructions below to install the required packages and set up the environment.

### 1. Clone this repository.
```Bash
git clone https://github.com/ControlNet/HYDRA
```

### 2. Setup conda environment and install dependencies. 
```Bash
bash -i build_env.sh
```

If you meet errors, please consider going through the `build_env.sh` file and install the packages manually.

### 3. Configure the environments

Edit the file `.env` or setup in CLI to configure the environment variables.

```
OPENAI_API_KEY=your-api-key
OLLAMA_HOST=http://ollama.server:11434
# do not change this TORCH_HOME variable
TORCH_HOME=./pretrained_models
```

### 4. Download the pretrained models
Run the scripts to download the pretrained models to the `./pretrained_models` directory. 

```Bash
python -m hydra_vl4ai.download_models --base_config <EXP-CONFIG-DIR> --model_config <MODEL-CONFIG-PATH>
```

For example,
```Bash
python -m hydra_vl4ai.download_models --base_config ./config/okvqa.yaml --model_config ./configs/model_config_1gpu.yaml
```

## Inference
A worker is required to run the inference. 

```Bash
python -m hydra_vl4ai.executor --base_config <EXP-CONFIG-DIR> --model_config <MODEL-CONFIG-PATH>
```

### Inference with given one image and prompt
```Bash
python demo_cli.py \
  --image <IMAGE_PATH> \
  --prompt <PROMPT> \
  --base_config <YOUR-CONFIG-DIR> \
  --model_config <MODEL-PATH>
```

### Inference with Gradio GUI
TODO.

### Inference dataset

```Bash
python main.py \
  --data_root <YOUR-DATA-ROOT> \
  --base_config <YOUR-CONFIG-DIR> \
  --model_config <MODEL-PATH>
```

Then the inference results are saved in the `./result` directory for evaluation.

## Evaluation

```Bash
python evaluate.py <RESULT_JSON_PATH> <DATASET_NAME>
```

For example,

```Bash
python evaluate.py result/result_okvqa.jsonl okvqa
```


## Citation
```bibtex
@inproceedings{ke2024hydra,
  title={HYDRA: A Hyper Agent for Dynamic Compositional Visual Reasoning},
  author={Fucai Ke and Zhixi Cai and Simindokht Jahangard and Weiqing Wang and Pari Delir Haghighi and Hamid Rezatofighi},
  booktitle={European Conference on Computer Vision},
  year={2024},
  organization={Springer}
}
```

## Acknowledgements

Some code and prompts are based on [cvlab-columbia/viper](https://github.com/cvlab-columbia/viper).

            

Raw data

            {
    "_id": null,
    "home_page": "https://hydra-vl4ai.github.io/",
    "name": "hydra-vl4ai",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "deep learning, pytorch, AI",
    "author": "ControlNet",
    "author_email": "smczx@hotmail.com",
    "download_url": "https://files.pythonhosted.org/packages/f3/41/2f696e5bee28836179aaf8e547244ad04234211499cdb8eaca6cd2adcfed/hydra_vl4ai-0.0.0.tar.gz",
    "platform": null,
    "description": "# <img src=\"media/HYDRA_icon_minimal.png\" width=\"20\"> HYDRA: A Hyper Agent for Dynamic Compositional Visual Reasoning\n\n<div align=\"center\">\n    <img src=\"media/Frame.png\">\n    <p></p>\n</div>\n\n\n<div align=\"center\">\n    <a href=\"https://github.com/ControlNet/HYDRA/issues\">\n        <img src=\"https://img.shields.io/github/issues/ControlNet/HYDRA?style=flat-square\">\n    </a>\n    <a href=\"https://github.com/ControlNet/HYDRA/network/members\">\n        <img src=\"https://img.shields.io/github/forks/ControlNet/HYDRA?style=flat-square\">\n    </a>\n    <a href=\"https://github.com/ControlNet/HYDRA/stargazers\">\n        <img src=\"https://img.shields.io/github/stars/ControlNet/HYDRA?style=flat-square\">\n    </a>\n    <a href=\"https://github.com/ControlNet/HYDRA/blob/master/LICENSE\">\n        <img src=\"https://img.shields.io/github/license/ControlNet/HYDRA?style=flat-square\">\n    </a>\n    <a href=\"https://arxiv.org/abs/2403.12884\">\n        <img src=\"https://img.shields.io/badge/arXiv-2403.12884-b31b1b.svg?style=flat-square\">\n    </a>\n</div>\n\n**This is the code for the paper [HYDRA: A Hyper Agent for Dynamic Compositional Visual Reasoning](https://arxiv.org/abs/2403.12884), accepted by ECCV 2024 \\[[Project Page](https://hydra-vl4ai.github.io)\\].**\n\n## Release\n\n- [2024/07/29] \ud83d\udd25 **HYDRA** is open sourced in GitHub.\n\n## TODOs\nWe realize that `gpt-3.5-turbo-0613` is deprecated, and `gpt-3.5` will be replaced by `gpt-4o-mini`. We will release another version of HYDRA.\n>As of July 2024, `gpt-4o-mini` should be used in place of `gpt-3.5-turbo`, as it is cheaper, more capable, multimodal, and just as fast [Openai API Page](https://platform.openai.com/docs/models/gpt-3-5-turbo).\n\nWe also notice the embedding model is updated by OpenAI as shown in this [link](https://openai.com/index/new-embedding-models-and-api-updates/). Due to the uncertainty of the embedding model updates from OpenAI, we suggest you train a new version of the RL controller yourself and update the RL models.\n- [x] GPT-4o-mini replacement.\n- [x] LLaMA3.1 (ollama) replacement.\n- [ ] Gradio Demo\n- [ ] GPT-4o Version.\n- [ ] HYDRA with RL\n\n\n## Installation\n\n### Requirements\n\n- Python >= 3.10\n- conda\n\nPlease follow the instructions below to install the required packages and set up the environment.\n\n### 1. Clone this repository.\n```Bash\ngit clone https://github.com/ControlNet/HYDRA\n```\n\n### 2. Setup conda environment and install dependencies. \n```Bash\nbash -i build_env.sh\n```\n\nIf you meet errors, please consider going through the `build_env.sh` file and install the packages manually.\n\n### 3. Configure the environments\n\nEdit the file `.env` or setup in CLI to configure the environment variables.\n\n```\nOPENAI_API_KEY=your-api-key\nOLLAMA_HOST=http://ollama.server:11434\n# do not change this TORCH_HOME variable\nTORCH_HOME=./pretrained_models\n```\n\n### 4. Download the pretrained models\nRun the scripts to download the pretrained models to the `./pretrained_models` directory. \n\n```Bash\npython -m hydra_vl4ai.download_models --base_config <EXP-CONFIG-DIR> --model_config <MODEL-CONFIG-PATH>\n```\n\nFor example,\n```Bash\npython -m hydra_vl4ai.download_models --base_config ./config/okvqa.yaml --model_config ./configs/model_config_1gpu.yaml\n```\n\n## Inference\nA worker is required to run the inference. \n\n```Bash\npython -m hydra_vl4ai.executor --base_config <EXP-CONFIG-DIR> --model_config <MODEL-CONFIG-PATH>\n```\n\n### Inference with given one image and prompt\n```Bash\npython demo_cli.py \\\n  --image <IMAGE_PATH> \\\n  --prompt <PROMPT> \\\n  --base_config <YOUR-CONFIG-DIR> \\\n  --model_config <MODEL-PATH>\n```\n\n### Inference with Gradio GUI\nTODO.\n\n### Inference dataset\n\n```Bash\npython main.py \\\n  --data_root <YOUR-DATA-ROOT> \\\n  --base_config <YOUR-CONFIG-DIR> \\\n  --model_config <MODEL-PATH>\n```\n\nThen the inference results are saved in the `./result` directory for evaluation.\n\n## Evaluation\n\n```Bash\npython evaluate.py <RESULT_JSON_PATH> <DATASET_NAME>\n```\n\nFor example,\n\n```Bash\npython evaluate.py result/result_okvqa.jsonl okvqa\n```\n\n\n## Citation\n```bibtex\n@inproceedings{ke2024hydra,\n  title={HYDRA: A Hyper Agent for Dynamic Compositional Visual Reasoning},\n  author={Fucai Ke and Zhixi Cai and Simindokht Jahangard and Weiqing Wang and Pari Delir Haghighi and Hamid Rezatofighi},\n  booktitle={European Conference on Computer Vision},\n  year={2024},\n  organization={Springer}\n}\n```\n\n## Acknowledgements\n\nSome code and prompts are based on [cvlab-columbia/viper](https://github.com/cvlab-columbia/viper).\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Official implementation for HYDRA.",
    "version": "0.0.0",
    "project_urls": {
        "Bug Tracker": "https://github.com/ControlNet/HYDRA/issues",
        "Homepage": "https://hydra-vl4ai.github.io/",
        "Source Code": "https://github.com/ControlNet/HYDRA"
    },
    "split_keywords": [
        "deep learning",
        " pytorch",
        " ai"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ad63d708b353c8985b4039c40f9f8415ca6dfd35635367e64926a5e11dfdea46",
                "md5": "0190f33b6925519ca1f2122f6eed49aa",
                "sha256": "b239d0a7b44b6da2a1d01b8aff0e443da909aaa5384f2e7ed52c441d38bb7ff0"
            },
            "downloads": -1,
            "filename": "hydra_vl4ai-0.0.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "0190f33b6925519ca1f2122f6eed49aa",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 138164,
            "upload_time": "2024-08-05T08:10:13",
            "upload_time_iso_8601": "2024-08-05T08:10:13.465852Z",
            "url": "https://files.pythonhosted.org/packages/ad/63/d708b353c8985b4039c40f9f8415ca6dfd35635367e64926a5e11dfdea46/hydra_vl4ai-0.0.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f3412f696e5bee28836179aaf8e547244ad04234211499cdb8eaca6cd2adcfed",
                "md5": "51c1d9ca8db448fca85a1e46493961b6",
                "sha256": "a5e0f4182aaf850e2bccbcdfff476aedd35a8654e3f5878a05b67758c2dd9b61"
            },
            "downloads": -1,
            "filename": "hydra_vl4ai-0.0.0.tar.gz",
            "has_sig": false,
            "md5_digest": "51c1d9ca8db448fca85a1e46493961b6",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 103823,
            "upload_time": "2024-08-05T08:10:15",
            "upload_time_iso_8601": "2024-08-05T08:10:15.546593Z",
            "url": "https://files.pythonhosted.org/packages/f3/41/2f696e5bee28836179aaf8e547244ad04234211499cdb8eaca6cd2adcfed/hydra_vl4ai-0.0.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-08-05 08:10:15",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "ControlNet",
    "github_project": "HYDRA",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "hydra-vl4ai"
}
        
Elapsed time: 2.99767s