virtualhome-eval

Name	virtualhome-eval JSON
Version	0.1.1 JSON
	download
home_page	https://github.com/embodied-agent-eval/virtualhome_eval
Summary	Embodied agent interface evaluation for VirtualHome
upload_time	2024-07-31 05:07:04
maintainer	None
docs_url	None
author	stanford
requires_python	>=3.8
license	None
keywords
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Installation and Usage Guide for virtualhome-eval


## Install dependencies
```
pip install virtualhome_eval
```

## Usage
To run `virtualhome_eval`
1. Use in python
```
from virtualhome_eval.agent_eval import agent_evaluation
agent_evaluation(mode=[generate_prompts, evaluate_results], eval_type=[goal_interpretation, action_sequence, transition_modeling], llm_response_path=[YOUR LLM OUTPUT DIR])
```
2. Use directly in the command line
```
virtualhome-eval --mode [generate_prompts, evaluate_results] --eval-type [goal_interpretation, action_sequence] --llm-response-path [YOUR LLM OUTPUT DIR] --output-dir [YOUR EVAL OUTPUT DIR]
```
### Parameters
- `mode`: Specifies either generate prompts or evaluate results. Options are:
  - `generate_prompts` 
  - `evaluate_results`
- `eval_type`: Specifies the evaluation task type. Options are:
  - `goal_interpretation`
  - `action_sequence`
  - `subgoal_decomposition`
  - `transition_model`
- `llm_response_path`: The path of LLM output directory to be evaluated. It is `""` by default, using the existing outputs at directory `virtualhome_eval/llm_response/`. The function will evaluate all LLM outputs under the directory.
- `dataset`: The dataset type. Options:
  - `virtualhome`
  - `behavior`
- `output_dir`: The directory to store the output results. By default, it is at `output/` of current path.

### Example usage in python
1. To generate prompts for `goal_interpretation`:
```
agent_evaluation(mode='generate_prompts',  eval_type='goal_interpretation')
```
2. To evaluate LLM outputs for `goal_interpretation`:
```
results = agent_evaluation(mode='evaluate_results', eval_type='goal_interpretation')
```
3. To generate prompts for `action_sequence`:
```
agent_evaluation(mode='generate_prompts',  eval_type='action_sequence')
```
4. To evaluate LLM outputs for `action_sequence`:
```
results = agent_evaluation(mode='evaluate_results', eval_type='action_sequence')
```
5. To generate Virtualhome prompts for `transition_model`:
```
agent_evaluation(mode='generate_prompts',  eval_type='transition_model')
```
6. To evaluate LLM outputs on Virtualhome for `transition_model`:
```
results = agent_evaluation(mode='evaluate_results', eval_type='transition_model')
```
7. To generate prompts for `subgoal_decomposition`:
```
agent_evaluation(mode='generate_prompts',  eval_type='subgoal_decomposition')
```
8. To evaluate LLM outputs for `subgoal_decomposition`:
```
results = agent_evaluation(mode='evaluate_results', eval_type='subgoal_decomposition')

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/embodied-agent-eval/virtualhome_eval",
    "name": "virtualhome-eval",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": null,
    "author": "stanford",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/b7/04/6e5a7062faa2e038667a7c5bf8b2f57c9810482ab42f235b98141b9d4a10/virtualhome_eval-0.1.1.tar.gz",
    "platform": null,
    "description": "# Installation and Usage Guide for virtualhome-eval\n\n\n## Install dependencies\n```\npip install virtualhome_eval\n```\n\n## Usage\nTo run `virtualhome_eval`\n1. Use in python\n```\nfrom virtualhome_eval.agent_eval import agent_evaluation\nagent_evaluation(mode=[generate_prompts, evaluate_results], eval_type=[goal_interpretation, action_sequence, transition_modeling], llm_response_path=[YOUR LLM OUTPUT DIR])\n```\n2. Use directly in the command line\n```\nvirtualhome-eval --mode [generate_prompts, evaluate_results] --eval-type [goal_interpretation, action_sequence] --llm-response-path [YOUR LLM OUTPUT DIR] --output-dir [YOUR EVAL OUTPUT DIR]\n```\n### Parameters\n- `mode`: Specifies either generate prompts or evaluate results. Options are:\n  - `generate_prompts` \n  - `evaluate_results`\n- `eval_type`: Specifies the evaluation task type. Options are:\n  - `goal_interpretation`\n  - `action_sequence`\n  - `subgoal_decomposition`\n  - `transition_model`\n- `llm_response_path`: The path of LLM output directory to be evaluated. It is `\"\"` by default, using the existing outputs at directory `virtualhome_eval/llm_response/`. The function will evaluate all LLM outputs under the directory.\n- `dataset`: The dataset type. Options:\n  - `virtualhome`\n  - `behavior`\n- `output_dir`: The directory to store the output results. By default, it is at `output/` of current path.\n\n### Example usage in python\n1. To generate prompts for `goal_interpretation`:\n```\nagent_evaluation(mode='generate_prompts',  eval_type='goal_interpretation')\n```\n2. To evaluate LLM outputs for `goal_interpretation`:\n```\nresults = agent_evaluation(mode='evaluate_results', eval_type='goal_interpretation')\n```\n3. To generate prompts for `action_sequence`:\n```\nagent_evaluation(mode='generate_prompts',  eval_type='action_sequence')\n```\n4. To evaluate LLM outputs for `action_sequence`:\n```\nresults = agent_evaluation(mode='evaluate_results', eval_type='action_sequence')\n```\n5. To generate Virtualhome prompts for `transition_model`:\n```\nagent_evaluation(mode='generate_prompts',  eval_type='transition_model')\n```\n6. To evaluate LLM outputs on Virtualhome for `transition_model`:\n```\nresults = agent_evaluation(mode='evaluate_results', eval_type='transition_model')\n```\n7. To generate prompts for `subgoal_decomposition`:\n```\nagent_evaluation(mode='generate_prompts',  eval_type='subgoal_decomposition')\n```\n8. To evaluate LLM outputs for `subgoal_decomposition`:\n```\nresults = agent_evaluation(mode='evaluate_results', eval_type='subgoal_decomposition')\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Embodied agent interface evaluation for VirtualHome",
    "version": "0.1.1",
    "project_urls": {
        "Homepage": "https://github.com/embodied-agent-eval/virtualhome_eval"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b96f63a7e29f0c00f10c2a8fd80d7697f77bf40c9a4ec0c813cdbf198f70551e",
                "md5": "2b7a9dac3ca4f4da859c01e59ac0c774",
                "sha256": "a4e244f423edfc0da2ec7b7daf48f2fca8af00afe6f2af07b8f1e53d8c79de37"
            },
            "downloads": -1,
            "filename": "virtualhome_eval-0.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "2b7a9dac3ca4f4da859c01e59ac0c774",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 27159713,
            "upload_time": "2024-07-31T05:07:01",
            "upload_time_iso_8601": "2024-07-31T05:07:01.522623Z",
            "url": "https://files.pythonhosted.org/packages/b9/6f/63a7e29f0c00f10c2a8fd80d7697f77bf40c9a4ec0c813cdbf198f70551e/virtualhome_eval-0.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b7046e5a7062faa2e038667a7c5bf8b2f57c9810482ab42f235b98141b9d4a10",
                "md5": "0c74005981dee466b7075960ea9deb2f",
                "sha256": "a790c85f241a67e718d2f3d05d3daaea38acb7db5e16de118ced089dbf9eddd7"
            },
            "downloads": -1,
            "filename": "virtualhome_eval-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "0c74005981dee466b7075960ea9deb2f",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 22854903,
            "upload_time": "2024-07-31T05:07:04",
            "upload_time_iso_8601": "2024-07-31T05:07:04.661309Z",
            "url": "https://files.pythonhosted.org/packages/b7/04/6e5a7062faa2e038667a7c5bf8b2f57c9810482ab42f235b98141b9d4a10/virtualhome_eval-0.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-07-31 05:07:04",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "embodied-agent-eval",
    "github_project": "virtualhome_eval",
    "github_not_found": true,
    "lcname": "virtualhome-eval"
}

stanford