# Installation and Usage Guide for virtualhome-eval
## Install dependencies
```
pip install virtualhome_eval
```
## Usage
To run `virtualhome_eval`
1. Use in python
```
from virtualhome_eval.agent_eval import agent_evaluation
agent_evaluation(mode=[generate_prompts, evaluate_results], eval_type=[goal_interpretation, action_sequence, transition_modeling], llm_response_path=[YOUR LLM OUTPUT DIR])
```
2. Use directly in the command line
```
virtualhome-eval --mode [generate_prompts, evaluate_results] --eval-type [goal_interpretation, action_sequence] --llm-response-path [YOUR LLM OUTPUT DIR] --output-dir [YOUR EVAL OUTPUT DIR]
```
### Parameters
- `mode`: Specifies either generate prompts or evaluate results. Options are:
- `generate_prompts`
- `evaluate_results`
- `eval_type`: Specifies the evaluation task type. Options are:
- `goal_interpretation`
- `action_sequence`
- `subgoal_decomposition`
- `transition_model`
- `llm_response_path`: The path of LLM output directory to be evaluated. It is `""` by default, using the existing outputs at directory `virtualhome_eval/llm_response/`. The function will evaluate all LLM outputs under the directory.
- `dataset`: The dataset type. Options:
- `virtualhome`
- `behavior`
- `output_dir`: The directory to store the output results. By default, it is at `output/` of current path.
### Example usage in python
1. To generate prompts for `goal_interpretation`:
```
agent_evaluation(mode='generate_prompts', eval_type='goal_interpretation')
```
2. To evaluate LLM outputs for `goal_interpretation`:
```
results = agent_evaluation(mode='evaluate_results', eval_type='goal_interpretation')
```
3. To generate prompts for `action_sequence`:
```
agent_evaluation(mode='generate_prompts', eval_type='action_sequence')
```
4. To evaluate LLM outputs for `action_sequence`:
```
results = agent_evaluation(mode='evaluate_results', eval_type='action_sequence')
```
5. To generate Virtualhome prompts for `transition_model`:
```
agent_evaluation(mode='generate_prompts', eval_type='transition_model')
```
6. To evaluate LLM outputs on Virtualhome for `transition_model`:
```
results = agent_evaluation(mode='evaluate_results', eval_type='transition_model')
```
7. To generate prompts for `subgoal_decomposition`:
```
agent_evaluation(mode='generate_prompts', eval_type='subgoal_decomposition')
```
8. To evaluate LLM outputs for `subgoal_decomposition`:
```
results = agent_evaluation(mode='evaluate_results', eval_type='subgoal_decomposition')
Raw data
{
"_id": null,
"home_page": "https://github.com/embodied-agent-eval/virtualhome_eval",
"name": "virtualhome-eval",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": null,
"author": "stanford",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/b7/04/6e5a7062faa2e038667a7c5bf8b2f57c9810482ab42f235b98141b9d4a10/virtualhome_eval-0.1.1.tar.gz",
"platform": null,
"description": "# Installation and Usage Guide for virtualhome-eval\n\n\n## Install dependencies\n```\npip install virtualhome_eval\n```\n\n## Usage\nTo run `virtualhome_eval`\n1. Use in python\n```\nfrom virtualhome_eval.agent_eval import agent_evaluation\nagent_evaluation(mode=[generate_prompts, evaluate_results], eval_type=[goal_interpretation, action_sequence, transition_modeling], llm_response_path=[YOUR LLM OUTPUT DIR])\n```\n2. Use directly in the command line\n```\nvirtualhome-eval --mode [generate_prompts, evaluate_results] --eval-type [goal_interpretation, action_sequence] --llm-response-path [YOUR LLM OUTPUT DIR] --output-dir [YOUR EVAL OUTPUT DIR]\n```\n### Parameters\n- `mode`: Specifies either generate prompts or evaluate results. Options are:\n - `generate_prompts` \n - `evaluate_results`\n- `eval_type`: Specifies the evaluation task type. Options are:\n - `goal_interpretation`\n - `action_sequence`\n - `subgoal_decomposition`\n - `transition_model`\n- `llm_response_path`: The path of LLM output directory to be evaluated. It is `\"\"` by default, using the existing outputs at directory `virtualhome_eval/llm_response/`. The function will evaluate all LLM outputs under the directory.\n- `dataset`: The dataset type. Options:\n - `virtualhome`\n - `behavior`\n- `output_dir`: The directory to store the output results. By default, it is at `output/` of current path.\n\n### Example usage in python\n1. To generate prompts for `goal_interpretation`:\n```\nagent_evaluation(mode='generate_prompts', eval_type='goal_interpretation')\n```\n2. To evaluate LLM outputs for `goal_interpretation`:\n```\nresults = agent_evaluation(mode='evaluate_results', eval_type='goal_interpretation')\n```\n3. To generate prompts for `action_sequence`:\n```\nagent_evaluation(mode='generate_prompts', eval_type='action_sequence')\n```\n4. To evaluate LLM outputs for `action_sequence`:\n```\nresults = agent_evaluation(mode='evaluate_results', eval_type='action_sequence')\n```\n5. To generate Virtualhome prompts for `transition_model`:\n```\nagent_evaluation(mode='generate_prompts', eval_type='transition_model')\n```\n6. To evaluate LLM outputs on Virtualhome for `transition_model`:\n```\nresults = agent_evaluation(mode='evaluate_results', eval_type='transition_model')\n```\n7. To generate prompts for `subgoal_decomposition`:\n```\nagent_evaluation(mode='generate_prompts', eval_type='subgoal_decomposition')\n```\n8. To evaluate LLM outputs for `subgoal_decomposition`:\n```\nresults = agent_evaluation(mode='evaluate_results', eval_type='subgoal_decomposition')\n",
"bugtrack_url": null,
"license": null,
"summary": "Embodied agent interface evaluation for VirtualHome",
"version": "0.1.1",
"project_urls": {
"Homepage": "https://github.com/embodied-agent-eval/virtualhome_eval"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "b96f63a7e29f0c00f10c2a8fd80d7697f77bf40c9a4ec0c813cdbf198f70551e",
"md5": "2b7a9dac3ca4f4da859c01e59ac0c774",
"sha256": "a4e244f423edfc0da2ec7b7daf48f2fca8af00afe6f2af07b8f1e53d8c79de37"
},
"downloads": -1,
"filename": "virtualhome_eval-0.1.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "2b7a9dac3ca4f4da859c01e59ac0c774",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 27159713,
"upload_time": "2024-07-31T05:07:01",
"upload_time_iso_8601": "2024-07-31T05:07:01.522623Z",
"url": "https://files.pythonhosted.org/packages/b9/6f/63a7e29f0c00f10c2a8fd80d7697f77bf40c9a4ec0c813cdbf198f70551e/virtualhome_eval-0.1.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "b7046e5a7062faa2e038667a7c5bf8b2f57c9810482ab42f235b98141b9d4a10",
"md5": "0c74005981dee466b7075960ea9deb2f",
"sha256": "a790c85f241a67e718d2f3d05d3daaea38acb7db5e16de118ced089dbf9eddd7"
},
"downloads": -1,
"filename": "virtualhome_eval-0.1.1.tar.gz",
"has_sig": false,
"md5_digest": "0c74005981dee466b7075960ea9deb2f",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 22854903,
"upload_time": "2024-07-31T05:07:04",
"upload_time_iso_8601": "2024-07-31T05:07:04.661309Z",
"url": "https://files.pythonhosted.org/packages/b7/04/6e5a7062faa2e038667a7c5bf8b2f57c9810482ab42f235b98141b9d4a10/virtualhome_eval-0.1.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-07-31 05:07:04",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "embodied-agent-eval",
"github_project": "virtualhome_eval",
"github_not_found": true,
"lcname": "virtualhome-eval"
}