![pre-commit](https://github.com/Farama-Foundation/A2Perf/actions/workflows/pre-commit.yml/badge.svg)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
[//]: # ([![Python](https://img.shields.io/pypi/pyversions/gymnasium.svg)](https://badge.fury.io/py/gymnasium) TODO: Add working Python versions once a2perf package is available)
[//]: # ([![PyPI](https://badge.fury.io/py/gymnasium.svg)](https://badge.fury.io/py/gymnasium)
TODO: Add PyPI once a2perf package is available)
[//]: # ([![arXiv](https://img.shields.io/badge/arXiv-2407.17032-b31b1b.svg)](https://arxiv.org/abs/2407.17032) TODO: Add arXiv once we have DOI link)
<p align="center">
<img src="docs/_static/img/logo/github/A2Perf-github.png" width="500px"/>
</p>
A2Perf is a benchmark for evaluating agents on sequential decision problems that
are relevant to the real world. This
repository contains code for running and evaluating participant's submissions on
the benchmark platform.
## Environments
A2Perf provides benchmark environments in the following domains:
* [Web Navigation](docs/content/web_navigation/WebNavigation-Difficulty-01-v0.ipynb) -
This environment facilitates the
creation of compositional tasks represented by dependency graphs, where
automatically generated websites are completed by the trained agent.
* [Quadruped Locomotion](docs/content/quadruped_locomotion/QuadrupedLocomotion-DogPace-v0.ipynb) -
This quadruped
locomotion environment aims to teach a legged robot with 18 degrees of freedom
to replicate animal-like behaviors by imitating real-world motion data to
develop a diverse repertoire of skills.
* [Circuit Training](docs/content/circuit_training/CircuitTraining-Ariane-v0.ipynb) -
Chip floorplanning, a
complex and traditionally manual process, has been addressed by Google's
open-source Circuit Training framework, which uses reinforcement learning to
optimize chip layouts for multiple objectives.
<!--
### Web Navigation
![Three web navigation environments](media/gminiwob_scene.png)
### Quadruped Locomotion
![Simulated quadrupeds](media/locomotion_scene.png)
### Chip Floorplanning
![Chip floorplanning environment](media/ariane_scene.png) -->
## Installation
A2Perf can be installed on your local machine:
```bash
git clone https://github.com/Farama-Foundation/A2Perf.git
cd A2Perf
git submodule sync --recursive
git submodule update --init --recursive
pip install -e .[all]
```
### Specific Package installation
To install specific packages, you can use the following commands:
```bash
pip install -e .[web_navigation]
pip install -e .[quadruped_locomotion]
pip install -e .[circuit_training]
```
Both x86-64 and Arch64 (ARM64) architectures are supported.
\
Please note that the Windows version is not as well-tested as Linux and macOS
versions.
It can be used for development and testing but if you want to conduct serious (
time and resource-extensive) experiments on Windows,
please consider
using [Docker](https://docs.docker.com/docker-for-windows/install/)
or [WSL](https://docs.microsoft.com/en-us/windows/wsl/install-win10) with Linux
version.
## API
Environments in A2Perf are registered under specific names for each domain and
task. Here are the available environments:
1. Quadruped Locomotion:
- `QuadrupedLocomotion-DogPace-v0`
- `QuadrupedLocomotion-DogTrot-v0`
- `QuadrupedLocomotion-DogSpin-v0`
2. Web Navigation:
- `WebNavigation-Difficulty-01-v0`
- `WebNavigation-Difficulty-02-v0`
- `WebNavigation-Difficulty-03-v0`
3. Circuit Training:
- `CircuitTraining-ToyMacro-v0`
- `CircuitTraining-Ariane-v0`
For example, you can create an instance of the `WebNavigation-Difficulty-01-v0`
environment as follows:
```python
import gymnasium as gym
from a2perf.domains import web_navigation
env = gym.make("WebNavigation-DifficultyLevel-01-v0", num_websites=10, seed=0)
```
## User Submission
A beginners guide to benchmarking with A2Perf is
described [here](docs/content/tutorials/training.md).
- Users can pull the template repository
at https://github.com/Farama-Foundation/a2perf-benchmark-submission
- The submission repository must include:
- `train.py` - defines a global `train` function with the following
signature:
```python
def train():
"""Trains the user's model."""
```
- `inference.py` - defines the following functions:
```python
def load_policy(env, **load_kwargs):
"""Loads a trained policy model from the specified directory."""
def infer_once(policy, observation):
"""Runs a single inference step using the given policy and observation."""
def preprocess_observation(observation):
"""Preprocesses a raw observation from the environment into a format compatible with the policy."""
```
- `requirements.txt` - lists the required Python packages and
their versions for running the user's code
- `__init__.py` - an empty file that allows the submission to be
imported as a Python module
## Gin Configuration Files
Under [
`a2perf/submission/configs`](https://github.com/Farama-Foundation/A2Perf/tree/main/a2perf/submission/configs),
there are default gin configuration files for training and inference for each
domain. These files define various settings and parameters for
benchmarking.
Here's an example of an `training.gin` file for web navigation:
```python
# ----------------------
# IMPORTS
# ----------------------
import a2perf.submission.submission_util
# ----------------------
# SUBMISSION SETUP
# ----------------------
# Set up submission object
Submission.mode = %BenchmarkMode.TRAIN
Submission.domain = %BenchmarkDomain.WEB_NAVIGATION
Submission.run_offline_metrics_only = False
Submission.measure_emissions = True
# ----------------------
# SYSTEM METRICS SETUP
# ----------------------
# Set up codecarbon for system metrics
track_emissions_decorator.project_name = 'a2perf_web_navigation_train'
track_emissions_decorator.measure_power_secs = 5
track_emissions_decorator.save_to_file = True # Save data to file
track_emissions_decorator.save_to_logger = False # Do not save data to logger
track_emissions_decorator.gpu_ids = None # Enter list of specific GPU IDs to track if desired
track_emissions_decorator.log_level = 'info' # Log level set to 'info'
track_emissions_decorator.country_iso_code = 'USA'
track_emissions_decorator.region = 'Massachusetts'
track_emissions_decorator.offline = True
```
## Baselines
Baselines for all tasks are provided and are described in the article supporting
A2Perf.
## Environment Versioning
A2Perf keeps strict versioning for reproducibility reasons. All environments end
in a suffix like "-v0". When changes are made to environments that might impact
learning results, the number is increased by one to prevent potential confusion.
This follows the Gymnasium convention.
[//]: # (## Citation)
[//]: # ()
[//]: # (You can cite A2Perf as:)
[//]: # ()
[//]: # (```bibtex)
[//]: # (@misc{TODO })
[//]: # (```)
Raw data
{
"_id": null,
"home_page": null,
"name": "a2perf",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "Reinforcement Learning, Autonomous Agents, RL, Imitation Learning, Benchmark, AI",
"author": null,
"author_email": "Farama Foundation <contact@farama.org>",
"download_url": "https://files.pythonhosted.org/packages/8c/76/19d8587b5e34b717933eb7e1191f571b5f4fd6c12d3589e302d8bee30734/a2perf-0.1.0.tar.gz",
"platform": null,
"description": "![pre-commit](https://github.com/Farama-Foundation/A2Perf/actions/workflows/pre-commit.yml/badge.svg)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n\n[//]: # ([![Python](https://img.shields.io/pypi/pyversions/gymnasium.svg)](https://badge.fury.io/py/gymnasium) TODO: Add working Python versions once a2perf package is available)\n\n[//]: # ([![PyPI](https://badge.fury.io/py/gymnasium.svg)](https://badge.fury.io/py/gymnasium)\nTODO: Add PyPI once a2perf package is available)\n\n[//]: # ([![arXiv](https://img.shields.io/badge/arXiv-2407.17032-b31b1b.svg)](https://arxiv.org/abs/2407.17032) TODO: Add arXiv once we have DOI link)\n\n\n<p align=\"center\">\n <img src=\"docs/_static/img/logo/github/A2Perf-github.png\" width=\"500px\"/>\n</p>\nA2Perf is a benchmark for evaluating agents on sequential decision problems that\nare relevant to the real world. This\nrepository contains code for running and evaluating participant's submissions on\nthe benchmark platform.\n\n## Environments\n\nA2Perf provides benchmark environments in the following domains:\n\n* [Web Navigation](docs/content/web_navigation/WebNavigation-Difficulty-01-v0.ipynb) -\n This environment facilitates the\n creation of compositional tasks represented by dependency graphs, where\n automatically generated websites are completed by the trained agent.\n* [Quadruped Locomotion](docs/content/quadruped_locomotion/QuadrupedLocomotion-DogPace-v0.ipynb) -\n This quadruped\n locomotion environment aims to teach a legged robot with 18 degrees of freedom\n to replicate animal-like behaviors by imitating real-world motion data to\n develop a diverse repertoire of skills.\n* [Circuit Training](docs/content/circuit_training/CircuitTraining-Ariane-v0.ipynb) -\n Chip floorplanning, a\n complex and traditionally manual process, has been addressed by Google's\n open-source Circuit Training framework, which uses reinforcement learning to\n optimize chip layouts for multiple objectives.\n\n<!--\n### Web Navigation\n\n![Three web navigation environments](media/gminiwob_scene.png)\n\n### Quadruped Locomotion\n\n![Simulated quadrupeds](media/locomotion_scene.png)\n\n### Chip Floorplanning\n\n![Chip floorplanning environment](media/ariane_scene.png) -->\n\n## Installation\n\nA2Perf can be installed on your local machine:\n\n```bash\ngit clone https://github.com/Farama-Foundation/A2Perf.git\ncd A2Perf\ngit submodule sync --recursive\ngit submodule update --init --recursive\npip install -e .[all]\n```\n\n### Specific Package installation\n\nTo install specific packages, you can use the following commands:\n\n```bash\npip install -e .[web_navigation]\npip install -e .[quadruped_locomotion]\npip install -e .[circuit_training]\n```\n\nBoth x86-64 and Arch64 (ARM64) architectures are supported.\n\\\nPlease note that the Windows version is not as well-tested as Linux and macOS\nversions.\nIt can be used for development and testing but if you want to conduct serious (\ntime and resource-extensive) experiments on Windows,\nplease consider\nusing [Docker](https://docs.docker.com/docker-for-windows/install/)\nor [WSL](https://docs.microsoft.com/en-us/windows/wsl/install-win10) with Linux\nversion.\n\n## API\n\nEnvironments in A2Perf are registered under specific names for each domain and\ntask. Here are the available environments:\n\n1. Quadruped Locomotion:\n - `QuadrupedLocomotion-DogPace-v0`\n - `QuadrupedLocomotion-DogTrot-v0`\n - `QuadrupedLocomotion-DogSpin-v0`\n\n2. Web Navigation:\n - `WebNavigation-Difficulty-01-v0`\n - `WebNavigation-Difficulty-02-v0`\n - `WebNavigation-Difficulty-03-v0`\n\n3. Circuit Training:\n - `CircuitTraining-ToyMacro-v0`\n - `CircuitTraining-Ariane-v0`\n\nFor example, you can create an instance of the `WebNavigation-Difficulty-01-v0`\nenvironment as follows:\n\n```python\nimport gymnasium as gym\n\nfrom a2perf.domains import web_navigation\n\nenv = gym.make(\"WebNavigation-DifficultyLevel-01-v0\", num_websites=10, seed=0)\n\n```\n\n## User Submission\n\nA beginners guide to benchmarking with A2Perf is\ndescribed [here](docs/content/tutorials/training.md).\n\n- Users can pull the template repository\n at https://github.com/Farama-Foundation/a2perf-benchmark-submission\n - The submission repository must include:\n - `train.py` - defines a global `train` function with the following\n signature:\n ```python\n def train():\n \"\"\"Trains the user's model.\"\"\"\n ```\n - `inference.py` - defines the following functions:\n ```python\n def load_policy(env, **load_kwargs):\n \"\"\"Loads a trained policy model from the specified directory.\"\"\"\n def infer_once(policy, observation):\n \"\"\"Runs a single inference step using the given policy and observation.\"\"\"\n def preprocess_observation(observation):\n \"\"\"Preprocesses a raw observation from the environment into a format compatible with the policy.\"\"\"\n ```\n - `requirements.txt` - lists the required Python packages and\n their versions for running the user's code\n - `__init__.py` - an empty file that allows the submission to be\n imported as a Python module\n\n## Gin Configuration Files\n\nUnder [\n`a2perf/submission/configs`](https://github.com/Farama-Foundation/A2Perf/tree/main/a2perf/submission/configs),\nthere are default gin configuration files for training and inference for each\ndomain. These files define various settings and parameters for\nbenchmarking.\n\nHere's an example of an `training.gin` file for web navigation:\n\n```python\n# ----------------------\n# IMPORTS\n# ----------------------\nimport a2perf.submission.submission_util\n\n# ----------------------\n# SUBMISSION SETUP\n# ----------------------\n# Set up submission object\nSubmission.mode = %BenchmarkMode.TRAIN\nSubmission.domain = %BenchmarkDomain.WEB_NAVIGATION\nSubmission.run_offline_metrics_only = False\nSubmission.measure_emissions = True\n\n# ----------------------\n# SYSTEM METRICS SETUP\n# ----------------------\n# Set up codecarbon for system metrics\ntrack_emissions_decorator.project_name = 'a2perf_web_navigation_train'\ntrack_emissions_decorator.measure_power_secs = 5\ntrack_emissions_decorator.save_to_file = True # Save data to file\ntrack_emissions_decorator.save_to_logger = False # Do not save data to logger\ntrack_emissions_decorator.gpu_ids = None # Enter list of specific GPU IDs to track if desired\ntrack_emissions_decorator.log_level = 'info' # Log level set to 'info'\ntrack_emissions_decorator.country_iso_code = 'USA'\ntrack_emissions_decorator.region = 'Massachusetts'\ntrack_emissions_decorator.offline = True\n```\n\n## Baselines\n\nBaselines for all tasks are provided and are described in the article supporting\nA2Perf.\n\n## Environment Versioning\n\nA2Perf keeps strict versioning for reproducibility reasons. All environments end\nin a suffix like \"-v0\". When changes are made to environments that might impact\nlearning results, the number is increased by one to prevent potential confusion.\nThis follows the Gymnasium convention.\n\n[//]: # (## Citation)\n\n[//]: # ()\n\n[//]: # (You can cite A2Perf as:)\n\n[//]: # ()\n\n[//]: # (```bibtex)\n\n[//]: # (@misc{TODO })\n\n[//]: # (```)\n",
"bugtrack_url": null,
"license": "MIT License",
"summary": "Benchmarking suite for evaluating autonomous agents in real-world domains.",
"version": "0.1.0",
"project_urls": {
"Bug Report": "https://github.com/Farama-Foundation/A2Perf/issues",
"Documentation": "https://a2perf.farama.org",
"Homepage": "https://farama.org",
"Repository": "https://github.com/Farama-Foundation/A2Perf"
},
"split_keywords": [
"reinforcement learning",
" autonomous agents",
" rl",
" imitation learning",
" benchmark",
" ai"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "42c007a7294ff1afe5258b55f2da22a7b2dcde61fb9c0bb68f102b3c0761c1d8",
"md5": "a1b108e5883f9c4f4981107c6d540769",
"sha256": "51a77ffce2cd2cb1e8be16f32f437ffd756fcd223498cbe9198799391ffed85f"
},
"downloads": -1,
"filename": "a2perf-0.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "a1b108e5883f9c4f4981107c6d540769",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 67748,
"upload_time": "2024-09-06T15:00:06",
"upload_time_iso_8601": "2024-09-06T15:00:06.653723Z",
"url": "https://files.pythonhosted.org/packages/42/c0/07a7294ff1afe5258b55f2da22a7b2dcde61fb9c0bb68f102b3c0761c1d8/a2perf-0.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "8c7619d8587b5e34b717933eb7e1191f571b5f4fd6c12d3589e302d8bee30734",
"md5": "0127ce64319671c937c7a8b7701f910b",
"sha256": "ba43c30fb0aa96c8e4455889aa67d42a857c98b6a1f76346f8f8a0905a140563"
},
"downloads": -1,
"filename": "a2perf-0.1.0.tar.gz",
"has_sig": false,
"md5_digest": "0127ce64319671c937c7a8b7701f910b",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 53513,
"upload_time": "2024-09-06T15:00:08",
"upload_time_iso_8601": "2024-09-06T15:00:08.268877Z",
"url": "https://files.pythonhosted.org/packages/8c/76/19d8587b5e34b717933eb7e1191f571b5f4fd6c12d3589e302d8bee30734/a2perf-0.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-09-06 15:00:08",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "Farama-Foundation",
"github_project": "A2Perf",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [],
"lcname": "a2perf"
}