# Robotics Environment Authoring Framework (REAF)
The Robotics Environment Authoring Framework (REAF) simplifies creating
environments that adhere to the
[GDM Robotics Environment interface](https://github.com/google-deepmind/gdm_robotics/interfaces/environment.py).
## How to install
`reaf` can be installed from PyPI using `pip`:
```bash
pip install reaf
```
## Directory Structure
Currently, the directory structure is designed as standalone subdirectories with
a well-defined dependency graph.
```
reaf
├── core: Core libraries and interfaces for REAF.
├── testing: General tooling for testing REAF interfaces and environments.
├── common: Libraries with shared functionality across setups and platforms.
```
## Design
REAF is a framework designed to simplify the creation of robotics environments.
It adopts a layered architecture to promote modularity and reusability. The core
components of a REAF environment are:
1. **Environment:** The top-level interface for interacting with the
environment, conforming to the
[GDM Robotics Environment interface](https://github.com/google-deepmind/gdm_robotics/interfaces/environment.py).
It handles stepping, resetting, and action/observation specs.
2. **Task Logic Layer (TLL):** Responsible for defining the task itself,
including reward calculation, termination conditions, features generation,
and commands processing.
3. **Data Acquisition and Control Layer (DACL):** Interfaces with the physical
or simulated robotic setup, managing commands to actuators and retrieving
measurements from sensors.
4. **Adapters:** Bridge the gap between the abstract GDMR interfaces and the
specific requirements of the TLL. These adapters translate agent actions
into TLL commands and TLL features into agent observations.
5. **Reset and End of Episode Handlers:** Support customized behavior during
environment resets and episode termination.
### Environment
The `Environment` class serves as the primary interface for interacting with the
robotic environment. It coordinates the interactions between the TLL and DACL,
manages the environment's state, and handles stepping through the environment.
Key functionalities include:
* **`reset_with_options()`:** Resets the environment to a new initial state
based on the provided options. This involves resetting the DACL, computing
initial features and observations.
* **`step()`:** Advances the environment by one step. This method takes an
agent action, processes it into commands, steps the DACL, computes features,
reward, discount, termination conditions, and new observations, and returns
a `TimeStep` object containing this information.
* **`action_spec()`:** Returns the specification for valid agent actions. This
is determined by the `ActionSpaceAdapter`.
* **`timestep_spec()`:** Returns the specification for the `TimeStep` objects
returned by `step()` and `reset()`.
* **Logging:** Facilitates adding and removing loggers to monitor internal
operations.
### Task Logic Layer (TLL)
The TLL defines the logic and rules governing the robotic task. It comprises
several core components:
* **`FeaturesProducer`:** Generates additional features based on existing
features and measurements from the DACL. Each producer has a
`produced_features_spec()` defining the features it generates and
`required_features_keys()` indicating the features it depends on.
* **`CommandsProcessor`:** Modifies commands before they are sent to the DACL.
Processors can transform, filter, or augment commands. The
`consumed_commands_spec()` describes the commands accepted by the processor,
and `produced_commands_keys()` defines the output commands.
* **`RewardProvider`:** Calculates the reward signal based on the current
features. It exposes a `reward_spec()` defining the structure of the reward.
* **`TerminationChecker`:** Determines whether the episode should terminate
based on features and returns a `TerminationResult` indicating the type of
termination.
* **`DiscountProvider`:** Computes the discount factor based on features and
the termination state.
* **`FeaturesObserver`:** Passive components that observe features without
modifying them. This is useful for logging or analysis.
* **`Logger`:** Records measurements, features, and commands during
environment interactions. Methods like `record_measurements()`,
`record_features()`, and `record_commands_processing()` are called at
specific points in the environment's lifecycle.
The TLL also provides methods to:
* **`compute_all_features()`:** Computes all features based on measurements
from the DACL and the output of `FeaturesProducer`s.
* **`compute_final_commands()`:** Processes the policy's commands using the
`CommandsProcessor`s and outputs the DACL command.
* **`compute_reward()`:** Calculates the reward using the `RewardProvider`.
* **`check_for_termination()`:** Checks termination conditions using
`TerminationChecker`s.
* **`compute_discount()`:** Computes the discount using the
`DiscountProvider`.
* **`validate_spec()`:** Verifies the consistency of the specs across the TLL
and DACL.
### Data Acquisition and Control Layer (DACL)
The DACL serves as the bridge between the REAF environment and the robotic
hardware. It's responsible for sending commands to the robot and receiving
measurements from sensors. The DACL is built around:
* **`Device`:** Represents a single hardware component (e.g., robot arm,
camera). It provides methods like `set_commands()` and `get_measurements()`
for interacting with the hardware.
* **`DeviceCoordinator`:** Manages a collection of `Device` objects,
coordinating their actions and data exchange. It provides lifecycle
management through `start()` and `stop()` methods and synchronization points
through `before_set_commands()` and `before_get_measurements()`.
The DACL's key functions are:
* **`begin_stepping()`:** Initializes the DACL and returns the initial
measurements.
* **`step()`:** Sends commands to the devices, retrieves new measurements, and
returns them.
* **`end_stepping()`:** Performs cleanup operations at the end of an episode.
* **`commands_spec()`:** Returns the specification for valid commands. The
user is expected to pass the full command dictionary.
* **`measurements_spec()`:** Returns the specification for the measurements
returned by `get_measurements()`.
### Adapters
REAF utilizes adapters to translate between the generic agent interface and the
specific format required by the TLL.
* **`ActionSpaceAdapter`:** Converts the agent's actions into a commands
dictionary understood by the TLL.
* **`ObservationSpaceAdapter`:** Transforms the features generated by the TLL
into observations suitable for the agent.
### Reset and End of Episode Handler
* **`EnvironmentReset`:** Defines the reset behavior of the environment,
including a `do_reset()` method and a default reset configuration. The reset
is a complex step and very hard to predict how this will be written. The
intention is to leave complete control to the user so they can pass whatever
object is needed to the reset without any restrictions. The user can use
this to check that everything is working properly, reset the state of any
features producers, etc...
* **`EndOfEpisodeHandler`:** Provides a callback function,
`on_end_of_episode_stepping()`, that is invoked at the end of each episode
after the last step. This is useful for logging, cleanup, or custom logic
that needs to be executed when an episode ends.
## Licence and Disclaimer
Copyright 2025 Google LLC
All software is licensed under the Apache License, Version 2.0 (Apache 2.0); you
may not use this file except in compliance with the Apache 2.0 license. You may
obtain a copy of the Apache 2.0 license at: https://www.apache.org/licenses/LICENSE-2.0
All other materials are licensed under the Creative Commons Attribution 4.0
International License (CC-BY). You may obtain a copy of the CC-BY license at:
https://creativecommons.org/licenses/by/4.0/legalcode
Unless required by applicable law or agreed to in writing, all software and
materials distributed here under the Apache 2.0 or CC-BY licenses are
distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND,
either express or implied. See the licenses for the specific language governing
permissions and limitations under those licenses.
This is not an official Google product.
Raw data
{
"_id": null,
"home_page": null,
"name": "reaf",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": "Antoine Laurens <alaurens@google.com>, Francesco Romano <fraromano@google.com>, Jose Enrique Chen <josechenf@google.com>",
"keywords": "RL, Reinforcement Learning, AI, ML, Environment, Authoring tool",
"author": null,
"author_email": "Google DeepMind <robotics+oss@google.com>",
"download_url": "https://files.pythonhosted.org/packages/28/d6/b6cf04d2eafe14320780cbda198805689b9e3091d20ed67521e9641d44b0/reaf-1.0.0.tar.gz",
"platform": null,
"description": "# Robotics Environment Authoring Framework (REAF)\n\nThe Robotics Environment Authoring Framework (REAF) simplifies creating\nenvironments that adhere to the\n[GDM Robotics Environment interface](https://github.com/google-deepmind/gdm_robotics/interfaces/environment.py).\n\n## How to install\n\n`reaf` can be installed from PyPI using `pip`:\n\n```bash\npip install reaf\n```\n\n## Directory Structure\n\nCurrently, the directory structure is designed as standalone subdirectories with\na well-defined dependency graph.\n\n```\nreaf\n\u251c\u2500\u2500 core: Core libraries and interfaces for REAF.\n\u251c\u2500\u2500 testing: General tooling for testing REAF interfaces and environments.\n\u251c\u2500\u2500 common: Libraries with shared functionality across setups and platforms.\n```\n\n## Design\n\nREAF is a framework designed to simplify the creation of robotics environments.\nIt adopts a layered architecture to promote modularity and reusability. The core\ncomponents of a REAF environment are:\n\n1. **Environment:** The top-level interface for interacting with the\n environment, conforming to the\n [GDM Robotics Environment interface](https://github.com/google-deepmind/gdm_robotics/interfaces/environment.py).\n It handles stepping, resetting, and action/observation specs.\n2. **Task Logic Layer (TLL):** Responsible for defining the task itself,\n including reward calculation, termination conditions, features generation,\n and commands processing.\n3. **Data Acquisition and Control Layer (DACL):** Interfaces with the physical\n or simulated robotic setup, managing commands to actuators and retrieving\n measurements from sensors.\n4. **Adapters:** Bridge the gap between the abstract GDMR interfaces and the\n specific requirements of the TLL. These adapters translate agent actions\n into TLL commands and TLL features into agent observations.\n5. **Reset and End of Episode Handlers:** Support customized behavior during\n environment resets and episode termination.\n\n### Environment\n\nThe `Environment` class serves as the primary interface for interacting with the\nrobotic environment. It coordinates the interactions between the TLL and DACL,\nmanages the environment's state, and handles stepping through the environment.\nKey functionalities include:\n\n* **`reset_with_options()`:** Resets the environment to a new initial state\n based on the provided options. This involves resetting the DACL, computing\n initial features and observations.\n* **`step()`:** Advances the environment by one step. This method takes an\n agent action, processes it into commands, steps the DACL, computes features,\n reward, discount, termination conditions, and new observations, and returns\n a `TimeStep` object containing this information.\n* **`action_spec()`:** Returns the specification for valid agent actions. This\n is determined by the `ActionSpaceAdapter`.\n* **`timestep_spec()`:** Returns the specification for the `TimeStep` objects\n returned by `step()` and `reset()`.\n* **Logging:** Facilitates adding and removing loggers to monitor internal\n operations.\n\n### Task Logic Layer (TLL)\n\nThe TLL defines the logic and rules governing the robotic task. It comprises\nseveral core components:\n\n* **`FeaturesProducer`:** Generates additional features based on existing\n features and measurements from the DACL. Each producer has a\n `produced_features_spec()` defining the features it generates and\n `required_features_keys()` indicating the features it depends on.\n* **`CommandsProcessor`:** Modifies commands before they are sent to the DACL.\n Processors can transform, filter, or augment commands. The\n `consumed_commands_spec()` describes the commands accepted by the processor,\n and `produced_commands_keys()` defines the output commands.\n* **`RewardProvider`:** Calculates the reward signal based on the current\n features. It exposes a `reward_spec()` defining the structure of the reward.\n* **`TerminationChecker`:** Determines whether the episode should terminate\n based on features and returns a `TerminationResult` indicating the type of\n termination.\n* **`DiscountProvider`:** Computes the discount factor based on features and\n the termination state.\n* **`FeaturesObserver`:** Passive components that observe features without\n modifying them. This is useful for logging or analysis.\n* **`Logger`:** Records measurements, features, and commands during\n environment interactions. Methods like `record_measurements()`,\n `record_features()`, and `record_commands_processing()` are called at\n specific points in the environment's lifecycle.\n\nThe TLL also provides methods to:\n\n* **`compute_all_features()`:** Computes all features based on measurements\n from the DACL and the output of `FeaturesProducer`s.\n* **`compute_final_commands()`:** Processes the policy's commands using the\n `CommandsProcessor`s and outputs the DACL command.\n* **`compute_reward()`:** Calculates the reward using the `RewardProvider`.\n* **`check_for_termination()`:** Checks termination conditions using\n `TerminationChecker`s.\n* **`compute_discount()`:** Computes the discount using the\n `DiscountProvider`.\n* **`validate_spec()`:** Verifies the consistency of the specs across the TLL\n and DACL.\n\n### Data Acquisition and Control Layer (DACL)\n\nThe DACL serves as the bridge between the REAF environment and the robotic\nhardware. It's responsible for sending commands to the robot and receiving\nmeasurements from sensors. The DACL is built around:\n\n* **`Device`:** Represents a single hardware component (e.g., robot arm,\n camera). It provides methods like `set_commands()` and `get_measurements()`\n for interacting with the hardware.\n* **`DeviceCoordinator`:** Manages a collection of `Device` objects,\n coordinating their actions and data exchange. It provides lifecycle\n management through `start()` and `stop()` methods and synchronization points\n through `before_set_commands()` and `before_get_measurements()`.\n\nThe DACL's key functions are:\n\n* **`begin_stepping()`:** Initializes the DACL and returns the initial\n measurements.\n* **`step()`:** Sends commands to the devices, retrieves new measurements, and\n returns them.\n* **`end_stepping()`:** Performs cleanup operations at the end of an episode.\n* **`commands_spec()`:** Returns the specification for valid commands. The\n user is expected to pass the full command dictionary.\n* **`measurements_spec()`:** Returns the specification for the measurements\n returned by `get_measurements()`.\n\n### Adapters\n\nREAF utilizes adapters to translate between the generic agent interface and the\nspecific format required by the TLL.\n\n* **`ActionSpaceAdapter`:** Converts the agent's actions into a commands\n dictionary understood by the TLL.\n* **`ObservationSpaceAdapter`:** Transforms the features generated by the TLL\n into observations suitable for the agent.\n\n### Reset and End of Episode Handler\n\n* **`EnvironmentReset`:** Defines the reset behavior of the environment,\n including a `do_reset()` method and a default reset configuration. The reset\n is a complex step and very hard to predict how this will be written. The\n intention is to leave complete control to the user so they can pass whatever\n object is needed to the reset without any restrictions. The user can use\n this to check that everything is working properly, reset the state of any\n features producers, etc...\n* **`EndOfEpisodeHandler`:** Provides a callback function,\n `on_end_of_episode_stepping()`, that is invoked at the end of each episode\n after the last step. This is useful for logging, cleanup, or custom logic\n that needs to be executed when an episode ends.\n\n## Licence and Disclaimer\n\nCopyright 2025 Google LLC\n\nAll software is licensed under the Apache License, Version 2.0 (Apache 2.0); you\nmay not use this file except in compliance with the Apache 2.0 license. You may\nobtain a copy of the Apache 2.0 license at: https://www.apache.org/licenses/LICENSE-2.0\n\nAll other materials are licensed under the Creative Commons Attribution 4.0\nInternational License (CC-BY). You may obtain a copy of the CC-BY license at:\nhttps://creativecommons.org/licenses/by/4.0/legalcode\n\nUnless required by applicable law or agreed to in writing, all software and\nmaterials distributed here under the Apache 2.0 or CC-BY licenses are\ndistributed on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND,\neither express or implied. See the licenses for the specific language governing \npermissions and limitations under those licenses.\n\nThis is not an official Google product.\n",
"bugtrack_url": null,
"license": null,
"summary": "Robotics Environment Authoring Framework.",
"version": "1.0.0",
"project_urls": {
"Issues": "https://github.com/google-deepmind/reaf/issues",
"Repository": "https://github.com/google-deepmind/reaf.git"
},
"split_keywords": [
"rl",
" reinforcement learning",
" ai",
" ml",
" environment",
" authoring tool"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "854e853a5698284824d627ee7a10fd27bd61a6237251413e56b5e485d43675e2",
"md5": "74f78f6743081ce958241cd1adcc5cc0",
"sha256": "921c8e84dd5bf6cb8522a06b634c12cd2fb5c654f3b9ec2a18e6d8e64d8854b3"
},
"downloads": -1,
"filename": "reaf-1.0.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl",
"has_sig": false,
"md5_digest": "74f78f6743081ce958241cd1adcc5cc0",
"packagetype": "bdist_wheel",
"python_version": "cp310",
"requires_python": ">=3.10",
"size": 409453,
"upload_time": "2025-09-05T11:27:24",
"upload_time_iso_8601": "2025-09-05T11:27:24.530357Z",
"url": "https://files.pythonhosted.org/packages/85/4e/853a5698284824d627ee7a10fd27bd61a6237251413e56b5e485d43675e2/reaf-1.0.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "84ef5dec9d59d76326c37b1cce6d0ee9615200eaf273f6b2906d6cfc2a6084ca",
"md5": "2ca1e807aed9454c7320bba6bf6f1d1b",
"sha256": "121a00be0220c70f169992e8ca3fdc77090393c1c3d577c70b17d7da17249701"
},
"downloads": -1,
"filename": "reaf-1.0.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl",
"has_sig": false,
"md5_digest": "2ca1e807aed9454c7320bba6bf6f1d1b",
"packagetype": "bdist_wheel",
"python_version": "cp311",
"requires_python": ">=3.10",
"size": 411055,
"upload_time": "2025-09-05T11:27:26",
"upload_time_iso_8601": "2025-09-05T11:27:26.452213Z",
"url": "https://files.pythonhosted.org/packages/84/ef/5dec9d59d76326c37b1cce6d0ee9615200eaf273f6b2906d6cfc2a6084ca/reaf-1.0.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "cc101652e3b1b52b55eeb904a47be7cdfb126ffa955ac7e0a8eba92f6b0e5202",
"md5": "fd7808396e4dc0f7f3781f7edaa34a6a",
"sha256": "e06814a12442b9075add904d1cfadb5eb95a6a529f0ab9b1f0e29686047694df"
},
"downloads": -1,
"filename": "reaf-1.0.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl",
"has_sig": false,
"md5_digest": "fd7808396e4dc0f7f3781f7edaa34a6a",
"packagetype": "bdist_wheel",
"python_version": "cp312",
"requires_python": ">=3.10",
"size": 411711,
"upload_time": "2025-09-05T11:27:28",
"upload_time_iso_8601": "2025-09-05T11:27:28.181664Z",
"url": "https://files.pythonhosted.org/packages/cc/10/1652e3b1b52b55eeb904a47be7cdfb126ffa955ac7e0a8eba92f6b0e5202/reaf-1.0.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "8a9a7db10634379a18220eb00a0ad282ee53769938609658a702277b2e05dcc4",
"md5": "1a2701a5d85d02b854cf812346ddc774",
"sha256": "57bf40c64c4c111923f787903cad3b4b28cd9e87e340ef1ea05ab5d84f4fa33e"
},
"downloads": -1,
"filename": "reaf-1.0.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl",
"has_sig": false,
"md5_digest": "1a2701a5d85d02b854cf812346ddc774",
"packagetype": "bdist_wheel",
"python_version": "cp313",
"requires_python": ">=3.10",
"size": 411911,
"upload_time": "2025-09-05T11:27:29",
"upload_time_iso_8601": "2025-09-05T11:27:29.386396Z",
"url": "https://files.pythonhosted.org/packages/8a/9a/7db10634379a18220eb00a0ad282ee53769938609658a702277b2e05dcc4/reaf-1.0.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "28d6b6cf04d2eafe14320780cbda198805689b9e3091d20ed67521e9641d44b0",
"md5": "87b7d16ca8acb36fa70e89d371c84736",
"sha256": "f17586b989daa44bf19915e6740f1029ef1cfcba22c9d37f7ad7d4988f13c261"
},
"downloads": -1,
"filename": "reaf-1.0.0.tar.gz",
"has_sig": false,
"md5_digest": "87b7d16ca8acb36fa70e89d371c84736",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 82666,
"upload_time": "2025-09-05T11:27:30",
"upload_time_iso_8601": "2025-09-05T11:27:30.839372Z",
"url": "https://files.pythonhosted.org/packages/28/d6/b6cf04d2eafe14320780cbda198805689b9e3091d20ed67521e9641d44b0/reaf-1.0.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-09-05 11:27:30",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "google-deepmind",
"github_project": "reaf",
"github_not_found": true,
"lcname": "reaf"
}