# ✈️ Plane: Reinforcement Learning Environment for Aircraft Control

**Plane** is a lightweight yet realistic **reinforcement learning environment** simulating a 2D side view of an Airbus A320-like aircraft.
It’s designed for **fast, end-to-end training on GPU with JAX** while staying **physics-based** and **realistic enough** to capture the core challenges of aircraft control.
Plane allows you to benchmark RL agents on **delays, irrecoverable states, partial observability, and competing objectives** — challenges that are often ignored in standard toy environments.
---
## ✨ Features
* 🏎 **Fast & parallelizable** thanks to JAX — scale to thousands of parallel environments on GPU/TPU.
* 📐 **Physics-based**: Dynamics are derived from airplane modeling equations (not arcade physics).
* 🧪 **Reliable**: Covered by unit tests to ensure stability and reproducibility.
* 🎯 **Challenging**: Captures real-world aviation control problems (momentum, delays, irrecoverable states).
* 🔄 **Compatible with multiple interfaces**: Designed to work with JAX-based environments.
* 🌟 **Upcoming features**: Environmental perturbations (e.g., wind) will be available in future releases.
---
## 📊 Stable Altitude vs. Power & Pitch
Below is an example of how stable altitude changes with engine power and pitch:

This highlights the **multi-stability** phenomenon: holding a constant power setting can lead the plane to naturally converge to a stable altitude.
---
## 🚀 Installation
Once released on PyPI, you can install Plane with:
```bash
# Using pip
pip install plane-env
# Or with Poetry
poetry add plane-env
```
---
## 🎮 Usage
Here’s a minimal example of running an episode and saving a video:
```python
from plane_env.env_jax import Airplane2D, EnvParams
# Create env
env = Airplane2D()
seed = 42
env_params = EnvParams(max_steps_in_episode=1_000)
# Simple constant policy with 80% power and 0° stick input.
action = (0.8, 0.0)
# Save the video
env.save_video(lambda o: action, seed, folder="videos", episode_index=0, params=env_params, format="gif")
```
Of course you can also directly use it to train an agent using your favorite RL library (here: stable-baselines3)
```python
from plane_env.env_gymnasium import Airplane2D, EnvParams
from stable_baselines3 import SAC
# Create env
env = Airplane2D()
# Model training (adapted from https://stable-baselines3.readthedocs.io/en/master/modules/sac.html)
model = SAC("MlpPolicy", env, verbose=1)
model.learn(total_timesteps=10_000, log_interval=4)
model.save("sac_plane")
del model # remove to demonstrate saving and loading
model = SAC.load("sac_plane")
obs, info = env.reset()
while True:
action, _states = model.predict(obs, deterministic=True)
obs, reward, terminated, truncated, info = env.step(action)
if terminated or truncated:
break
```
---
## 🛩️ Environment Overview (Reinforcement Learning Perspective)
**State (`EnvState`)**: 13-dimensional vector representing aircraft dynamics:
| Variable | Description |
| ----------------- | --------------------------------------- |
| `x` | Horizontal position (m) |
| `x_dot` | Horizontal speed (m/s) |
| `z` | Altitude (m) |
| `z_dot` | Vertical speed (m/s) |
| `theta` | Pitch angle (rad) |
| `theta_dot` | Pitch angular velocity (rad/s) |
| `alpha` | Angle of attack (rad) |
| `gamma` | Flight path angle (rad) |
| `m` | Aircraft mass (kg) |
| `power` | Normalized engine thrust (0–1) |
| `stick` | Control stick input for pitch (–1 to 1) |
| `fuel` | Remaining fuel (kg) |
| `t` | Current timestep |
| `target_altitude` | Desired target altitude (m) |
The state also provides **derived properties** like air density, Mach number, and speed of sound.
The agent currently observes all of the state, minus **x** and **t** (as they should be irrelevant for control), as well as fuel which is currently not used.
**Action Space**: Continuous 2D vector `[power_requested, stick_requested]` controlling engine thrust and pitch.
**Reward Function**:
* Encourages maintaining **target altitude**.
* Terminal altitude violations (`z < min_alt` or `z > max_alt`) incur `-max_steps_in_episode`.
* Otherwise, reward is sthe quared normalized difference to target altitude:
$`r_t = \left( \frac{\text{max\_alt} - | \text{target\_altitude} - z_t |}{\text{max\_alt} - \text{min\_alt}} \right)^2`$
**Episode Termination**:
* **Altitude limits exceeded** → terminated
* **Maximum episode length reached** → truncated
**Time step**: `delta_t = 0.5 s`, `max_steps_in_episode = 1,000`.
---
## 🧩 Challenges Modeled
Plane is designed to test RL agents under **realistic aviation challenges**:
* ⏳ **Delay**: Engine power changes take time to fully apply.
* 👀 **Partial observability**: Some forces cannot be directly measured.
* 🏁 **Competing objectives**: Reach target altitude fast while minimizing fuel and overshoot.
* 🌀 **Momentum effects**: Control inputs show delayed impact due to physical inertia.
* ⚠️ **Irrecoverable states**: Certain trajectories inevitably lead to failure (crash).
> Environmental perturbations (wind, turbulence) are coming in a future release.
---
## 📦 Roadmap
* [ ] Add perturbations (wind with varying speeds and directions) to model the non-stationarity of the dynamics.
* [ ] Add an easier interface to create partially-observable versions of the environment.
* [ ] Provide ready-to-use benchmark results for popular RL baselines.
* [ ] Add fuel consumption.
---
## 🤝 Contributing
Contributions are welcome!
Please open an issue or PR if you have suggestions, bug reports, or new features.
---
## 📜 License
MIT License – feel free to use it in your own research and projects.
Raw data
{
"_id": null,
"home_page": "https://github.com/yannberthelot/Plane",
"name": "plane-env",
"maintainer": null,
"docs_url": null,
"requires_python": "<3.13,>=3.10",
"maintainer_email": null,
"keywords": "reinforcement-learning, gymnasium, jax, airplane, simulation",
"author": "Yann Berthelot",
"author_email": "yannberthelot1@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/10/b7/35d7f66f1777afd72fb090f615fd554d76784c0333337bd8fc8ca34b4c6b/plane_env-0.1.2.tar.gz",
"platform": null,
"description": "# \u2708\ufe0f Plane: Reinforcement Learning Environment for Aircraft Control\n\n\n\n**Plane** is a lightweight yet realistic **reinforcement learning environment** simulating a 2D side view of an Airbus A320-like aircraft.\nIt\u2019s designed for **fast, end-to-end training on GPU with JAX** while staying **physics-based** and **realistic enough** to capture the core challenges of aircraft control.\n\nPlane allows you to benchmark RL agents on **delays, irrecoverable states, partial observability, and competing objectives** \u2014 challenges that are often ignored in standard toy environments.\n\n---\n\n## \u2728 Features\n\n* \ud83c\udfce **Fast & parallelizable** thanks to JAX \u2014 scale to thousands of parallel environments on GPU/TPU.\n* \ud83d\udcd0 **Physics-based**: Dynamics are derived from airplane modeling equations (not arcade physics).\n* \ud83e\uddea **Reliable**: Covered by unit tests to ensure stability and reproducibility.\n* \ud83c\udfaf **Challenging**: Captures real-world aviation control problems (momentum, delays, irrecoverable states).\n* \ud83d\udd04 **Compatible with multiple interfaces**: Designed to work with JAX-based environments.\n* \ud83c\udf1f **Upcoming features**: Environmental perturbations (e.g., wind) will be available in future releases.\n\n---\n\n## \ud83d\udcca Stable Altitude vs. Power & Pitch\n\nBelow is an example of how stable altitude changes with engine power and pitch:\n\n\n\nThis highlights the **multi-stability** phenomenon: holding a constant power setting can lead the plane to naturally converge to a stable altitude.\n\n---\n\n## \ud83d\ude80 Installation\n\nOnce released on PyPI, you can install Plane with:\n\n```bash\n# Using pip\npip install plane-env\n\n# Or with Poetry\npoetry add plane-env\n```\n\n---\n\n## \ud83c\udfae Usage\n\nHere\u2019s a minimal example of running an episode and saving a video:\n\n```python\nfrom plane_env.env_jax import Airplane2D, EnvParams\n\n# Create env\nenv = Airplane2D()\nseed = 42\nenv_params = EnvParams(max_steps_in_episode=1_000)\n\n# Simple constant policy with 80% power and 0\u00b0 stick input.\naction = (0.8, 0.0)\n\n# Save the video\nenv.save_video(lambda o: action, seed, folder=\"videos\", episode_index=0, params=env_params, format=\"gif\")\n```\n\nOf course you can also directly use it to train an agent using your favorite RL library (here: stable-baselines3)\n\n```python\nfrom plane_env.env_gymnasium import Airplane2D, EnvParams\nfrom stable_baselines3 import SAC\n\n# Create env\nenv = Airplane2D()\n# Model training (adapted from https://stable-baselines3.readthedocs.io/en/master/modules/sac.html)\n\n\nmodel = SAC(\"MlpPolicy\", env, verbose=1)\nmodel.learn(total_timesteps=10_000, log_interval=4)\nmodel.save(\"sac_plane\")\n\ndel model # remove to demonstrate saving and loading\n\nmodel = SAC.load(\"sac_plane\")\n\nobs, info = env.reset()\nwhile True:\n action, _states = model.predict(obs, deterministic=True)\n obs, reward, terminated, truncated, info = env.step(action)\n if terminated or truncated:\n break\n```\n\n\n---\n\n## \ud83d\udee9\ufe0f Environment Overview (Reinforcement Learning Perspective)\n\n**State (`EnvState`)**: 13-dimensional vector representing aircraft dynamics:\n\n| Variable | Description |\n| ----------------- | --------------------------------------- |\n| `x` | Horizontal position (m) |\n| `x_dot` | Horizontal speed (m/s) |\n| `z` | Altitude (m) |\n| `z_dot` | Vertical speed (m/s) |\n| `theta` | Pitch angle (rad) |\n| `theta_dot` | Pitch angular velocity (rad/s) |\n| `alpha` | Angle of attack (rad) |\n| `gamma` | Flight path angle (rad) |\n| `m` | Aircraft mass (kg) |\n| `power` | Normalized engine thrust (0\u20131) |\n| `stick` | Control stick input for pitch (\u20131 to 1) |\n| `fuel` | Remaining fuel (kg) |\n| `t` | Current timestep |\n| `target_altitude` | Desired target altitude (m) |\n\nThe state also provides **derived properties** like air density, Mach number, and speed of sound.\n\nThe agent currently observes all of the state, minus **x** and **t** (as they should be irrelevant for control), as well as fuel which is currently not used.\n\n**Action Space**: Continuous 2D vector `[power_requested, stick_requested]` controlling engine thrust and pitch.\n\n**Reward Function**:\n\n* Encourages maintaining **target altitude**.\n* Terminal altitude violations (`z < min_alt` or `z > max_alt`) incur `-max_steps_in_episode`.\n* Otherwise, reward is sthe quared normalized difference to target altitude:\n\n$`r_t = \\left( \\frac{\\text{max\\_alt} - | \\text{target\\_altitude} - z_t |}{\\text{max\\_alt} - \\text{min\\_alt}} \\right)^2`$\n\n\n\n**Episode Termination**:\n\n* **Altitude limits exceeded** \u2192 terminated\n* **Maximum episode length reached** \u2192 truncated\n\n**Time step**: `delta_t = 0.5 s`, `max_steps_in_episode = 1,000`.\n\n---\n\n## \ud83e\udde9 Challenges Modeled\n\nPlane is designed to test RL agents under **realistic aviation challenges**:\n\n* \u23f3 **Delay**: Engine power changes take time to fully apply.\n* \ud83d\udc40 **Partial observability**: Some forces cannot be directly measured.\n* \ud83c\udfc1 **Competing objectives**: Reach target altitude fast while minimizing fuel and overshoot.\n* \ud83c\udf00 **Momentum effects**: Control inputs show delayed impact due to physical inertia.\n* \u26a0\ufe0f **Irrecoverable states**: Certain trajectories inevitably lead to failure (crash).\n\n> Environmental perturbations (wind, turbulence) are coming in a future release.\n\n---\n\n## \ud83d\udce6 Roadmap\n\n* [ ] Add perturbations (wind with varying speeds and directions) to model the non-stationarity of the dynamics.\n* [ ] Add an easier interface to create partially-observable versions of the environment.\n* [ ] Provide ready-to-use benchmark results for popular RL baselines.\n* [ ] Add fuel consumption.\n\n---\n\n## \ud83e\udd1d Contributing\n\nContributions are welcome!\nPlease open an issue or PR if you have suggestions, bug reports, or new features.\n\n---\n\n## \ud83d\udcdc License\n\nMIT License \u2013 feel free to use it in your own research and projects.\n",
"bugtrack_url": null,
"license": null,
"summary": "A reinforcement learning environment for simulating a plane in 2D available, with gymnasium and gymnax support",
"version": "0.1.2",
"project_urls": {
"Homepage": "https://github.com/yannberthelot/Plane",
"Repository": "https://github.com/yannberthelot/Plane"
},
"split_keywords": [
"reinforcement-learning",
" gymnasium",
" jax",
" airplane",
" simulation"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "626b49aa7ada750b88cc95aac27276ca3df95f66715b5e44bf78b215dc1a6dab",
"md5": "8eb082af904b9e016e78e0348c0f4fa6",
"sha256": "9ad38d1fb7e64e42c1eb47f99644e4ab5e98ed486960561f5621bbe588963ced"
},
"downloads": -1,
"filename": "plane_env-0.1.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "8eb082af904b9e016e78e0348c0f4fa6",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<3.13,>=3.10",
"size": 26562,
"upload_time": "2025-09-04T21:45:32",
"upload_time_iso_8601": "2025-09-04T21:45:32.553781Z",
"url": "https://files.pythonhosted.org/packages/62/6b/49aa7ada750b88cc95aac27276ca3df95f66715b5e44bf78b215dc1a6dab/plane_env-0.1.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "10b735d7f66f1777afd72fb090f615fd554d76784c0333337bd8fc8ca34b4c6b",
"md5": "bc9c01fa9833186889d297f751da61f3",
"sha256": "3c0c8100738ff72905277bbe3f11375221d9931d711947ae2950655c8ee50f18"
},
"downloads": -1,
"filename": "plane_env-0.1.2.tar.gz",
"has_sig": false,
"md5_digest": "bc9c01fa9833186889d297f751da61f3",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<3.13,>=3.10",
"size": 25677,
"upload_time": "2025-09-04T21:45:33",
"upload_time_iso_8601": "2025-09-04T21:45:33.704503Z",
"url": "https://files.pythonhosted.org/packages/10/b7/35d7f66f1777afd72fb090f615fd554d76784c0333337bd8fc8ca34b4c6b/plane_env-0.1.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-09-04 21:45:33",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "yannberthelot",
"github_project": "Plane",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "plane-env"
}