Name | gym-sailing JSON |
Version |
0.2.1
JSON |
| download |
home_page | None |
Summary | A sailing environment for OpenAI Gym / Gymnasium |
upload_time | 2024-08-27 23:24:52 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.8 |
license | Copyright 2024 Gabriel Torre Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. |
keywords |
gym
gymnasium
reinforcement learning
sailing
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# gym-sailing: A sailing environment for OpenAI Gym / Gymnasium
This is a Gymnasium (OpenAI Gym) environment designed to train reinforcement learning (RL) agents to control a sailboat. The environment simulates the dynamics of a sailboat and allows the agent to learn tacking behavior to reach a target point.
![sailboat gif](https://github.com/Gabo-Tor/gym-sailing/raw/main/img/env.gif?raw=True "sailboat")
## Environments
| Environment | Description |
| --- | --- |
| **Sailboat-v0** | The main environment with a continuous action space. |
| **SailboatDiscrete-v0** | A variation of the environment with a discrete action space. |
| **Motorboat-v0** | An easy test environment with a motorboat instead of a sailboat. |
## Installation
You can install the latest release using pip:
```bash
pip install gym-sailing
```
Alternatively, if you prefer, you can clone the repository and install it locally.
## Usage
### Basic Usage
Bare minimum code to run the environment:
```python
import gymnasium as gym
import gym_sailing
env = gym.make("Sailboat-v0", render_mode="human")
observation, info = env.reset(seed=42)
for _ in range(1000):
action = env.action_space.sample() # this is where you would insert your policy
observation, reward, terminated, truncated, info = env.step(action)
if terminated or truncated:
observation, info = env.reset()
env.close()
```
### Training an RL Agent
To train an RL agent using stable-baselines3:
```python
from stable_baselines3 import PPO
import gymnasium as gym
import gym_sailing
env = gym.make("Sailboat-v0")
model = PPO('MlpPolicy', env, verbose=1)
# Train the agent
model.learn(total_timesteps=1_000_000)
# Test the trained model
observation, info = env.reset()
for _ in range(1000):
action, _ = model.predict(observation)
observation, reward, terminated, truncated, info = env.step(action)
env.close()
```
## Environment Details
### Observation Space
The observation space includes:
- **Boat Speed:** The current speed of the boat.
- **Boat Heading:** The angle of the boat relative to the wind, ranging from -$\pi$ to $\pi$.
- **Heading Rate:** The rate of change of the boat's heading.
- **Course to Target:** The angle between the boat's heading and the target, ranging from -$\pi$ to $\pi$.
- **Distance to Target:** The normalized distance between the boat and the target.
### Action Space
The action space consists of:
- **Rudder Angle:** The angle of the rudder, ranging from -1 to 1 for *Sailboat-v0* and *Motorboat-v0*, and {-1, 0, 1} for *SailboatDiscrete-v0*.
### Reward
The default reward function includes:
- **Alive Penalty:** A penalty for each time step to encourage the agent to reach the target quickly.
- **Target Reward:** A reward for reaching the target.
- **Course Penalty:** A penalty for leaving the course area.
- **Progress Reward:** A reward for making progress towards the target, using the L8 norm, to encourage the agent to move upwind.
### Episode End
- The environment is **terminated** if the boat reaches the target or leaves the course area.
- The environment is **truncated** after 3000 steps.
## Benchmarks
Benchmarks using stable-baselines3 with default hyperparameters. Good policies that tack only once tend to achieve ~390 total reward for the sailboat environment. PPO seems to perform better, but SAC is also a good option, that even converging faster.
![benchmarks](https://github.com/Gabo-Tor/gym-sailing/raw/main/img/benchmarks.png?raw=True "benchmarks")
## Contributing
Contributions are welcome. Please fork the repository and submit a pull request with your changes. For any questions or suggestions, feel free to open an issue.
## Future Work
Here are some features I'd like to add in the future:
- Add currents of different intensities and directions.
- Add wind shifts.
- Add wind gusts and lulls.
- Make the polar diagram more accurate, using the data from this paper: *R. Binns, F. W. Bethwaite, and N. R. Saunders, “Development of A More Realistic Sailing Simulator,” High Performance Yacht Design Conference. RINA, pp. 243–250, Dec. 04, 2002. doi: 10.3940/rina.ya.2002.29.*
## Inspiration
This project was inspired by this fork: https://github.com/openai/gym/compare/master...JonAsbury:gym:Sailing-Simulator-Env
## License
This project is licensed under the MIT License - see the [LICENSE](https://github.com/Gabo-Tor/gym-sailing/raw/main/LICENSE) file for details.
Raw data
{
"_id": null,
"home_page": null,
"name": "gym-sailing",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "gym, gymnasium, reinforcement learning, sailing",
"author": null,
"author_email": "Gabriel Torre <g-torre@hotmail.com>",
"download_url": "https://files.pythonhosted.org/packages/c7/57/2527aac22527e0d0823ac9a5c4be22d9a52f5ded5671fb3ef50ba2565c79/gym_sailing-0.2.1.tar.gz",
"platform": null,
"description": "# gym-sailing: A sailing environment for OpenAI Gym / Gymnasium\n\nThis is a Gymnasium (OpenAI Gym) environment designed to train reinforcement learning (RL) agents to control a sailboat. The environment simulates the dynamics of a sailboat and allows the agent to learn tacking behavior to reach a target point.\n\n![sailboat gif](https://github.com/Gabo-Tor/gym-sailing/raw/main/img/env.gif?raw=True \"sailboat\")\n\n## Environments\n\n| Environment | Description |\n| --- | --- |\n| **Sailboat-v0** | The main environment with a continuous action space. |\n| **SailboatDiscrete-v0** | A variation of the environment with a discrete action space. |\n| **Motorboat-v0** | An easy test environment with a motorboat instead of a sailboat. |\n\n## Installation\n\nYou can install the latest release using pip:\n\n```bash\npip install gym-sailing\n```\n\nAlternatively, if you prefer, you can clone the repository and install it locally.\n\n## Usage\n\n### Basic Usage\n\nBare minimum code to run the environment:\n\n```python\nimport gymnasium as gym\nimport gym_sailing\n\nenv = gym.make(\"Sailboat-v0\", render_mode=\"human\")\nobservation, info = env.reset(seed=42)\n\nfor _ in range(1000):\n action = env.action_space.sample() # this is where you would insert your policy\n observation, reward, terminated, truncated, info = env.step(action)\n\n if terminated or truncated:\n observation, info = env.reset()\n\nenv.close()\n```\n\n### Training an RL Agent\n\nTo train an RL agent using stable-baselines3:\n\n```python\nfrom stable_baselines3 import PPO\nimport gymnasium as gym\nimport gym_sailing\n\nenv = gym.make(\"Sailboat-v0\")\nmodel = PPO('MlpPolicy', env, verbose=1)\n\n# Train the agent\nmodel.learn(total_timesteps=1_000_000)\n\n# Test the trained model\nobservation, info = env.reset()\nfor _ in range(1000):\n action, _ = model.predict(observation)\n observation, reward, terminated, truncated, info = env.step(action)\n\nenv.close()\n```\n\n## Environment Details\n\n### Observation Space\n\nThe observation space includes:\n\n- **Boat Speed:** The current speed of the boat.\n- **Boat Heading:** The angle of the boat relative to the wind, ranging from -$\\pi$ to $\\pi$.\n- **Heading Rate:** The rate of change of the boat's heading.\n- **Course to Target:** The angle between the boat's heading and the target, ranging from -$\\pi$ to $\\pi$.\n- **Distance to Target:** The normalized distance between the boat and the target.\n\n### Action Space\n\nThe action space consists of:\n\n- **Rudder Angle:** The angle of the rudder, ranging from -1 to 1 for *Sailboat-v0* and *Motorboat-v0*, and {-1, 0, 1} for *SailboatDiscrete-v0*.\n\n### Reward\n\nThe default reward function includes:\n\n- **Alive Penalty:** A penalty for each time step to encourage the agent to reach the target quickly.\n- **Target Reward:** A reward for reaching the target.\n- **Course Penalty:** A penalty for leaving the course area.\n- **Progress Reward:** A reward for making progress towards the target, using the L8 norm, to encourage the agent to move upwind.\n\n### Episode End\n\n- The environment is **terminated** if the boat reaches the target or leaves the course area.\n- The environment is **truncated** after 3000 steps.\n\n## Benchmarks\n\nBenchmarks using stable-baselines3 with default hyperparameters. Good policies that tack only once tend to achieve ~390 total reward for the sailboat environment. PPO seems to perform better, but SAC is also a good option, that even converging faster.\n\n![benchmarks](https://github.com/Gabo-Tor/gym-sailing/raw/main/img/benchmarks.png?raw=True \"benchmarks\")\n\n## Contributing\n\nContributions are welcome. Please fork the repository and submit a pull request with your changes. For any questions or suggestions, feel free to open an issue.\n\n## Future Work\n\nHere are some features I'd like to add in the future:\n\n- Add currents of different intensities and directions.\n- Add wind shifts.\n- Add wind gusts and lulls.\n- Make the polar diagram more accurate, using the data from this paper: *R. Binns, F. W. Bethwaite, and N. R. Saunders, \u201cDevelopment of A More Realistic Sailing Simulator,\u201d High Performance Yacht Design Conference. RINA, pp. 243\u2013250, Dec. 04, 2002. doi: 10.3940/rina.ya.2002.29.*\n\n## Inspiration\n\nThis project was inspired by this fork: https://github.com/openai/gym/compare/master...JonAsbury:gym:Sailing-Simulator-Env\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](https://github.com/Gabo-Tor/gym-sailing/raw/main/LICENSE) file for details.\n",
"bugtrack_url": null,
"license": "Copyright 2024 Gabriel Torre Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \u201cSoftware\u201d), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED \u201cAS IS\u201d, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.",
"summary": "A sailing environment for OpenAI Gym / Gymnasium",
"version": "0.2.1",
"project_urls": {
"Documentation": "https://github.com/Gabo-Tor/gym-sailing/blob/main/README.md",
"Homepage": "https://github.com/Gabo-Tor/gym-sailing",
"Issues": "https://github.com/Gabo-Tor/gym-sailing/issues",
"Repository": "https://github.com/Gabo-Tor/gym-sailing"
},
"split_keywords": [
"gym",
" gymnasium",
" reinforcement learning",
" sailing"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "deb6b50efc6670694a41580f1bf04416d22f86ae71b4c8e103492659c4e544e9",
"md5": "603649faed28319a298484c67b932360",
"sha256": "4f49dc4180c54c7fc92837ae6fd5e1e15b32867a0cee3f0a47054891dc5cb397"
},
"downloads": -1,
"filename": "gym_sailing-0.2.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "603649faed28319a298484c67b932360",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 23105,
"upload_time": "2024-08-27T23:24:48",
"upload_time_iso_8601": "2024-08-27T23:24:48.913372Z",
"url": "https://files.pythonhosted.org/packages/de/b6/b50efc6670694a41580f1bf04416d22f86ae71b4c8e103492659c4e544e9/gym_sailing-0.2.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "c7572527aac22527e0d0823ac9a5c4be22d9a52f5ded5671fb3ef50ba2565c79",
"md5": "77cf66e4c4bab2e06c021fd13092a91b",
"sha256": "67938c3e3df52580351b79fcf82c6131f2b078c6d14d50bfaa63e8ccd58fe568"
},
"downloads": -1,
"filename": "gym_sailing-0.2.1.tar.gz",
"has_sig": false,
"md5_digest": "77cf66e4c4bab2e06c021fd13092a91b",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 923572,
"upload_time": "2024-08-27T23:24:52",
"upload_time_iso_8601": "2024-08-27T23:24:52.030163Z",
"url": "https://files.pythonhosted.org/packages/c7/57/2527aac22527e0d0823ac9a5c4be22d9a52f5ded5671fb3ef50ba2565c79/gym_sailing-0.2.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-08-27 23:24:52",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "Gabo-Tor",
"github_project": "gym-sailing",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "gym-sailing"
}