# **Tornado Cliff Walking**
<p align="center">
<img src="src/tornado_cliff/vid/example.gif" alt="TonadoCliffWalking Example" height="240"/>
</p>
Cliff Walking with Tornados is a variation of the original Cliff Walking environment.
It involves crossing a grid world while simultaneously avoiding falling off a cliff
and encountering a tornado which blows away the character to a random square in the
grid (including the cliff).
## Description
The game starts with the Elf at the top left corner of a gridworld (`i.e. [0,0]`).
The goal location is always at the bottom right corner
(`i.e. [-1,-1]`), and if the Elf reaches the goal the episode ends.
A cliff runs along the middle of the grid. If the player moves to a cliff location it
returns to the start location.
A tornado begins from a random square, excluding the cliff, and it makes a random walk through
the grid at a given pace (default is 1). If the Elf crosses the tornado, it will
be blown away by a random square in the grid including the cliff.
The player makes moves until they reach the goal.
Resembles Example 6.6 (page 132) from Reinforcement Learning: An Introduction
by Sutton and Barto [<a href="#cliffwalk_ref">1</a>].
It is an adaptation of Gymnasium's Cliff Walking [<a href="#gymnasium_ref">2</a>].
## Action Space
The action shape is `(1,)` in the range `{0, 3}` indicating
which direction to move the player.
- 0: Move up
- 1: Move right
- 2: Move down
- 3: Move left
## Observation Space
The observation depends on the shape of the grid. For a `(6,8)` grid there are 48*48 possible states, corresponding to the position of the Elf and the position of
the tornado. The player cannot be at the cliff, nor at the goal as the latter
results in the end of the episode.
The observation is a tuple representing the player's and the tornado's current
position as current_row * nrows + current_col (where both the row and col start at 0).
The observation is returned as a `Tuple[int, int]`. The first number being the state of the Elf and the second one corresponding to the state of the tornado.
## Starting State
The episode starts with the player in state `[0]` (location [0, 0]). And the tornado
begins at a random state.
## Reward
Each time step incurs a -1 reward unless the player stepped into the cliff,
which incurs a -100 reward.
## Episode End
The episode terminates when the player reaches the goal at the bottom left corner.
## Information
`step()` and `reset()` return a dict with the following keys:
- "p" - transition probability for the state.
As cliff walking is not stochastic, the transition probability returned always 1.0.
## Installation
```bash
git clone https://github.com/davera-017/TornadoCliffWalking
cd TornadoCliffWalking
pip install -e .
```
## References
<a id="cliffwalk_ref"></a>[1] R. Sutton and A. Barto, “Reinforcement Learning:
An Introduction” 2020. [Online].
Available: [http://www.incompleteideas.net/book/RLbook2020.pdf](http://www.incompleteideas.net/book/RLbook2020.pdf)
<a id="gymnasium_ref"></a>[2] Farama Foundation, “Gymnasium” 2023. (v0.28.1).
See: [https://github.com/Farama-Foundation/Gymnasium/blob/main/gymnasium/envs/toy_text/cliffwalking.py](https://github.com/Farama-Foundation/Gymnasium/blob/main/gymnasium/envs/toy_text/cliffwalking.py)
## Version History
- v0.0.1: Initial version release
Raw data
{
"_id": null,
"home_page": "https://github.com/davera-017/TornadoCliffWalking",
"name": "tornado-cliff-walking",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.9,<4.0",
"maintainer_email": "",
"keywords": "Reinforcement Learning,CliffWalking,gymnasium",
"author": "Daniel \u00c1vila Vera",
"author_email": "davera.017@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/56/33/52c8e90b21374f5303b68a8d8374e9bcc94ce8acd9297060c78fee7dc1f2/tornado_cliff_walking-0.1.0.tar.gz",
"platform": null,
"description": "# **Tornado Cliff Walking**\n\n<p align=\"center\">\n <img src=\"src/tornado_cliff/vid/example.gif\" alt=\"TonadoCliffWalking Example\" height=\"240\"/>\n</p>\n\n\nCliff Walking with Tornados is a variation of the original Cliff Walking environment.\nIt involves crossing a grid world while simultaneously avoiding falling off a cliff\nand encountering a tornado which blows away the character to a random square in the\ngrid (including the cliff).\n\n## Description\n\nThe game starts with the Elf at the top left corner of a gridworld (`i.e. [0,0]`).\nThe goal location is always at the bottom right corner\n(`i.e. [-1,-1]`), and if the Elf reaches the goal the episode ends.\n\nA cliff runs along the middle of the grid. If the player moves to a cliff location it\nreturns to the start location.\n\nA tornado begins from a random square, excluding the cliff, and it makes a random walk through\nthe grid at a given pace (default is 1). If the Elf crosses the tornado, it will\nbe blown away by a random square in the grid including the cliff.\n\nThe player makes moves until they reach the goal.\nResembles Example 6.6 (page 132) from Reinforcement Learning: An Introduction\nby Sutton and Barto [<a href=\"#cliffwalk_ref\">1</a>].\n\nIt is an adaptation of Gymnasium's Cliff Walking [<a href=\"#gymnasium_ref\">2</a>].\n\n## Action Space\n\nThe action shape is `(1,)` in the range `{0, 3}` indicating\nwhich direction to move the player.\n\n- 0: Move up\n- 1: Move right\n- 2: Move down\n- 3: Move left\n\n## Observation Space\n\nThe observation depends on the shape of the grid. For a `(6,8)` grid there are 48*48 possible states, corresponding to the position of the Elf and the position of\nthe tornado. The player cannot be at the cliff, nor at the goal as the latter\nresults in the end of the episode.\n\nThe observation is a tuple representing the player's and the tornado's current\nposition as current_row * nrows + current_col (where both the row and col start at 0).\n\nThe observation is returned as a `Tuple[int, int]`. The first number being the state of the Elf and the second one corresponding to the state of the tornado.\n\n## Starting State\n\nThe episode starts with the player in state `[0]` (location [0, 0]). And the tornado\nbegins at a random state.\n\n## Reward\n\nEach time step incurs a -1 reward unless the player stepped into the cliff,\nwhich incurs a -100 reward.\n\n## Episode End\n\nThe episode terminates when the player reaches the goal at the bottom left corner.\n\n## Information\n\n`step()` and `reset()` return a dict with the following keys:\n\n- \"p\" - transition probability for the state.\n\nAs cliff walking is not stochastic, the transition probability returned always 1.0.\n\n## Installation\n\n```bash\ngit clone https://github.com/davera-017/TornadoCliffWalking\ncd TornadoCliffWalking\npip install -e .\n```\n\n## References\n\n<a id=\"cliffwalk_ref\"></a>[1] R. Sutton and A. Barto, \u201cReinforcement Learning:\nAn Introduction\u201d 2020. [Online].\nAvailable: [http://www.incompleteideas.net/book/RLbook2020.pdf](http://www.incompleteideas.net/book/RLbook2020.pdf)\n\n<a id=\"gymnasium_ref\"></a>[2] Farama Foundation, \u201cGymnasium\u201d 2023. (v0.28.1).\nSee: [https://github.com/Farama-Foundation/Gymnasium/blob/main/gymnasium/envs/toy_text/cliffwalking.py](https://github.com/Farama-Foundation/Gymnasium/blob/main/gymnasium/envs/toy_text/cliffwalking.py)\n\n## Version History\n\n- v0.0.1: Initial version release\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A variation of Gymnasium's CliffWalking enviroment.",
"version": "0.1.0",
"project_urls": {
"Documentation": "https://github.com/davera-017/TornadoCliffWalking",
"Homepage": "https://github.com/davera-017/TornadoCliffWalking",
"Repository": "https://github.com/davera-017/TornadoCliffWalking"
},
"split_keywords": [
"reinforcement learning",
"cliffwalking",
"gymnasium"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "2b23c8b377cdf36252203087b0ce68f6da2215ac55a150fb6ff15d16fa0d93d5",
"md5": "88faa661ab9134dd011481ea0b6047a4",
"sha256": "58e95c54ed5a86307af2a8c98340c6fd188e88a4ef488114249678f9039bcc7e"
},
"downloads": -1,
"filename": "tornado_cliff_walking-0.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "88faa661ab9134dd011481ea0b6047a4",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9,<4.0",
"size": 1872898,
"upload_time": "2023-06-05T23:34:05",
"upload_time_iso_8601": "2023-06-05T23:34:05.743297Z",
"url": "https://files.pythonhosted.org/packages/2b/23/c8b377cdf36252203087b0ce68f6da2215ac55a150fb6ff15d16fa0d93d5/tornado_cliff_walking-0.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "563352c8e90b21374f5303b68a8d8374e9bcc94ce8acd9297060c78fee7dc1f2",
"md5": "8ab3d37a838b09b811a9cd9e2ef5fc02",
"sha256": "264d6b571674f45c7e06fdd5e2b82d554458215dc969c04954cd9123603e8f41"
},
"downloads": -1,
"filename": "tornado_cliff_walking-0.1.0.tar.gz",
"has_sig": false,
"md5_digest": "8ab3d37a838b09b811a9cd9e2ef5fc02",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9,<4.0",
"size": 1871170,
"upload_time": "2023-06-05T23:34:08",
"upload_time_iso_8601": "2023-06-05T23:34:08.073756Z",
"url": "https://files.pythonhosted.org/packages/56/33/52c8e90b21374f5303b68a8d8374e9bcc94ce8acd9297060c78fee7dc1f2/tornado_cliff_walking-0.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-06-05 23:34:08",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "davera-017",
"github_project": "TornadoCliffWalking",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"tox": true,
"lcname": "tornado-cliff-walking"
}