gym-simplegrid


Namegym-simplegrid JSON
Version 1.0.5 PyPI version JSON
download
home_pagehttps://github.com/damat-le/gym-simplegrid
SummarySimple Gridworld Environment for Gymnasium
upload_time2023-08-23 13:01:01
maintainer
docs_urlNone
authorLeo D'Amato
requires_python>=3.7
license
keywords reinforcement learning environment gridworld agent rl openaigym openai-gym gym gymnasium farama-foundation
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Simple Gridworld Environment for OpenAI Gym

SimpleGrid is a super simple gridworld environment for [Gymnasium](https://gymnasium.farama.org/). It is easy to use and customise and it is intended to offer an environment for quickly testing and prototyping different RL algorithms.

It is also efficient, lightweight and has few dependencies (gymnasium, numpy, matplotlib). 

![](img/simplegrid.gif)

SimpleGrid involves navigating a grid from a Start (red tile) to a Goal (green tile) state without colliding with any Wall (black tiles) by walking over the Empty (white tiles) cells. The yellow circle denotes the agent's current position. 


## Installation

To install SimpleGrid, you can either use pip

```bash
pip install gym-simplegrid
```

or you can clone the repository and run an editable installation

```bash
git clone https://github.com/damat-le/gym-simplegrid.git
cd gym-simplegrid
pip install -e .
```


## Citation

Please use this bibtex if you want to cite this repository in your publications:

```tex
@misc{gym_simplegrid,
  author = {Leo D'Amato},
  title = {Simple Gridworld Environment for OpenAI Gym},
  year = {2022},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/damat-le/gym-simplegrid}},
}
```

## Getting Started

Basic usage options:

```python
import gymnasium as gym
import gym_simplegrid

# Load the default 8x8 map
env = gym.make('SimpleGrid-8x8-v0', render_mode='human')

# Load the default 4x4 map
env = gym.make('SimpleGrid-4x4-v0', render_mode='human')

# Load a custom map
obstacle_map = [
        "10001000",
        "10010000",
        "00000001",
        "01000001",
    ]

env = gym.make(
    'SimpleGrid-v0', 
    obstacle_map=obstacle_map, 
    render_mode='human'
)

# Use the options dict in the reset method
# This initialises the agent in location (0,0) and the goal in location (7,7)
env = gym.make('SimpleGrid-8x8-v0', render_mode='human')
obs, info = env.reset(options={'start_loc':0, 'goal_loc':63})
```

Basic example with rendering:

```python
import gymnasium as gym
import gym_simplegrid

env = gym.make('SimpleGrid-8x8-v0', render_mode='human')
obs, info = env.reset()
done = env.unwrapped.done

for _ in range(50):
    if done:
        break
    action = env.action_space.sample()
    obs, reward, done, _, info = env.step(action)
env.close()
```

For an other example, take a look at the [example script](example.py).


## Environment Description

### Action Space

The action space is `gymnasium.spaces.Discrete(4)`. An action is a `int` number and represents a direction according to the following scheme:

- 0: UP
- 1: DOWN
- 2: LEFT
- 3: RIGHT

### Observation Space

Assume to have an environment of size `(nrow, ncol)`, then the observation space is `gymnasium.spaces.Discrete(nrow * ncol)`. Hence, an observation is an integer from `0` to `nrow * ncol - 1` and represents the agent's current position. We can convert an observation `s` to a tuple `(x,y)` using the following formulae:

```python
 x = s // ncol # integer division
 y = s % ncol  # modulo operation
```

For example: let `nrow=4`, `ncol=5` and let `s=11`. Then `x=11//5=2` and `y=10%5=1`.

Viceversa, we can convert a tuple `(x,y)` to an observation `s` using the following formulae:

```python
s = x * ncol + y
```

For example: let `nrow=4`, `ncol=5` and let `x=2`, `y=1`. Then `s=2*5+1=11`.

### Environment Dynamics

In the current implementation, the episodes terminates only when the agent reaches the goal state. In case the agent takes a non-valid action (e.g. it tries to walk over a wall or exit the grid), the agent stays in the same position and receives a negative reward.

It is possible to subclass the `SimpleGridEnv` class  and to override the `step()` method to define custom dynamics (e.g. truncate the episode if the agent takes a non-valid action).

### Rewards

Currently, the reward map is defined in the `get_reward()` method of the `SimpleGridEnv` class.

For a given position `(x,y)`, the default reward function is defined as follows:

```python 
def get_reward(self, x: int, y: int) -> float:
    """
    Get the reward of a given cell.
    """
    if not self.is_in_bounds(x, y):
        # if the agent tries to exit the grid, it receives a negative reward
        return -1.0
    elif not self.is_free(x, y):
        # if the agent tries to walk over a wall, it receives a negative reward
        return -1.0
    elif (x, y) == self.goal_xy:
        # if the agent reaches the goal, it receives a positive reward
        return 1.0
    else:
        # otherwise, it receives no reward
        return 0.0
```

It is possible to subclass the `SimpleGridEnv` class  and to override this method to define custom rewards.

## Notes on rendering

The default frame rate is 5 FPS. It is possible to change it through the `metadata` dictionary. 

To properly render the environment, remember that the point (x,y) in the desc matrix corresponds to the point (y,x) in the rendered matrix.
This is because the rendering code works in terms of width and height while the computation in the environment is done using x and y coordinates.
You don't have to worry about this unless you play with the environment's internals.



            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/damat-le/gym-simplegrid",
    "name": "gym-simplegrid",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "",
    "keywords": "reinforcement learning,environment,gridworld,agent,rl,openaigym,openai-gym,gym,gymnasium,farama-foundation",
    "author": "Leo D'Amato",
    "author_email": "leo.damato.dev@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/16/04/808718449d028b9c0938969ebd9ff1b48efbbf2a1d33c7ba1873d57c6c59/gym_simplegrid-1.0.5.tar.gz",
    "platform": null,
    "description": "# Simple Gridworld Environment for OpenAI Gym\n\nSimpleGrid is a super simple gridworld environment for [Gymnasium](https://gymnasium.farama.org/). It is easy to use and customise and it is intended to offer an environment for quickly testing and prototyping different RL algorithms.\n\nIt is also efficient, lightweight and has few dependencies (gymnasium, numpy, matplotlib). \n\n![](img/simplegrid.gif)\n\nSimpleGrid involves navigating a grid from a Start (red tile) to a Goal (green tile) state without colliding with any Wall (black tiles) by walking over the Empty (white tiles) cells. The yellow circle denotes the agent's current position. \n\n\n## Installation\n\nTo install SimpleGrid, you can either use pip\n\n```bash\npip install gym-simplegrid\n```\n\nor you can clone the repository and run an editable installation\n\n```bash\ngit clone https://github.com/damat-le/gym-simplegrid.git\ncd gym-simplegrid\npip install -e .\n```\n\n\n## Citation\n\nPlease use this bibtex if you want to cite this repository in your publications:\n\n```tex\n@misc{gym_simplegrid,\n  author = {Leo D'Amato},\n  title = {Simple Gridworld Environment for OpenAI Gym},\n  year = {2022},\n  publisher = {GitHub},\n  journal = {GitHub repository},\n  howpublished = {\\url{https://github.com/damat-le/gym-simplegrid}},\n}\n```\n\n## Getting Started\n\nBasic usage options:\n\n```python\nimport gymnasium as gym\nimport gym_simplegrid\n\n# Load the default 8x8 map\nenv = gym.make('SimpleGrid-8x8-v0', render_mode='human')\n\n# Load the default 4x4 map\nenv = gym.make('SimpleGrid-4x4-v0', render_mode='human')\n\n# Load a custom map\nobstacle_map = [\n        \"10001000\",\n        \"10010000\",\n        \"00000001\",\n        \"01000001\",\n    ]\n\nenv = gym.make(\n    'SimpleGrid-v0', \n    obstacle_map=obstacle_map, \n    render_mode='human'\n)\n\n# Use the options dict in the reset method\n# This initialises the agent in location (0,0) and the goal in location (7,7)\nenv = gym.make('SimpleGrid-8x8-v0', render_mode='human')\nobs, info = env.reset(options={'start_loc':0, 'goal_loc':63})\n```\n\nBasic example with rendering:\n\n```python\nimport gymnasium as gym\nimport gym_simplegrid\n\nenv = gym.make('SimpleGrid-8x8-v0', render_mode='human')\nobs, info = env.reset()\ndone = env.unwrapped.done\n\nfor _ in range(50):\n    if done:\n        break\n    action = env.action_space.sample()\n    obs, reward, done, _, info = env.step(action)\nenv.close()\n```\n\nFor an other example, take a look at the [example script](example.py).\n\n\n## Environment Description\n\n### Action Space\n\nThe action space is `gymnasium.spaces.Discrete(4)`. An action is a `int` number and represents a direction according to the following scheme:\n\n- 0: UP\n- 1: DOWN\n- 2: LEFT\n- 3: RIGHT\n\n### Observation Space\n\nAssume to have an environment of size `(nrow, ncol)`, then the observation space is `gymnasium.spaces.Discrete(nrow * ncol)`. Hence, an observation is an integer from `0` to `nrow * ncol - 1` and represents the agent's current position. We can convert an observation `s` to a tuple `(x,y)` using the following formulae:\n\n```python\n x = s // ncol # integer division\n y = s % ncol  # modulo operation\n```\n\nFor example: let `nrow=4`, `ncol=5` and let `s=11`. Then `x=11//5=2` and `y=10%5=1`.\n\nViceversa, we can convert a tuple `(x,y)` to an observation `s` using the following formulae:\n\n```python\ns = x * ncol + y\n```\n\nFor example: let `nrow=4`, `ncol=5` and let `x=2`, `y=1`. Then `s=2*5+1=11`.\n\n### Environment Dynamics\n\nIn the current implementation, the episodes terminates only when the agent reaches the goal state. In case the agent takes a non-valid action (e.g. it tries to walk over a wall or exit the grid), the agent stays in the same position and receives a negative reward.\n\nIt is possible to subclass the `SimpleGridEnv` class  and to override the `step()` method to define custom dynamics (e.g. truncate the episode if the agent takes a non-valid action).\n\n### Rewards\n\nCurrently, the reward map is defined in the `get_reward()` method of the `SimpleGridEnv` class.\n\nFor a given position `(x,y)`, the default reward function is defined as follows:\n\n```python \ndef get_reward(self, x: int, y: int) -> float:\n    \"\"\"\n    Get the reward of a given cell.\n    \"\"\"\n    if not self.is_in_bounds(x, y):\n        # if the agent tries to exit the grid, it receives a negative reward\n        return -1.0\n    elif not self.is_free(x, y):\n        # if the agent tries to walk over a wall, it receives a negative reward\n        return -1.0\n    elif (x, y) == self.goal_xy:\n        # if the agent reaches the goal, it receives a positive reward\n        return 1.0\n    else:\n        # otherwise, it receives no reward\n        return 0.0\n```\n\nIt is possible to subclass the `SimpleGridEnv` class  and to override this method to define custom rewards.\n\n## Notes on rendering\n\nThe default frame rate is 5 FPS. It is possible to change it through the `metadata` dictionary. \n\nTo properly render the environment, remember that the point (x,y) in the desc matrix corresponds to the point (y,x) in the rendered matrix.\nThis is because the rendering code works in terms of width and height while the computation in the environment is done using x and y coordinates.\nYou don't have to worry about this unless you play with the environment's internals.\n\n\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Simple Gridworld Environment for Gymnasium",
    "version": "1.0.5",
    "project_urls": {
        "Homepage": "https://github.com/damat-le/gym-simplegrid"
    },
    "split_keywords": [
        "reinforcement learning",
        "environment",
        "gridworld",
        "agent",
        "rl",
        "openaigym",
        "openai-gym",
        "gym",
        "gymnasium",
        "farama-foundation"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "1604808718449d028b9c0938969ebd9ff1b48efbbf2a1d33c7ba1873d57c6c59",
                "md5": "54cda60c1dd936ad63556cdfe995d683",
                "sha256": "b75ea04bc00fe4ae291b40fde5465b7163bf265eacd78654fda27c7c23279b9c"
            },
            "downloads": -1,
            "filename": "gym_simplegrid-1.0.5.tar.gz",
            "has_sig": false,
            "md5_digest": "54cda60c1dd936ad63556cdfe995d683",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 17326,
            "upload_time": "2023-08-23T13:01:01",
            "upload_time_iso_8601": "2023-08-23T13:01:01.680654Z",
            "url": "https://files.pythonhosted.org/packages/16/04/808718449d028b9c0938969ebd9ff1b48efbbf2a1d33c7ba1873d57c6c59/gym_simplegrid-1.0.5.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-08-23 13:01:01",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "damat-le",
    "github_project": "gym-simplegrid",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [],
    "lcname": "gym-simplegrid"
}
        
Elapsed time: 0.12649s