# PySuperTuxKart gymnasium wrapper
[![PyPI
version](https://badge.fury.io/py/pystk2-gymnasium.svg)](https://badge.fury.io/py/pystk2-gymnasium)
Read the [Changelog](./CHANGELOG.md)
## Install
The PySuperKart2 gymnasium wrapper is a Python package, so installing is fairly
easy
`pip install pystk2-gymnasium`
Note that during the first run, SuperTuxKart assets are downloaded in the cache
directory.
## AgentSpec
Each controlled kart is parametrized by `pystk2_gymnasium.AgentSpec`:
- `name` defines name of the player (displayed on top of the kart)
- `rank_start` defines the starting position (None for random, which is the
default)
- `use_ai` flag (False by default) to ignore actions (when calling `step`, a
SuperTuxKart bot is used instead of using the action)
- `camera_mode` can be set to `AUTO` (camera on for non STK bots), `ON` (camera
on) or `OFF` (no camera).
## Current limitations
- no graphics information is available (i.e. pixmap)
## Environments
After importing `pystk2_gymnasium`, the following environments are available:
- `supertuxkart/full-v0` is the main environment containing complete
observations. The observation and action spaces are both dictionaries with
continuous or discrete variables (see below). The exact structure can be found
using `env.observation_space` and `env.action_space`. The following options
can be used to modify the environment:
- `agent` is an `AgentSpec (see above)`
- `render_mode` can be None or `human`
- `track` defines the SuperTuxKart track to use (None for random). The full
list can be found in `STKRaceEnv.TRACKS` after initialization with
`initialize.initialize(with_graphics: bool)` has been called.
- `num_kart` defines the number of karts on the track (3 by default)
- `max_paths` the maximum number of the (nearest) paths (a track is made of
paths) to consider in the observation state
- `laps` is the number of laps (1 by default)
- `difficulty` is the difficulty of the AI bots (lowest 0 to highest 2,
default to 2)
Some environments are created using wrappers (see below for wrapper
documentation),
- `supertuxkart/simple-v0` (wrappers: `ConstantSizedObservations`) is a
simplified environment with a fixed number of observations for paths
(controlled by `state_paths`, default 5), items (`state_items`, default 5),
karts (`state_karts`, default 5)
- `supertuxkart/flattened-v0` (wrappers: `ConstantSizedObservations`,
`PolarObservations`, `FlattenerWrapper`) has observation and action spaces
simplified at the maximum (only `discrete` and `continuous` keys)
- `supertuxkart/flattened_continuous_actions-v0` (wrappers: `ConstantSizedObservations`, `PolarObservations`, `OnlyContinuousActionsWrapper`, `FlattenerWrapper`) removes discrete actions
(default to 0) so this is steer/acceleration only in the continuous domain
- `supertuxkart/flattened_multidiscrete-v0` (wrappers: `ConstantSizedObservations`, `PolarObservations`, `DiscreteActionsWrapper`, `FlattenerWrapper`) is like the previous one, but with
fully multi-discrete actions. `acceleration_steps` and `steer_steps` (default
to 5) control the number of discrete values for acceleration and steering
respectively.
- `supertuxkart/flattened_discrete-v0` (wrappers: `ConstantSizedObservations`, `PolarObservations`, `DiscreteActionsWrapper`, `FlattenerWrapper`, `FlattenMultiDiscreteActions`) is like the previous one, but with fully
discretized actions
The reward $r_t$ at time $t$ is given by
$$ r_{t} = \frac{1}{10}(d_{t} - d_{t-1}) + (1 - \frac{\mathrm{pos}_t}{K})
\times (3 + 7 f_t) - 0.1 + 10 * f_t $$
where $d_t$ is the overall track distance at time $t$, $\mathrm{pos}_t$ the
position among the $K$ karts at time $t$, and $f_t$ is $1$ when the kart
finishes the race.
## Wrappers
Wrappers can be used to modify the environment.
### Constant-size observation
`pystk2_gymnasium.ConstantSizedObservations( env, state_items=5,
state_karts=5, state_paths=5 )` ensures that the number of observed items,
karts and paths is constant. By default, the number of observations per category
is 5.
### Polar observations
`pystk2_gymnasium.PolarObservations(env)` changes Cartesian
coordinates to polar ones (angle in the horizontal plane, angle in the vertical plan, and distance) of all 3D vectors.
### Discrete actions
`pystk2_gymnasium.DiscreteActionsWrapper(env, acceleration_steps=5, steer_steps=7)` discretizes acceleration and steer actions (5 and 7 values respectively).
### Flattener (actions and observations)
This wrapper groups all continuous and discrete spaces together.
`pystk2_gymnasium.FlattenerWrapper(env)` flattens **actions and
observations**. The base environment should be a dictionary of observation
spaces. The transformed environment is a dictionary made with two entries,
`discrete` and `continuous` (if both continuous and discrete
observations/actions are present in the initial environment, otherwise it is
either the type of `discrete` or `continuous`). `discrete` is `MultiDiscrete`
space that combines all the discrete (and multi-discrete) observations, while
`continuous` is a `Box` space.
### Flatten multi-discrete actions
`pystk2_gymnasium.FlattenMultiDiscreteActions(env)` flattens a multi-discrete
action space into a discrete one, with one action per possible unique choice of
actions. For instance, if the initial space is $\{0, 1\} \times \{0, 1, 2\}$,
the action space becomes $\{0, 1, \ldots, 6\}$.
## Multi-agent environment
`supertuxkart/multi-full-v0` can be used to control multiple karts. It takes an
`agents` parameter that is a list of `AgentSpec`. Observations and actions are a
dictionary of single-kart ones where **string** keys that range from `0` to
`n-1` with `n` the number of karts.
To use different gymnasium wrappers, one can use a `MonoAgentWrapperAdapter`.
Let's look at an example to illustrate this:
```py
from pystk_gymnasium import AgentSpec
agents = [
AgentSpec(use_ai=True, name="Yin Team", camera_mode=CameraMode.ON),
AgentSpec(use_ai=True, name="Yang Team", camera_mode=CameraMode.ON),
AgentSpec(use_ai=True, name="Zen Team", camera_mode=CameraMode.ON)
]
wrappers = [
partial(MonoAgentWrapperAdapter, wrapper_factories={
"0": lambda env: ConstantSizedObservations(env),
"1": lambda env: PolarObservations(ConstantSizedObservations(env)),
"2": lambda env: PolarObservations(ConstantSizedObservations(env))
}),
]
make_stkenv = partial(
make_env,
"supertuxkart/multi-full-v0",
render_mode="human",
num_kart=5,
agents=agents,
wrappers=wrappers
)
```
## Action and observation space
All the 3D vectors are within the kart referential (`z` front, `x` left, `y`
up):
- `distance_down_track`: The distance from the start
- `energy`: remaining collected energy
- `front`: front of the kart (3D vector)
- `attachment`: the item attached to the kart (bonus box, banana, nitro/big,
nitro/small, bubble gum, easter egg)
- `attachment_time_left`: how much time the attachment will be kept
- `items_position`: position of the items (3D vectors)
- `items_type`: type of the item
- `jumping`: is the kart jumping
- `karts_position`: position of other karts, beginning with the ones in front
- `max_steer_angle` the max angle of the steering (given the current speed)
- `center_path_distance`: distance to the center of the path
- `center_path`: vector to the center of the path
- `paths_start`, `paths_end`, `paths_width`: 3D vectors to the paths start and
end, and vector of their widths (scalar). The paths are sorted so that the
first element of the array is the current one.
- `paths_distance`: the distance of the paths starts and ends (vector of
dimension 2)
- `powerup`: collected power-up
- `shield_time`
- `skeed_factor`
- `velocity`: velocity vector
## Example
```py3
import gymnasium as gym
from pystk2_gymnasium import AgentSpec
# STK gymnasium uses one process
if __name__ == '__main__':
# Use a a flattened version of the observation and action spaces
# In both case, this corresponds to a dictionary with two keys:
# - `continuous` is a vector corresponding to the continuous observations
# - `discrete` is a vector (of integers) corresponding to discrete observations
env = gym.make("supertuxkart/flattened-v0", render_mode="human", agent=AgentSpec(use_ai=False))
ix = 0
done = False
state, *_ = env.reset()
while not done:
ix += 1
action = env.action_space.sample()
state, reward, terminated, truncated, _ = env.step(action)
done = truncated or terminated
# Important to stop the STK process
env.close()
```
Raw data
{
"_id": null,
"home_page": "https://github.com/bpiwowar/pystk2-gymnasium",
"name": "pystk2-gymnasium",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.8",
"maintainer_email": null,
"keywords": null,
"author": "Benjamin Piwowarski",
"author_email": "benjamin@piwowarski.fr",
"download_url": "https://files.pythonhosted.org/packages/63/5e/23f807106a0adc1ed5bd5a90a9fc43ac23d11ed54d8805793a5d4ecd1b41/pystk2_gymnasium-0.7.0.tar.gz",
"platform": null,
"description": "# PySuperTuxKart gymnasium wrapper\n\n[![PyPI\nversion](https://badge.fury.io/py/pystk2-gymnasium.svg)](https://badge.fury.io/py/pystk2-gymnasium)\n\nRead the [Changelog](./CHANGELOG.md)\n\n## Install\n\nThe PySuperKart2 gymnasium wrapper is a Python package, so installing is fairly\neasy\n\n`pip install pystk2-gymnasium`\n\nNote that during the first run, SuperTuxKart assets are downloaded in the cache\ndirectory.\n\n## AgentSpec\n\nEach controlled kart is parametrized by `pystk2_gymnasium.AgentSpec`:\n\n- `name` defines name of the player (displayed on top of the kart)\n- `rank_start` defines the starting position (None for random, which is the\n default)\n- `use_ai` flag (False by default) to ignore actions (when calling `step`, a\n SuperTuxKart bot is used instead of using the action)\n- `camera_mode` can be set to `AUTO` (camera on for non STK bots), `ON` (camera\n on) or `OFF` (no camera).\n\n\n## Current limitations\n\n- no graphics information is available (i.e. pixmap)\n\n\n## Environments\n\nAfter importing `pystk2_gymnasium`, the following environments are available:\n\n- `supertuxkart/full-v0` is the main environment containing complete\n observations. The observation and action spaces are both dictionaries with\n continuous or discrete variables (see below). The exact structure can be found\n using `env.observation_space` and `env.action_space`. The following options\n can be used to modify the environment:\n - `agent` is an `AgentSpec (see above)`\n - `render_mode` can be None or `human`\n - `track` defines the SuperTuxKart track to use (None for random). The full\n list can be found in `STKRaceEnv.TRACKS` after initialization with\n `initialize.initialize(with_graphics: bool)` has been called.\n - `num_kart` defines the number of karts on the track (3 by default)\n - `max_paths` the maximum number of the (nearest) paths (a track is made of\n paths) to consider in the observation state\n - `laps` is the number of laps (1 by default)\n - `difficulty` is the difficulty of the AI bots (lowest 0 to highest 2,\n default to 2)\n\nSome environments are created using wrappers (see below for wrapper\ndocumentation),\n- `supertuxkart/simple-v0` (wrappers: `ConstantSizedObservations`) is a\n simplified environment with a fixed number of observations for paths\n (controlled by `state_paths`, default 5), items (`state_items`, default 5),\n karts (`state_karts`, default 5)\n- `supertuxkart/flattened-v0` (wrappers: `ConstantSizedObservations`,\n `PolarObservations`, `FlattenerWrapper`) has observation and action spaces\n simplified at the maximum (only `discrete` and `continuous` keys)\n- `supertuxkart/flattened_continuous_actions-v0` (wrappers: `ConstantSizedObservations`, `PolarObservations`, `OnlyContinuousActionsWrapper`, `FlattenerWrapper`) removes discrete actions\n (default to 0) so this is steer/acceleration only in the continuous domain\n- `supertuxkart/flattened_multidiscrete-v0` (wrappers: `ConstantSizedObservations`, `PolarObservations`, `DiscreteActionsWrapper`, `FlattenerWrapper`) is like the previous one, but with\n fully multi-discrete actions. `acceleration_steps` and `steer_steps` (default\n to 5) control the number of discrete values for acceleration and steering\n respectively.\n- `supertuxkart/flattened_discrete-v0` (wrappers: `ConstantSizedObservations`, `PolarObservations`, `DiscreteActionsWrapper`, `FlattenerWrapper`, `FlattenMultiDiscreteActions`) is like the previous one, but with fully\n discretized actions\n\nThe reward $r_t$ at time $t$ is given by\n\n$$ r_{t} = \\frac{1}{10}(d_{t} - d_{t-1}) + (1 - \\frac{\\mathrm{pos}_t}{K})\n\\times (3 + 7 f_t) - 0.1 + 10 * f_t $$\n\nwhere $d_t$ is the overall track distance at time $t$, $\\mathrm{pos}_t$ the\nposition among the $K$ karts at time $t$, and $f_t$ is $1$ when the kart\nfinishes the race.\n\n## Wrappers\n\nWrappers can be used to modify the environment.\n\n### Constant-size observation\n\n`pystk2_gymnasium.ConstantSizedObservations( env, state_items=5,\n state_karts=5, state_paths=5 )` ensures that the number of observed items,\nkarts and paths is constant. By default, the number of observations per category\nis 5.\n\n### Polar observations\n\n`pystk2_gymnasium.PolarObservations(env)` changes Cartesian\ncoordinates to polar ones (angle in the horizontal plane, angle in the vertical plan, and distance) of all 3D vectors.\n\n### Discrete actions\n\n`pystk2_gymnasium.DiscreteActionsWrapper(env, acceleration_steps=5, steer_steps=7)` discretizes acceleration and steer actions (5 and 7 values respectively).\n\n### Flattener (actions and observations)\n\nThis wrapper groups all continuous and discrete spaces together.\n\n`pystk2_gymnasium.FlattenerWrapper(env)` flattens **actions and\nobservations**. The base environment should be a dictionary of observation\nspaces. The transformed environment is a dictionary made with two entries,\n`discrete` and `continuous` (if both continuous and discrete\nobservations/actions are present in the initial environment, otherwise it is\neither the type of `discrete` or `continuous`). `discrete` is `MultiDiscrete`\nspace that combines all the discrete (and multi-discrete) observations, while\n`continuous` is a `Box` space.\n\n### Flatten multi-discrete actions\n\n`pystk2_gymnasium.FlattenMultiDiscreteActions(env)` flattens a multi-discrete\naction space into a discrete one, with one action per possible unique choice of\nactions. For instance, if the initial space is $\\{0, 1\\} \\times \\{0, 1, 2\\}$,\nthe action space becomes $\\{0, 1, \\ldots, 6\\}$.\n\n\n## Multi-agent environment\n\n`supertuxkart/multi-full-v0` can be used to control multiple karts. It takes an\n`agents` parameter that is a list of `AgentSpec`. Observations and actions are a\ndictionary of single-kart ones where **string** keys that range from `0` to\n`n-1` with `n` the number of karts.\n\nTo use different gymnasium wrappers, one can use a `MonoAgentWrapperAdapter`.\n\nLet's look at an example to illustrate this:\n\n```py\n\nfrom pystk_gymnasium import AgentSpec\n\nagents = [\n AgentSpec(use_ai=True, name=\"Yin Team\", camera_mode=CameraMode.ON),\n AgentSpec(use_ai=True, name=\"Yang Team\", camera_mode=CameraMode.ON),\n AgentSpec(use_ai=True, name=\"Zen Team\", camera_mode=CameraMode.ON)\n]\n\nwrappers = [\n partial(MonoAgentWrapperAdapter, wrapper_factories={\n \"0\": lambda env: ConstantSizedObservations(env),\n \"1\": lambda env: PolarObservations(ConstantSizedObservations(env)),\n \"2\": lambda env: PolarObservations(ConstantSizedObservations(env))\n }),\n]\n\nmake_stkenv = partial(\n make_env,\n \"supertuxkart/multi-full-v0\",\n render_mode=\"human\",\n num_kart=5,\n agents=agents,\n wrappers=wrappers\n)\n```\n\n## Action and observation space\n\nAll the 3D vectors are within the kart referential (`z` front, `x` left, `y`\nup):\n\n- `distance_down_track`: The distance from the start\n- `energy`: remaining collected energy\n- `front`: front of the kart (3D vector)\n- `attachment`: the item attached to the kart (bonus box, banana, nitro/big,\n nitro/small, bubble gum, easter egg)\n- `attachment_time_left`: how much time the attachment will be kept\n- `items_position`: position of the items (3D vectors)\n- `items_type`: type of the item\n- `jumping`: is the kart jumping\n- `karts_position`: position of other karts, beginning with the ones in front\n- `max_steer_angle` the max angle of the steering (given the current speed)\n- `center_path_distance`: distance to the center of the path\n- `center_path`: vector to the center of the path\n- `paths_start`, `paths_end`, `paths_width`: 3D vectors to the paths start and\n end, and vector of their widths (scalar). The paths are sorted so that the\n first element of the array is the current one.\n- `paths_distance`: the distance of the paths starts and ends (vector of\n dimension 2)\n- `powerup`: collected power-up\n- `shield_time`\n- `skeed_factor`\n- `velocity`: velocity vector\n\n## Example\n\n```py3\nimport gymnasium as gym\nfrom pystk2_gymnasium import AgentSpec\n\n\n# STK gymnasium uses one process\nif __name__ == '__main__':\n # Use a a flattened version of the observation and action spaces\n # In both case, this corresponds to a dictionary with two keys:\n # - `continuous` is a vector corresponding to the continuous observations\n # - `discrete` is a vector (of integers) corresponding to discrete observations\n env = gym.make(\"supertuxkart/flattened-v0\", render_mode=\"human\", agent=AgentSpec(use_ai=False))\n\n ix = 0\n done = False\n state, *_ = env.reset()\n\n while not done:\n ix += 1\n action = env.action_space.sample()\n state, reward, terminated, truncated, _ = env.step(action)\n done = truncated or terminated\n\n # Important to stop the STK process\n env.close()\n```\n",
"bugtrack_url": null,
"license": "GPL",
"summary": "Gymnasium wrapper for PySTK2",
"version": "0.7.0",
"project_urls": {
"Homepage": "https://github.com/bpiwowar/pystk2-gymnasium",
"Repository": "https://github.com/bpiwowar/pystk2-gymnasium"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "8019af7b754c97bb9bbf654619664ce797060c70001befb15a243576b8e65fbd",
"md5": "61ca2160ac14faef5a0af74b4e782011",
"sha256": "73f3fc410434785e771e43be5f9a9c88bcd05f7bc0e0f3a0985431901365b033"
},
"downloads": -1,
"filename": "pystk2_gymnasium-0.7.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "61ca2160ac14faef5a0af74b4e782011",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.8",
"size": 32034,
"upload_time": "2024-11-12T13:12:10",
"upload_time_iso_8601": "2024-11-12T13:12:10.218254Z",
"url": "https://files.pythonhosted.org/packages/80/19/af7b754c97bb9bbf654619664ce797060c70001befb15a243576b8e65fbd/pystk2_gymnasium-0.7.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "635e23f807106a0adc1ed5bd5a90a9fc43ac23d11ed54d8805793a5d4ecd1b41",
"md5": "91146dd76d9ce7d01b052298f3cc75a0",
"sha256": "14d72afa7f2e8be0e14b02078162e74005b2207c76f934e9c83db753a439fe4e"
},
"downloads": -1,
"filename": "pystk2_gymnasium-0.7.0.tar.gz",
"has_sig": false,
"md5_digest": "91146dd76d9ce7d01b052298f3cc75a0",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.8",
"size": 31622,
"upload_time": "2024-11-12T13:12:11",
"upload_time_iso_8601": "2024-11-12T13:12:11.242460Z",
"url": "https://files.pythonhosted.org/packages/63/5e/23f807106a0adc1ed5bd5a90a9fc43ac23d11ed54d8805793a5d4ecd1b41/pystk2_gymnasium-0.7.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-11-12 13:12:11",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "bpiwowar",
"github_project": "pystk2-gymnasium",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "pystk2-gymnasium"
}