soccer-twos

Name	soccer-twos JSON
Version	0.1.13 JSON
	download
home_page	https://github.com/bryanoliveira/soccer-twos-env
Summary	A pre-compiled soccer-twos (Unity ML Agents) environment with a nice visualizer.
upload_time	2023-02-07 18:33:26
maintainer
docs_url	None
author	Bryan L M Oliveira
requires_python	>=3.6
license
keywords
VCS
bugtrack_url
requirements	gym gym-unity mlagents mlagents-envs numpy
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Soccer-Twos Gym Environment

A pre-compiled [Soccer-Twos](https://github.com/Unity-Technologies/ml-agents/blob/92ff2c26fef7174b443115454fa1c6045d622bc2/docs/Learning-Environment-Examples.md#soccer-twos) environment with multi-agent Gym-compatible wrappers and a human-friendly visualizer. Built on top of [Unity ML Agents](https://github.com/Unity-Technologies/ml-agents) to be used as final assignment for the Reinforcement Learning Minicourse at CEIA / Deep Learning Brazil.

<div align="center">
    <img class="text-img mw-100" src="https://raw.githubusercontent.com/bryanoliveira/soccer-twos-env/main/images/soccer.gif">
</div>
<br/>

Pre-compiled versions of this environment are available for Linux, Windows and MacOS (x86, 64 bits). The source code for this environment is available [here](https://github.com/bryanoliveira/unity-soccer). Example agent training procedures are available [here](https://github.com/bryanoliveira/ceia-rl-tournament-starter).

## Install

On a Python 3.6+ environment, run:

`pip install soccer-twos`

## Requirements

See [requirements.txt](https://github.com/bryanoliveira/soccer-twos-env/blob/main/requirements.txt).

## Usage

### For training

Import this package and instantiate the environment:

```python
import soccer_twos

env = soccer_twos.make()
```

The `make` method accepts several options:

| Option             | Description                                                                                                                                                                                                             |
| ------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `render`           | Whether to render the environment. Defaults to `False`.                                                                                                                                                                 |
| `watch`            | Whether to run an audience-friendly version the provided Soccer-Twos environment. Forces `render` to `True`, `time_scale` to `1` and `quality_level` to `5`. Has no effect when `env_path` is set. Defaults to `False`. |
| `variation`        | A soccer env variation in EnvType. Defaults to `EnvType.multiagent_player`                                                                                                                                              |
| `blue_team_name`   | The name of the blue team. Defaults to "BLUE".                                                                                                                                                                          |
| `orange_team_name` | The name of the orange team. Defaults to "ORANGE".                                                                                                                                                                      |
| `env_channel`      | The side channel to use for communication with the environment. Defaults to None.                                                                                                                                       |
| `time_scale`       | The time scale to use for the environment. This should be less than `100`x for better simulation accuracy. Defaults to `20`x realtime.                                                                                  |
| `quality_level`    | The quality level to use when rendering the environment. Ranges between `0` (lowest) and `5` (highest). Defaults to `0`.                                                                                                |
| `base_port`        | The base port to use to communicate with the environment. Defaults to `50039`.                                                                                                                                          |
| `worker_id`        | Used as base port shift to avoid communication conflicts. Defaults to `0`.                                                                                                                                              |
| `env_path`         | The path to the environment executable. Overrides `watch`. Defaults to the provided Soccer-Twos environment.                                                                                                            |
| `flatten_branched` | If `True`, turn branched discrete action spaces into a `Discrete` space rather than `MultiDiscrete`. Defaults to `False`.                                                                                               |
| `opponent_policy`  | The policy to use for the opponent when `variation==team_vs_policy`. Defaults to a random agent.                                                                                                                        |
| `single_player`    | Whether to let the agent control a single player, while the other stays still. Only works when `variation==team_vs_policy`. Defaults to `False`.                                                                        |

The created `env` exposes a basic [Gym](https://gym.openai.com/) interface.
Namely, the methods `reset()`, `step(action: Dict[int, np.ndarray])` and `close()` are available.
The `render()` method has currently no effect and `soccer_twos.make(render=True)` should be used instead.
The `step()` method returns extra information about the player and the ball in the last tuple element. This extra information includes position (x, y) and velocity (x, y) for the ball and players and y rotation (in degrees) of the players.

We expose an RLLib-compatible multiagent interface.
This means, for example, that `action` should be a `dict` where keys are integers in `{0, 1, 2, 3}` corresponding to each agent.
Additionally, values should be single actions shaped like `env.action_space.shape`.
Observations and rewards follow the same structure. Dones are only set for the key `__all__`, which means "all agents".
Agents 0 and 1 correspond to the blue team and agents 2 and 3 correspond to the orange team.

Here's a full example:

```python
import soccer_twos

env = soccer_twos.make(render=True)
print("Observation Space: ", env.observation_space.shape)
print("Action Space: ", env.action_space.shape)

team0_reward = 0
team1_reward = 0
while True:
    obs, reward, done, info = env.step(
        {
            0: env.action_space.sample(),
            1: env.action_space.sample(),
            2: env.action_space.sample(),
            3: env.action_space.sample(),
        }
    )

    team0_reward += reward[0] + reward[1]
    team1_reward += reward[2] + reward[3]
    if done["__all__"]:
        print("Total Reward: ", team0_reward, " x ", team1_reward)
        team0_reward = 0
        team1_reward = 0
        env.reset()
```

#### Environment State Configuration

The `env_channel` parameter allows for state configuration inside the simulation. To use it, you must first instantiate a `soccer_twos.side_channels.EnvConfigurationChannel` and pass it in the `soccer_twos.make` call. Here's a full example:

```python
import soccer_twos
from soccer_twos.side_channels import EnvConfigurationChannel
env_channel = EnvConfigurationChannel()
env = soccer_twos.make(env_channel=env_channel)
env.reset()
env_channel.set_parameters(
    ball_state={
        "position": [1, -1],
        "velocity": [-1.2, 3],
    },
    players_states={
        3: {
            "position": [-5, 10],
            "rotation_y": 45,
            "velocity": [5, 0],
        }
    }
)
# env.step()
```

All the `env_channel.set_parameters` method parameters and dict keys are optional. You can set a single parameter at a time or the full game state if you need so.

### Evaluating

To quickly evaluate one agent against another and generate comprehensive statistics, you may use the `evaluate` script:

`python -m soccer_twos.evaluate -m1 agent_module -m2 opponent_module`

You can also provide the `--episodes` option to specify the number of episodes to evaluate on (defaults to 100).

### Watching

To rollout via CLI, you must create an implementation (subclass) of `soccer_twos.AgentInterface` and run:

`python -m soccer_twos.watch -m agent_module`

This will run a human-friendly version of the environment, where your agent will play against itself.
You may instead use the options `-m1 agent_module -m2 opponent_module` to play against a different opponent.
You may also implement your own rollout script using `soccer_twos.make(watch=True)`.

<div align="center">
    <img src="https://raw.githubusercontent.com/bryanoliveira/soccer-twos-env/main/images/screenshot.png" width="480"/>
</div>

## Environment Specs

This environment is based on Unity ML Agents' [Soccer Twos](https://github.com/Unity-Technologies/ml-agents/blob/92ff2c26fef7174b443115454fa1c6045d622bc2/docs/Learning-Environment-Examples.md#soccer-twos), so most of the specs are the same. Here, four agents compete in a 2 vs 2 toy soccer game, aiming to get the ball into the opponent's goal while preventing the ball from entering own goal.

<div align="center">
    <img src="https://raw.githubusercontent.com/bryanoliveira/soccer-twos-env/main/images/obs.png" width="480"/>
</div>
<br/>

- Observation space: a 336-dimensional vector corresponding to 11 ray-casts forward distributed over 120 degrees and 3 ray-casts backward distributed over 90 degrees, each detecting 6 possible object types, along with the object's distance. The forward ray-casts contribute 264 state dimensions and backward 72 state dimensions over three observation stacks.
- Action space: 3 discrete branched actions (MultiDiscrete) corresponding to forward, backward, sideways movement, as well as rotation (27 discrete actions).
- Agent Reward Function:
  - `1 - accumulated time penalty`: when ball enters opponent's goal. Accumulated time penalty is incremented by `(1 / MaxSteps)` every fixed update and is reset to 0 at the beginning of an episode. In this build, `MaxSteps = 5000`.
  - `-1`: when ball enters team's goal.

Note that while this is true when `variation == EnvType.multiagent_player`, observation and action spaces may vary for other variations.

## Citation

```bibtex
@misc{soccertwos,
  author = {Bryan Oliveira},
  title = {A pre-compiled Soccer-Twos reinforcement learning environment with multi-agent Gym-compatible wrappers and human-friendly visualizers.},
  year = {2021},
  publisher = {GitHub},
  journal = {GitHub Repository},
  howpublished = {\url{https://github.com/bryanoliveira/soccer-twos-env}}
}
```

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/bryanoliveira/soccer-twos-env",
    "name": "soccer-twos",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": "",
    "keywords": "",
    "author": "Bryan L M Oliveira",
    "author_email": "bryanlmoliveira@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/1e/3f/c8b25a5122efa576e6339800ef794c20144c2938cefdbf38cc2c9ad59fc5/soccer-twos-0.1.13.tar.gz",
    "platform": null,
    "description": "# Soccer-Twos Gym Environment\n\nA pre-compiled [Soccer-Twos](https://github.com/Unity-Technologies/ml-agents/blob/92ff2c26fef7174b443115454fa1c6045d622bc2/docs/Learning-Environment-Examples.md#soccer-twos) environment with multi-agent Gym-compatible wrappers and a human-friendly visualizer. Built on top of [Unity ML Agents](https://github.com/Unity-Technologies/ml-agents) to be used as final assignment for the Reinforcement Learning Minicourse at CEIA / Deep Learning Brazil.\n\n<div align=\"center\">\n    <img class=\"text-img mw-100\" src=\"https://raw.githubusercontent.com/bryanoliveira/soccer-twos-env/main/images/soccer.gif\">\n</div>\n<br/>\n\nPre-compiled versions of this environment are available for Linux, Windows and MacOS (x86, 64 bits). The source code for this environment is available [here](https://github.com/bryanoliveira/unity-soccer). Example agent training procedures are available [here](https://github.com/bryanoliveira/ceia-rl-tournament-starter).\n\n## Install\n\nOn a Python 3.6+ environment, run:\n\n`pip install soccer-twos`\n\n## Requirements\n\nSee [requirements.txt](https://github.com/bryanoliveira/soccer-twos-env/blob/main/requirements.txt).\n\n## Usage\n\n### For training\n\nImport this package and instantiate the environment:\n\n```python\nimport soccer_twos\n\nenv = soccer_twos.make()\n```\n\nThe `make` method accepts several options:\n\n| Option             | Description                                                                                                                                                                                                             |\n| ------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |\n| `render`           | Whether to render the environment. Defaults to `False`.                                                                                                                                                                 |\n| `watch`            | Whether to run an audience-friendly version the provided Soccer-Twos environment. Forces `render` to `True`, `time_scale` to `1` and `quality_level` to `5`. Has no effect when `env_path` is set. Defaults to `False`. |\n| `variation`        | A soccer env variation in EnvType. Defaults to `EnvType.multiagent_player`                                                                                                                                              |\n| `blue_team_name`   | The name of the blue team. Defaults to \"BLUE\".                                                                                                                                                                          |\n| `orange_team_name` | The name of the orange team. Defaults to \"ORANGE\".                                                                                                                                                                      |\n| `env_channel`      | The side channel to use for communication with the environment. Defaults to None.                                                                                                                                       |\n| `time_scale`       | The time scale to use for the environment. This should be less than `100`x for better simulation accuracy. Defaults to `20`x realtime.                                                                                  |\n| `quality_level`    | The quality level to use when rendering the environment. Ranges between `0` (lowest) and `5` (highest). Defaults to `0`.                                                                                                |\n| `base_port`        | The base port to use to communicate with the environment. Defaults to `50039`.                                                                                                                                          |\n| `worker_id`        | Used as base port shift to avoid communication conflicts. Defaults to `0`.                                                                                                                                              |\n| `env_path`         | The path to the environment executable. Overrides `watch`. Defaults to the provided Soccer-Twos environment.                                                                                                            |\n| `flatten_branched` | If `True`, turn branched discrete action spaces into a `Discrete` space rather than `MultiDiscrete`. Defaults to `False`.                                                                                               |\n| `opponent_policy`  | The policy to use for the opponent when `variation==team_vs_policy`. Defaults to a random agent.                                                                                                                        |\n| `single_player`    | Whether to let the agent control a single player, while the other stays still. Only works when `variation==team_vs_policy`. Defaults to `False`.                                                                        |\n\nThe created `env` exposes a basic [Gym](https://gym.openai.com/) interface.\nNamely, the methods `reset()`, `step(action: Dict[int, np.ndarray])` and `close()` are available.\nThe `render()` method has currently no effect and `soccer_twos.make(render=True)` should be used instead.\nThe `step()` method returns extra information about the player and the ball in the last tuple element. This extra information includes position (x, y) and velocity (x, y) for the ball and players and y rotation (in degrees) of the players.\n\nWe expose an RLLib-compatible multiagent interface.\nThis means, for example, that `action` should be a `dict` where keys are integers in `{0, 1, 2, 3}` corresponding to each agent.\nAdditionally, values should be single actions shaped like `env.action_space.shape`.\nObservations and rewards follow the same structure. Dones are only set for the key `__all__`, which means \"all agents\".\nAgents 0 and 1 correspond to the blue team and agents 2 and 3 correspond to the orange team.\n\nHere's a full example:\n\n```python\nimport soccer_twos\n\nenv = soccer_twos.make(render=True)\nprint(\"Observation Space: \", env.observation_space.shape)\nprint(\"Action Space: \", env.action_space.shape)\n\nteam0_reward = 0\nteam1_reward = 0\nwhile True:\n    obs, reward, done, info = env.step(\n        {\n            0: env.action_space.sample(),\n            1: env.action_space.sample(),\n            2: env.action_space.sample(),\n            3: env.action_space.sample(),\n        }\n    )\n\n    team0_reward += reward[0] + reward[1]\n    team1_reward += reward[2] + reward[3]\n    if done[\"__all__\"]:\n        print(\"Total Reward: \", team0_reward, \" x \", team1_reward)\n        team0_reward = 0\n        team1_reward = 0\n        env.reset()\n```\n\n#### Environment State Configuration\n\nThe `env_channel` parameter allows for state configuration inside the simulation. To use it, you must first instantiate a `soccer_twos.side_channels.EnvConfigurationChannel` and pass it in the `soccer_twos.make` call. Here's a full example:\n\n```python\nimport soccer_twos\nfrom soccer_twos.side_channels import EnvConfigurationChannel\nenv_channel = EnvConfigurationChannel()\nenv = soccer_twos.make(env_channel=env_channel)\nenv.reset()\nenv_channel.set_parameters(\n    ball_state={\n        \"position\": [1, -1],\n        \"velocity\": [-1.2, 3],\n    },\n    players_states={\n        3: {\n            \"position\": [-5, 10],\n            \"rotation_y\": 45,\n            \"velocity\": [5, 0],\n        }\n    }\n)\n# env.step()\n```\n\nAll the `env_channel.set_parameters` method parameters and dict keys are optional. You can set a single parameter at a time or the full game state if you need so.\n\n### Evaluating\n\nTo quickly evaluate one agent against another and generate comprehensive statistics, you may use the `evaluate` script:\n\n`python -m soccer_twos.evaluate -m1 agent_module -m2 opponent_module`\n\nYou can also provide the `--episodes` option to specify the number of episodes to evaluate on (defaults to 100).\n\n### Watching\n\nTo rollout via CLI, you must create an implementation (subclass) of `soccer_twos.AgentInterface` and run:\n\n`python -m soccer_twos.watch -m agent_module`\n\nThis will run a human-friendly version of the environment, where your agent will play against itself.\nYou may instead use the options `-m1 agent_module -m2 opponent_module` to play against a different opponent.\nYou may also implement your own rollout script using `soccer_twos.make(watch=True)`.\n\n<div align=\"center\">\n    <img src=\"https://raw.githubusercontent.com/bryanoliveira/soccer-twos-env/main/images/screenshot.png\" width=\"480\"/>\n</div>\n\n## Environment Specs\n\nThis environment is based on Unity ML Agents' [Soccer Twos](https://github.com/Unity-Technologies/ml-agents/blob/92ff2c26fef7174b443115454fa1c6045d622bc2/docs/Learning-Environment-Examples.md#soccer-twos), so most of the specs are the same. Here, four agents compete in a 2 vs 2 toy soccer game, aiming to get the ball into the opponent's goal while preventing the ball from entering own goal.\n\n<div align=\"center\">\n    <img src=\"https://raw.githubusercontent.com/bryanoliveira/soccer-twos-env/main/images/obs.png\" width=\"480\"/>\n</div>\n<br/>\n\n- Observation space: a 336-dimensional vector corresponding to 11 ray-casts forward distributed over 120 degrees and 3 ray-casts backward distributed over 90 degrees, each detecting 6 possible object types, along with the object's distance. The forward ray-casts contribute 264 state dimensions and backward 72 state dimensions over three observation stacks.\n- Action space: 3 discrete branched actions (MultiDiscrete) corresponding to forward, backward, sideways movement, as well as rotation (27 discrete actions).\n- Agent Reward Function:\n  - `1 - accumulated time penalty`: when ball enters opponent's goal. Accumulated time penalty is incremented by `(1 / MaxSteps)` every fixed update and is reset to 0 at the beginning of an episode. In this build, `MaxSteps = 5000`.\n  - `-1`: when ball enters team's goal.\n\nNote that while this is true when `variation == EnvType.multiagent_player`, observation and action spaces may vary for other variations.\n\n## Citation\n\n```bibtex\n@misc{soccertwos,\n  author = {Bryan Oliveira},\n  title = {A pre-compiled Soccer-Twos reinforcement learning environment with multi-agent Gym-compatible wrappers and human-friendly visualizers.},\n  year = {2021},\n  publisher = {GitHub},\n  journal = {GitHub Repository},\n  howpublished = {\\url{https://github.com/bryanoliveira/soccer-twos-env}}\n}\n```\n\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "A pre-compiled soccer-twos (Unity ML Agents) environment with a nice visualizer.",
    "version": "0.1.13",
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "eb20de239776a8d41de128a3fb4f515fba47005641b6ab6537b4f66295d52d06",
                "md5": "23682f526f201867e472791da31ce4cd",
                "sha256": "5b7b6f88ee78f27a3fe170c1ea972c26333c9c452f1d471daaea950144f3c7f8"
            },
            "downloads": -1,
            "filename": "soccer_twos-0.1.13-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "23682f526f201867e472791da31ce4cd",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 20806,
            "upload_time": "2023-02-07T18:33:24",
            "upload_time_iso_8601": "2023-02-07T18:33:24.978521Z",
            "url": "https://files.pythonhosted.org/packages/eb/20/de239776a8d41de128a3fb4f515fba47005641b6ab6537b4f66295d52d06/soccer_twos-0.1.13-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "1e3fc8b25a5122efa576e6339800ef794c20144c2938cefdbf38cc2c9ad59fc5",
                "md5": "1ce3930f24a01a17e743c21837376bfd",
                "sha256": "5a87b54cb36930c4749604073d0e1e9df4dd819ce7f6fa3a4e23a111f2c33c8d"
            },
            "downloads": -1,
            "filename": "soccer-twos-0.1.13.tar.gz",
            "has_sig": false,
            "md5_digest": "1ce3930f24a01a17e743c21837376bfd",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 20967,
            "upload_time": "2023-02-07T18:33:26",
            "upload_time_iso_8601": "2023-02-07T18:33:26.455556Z",
            "url": "https://files.pythonhosted.org/packages/1e/3f/c8b25a5122efa576e6339800ef794c20144c2938cefdbf38cc2c9ad59fc5/soccer-twos-0.1.13.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-02-07 18:33:26",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "bryanoliveira",
    "github_project": "soccer-twos-env",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "gym",
            "specs": [
                [
                    "==",
                    "0.19.0"
                ]
            ]
        },
        {
            "name": "gym-unity",
            "specs": [
                [
                    "==",
                    "0.27.0"
                ]
            ]
        },
        {
            "name": "mlagents",
            "specs": [
                [
                    "==",
                    "0.27.0"
                ]
            ]
        },
        {
            "name": "mlagents-envs",
            "specs": [
                [
                    "==",
                    "0.27.0"
                ]
            ]
        },
        {
            "name": "numpy",
            "specs": [
                [
                    "==",
                    "1.19.5"
                ]
            ]
        }
    ],
    "lcname": "soccer-twos"
}

Bryan L M Oliveira