jumanji


Namejumanji JSON
Version 1.0.1 PyPI version JSON
download
home_pagehttps://github.com/instadeepai/jumanji/
SummaryA diverse suite of scalable reinforcement learning environments in JAX
upload_time2024-03-29 11:24:18
maintainerNone
docs_urlNone
authorInstaDeep
requires_python>=3.8
licenseApache 2.0
keywords reinforcement-learning python jax
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <p align="center">
    <a href="docs/img/jumanji_logo.png">
        <img src="docs/img/jumanji_logo.png" alt="Jumanji logo" width="50%"/>
    </a>
</p>

[![Python Versions](https://img.shields.io/pypi/pyversions/jumanji.svg?style=flat-square)](https://www.python.org/doc/versions/)
[![PyPI Version](https://badge.fury.io/py/jumanji.svg)](https://badge.fury.io/py/jumanji)
[![Tests](https://github.com/instadeepai/jumanji/actions/workflows/tests_linters.yml/badge.svg)](https://github.com/instadeepai/jumanji/actions/workflows/tests_linters.yml)
[![Code Style](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
[![MyPy](http://www.mypy-lang.org/static/mypy_badge.svg)](http://mypy-lang.org/)
[![License](https://img.shields.io/badge/License-Apache%202.0-orange.svg)](https://opensource.org/licenses/Apache-2.0)
[![Hugging Face](https://img.shields.io/badge/%F0%9F%A4%97-Hugging%20Face-F8D521)](https://huggingface.co/InstaDeepAI)

[**Environments**](#environments)
| [**Installation**](#install)
| [**Quickstart**](#quickstart)
| [**Training**](#training)
| [**Citation**](#citing)
| [**Docs**](https://instadeepai.github.io/jumanji)
---

<div class="collage">
  <div class="row" align="center">
    <img src="docs/env_anim/bin_pack.gif" alt="BinPack" width="16%">
    <img src="docs/env_anim/cleaner.gif" alt="Cleaner" width="16%">
    <img src="docs/env_anim/connector.gif" alt="Connector" width="16%">
    <img src="docs/env_anim/cvrp.gif" alt="CVRP" width="16%">
    <img src="docs/env_anim/flat_pack.gif" alt="FlatPack" width="16%">
    <img src="docs/env_anim/game_2048.gif" alt="Game2048" width="16%">
  </div>
  <div class="row" align="center">
    <img src="docs/env_anim/graph_coloring.gif" alt="GraphColoring" width="16%">
    <img src="docs/env_anim/job_shop.gif" alt="JobShop" width="16%">
    <img src="docs/env_anim/knapsack.gif" alt="Knapsack" width="16%">
    <img src="docs/env_anim/maze.gif" alt="Maze" width="16%">
    <img src="docs/env_anim/minesweeper.gif" alt="Minesweeper" width="16%">
    <img src="docs/env_anim/mmst.gif" alt="MMST" width="16%">
  </div>
  <div class="row" align="center">
    <img src="docs/env_anim/multi_cvrp.gif" alt="MultiCVRP" width="16%">
    <img src="docs/env_anim/pac_man.gif" alt="PacMan" width="16%">
    <img src="docs/env_anim/robot_warehouse.gif" alt="RobotWarehouse" width="16%">
    <img src="docs/env_anim/rubiks_cube.gif" alt="RubiksCube" width="16%">
    <img src="docs/env_anim/sliding_tile_puzzle.gif" alt="SlidingTilePuzzle" width="16%">
    <img src="docs/env_anim/snake.gif" alt="Snake" width="16%">
  </div>
    <div class="row" align="center">
    <img src="docs/env_anim/sokoban.gif" alt="RobotWarehouse" width="16%">
    <img src="docs/env_anim/sudoku.gif" alt="Sudoku" width="16%">
    <img src="docs/env_anim/tetris.gif" alt="Tetris" width="16%">
    <img src="docs/env_anim/tsp.gif" alt="Tetris" width="16%">
  </div>
</div>

## Jumanji @ ICLR 2024

Jumanji has been accepted at [ICLR 2024](https://iclr.cc/), check out our [research paper](https://arxiv.org/abs/2306.09884).

## Welcome to the Jungle! 🌴

Jumanji is a diverse suite of scalable reinforcement learning environments written in JAX. It now features 22 environments!

Jumanji is helping pioneer a new wave of hardware-accelerated research and development in the
field of RL. Jumanji's high-speed environments enable faster iteration and large-scale
experimentation while simultaneously reducing complexity. Originating in the research team at
[InstaDeep](https://www.instadeep.com/), Jumanji is now developed jointly with the open-source
community. To join us in these efforts, reach out, raise issues and read our
[contribution guidelines](https://github.com/instadeepai/jumanji/blob/main/CONTRIBUTING.md) or just
[star](https://github.com/instadeepai/jumanji) 🌟 to stay up to date with the latest developments!

### Goals πŸš€

1. Provide a simple, well-tested API for JAX-based environments.
2. Make research in RL more accessible.
3. Facilitate the research on RL for problems in the industry and help close the gap between
research and industrial applications.
4. Provide environments whose difficulty can be scaled to be arbitrarily hard.

### Overview 🦜

- πŸ₯‘ **Environment API**: core abstractions for JAX-based environments.
- πŸ•ΉοΈ **Environment Suite**: a collection of RL environments ranging from simple games to NP-hard
combinatorial problems.
- 🍬 **Wrappers**: easily connect to your favourite RL frameworks and libraries such as
[Acme](https://github.com/deepmind/acme),
[Stable Baselines3](https://github.com/DLR-RM/stable-baselines3),
[RLlib](https://docs.ray.io/en/latest/rllib/index.html), [OpenAI Gym](https://github.com/openai/gym)
and [DeepMind-Env](https://github.com/deepmind/dm_env) through our `dm_env` and `gym` wrappers.
- πŸŽ“ **Examples**: guides to facilitate Jumanji's adoption and highlight the added value of
JAX-based environments.
- 🏎️ **Training:** example agents that can be used as inspiration for the agents one may implement
in their research.

<h2 name="environments" id="environments">Environments 🌍</h2>

Jumanji provides a diverse range of environments ranging from simple games to NP-hard combinatorial
problems.

| Environment                              | Category | Registered Version(s)                                | Source                                                                                           | Description                                                            |
|------------------------------------------|----------|------------------------------------------------------|--------------------------------------------------------------------------------------------------|------------------------------------------------------------------------|
| πŸ”’ Game2048                              | Logic  | `Game2048-v1`                                        | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/logic/game_2048/)   | [doc](https://instadeepai.github.io/jumanji/environments/game_2048/)   |
| 🎨 GraphColoring                              | Logic  | `GraphColoring-v0`                                   | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/logic/graph_coloring/)   | [doc](https://instadeepai.github.io/jumanji/environments/graph_coloring/)   |
| πŸ’£ Minesweeper                           | Logic    | `Minesweeper-v0`                                     | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/logic/minesweeper/) | [doc](https://instadeepai.github.io/jumanji/environments/minesweeper/) |
| 🎲 RubiksCube                            | Logic    | `RubiksCube-v0`<br/>`RubiksCube-partly-scrambled-v0` | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/logic/rubiks_cube/) | [doc](https://instadeepai.github.io/jumanji/environments/rubiks_cube/) |
| πŸ”€ SlidingTilePuzzle                       | Logic    | `SlidingTilePuzzle-v0` | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/logic/sliding_tile_puzzle/) | [doc](https://instadeepai.github.io/jumanji/environments/sliding_tile_puzzle/) |
| ✏️ Sudoku                       | Logic    | `Sudoku-v0` <br/>`Sudoku-very-easy-v0`| [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/logic/sudoku/) | [doc](https://instadeepai.github.io/jumanji/environments/sudoku/) |
| πŸ“¦ BinPack (3D BinPacking Problem)       | Packing  | `BinPack-v1`                                         | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/packing/bin_pack/)  | [doc](https://instadeepai.github.io/jumanji/environments/bin_pack/)    |
| 🧩 FlatPack (2D Grid Filling Problem) | Packing  | `FlatPack-v0`                                         | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/packing/flat_pack/)  | [doc](https://instadeepai.github.io/jumanji/environments/flat_pack/)    |
| 🏭 JobShop (Job Shop Scheduling Problem) | Packing  | `JobShop-v0`                                         | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/packing/job_shop/)  | [doc](https://instadeepai.github.io/jumanji/environments/job_shop/)    |
| πŸŽ’ Knapsack                              | Packing  | `Knapsack-v1`                                        | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/packing/knapsack/)  | [doc](https://instadeepai.github.io/jumanji/environments/knapsack/)    |
| β–’ Tetris                              | Packing  | `Tetris-v0`                                        | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/packing/tetris/)  | [doc](https://instadeepai.github.io/jumanji/environments/tetris/)    |
| 🧹 Cleaner                               | Routing  | `Cleaner-v0`                                         | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/cleaner/)   | [doc](https://instadeepai.github.io/jumanji/environments/cleaner/)     |
| :link: Connector                         | Routing  | `Connector-v2`                                       | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/connector/) | [doc](https://instadeepai.github.io/jumanji/environments/connector/)   |
| 🚚 CVRP (Capacitated Vehicle Routing Problem)  | Routing  | `CVRP-v1`                                            | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/cvrp/)      | [doc](https://instadeepai.github.io/jumanji/environments/cvrp/)        |
| 🚚 MultiCVRP (Multi-Agent Capacitated Vehicle Routing Problem)  | Routing  | `MultiCVRP-v0`                                            | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/multi_cvrp/)      | [doc](https://instadeepai.github.io/jumanji/environments/multi_cvrp/)        |
| :mag: Maze   | Routing  | `Maze-v0`                                            | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/maze/)      | [doc](https://instadeepai.github.io/jumanji/environments/maze/)        |
| :robot: RobotWarehouse  | Routing  | `RobotWarehouse-v0`                                  | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/robot_warehouse/)      | [doc](https://instadeepai.github.io/jumanji/environments/robot_warehouse/)        |
| 🐍 Snake                                       | Routing  | `Snake-v1`                                           | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/snake/)     | [doc](https://instadeepai.github.io/jumanji/environments/snake/)       |
| πŸ“¬ TSP (Travelling Salesman Problem)           | Routing  | `TSP-v1`                                             | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/tsp/)       | [doc](https://instadeepai.github.io/jumanji/environments/tsp/)         |
| Multi Minimum Spanning Tree Problem | Routing  | `MMST-v0`                                | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/mmst)    | [doc](https://instadeepai.github.io/jumanji/environments/mmst/)    |
| α—§β€’β€’β€’α—£β€’β€’ PacMan   | Routing  | `PacMan-v0`                                            | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/pac_man/)      | [doc](https://instadeepai.github.io/jumanji/environments/pac_man/)
| πŸ‘Ύ Sokoban                                                     | Routing  | `Sokoban-v0`                                         | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/sokoban/)          | [doc](https://instadeepai.github.io/jumanji/environments/sokoban/)         |

<h2 name="install" id="install">Installation 🎬</h2>

You can install the latest release of Jumanji from PyPI:

```bash
pip install -U jumanji
```

Alternatively, you can install the latest development version directly from GitHub:

```bash
pip install git+https://github.com/instadeepai/jumanji.git
```

Jumanji has been tested on Python 3.8 and 3.9.
Note that because the installation of JAX differs depending on your hardware accelerator,
we advise users to explicitly install the correct JAX version (see the
[official installation guide](https://github.com/google/jax#installation)).

**Rendering:** Matplotlib is used for rendering all the environments. To visualize the environments
you will need a GUI backend. For example, on Linux, you can install Tk via:
`apt-get install python3-tk`, or using conda: `conda install tk`. Check out
[Matplotlib backends](https://matplotlib.org/stable/users/explain/backends.html) for a list of
backends you can use.

<h2 name="quickstart" id="quickstart">Quickstart ⚑</h2>

RL practitioners will find Jumanji's interface familiar as it combines the widely adopted
[OpenAI Gym](https://github.com/openai/gym) and
[DeepMind Environment](https://github.com/deepmind/dm_env) interfaces. From OpenAI Gym, we adopted
the idea of a `registry` and the `render` method, while our `TimeStep` structure is inspired by
DeepMind Environment.

### Basic Usage πŸ§‘β€πŸ’»

```python
import jax
import jumanji

# Instantiate a Jumanji environment using the registry
env = jumanji.make('Snake-v1')

# Reset your (jit-able) environment
key = jax.random.PRNGKey(0)
state, timestep = jax.jit(env.reset)(key)

# (Optional) Render the env state
env.render(state)

# Interact with the (jit-able) environment
action = env.action_spec.generate_value()          # Action selection (dummy value here)
state, timestep = jax.jit(env.step)(state, action)   # Take a step and observe the next state and time step
```

- `state` represents the internal state of the environment: it contains all the information required
to take a step when executing an action. This should **not** be confused with the `observation`
contained in the `timestep`, which is the information perceived by the agent.
- `timestep` is a dataclass containing `step_type`, `reward`, `discount`, `observation` and
`extras`. This structure is similar to
[`dm_env.TimeStep`](https://github.com/deepmind/dm_env/blob/master/docs/index.md) except for the
`extras` field that was added to allow users to log environments metrics that are neither part of
the agent's observation nor part of the environment's internal state.

### Advanced Usage πŸ§‘β€πŸ”¬

Being written in JAX, Jumanji's environments benefit from many of its features including
automatic vectorization/parallelization (`jax.vmap`, `jax.pmap`) and JIT-compilation (`jax.jit`),
which can be composed arbitrarily.
We provide an example of a more advanced usage in the
[advanced usage guide](https://instadeepai.github.io/jumanji/guides/advanced_usage/).

### Registry and Versioning πŸ“–

Like OpenAI Gym, Jumanji keeps a strict versioning of its environments for reproducibility reasons.
We maintain a registry of standard environments with their configuration.
For each environment, a version suffix is appended, e.g. `Snake-v1`.
When changes are made to environments that might impact learning results,
the version number is incremented by one to prevent potential confusion.
For a full list of registered versions of each environment, check out
[the documentation](https://instadeepai.github.io/jumanji/environments/tsp/).

<h2 name="training" id="training">Training 🏎️</h2>

To showcase how to train RL agents on Jumanji environments, we provide a random agent and a vanilla
actor-critic (A2C) agent. These agents can be found in
[jumanji/training/](https://github.com/instadeepai/jumanji/tree/main/jumanji/training/).

Because the environment framework in Jumanji is so flexible, it allows pretty much any problem to
be implemented as a Jumanji environment, giving rise to very diverse observations. For this reason,
environment-specific networks are required to capture the symmetries of each environment.
Alongside the A2C agent implementation, we provide examples of such environment-specific
actor-critic networks in
[jumanji/training/networks](https://github.com/instadeepai/jumanji/tree/main/jumanji/training/networks/).

> ⚠️ The example agents in `jumanji/training` are **only** meant to serve as inspiration for how one
> can implement an agent. Jumanji is first and foremost a library of environments - as such, the
> agents and networks will **not** be maintained to a production standard.

For more information on how to use the example agents, see the
[training guide](https://instadeepai.github.io/jumanji/guides/training/).

## Contributing 🀝

Contributions are welcome! See our issue tracker for
[good first issues](https://github.com/instadeepai/jumanji/labels/good%20first%20issue). Please read
our [contributing guidelines](https://github.com/instadeepai/jumanji/blob/main/CONTRIBUTING.md) for
details on how to submit pull requests, our Contributor License Agreement, and community guidelines.

<h2 name="citing" id="citing">Citing Jumanji ✏️</h2>

If you use Jumanji in your work, please cite the library using:

```
@misc{bonnet2024jumanji,
    title={Jumanji: a Diverse Suite of Scalable Reinforcement Learning Environments in JAX},
    author={ClΓ©ment Bonnet and Daniel Luo and Donal Byrne and Shikha Surana and Sasha Abramowitz and Paul Duckworth and Vincent Coyette and Laurence I. Midgley and Elshadai Tegegn and Tristan Kalloniatis and Omayma Mahjoub and Matthew Macfarlane and Andries P. Smit and Nathan Grinsztajn and Raphael Boige and Cemlyn N. Waters and Mohamed A. Mimouni and Ulrich A. Mbou Sob and Ruan de Kock and Siddarth Singh and Daniel Furelos-Blanco and Victor Le and Arnu Pretorius and Alexandre Laterre},
    year={2024},
    eprint={2306.09884},
    url={https://arxiv.org/abs/2306.09884},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}
```

## See Also πŸ”Ž

Other works have embraced the approach of writing RL environments in JAX.
In particular, we suggest users check out the following sister repositories:

- πŸ€– [Qdax](https://github.com/adaptive-intelligent-robotics/QDax) is a library to accelerate
Quality-Diversity and neuro-evolution algorithms through hardware accelerators and parallelization.
- 🌳 [Evojax](https://github.com/google/evojax) provides tools to enable neuroevolution algorithms
to work with neural networks running across multiple TPU/GPUs.
- 🦾 [Brax](https://github.com/google/brax) is a differentiable physics engine that simulates
environments made up of rigid bodies, joints, and actuators.
- πŸ‹οΈβ€ [Gymnax](https://github.com/RobertTLange/gymnax) implements classic environments including
classic control, bsuite, MinAtar and a collection of meta RL tasks.
- 🎲 [Pgx](https://github.com/sotetsuk/pgx) provides classic board game environments like
Backgammon, Shogi, and Go.

## Acknowledgements πŸ™

The development of this library was supported with Cloud TPUs
from Google's [TPU Research Cloud](https://sites.research.google/trc/about/) (TRC) 🌀.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/instadeepai/jumanji/",
    "name": "jumanji",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "reinforcement-learning python jax",
    "author": "InstaDeep",
    "author_email": "clement.bonnet16@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/13/f7/af897bb7918745b9a75cd3376b9da69bd6c625f964f76bccde5c80314141/jumanji-1.0.1.tar.gz",
    "platform": null,
    "description": "<p align=\"center\">\n    <a href=\"docs/img/jumanji_logo.png\">\n        <img src=\"docs/img/jumanji_logo.png\" alt=\"Jumanji logo\" width=\"50%\"/>\n    </a>\n</p>\n\n[![Python Versions](https://img.shields.io/pypi/pyversions/jumanji.svg?style=flat-square)](https://www.python.org/doc/versions/)\n[![PyPI Version](https://badge.fury.io/py/jumanji.svg)](https://badge.fury.io/py/jumanji)\n[![Tests](https://github.com/instadeepai/jumanji/actions/workflows/tests_linters.yml/badge.svg)](https://github.com/instadeepai/jumanji/actions/workflows/tests_linters.yml)\n[![Code Style](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n[![MyPy](http://www.mypy-lang.org/static/mypy_badge.svg)](http://mypy-lang.org/)\n[![License](https://img.shields.io/badge/License-Apache%202.0-orange.svg)](https://opensource.org/licenses/Apache-2.0)\n[![Hugging Face](https://img.shields.io/badge/%F0%9F%A4%97-Hugging%20Face-F8D521)](https://huggingface.co/InstaDeepAI)\n\n[**Environments**](#environments)\n| [**Installation**](#install)\n| [**Quickstart**](#quickstart)\n| [**Training**](#training)\n| [**Citation**](#citing)\n| [**Docs**](https://instadeepai.github.io/jumanji)\n---\n\n<div class=\"collage\">\n  <div class=\"row\" align=\"center\">\n    <img src=\"docs/env_anim/bin_pack.gif\" alt=\"BinPack\" width=\"16%\">\n    <img src=\"docs/env_anim/cleaner.gif\" alt=\"Cleaner\" width=\"16%\">\n    <img src=\"docs/env_anim/connector.gif\" alt=\"Connector\" width=\"16%\">\n    <img src=\"docs/env_anim/cvrp.gif\" alt=\"CVRP\" width=\"16%\">\n    <img src=\"docs/env_anim/flat_pack.gif\" alt=\"FlatPack\" width=\"16%\">\n    <img src=\"docs/env_anim/game_2048.gif\" alt=\"Game2048\" width=\"16%\">\n  </div>\n  <div class=\"row\" align=\"center\">\n    <img src=\"docs/env_anim/graph_coloring.gif\" alt=\"GraphColoring\" width=\"16%\">\n    <img src=\"docs/env_anim/job_shop.gif\" alt=\"JobShop\" width=\"16%\">\n    <img src=\"docs/env_anim/knapsack.gif\" alt=\"Knapsack\" width=\"16%\">\n    <img src=\"docs/env_anim/maze.gif\" alt=\"Maze\" width=\"16%\">\n    <img src=\"docs/env_anim/minesweeper.gif\" alt=\"Minesweeper\" width=\"16%\">\n    <img src=\"docs/env_anim/mmst.gif\" alt=\"MMST\" width=\"16%\">\n  </div>\n  <div class=\"row\" align=\"center\">\n    <img src=\"docs/env_anim/multi_cvrp.gif\" alt=\"MultiCVRP\" width=\"16%\">\n    <img src=\"docs/env_anim/pac_man.gif\" alt=\"PacMan\" width=\"16%\">\n    <img src=\"docs/env_anim/robot_warehouse.gif\" alt=\"RobotWarehouse\" width=\"16%\">\n    <img src=\"docs/env_anim/rubiks_cube.gif\" alt=\"RubiksCube\" width=\"16%\">\n    <img src=\"docs/env_anim/sliding_tile_puzzle.gif\" alt=\"SlidingTilePuzzle\" width=\"16%\">\n    <img src=\"docs/env_anim/snake.gif\" alt=\"Snake\" width=\"16%\">\n  </div>\n    <div class=\"row\" align=\"center\">\n    <img src=\"docs/env_anim/sokoban.gif\" alt=\"RobotWarehouse\" width=\"16%\">\n    <img src=\"docs/env_anim/sudoku.gif\" alt=\"Sudoku\" width=\"16%\">\n    <img src=\"docs/env_anim/tetris.gif\" alt=\"Tetris\" width=\"16%\">\n    <img src=\"docs/env_anim/tsp.gif\" alt=\"Tetris\" width=\"16%\">\n  </div>\n</div>\n\n## Jumanji @ ICLR 2024\n\nJumanji has been accepted at [ICLR 2024](https://iclr.cc/), check out our [research paper](https://arxiv.org/abs/2306.09884).\n\n## Welcome to the Jungle! \ud83c\udf34\n\nJumanji is a diverse suite of scalable reinforcement learning environments written in JAX. It now features 22 environments!\n\nJumanji is helping pioneer a new wave of hardware-accelerated research and development in the\nfield of RL. Jumanji's high-speed environments enable faster iteration and large-scale\nexperimentation while simultaneously reducing complexity. Originating in the research team at\n[InstaDeep](https://www.instadeep.com/), Jumanji is now developed jointly with the open-source\ncommunity. To join us in these efforts, reach out, raise issues and read our\n[contribution guidelines](https://github.com/instadeepai/jumanji/blob/main/CONTRIBUTING.md) or just\n[star](https://github.com/instadeepai/jumanji) \ud83c\udf1f to stay up to date with the latest developments!\n\n### Goals \ud83d\ude80\n\n1. Provide a simple, well-tested API for JAX-based environments.\n2. Make research in RL more accessible.\n3. Facilitate the research on RL for problems in the industry and help close the gap between\nresearch and industrial applications.\n4. Provide environments whose difficulty can be scaled to be arbitrarily hard.\n\n### Overview \ud83e\udd9c\n\n- \ud83e\udd51 **Environment API**: core abstractions for JAX-based environments.\n- \ud83d\udd79\ufe0f **Environment Suite**: a collection of RL environments ranging from simple games to NP-hard\ncombinatorial problems.\n- \ud83c\udf6c **Wrappers**: easily connect to your favourite RL frameworks and libraries such as\n[Acme](https://github.com/deepmind/acme),\n[Stable Baselines3](https://github.com/DLR-RM/stable-baselines3),\n[RLlib](https://docs.ray.io/en/latest/rllib/index.html), [OpenAI Gym](https://github.com/openai/gym)\nand [DeepMind-Env](https://github.com/deepmind/dm_env) through our `dm_env` and `gym` wrappers.\n- \ud83c\udf93 **Examples**: guides to facilitate Jumanji's adoption and highlight the added value of\nJAX-based environments.\n- \ud83c\udfce\ufe0f **Training:** example agents that can be used as inspiration for the agents one may implement\nin their research.\n\n<h2 name=\"environments\" id=\"environments\">Environments \ud83c\udf0d</h2>\n\nJumanji provides a diverse range of environments ranging from simple games to NP-hard combinatorial\nproblems.\n\n| Environment                              | Category | Registered Version(s)                                | Source                                                                                           | Description                                                            |\n|------------------------------------------|----------|------------------------------------------------------|--------------------------------------------------------------------------------------------------|------------------------------------------------------------------------|\n| \ud83d\udd22 Game2048                              | Logic  | `Game2048-v1`                                        | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/logic/game_2048/)   | [doc](https://instadeepai.github.io/jumanji/environments/game_2048/)   |\n| \ud83c\udfa8 GraphColoring                              | Logic  | `GraphColoring-v0`                                   | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/logic/graph_coloring/)   | [doc](https://instadeepai.github.io/jumanji/environments/graph_coloring/)   |\n| \ud83d\udca3 Minesweeper                           | Logic    | `Minesweeper-v0`                                     | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/logic/minesweeper/) | [doc](https://instadeepai.github.io/jumanji/environments/minesweeper/) |\n| \ud83c\udfb2 RubiksCube                            | Logic    | `RubiksCube-v0`<br/>`RubiksCube-partly-scrambled-v0` | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/logic/rubiks_cube/) | [doc](https://instadeepai.github.io/jumanji/environments/rubiks_cube/) |\n| \ud83d\udd00 SlidingTilePuzzle                       | Logic    | `SlidingTilePuzzle-v0` | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/logic/sliding_tile_puzzle/) | [doc](https://instadeepai.github.io/jumanji/environments/sliding_tile_puzzle/) |\n| \u270f\ufe0f Sudoku                       | Logic    | `Sudoku-v0` <br/>`Sudoku-very-easy-v0`| [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/logic/sudoku/) | [doc](https://instadeepai.github.io/jumanji/environments/sudoku/) |\n| \ud83d\udce6 BinPack (3D BinPacking Problem)       | Packing  | `BinPack-v1`                                         | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/packing/bin_pack/)  | [doc](https://instadeepai.github.io/jumanji/environments/bin_pack/)    |\n| \ud83e\udde9 FlatPack (2D Grid Filling Problem) | Packing  | `FlatPack-v0`                                         | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/packing/flat_pack/)  | [doc](https://instadeepai.github.io/jumanji/environments/flat_pack/)    |\n| \ud83c\udfed JobShop (Job Shop Scheduling Problem) | Packing  | `JobShop-v0`                                         | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/packing/job_shop/)  | [doc](https://instadeepai.github.io/jumanji/environments/job_shop/)    |\n| \ud83c\udf92 Knapsack                              | Packing  | `Knapsack-v1`                                        | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/packing/knapsack/)  | [doc](https://instadeepai.github.io/jumanji/environments/knapsack/)    |\n| \u2592 Tetris                              | Packing  | `Tetris-v0`                                        | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/packing/tetris/)  | [doc](https://instadeepai.github.io/jumanji/environments/tetris/)    |\n| \ud83e\uddf9 Cleaner                               | Routing  | `Cleaner-v0`                                         | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/cleaner/)   | [doc](https://instadeepai.github.io/jumanji/environments/cleaner/)     |\n| :link: Connector                         | Routing  | `Connector-v2`                                       | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/connector/) | [doc](https://instadeepai.github.io/jumanji/environments/connector/)   |\n| \ud83d\ude9a CVRP (Capacitated Vehicle Routing Problem)  | Routing  | `CVRP-v1`                                            | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/cvrp/)      | [doc](https://instadeepai.github.io/jumanji/environments/cvrp/)        |\n| \ud83d\ude9a MultiCVRP (Multi-Agent Capacitated Vehicle Routing Problem)  | Routing  | `MultiCVRP-v0`                                            | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/multi_cvrp/)      | [doc](https://instadeepai.github.io/jumanji/environments/multi_cvrp/)        |\n| :mag: Maze   | Routing  | `Maze-v0`                                            | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/maze/)      | [doc](https://instadeepai.github.io/jumanji/environments/maze/)        |\n| :robot: RobotWarehouse  | Routing  | `RobotWarehouse-v0`                                  | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/robot_warehouse/)      | [doc](https://instadeepai.github.io/jumanji/environments/robot_warehouse/)        |\n| \ud83d\udc0d Snake                                       | Routing  | `Snake-v1`                                           | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/snake/)     | [doc](https://instadeepai.github.io/jumanji/environments/snake/)       |\n| \ud83d\udcec TSP (Travelling Salesman Problem)           | Routing  | `TSP-v1`                                             | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/tsp/)       | [doc](https://instadeepai.github.io/jumanji/environments/tsp/)         |\n| Multi Minimum Spanning Tree Problem | Routing  | `MMST-v0`                                | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/mmst)    | [doc](https://instadeepai.github.io/jumanji/environments/mmst/)    |\n| \u15e7\u2022\u2022\u2022\u15e3\u2022\u2022 PacMan   | Routing  | `PacMan-v0`                                            | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/pac_man/)      | [doc](https://instadeepai.github.io/jumanji/environments/pac_man/)\n| \ud83d\udc7e Sokoban                                                     | Routing  | `Sokoban-v0`                                         | [code](https://github.com/instadeepai/jumanji/tree/main/jumanji/environments/routing/sokoban/)          | [doc](https://instadeepai.github.io/jumanji/environments/sokoban/)         |\n\n<h2 name=\"install\" id=\"install\">Installation \ud83c\udfac</h2>\n\nYou can install the latest release of Jumanji from PyPI:\n\n```bash\npip install -U jumanji\n```\n\nAlternatively, you can install the latest development version directly from GitHub:\n\n```bash\npip install git+https://github.com/instadeepai/jumanji.git\n```\n\nJumanji has been tested on Python 3.8 and 3.9.\nNote that because the installation of JAX differs depending on your hardware accelerator,\nwe advise users to explicitly install the correct JAX version (see the\n[official installation guide](https://github.com/google/jax#installation)).\n\n**Rendering:** Matplotlib is used for rendering all the environments. To visualize the environments\nyou will need a GUI backend. For example, on Linux, you can install Tk via:\n`apt-get install python3-tk`, or using conda: `conda install tk`. Check out\n[Matplotlib backends](https://matplotlib.org/stable/users/explain/backends.html) for a list of\nbackends you can use.\n\n<h2 name=\"quickstart\" id=\"quickstart\">Quickstart \u26a1</h2>\n\nRL practitioners will find Jumanji's interface familiar as it combines the widely adopted\n[OpenAI Gym](https://github.com/openai/gym) and\n[DeepMind Environment](https://github.com/deepmind/dm_env) interfaces. From OpenAI Gym, we adopted\nthe idea of a `registry` and the `render` method, while our `TimeStep` structure is inspired by\nDeepMind Environment.\n\n### Basic Usage \ud83e\uddd1\u200d\ud83d\udcbb\n\n```python\nimport jax\nimport jumanji\n\n# Instantiate a Jumanji environment using the registry\nenv = jumanji.make('Snake-v1')\n\n# Reset your (jit-able) environment\nkey = jax.random.PRNGKey(0)\nstate, timestep = jax.jit(env.reset)(key)\n\n# (Optional) Render the env state\nenv.render(state)\n\n# Interact with the (jit-able) environment\naction = env.action_spec.generate_value()          # Action selection (dummy value here)\nstate, timestep = jax.jit(env.step)(state, action)   # Take a step and observe the next state and time step\n```\n\n- `state` represents the internal state of the environment: it contains all the information required\nto take a step when executing an action. This should **not** be confused with the `observation`\ncontained in the `timestep`, which is the information perceived by the agent.\n- `timestep` is a dataclass containing `step_type`, `reward`, `discount`, `observation` and\n`extras`. This structure is similar to\n[`dm_env.TimeStep`](https://github.com/deepmind/dm_env/blob/master/docs/index.md) except for the\n`extras` field that was added to allow users to log environments metrics that are neither part of\nthe agent's observation nor part of the environment's internal state.\n\n### Advanced Usage \ud83e\uddd1\u200d\ud83d\udd2c\n\nBeing written in JAX, Jumanji's environments benefit from many of its features including\nautomatic vectorization/parallelization (`jax.vmap`, `jax.pmap`) and JIT-compilation (`jax.jit`),\nwhich can be composed arbitrarily.\nWe provide an example of a more advanced usage in the\n[advanced usage guide](https://instadeepai.github.io/jumanji/guides/advanced_usage/).\n\n### Registry and Versioning \ud83d\udcd6\n\nLike OpenAI Gym, Jumanji keeps a strict versioning of its environments for reproducibility reasons.\nWe maintain a registry of standard environments with their configuration.\nFor each environment, a version suffix is appended, e.g. `Snake-v1`.\nWhen changes are made to environments that might impact learning results,\nthe version number is incremented by one to prevent potential confusion.\nFor a full list of registered versions of each environment, check out\n[the documentation](https://instadeepai.github.io/jumanji/environments/tsp/).\n\n<h2 name=\"training\" id=\"training\">Training \ud83c\udfce\ufe0f</h2>\n\nTo showcase how to train RL agents on Jumanji environments, we provide a random agent and a vanilla\nactor-critic (A2C) agent. These agents can be found in\n[jumanji/training/](https://github.com/instadeepai/jumanji/tree/main/jumanji/training/).\n\nBecause the environment framework in Jumanji is so flexible, it allows pretty much any problem to\nbe implemented as a Jumanji environment, giving rise to very diverse observations. For this reason,\nenvironment-specific networks are required to capture the symmetries of each environment.\nAlongside the A2C agent implementation, we provide examples of such environment-specific\nactor-critic networks in\n[jumanji/training/networks](https://github.com/instadeepai/jumanji/tree/main/jumanji/training/networks/).\n\n> \u26a0\ufe0f The example agents in `jumanji/training` are **only** meant to serve as inspiration for how one\n> can implement an agent. Jumanji is first and foremost a library of environments - as such, the\n> agents and networks will **not** be maintained to a production standard.\n\nFor more information on how to use the example agents, see the\n[training guide](https://instadeepai.github.io/jumanji/guides/training/).\n\n## Contributing \ud83e\udd1d\n\nContributions are welcome! See our issue tracker for\n[good first issues](https://github.com/instadeepai/jumanji/labels/good%20first%20issue). Please read\nour [contributing guidelines](https://github.com/instadeepai/jumanji/blob/main/CONTRIBUTING.md) for\ndetails on how to submit pull requests, our Contributor License Agreement, and community guidelines.\n\n<h2 name=\"citing\" id=\"citing\">Citing Jumanji \u270f\ufe0f</h2>\n\nIf you use Jumanji in your work, please cite the library using:\n\n```\n@misc{bonnet2024jumanji,\n    title={Jumanji: a Diverse Suite of Scalable Reinforcement Learning Environments in JAX},\n    author={Cl\u00e9ment Bonnet and Daniel Luo and Donal Byrne and Shikha Surana and Sasha Abramowitz and Paul Duckworth and Vincent Coyette and Laurence I. Midgley and Elshadai Tegegn and Tristan Kalloniatis and Omayma Mahjoub and Matthew Macfarlane and Andries P. Smit and Nathan Grinsztajn and Raphael Boige and Cemlyn N. Waters and Mohamed A. Mimouni and Ulrich A. Mbou Sob and Ruan de Kock and Siddarth Singh and Daniel Furelos-Blanco and Victor Le and Arnu Pretorius and Alexandre Laterre},\n    year={2024},\n    eprint={2306.09884},\n    url={https://arxiv.org/abs/2306.09884},\n    archivePrefix={arXiv},\n    primaryClass={cs.LG}\n}\n```\n\n## See Also \ud83d\udd0e\n\nOther works have embraced the approach of writing RL environments in JAX.\nIn particular, we suggest users check out the following sister repositories:\n\n- \ud83e\udd16 [Qdax](https://github.com/adaptive-intelligent-robotics/QDax) is a library to accelerate\nQuality-Diversity and neuro-evolution algorithms through hardware accelerators and parallelization.\n- \ud83c\udf33 [Evojax](https://github.com/google/evojax) provides tools to enable neuroevolution algorithms\nto work with neural networks running across multiple TPU/GPUs.\n- \ud83e\uddbe [Brax](https://github.com/google/brax) is a differentiable physics engine that simulates\nenvironments made up of rigid bodies, joints, and actuators.\n- \ud83c\udfcb\ufe0f\u200d [Gymnax](https://github.com/RobertTLange/gymnax) implements classic environments including\nclassic control, bsuite, MinAtar and a collection of meta RL tasks.\n- \ud83c\udfb2 [Pgx](https://github.com/sotetsuk/pgx) provides classic board game environments like\nBackgammon, Shogi, and Go.\n\n## Acknowledgements \ud83d\ude4f\n\nThe development of this library was supported with Cloud TPUs\nfrom Google's [TPU Research Cloud](https://sites.research.google/trc/about/) (TRC) \ud83c\udf24.\n",
    "bugtrack_url": null,
    "license": "Apache 2.0",
    "summary": "A diverse suite of scalable reinforcement learning environments in JAX",
    "version": "1.0.1",
    "project_urls": {
        "Homepage": "https://github.com/instadeepai/jumanji/"
    },
    "split_keywords": [
        "reinforcement-learning",
        "python",
        "jax"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "aa17c8fcdcbdf631092cd86e375c18bb199d08165eed8e3ed2c9ea947e7398c9",
                "md5": "829b93d3d0c512da599c3a8f721e14cf",
                "sha256": "807b9aa1c98ab315944efc4b377d8c3812a92bc2318be8ece644594b927390ab"
            },
            "downloads": -1,
            "filename": "jumanji-1.0.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "829b93d3d0c512da599c3a8f721e14cf",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 829159,
            "upload_time": "2024-03-29T11:24:17",
            "upload_time_iso_8601": "2024-03-29T11:24:17.124752Z",
            "url": "https://files.pythonhosted.org/packages/aa/17/c8fcdcbdf631092cd86e375c18bb199d08165eed8e3ed2c9ea947e7398c9/jumanji-1.0.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "13f7af897bb7918745b9a75cd3376b9da69bd6c625f964f76bccde5c80314141",
                "md5": "8626a4d74121a2ab7d2e971885fb38fa",
                "sha256": "cdbc0245deb2f72cfdaf3719793a484d26ac25b06f2bfaa0e8323a1d79fef8d2"
            },
            "downloads": -1,
            "filename": "jumanji-1.0.1.tar.gz",
            "has_sig": false,
            "md5_digest": "8626a4d74121a2ab7d2e971885fb38fa",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 629464,
            "upload_time": "2024-03-29T11:24:18",
            "upload_time_iso_8601": "2024-03-29T11:24:18.634143Z",
            "url": "https://files.pythonhosted.org/packages/13/f7/af897bb7918745b9a75cd3376b9da69bd6c625f964f76bccde5c80314141/jumanji-1.0.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-03-29 11:24:18",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "instadeepai",
    "github_project": "jumanji",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "jumanji"
}
        
Elapsed time: 0.21689s