gymcts


Namegymcts JSON
Version 1.4.5 PyPI version JSON
download
home_pageNone
SummaryA minimalistic implementation of the Monte Carlo Tree Search algorithm for planning problems fomulated as gymnaisum reinforcement learning environments.
upload_time2025-07-17 11:34:57
maintainerNone
docs_urlNone
authorAlexander Nasuta
requires_python>=3.11
licenseMIT License Copyright (c) 2025 Alexander Nasuta Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
keywords
VCS
bugtrack_url
requirements cloudpickle contourpy cycler farama-notifications fonttools gymnasium kiwisolver markdown-it-py matplotlib mdurl numpy packaging pillow pygments pyparsing python-dateutil rich six typing-extensions
Travis-CI No Travis.
coveralls test coverage No coveralls.
            [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.15283390.svg)](https://doi.org/10.5281/zenodo.15283390)
[![Python Badge](https://img.shields.io/badge/Python-3776AB?logo=python&logoColor=fff&style=flat)](https://www.python.org/downloads/)
[![PyPI version](https://img.shields.io/pypi/v/gymcts)](https://pypi.org/project/gymcts/)
[![License](https://img.shields.io/pypi/l/gymcts)](https://github.com/Alexander-Nasuta/gymcts/blob/master/LICENSE)
[![Documentation Status](https://readthedocs.org/projects/gymcts/badge/?version=latest)](https://gymcts.readthedocs.io/en/latest/?badge=latest)

# GYMCTS

A Monte Carlo Tree Search Implementation for Gymnasium-style Environments.

- Github: [GYMCTS on Github](https://github.com/Alexander-Nasuta/gymcts)
- GitLab: [GYMCTS on GitLab](https://git-ce.rwth-aachen.de/alexander.nasuta/gymcts)
- Pypi: [GYMCTS on PyPi](https://pypi.org/project/gymcts/)
- Documentation: [GYMCTS Docs](https://gymcts.readthedocs.io/en/latest/)

## Description

This project provides a Monte Carlo Tree Search (MCTS) implementation for Gymnasium-style environments as an installable Python package.
The package is designed to be used with the Gymnasium interface.
It is especially useful for combinatorial optimization problems or planning problems, such as the Job Shop Scheduling Problem (JSP).
The documentation provides numerous examples on how to use the package with different environments, while focusing on scheduling problems.

A minimal working example is provided in the [Quickstart](#quickstart) section.

It comes with a variety of visualisation options, which is useful for research and debugging purposes. 
It aims to be a base for further research and development for neural guided search algorithms.
## Quickstart
To use the package, install it via pip:

```shell
pip install gymcts
```
The usage of a MCTS agent can roughly organised into the following steps:

- Create a Gymnasium-style environment
- Wrap the environment with a GymCTS wrapper
- Create a MCTS agent
- Solve the environment with the MCTS agent
- Render the solution

The GYMCTS package provides a two types of wrappers for Gymnasium-style environments:
- `DeepCopyMCTSGymEnvWrapper`: A wrapper that uses deepcopies of the environment to save a snapshot of the environment state for each node in the MCTS tree.
- `ActionHistoryMCTSGymEnvWrapper`: A wrapper that saves the action sequence that lead to the current state in the MCTS node.

These wrappers can be used with the `GymctsAgent` to solve the environment. 
The wrapper implement methods that are required by the `GymctsAgent` to interact with the environment.
GYMCTS is designed to use a single environment instance and reconstructing the environment state form a state snapshot, when needed.

NOTE: MCTS works best when the return of an episode is in the range of [-1, 1]. Please adjust the reward function of the environment accordingly (or change the ubc-scaling parameter of the MCTS agent).
Adjusting the reward function of the environment is easily done with a [NormalizeReward](https://gymnasium.farama.org/api/wrappers/reward_wrappers/#gymnasium.wrappers.NormalizeReward) or [TransformReward](https://gymnasium.farama.org/api/wrappers/reward_wrappers/#gymnasium.wrappers.TransformReward) Wrapper.
```python
env = NormalizeReward(env, gamma=0.99, epsilon=1e-8)
```

```python
env = TransformReward(env, lambda r: r / n_steps_per_episode)
```
### FrozenLake Example (DeepCopyMCTSGymEnvWrapper)

A minimal example of how to use the package with the FrozenLake environment and the NaiveSoloMCTSGymEnvWrapper is provided in the following code snippet below.
The DeepCopyMCTSGymEnvWrapper can be used with non-deterministic environments, such as the FrozenLake environment with slippery ice.

```python
import gymnasium as gym

from gymcts.gymcts_agent import GymctsAgent
from gymcts.gymcts_deepcopy_wrapper import DeepCopyMCTSGymEnvWrapper

from gymcts.logger import log

# set log level to 20 (INFO) 
# set log level to 10 (DEBUG) to see more detailed information
log.setLevel(20)

if __name__ == '__main__':
    # 0. create the environment
    env = gym.make('FrozenLake-v1', desc=None, map_name="4x4", is_slippery=True, render_mode="ansi")
    env.reset()

    # 1. wrap the environment with the deep copy wrapper or a custom gymcts wrapper
    env = DeepCopyMCTSGymEnvWrapper(env)

    # 2. create the agent
    agent = GymctsAgent(
        env=env,
        clear_mcts_tree_after_step=False,
        render_tree_after_step=True,
        number_of_simulations_per_step=50,
        exclude_unvisited_nodes_from_render=True
    )

    # 3. solve the environment
    actions = agent.solve()

    # 4. render the environment solution in the terminal
    print(env.render())
    for a in actions:
        obs, rew, term, trun, info = env.step(a)
        print(env.render())

    # 5. print the solution
    # read the solution from the info provided by the RecordEpisodeStatistics wrapper 
    # (that DeepCopyMCTSGymEnvWrapper uses internally)
    episode_length = info["episode"]["l"]
    episode_return = info["episode"]["r"]

    if episode_return == 1.0:
        print(f"Environment solved in {episode_length} steps.")
    else:
        print(f"Environment not solved in {episode_length} steps.")
```

### FrozenLake Example (DeterministicSoloMCTSGymEnvWrapper)

A minimal example of how to use the package with the FrozenLake environment and the DeterministicSoloMCTSGymEnvWrapper is provided in the following code snippet below.
The DeterministicSoloMCTSGymEnvWrapper can be used with deterministic environments, such as the FrozenLake environment without slippery ice.

The DeterministicSoloMCTSGymEnvWrapper saves the action sequence that lead to the current state in the MCTS node.

```python
import gymnasium as gym

from gymcts.gymcts_agent import GymctsAgent
from gymcts.gymcts_action_history_wrapper import ActionHistoryMCTSGymEnvWrapper

from gymcts.logger import log

# set log level to 20 (INFO)
# set log level to 10 (DEBUG) to see more detailed information
log.setLevel(20)

if __name__ == '__main__':
    # 0. create the environment
    env = gym.make('FrozenLake-v1', desc=None, map_name="4x4", is_slippery=False, render_mode="ansi")
    env.reset()

    # 1. wrap the environment with the wrapper
    env = ActionHistoryMCTSGymEnvWrapper(env)

    # 2. create the agent
    agent = GymctsAgent(
        env=env,
        clear_mcts_tree_after_step=False,
        render_tree_after_step=True,
        number_of_simulations_per_step=50,
        exclude_unvisited_nodes_from_render=True
    )

    # 3. solve the environment
    actions = agent.solve()

    # 4. render the environment solution in the terminal
    print(env.render())
    for a in actions:
        obs, rew, term, trun, info = env.step(a)
        print(env.render())

    # 5. print the solution
    # read the solution from the info provided by the RecordEpisodeStatistics wrapper
    # (that DeterministicSoloMCTSGymEnvWrapper uses internally)
    episode_length = info["episode"]["l"]
    episode_return = info["episode"]["r"]

    if episode_return == 1.0:
        print(f"Environment solved in {episode_length} steps.")
    else:
        print(f"Environment not solved in {episode_length} steps.")
```


### FrozenLake Video Example

![FrozenLake Video as .gif](./resources/frozenlake_4x4-episode-0-video-to-gif-converted.gif)

To create a video of the solution of the FrozenLake environment, you can use the following code snippet:

```python  
import gymnasium as gym

from gymcts.gymcts_agent import GymctsAgent
from gymcts.gymcts_deepcopy_wrapper import DeepCopyMCTSGymEnvWrapper

from gymcts.logger import log

log.setLevel(20)

from gymnasium.envs.toy_text.frozen_lake import FrozenLakeEnv

if __name__ == '__main__':
    log.debug("Starting example")

    # 0. create the environment
    env = gym.make('FrozenLake-v1', desc=None, map_name="4x4", is_slippery=False, render_mode="rgb_array")
    env.reset()

    # 1. wrap the environment with the deep copy wrapper or a custom gymcts wrapper
    env = DeepCopyMCTSGymEnvWrapper(env)

    # 2. create the agent
    agent = GymctsAgent(
        env=env,
        clear_mcts_tree_after_step=False,
        render_tree_after_step=True,
        number_of_simulations_per_step=200,
        exclude_unvisited_nodes_from_render=True
    )

    # 3. solve the environment
    actions = agent.solve()

    # 4. render the environment solution
    env = gym.wrappers.RecordVideo(
        env,
        video_folder="./videos",
        episode_trigger=lambda episode_id: True,
        name_prefix="frozenlake_4x4"
    )
    env.reset()

    for a in actions:
        obs, rew, term, trun, info = env.step(a)
    env.close()

    # 5. print the solution
    # read the solution from the info provided by the RecordEpisodeStatistics wrapper (that DeepCopyMCTSGymEnvWrapper wraps internally)
    episode_length = info["episode"]["l"]
    episode_return = info["episode"]["r"]

    if episode_return == 1.0:
        print(f"Environment solved in {episode_length} steps.")
    else:
        print(f"Environment not solved in {episode_length} steps.")
```

### Job Shop Scheduling (CustomWrapper)

![](https://github.com/Alexander-Nasuta/GraphMatrixJobShopEnv/raw/master/resources/default-render.gif)

The following code snippet shows how to use the package with the [graph-jsp-env](https://github.com/Alexander-Nasuta/graph-jsp-env) environment.

First, install the environment via pip:

```shell
pip install graph-jsp-env
```

and a utility package for JSP instances:

```shell
pip install jsp-instance-utils
```

Then, you can use the following code snippet to solve the environment with the MCTS agent:
```

```python  
from typing import Any

import random

import gymnasium as gym

from graph_jsp_env.disjunctive_graph_jsp_env import DisjunctiveGraphJspEnv
from jsp_instance_utils.instances import ft06, ft06_makespan

from gymcts.gymcts_agent import GymctsAgent
from gymcts.gymcts_env_abc import GymctsABC

from gymcts.logger import log


class GraphJspGYMCTSWrapper(GymctsABC, gym.Wrapper):

    def __init__(self, env: DisjunctiveGraphJspEnv):
        gym.Wrapper.__init__(self, env)

    def load_state(self, state: Any) -> None:
        self.env.reset()
        for action in state:
            self.env.step(action)

    def is_terminal(self) -> bool:
        return self.env.unwrapped.is_terminal()

    def get_valid_actions(self) -> list[int]:
        return list(self.env.unwrapped.valid_actions())

    def rollout(self) -> float:
        terminal = env.is_terminal()

        if terminal:
            lower_bound = env.unwrapped.reward_function_parameters['scaling_divisor']
            return - env.unwrapped.get_makespan() / lower_bound + 2

        reward = 0
        while not terminal:
            action = random.choice(self.get_valid_actions())
            obs, reward, terminal, truncated, _ = env.step(action)

        return reward + 2

    def get_state(self) -> Any:
        return env.unwrapped.get_action_history()


if __name__ == '__main__':
    log.setLevel(20)

    env_kwargs = {
        "jps_instance": ft06,
        "default_visualisations": ["gantt_console", "graph_console"],
        "reward_function_parameters": {
            "scaling_divisor": ft06_makespan
        },
        "reward_function": "nasuta",
    }

    env = DisjunctiveGraphJspEnv(**env_kwargs)
    env.reset()

    env = GraphJspGYMCTSWrapper(env)

    agent = GymctsAgent(
        env=env,
        clear_mcts_tree_after_step=True,
        render_tree_after_step=True,
        exclude_unvisited_nodes_from_render=True,
        number_of_simulations_per_step=50,
    )

    root = agent.search_root_node.get_root()

    actions = agent.solve(render_tree_after_step=True)
    for a in actions:
        obs, rew, term, trun, info = env.step(a)

    env.render()
    makespan = env.unwrapped.get_makespan()
    print(f"makespan: {makespan}")

```

## Visualizations

The MCTS agent provides a visualisation of the MCTS tree.
Below is an example code snippet that shows how to use the visualisation options of the MCTS agent.

The following metrics are displayed in the visualisation:
- `N`: the number of visits of the node
- `Q_v`: the average return of the node
- `ubc`: the upper confidence bound of the node
- `a`: the action that leads to the node
- `best`: the highest return of any rollout from the node

`Q_v` and `ubc` have a color gradient from red to green, where red indicates a low value and green indicates a high value.
The color gradient is based on the minimum and maximum values of the respective metric in the tree.

The visualisation is rendered in the terminal and can be limited to a certain depth of the tree.
The default depth is 2.

```python
import gymnasium as gym

from gymcts.gymcts_agent import GymctsAgent
from gymcts.gymcts_action_history_wrapper import ActionHistoryMCTSGymEnvWrapper

from gymcts.logger import log

# set log level to 20 (INFO)
# set log level to 10 (DEBUG) to see more detailed information
log.setLevel(20)

if __name__ == '__main__':
    # create the environment
    env = gym.make('FrozenLake-v1', desc=None, map_name="4x4", is_slippery=False, render_mode="ansi")
    env.reset()

    # wrap the environment with the wrapper or a custom gymcts wrapper
    env = ActionHistoryMCTSGymEnvWrapper(env)

    # create the agent
    agent = GymctsAgent(
        env=env,
        clear_mcts_tree_after_step=False,
        render_tree_after_step=False,
        number_of_simulations_per_step=50,
        exclude_unvisited_nodes_from_render=True,  # weather to exclude unvisited nodes from the render
        render_tree_max_depth=2  # the maximum depth of the tree to render
    )

    # solve the environment
    actions = agent.solve()

    # render the MCTS tree from the root
    # search_root_node is the node that corresponds to the current state of the environment in the search process
    # since we called agent.solve() we are at the end of the search process
    log.info(f"MCTS Tree starting at the final state of the environment (actions: {agent.search_root_node.state})")
    agent.show_mcts_tree(
        start_node=agent.search_root_node,
    )

    # the parent of the terminal node (which we are rendering below) is the search root node of the previous step in the
    # MCTS solving process
    log.info(
        f"MCTS Tree starting at the pre-final state of the environment (actions: {agent.search_root_node.parent.state})")
    agent.show_mcts_tree(
        start_node=agent.search_root_node.parent,
    )

    # render the MCTS tree from the root
    log.info(f"MCTS Tree starting at the root state (actions: {agent.search_root_node.get_root().state})")
    agent.show_mcts_tree(
        start_node=agent.search_root_node.get_root(),
        # you can limit the depth of the tree to render to any number
        tree_max_depth=1
    )
```

![visualsiation example on the frozenlanke environment](./resources/mcts_visualisation.png)


## State of the Project

This project is complementary material for a research paper. It will not be frequently updated.
Minor updates might occur.
Significant further development will most likely result in a new project. In that case, a note with a link will be added in the `README.md` of this project.  

## Dependencies

This project specifies multiple requirements files. 
`requirements.txt` contains the dependencies for the environment to work. These requirements will be installed automatically when installing the environment via `pip`.
`requirements_dev.txt` contains the dependencies for development purposes. It includes the dependencies for testing, linting, and building the project on top of the dependencies in `requirements.txt`.
`requirements_examples.txt` contains the dependencies for running the examples inside the project. It includes the dependencies in `requirements.txt` and additional dependencies for the examples.

In this Project the dependencies are specified in the `pyproject.toml` file with as little version constraints as possible.
The tool `pip-compile` translates the `pyproject.toml` file into a `requirements.txt` file with pinned versions. 
That way version conflicts can be avoided (as much as possible) and the project can be built in a reproducible way.

## Development Setup

If you want to check out the code and implement new features or fix bugs, you can set up the project as follows:

### Clone the Repository

clone the repository in your favorite code editor (for example PyCharm, VSCode, Neovim, etc.)

using https:
```shell
git clone https://github.com/Alexander-Nasuta/gymcts.git
```
or by using the GitHub CLI:
```shell
gh repo clone Alexander-Nasuta/gymcts
```

if you are using PyCharm, I recommend doing the following additional steps:

- mark the `src` folder as source root (by right-clicking on the folder and selecting `Mark Directory as` -> `Sources Root`)
- mark the `tests` folder as test root (by right-clicking on the folder and selecting `Mark Directory as` -> `Test Sources Root`)
- mark the `resources` folder as resources root (by right-clicking on the folder and selecting `Mark Directory as` -> `Resources Root`)


### Create a Virtual Environment (optional)

Most Developers use a virtual environment to manage the dependencies of their projects. 
I personally use `conda` for this purpose.

When using `conda`, you can create a new environment with the name 'my-graph-jsp-env' following command:

```shell
conda create -n gymcts python=3.11
```

Feel free to use any other name for the environment or an more recent version of python.
Activate the environment with the following command:

```shell
conda activate gymcts
```

Replace `gymcts` with the name of your environment, if you used a different name.

You can also use `venv` or `virtualenv` to create a virtual environment. In that case please refer to the respective documentation.

### Install the Dependencies

To install the dependencies for development purposes, run the following command:

```shell
pip install -r requirements_dev.txt
pip install tox
```

The testing package `tox` is not included in the `requirements_dev.txt` file, because it sometimes causes issues when 
using github actions. 
Github Actions uses an own tox environment (namely 'tox-gh-actions'), which can cause conflicts with the tox environment on your local machine.

Reference: [Automated Testing in Python with pytest, tox, and GitHub Actions](https://www.youtube.com/watch?v=DhUpxWjOhME).

### Install the Project in Editable Mode

To install the project in editable mode, run the following command:

```shell
pip install -e .
```

This will install the project in editable mode, so you can make changes to the code and test them immediately.

### Run the Tests

This project uses `pytest` for testing. To run the tests, run the following command:

```shell
pytest
```

For testing with `tox` run the following command:

```shell
tox
```

### Builing and Publishing the Project to PyPi 

In order to publish the project to PyPi, the project needs to be built and then uploaded to PyPi.

To build the project, run the following command:

```shell
python -m build
```

It is considered good practice use the tool `twine` for checking the build and uploading the project to PyPi.
By default the build command creates a `dist` folder with the built project files.
To check all the files in the `dist` folder, run the following command:

```shell
twine check dist/**
```

If the check is successful, you can upload the project to PyPi with the following command:

```shell
twine upload dist/**
```

### Documentation
This project uses `sphinx` for generating the documentation. 
It also uses a lot of sphinx extensions to make the documentation more readable and interactive.
For example the extension `myst-parser` is used to enable markdown support in the documentation (instead of the usual .rst-files).
It also uses the `sphinx-autobuild` extension to automatically rebuild the documentation when changes are made.
By running the following command, the documentation will be automatically built and served, when changes are made (make sure to run this command in the root directory of the project):

```shell
sphinx-autobuild ./docs/source/ ./docs/build/html/
```

This project features most of the extensions featured in this Tutorial: [Document Your Scientific Project With Markdown, Sphinx, and Read the Docs | PyData Global 2021](https://www.youtube.com/watch?v=qRSb299awB0).


## Contact

If you have any questions or feedback, feel free to contact me via [email](mailto:alexander.nasuta@wzl-iqs.rwth-aachen.de) or open an issue on repository.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "gymcts",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.11",
    "maintainer_email": null,
    "keywords": null,
    "author": "Alexander Nasuta",
    "author_email": "Alexander Nasuta <alexander.nasuta@wzl-iqs.rwth-aachen.de>",
    "download_url": "https://files.pythonhosted.org/packages/c1/e7/640d77b822de96a037ad20e11df976842dd2aadf2a70bb63b215dc0fb677/gymcts-1.4.5.tar.gz",
    "platform": "unix",
    "description": "[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.15283390.svg)](https://doi.org/10.5281/zenodo.15283390)\n[![Python Badge](https://img.shields.io/badge/Python-3776AB?logo=python&logoColor=fff&style=flat)](https://www.python.org/downloads/)\n[![PyPI version](https://img.shields.io/pypi/v/gymcts)](https://pypi.org/project/gymcts/)\n[![License](https://img.shields.io/pypi/l/gymcts)](https://github.com/Alexander-Nasuta/gymcts/blob/master/LICENSE)\n[![Documentation Status](https://readthedocs.org/projects/gymcts/badge/?version=latest)](https://gymcts.readthedocs.io/en/latest/?badge=latest)\n\n# GYMCTS\n\nA Monte Carlo Tree Search Implementation for Gymnasium-style Environments.\n\n- Github: [GYMCTS on Github](https://github.com/Alexander-Nasuta/gymcts)\n- GitLab: [GYMCTS on GitLab](https://git-ce.rwth-aachen.de/alexander.nasuta/gymcts)\n- Pypi: [GYMCTS on PyPi](https://pypi.org/project/gymcts/)\n- Documentation: [GYMCTS Docs](https://gymcts.readthedocs.io/en/latest/)\n\n## Description\n\nThis project provides a Monte Carlo Tree Search (MCTS) implementation for Gymnasium-style environments as an installable Python package.\nThe package is designed to be used with the Gymnasium interface.\nIt is especially useful for combinatorial optimization problems or planning problems, such as the Job Shop Scheduling Problem (JSP).\nThe documentation provides numerous examples on how to use the package with different environments, while focusing on scheduling problems.\n\nA minimal working example is provided in the [Quickstart](#quickstart) section.\n\nIt comes with a variety of visualisation options, which is useful for research and debugging purposes. \nIt aims to be a base for further research and development for neural guided search algorithms.\n## Quickstart\nTo use the package, install it via pip:\n\n```shell\npip install gymcts\n```\nThe usage of a MCTS agent can roughly organised into the following steps:\n\n- Create a Gymnasium-style environment\n- Wrap the environment with a GymCTS wrapper\n- Create a MCTS agent\n- Solve the environment with the MCTS agent\n- Render the solution\n\nThe GYMCTS package provides a two types of wrappers for Gymnasium-style environments:\n- `DeepCopyMCTSGymEnvWrapper`: A wrapper that uses deepcopies of the environment to save a snapshot of the environment state for each node in the MCTS tree.\n- `ActionHistoryMCTSGymEnvWrapper`: A wrapper that saves the action sequence that lead to the current state in the MCTS node.\n\nThese wrappers can be used with the `GymctsAgent` to solve the environment. \nThe wrapper implement methods that are required by the `GymctsAgent` to interact with the environment.\nGYMCTS is designed to use a single environment instance and reconstructing the environment state form a state snapshot, when needed.\n\nNOTE: MCTS works best when the return of an episode is in the range of [-1, 1]. Please adjust the reward function of the environment accordingly (or change the ubc-scaling parameter of the MCTS agent).\nAdjusting the reward function of the environment is easily done with a [NormalizeReward](https://gymnasium.farama.org/api/wrappers/reward_wrappers/#gymnasium.wrappers.NormalizeReward) or [TransformReward](https://gymnasium.farama.org/api/wrappers/reward_wrappers/#gymnasium.wrappers.TransformReward) Wrapper.\n```python\nenv = NormalizeReward(env, gamma=0.99, epsilon=1e-8)\n```\n\n```python\nenv = TransformReward(env, lambda r: r / n_steps_per_episode)\n```\n### FrozenLake Example (DeepCopyMCTSGymEnvWrapper)\n\nA minimal example of how to use the package with the FrozenLake environment and the NaiveSoloMCTSGymEnvWrapper is provided in the following code snippet below.\nThe DeepCopyMCTSGymEnvWrapper can be used with non-deterministic environments, such as the FrozenLake environment with slippery ice.\n\n```python\nimport gymnasium as gym\n\nfrom gymcts.gymcts_agent import GymctsAgent\nfrom gymcts.gymcts_deepcopy_wrapper import DeepCopyMCTSGymEnvWrapper\n\nfrom gymcts.logger import log\n\n# set log level to 20 (INFO) \n# set log level to 10 (DEBUG) to see more detailed information\nlog.setLevel(20)\n\nif __name__ == '__main__':\n    # 0. create the environment\n    env = gym.make('FrozenLake-v1', desc=None, map_name=\"4x4\", is_slippery=True, render_mode=\"ansi\")\n    env.reset()\n\n    # 1. wrap the environment with the deep copy wrapper or a custom gymcts wrapper\n    env = DeepCopyMCTSGymEnvWrapper(env)\n\n    # 2. create the agent\n    agent = GymctsAgent(\n        env=env,\n        clear_mcts_tree_after_step=False,\n        render_tree_after_step=True,\n        number_of_simulations_per_step=50,\n        exclude_unvisited_nodes_from_render=True\n    )\n\n    # 3. solve the environment\n    actions = agent.solve()\n\n    # 4. render the environment solution in the terminal\n    print(env.render())\n    for a in actions:\n        obs, rew, term, trun, info = env.step(a)\n        print(env.render())\n\n    # 5. print the solution\n    # read the solution from the info provided by the RecordEpisodeStatistics wrapper \n    # (that DeepCopyMCTSGymEnvWrapper uses internally)\n    episode_length = info[\"episode\"][\"l\"]\n    episode_return = info[\"episode\"][\"r\"]\n\n    if episode_return == 1.0:\n        print(f\"Environment solved in {episode_length} steps.\")\n    else:\n        print(f\"Environment not solved in {episode_length} steps.\")\n```\n\n### FrozenLake Example (DeterministicSoloMCTSGymEnvWrapper)\n\nA minimal example of how to use the package with the FrozenLake environment and the DeterministicSoloMCTSGymEnvWrapper is provided in the following code snippet below.\nThe DeterministicSoloMCTSGymEnvWrapper can be used with deterministic environments, such as the FrozenLake environment without slippery ice.\n\nThe DeterministicSoloMCTSGymEnvWrapper saves the action sequence that lead to the current state in the MCTS node.\n\n```python\nimport gymnasium as gym\n\nfrom gymcts.gymcts_agent import GymctsAgent\nfrom gymcts.gymcts_action_history_wrapper import ActionHistoryMCTSGymEnvWrapper\n\nfrom gymcts.logger import log\n\n# set log level to 20 (INFO)\n# set log level to 10 (DEBUG) to see more detailed information\nlog.setLevel(20)\n\nif __name__ == '__main__':\n    # 0. create the environment\n    env = gym.make('FrozenLake-v1', desc=None, map_name=\"4x4\", is_slippery=False, render_mode=\"ansi\")\n    env.reset()\n\n    # 1. wrap the environment with the wrapper\n    env = ActionHistoryMCTSGymEnvWrapper(env)\n\n    # 2. create the agent\n    agent = GymctsAgent(\n        env=env,\n        clear_mcts_tree_after_step=False,\n        render_tree_after_step=True,\n        number_of_simulations_per_step=50,\n        exclude_unvisited_nodes_from_render=True\n    )\n\n    # 3. solve the environment\n    actions = agent.solve()\n\n    # 4. render the environment solution in the terminal\n    print(env.render())\n    for a in actions:\n        obs, rew, term, trun, info = env.step(a)\n        print(env.render())\n\n    # 5. print the solution\n    # read the solution from the info provided by the RecordEpisodeStatistics wrapper\n    # (that DeterministicSoloMCTSGymEnvWrapper uses internally)\n    episode_length = info[\"episode\"][\"l\"]\n    episode_return = info[\"episode\"][\"r\"]\n\n    if episode_return == 1.0:\n        print(f\"Environment solved in {episode_length} steps.\")\n    else:\n        print(f\"Environment not solved in {episode_length} steps.\")\n```\n\n\n### FrozenLake Video Example\n\n![FrozenLake Video as .gif](./resources/frozenlake_4x4-episode-0-video-to-gif-converted.gif)\n\nTo create a video of the solution of the FrozenLake environment, you can use the following code snippet:\n\n```python  \nimport gymnasium as gym\n\nfrom gymcts.gymcts_agent import GymctsAgent\nfrom gymcts.gymcts_deepcopy_wrapper import DeepCopyMCTSGymEnvWrapper\n\nfrom gymcts.logger import log\n\nlog.setLevel(20)\n\nfrom gymnasium.envs.toy_text.frozen_lake import FrozenLakeEnv\n\nif __name__ == '__main__':\n    log.debug(\"Starting example\")\n\n    # 0. create the environment\n    env = gym.make('FrozenLake-v1', desc=None, map_name=\"4x4\", is_slippery=False, render_mode=\"rgb_array\")\n    env.reset()\n\n    # 1. wrap the environment with the deep copy wrapper or a custom gymcts wrapper\n    env = DeepCopyMCTSGymEnvWrapper(env)\n\n    # 2. create the agent\n    agent = GymctsAgent(\n        env=env,\n        clear_mcts_tree_after_step=False,\n        render_tree_after_step=True,\n        number_of_simulations_per_step=200,\n        exclude_unvisited_nodes_from_render=True\n    )\n\n    # 3. solve the environment\n    actions = agent.solve()\n\n    # 4. render the environment solution\n    env = gym.wrappers.RecordVideo(\n        env,\n        video_folder=\"./videos\",\n        episode_trigger=lambda episode_id: True,\n        name_prefix=\"frozenlake_4x4\"\n    )\n    env.reset()\n\n    for a in actions:\n        obs, rew, term, trun, info = env.step(a)\n    env.close()\n\n    # 5. print the solution\n    # read the solution from the info provided by the RecordEpisodeStatistics wrapper (that DeepCopyMCTSGymEnvWrapper wraps internally)\n    episode_length = info[\"episode\"][\"l\"]\n    episode_return = info[\"episode\"][\"r\"]\n\n    if episode_return == 1.0:\n        print(f\"Environment solved in {episode_length} steps.\")\n    else:\n        print(f\"Environment not solved in {episode_length} steps.\")\n```\n\n### Job Shop Scheduling (CustomWrapper)\n\n![](https://github.com/Alexander-Nasuta/GraphMatrixJobShopEnv/raw/master/resources/default-render.gif)\n\nThe following code snippet shows how to use the package with the [graph-jsp-env](https://github.com/Alexander-Nasuta/graph-jsp-env) environment.\n\nFirst, install the environment via pip:\n\n```shell\npip install graph-jsp-env\n```\n\nand a utility package for JSP instances:\n\n```shell\npip install jsp-instance-utils\n```\n\nThen, you can use the following code snippet to solve the environment with the MCTS agent:\n```\n\n```python  \nfrom typing import Any\n\nimport random\n\nimport gymnasium as gym\n\nfrom graph_jsp_env.disjunctive_graph_jsp_env import DisjunctiveGraphJspEnv\nfrom jsp_instance_utils.instances import ft06, ft06_makespan\n\nfrom gymcts.gymcts_agent import GymctsAgent\nfrom gymcts.gymcts_env_abc import GymctsABC\n\nfrom gymcts.logger import log\n\n\nclass GraphJspGYMCTSWrapper(GymctsABC, gym.Wrapper):\n\n    def __init__(self, env: DisjunctiveGraphJspEnv):\n        gym.Wrapper.__init__(self, env)\n\n    def load_state(self, state: Any) -> None:\n        self.env.reset()\n        for action in state:\n            self.env.step(action)\n\n    def is_terminal(self) -> bool:\n        return self.env.unwrapped.is_terminal()\n\n    def get_valid_actions(self) -> list[int]:\n        return list(self.env.unwrapped.valid_actions())\n\n    def rollout(self) -> float:\n        terminal = env.is_terminal()\n\n        if terminal:\n            lower_bound = env.unwrapped.reward_function_parameters['scaling_divisor']\n            return - env.unwrapped.get_makespan() / lower_bound + 2\n\n        reward = 0\n        while not terminal:\n            action = random.choice(self.get_valid_actions())\n            obs, reward, terminal, truncated, _ = env.step(action)\n\n        return reward + 2\n\n    def get_state(self) -> Any:\n        return env.unwrapped.get_action_history()\n\n\nif __name__ == '__main__':\n    log.setLevel(20)\n\n    env_kwargs = {\n        \"jps_instance\": ft06,\n        \"default_visualisations\": [\"gantt_console\", \"graph_console\"],\n        \"reward_function_parameters\": {\n            \"scaling_divisor\": ft06_makespan\n        },\n        \"reward_function\": \"nasuta\",\n    }\n\n    env = DisjunctiveGraphJspEnv(**env_kwargs)\n    env.reset()\n\n    env = GraphJspGYMCTSWrapper(env)\n\n    agent = GymctsAgent(\n        env=env,\n        clear_mcts_tree_after_step=True,\n        render_tree_after_step=True,\n        exclude_unvisited_nodes_from_render=True,\n        number_of_simulations_per_step=50,\n    )\n\n    root = agent.search_root_node.get_root()\n\n    actions = agent.solve(render_tree_after_step=True)\n    for a in actions:\n        obs, rew, term, trun, info = env.step(a)\n\n    env.render()\n    makespan = env.unwrapped.get_makespan()\n    print(f\"makespan: {makespan}\")\n\n```\n\n## Visualizations\n\nThe MCTS agent provides a visualisation of the MCTS tree.\nBelow is an example code snippet that shows how to use the visualisation options of the MCTS agent.\n\nThe following metrics are displayed in the visualisation:\n- `N`: the number of visits of the node\n- `Q_v`: the average return of the node\n- `ubc`: the upper confidence bound of the node\n- `a`: the action that leads to the node\n- `best`: the highest return of any rollout from the node\n\n`Q_v` and `ubc` have a color gradient from red to green, where red indicates a low value and green indicates a high value.\nThe color gradient is based on the minimum and maximum values of the respective metric in the tree.\n\nThe visualisation is rendered in the terminal and can be limited to a certain depth of the tree.\nThe default depth is 2.\n\n```python\nimport gymnasium as gym\n\nfrom gymcts.gymcts_agent import GymctsAgent\nfrom gymcts.gymcts_action_history_wrapper import ActionHistoryMCTSGymEnvWrapper\n\nfrom gymcts.logger import log\n\n# set log level to 20 (INFO)\n# set log level to 10 (DEBUG) to see more detailed information\nlog.setLevel(20)\n\nif __name__ == '__main__':\n    # create the environment\n    env = gym.make('FrozenLake-v1', desc=None, map_name=\"4x4\", is_slippery=False, render_mode=\"ansi\")\n    env.reset()\n\n    # wrap the environment with the wrapper or a custom gymcts wrapper\n    env = ActionHistoryMCTSGymEnvWrapper(env)\n\n    # create the agent\n    agent = GymctsAgent(\n        env=env,\n        clear_mcts_tree_after_step=False,\n        render_tree_after_step=False,\n        number_of_simulations_per_step=50,\n        exclude_unvisited_nodes_from_render=True,  # weather to exclude unvisited nodes from the render\n        render_tree_max_depth=2  # the maximum depth of the tree to render\n    )\n\n    # solve the environment\n    actions = agent.solve()\n\n    # render the MCTS tree from the root\n    # search_root_node is the node that corresponds to the current state of the environment in the search process\n    # since we called agent.solve() we are at the end of the search process\n    log.info(f\"MCTS Tree starting at the final state of the environment (actions: {agent.search_root_node.state})\")\n    agent.show_mcts_tree(\n        start_node=agent.search_root_node,\n    )\n\n    # the parent of the terminal node (which we are rendering below) is the search root node of the previous step in the\n    # MCTS solving process\n    log.info(\n        f\"MCTS Tree starting at the pre-final state of the environment (actions: {agent.search_root_node.parent.state})\")\n    agent.show_mcts_tree(\n        start_node=agent.search_root_node.parent,\n    )\n\n    # render the MCTS tree from the root\n    log.info(f\"MCTS Tree starting at the root state (actions: {agent.search_root_node.get_root().state})\")\n    agent.show_mcts_tree(\n        start_node=agent.search_root_node.get_root(),\n        # you can limit the depth of the tree to render to any number\n        tree_max_depth=1\n    )\n```\n\n![visualsiation example on the frozenlanke environment](./resources/mcts_visualisation.png)\n\n\n## State of the Project\n\nThis project is complementary material for a research paper. It will not be frequently updated.\nMinor updates might occur.\nSignificant further development will most likely result in a new project. In that case, a note with a link will be added in the `README.md` of this project.  \n\n## Dependencies\n\nThis project specifies multiple requirements files. \n`requirements.txt` contains the dependencies for the environment to work. These requirements will be installed automatically when installing the environment via `pip`.\n`requirements_dev.txt` contains the dependencies for development purposes. It includes the dependencies for testing, linting, and building the project on top of the dependencies in `requirements.txt`.\n`requirements_examples.txt` contains the dependencies for running the examples inside the project. It includes the dependencies in `requirements.txt` and additional dependencies for the examples.\n\nIn this Project the dependencies are specified in the `pyproject.toml` file with as little version constraints as possible.\nThe tool `pip-compile` translates the `pyproject.toml` file into a `requirements.txt` file with pinned versions. \nThat way version conflicts can be avoided (as much as possible) and the project can be built in a reproducible way.\n\n## Development Setup\n\nIf you want to check out the code and implement new features or fix bugs, you can set up the project as follows:\n\n### Clone the Repository\n\nclone the repository in your favorite code editor (for example PyCharm, VSCode, Neovim, etc.)\n\nusing https:\n```shell\ngit clone https://github.com/Alexander-Nasuta/gymcts.git\n```\nor by using the GitHub CLI:\n```shell\ngh repo clone Alexander-Nasuta/gymcts\n```\n\nif you are using PyCharm, I recommend doing the following additional steps:\n\n- mark the `src` folder as source root (by right-clicking on the folder and selecting `Mark Directory as` -> `Sources Root`)\n- mark the `tests` folder as test root (by right-clicking on the folder and selecting `Mark Directory as` -> `Test Sources Root`)\n- mark the `resources` folder as resources root (by right-clicking on the folder and selecting `Mark Directory as` -> `Resources Root`)\n\n\n### Create a Virtual Environment (optional)\n\nMost Developers use a virtual environment to manage the dependencies of their projects. \nI personally use `conda` for this purpose.\n\nWhen using `conda`, you can create a new environment with the name 'my-graph-jsp-env' following command:\n\n```shell\nconda create -n gymcts python=3.11\n```\n\nFeel free to use any other name for the environment or an more recent version of python.\nActivate the environment with the following command:\n\n```shell\nconda activate gymcts\n```\n\nReplace `gymcts` with the name of your environment, if you used a different name.\n\nYou can also use `venv` or `virtualenv` to create a virtual environment. In that case please refer to the respective documentation.\n\n### Install the Dependencies\n\nTo install the dependencies for development purposes, run the following command:\n\n```shell\npip install -r requirements_dev.txt\npip install tox\n```\n\nThe testing package `tox` is not included in the `requirements_dev.txt` file, because it sometimes causes issues when \nusing github actions. \nGithub Actions uses an own tox environment (namely 'tox-gh-actions'), which can cause conflicts with the tox environment on your local machine.\n\nReference: [Automated Testing in Python with pytest, tox, and GitHub Actions](https://www.youtube.com/watch?v=DhUpxWjOhME).\n\n### Install the Project in Editable Mode\n\nTo install the project in editable mode, run the following command:\n\n```shell\npip install -e .\n```\n\nThis will install the project in editable mode, so you can make changes to the code and test them immediately.\n\n### Run the Tests\n\nThis project uses `pytest` for testing. To run the tests, run the following command:\n\n```shell\npytest\n```\n\nFor testing with `tox` run the following command:\n\n```shell\ntox\n```\n\n### Builing and Publishing the Project to PyPi \n\nIn order to publish the project to PyPi, the project needs to be built and then uploaded to PyPi.\n\nTo build the project, run the following command:\n\n```shell\npython -m build\n```\n\nIt is considered good practice use the tool `twine` for checking the build and uploading the project to PyPi.\nBy default the build command creates a `dist` folder with the built project files.\nTo check all the files in the `dist` folder, run the following command:\n\n```shell\ntwine check dist/**\n```\n\nIf the check is successful, you can upload the project to PyPi with the following command:\n\n```shell\ntwine upload dist/**\n```\n\n### Documentation\nThis project uses `sphinx` for generating the documentation. \nIt also uses a lot of sphinx extensions to make the documentation more readable and interactive.\nFor example the extension `myst-parser` is used to enable markdown support in the documentation (instead of the usual .rst-files).\nIt also uses the `sphinx-autobuild` extension to automatically rebuild the documentation when changes are made.\nBy running the following command, the documentation will be automatically built and served, when changes are made (make sure to run this command in the root directory of the project):\n\n```shell\nsphinx-autobuild ./docs/source/ ./docs/build/html/\n```\n\nThis project features most of the extensions featured in this Tutorial: [Document Your Scientific Project With Markdown, Sphinx, and Read the Docs | PyData Global 2021](https://www.youtube.com/watch?v=qRSb299awB0).\n\n\n## Contact\n\nIf you have any questions or feedback, feel free to contact me via [email](mailto:alexander.nasuta@wzl-iqs.rwth-aachen.de) or open an issue on repository.\n",
    "bugtrack_url": null,
    "license": "MIT License  Copyright (c) 2025 Alexander Nasuta  Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:  The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.  THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.",
    "summary": "A minimalistic implementation of the Monte Carlo Tree Search algorithm for planning problems fomulated as gymnaisum reinforcement learning environments.",
    "version": "1.4.5",
    "project_urls": {
        "Homepage": "https://github.com/Alexander-Nasuta/gymcts"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "04d862c5e0ebe36b341248d505b8851a9408b7220aed362eae381c7abf2ce6ea",
                "md5": "0b4822e01940595ac15e5a4007560c0e",
                "sha256": "e1e1be0f07e34f60d3fc667c83cfa0b9147aa1d2d90f385adef72bb43a55fb12"
            },
            "downloads": -1,
            "filename": "gymcts-1.4.5-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "0b4822e01940595ac15e5a4007560c0e",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.11",
            "size": 31792,
            "upload_time": "2025-07-17T11:34:53",
            "upload_time_iso_8601": "2025-07-17T11:34:53.028567Z",
            "url": "https://files.pythonhosted.org/packages/04/d8/62c5e0ebe36b341248d505b8851a9408b7220aed362eae381c7abf2ce6ea/gymcts-1.4.5-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c1e7640d77b822de96a037ad20e11df976842dd2aadf2a70bb63b215dc0fb677",
                "md5": "88dbdd5e449e7be4737c953e9190c3d9",
                "sha256": "8699d95184415c2884c54ea0e38899541f3f262a1f1e5df83337e66cc5ec404a"
            },
            "downloads": -1,
            "filename": "gymcts-1.4.5.tar.gz",
            "has_sig": false,
            "md5_digest": "88dbdd5e449e7be4737c953e9190c3d9",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.11",
            "size": 33043,
            "upload_time": "2025-07-17T11:34:57",
            "upload_time_iso_8601": "2025-07-17T11:34:57.759716Z",
            "url": "https://files.pythonhosted.org/packages/c1/e7/640d77b822de96a037ad20e11df976842dd2aadf2a70bb63b215dc0fb677/gymcts-1.4.5.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-17 11:34:57",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Alexander-Nasuta",
    "github_project": "gymcts",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "cloudpickle",
            "specs": [
                [
                    "==",
                    "3.1.1"
                ]
            ]
        },
        {
            "name": "contourpy",
            "specs": [
                [
                    "==",
                    "1.3.1"
                ]
            ]
        },
        {
            "name": "cycler",
            "specs": [
                [
                    "==",
                    "0.12.1"
                ]
            ]
        },
        {
            "name": "farama-notifications",
            "specs": [
                [
                    "==",
                    "0.0.4"
                ]
            ]
        },
        {
            "name": "fonttools",
            "specs": [
                [
                    "==",
                    "4.55.3"
                ]
            ]
        },
        {
            "name": "gymnasium",
            "specs": [
                [
                    "==",
                    "1.0.0"
                ]
            ]
        },
        {
            "name": "kiwisolver",
            "specs": [
                [
                    "==",
                    "1.4.8"
                ]
            ]
        },
        {
            "name": "markdown-it-py",
            "specs": [
                [
                    "==",
                    "3.0.0"
                ]
            ]
        },
        {
            "name": "matplotlib",
            "specs": [
                [
                    "==",
                    "3.8.4"
                ]
            ]
        },
        {
            "name": "mdurl",
            "specs": [
                [
                    "==",
                    "0.1.2"
                ]
            ]
        },
        {
            "name": "numpy",
            "specs": [
                [
                    "==",
                    "2.2.1"
                ]
            ]
        },
        {
            "name": "packaging",
            "specs": [
                [
                    "==",
                    "24.2"
                ]
            ]
        },
        {
            "name": "pillow",
            "specs": [
                [
                    "==",
                    "11.1.0"
                ]
            ]
        },
        {
            "name": "pygments",
            "specs": [
                [
                    "==",
                    "2.19.1"
                ]
            ]
        },
        {
            "name": "pyparsing",
            "specs": [
                [
                    "==",
                    "3.2.1"
                ]
            ]
        },
        {
            "name": "python-dateutil",
            "specs": [
                [
                    "==",
                    "2.9.0.post0"
                ]
            ]
        },
        {
            "name": "rich",
            "specs": [
                [
                    "==",
                    "13.9.4"
                ]
            ]
        },
        {
            "name": "six",
            "specs": [
                [
                    "==",
                    "1.17.0"
                ]
            ]
        },
        {
            "name": "typing-extensions",
            "specs": [
                [
                    "==",
                    "4.12.2"
                ]
            ]
        }
    ],
    "lcname": "gymcts"
}
        
Elapsed time: 1.57209s