deepcubeai-2

Name	deepcubeai-2 JSON
Version	0.1.12 JSON
	download
home_page	None
Summary	Learning Discrete World Models for Heuristic Search
upload_time	2024-08-29 16:15:26
maintainer	None
docs_url	None
author	None
requires_python	>=3.10
license	MIT License
keywords	deepcubeai deepcubea deepcube deepxube deep learning reinforcement learning deep reinforcement learning search heuristic search pathfinding planning rubik's cube sokoban q* model-based reinforcement learning discrete world model deep q-learning deep q-network q* search
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # DeepCubeAI

This repository contains the code and materials for the paper [Learning Discrete World Models for Heuristic Search](https://rlj.cs.umass.edu/2024/papers/Paper225.html)
> [!NOTE]  
> This README file is currently being updated and will include more details soon.
<br>

<div align="center">
  <img src="https://raw.githubusercontent.com/misaghsoltani/DeepCubeAI/master/images/dcai_rubiks_cube.gif" width="256" height="128" style="margin: 10px;"> &nbsp; &nbsp;
  <img src="https://raw.githubusercontent.com/misaghsoltani/DeepCubeAI/master/images/dcai_sokoban.gif" width="128" height="128" style="margin: 10px;"> &nbsp; &nbsp;
  <img src="https://raw.githubusercontent.com/misaghsoltani/DeepCubeAI/master/images/dcai_iceslider.gif" width="128" height="128" style="margin: 10px;"> &nbsp; &nbsp;
  <img src="https://raw.githubusercontent.com/misaghsoltani/DeepCubeAI/master/images/dcai_digitjump.gif" width="128" height="128" style="margin: 10px;"> &nbsp; &nbsp;
</div>

## About DeepCubeAI

DeepCubeAI is an algorithm that learns a discrete world model and employs Deep Reinforcement Learning methods to learn a heuristic function that generalizes over start and goal states. We then integrate the learned model and the learned heuristic function with heuristic search, such as Q* search, to solve sequential decision making problems. For more details, read the [paper](https://rlj.cs.umass.edu/2024/papers/Paper225.html).
‌
### Key Contributions

DeepCubeAI is comprised of three key components:

1. **Discrete World Model**
   - Learns a world model that represents states in a discrete latent space.
   - This approach tackles two challenges: model degradation and state re-identification.
     - Prediction errors less than 0.5 are corrected by rounding.
     - Re-identifies states by comparing two binary vectors.
  
<div align="center">
  <img src="https://raw.githubusercontent.com/misaghsoltani/DeepCubeAI/master/images/dcai_discrete_world_model.png" width="450" height="450" style="margin: 10px;">
</div>

2. **Generalizable Heuristic Function**
   - Utilizes Deep Q-Network (DQN) and hindsight experience replay (HER) to learn a heuristic function that generalizes over start and goal states.

3. **Optimized Search**
   - Integrates the learned model and the learned heuristic function with heuristic search to solve problems. It uses [Q* search](https://prl-theworkshop.github.io/prl2024-icaps/papers/9.pdf), a variant of A* search optimized for DQNs, which enables faster and more memory-efficient planning.
‌
### Main Results
* Accurate reconstruction of ground truth images after thousands of timesteps.
* Achieved 100% success on Rubik's Cube (canonical goal), Sokoban, IceSlider, and DigitJump.
* 99.9% success on Rubik's Cube with reversed start/goal states.
* Demonstrated significant improvement in solving complex planning problems and generalizing to unseen goals.

## Running the Code

To run the code, please follow these steps:

1. Create a Conda environment:
   - **For macOS:** Create an environment with dependencies specified in `environment_macos.yaml` using the following command:
     ```bash
     conda env create --file environment_macos.yaml
     ```
   - **For Linux and Windows:** Create an environment with dependencies specified in `environment.yaml` using the following command:
     ```bash
     conda env create --file environment.yaml
     ```

> [!NOTE]  
> The only difference between the macOS environment and the Linux/Windows environments is that `pytorch-cuda` is not installed for macOS, as CUDA is not available for macOS devices.

2. Run the setup script by executing `./setup.sh`

## Running the Code and Reproducing Results

The `reproduce_results` folder contains scripts to run the code. To run different stages of the code and reproduce the results in the paper, please use the corresponding `.sh` files in the `reproduce_results` folder. The code should be run in the order given by the commands inside `.sh` files.

Additionally, if you want to run the code on a machine with SLURM scheduler, you can use the job submission files in the `job_submissions` folder.

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "deepcubeai-2",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "DeepCubeAI, DeepCubeA, DeepCube, DeepXube, deep learning, reinforcement learning, deep reinforcement learning, search, heuristic search, pathfinding, planning, Rubik's Cube, Sokoban, Q*, model-based reinforcement learning, discrete world model, deep q-learning, deep q-network, Q* search",
    "author": null,
    "author_email": "Misagh Soltani <msoltani@email.sc.edu>",
    "download_url": "https://files.pythonhosted.org/packages/d0/dd/85c440521cd75225113994b07c852fc754aa6720feb1e8bd5fccaebd3b8c/deepcubeai_2-0.1.12.tar.gz",
    "platform": null,
    "description": "# DeepCubeAI\n\nThis repository contains the code and materials for the paper [Learning Discrete World Models for Heuristic Search](https://rlj.cs.umass.edu/2024/papers/Paper225.html)\n> [!NOTE]  \n> This README file is currently being updated and will include more details soon.\n<br>\n\n<div align=\"center\">\n  <img src=\"https://raw.githubusercontent.com/misaghsoltani/DeepCubeAI/master/images/dcai_rubiks_cube.gif\" width=\"256\" height=\"128\" style=\"margin: 10px;\"> &nbsp; &nbsp;\n  <img src=\"https://raw.githubusercontent.com/misaghsoltani/DeepCubeAI/master/images/dcai_sokoban.gif\" width=\"128\" height=\"128\" style=\"margin: 10px;\"> &nbsp; &nbsp;\n  <img src=\"https://raw.githubusercontent.com/misaghsoltani/DeepCubeAI/master/images/dcai_iceslider.gif\" width=\"128\" height=\"128\" style=\"margin: 10px;\"> &nbsp; &nbsp;\n  <img src=\"https://raw.githubusercontent.com/misaghsoltani/DeepCubeAI/master/images/dcai_digitjump.gif\" width=\"128\" height=\"128\" style=\"margin: 10px;\"> &nbsp; &nbsp;\n</div>\n\n## About DeepCubeAI\n\nDeepCubeAI is an algorithm that learns a discrete world model and employs Deep Reinforcement Learning methods to learn a heuristic function that generalizes over start and goal states. We then integrate the learned model and the learned heuristic function with heuristic search, such as Q* search, to solve sequential decision making problems. For more details, read the [paper](https://rlj.cs.umass.edu/2024/papers/Paper225.html).\n\u200c\n### Key Contributions\n\nDeepCubeAI is comprised of three key components:\n\n1. **Discrete World Model**\n   - Learns a world model that represents states in a discrete latent space.\n   - This approach tackles two challenges: model degradation and state re-identification.\n     - Prediction errors less than 0.5 are corrected by rounding.\n     - Re-identifies states by comparing two binary vectors.\n  \n<div align=\"center\">\n  <img src=\"https://raw.githubusercontent.com/misaghsoltani/DeepCubeAI/master/images/dcai_discrete_world_model.png\" width=\"450\" height=\"450\" style=\"margin: 10px;\">\n</div>\n\n2. **Generalizable Heuristic Function**\n   - Utilizes Deep Q-Network (DQN) and hindsight experience replay (HER) to learn a heuristic function that generalizes over start and goal states.\n\n3. **Optimized Search**\n   - Integrates the learned model and the learned heuristic function with heuristic search to solve problems. It uses [Q* search](https://prl-theworkshop.github.io/prl2024-icaps/papers/9.pdf), a variant of A* search optimized for DQNs, which enables faster and more memory-efficient planning.\n\u200c\n### Main Results\n* Accurate reconstruction of ground truth images after thousands of timesteps.\n* Achieved 100% success on Rubik's Cube (canonical goal), Sokoban, IceSlider, and DigitJump.\n* 99.9% success on Rubik's Cube with reversed start/goal states.\n* Demonstrated significant improvement in solving complex planning problems and generalizing to unseen goals.\n\n## Running the Code\n\nTo run the code, please follow these steps:\n\n1. Create a Conda environment:\n   - **For macOS:** Create an environment with dependencies specified in `environment_macos.yaml` using the following command:\n     ```bash\n     conda env create --file environment_macos.yaml\n     ```\n   - **For Linux and Windows:** Create an environment with dependencies specified in `environment.yaml` using the following command:\n     ```bash\n     conda env create --file environment.yaml\n     ```\n\n> [!NOTE]  \n> The only difference between the macOS environment and the Linux/Windows environments is that `pytorch-cuda` is not installed for macOS, as CUDA is not available for macOS devices.\n\n2. Run the setup script by executing `./setup.sh`\n\n## Running the Code and Reproducing Results\n\nThe `reproduce_results` folder contains scripts to run the code. To run different stages of the code and reproduce the results in the paper, please use the corresponding `.sh` files in the `reproduce_results` folder. The code should be run in the order given by the commands inside `.sh` files.\n\nAdditionally, if you want to run the code on a machine with SLURM scheduler, you can use the job submission files in the `job_submissions` folder.\n",
    "bugtrack_url": null,
    "license": "MIT License",
    "summary": "Learning Discrete World Models for Heuristic Search",
    "version": "0.1.12",
    "project_urls": {
        "Paper": "https://rlj.cs.umass.edu/2024/papers/Paper225.html",
        "Repository": "https://github.com/misaghsoltani/DeepCubeAI_2/"
    },
    "split_keywords": [
        "deepcubeai",
        " deepcubea",
        " deepcube",
        " deepxube",
        " deep learning",
        " reinforcement learning",
        " deep reinforcement learning",
        " search",
        " heuristic search",
        " pathfinding",
        " planning",
        " rubik's cube",
        " sokoban",
        " q*",
        " model-based reinforcement learning",
        " discrete world model",
        " deep q-learning",
        " deep q-network",
        " q* search"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "2c1d91f8430d3a3db4460005d85e652f10aedda0567b949e6030639bfed7e25d",
                "md5": "58d9935833c9f4175cc993ea3e20c7b0",
                "sha256": "067fea74d407a59aa3948401e7d9ba6509364908da6764c51ca47685b72b912f"
            },
            "downloads": -1,
            "filename": "deepcubeai_2-0.1.12-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "58d9935833c9f4175cc993ea3e20c7b0",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 15697305,
            "upload_time": "2024-08-29T16:15:23",
            "upload_time_iso_8601": "2024-08-29T16:15:23.170205Z",
            "url": "https://files.pythonhosted.org/packages/2c/1d/91f8430d3a3db4460005d85e652f10aedda0567b949e6030639bfed7e25d/deepcubeai_2-0.1.12-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d0dd85c440521cd75225113994b07c852fc754aa6720feb1e8bd5fccaebd3b8c",
                "md5": "cb40ed30d9cdbca204aa149a53a105a7",
                "sha256": "dacb99c787f24492dce3641ace3a8fcd9788edb5062a67bab55eb19877698a2b"
            },
            "downloads": -1,
            "filename": "deepcubeai_2-0.1.12.tar.gz",
            "has_sig": false,
            "md5_digest": "cb40ed30d9cdbca204aa149a53a105a7",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 15658321,
            "upload_time": "2024-08-29T16:15:26",
            "upload_time_iso_8601": "2024-08-29T16:15:26.585669Z",
            "url": "https://files.pythonhosted.org/packages/d0/dd/85c440521cd75225113994b07c852fc754aa6720feb1e8bd5fccaebd3b8c/deepcubeai_2-0.1.12.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-08-29 16:15:26",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "misaghsoltani",
    "github_project": "DeepCubeAI_2",
    "github_not_found": true,
    "lcname": "deepcubeai-2"
}

None