# DeepCubeAI
This repository contains the code and materials for the paper [Learning Discrete World Models for Heuristic Search](https://rlj.cs.umass.edu/2024/papers/Paper225.html)
> [!NOTE]
> This README file is currently being updated and will include more details soon.
<br>
<div align="center">
<img src="https://raw.githubusercontent.com/misaghsoltani/DeepCubeAI/master/images/dcai_rubiks_cube.gif" width="256" height="128" style="margin: 10px;">
<img src="https://raw.githubusercontent.com/misaghsoltani/DeepCubeAI/master/images/dcai_sokoban.gif" width="128" height="128" style="margin: 10px;">
<img src="https://raw.githubusercontent.com/misaghsoltani/DeepCubeAI/master/images/dcai_iceslider.gif" width="128" height="128" style="margin: 10px;">
<img src="https://raw.githubusercontent.com/misaghsoltani/DeepCubeAI/master/images/dcai_digitjump.gif" width="128" height="128" style="margin: 10px;">
</div>
## About DeepCubeAI
DeepCubeAI is an algorithm that learns a discrete world model and employs Deep Reinforcement Learning methods to learn a heuristic function that generalizes over start and goal states. We then integrate the learned model and the learned heuristic function with heuristic search, such as Q* search, to solve sequential decision making problems. For more details, read the [paper](https://rlj.cs.umass.edu/2024/papers/Paper225.html).
### Key Contributions
DeepCubeAI is comprised of three key components:
1. **Discrete World Model**
- Learns a world model that represents states in a discrete latent space.
- This approach tackles two challenges: model degradation and state re-identification.
- Prediction errors less than 0.5 are corrected by rounding.
- Re-identifies states by comparing two binary vectors.
<div align="center">
<img src="https://raw.githubusercontent.com/misaghsoltani/DeepCubeAI/master/images/dcai_discrete_world_model.png" width="450" height="450" style="margin: 10px;">
</div>
2. **Generalizable Heuristic Function**
- Utilizes Deep Q-Network (DQN) and hindsight experience replay (HER) to learn a heuristic function that generalizes over start and goal states.
3. **Optimized Search**
- Integrates the learned model and the learned heuristic function with heuristic search to solve problems. It uses [Q* search](https://prl-theworkshop.github.io/prl2024-icaps/papers/9.pdf), a variant of A* search optimized for DQNs, which enables faster and more memory-efficient planning.
### Main Results
* Accurate reconstruction of ground truth images after thousands of timesteps.
* Achieved 100% success on Rubik's Cube (canonical goal), Sokoban, IceSlider, and DigitJump.
* 99.9% success on Rubik's Cube with reversed start/goal states.
* Demonstrated significant improvement in solving complex planning problems and generalizing to unseen goals.
## Running the Code
To run the code, please follow these steps:
1. Create a Conda environment:
- **For macOS:** Create an environment with dependencies specified in `environment_macos.yaml` using the following command:
```bash
conda env create --file environment_macos.yaml
```
- **For Linux and Windows:** Create an environment with dependencies specified in `environment.yaml` using the following command:
```bash
conda env create --file environment.yaml
```
> [!NOTE]
> The only difference between the macOS environment and the Linux/Windows environments is that `pytorch-cuda` is not installed for macOS, as CUDA is not available for macOS devices.
2. Run the setup script by executing `./setup.sh`
## Running the Code and Reproducing Results
The `reproduce_results` folder contains scripts to run the code. To run different stages of the code and reproduce the results in the paper, please use the corresponding `.sh` files in the `reproduce_results` folder. The code should be run in the order given by the commands inside `.sh` files.
Additionally, if you want to run the code on a machine with SLURM scheduler, you can use the job submission files in the `job_submissions` folder.
Raw data
{
"_id": null,
"home_page": null,
"name": "deepcubeai-2",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": "DeepCubeAI, DeepCubeA, DeepCube, DeepXube, deep learning, reinforcement learning, deep reinforcement learning, search, heuristic search, pathfinding, planning, Rubik's Cube, Sokoban, Q*, model-based reinforcement learning, discrete world model, deep q-learning, deep q-network, Q* search",
"author": null,
"author_email": "Misagh Soltani <msoltani@email.sc.edu>",
"download_url": "https://files.pythonhosted.org/packages/d0/dd/85c440521cd75225113994b07c852fc754aa6720feb1e8bd5fccaebd3b8c/deepcubeai_2-0.1.12.tar.gz",
"platform": null,
"description": "# DeepCubeAI\n\nThis repository contains the code and materials for the paper [Learning Discrete World Models for Heuristic Search](https://rlj.cs.umass.edu/2024/papers/Paper225.html)\n> [!NOTE] \n> This README file is currently being updated and will include more details soon.\n<br>\n\n<div align=\"center\">\n <img src=\"https://raw.githubusercontent.com/misaghsoltani/DeepCubeAI/master/images/dcai_rubiks_cube.gif\" width=\"256\" height=\"128\" style=\"margin: 10px;\"> \n <img src=\"https://raw.githubusercontent.com/misaghsoltani/DeepCubeAI/master/images/dcai_sokoban.gif\" width=\"128\" height=\"128\" style=\"margin: 10px;\"> \n <img src=\"https://raw.githubusercontent.com/misaghsoltani/DeepCubeAI/master/images/dcai_iceslider.gif\" width=\"128\" height=\"128\" style=\"margin: 10px;\"> \n <img src=\"https://raw.githubusercontent.com/misaghsoltani/DeepCubeAI/master/images/dcai_digitjump.gif\" width=\"128\" height=\"128\" style=\"margin: 10px;\"> \n</div>\n\n## About DeepCubeAI\n\nDeepCubeAI is an algorithm that learns a discrete world model and employs Deep Reinforcement Learning methods to learn a heuristic function that generalizes over start and goal states. We then integrate the learned model and the learned heuristic function with heuristic search, such as Q* search, to solve sequential decision making problems. For more details, read the [paper](https://rlj.cs.umass.edu/2024/papers/Paper225.html).\n\u200c\n### Key Contributions\n\nDeepCubeAI is comprised of three key components:\n\n1. **Discrete World Model**\n - Learns a world model that represents states in a discrete latent space.\n - This approach tackles two challenges: model degradation and state re-identification.\n - Prediction errors less than 0.5 are corrected by rounding.\n - Re-identifies states by comparing two binary vectors.\n \n<div align=\"center\">\n <img src=\"https://raw.githubusercontent.com/misaghsoltani/DeepCubeAI/master/images/dcai_discrete_world_model.png\" width=\"450\" height=\"450\" style=\"margin: 10px;\">\n</div>\n\n2. **Generalizable Heuristic Function**\n - Utilizes Deep Q-Network (DQN) and hindsight experience replay (HER) to learn a heuristic function that generalizes over start and goal states.\n\n3. **Optimized Search**\n - Integrates the learned model and the learned heuristic function with heuristic search to solve problems. It uses [Q* search](https://prl-theworkshop.github.io/prl2024-icaps/papers/9.pdf), a variant of A* search optimized for DQNs, which enables faster and more memory-efficient planning.\n\u200c\n### Main Results\n* Accurate reconstruction of ground truth images after thousands of timesteps.\n* Achieved 100% success on Rubik's Cube (canonical goal), Sokoban, IceSlider, and DigitJump.\n* 99.9% success on Rubik's Cube with reversed start/goal states.\n* Demonstrated significant improvement in solving complex planning problems and generalizing to unseen goals.\n\n## Running the Code\n\nTo run the code, please follow these steps:\n\n1. Create a Conda environment:\n - **For macOS:** Create an environment with dependencies specified in `environment_macos.yaml` using the following command:\n ```bash\n conda env create --file environment_macos.yaml\n ```\n - **For Linux and Windows:** Create an environment with dependencies specified in `environment.yaml` using the following command:\n ```bash\n conda env create --file environment.yaml\n ```\n\n> [!NOTE] \n> The only difference between the macOS environment and the Linux/Windows environments is that `pytorch-cuda` is not installed for macOS, as CUDA is not available for macOS devices.\n\n2. Run the setup script by executing `./setup.sh`\n\n## Running the Code and Reproducing Results\n\nThe `reproduce_results` folder contains scripts to run the code. To run different stages of the code and reproduce the results in the paper, please use the corresponding `.sh` files in the `reproduce_results` folder. The code should be run in the order given by the commands inside `.sh` files.\n\nAdditionally, if you want to run the code on a machine with SLURM scheduler, you can use the job submission files in the `job_submissions` folder.\n",
"bugtrack_url": null,
"license": "MIT License",
"summary": "Learning Discrete World Models for Heuristic Search",
"version": "0.1.12",
"project_urls": {
"Paper": "https://rlj.cs.umass.edu/2024/papers/Paper225.html",
"Repository": "https://github.com/misaghsoltani/DeepCubeAI_2/"
},
"split_keywords": [
"deepcubeai",
" deepcubea",
" deepcube",
" deepxube",
" deep learning",
" reinforcement learning",
" deep reinforcement learning",
" search",
" heuristic search",
" pathfinding",
" planning",
" rubik's cube",
" sokoban",
" q*",
" model-based reinforcement learning",
" discrete world model",
" deep q-learning",
" deep q-network",
" q* search"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "2c1d91f8430d3a3db4460005d85e652f10aedda0567b949e6030639bfed7e25d",
"md5": "58d9935833c9f4175cc993ea3e20c7b0",
"sha256": "067fea74d407a59aa3948401e7d9ba6509364908da6764c51ca47685b72b912f"
},
"downloads": -1,
"filename": "deepcubeai_2-0.1.12-py3-none-any.whl",
"has_sig": false,
"md5_digest": "58d9935833c9f4175cc993ea3e20c7b0",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 15697305,
"upload_time": "2024-08-29T16:15:23",
"upload_time_iso_8601": "2024-08-29T16:15:23.170205Z",
"url": "https://files.pythonhosted.org/packages/2c/1d/91f8430d3a3db4460005d85e652f10aedda0567b949e6030639bfed7e25d/deepcubeai_2-0.1.12-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "d0dd85c440521cd75225113994b07c852fc754aa6720feb1e8bd5fccaebd3b8c",
"md5": "cb40ed30d9cdbca204aa149a53a105a7",
"sha256": "dacb99c787f24492dce3641ace3a8fcd9788edb5062a67bab55eb19877698a2b"
},
"downloads": -1,
"filename": "deepcubeai_2-0.1.12.tar.gz",
"has_sig": false,
"md5_digest": "cb40ed30d9cdbca204aa149a53a105a7",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 15658321,
"upload_time": "2024-08-29T16:15:26",
"upload_time_iso_8601": "2024-08-29T16:15:26.585669Z",
"url": "https://files.pythonhosted.org/packages/d0/dd/85c440521cd75225113994b07c852fc754aa6720feb1e8bd5fccaebd3b8c/deepcubeai_2-0.1.12.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-08-29 16:15:26",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "misaghsoltani",
"github_project": "DeepCubeAI_2",
"github_not_found": true,
"lcname": "deepcubeai-2"
}