# DeepCubeAI
This repository contains code for the paper [Learning Discrete World Models for Heuristic Search](https://rlj.cs.umass.edu/2024/papers/Paper225.html).
|  |  |  |  |
| :------------------------------------------------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------------: | :---------------------------------------------------------------------------------------------------------------------------------: | :---------------------------------------------------------------------------------------------------------------------------------: |
## About DeepCubeAI
DeepCubeAI is an algorithm that learns a discrete world model and employs Deep Reinforcement Learning methods to learn a heuristic function that generalizes over start and goal states. We then integrate the learned model and the learned heuristic function with heuristic search, such as Q* search, to solve sequential decision making problems. For more details, please refer to the [paper](https://rlj.cs.umass.edu/2024/papers/Paper225.html).
## Quick links
- Key contributions: [Key Contributions](#key-contributions)
- Main results: [Main Results](#main-results)
- Quick start: [Quick start](#quick-start)
- Install: [docs/installation.md](https://github.com/misaghsoltani/DeepCubeAI/blob/main/docs/installation.md)
- CLI reference: [docs/cli.md](https://github.com/misaghsoltani/DeepCubeAI/blob/main/docs/cli.md)
- Stage-by-stage usage (all flags and paths): [docs/usage.md](https://github.com/misaghsoltani/DeepCubeAI/blob/main/docs/usage.md)
- Reproduce the paper results: [docs/reproduce.md](https://github.com/misaghsoltani/DeepCubeAI/blob/main/docs/reproduce.md)
- SLURM and Distributed training: [docs/qlearning_distributed.md](https://github.com/misaghsoltani/DeepCubeAI/blob/main/docs/qlearning_distributed.md)
- Environments and integration: [docs/environments.md](https://github.com/misaghsoltani/DeepCubeAI/blob/main/docs/environments.md)
- Python usage (API snippets): [docs/python_api.md](https://github.com/misaghsoltani/DeepCubeAI/blob/main/docs/python_api.md)
- Citing the paper: [Citation](#citation)
- Contact: [Contact](#contact)
## Key Contributions
### Overview
DeepCubeAI is comprised of three key components:
1. **Discrete World Model**
- Learns a world model that represents states in a discrete latent space.
- This approach tackles two challenges: model degradation and state re-identification.
- Prediction errors less than 0.5 are corrected by rounding.
- Re-identifies states by comparing two binary vectors.
<br>
|  |
| :----------------------------------------------------------------------------------------------------------------------------------------: |
2. **Generalizable Heuristic Function**
- Utilizes Deep Q-Network (DQN) and hindsight experience replay (HER) to learn a heuristic function that generalizes over start and goal states.
3. **Optimized Search**
- Integrates the learned model and the learned heuristic function with heuristic search to solve problems. It uses [Q* search](https://prl-theworkshop.github.io/prl2024-icaps/papers/9.pdf), a variant of A* search optimized for DQNs, which enables faster and more memory-efficient planning.
### Main Results
- Accurate reconstruction of ground truth images after thousands of timesteps.
- Achieved 100% success on Rubik's Cube (canonical goal), Sokoban, IceSlider, and DigitJump.
- 99.9% success on Rubik's Cube with reversed start/goal states.
- Demonstrated significant improvement in solving complex planning problems and generalizing to unseen goals.
## Quick start
DeepCubeAI provides a Python package and CLI. You can install it from PyPI or build it from source. The package supports Python 3.10-3.12.
> [!NOTE]
>
> You can find detailed installation instructions, including using Conda for environment management, in the [installation guide](https://github.com/misaghsoltani/DeepCubeAI/blob/main/docs/installation.md).
### Install `deepcubeai` Package from PyPI with [uv](https://docs.astral.sh/uv/) (Recommended if Running as a Package)
`deepcubeai` is available on PyPI and you can use the following commands to install it.
1. **Install `uv`** from the official website: [Install uv](https://docs.astral.sh/uv/getting-started/installation/).
2. Create and activate a virtual environment:
```bash
# create a .venv in the current folder
uv venv
# macOS & Linux
source .venv/bin/activate
# Windows (PowerShell)
.venv\Scripts\activate
```
If you have multiple Python versions, ensure you use a supported one (3.10-3.12), e.g.:
```bash
uv venv --python 3.12
```
3. Install the package (using [uv’s pip interface](https://docs.astral.sh/uv/pip/)):
```bash
uv pip install deepcubeai
```
### Install from Source with Pixi (Recommended if Working from Source)
[Pixi](https://pixi.sh/) is a package management tool that provides fast, reproducible environments with support for Conda and PyPI dependencies. The `pixi.toml` and `pixi.lock` files define reproducible environments with exact dependency versions.
1. **Install Pixi**: Follow the [official installation guide](https://pixi.sh/latest/installation/)
2. **Clone repository**:
```bash
git clone https://github.com/misaghsoltani/DeepCubeAI.git
cd DeepCubeAI
```
3. **Enter the default environment** (first run performs dependency resolution):
```bash
pixi shell # or: pixi shell -e default
# or
pixi install -e default # non-interactive solve only
```
### Running DeepCubeAI
For running the CLI use the following command to see the available options:
```bash
# If already entered the environment with Pixi:
deepcubeai --help # or -h
# or
# Without entering the environment:
pixi run deepcubeai --help # or -h
```
Or use it as a Python package:
```python
import deepcubeai
print(deepcubeai.__version__)
```
## License
MIT License - see [LICENSE](LICENSE).
## Citation
If you use DeepCubeAI in your research, please cite:
```bibtex
@article{agostinelli2025learning,
title={Learning Discrete World Models for Heuristic Search},
author={Agostinelli, Forest and Soltani, Misagh},
journal={Reinforcement Learning Journal},
volume={4},
pages={1781--1792},
year={2025}
}
```
## Contact
If you have any questions or issues, please contact Misagh Soltani ([msoltani@email.sc.edu](mailto:msoltani@email.sc.edu))
Raw data
{
"_id": null,
"home_page": null,
"name": "deepcubeai",
"maintainer": null,
"docs_url": null,
"requires_python": "<3.13,>=3.10",
"maintainer_email": null,
"keywords": "DeepCube, DeepCubeA, DeepCubeAI, DeepXube, Q*, Q* search, Rubik's Cube, Sokoban, deep learning, deep q-learning, deep q-network, deep reinforcement learning, discrete world model, heuristic search, model-based reinforcement learning, pathfinding, planning, reinforcement learning, search",
"author": null,
"author_email": "Misagh Soltani <msoltani@email.sc.edu>",
"download_url": "https://files.pythonhosted.org/packages/47/e1/bd2857b0512ca1d22c1438f98ba020d03bf54b1bc6a24639601e0c5920d4/deepcubeai-0.2.1.tar.gz",
"platform": null,
"description": "# DeepCubeAI\n\nThis repository contains code for the paper [Learning Discrete World Models for Heuristic Search](https://rlj.cs.umass.edu/2024/papers/Paper225.html).\n\n|  |  |  |  |\n| :------------------------------------------------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------------------: | :---------------------------------------------------------------------------------------------------------------------------------: | :---------------------------------------------------------------------------------------------------------------------------------: |\n\n## About DeepCubeAI\n\nDeepCubeAI is an algorithm that learns a discrete world model and employs Deep Reinforcement Learning methods to learn a heuristic function that generalizes over start and goal states. We then integrate the learned model and the learned heuristic function with heuristic search, such as Q* search, to solve sequential decision making problems. For more details, please refer to the [paper](https://rlj.cs.umass.edu/2024/papers/Paper225.html).\n\n## Quick links\n\n- Key contributions: [Key Contributions](#key-contributions)\n- Main results: [Main Results](#main-results)\n- Quick start: [Quick start](#quick-start)\n- Install: [docs/installation.md](https://github.com/misaghsoltani/DeepCubeAI/blob/main/docs/installation.md)\n- CLI reference: [docs/cli.md](https://github.com/misaghsoltani/DeepCubeAI/blob/main/docs/cli.md)\n- Stage-by-stage usage (all flags and paths): [docs/usage.md](https://github.com/misaghsoltani/DeepCubeAI/blob/main/docs/usage.md)\n- Reproduce the paper results: [docs/reproduce.md](https://github.com/misaghsoltani/DeepCubeAI/blob/main/docs/reproduce.md)\n- SLURM and Distributed training: [docs/qlearning_distributed.md](https://github.com/misaghsoltani/DeepCubeAI/blob/main/docs/qlearning_distributed.md)\n- Environments and integration: [docs/environments.md](https://github.com/misaghsoltani/DeepCubeAI/blob/main/docs/environments.md)\n- Python usage (API snippets): [docs/python_api.md](https://github.com/misaghsoltani/DeepCubeAI/blob/main/docs/python_api.md)\n- Citing the paper: [Citation](#citation)\n- Contact: [Contact](#contact)\n\n## Key Contributions\n\n### Overview\n\nDeepCubeAI is comprised of three key components:\n\n1. **Discrete World Model**\n\n - Learns a world model that represents states in a discrete latent space.\n - This approach tackles two challenges: model degradation and state re-identification.\n - Prediction errors less than 0.5 are corrected by rounding.\n - Re-identifies states by comparing two binary vectors.\n\n <br>\n\n |  |\n | :----------------------------------------------------------------------------------------------------------------------------------------: |\n\n2. **Generalizable Heuristic Function**\n\n - Utilizes Deep Q-Network (DQN) and hindsight experience replay (HER) to learn a heuristic function that generalizes over start and goal states.\n\n3. **Optimized Search**\n\n - Integrates the learned model and the learned heuristic function with heuristic search to solve problems. It uses [Q* search](https://prl-theworkshop.github.io/prl2024-icaps/papers/9.pdf), a variant of A* search optimized for DQNs, which enables faster and more memory-efficient planning.\n\u200c\n\n### Main Results\n\n- Accurate reconstruction of ground truth images after thousands of timesteps.\n- Achieved 100% success on Rubik's Cube (canonical goal), Sokoban, IceSlider, and DigitJump.\n- 99.9% success on Rubik's Cube with reversed start/goal states.\n- Demonstrated significant improvement in solving complex planning problems and generalizing to unseen goals.\n\n## Quick start\n\nDeepCubeAI provides a Python package and CLI. You can install it from PyPI or build it from source. The package supports Python 3.10-3.12.\n\n> [!NOTE]\n>\n> You can find detailed installation instructions, including using Conda for environment management, in the [installation guide](https://github.com/misaghsoltani/DeepCubeAI/blob/main/docs/installation.md).\n\n### Install `deepcubeai` Package from PyPI with [uv](https://docs.astral.sh/uv/) (Recommended if Running as a Package)\n\n`deepcubeai` is available on PyPI and you can use the following commands to install it.\n\n 1. **Install `uv`** from the official website: [Install uv](https://docs.astral.sh/uv/getting-started/installation/).\n 2. Create and activate a virtual environment:\n\n ```bash\n # create a .venv in the current folder\n uv venv\n\n # macOS & Linux\n source .venv/bin/activate\n\n # Windows (PowerShell)\n .venv\\Scripts\\activate\n ```\n\n If you have multiple Python versions, ensure you use a supported one (3.10-3.12), e.g.:\n\n ```bash\n uv venv --python 3.12\n ```\n\n 3. Install the package (using [uv\u2019s pip interface](https://docs.astral.sh/uv/pip/)):\n\n ```bash\n uv pip install deepcubeai\n ```\n\n### Install from Source with Pixi (Recommended if Working from Source)\n\n[Pixi](https://pixi.sh/) is a package management tool that provides fast, reproducible environments with support for Conda and PyPI dependencies. The `pixi.toml` and `pixi.lock` files define reproducible environments with exact dependency versions.\n\n1. **Install Pixi**: Follow the [official installation guide](https://pixi.sh/latest/installation/)\n2. **Clone repository**:\n\n ```bash\n git clone https://github.com/misaghsoltani/DeepCubeAI.git\n cd DeepCubeAI\n ```\n\n3. **Enter the default environment** (first run performs dependency resolution):\n\n ```bash\n pixi shell # or: pixi shell -e default\n\n # or\n\n pixi install -e default # non-interactive solve only\n ```\n\n### Running DeepCubeAI\n\nFor running the CLI use the following command to see the available options:\n\n ```bash\n # If already entered the environment with Pixi:\n deepcubeai --help # or -h\n\n # or\n\n # Without entering the environment:\n pixi run deepcubeai --help # or -h\n ```\n\nOr use it as a Python package:\n\n```python\nimport deepcubeai\n\nprint(deepcubeai.__version__)\n```\n\n## License\n\nMIT License - see [LICENSE](LICENSE).\n\n## Citation\n\nIf you use DeepCubeAI in your research, please cite:\n\n```bibtex\n@article{agostinelli2025learning,\n title={Learning Discrete World Models for Heuristic Search},\n author={Agostinelli, Forest and Soltani, Misagh},\n journal={Reinforcement Learning Journal},\n volume={4},\n pages={1781--1792},\n year={2025}\n}\n```\n\n## Contact\n\nIf you have any questions or issues, please contact Misagh Soltani ([msoltani@email.sc.edu](mailto:msoltani@email.sc.edu))\n",
"bugtrack_url": null,
"license": null,
"summary": "Learning Discrete World Models for Heuristic Search",
"version": "0.2.1",
"project_urls": {
"GitHub": "https://github.com/misaghsoltani/DeepCubeAI/",
"Paper": "https://rlj.cs.umass.edu/2024/papers/Paper225.html"
},
"split_keywords": [
"deepcube",
" deepcubea",
" deepcubeai",
" deepxube",
" q*",
" q* search",
" rubik's cube",
" sokoban",
" deep learning",
" deep q-learning",
" deep q-network",
" deep reinforcement learning",
" discrete world model",
" heuristic search",
" model-based reinforcement learning",
" pathfinding",
" planning",
" reinforcement learning",
" search"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "18a8679b84ec4f8e31a872623ac2dd4f013750652ac59f7926362019c128fc2a",
"md5": "bfbac757b53d70b3e61a2b5be55243d8",
"sha256": "9e9a7ce60264757af4f9e0ee0a71c364137a647af84b104d49bfb82a5f2482d1"
},
"downloads": -1,
"filename": "deepcubeai-0.2.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "bfbac757b53d70b3e61a2b5be55243d8",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<3.13,>=3.10",
"size": 15629202,
"upload_time": "2025-08-27T22:54:35",
"upload_time_iso_8601": "2025-08-27T22:54:35.447402Z",
"url": "https://files.pythonhosted.org/packages/18/a8/679b84ec4f8e31a872623ac2dd4f013750652ac59f7926362019c128fc2a/deepcubeai-0.2.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "47e1bd2857b0512ca1d22c1438f98ba020d03bf54b1bc6a24639601e0c5920d4",
"md5": "a6a2a1035feb34c70a7576b3b88a22d8",
"sha256": "04c3b0a202d22b5e88ab274bf4845a475d0d73627fbdae25179f10b2fea365b3"
},
"downloads": -1,
"filename": "deepcubeai-0.2.1.tar.gz",
"has_sig": false,
"md5_digest": "a6a2a1035feb34c70a7576b3b88a22d8",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<3.13,>=3.10",
"size": 15585146,
"upload_time": "2025-08-27T22:54:38",
"upload_time_iso_8601": "2025-08-27T22:54:38.248392Z",
"url": "https://files.pythonhosted.org/packages/47/e1/bd2857b0512ca1d22c1438f98ba020d03bf54b1bc6a24639601e0c5920d4/deepcubeai-0.2.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-27 22:54:38",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "misaghsoltani",
"github_project": "DeepCubeAI",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "torch",
"specs": [
[
"<",
"3"
],
[
">=",
"2.2.2"
]
]
},
{
"name": "numpy",
"specs": [
[
"<",
"3"
],
[
">=",
"2.3.2"
]
]
},
{
"name": "matplotlib",
"specs": [
[
">=",
"3.10.3"
],
[
"<",
"4"
]
]
},
{
"name": "networkx",
"specs": [
[
">=",
"3.5"
],
[
"<",
"4"
]
]
},
{
"name": "opencv-python",
"specs": [
[
"<",
"5"
],
[
">=",
"4.11.0.86"
]
]
},
{
"name": "tensorboard",
"specs": [
[
"<",
"3"
],
[
">=",
"2.20.0"
]
]
},
{
"name": "gymnasium",
"specs": [
[
">=",
"1.2.0"
],
[
"<",
"2"
]
]
}
],
"lcname": "deepcubeai"
}