# **abcdRL** (Implement a RL algorithm in four simple steps)
English | [简体中文](./README.cn.md)
[![license](https://img.shields.io/pypi/l/abcdrl)](https://github.com/sdpkjc/abcdrl)
[![pytest](https://github.com/sdpkjc/abcdrl/actions/workflows/test.yml/badge.svg)](https://github.com/sdpkjc/abcdrl/actions/workflows/test.yml)
[![pre-commit](https://github.com/sdpkjc/abcdrl/actions/workflows/pre-commit.yml/badge.svg)](https://github.com/sdpkjc/abcdrl/actions/workflows/pre-commit.yml)
[![pypi](https://img.shields.io/pypi/v/abcdrl)](https://pypi.org/project/abcdrl)
[![docker autobuild](https://img.shields.io/docker/cloud/build/sdpkjc/abcdrl)](https://hub.docker.com/r/sdpkjc/abcdrl/)
[![docs](https://img.shields.io/github/deployments/sdpkjc/abcdrl/Production?label=docs&logo=vercel)](https://docs.abcdrl.xyz/)
[![Gitpod ready-to-code](https://img.shields.io/badge/Gitpod-ready--to--code-908a85?logo=gitpod)](https://gitpod.io/#https://github.com/sdpkjc/abcdrl)
[![benchmark](https://img.shields.io/badge/Weights%20&%20Biases-benchmark-FFBE00?logo=weightsandbiases)](https://report.abcdrl.xyz/)
[![mirror repo](https://img.shields.io/badge/Gitee-mirror%20repo-black?style=flat&labelColor=C71D23&logo=gitee)](https://gitee.com/sdpkjc/abcdrl/)
[![Checked with mypy](https://img.shields.io/badge/mypy-checked-blue)](http://mypy-lang.org/)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
[![Imports: isort](https://img.shields.io/badge/%20imports-isort-%231674b1?style=flat&labelColor=ef8336)](https://pycqa.github.io/isort/)
[![python versions](https://img.shields.io/pypi/pyversions/abcdrl)](https://pypi.org/project/abcdrl)
abcdRL is a **Modular Single-file Reinforcement Learning Algorithms Library** that provides modular design without strict and clean single-file implementation.
<img src="https://abcdrl.xyz/logo/adam.svg" width="300"/>
*When reading the code, understand the full implementation details of the algorithm in the single file quickly; When modifying the algorithm, benefiting from a lightweight modular design, only need to focus on a small number of modules.*
> abcdRL mainly references the single-file design philosophy of [vwxyzjn/cleanrl](https://github.com/vwxyzjn/cleanrl/) and the module design of [PaddlePaddle/PARL](https://github.com/PaddlePaddle/PARL/).
***Documentation ➡️ [docs.abcdrl.xyz](https://abcdrl.xyz)***
***Roadmap🗺️ [#57](https://github.com/sdpkjc/abcdrl/issues/57)***
## 🚀 Quickstart
Open the project in Gitpod🌐 and start coding immediately.
[![Open in Gitpod](https://gitpod.io/button/open-in-gitpod.svg)](https://gitpod.io/#https://github.com/sdpkjc/abcdrl)
Using Docker📦:
```bash
# 0. Prerequisites: Docker & Nvidia Drive & NVIDIA Container Toolkit
# 1. Run DQN algorithm
docker run --rm --gpus all sdpkjc/abcdrl python abcdrl/dqn.py
```
***[For detailed installation instructions 👀](https://docs.abcdrl.xyz/install/)***
## 🐼 Features
- 👨👩👧👦 Unified code structure
- 📄 Single-file implementation
- 🐷 Low code reuse
- 📐 Minimizing code differences
- 📈 Tensorboard & Wandb support
- 🛤 PEP8(code style) & PEP526(type hint) compliant
## 🗽 Design Philosophy
- "Copy📋", ~~not "Inheritance🧬"~~
- "Single-file📜", ~~not "Multi-file📚"~~
- "Features reuse🛠", ~~not "Algorithms reuse🖨"~~
- "Unified logic🤖", ~~not "Unified interface🔌"~~
## ✅ Implemented Algorithms
***Weights & Biases Benchmark Report ➡️ [report.abcdrl.xyz](https://report.abcdrl.xyz)***
- [Deep Q Network (DQN)](https://doi.org/10.1038/nature14236)
- [Deep Deterministic Policy Gradient (DDPG)](http://arxiv.org/abs/1509.02971)
- [Twin Delayed Deep Deterministic Policy Gradient (TD3)](http://arxiv.org/abs/1802.09477)
- [Soft Actor-Critic (SAC)](http://arxiv.org/abs/1801.01290)
- [Proximal Policy Optimization (PPO)](http://arxiv.org/abs/1802.09477)
---
- [Double Deep Q Network (DDQN)](http://arxiv.org/abs/1509.06461)
- [Prioritized Deep Q Network (PDQN)](http://arxiv.org/abs/1511.05952)
## Citing abcdRL
```bibtex
@misc{zhao_abcdrl_2022,
author = {Yanxiao, Zhao},
month = {12},
title = {{abcdRL: Modular Single-file Reinforcement Learning Algorithms Library}},
url = {https://github.com/sdpkjc/abcdrl},
year = {2022}
}
```
Raw data
{
"_id": null,
"home_page": "https://abcdrl.xyz/",
"name": "abcdrl",
"maintainer": "Adam Zhao",
"docs_url": null,
"requires_python": ">=3.8,<3.11",
"maintainer_email": "pazyx728@gmail.com",
"keywords": "reinforcement,machine,learning",
"author": "Adam Zhao",
"author_email": "pazyx728@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/a3/23/7c2dd2dd33e5a9c14c657f461db7b2cfaa81e633bccfe90571575fd276e9/abcdrl-0.2.0a4.tar.gz",
"platform": null,
"description": "# **abcdRL** (Implement a RL algorithm in four simple steps)\n\nEnglish | [\u7b80\u4f53\u4e2d\u6587](./README.cn.md)\n\n[![license](https://img.shields.io/pypi/l/abcdrl)](https://github.com/sdpkjc/abcdrl)\n[![pytest](https://github.com/sdpkjc/abcdrl/actions/workflows/test.yml/badge.svg)](https://github.com/sdpkjc/abcdrl/actions/workflows/test.yml)\n[![pre-commit](https://github.com/sdpkjc/abcdrl/actions/workflows/pre-commit.yml/badge.svg)](https://github.com/sdpkjc/abcdrl/actions/workflows/pre-commit.yml)\n[![pypi](https://img.shields.io/pypi/v/abcdrl)](https://pypi.org/project/abcdrl)\n[![docker autobuild](https://img.shields.io/docker/cloud/build/sdpkjc/abcdrl)](https://hub.docker.com/r/sdpkjc/abcdrl/)\n[![docs](https://img.shields.io/github/deployments/sdpkjc/abcdrl/Production?label=docs&logo=vercel)](https://docs.abcdrl.xyz/)\n[![Gitpod ready-to-code](https://img.shields.io/badge/Gitpod-ready--to--code-908a85?logo=gitpod)](https://gitpod.io/#https://github.com/sdpkjc/abcdrl)\n[![benchmark](https://img.shields.io/badge/Weights%20&%20Biases-benchmark-FFBE00?logo=weightsandbiases)](https://report.abcdrl.xyz/)\n[![mirror repo](https://img.shields.io/badge/Gitee-mirror%20repo-black?style=flat&labelColor=C71D23&logo=gitee)](https://gitee.com/sdpkjc/abcdrl/)\n[![Checked with mypy](https://img.shields.io/badge/mypy-checked-blue)](http://mypy-lang.org/)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n[![Imports: isort](https://img.shields.io/badge/%20imports-isort-%231674b1?style=flat&labelColor=ef8336)](https://pycqa.github.io/isort/)\n[![python versions](https://img.shields.io/pypi/pyversions/abcdrl)](https://pypi.org/project/abcdrl)\n\nabcdRL is a **Modular Single-file Reinforcement Learning Algorithms Library** that provides modular design without strict and clean single-file implementation.\n\n<img src=\"https://abcdrl.xyz/logo/adam.svg\" width=\"300\"/>\n\n*When reading the code, understand the full implementation details of the algorithm in the single file quickly; When modifying the algorithm, benefiting from a lightweight modular design, only need to focus on a small number of modules.*\n\n> abcdRL mainly references the single-file design philosophy of [vwxyzjn/cleanrl](https://github.com/vwxyzjn/cleanrl/) and the module design of [PaddlePaddle/PARL](https://github.com/PaddlePaddle/PARL/).\n\n***Documentation \u27a1\ufe0f [docs.abcdrl.xyz](https://abcdrl.xyz)***\n\n***Roadmap\ud83d\uddfa\ufe0f [#57](https://github.com/sdpkjc/abcdrl/issues/57)***\n\n## \ud83d\ude80 Quickstart\n\nOpen the project in Gitpod\ud83c\udf10 and start coding immediately.\n\n[![Open in Gitpod](https://gitpod.io/button/open-in-gitpod.svg)](https://gitpod.io/#https://github.com/sdpkjc/abcdrl)\n\nUsing Docker\ud83d\udce6:\n\n```bash\n# 0. Prerequisites: Docker & Nvidia Drive & NVIDIA Container Toolkit\n# 1. Run DQN algorithm\ndocker run --rm --gpus all sdpkjc/abcdrl python abcdrl/dqn.py\n```\n\n***[For detailed installation instructions \ud83d\udc40](https://docs.abcdrl.xyz/install/)***\n\n## \ud83d\udc3c Features\n\n- \ud83d\udc68\u200d\ud83d\udc69\u200d\ud83d\udc67\u200d\ud83d\udc66 Unified code structure\n- \ud83d\udcc4 Single-file implementation\n- \ud83d\udc37 Low code reuse\n- \ud83d\udcd0 Minimizing code differences\n- \ud83d\udcc8 Tensorboard & Wandb support\n- \ud83d\udee4 PEP8(code style) & PEP526(type hint) compliant\n\n## \ud83d\uddfd Design Philosophy\n\n- \"Copy\ud83d\udccb\", ~~not \"Inheritance\ud83e\uddec\"~~\n- \"Single-file\ud83d\udcdc\", ~~not \"Multi-file\ud83d\udcda\"~~\n- \"Features reuse\ud83d\udee0\", ~~not \"Algorithms reuse\ud83d\udda8\"~~\n- \"Unified logic\ud83e\udd16\", ~~not \"Unified interface\ud83d\udd0c\"~~\n\n## \u2705 Implemented Algorithms\n\n***Weights & Biases Benchmark Report \u27a1\ufe0f [report.abcdrl.xyz](https://report.abcdrl.xyz)***\n\n- [Deep Q Network (DQN)](https://doi.org/10.1038/nature14236)\n- [Deep Deterministic Policy Gradient (DDPG)](http://arxiv.org/abs/1509.02971)\n- [Twin Delayed Deep Deterministic Policy Gradient (TD3)](http://arxiv.org/abs/1802.09477)\n- [Soft Actor-Critic (SAC)](http://arxiv.org/abs/1801.01290)\n- [Proximal Policy Optimization (PPO)](http://arxiv.org/abs/1802.09477)\n\n---\n\n- [Double Deep Q Network (DDQN)](http://arxiv.org/abs/1509.06461)\n- [Prioritized Deep Q Network (PDQN)](http://arxiv.org/abs/1511.05952)\n\n## Citing abcdRL\n\n```bibtex\n@misc{zhao_abcdrl_2022,\n author = {Yanxiao, Zhao},\n month = {12},\n title = {{abcdRL: Modular Single-file Reinforcement Learning Algorithms Library}},\n url = {https://github.com/sdpkjc/abcdrl},\n year = {2022}\n}\n```\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Modular Single-file Reinfocement Learning Algorithms Library",
"version": "0.2.0a4",
"split_keywords": [
"reinforcement",
"machine",
"learning"
],
"urls": [
{
"comment_text": "",
"digests": {
"md5": "4e04a5d22f547f59157132897181498c",
"sha256": "ed0274b4ac92c20fc85c81258fe1a69afcf085f55917f2063cb5349de1762968"
},
"downloads": -1,
"filename": "abcdrl-0.2.0a4-py3-none-any.whl",
"has_sig": false,
"md5_digest": "4e04a5d22f547f59157132897181498c",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8,<3.11",
"size": 45495,
"upload_time": "2023-01-02T08:39:46",
"upload_time_iso_8601": "2023-01-02T08:39:46.611393Z",
"url": "https://files.pythonhosted.org/packages/10/94/b819c3d72a88e8a12f57a97982eef0b308a91c0fcedd5580aed5435099a6/abcdrl-0.2.0a4-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"md5": "05c6ea7b48c7a3fc65d23d60f5473d22",
"sha256": "4c52cbfc2cd0a294278fbcbfaf6faaa26d10c375f5471f4e36c5e0e6fa059878"
},
"downloads": -1,
"filename": "abcdrl-0.2.0a4.tar.gz",
"has_sig": false,
"md5_digest": "05c6ea7b48c7a3fc65d23d60f5473d22",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8,<3.11",
"size": 24336,
"upload_time": "2023-01-02T08:39:47",
"upload_time_iso_8601": "2023-01-02T08:39:47.928067Z",
"url": "https://files.pythonhosted.org/packages/a3/23/7c2dd2dd33e5a9c14c657f461db7b2cfaa81e633bccfe90571575fd276e9/abcdrl-0.2.0a4.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-01-02 08:39:47",
"github": false,
"gitlab": false,
"bitbucket": false,
"lcname": "abcdrl"
}