abcdrl


Nameabcdrl JSON
Version 0.2.0a4 PyPI version JSON
download
home_pagehttps://abcdrl.xyz/
SummaryModular Single-file Reinfocement Learning Algorithms Library
upload_time2023-01-02 08:39:47
maintainerAdam Zhao
docs_urlNone
authorAdam Zhao
requires_python>=3.8,<3.11
licenseMIT
keywords reinforcement machine learning
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # **abcdRL** (Implement a RL algorithm in four simple steps)

English | [简体中文](./README.cn.md)

[![license](https://img.shields.io/pypi/l/abcdrl)](https://github.com/sdpkjc/abcdrl)
[![pytest](https://github.com/sdpkjc/abcdrl/actions/workflows/test.yml/badge.svg)](https://github.com/sdpkjc/abcdrl/actions/workflows/test.yml)
[![pre-commit](https://github.com/sdpkjc/abcdrl/actions/workflows/pre-commit.yml/badge.svg)](https://github.com/sdpkjc/abcdrl/actions/workflows/pre-commit.yml)
[![pypi](https://img.shields.io/pypi/v/abcdrl)](https://pypi.org/project/abcdrl)
[![docker autobuild](https://img.shields.io/docker/cloud/build/sdpkjc/abcdrl)](https://hub.docker.com/r/sdpkjc/abcdrl/)
[![docs](https://img.shields.io/github/deployments/sdpkjc/abcdrl/Production?label=docs&logo=vercel)](https://docs.abcdrl.xyz/)
[![Gitpod ready-to-code](https://img.shields.io/badge/Gitpod-ready--to--code-908a85?logo=gitpod)](https://gitpod.io/#https://github.com/sdpkjc/abcdrl)
[![benchmark](https://img.shields.io/badge/Weights%20&%20Biases-benchmark-FFBE00?logo=weightsandbiases)](https://report.abcdrl.xyz/)
[![mirror repo](https://img.shields.io/badge/Gitee-mirror%20repo-black?style=flat&labelColor=C71D23&logo=gitee)](https://gitee.com/sdpkjc/abcdrl/)
[![Checked with mypy](https://img.shields.io/badge/mypy-checked-blue)](http://mypy-lang.org/)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
[![Imports: isort](https://img.shields.io/badge/%20imports-isort-%231674b1?style=flat&labelColor=ef8336)](https://pycqa.github.io/isort/)
[![python versions](https://img.shields.io/pypi/pyversions/abcdrl)](https://pypi.org/project/abcdrl)

abcdRL is a **Modular Single-file Reinforcement Learning Algorithms Library** that provides modular design without strict and clean single-file implementation.

<img src="https://abcdrl.xyz/logo/adam.svg" width="300"/>

*When reading the code, understand the full implementation details of the algorithm in the single file quickly; When modifying the algorithm, benefiting from a lightweight modular design, only need to focus on a small number of modules.*

> abcdRL mainly references the single-file design philosophy of [vwxyzjn/cleanrl](https://github.com/vwxyzjn/cleanrl/) and the module design of [PaddlePaddle/PARL](https://github.com/PaddlePaddle/PARL/).

***Documentation ➡️ [docs.abcdrl.xyz](https://abcdrl.xyz)***

***Roadmap🗺️ [#57](https://github.com/sdpkjc/abcdrl/issues/57)***

## 🚀 Quickstart

Open the project in Gitpod🌐 and start coding immediately.

[![Open in Gitpod](https://gitpod.io/button/open-in-gitpod.svg)](https://gitpod.io/#https://github.com/sdpkjc/abcdrl)

Using Docker📦:

```bash
# 0. Prerequisites: Docker & Nvidia Drive & NVIDIA Container Toolkit
# 1. Run DQN algorithm
docker run --rm --gpus all sdpkjc/abcdrl python abcdrl/dqn.py
```

***[For detailed installation instructions 👀](https://docs.abcdrl.xyz/install/)***

## 🐼 Features

- 👨‍👩‍👧‍👦 Unified code structure
- 📄 Single-file implementation
- 🐷 Low code reuse
- 📐 Minimizing code differences
- 📈 Tensorboard & Wandb support
- 🛤 PEP8(code style) & PEP526(type hint) compliant

## 🗽 Design Philosophy

- "Copy📋", ~~not "Inheritance🧬"~~
- "Single-file📜", ~~not "Multi-file📚"~~
- "Features reuse🛠", ~~not "Algorithms reuse🖨"~~
- "Unified logic🤖", ~~not "Unified interface🔌"~~

## ✅ Implemented Algorithms

***Weights & Biases Benchmark Report ➡️ [report.abcdrl.xyz](https://report.abcdrl.xyz)***

- [Deep Q Network (DQN)](https://doi.org/10.1038/nature14236)
- [Deep Deterministic Policy Gradient (DDPG)](http://arxiv.org/abs/1509.02971)
- [Twin Delayed Deep Deterministic Policy Gradient (TD3)](http://arxiv.org/abs/1802.09477)
- [Soft Actor-Critic (SAC)](http://arxiv.org/abs/1801.01290)
- [Proximal Policy Optimization (PPO)](http://arxiv.org/abs/1802.09477)

---

- [Double Deep Q Network (DDQN)](http://arxiv.org/abs/1509.06461)
- [Prioritized Deep Q Network (PDQN)](http://arxiv.org/abs/1511.05952)

## Citing abcdRL

```bibtex
@misc{zhao_abcdrl_2022,
    author = {Yanxiao, Zhao},
    month = {12},
    title = {{abcdRL: Modular Single-file Reinforcement Learning Algorithms Library}},
    url = {https://github.com/sdpkjc/abcdrl},
    year = {2022}
}
```

            

Raw data

            {
    "_id": null,
    "home_page": "https://abcdrl.xyz/",
    "name": "abcdrl",
    "maintainer": "Adam Zhao",
    "docs_url": null,
    "requires_python": ">=3.8,<3.11",
    "maintainer_email": "pazyx728@gmail.com",
    "keywords": "reinforcement,machine,learning",
    "author": "Adam Zhao",
    "author_email": "pazyx728@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/a3/23/7c2dd2dd33e5a9c14c657f461db7b2cfaa81e633bccfe90571575fd276e9/abcdrl-0.2.0a4.tar.gz",
    "platform": null,
    "description": "# **abcdRL** (Implement a RL algorithm in four simple steps)\n\nEnglish | [\u7b80\u4f53\u4e2d\u6587](./README.cn.md)\n\n[![license](https://img.shields.io/pypi/l/abcdrl)](https://github.com/sdpkjc/abcdrl)\n[![pytest](https://github.com/sdpkjc/abcdrl/actions/workflows/test.yml/badge.svg)](https://github.com/sdpkjc/abcdrl/actions/workflows/test.yml)\n[![pre-commit](https://github.com/sdpkjc/abcdrl/actions/workflows/pre-commit.yml/badge.svg)](https://github.com/sdpkjc/abcdrl/actions/workflows/pre-commit.yml)\n[![pypi](https://img.shields.io/pypi/v/abcdrl)](https://pypi.org/project/abcdrl)\n[![docker autobuild](https://img.shields.io/docker/cloud/build/sdpkjc/abcdrl)](https://hub.docker.com/r/sdpkjc/abcdrl/)\n[![docs](https://img.shields.io/github/deployments/sdpkjc/abcdrl/Production?label=docs&logo=vercel)](https://docs.abcdrl.xyz/)\n[![Gitpod ready-to-code](https://img.shields.io/badge/Gitpod-ready--to--code-908a85?logo=gitpod)](https://gitpod.io/#https://github.com/sdpkjc/abcdrl)\n[![benchmark](https://img.shields.io/badge/Weights%20&%20Biases-benchmark-FFBE00?logo=weightsandbiases)](https://report.abcdrl.xyz/)\n[![mirror repo](https://img.shields.io/badge/Gitee-mirror%20repo-black?style=flat&labelColor=C71D23&logo=gitee)](https://gitee.com/sdpkjc/abcdrl/)\n[![Checked with mypy](https://img.shields.io/badge/mypy-checked-blue)](http://mypy-lang.org/)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n[![Imports: isort](https://img.shields.io/badge/%20imports-isort-%231674b1?style=flat&labelColor=ef8336)](https://pycqa.github.io/isort/)\n[![python versions](https://img.shields.io/pypi/pyversions/abcdrl)](https://pypi.org/project/abcdrl)\n\nabcdRL is a **Modular Single-file Reinforcement Learning Algorithms Library** that provides modular design without strict and clean single-file implementation.\n\n<img src=\"https://abcdrl.xyz/logo/adam.svg\" width=\"300\"/>\n\n*When reading the code, understand the full implementation details of the algorithm in the single file quickly; When modifying the algorithm, benefiting from a lightweight modular design, only need to focus on a small number of modules.*\n\n> abcdRL mainly references the single-file design philosophy of [vwxyzjn/cleanrl](https://github.com/vwxyzjn/cleanrl/) and the module design of [PaddlePaddle/PARL](https://github.com/PaddlePaddle/PARL/).\n\n***Documentation \u27a1\ufe0f [docs.abcdrl.xyz](https://abcdrl.xyz)***\n\n***Roadmap\ud83d\uddfa\ufe0f [#57](https://github.com/sdpkjc/abcdrl/issues/57)***\n\n## \ud83d\ude80 Quickstart\n\nOpen the project in Gitpod\ud83c\udf10 and start coding immediately.\n\n[![Open in Gitpod](https://gitpod.io/button/open-in-gitpod.svg)](https://gitpod.io/#https://github.com/sdpkjc/abcdrl)\n\nUsing Docker\ud83d\udce6:\n\n```bash\n# 0. Prerequisites: Docker & Nvidia Drive & NVIDIA Container Toolkit\n# 1. Run DQN algorithm\ndocker run --rm --gpus all sdpkjc/abcdrl python abcdrl/dqn.py\n```\n\n***[For detailed installation instructions \ud83d\udc40](https://docs.abcdrl.xyz/install/)***\n\n## \ud83d\udc3c Features\n\n- \ud83d\udc68\u200d\ud83d\udc69\u200d\ud83d\udc67\u200d\ud83d\udc66 Unified code structure\n- \ud83d\udcc4 Single-file implementation\n- \ud83d\udc37 Low code reuse\n- \ud83d\udcd0 Minimizing code differences\n- \ud83d\udcc8 Tensorboard & Wandb support\n- \ud83d\udee4 PEP8(code style) & PEP526(type hint) compliant\n\n## \ud83d\uddfd Design Philosophy\n\n- \"Copy\ud83d\udccb\", ~~not \"Inheritance\ud83e\uddec\"~~\n- \"Single-file\ud83d\udcdc\", ~~not \"Multi-file\ud83d\udcda\"~~\n- \"Features reuse\ud83d\udee0\", ~~not \"Algorithms reuse\ud83d\udda8\"~~\n- \"Unified logic\ud83e\udd16\", ~~not \"Unified interface\ud83d\udd0c\"~~\n\n## \u2705 Implemented Algorithms\n\n***Weights & Biases Benchmark Report \u27a1\ufe0f [report.abcdrl.xyz](https://report.abcdrl.xyz)***\n\n- [Deep Q Network (DQN)](https://doi.org/10.1038/nature14236)\n- [Deep Deterministic Policy Gradient (DDPG)](http://arxiv.org/abs/1509.02971)\n- [Twin Delayed Deep Deterministic Policy Gradient (TD3)](http://arxiv.org/abs/1802.09477)\n- [Soft Actor-Critic (SAC)](http://arxiv.org/abs/1801.01290)\n- [Proximal Policy Optimization (PPO)](http://arxiv.org/abs/1802.09477)\n\n---\n\n- [Double Deep Q Network (DDQN)](http://arxiv.org/abs/1509.06461)\n- [Prioritized Deep Q Network (PDQN)](http://arxiv.org/abs/1511.05952)\n\n## Citing abcdRL\n\n```bibtex\n@misc{zhao_abcdrl_2022,\n    author = {Yanxiao, Zhao},\n    month = {12},\n    title = {{abcdRL: Modular Single-file Reinforcement Learning Algorithms Library}},\n    url = {https://github.com/sdpkjc/abcdrl},\n    year = {2022}\n}\n```\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Modular Single-file Reinfocement Learning Algorithms Library",
    "version": "0.2.0a4",
    "split_keywords": [
        "reinforcement",
        "machine",
        "learning"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "md5": "4e04a5d22f547f59157132897181498c",
                "sha256": "ed0274b4ac92c20fc85c81258fe1a69afcf085f55917f2063cb5349de1762968"
            },
            "downloads": -1,
            "filename": "abcdrl-0.2.0a4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "4e04a5d22f547f59157132897181498c",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8,<3.11",
            "size": 45495,
            "upload_time": "2023-01-02T08:39:46",
            "upload_time_iso_8601": "2023-01-02T08:39:46.611393Z",
            "url": "https://files.pythonhosted.org/packages/10/94/b819c3d72a88e8a12f57a97982eef0b308a91c0fcedd5580aed5435099a6/abcdrl-0.2.0a4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "md5": "05c6ea7b48c7a3fc65d23d60f5473d22",
                "sha256": "4c52cbfc2cd0a294278fbcbfaf6faaa26d10c375f5471f4e36c5e0e6fa059878"
            },
            "downloads": -1,
            "filename": "abcdrl-0.2.0a4.tar.gz",
            "has_sig": false,
            "md5_digest": "05c6ea7b48c7a3fc65d23d60f5473d22",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8,<3.11",
            "size": 24336,
            "upload_time": "2023-01-02T08:39:47",
            "upload_time_iso_8601": "2023-01-02T08:39:47.928067Z",
            "url": "https://files.pythonhosted.org/packages/a3/23/7c2dd2dd33e5a9c14c657f461db7b2cfaa81e633bccfe90571575fd276e9/abcdrl-0.2.0a4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-01-02 08:39:47",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "lcname": "abcdrl"
}
        
Elapsed time: 0.02378s