dopamine-rl


Namedopamine-rl JSON
Version 4.1.2 PyPI version JSON
download
home_pagehttps://github.com/google/dopamine
SummaryDopamine: A framework for flexible Reinforcement Learning research
upload_time2024-10-31 13:26:47
maintainerNone
docs_urlNone
authorThe Dopamine Team
requires_python<4,>=3.5
licenseApache 2.0
keywords dopamine reinforcement machine learning research
VCS
bugtrack_url
requirements absl-py astunparse cachetools certifi chardet cloudpickle cycler flax future gast gin-config google-auth google-auth-oauthlib google-pasta grpcio gym h5py idna jax jaxlib Keras-Preprocessing kiwisolver Markdown matplotlib msgpack numpy oauthlib opencv-python opt-einsum pandas Pillow protobuf pyasn1 pyasn1-modules pygame pyglet pyparsing python-dateutil pytz requests requests-oauthlib rsa scipy six setuptools tensorboard tensorboard-plugin-wit tensorflow tensorflow-estimator tensorflow-probability termcolor tf-slim urllib3 Werkzeug wrapt tqdm
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Dopamine
[Getting Started](#getting-started) |
[Docs][docs] |
[Baseline Results][baselines] |
[Changelist](https://google.github.io/dopamine/docs/changelist)

<div align="center">
  <img src="https://google.github.io/dopamine/images/dopamine_logo.png"><br><br>
</div>

Dopamine is a research framework for fast prototyping of reinforcement learning
algorithms. It aims to fill the need for a small, easily grokked codebase in
which users can freely experiment with wild ideas (speculative research).

Our design principles are:

* _Easy experimentation_: Make it easy for new users to run benchmark
                          experiments.
* _Flexible development_: Make it easy for new users to try out research ideas.
* _Compact and reliable_: Provide implementations for a few, battle-tested
                          algorithms.
* _Reproducible_: Facilitate reproducibility in results. In particular, our
                  setup follows the recommendations given by
                  [Machado et al. (2018)][machado].

Dopamine supports the following agents, implemented with jax:

* DQN ([Mnih et al., 2015][dqn])
* C51 ([Bellemare et al., 2017][c51])
* Rainbow ([Hessel et al., 2018][rainbow])
* IQN ([Dabney et al., 2018][iqn])
* SAC ([Haarnoja et al., 2018][sac])
* PPO ([Schulman et al., 2017][ppo])

For more information on the available agents, see the [docs](https://google.github.io/dopamine/docs).

Many of these agents also have a tensorflow (legacy) implementation, though
newly added agents are likely to be jax-only.

This is not an official Google product.

## Getting Started


We provide docker containers for using Dopamine.
Instructions can be found [here](https://google.github.io/dopamine/docker/).

Alternatively, Dopamine can be installed from source (preferred) or installed
with pip. For either of these methods, continue reading at prerequisites.

### Prerequisites

Dopamine supports Atari environments and Mujoco environments. Install the
environments you intend to use before you install Dopamine:

**Atari**

1. These should now come packaged with
   [ale_py](https://github.com/Farama-Foundation/Arcade-Learning-Environment).
1. You may need to manually run some steps to properly install `baselines`, see
   [instructions](https://github.com/openai/baselines).

**Mujoco**

1. Install Mujoco and get a license
[here](https://github.com/openai/mujoco-py#install-mujoco).
2. Run `pip install mujoco-py` (we recommend using a
[virtual environment](virtualenv)).

### Installing from Source


The most common way to use Dopamine is to install it from source and modify
the source code directly:

```
git clone https://github.com/google/dopamine
```

After cloning, install dependencies:

```
pip install -r dopamine/requirements.txt
```

Dopamine supports tensorflow (legacy) and jax (actively maintained) agents.
View the [Tensorflow documentation](https://www.tensorflow.org/install) for
more information on installing tensorflow.

Note: We recommend using a [virtual environment](virtualenv) when working with Dopamine.

### Installing with Pip

Note: We strongly recommend installing from source for most users.

Installing with pip is simple, but Dopamine is designed to be modified
directly. We recommend installing from source for writing your own experiments.

```
pip install dopamine-rl
```

### Running tests

You can test whether the installation was successful by running the following
from the dopamine root directory.

```
export PYTHONPATH=$PYTHONPATH:$PWD
python -m tests.dopamine.atari_init_test
```

## Next Steps

View the [docs][docs] for more information on training agents.

We supply [baselines][baselines] for each Dopamine agent.

We also provide a set of [Colaboratory notebooks](https://github.com/google/dopamine/tree/master/dopamine/colab)
which demonstrate how to use Dopamine.

## References

[Bellemare et al., *The Arcade Learning Environment: An evaluation platform for
general agents*. Journal of Artificial Intelligence Research, 2013.][ale]

[Machado et al., *Revisiting the Arcade Learning Environment: Evaluation
Protocols and Open Problems for General Agents*, Journal of Artificial
Intelligence Research, 2018.][machado]

[Hessel et al., *Rainbow: Combining Improvements in Deep Reinforcement Learning*.
Proceedings of the AAAI Conference on Artificial Intelligence, 2018.][rainbow]

[Mnih et al., *Human-level Control through Deep Reinforcement Learning*. Nature,
2015.][dqn]

[Schaul et al., *Prioritized Experience Replay*. Proceedings of the International
Conference on Learning Representations, 2016.][prioritized_replay]

[Haarnoja et al., *Soft Actor-Critic Algorithms and Applications*,
arXiv preprint arXiv:1812.05905, 2018.][sac]

[Schulman et al., *Proximal Policy Optimization Algorithms*.][ppo]

## Giving credit

If you use Dopamine in your work, we ask that you cite our
[white paper][dopamine_paper]. Here is an example BibTeX entry:

```
@article{castro18dopamine,
  author    = {Pablo Samuel Castro and
               Subhodeep Moitra and
               Carles Gelada and
               Saurabh Kumar and
               Marc G. Bellemare},
  title     = {Dopamine: {A} {R}esearch {F}ramework for {D}eep {R}einforcement {L}earning},
  year      = {2018},
  url       = {http://arxiv.org/abs/1812.06110},
  archivePrefix = {arXiv}
}
```


[docs]: https://google.github.io/dopamine/docs/
[baselines]: https://google.github.io/dopamine/baselines
[machado]: https://jair.org/index.php/jair/article/view/11182
[ale]: https://jair.org/index.php/jair/article/view/10819
[dqn]: https://storage.googleapis.com/deepmind-media/dqn/DQNNaturePaper.pdf
[a3c]: http://proceedings.mlr.press/v48/mniha16.html
[prioritized_replay]: https://arxiv.org/abs/1511.05952
[c51]: http://proceedings.mlr.press/v70/bellemare17a.html
[rainbow]: https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/download/17204/16680
[iqn]: https://arxiv.org/abs/1806.06923
[sac]: https://arxiv.org/abs/1812.05905
[ppo]: https://arxiv.org/abs/1707.06347
[dopamine_paper]: https://arxiv.org/abs/1812.06110
[vitualenv]: https://docs.python.org/3/library/venv.html#creating-virtual-environments

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/google/dopamine",
    "name": "dopamine-rl",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4,>=3.5",
    "maintainer_email": null,
    "keywords": "dopamine, reinforcement, machine, learning, research",
    "author": "The Dopamine Team",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/fc/d2/6b6372afdb9b62c32a237af4a68551e863314363fb4a065605c6a990a61c/dopamine_rl-4.1.2.tar.gz",
    "platform": null,
    "description": "# Dopamine\n[Getting Started](#getting-started) |\n[Docs][docs] |\n[Baseline Results][baselines] |\n[Changelist](https://google.github.io/dopamine/docs/changelist)\n\n<div align=\"center\">\n  <img src=\"https://google.github.io/dopamine/images/dopamine_logo.png\"><br><br>\n</div>\n\nDopamine is a research framework for fast prototyping of reinforcement learning\nalgorithms. It aims to fill the need for a small, easily grokked codebase in\nwhich users can freely experiment with wild ideas (speculative research).\n\nOur design principles are:\n\n* _Easy experimentation_: Make it easy for new users to run benchmark\n                          experiments.\n* _Flexible development_: Make it easy for new users to try out research ideas.\n* _Compact and reliable_: Provide implementations for a few, battle-tested\n                          algorithms.\n* _Reproducible_: Facilitate reproducibility in results. In particular, our\n                  setup follows the recommendations given by\n                  [Machado et al. (2018)][machado].\n\nDopamine supports the following agents, implemented with jax:\n\n* DQN ([Mnih et al., 2015][dqn])\n* C51 ([Bellemare et al., 2017][c51])\n* Rainbow ([Hessel et al., 2018][rainbow])\n* IQN ([Dabney et al., 2018][iqn])\n* SAC ([Haarnoja et al., 2018][sac])\n* PPO ([Schulman et al., 2017][ppo])\n\nFor more information on the available agents, see the [docs](https://google.github.io/dopamine/docs).\n\nMany of these agents also have a tensorflow (legacy) implementation, though\nnewly added agents are likely to be jax-only.\n\nThis is not an official Google product.\n\n## Getting Started\n\n\nWe provide docker containers for using Dopamine.\nInstructions can be found [here](https://google.github.io/dopamine/docker/).\n\nAlternatively, Dopamine can be installed from source (preferred) or installed\nwith pip. For either of these methods, continue reading at prerequisites.\n\n### Prerequisites\n\nDopamine supports Atari environments and Mujoco environments. Install the\nenvironments you intend to use before you install Dopamine:\n\n**Atari**\n\n1. These should now come packaged with\n   [ale_py](https://github.com/Farama-Foundation/Arcade-Learning-Environment).\n1. You may need to manually run some steps to properly install `baselines`, see\n   [instructions](https://github.com/openai/baselines).\n\n**Mujoco**\n\n1. Install Mujoco and get a license\n[here](https://github.com/openai/mujoco-py#install-mujoco).\n2. Run `pip install mujoco-py` (we recommend using a\n[virtual environment](virtualenv)).\n\n### Installing from Source\n\n\nThe most common way to use Dopamine is to install it from source and modify\nthe source code directly:\n\n```\ngit clone https://github.com/google/dopamine\n```\n\nAfter cloning, install dependencies:\n\n```\npip install -r dopamine/requirements.txt\n```\n\nDopamine supports tensorflow (legacy) and jax (actively maintained) agents.\nView the [Tensorflow documentation](https://www.tensorflow.org/install) for\nmore information on installing tensorflow.\n\nNote: We recommend using a [virtual environment](virtualenv) when working with Dopamine.\n\n### Installing with Pip\n\nNote: We strongly recommend installing from source for most users.\n\nInstalling with pip is simple, but Dopamine is designed to be modified\ndirectly. We recommend installing from source for writing your own experiments.\n\n```\npip install dopamine-rl\n```\n\n### Running tests\n\nYou can test whether the installation was successful by running the following\nfrom the dopamine root directory.\n\n```\nexport PYTHONPATH=$PYTHONPATH:$PWD\npython -m tests.dopamine.atari_init_test\n```\n\n## Next Steps\n\nView the [docs][docs] for more information on training agents.\n\nWe supply [baselines][baselines] for each Dopamine agent.\n\nWe also provide a set of [Colaboratory notebooks](https://github.com/google/dopamine/tree/master/dopamine/colab)\nwhich demonstrate how to use Dopamine.\n\n## References\n\n[Bellemare et al., *The Arcade Learning Environment: An evaluation platform for\ngeneral agents*. Journal of Artificial Intelligence Research, 2013.][ale]\n\n[Machado et al., *Revisiting the Arcade Learning Environment: Evaluation\nProtocols and Open Problems for General Agents*, Journal of Artificial\nIntelligence Research, 2018.][machado]\n\n[Hessel et al., *Rainbow: Combining Improvements in Deep Reinforcement Learning*.\nProceedings of the AAAI Conference on Artificial Intelligence, 2018.][rainbow]\n\n[Mnih et al., *Human-level Control through Deep Reinforcement Learning*. Nature,\n2015.][dqn]\n\n[Schaul et al., *Prioritized Experience Replay*. Proceedings of the International\nConference on Learning Representations, 2016.][prioritized_replay]\n\n[Haarnoja et al., *Soft Actor-Critic Algorithms and Applications*,\narXiv preprint arXiv:1812.05905, 2018.][sac]\n\n[Schulman et al., *Proximal Policy Optimization Algorithms*.][ppo]\n\n## Giving credit\n\nIf you use Dopamine in your work, we ask that you cite our\n[white paper][dopamine_paper]. Here is an example BibTeX entry:\n\n```\n@article{castro18dopamine,\n  author    = {Pablo Samuel Castro and\n               Subhodeep Moitra and\n               Carles Gelada and\n               Saurabh Kumar and\n               Marc G. Bellemare},\n  title     = {Dopamine: {A} {R}esearch {F}ramework for {D}eep {R}einforcement {L}earning},\n  year      = {2018},\n  url       = {http://arxiv.org/abs/1812.06110},\n  archivePrefix = {arXiv}\n}\n```\n\n\n[docs]: https://google.github.io/dopamine/docs/\n[baselines]: https://google.github.io/dopamine/baselines\n[machado]: https://jair.org/index.php/jair/article/view/11182\n[ale]: https://jair.org/index.php/jair/article/view/10819\n[dqn]: https://storage.googleapis.com/deepmind-media/dqn/DQNNaturePaper.pdf\n[a3c]: http://proceedings.mlr.press/v48/mniha16.html\n[prioritized_replay]: https://arxiv.org/abs/1511.05952\n[c51]: http://proceedings.mlr.press/v70/bellemare17a.html\n[rainbow]: https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/download/17204/16680\n[iqn]: https://arxiv.org/abs/1806.06923\n[sac]: https://arxiv.org/abs/1812.05905\n[ppo]: https://arxiv.org/abs/1707.06347\n[dopamine_paper]: https://arxiv.org/abs/1812.06110\n[vitualenv]: https://docs.python.org/3/library/venv.html#creating-virtual-environments\n",
    "bugtrack_url": null,
    "license": "Apache 2.0",
    "summary": "Dopamine: A framework for flexible Reinforcement Learning research",
    "version": "4.1.2",
    "project_urls": {
        "Bug Reports": "https://github.com/google/dopamine/issues",
        "Documentation": "https://github.com/google/dopamine",
        "Homepage": "https://github.com/google/dopamine",
        "Source": "https://github.com/google/dopamine"
    },
    "split_keywords": [
        "dopamine",
        " reinforcement",
        " machine",
        " learning",
        " research"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "cf4b79287a4889f51f143ddf6d73526be97d104d742b17e18f5335fbe5e9fa95",
                "md5": "3eee0c842af3e60cf2156ee62375da03",
                "sha256": "bd03119a54caf839fb61cf22179bad0bde712bc8ccf62aebd93ebce5381e54aa"
            },
            "downloads": -1,
            "filename": "dopamine_rl-4.1.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "3eee0c842af3e60cf2156ee62375da03",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4,>=3.5",
            "size": 290498,
            "upload_time": "2024-10-31T13:26:46",
            "upload_time_iso_8601": "2024-10-31T13:26:46.047066Z",
            "url": "https://files.pythonhosted.org/packages/cf/4b/79287a4889f51f143ddf6d73526be97d104d742b17e18f5335fbe5e9fa95/dopamine_rl-4.1.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "fcd26b6372afdb9b62c32a237af4a68551e863314363fb4a065605c6a990a61c",
                "md5": "98835099a08efb8ecf1e303598e423b0",
                "sha256": "61b791d11edf80e08a164db7bc60166bb669db8a57b6cf21c8717a4aea1a84ef"
            },
            "downloads": -1,
            "filename": "dopamine_rl-4.1.2.tar.gz",
            "has_sig": false,
            "md5_digest": "98835099a08efb8ecf1e303598e423b0",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4,>=3.5",
            "size": 190419,
            "upload_time": "2024-10-31T13:26:47",
            "upload_time_iso_8601": "2024-10-31T13:26:47.851735Z",
            "url": "https://files.pythonhosted.org/packages/fc/d2/6b6372afdb9b62c32a237af4a68551e863314363fb4a065605c6a990a61c/dopamine_rl-4.1.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-10-31 13:26:47",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "google",
    "github_project": "dopamine",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "absl-py",
            "specs": [
                [
                    ">=",
                    "0.9.0"
                ]
            ]
        },
        {
            "name": "astunparse",
            "specs": [
                [
                    ">=",
                    "1.6.3"
                ]
            ]
        },
        {
            "name": "cachetools",
            "specs": [
                [
                    ">=",
                    "4.1.1"
                ]
            ]
        },
        {
            "name": "certifi",
            "specs": [
                [
                    ">=",
                    "2020.6.20"
                ]
            ]
        },
        {
            "name": "chardet",
            "specs": [
                [
                    ">=",
                    "3.0.4"
                ]
            ]
        },
        {
            "name": "cloudpickle",
            "specs": [
                [
                    ">=",
                    "1.3.0"
                ]
            ]
        },
        {
            "name": "cycler",
            "specs": [
                [
                    ">=",
                    "0.10.0"
                ]
            ]
        },
        {
            "name": "flax",
            "specs": [
                [
                    ">=",
                    "0.5.3"
                ]
            ]
        },
        {
            "name": "future",
            "specs": [
                [
                    ">=",
                    "0.18.2"
                ]
            ]
        },
        {
            "name": "gast",
            "specs": [
                [
                    ">=",
                    "0.3.3"
                ]
            ]
        },
        {
            "name": "gin-config",
            "specs": [
                [
                    ">=",
                    "0.3.0"
                ]
            ]
        },
        {
            "name": "google-auth",
            "specs": [
                [
                    ">=",
                    "1.19.2"
                ]
            ]
        },
        {
            "name": "google-auth-oauthlib",
            "specs": [
                [
                    ">=",
                    "0.4.1"
                ]
            ]
        },
        {
            "name": "google-pasta",
            "specs": [
                [
                    ">=",
                    "0.2.0"
                ]
            ]
        },
        {
            "name": "grpcio",
            "specs": [
                [
                    ">=",
                    "1.30.0"
                ]
            ]
        },
        {
            "name": "gym",
            "specs": [
                [
                    "<=",
                    "0.25.2"
                ]
            ]
        },
        {
            "name": "h5py",
            "specs": [
                [
                    ">=",
                    "2.10.0"
                ]
            ]
        },
        {
            "name": "idna",
            "specs": [
                [
                    ">=",
                    "2.10"
                ]
            ]
        },
        {
            "name": "jax",
            "specs": [
                [
                    ">=",
                    "0.3.16"
                ]
            ]
        },
        {
            "name": "jaxlib",
            "specs": [
                [
                    ">=",
                    "0.3.15"
                ]
            ]
        },
        {
            "name": "Keras-Preprocessing",
            "specs": [
                [
                    ">=",
                    "1.1.2"
                ]
            ]
        },
        {
            "name": "kiwisolver",
            "specs": [
                [
                    ">=",
                    "1.2.0"
                ]
            ]
        },
        {
            "name": "Markdown",
            "specs": [
                [
                    ">=",
                    "3.2.2"
                ]
            ]
        },
        {
            "name": "matplotlib",
            "specs": [
                [
                    ">=",
                    "3.3.0"
                ]
            ]
        },
        {
            "name": "msgpack",
            "specs": [
                [
                    ">=",
                    "1.0.0"
                ]
            ]
        },
        {
            "name": "numpy",
            "specs": [
                [
                    ">=",
                    "1.18.5"
                ]
            ]
        },
        {
            "name": "oauthlib",
            "specs": [
                [
                    ">=",
                    "3.1.0"
                ]
            ]
        },
        {
            "name": "opencv-python",
            "specs": [
                [
                    ">=",
                    "4.3.0.36"
                ]
            ]
        },
        {
            "name": "opt-einsum",
            "specs": [
                [
                    ">=",
                    "3.3.0"
                ]
            ]
        },
        {
            "name": "pandas",
            "specs": [
                [
                    ">=",
                    "1.0.5"
                ]
            ]
        },
        {
            "name": "Pillow",
            "specs": [
                [
                    ">=",
                    "7.2.0"
                ]
            ]
        },
        {
            "name": "protobuf",
            "specs": [
                [
                    ">=",
                    "3.12.2"
                ]
            ]
        },
        {
            "name": "pyasn1",
            "specs": [
                [
                    ">=",
                    "0.4.8"
                ]
            ]
        },
        {
            "name": "pyasn1-modules",
            "specs": [
                [
                    ">=",
                    "0.2.8"
                ]
            ]
        },
        {
            "name": "pygame",
            "specs": [
                [
                    ">=",
                    "1.9.6"
                ]
            ]
        },
        {
            "name": "pyglet",
            "specs": [
                [
                    ">=",
                    "1.5.0"
                ]
            ]
        },
        {
            "name": "pyparsing",
            "specs": [
                [
                    ">=",
                    "2.4.7"
                ]
            ]
        },
        {
            "name": "python-dateutil",
            "specs": [
                [
                    ">=",
                    "2.8.1"
                ]
            ]
        },
        {
            "name": "pytz",
            "specs": [
                [
                    ">=",
                    "2020.1"
                ]
            ]
        },
        {
            "name": "requests",
            "specs": [
                [
                    ">=",
                    "2.24.0"
                ]
            ]
        },
        {
            "name": "requests-oauthlib",
            "specs": [
                [
                    ">=",
                    "1.3.0"
                ]
            ]
        },
        {
            "name": "rsa",
            "specs": [
                [
                    ">=",
                    "4.6"
                ]
            ]
        },
        {
            "name": "scipy",
            "specs": [
                [
                    ">=",
                    "1.4.1"
                ]
            ]
        },
        {
            "name": "six",
            "specs": [
                [
                    ">=",
                    "1.15.0"
                ]
            ]
        },
        {
            "name": "setuptools",
            "specs": [
                [
                    ">=",
                    "49.2.01"
                ]
            ]
        },
        {
            "name": "tensorboard",
            "specs": []
        },
        {
            "name": "tensorboard-plugin-wit",
            "specs": []
        },
        {
            "name": "tensorflow",
            "specs": []
        },
        {
            "name": "tensorflow-estimator",
            "specs": []
        },
        {
            "name": "tensorflow-probability",
            "specs": [
                [
                    ">=",
                    "0.13.0"
                ]
            ]
        },
        {
            "name": "termcolor",
            "specs": [
                [
                    ">=",
                    "1.1.0"
                ]
            ]
        },
        {
            "name": "tf-slim",
            "specs": [
                [
                    ">=",
                    "1.1.0"
                ]
            ]
        },
        {
            "name": "urllib3",
            "specs": [
                [
                    ">=",
                    "1.25.10"
                ]
            ]
        },
        {
            "name": "Werkzeug",
            "specs": [
                [
                    ">=",
                    "1.0.1"
                ]
            ]
        },
        {
            "name": "wrapt",
            "specs": [
                [
                    ">=",
                    "1.12.1"
                ]
            ]
        },
        {
            "name": "tqdm",
            "specs": [
                [
                    ">=",
                    "4.64.1"
                ]
            ]
        }
    ],
    "lcname": "dopamine-rl"
}
        
Elapsed time: 0.35598s