Mighty-RL


NameMighty-RL JSON
Version 1.0.0 PyPI version JSON
download
home_pageNone
SummaryA modular, meta-learning-ready RL library.
upload_time2025-08-08 11:49:23
maintainerNone
docs_urlNone
authorNone
requires_python<3.12,>=3.11
license BSD License Copyright (c) 2023, AutoML Hannover All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. * Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
keywords reinforcement learning metarl generalization in rl
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <p align="center">
    <a href="./docs/img/logo.png">
        <img src="./docs/img/logo_with_font.svg" alt="Mighty Logo" width="80%"/>
    </a>
</p>

<div align="center">
    
[![PyPI Version](https://img.shields.io/pypi/v/mighty-rl.svg)](https://pypi.org/project/Mighty-RL/)
![Python](https://img.shields.io/badge/Python-3.11-3776AB)
![License](https://img.shields.io/badge/License-BSD3-orange)
[![Test](https://github.com/automl/Mighty/actions/workflows/test.yaml/badge.svg)](https://github.com/automl/Mighty/actions/workflows/test.yaml)
[![Doc Status](https://github.com/automl/Mighty/actions/workflows/docs_test.yaml/badge.svg)](https://github.com/automl/Mighty/actions/workflows/docs_test.yaml)
    
</div>

<div align="center">
    <h3>
      <a href="#installation">Installation</a> |
      <a href="https://automl.github.io/Mighty/">Documentation</a> |
      <a href="#run-a-mighty-agent">Run a Mighty Agent</a> |
      <a href="#cite-us">Cite Us</a>
    </h3>
</div>

---

# Mighty

Welcome to Mighty, hopefully your future one-stop shop for everything cRL.
Currently Mighty is still in its early stages with support for normal gym envs, DACBench and CARL.
The interface is controlled through hydra and we provide DQN, PPO and SAC algorithms.
We log training and regular evaluations to file and optionally also to wandb.
If you have any questions or feedback, please tell us, ideally via the GitHub issues!
If you want to get started immediately, use our [Template repository](https://github.com/automl/mighty_project_template).

Mighty features:
- Modular structure for easy (Meta-)RL tinkering
- PPO, SAC and DQN as base algorithms
- Environment integrations via Gymnasium, Pufferlib, CARL & DACBench
- Implementations of some important baselines: RND, PLR, Cosine LR Schedule and more!

## Installation
We recommend to using uv to install and run Mighty in a virtual environment.
The code has been tested with python 3.11 on Unix systems.

First create a clean python environment:

```bash
uv venv --python=3.11
source .venv/bin/activate
```

Then  install Mighty:

```bash
make install
```

Optionally you can install the dev requirements directly:
```bash
make install-dev
```

Alternatively, you can install Mighty from PyPI:
```bash
pip install mighty-rl
```

## Run a Mighty Agent
In order to run a Mighty Agent, use the run_mighty.py script and provide any training options as keywords.
If you want to know more about the configuration options, call:
```bash
python mighty/run_mighty.py --help
```

An example for running the PPO agent on the Pendulum gym environment looks like this:
```bash
python mighty/run_mighty.py 'algorithm=ppo' 'environment=gymnasium/pendulum'
```

### Train your Agent on a CARL Environment
Mighty is designed with contextual RL in mind and therefore fully compatible with CARL.
Before you start training, however, please follow the installation instructions in the [CARL repo](https://github.com/automl/CARL).

Then use the same command as before, but provide the CARL environment, in this example CARLCartPoleEnv,
and information about the context distribution as keywords:
```bash
python mighty/run_mighty.py 'algorithm=ppo' 'env=CARLCartPole' '+env_kwargs.num_contexts=10' '+env_kwargs.context_feature_args.gravity=[normal, 9.8, 1.0, -100.0, 100.0]' 'env_wrappers=[mighty.mighty_utils.wrappers.FlattenVecObs]' 'algorithm_kwargs.rollout_buffer_kwargs.buffer_size=2048'
```

For more complex configurations like this, we recommend making an environment configuration file. Check out our [CARL Ant](mighty/configs/environment/carl_walkers/ant_goals.yaml) file to see how this simplifies the process of working with configurable environments.

### Learning a Configuration Policy via DAC

In order to use Mighty with DACBench, you need to install DACBench first.
We recommend following the instructions in the [DACBench repo](https://github.com/automl/DACBench).

Afterwards, configure the benchmark you want to run. Since most DACBench benchmarks have Dict action and observation spaces, some fairly complex,  you might need to wrap DACBenchmarks in order to translate the observations and actions to an easy-to-handle format. We have a version of the FunctionApproximationBenchmark configured for you so you can get started like this:
```bash
python mighty/run_mighty.py 'algorithm=ppo' 'environment=dacbench/function_approximation'
```
The matching [configuration file](mighty/configs/environment/dacbench/function_approximation.yaml) shows you how to set the search spaces and benchmark type. Refer to DACBench itself to learn how to configure other elements like observations spaces or instance sets.

### Optimize Hyperparameters
You can optimize the hyperparameters of your algorithm with the [Hypersweeper](https://github.com/automl/hypersweeper) package, e.g. using [SMAC3](https://github.com/automl/SMAC3). Mighty is directly compatible with Hypersweeper and thus smart and distributed HPO! There are also other HPO options, check out our [examples](examples/README.md) for more information.

## Build Your Own Mighty Project
If you want to implement your own method in Mighty, we recommend using the [Mighty template repository](https://github.com/automl/mighty_project_template) as a base. It contains a runscript, the most relevant config files and basic scripts for plotting. Our [domain randomization example](https://github.com/automl/mighty_dr_example) shows that you can get started right away. Since Mighty has many options of how to implement your idea, here's a rough guide which Mighty class you want to look at:

```mermaid
stateDiagram
  direction TB
  classDef Neutral stroke-width:1px,stroke-dasharray:none,stroke:#000000,fill:#FFFFFF,color:#000000;
  classDef Peach stroke-width:1px,stroke-dasharray:none,stroke:#FBB35A,fill:#FFEFDB,color:#8F632D;
  classDef Aqua stroke-width:1px,stroke-dasharray:none,stroke:#46EDC8,fill:#DEFFF8,color:#378E7A;
  classDef Sky stroke-width:1px,stroke-dasharray:none,stroke:#374D7C,fill:#E2EBFF,color:#374D7C;
  classDef Pine stroke-width:1px,stroke-dasharray:none,stroke:#254336,fill:#8faea5,color:#FFFFFF;
  classDef Rose stroke-width:1px,stroke-dasharray:none,stroke:#FF5978,fill:#FFDFE5,color:#8E2236;
  classDef Ash stroke-width:1px,stroke-dasharray:none,stroke:#999999,fill:#EEEEEE,color:#000000;
  classDef Seven fill:#E1BEE7,color:#D50000,stroke:#AA00FF;
  Still --> root_end:Yes
  Still --> Moving:No
  Moving --> Crash:Yes
  Moving --> s2:No, only current transitions, env and network
  s2 --> s6:Action Sampling
  s2 --> s10:Policy Update
  s2 --> s8:Training Batch Sampling
  s2 --> Crash:More than one/not listed
  s2 --> s12:Direct algorithm change
  s12 --> s13:Yes
  s12 --> s14:No
  Still:Modify training settings and then repeated runs?
  root_end:Runner
  Moving:Access to update infos (gradients, batches, etc.)?
  Crash:Meta Component
  s2:Which interaction point with the algorithm?
  s6:Exploration Policy
  s10:Update
  s8:Buffer
  s12:Change only the model architecture?
  s13:Network and/or Model
  s14:Agent
  class root_end Peach
  class Crash Aqua
  class s6 Sky
  class s8 Pine
  class s10 Rose
  class s13 Ash
  class s14 Seven
  class Still Neutral
  class Moving Neutral
  class s2 Neutral
  class s12 Neutral
  style root_end color:none
  style s8 color:#FFFFFF
```

## Pre-Implemented Methods
Mighty is meant to be a platform to build upon and not a large collection of methods in itself. We have a few relevant methods pre-implemented, however, and this collection will likely grow over time:

- **Agents**: SAC, PPO, DQN
- **Updates**: SAC, PPO, Q-learning, double Q-learning, clipped double Q-learning
- **Buffer**s: Rollout Buffer, Replay Buffer, Prioritized Replay Buffer
- **Exploration Policies**: e-greedy (with and without decay), ez-greedy, standard stochastic
- **Models** (with MLP, CNN or ResNet backbone): SAC, PPO, DQN (with soft and hard reset options)
- **Meta Components**: RND, NovelD, SPaCE, PLR
- **Runners**: online RL runner, ES runner

## Cite Us

If you use Mighty in your work, please cite us:

```bibtex
@misc{mohaneimer24,
  author    = {A. Mohan and T. Eimer and C. Benjamins and M. Lindauer and A. Biedenkapp},
  title     = {Mighty},
  year      = {2024},
  url = {https://github.com/automl/mighty}
}
```

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "Mighty-RL",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<3.12,>=3.11",
    "maintainer_email": null,
    "keywords": "Reinforcement Learning, MetaRL, Generalization in RL",
    "author": null,
    "author_email": "\"AutoRL@LUHAI\" <a.mohan@ai.uni-hannover.de>",
    "download_url": "https://files.pythonhosted.org/packages/27/eb/8c1c13c03918feab1ce4e9f76969861c9b68ba77fb79780aae85fc636651/mighty_rl-1.0.0.tar.gz",
    "platform": null,
    "description": "<p align=\"center\">\n    <a href=\"./docs/img/logo.png\">\n        <img src=\"./docs/img/logo_with_font.svg\" alt=\"Mighty Logo\" width=\"80%\"/>\n    </a>\n</p>\n\n<div align=\"center\">\n    \n[![PyPI Version](https://img.shields.io/pypi/v/mighty-rl.svg)](https://pypi.org/project/Mighty-RL/)\n![Python](https://img.shields.io/badge/Python-3.11-3776AB)\n![License](https://img.shields.io/badge/License-BSD3-orange)\n[![Test](https://github.com/automl/Mighty/actions/workflows/test.yaml/badge.svg)](https://github.com/automl/Mighty/actions/workflows/test.yaml)\n[![Doc Status](https://github.com/automl/Mighty/actions/workflows/docs_test.yaml/badge.svg)](https://github.com/automl/Mighty/actions/workflows/docs_test.yaml)\n    \n</div>\n\n<div align=\"center\">\n    <h3>\n      <a href=\"#installation\">Installation</a> |\n      <a href=\"https://automl.github.io/Mighty/\">Documentation</a> |\n      <a href=\"#run-a-mighty-agent\">Run a Mighty Agent</a> |\n      <a href=\"#cite-us\">Cite Us</a>\n    </h3>\n</div>\n\n---\n\n# Mighty\n\nWelcome to Mighty, hopefully your future one-stop shop for everything cRL.\nCurrently Mighty is still in its early stages with support for normal gym envs, DACBench and CARL.\nThe interface is controlled through hydra and we provide DQN, PPO and SAC algorithms.\nWe log training and regular evaluations to file and optionally also to wandb.\nIf you have any questions or feedback, please tell us, ideally via the GitHub issues!\nIf you want to get started immediately, use our [Template repository](https://github.com/automl/mighty_project_template).\n\nMighty features:\n- Modular structure for easy (Meta-)RL tinkering\n- PPO, SAC and DQN as base algorithms\n- Environment integrations via Gymnasium, Pufferlib, CARL & DACBench\n- Implementations of some important baselines: RND, PLR, Cosine LR Schedule and more!\n\n## Installation\nWe recommend to using uv to install and run Mighty in a virtual environment.\nThe code has been tested with python 3.11 on Unix systems.\n\nFirst create a clean python environment:\n\n```bash\nuv venv --python=3.11\nsource .venv/bin/activate\n```\n\nThen  install Mighty:\n\n```bash\nmake install\n```\n\nOptionally you can install the dev requirements directly:\n```bash\nmake install-dev\n```\n\nAlternatively, you can install Mighty from PyPI:\n```bash\npip install mighty-rl\n```\n\n## Run a Mighty Agent\nIn order to run a Mighty Agent, use the run_mighty.py script and provide any training options as keywords.\nIf you want to know more about the configuration options, call:\n```bash\npython mighty/run_mighty.py --help\n```\n\nAn example for running the PPO agent on the Pendulum gym environment looks like this:\n```bash\npython mighty/run_mighty.py 'algorithm=ppo' 'environment=gymnasium/pendulum'\n```\n\n### Train your Agent on a CARL Environment\nMighty is designed with contextual RL in mind and therefore fully compatible with CARL.\nBefore you start training, however, please follow the installation instructions in the [CARL repo](https://github.com/automl/CARL).\n\nThen use the same command as before, but provide the CARL environment, in this example CARLCartPoleEnv,\nand information about the context distribution as keywords:\n```bash\npython mighty/run_mighty.py 'algorithm=ppo' 'env=CARLCartPole' '+env_kwargs.num_contexts=10' '+env_kwargs.context_feature_args.gravity=[normal, 9.8, 1.0, -100.0, 100.0]' 'env_wrappers=[mighty.mighty_utils.wrappers.FlattenVecObs]' 'algorithm_kwargs.rollout_buffer_kwargs.buffer_size=2048'\n```\n\nFor more complex configurations like this, we recommend making an environment configuration file. Check out our [CARL Ant](mighty/configs/environment/carl_walkers/ant_goals.yaml) file to see how this simplifies the process of working with configurable environments.\n\n### Learning a Configuration Policy via DAC\n\nIn order to use Mighty with DACBench, you need to install DACBench first.\nWe recommend following the instructions in the [DACBench repo](https://github.com/automl/DACBench).\n\nAfterwards, configure the benchmark you want to run. Since most DACBench benchmarks have Dict action and observation spaces, some fairly complex,  you might need to wrap DACBenchmarks in order to translate the observations and actions to an easy-to-handle format. We have a version of the FunctionApproximationBenchmark configured for you so you can get started like this:\n```bash\npython mighty/run_mighty.py 'algorithm=ppo' 'environment=dacbench/function_approximation'\n```\nThe matching [configuration file](mighty/configs/environment/dacbench/function_approximation.yaml) shows you how to set the search spaces and benchmark type. Refer to DACBench itself to learn how to configure other elements like observations spaces or instance sets.\n\n### Optimize Hyperparameters\nYou can optimize the hyperparameters of your algorithm with the [Hypersweeper](https://github.com/automl/hypersweeper) package, e.g. using [SMAC3](https://github.com/automl/SMAC3). Mighty is directly compatible with Hypersweeper and thus smart and distributed HPO! There are also other HPO options, check out our [examples](examples/README.md) for more information.\n\n## Build Your Own Mighty Project\nIf you want to implement your own method in Mighty, we recommend using the [Mighty template repository](https://github.com/automl/mighty_project_template) as a base. It contains a runscript, the most relevant config files and basic scripts for plotting. Our [domain randomization example](https://github.com/automl/mighty_dr_example) shows that you can get started right away. Since Mighty has many options of how to implement your idea, here's a rough guide which Mighty class you want to look at:\n\n```mermaid\nstateDiagram\n  direction TB\n  classDef Neutral stroke-width:1px,stroke-dasharray:none,stroke:#000000,fill:#FFFFFF,color:#000000;\n  classDef Peach stroke-width:1px,stroke-dasharray:none,stroke:#FBB35A,fill:#FFEFDB,color:#8F632D;\n  classDef Aqua stroke-width:1px,stroke-dasharray:none,stroke:#46EDC8,fill:#DEFFF8,color:#378E7A;\n  classDef Sky stroke-width:1px,stroke-dasharray:none,stroke:#374D7C,fill:#E2EBFF,color:#374D7C;\n  classDef Pine stroke-width:1px,stroke-dasharray:none,stroke:#254336,fill:#8faea5,color:#FFFFFF;\n  classDef Rose stroke-width:1px,stroke-dasharray:none,stroke:#FF5978,fill:#FFDFE5,color:#8E2236;\n  classDef Ash stroke-width:1px,stroke-dasharray:none,stroke:#999999,fill:#EEEEEE,color:#000000;\n  classDef Seven fill:#E1BEE7,color:#D50000,stroke:#AA00FF;\n  Still --> root_end:Yes\n  Still --> Moving:No\n  Moving --> Crash:Yes\n  Moving --> s2:No, only current transitions, env and network\n  s2 --> s6:Action Sampling\n  s2 --> s10:Policy Update\n  s2 --> s8:Training Batch Sampling\n  s2 --> Crash:More than one/not listed\n  s2 --> s12:Direct algorithm change\n  s12 --> s13:Yes\n  s12 --> s14:No\n  Still:Modify training settings and then repeated runs?\n  root_end:Runner\n  Moving:Access to update infos (gradients, batches, etc.)?\n  Crash:Meta Component\n  s2:Which interaction point with the algorithm?\n  s6:Exploration Policy\n  s10:Update\n  s8:Buffer\n  s12:Change only the model architecture?\n  s13:Network and/or Model\n  s14:Agent\n  class root_end Peach\n  class Crash Aqua\n  class s6 Sky\n  class s8 Pine\n  class s10 Rose\n  class s13 Ash\n  class s14 Seven\n  class Still Neutral\n  class Moving Neutral\n  class s2 Neutral\n  class s12 Neutral\n  style root_end color:none\n  style s8 color:#FFFFFF\n```\n\n## Pre-Implemented Methods\nMighty is meant to be a platform to build upon and not a large collection of methods in itself. We have a few relevant methods pre-implemented, however, and this collection will likely grow over time:\n\n- **Agents**: SAC, PPO, DQN\n- **Updates**: SAC, PPO, Q-learning, double Q-learning, clipped double Q-learning\n- **Buffer**s: Rollout Buffer, Replay Buffer, Prioritized Replay Buffer\n- **Exploration Policies**: e-greedy (with and without decay), ez-greedy, standard stochastic\n- **Models** (with MLP, CNN or ResNet backbone): SAC, PPO, DQN (with soft and hard reset options)\n- **Meta Components**: RND, NovelD, SPaCE, PLR\n- **Runners**: online RL runner, ES runner\n\n## Cite Us\n\nIf you use Mighty in your work, please cite us:\n\n```bibtex\n@misc{mohaneimer24,\n  author    = {A. Mohan and T. Eimer and C. Benjamins and M. Lindauer and A. Biedenkapp},\n  title     = {Mighty},\n  year      = {2024},\n  url = {https://github.com/automl/mighty}\n}\n```\n",
    "bugtrack_url": null,
    "license": "  BSD License  Copyright (c) 2023, AutoML Hannover All rights reserved.  Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:  * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.  * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.  * Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.  ",
    "summary": "A modular, meta-learning-ready RL library.",
    "version": "1.0.0",
    "project_urls": null,
    "split_keywords": [
        "reinforcement learning",
        " metarl",
        " generalization in rl"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "2f683a5006b84ff5a1b8f7ef07811b59f61ecb9f58a2932b143f33863c1e7abb",
                "md5": "ba0b087f87bfdb30d2dc19aa33da46e6",
                "sha256": "2eba6147be6d537a2706c018df8c3d1b1ac40cec3f8b85c2d2794760686f2df0"
            },
            "downloads": -1,
            "filename": "mighty_rl-1.0.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "ba0b087f87bfdb30d2dc19aa33da46e6",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<3.12,>=3.11",
            "size": 83682,
            "upload_time": "2025-08-08T11:49:21",
            "upload_time_iso_8601": "2025-08-08T11:49:21.602880Z",
            "url": "https://files.pythonhosted.org/packages/2f/68/3a5006b84ff5a1b8f7ef07811b59f61ecb9f58a2932b143f33863c1e7abb/mighty_rl-1.0.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "27eb8c1c13c03918feab1ce4e9f76969861c9b68ba77fb79780aae85fc636651",
                "md5": "c1c91003861f13a43316612cde34d4e4",
                "sha256": "da4410b200385845c67e816a46afcbd1ead0f450d5c96b049e0f701b814465c6"
            },
            "downloads": -1,
            "filename": "mighty_rl-1.0.0.tar.gz",
            "has_sig": false,
            "md5_digest": "c1c91003861f13a43316612cde34d4e4",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<3.12,>=3.11",
            "size": 67431,
            "upload_time": "2025-08-08T11:49:23",
            "upload_time_iso_8601": "2025-08-08T11:49:23.030974Z",
            "url": "https://files.pythonhosted.org/packages/27/eb/8c1c13c03918feab1ce4e9f76969861c9b68ba77fb79780aae85fc636651/mighty_rl-1.0.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-08 11:49:23",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "mighty-rl"
}
        
Elapsed time: 1.30424s