<div align="center">
<a href="http://www.offline-saferl.org"><img width="300px" height="auto" src="https://github.com/liuzuxin/dsrl/raw/main/docs/dsrl-logo.png"></a>
</div>
<br/>
<div align="center">
<a>![Python 3.8+](https://img.shields.io/badge/Python-3.8%2B-brightgreen.svg)</a>
[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](#license)
<!-- [![Documentation Status](https://img.shields.io/readthedocs/fsrl?logo=readthedocs)](https://fsrl.readthedocs.io) -->
<!-- [![CodeCov](https://codecov.io/github/liuzuxin/fsrl/branch/main/graph/badge.svg?token=BU27LTW9F3)](https://codecov.io/github/liuzuxin/fsrl)
[![Tests](https://github.com/liuzuxin/fsrl/actions/workflows/test.yml/badge.svg)](https://github.com/liuzuxin/fsrl/actions/workflows/test.yml) -->
<!-- [![CodeCov](https://img.shields.io/codecov/c/github/liuzuxin/fsrl/main?logo=codecov)](https://app.codecov.io/gh/liuzuxin/fsrl) -->
<!-- [![tests](https://img.shields.io/github/actions/workflow/status/liuzuxin/fsrl/test.yml?label=tests&logo=github)](https://github.com/liuzuxin/fsrl/tree/HEAD/tests) -->
<!-- [![PyPI](https://img.shields.io/pypi/v/fsrl?logo=pypi)](https://pypi.org/project/fsrl) -->
<!-- [![GitHub Repo Stars](https://img.shields.io/github/stars/liuzuxin/fsrl?color=brightgreen&logo=github)](https://github.com/liuzuxin/fsrl/stargazers)
[![Downloads](https://static.pepy.tech/personalized-badge/fsrl?period=total&left_color=grey&right_color=blue&left_text=downloads)](https://pepy.tech/project/fsrl) -->
</div>
---
**DSRL (Datasets for Safe Reinforcement Learning)** provides a rich collection of datasets specifically designed for offline Safe Reinforcement Learning (RL). Created with the objective of fostering progress in offline safe RL research, DSRL bridges a crucial gap in the availability of safety-centric public benchmarks and datasets.
DSRL provides:
1. **Diverse datasets:** 38 datasets across different safe RL environments and difficulty levels in SafetyGymnasium, BulletSafetyGym, and MetaDrive, all prepared with safety considerations.
2. **Consistent API with D4RL:** For easy use and evaluation of offline learning methods.
3. **Data post-processing filters:** Allowing alteration of data density, noise level, and reward distributions to simulate various data collection conditions.
This package is a part of a comprehensive benchmarking suite that includes [FSRL](https://github.com/liuzuxin/fsrl) and [OSRL](https://github.com/liuzuxin/osrl) and aims to promote advancements in the development and evaluation of safe learning algorithms.
To learn more, please visit our [project website](http://www.offline-saferl.org).
<!-- To learn more, please visit our [project website](http://www.offline-saferl.org) or refer to our [documentation](./docs). -->
## Installation
Pull this repo and install:
```
git clone https://github.com/liuzuxin/DSRL.git
cd DSRL
# install bullet_safety_gym only (by default)
pip install -e .
# install mujoco-based safety_gymnasium
pip install -e .[mujoco]
# install metadrive
pip install -e .[metadrive]
# install all in once
pip install -e .[all]
```
## How to use DSRL
DSRL uses the [Gymnasium](https://gymnasium.farama.org/) API. Tasks are created via the `gymnasium.make` function. Each task is associated with a fixed offline dataset, which can be obtained with the `env.get_dataset()` method. This method returns a dictionary with:
- `observations`: An N × obs_dim array of observations.
- `next_observations`: An N × obs_dim of next observations.
- `actions`: An N × act_dim array of actions.
- `rewards`: An N dimensional array of rewards.
- `costs`: An N dimensional array of costs.
- `terminals`: An N dimensional array of episode termination flags. This is true when episodes end due to termination conditions such as falling over.
- `timeouts`: An N dimensional array of termination flags. This is true when episodes end due to reaching the maximum episode length.
The usage is similar to [D4RL](https://github.com/Farama-Foundation/D4RL). Here is an example code:
```python
import gymnasium as gym
import dsrl
# Create the environment
env = gym.make('OfflineCarCircle-v0')
# Each task is associated with a dataset
# dataset contains observations, next_observatiosn, actions, rewards, costs, terminals, timeouts
dataset = env.get_dataset()
print(dataset['observations']) # An N x obs_dim Numpy array of observations
# dsrl abides by the OpenAI gym interface
obs, info = env.reset()
obs, reward, terminal, timeout, info = env.step(env.action_space.sample())
cost = info["cost"]
# Apply dataset filters [optional]
# dataset = env.pre_process_data(dataset, filter_cfgs)
```
Datasets are automatically downloaded to the `~/.dsrl/datasets` directory when `get_dataset()` is called. If you would like to change the location of this directory, you can set the `$DSRL_DATASET_DIR` environment variable to the directory of your choosing, or pass in the dataset filepath directly into the `get_dataset` method.
You can use run the following example scripts to play with the offline dataset of all the supported environments:
``` bash
python examples/run_mujoco.py --agent [your_agent] --task [your_task]
python examples/run_bullet.py --agent [your_agent] --task [your_task]
python examples/run_metadrive.py --road [your_road] --traffic [your_traffic]
```
### Normalizing Scores
- Set target cost by using `env.set_target_cost(target_cost)` function, where `target_cost` is the undiscounted sum of costs of an episode
- You can use the `env.get_normalized_score(return, cost_return)` function to compute a normalized reward and cost for an episode, where `returns` and `cost_returns` are the undiscounted sum of rewards and costs respectively of an episode.
- The individual min and max reference returns are stored in `dsrl/infos.py` for reference.
## License
All datasets are licensed under the [Creative Commons Attribution 4.0 License (CC BY)](https://creativecommons.org/licenses/by/4.0/), and code is licensed under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0.html).
Raw data
{
"_id": null,
"home_page": "https://github.com/liuzuxin/dsrl",
"name": "dsrl",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": "",
"keywords": "datasets for offline safe reinforcement learning",
"author": "DSRL contributors",
"author_email": "zuxin1997@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/27/5d/995bbda71be452e99f871c2ad37f3eb1cfac0098434a2c97859e2b8c19a8/dsrl-0.1.0.tar.gz",
"platform": null,
"description": "<div align=\"center\">\n <a href=\"http://www.offline-saferl.org\"><img width=\"300px\" height=\"auto\" src=\"https://github.com/liuzuxin/dsrl/raw/main/docs/dsrl-logo.png\"></a>\n</div>\n\n<br/>\n\n<div align=\"center\">\n\n <a>![Python 3.8+](https://img.shields.io/badge/Python-3.8%2B-brightgreen.svg)</a>\n [![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](#license)\n <!-- [![Documentation Status](https://img.shields.io/readthedocs/fsrl?logo=readthedocs)](https://fsrl.readthedocs.io) -->\n <!-- [![CodeCov](https://codecov.io/github/liuzuxin/fsrl/branch/main/graph/badge.svg?token=BU27LTW9F3)](https://codecov.io/github/liuzuxin/fsrl)\n [![Tests](https://github.com/liuzuxin/fsrl/actions/workflows/test.yml/badge.svg)](https://github.com/liuzuxin/fsrl/actions/workflows/test.yml) -->\n <!-- [![CodeCov](https://img.shields.io/codecov/c/github/liuzuxin/fsrl/main?logo=codecov)](https://app.codecov.io/gh/liuzuxin/fsrl) -->\n <!-- [![tests](https://img.shields.io/github/actions/workflow/status/liuzuxin/fsrl/test.yml?label=tests&logo=github)](https://github.com/liuzuxin/fsrl/tree/HEAD/tests) -->\n <!-- [![PyPI](https://img.shields.io/pypi/v/fsrl?logo=pypi)](https://pypi.org/project/fsrl) -->\n <!-- [![GitHub Repo Stars](https://img.shields.io/github/stars/liuzuxin/fsrl?color=brightgreen&logo=github)](https://github.com/liuzuxin/fsrl/stargazers)\n [![Downloads](https://static.pepy.tech/personalized-badge/fsrl?period=total&left_color=grey&right_color=blue&left_text=downloads)](https://pepy.tech/project/fsrl) -->\n\n</div>\n\n---\n\n\n**DSRL (Datasets for Safe Reinforcement Learning)** provides a rich collection of datasets specifically designed for offline Safe Reinforcement Learning (RL). Created with the objective of fostering progress in offline safe RL research, DSRL bridges a crucial gap in the availability of safety-centric public benchmarks and datasets. \n\nDSRL provides:\n\n1. **Diverse datasets:** 38 datasets across different safe RL environments and difficulty levels in SafetyGymnasium, BulletSafetyGym, and MetaDrive, all prepared with safety considerations.\n2. **Consistent API with D4RL:** For easy use and evaluation of offline learning methods.\n3. **Data post-processing filters:** Allowing alteration of data density, noise level, and reward distributions to simulate various data collection conditions.\n\nThis package is a part of a comprehensive benchmarking suite that includes [FSRL](https://github.com/liuzuxin/fsrl) and [OSRL](https://github.com/liuzuxin/osrl) and aims to promote advancements in the development and evaluation of safe learning algorithms.\n\nTo learn more, please visit our [project website](http://www.offline-saferl.org).\n\n<!-- To learn more, please visit our [project website](http://www.offline-saferl.org) or refer to our [documentation](./docs). -->\n\n## Installation\n\nPull this repo and install:\n```\ngit clone https://github.com/liuzuxin/DSRL.git\ncd DSRL\n# install bullet_safety_gym only (by default)\npip install -e .\n# install mujoco-based safety_gymnasium\npip install -e .[mujoco]\n# install metadrive\npip install -e .[metadrive]\n# install all in once\npip install -e .[all]\n```\n\n## How to use DSRL\nDSRL uses the [Gymnasium](https://gymnasium.farama.org/) API. Tasks are created via the `gymnasium.make` function. Each task is associated with a fixed offline dataset, which can be obtained with the `env.get_dataset()` method. This method returns a dictionary with:\n- `observations`: An N \u00d7 obs_dim array of observations.\n- `next_observations`: An N \u00d7 obs_dim of next observations.\n- `actions`: An N \u00d7 act_dim array of actions.\n- `rewards`: An N dimensional array of rewards.\n- `costs`: An N dimensional array of costs.\n- `terminals`: An N dimensional array of episode termination flags. This is true when episodes end due to termination conditions such as falling over.\n- `timeouts`: An N dimensional array of termination flags. This is true when episodes end due to reaching the maximum episode length.\n\nThe usage is similar to [D4RL](https://github.com/Farama-Foundation/D4RL). Here is an example code:\n\n```python\nimport gymnasium as gym\nimport dsrl\n\n# Create the environment\nenv = gym.make('OfflineCarCircle-v0')\n\n# Each task is associated with a dataset\n# dataset contains observations, next_observatiosn, actions, rewards, costs, terminals, timeouts\ndataset = env.get_dataset()\nprint(dataset['observations']) # An N x obs_dim Numpy array of observations\n\n# dsrl abides by the OpenAI gym interface\nobs, info = env.reset()\nobs, reward, terminal, timeout, info = env.step(env.action_space.sample())\ncost = info[\"cost\"]\n\n# Apply dataset filters [optional]\n# dataset = env.pre_process_data(dataset, filter_cfgs)\n```\n\nDatasets are automatically downloaded to the `~/.dsrl/datasets` directory when `get_dataset()` is called. If you would like to change the location of this directory, you can set the `$DSRL_DATASET_DIR` environment variable to the directory of your choosing, or pass in the dataset filepath directly into the `get_dataset` method.\n\nYou can use run the following example scripts to play with the offline dataset of all the supported environments: \n\n``` bash\npython examples/run_mujoco.py --agent [your_agent] --task [your_task]\npython examples/run_bullet.py --agent [your_agent] --task [your_task]\npython examples/run_metadrive.py --road [your_road] --traffic [your_traffic] \n```\n\n### Normalizing Scores\n- Set target cost by using `env.set_target_cost(target_cost)` function, where `target_cost` is the undiscounted sum of costs of an episode\n- You can use the `env.get_normalized_score(return, cost_return)` function to compute a normalized reward and cost for an episode, where `returns` and `cost_returns` are the undiscounted sum of rewards and costs respectively of an episode. \n- The individual min and max reference returns are stored in `dsrl/infos.py` for reference.\n\n\n## License\n\nAll datasets are licensed under the [Creative Commons Attribution 4.0 License (CC BY)](https://creativecommons.org/licenses/by/4.0/), and code is licensed under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0.html).\n",
"bugtrack_url": null,
"license": "Apache",
"summary": "Datasets for Offline Safe Reinforcement Learning",
"version": "0.1.0",
"project_urls": {
"Homepage": "https://github.com/liuzuxin/dsrl"
},
"split_keywords": [
"datasets",
"for",
"offline",
"safe",
"reinforcement",
"learning"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "0b0778733a92f5181a50bc6cbfc0fc0ca112d3b0f57a3c1f54ec5ef5d8aac331",
"md5": "03f201f30084cccb78ec3560694d9c8a",
"sha256": "1d2f4b019fb884b29a783e53eb03a765e4f080e809c9055a367beab9b875baea"
},
"downloads": -1,
"filename": "dsrl-0.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "03f201f30084cccb78ec3560694d9c8a",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 21887,
"upload_time": "2023-06-14T05:08:18",
"upload_time_iso_8601": "2023-06-14T05:08:18.454574Z",
"url": "https://files.pythonhosted.org/packages/0b/07/78733a92f5181a50bc6cbfc0fc0ca112d3b0f57a3c1f54ec5ef5d8aac331/dsrl-0.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "275d995bbda71be452e99f871c2ad37f3eb1cfac0098434a2c97859e2b8c19a8",
"md5": "2d7f713bdb3f68d04d85e26db56c2f42",
"sha256": "69bc8b3c6130285a0dc82a77521af7b9a47af80c62b2c47b51a0007f78e3f8c1"
},
"downloads": -1,
"filename": "dsrl-0.1.0.tar.gz",
"has_sig": false,
"md5_digest": "2d7f713bdb3f68d04d85e26db56c2f42",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 22802,
"upload_time": "2023-06-14T05:08:20",
"upload_time_iso_8601": "2023-06-14T05:08:20.519245Z",
"url": "https://files.pythonhosted.org/packages/27/5d/995bbda71be452e99f871c2ad37f3eb1cfac0098434a2c97859e2b8c19a8/dsrl-0.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-06-14 05:08:20",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "liuzuxin",
"github_project": "dsrl",
"github_not_found": true,
"lcname": "dsrl"
}