dsrl


Namedsrl JSON
Version 0.1.0 PyPI version JSON
download
home_pagehttps://github.com/liuzuxin/dsrl
SummaryDatasets for Offline Safe Reinforcement Learning
upload_time2023-06-14 05:08:20
maintainer
docs_urlNone
authorDSRL contributors
requires_python>=3.8
licenseApache
keywords datasets for offline safe reinforcement learning
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <div align="center">
  <a href="http://www.offline-saferl.org"><img width="300px" height="auto" src="https://github.com/liuzuxin/dsrl/raw/main/docs/dsrl-logo.png"></a>
</div>

<br/>

<div align="center">

  <a>![Python 3.8+](https://img.shields.io/badge/Python-3.8%2B-brightgreen.svg)</a>
  [![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](#license)
  <!-- [![Documentation Status](https://img.shields.io/readthedocs/fsrl?logo=readthedocs)](https://fsrl.readthedocs.io) -->
  <!-- [![CodeCov](https://codecov.io/github/liuzuxin/fsrl/branch/main/graph/badge.svg?token=BU27LTW9F3)](https://codecov.io/github/liuzuxin/fsrl)
  [![Tests](https://github.com/liuzuxin/fsrl/actions/workflows/test.yml/badge.svg)](https://github.com/liuzuxin/fsrl/actions/workflows/test.yml) -->
  <!-- [![CodeCov](https://img.shields.io/codecov/c/github/liuzuxin/fsrl/main?logo=codecov)](https://app.codecov.io/gh/liuzuxin/fsrl) -->
  <!-- [![tests](https://img.shields.io/github/actions/workflow/status/liuzuxin/fsrl/test.yml?label=tests&logo=github)](https://github.com/liuzuxin/fsrl/tree/HEAD/tests) -->
  <!-- [![PyPI](https://img.shields.io/pypi/v/fsrl?logo=pypi)](https://pypi.org/project/fsrl) -->
  <!-- [![GitHub Repo Stars](https://img.shields.io/github/stars/liuzuxin/fsrl?color=brightgreen&logo=github)](https://github.com/liuzuxin/fsrl/stargazers)
  [![Downloads](https://static.pepy.tech/personalized-badge/fsrl?period=total&left_color=grey&right_color=blue&left_text=downloads)](https://pepy.tech/project/fsrl) -->

</div>

---


**DSRL (Datasets for Safe Reinforcement Learning)** provides a rich collection of datasets specifically designed for offline Safe Reinforcement Learning (RL). Created with the objective of fostering progress in offline safe RL research, DSRL bridges a crucial gap in the availability of safety-centric public benchmarks and datasets. 

DSRL provides:

1. **Diverse datasets:** 38 datasets across different safe RL environments and difficulty levels in SafetyGymnasium, BulletSafetyGym, and MetaDrive, all prepared with safety considerations.
2. **Consistent API with D4RL:** For easy use and evaluation of offline learning methods.
3. **Data post-processing filters:** Allowing alteration of data density, noise level, and reward distributions to simulate various data collection conditions.

This package is a part of a comprehensive benchmarking suite that includes [FSRL](https://github.com/liuzuxin/fsrl) and [OSRL](https://github.com/liuzuxin/osrl) and aims to promote advancements in the development and evaluation of safe learning algorithms.

To learn more, please visit our [project website](http://www.offline-saferl.org).

<!-- To learn more, please visit our [project website](http://www.offline-saferl.org) or refer to our [documentation](./docs). -->

## Installation

Pull this repo and install:
```
git clone https://github.com/liuzuxin/DSRL.git
cd DSRL
# install bullet_safety_gym only (by default)
pip install -e .
# install mujoco-based safety_gymnasium
pip install -e .[mujoco]
# install metadrive
pip install -e .[metadrive]
# install all in once
pip install -e .[all]
```

## How to use DSRL
DSRL uses the [Gymnasium](https://gymnasium.farama.org/) API. Tasks are created via the `gymnasium.make` function. Each task is associated with a fixed offline dataset, which can be obtained with the `env.get_dataset()` method. This method returns a dictionary with:
- `observations`: An N × obs_dim array of observations.
- `next_observations`: An N × obs_dim of next observations.
- `actions`: An N × act_dim array of actions.
- `rewards`: An N dimensional array of rewards.
- `costs`: An N dimensional array of costs.
- `terminals`: An N dimensional array of episode termination flags. This is true when episodes end due to termination conditions such as falling over.
- `timeouts`: An N dimensional array of termination flags. This is true when episodes end due to reaching the maximum episode length.

The usage is similar to [D4RL](https://github.com/Farama-Foundation/D4RL). Here is an example code:

```python
import gymnasium as gym
import dsrl

# Create the environment
env = gym.make('OfflineCarCircle-v0')

# Each task is associated with a dataset
# dataset contains observations, next_observatiosn, actions, rewards, costs, terminals, timeouts
dataset = env.get_dataset()
print(dataset['observations']) # An N x obs_dim Numpy array of observations

# dsrl abides by the OpenAI gym interface
obs, info = env.reset()
obs, reward, terminal, timeout, info = env.step(env.action_space.sample())
cost = info["cost"]

# Apply dataset filters [optional]
# dataset = env.pre_process_data(dataset, filter_cfgs)
```

Datasets are automatically downloaded to the `~/.dsrl/datasets` directory when `get_dataset()` is called. If you would like to change the location of this directory, you can set the `$DSRL_DATASET_DIR` environment variable to the directory of your choosing, or pass in the dataset filepath directly into the `get_dataset` method.

You can use run the following example scripts to play with the offline dataset of all the supported environments: 

``` bash
python examples/run_mujoco.py --agent [your_agent] --task [your_task]
python examples/run_bullet.py --agent [your_agent] --task [your_task]
python examples/run_metadrive.py --road [your_road] --traffic [your_traffic] 
```

### Normalizing Scores
- Set target cost by using `env.set_target_cost(target_cost)` function, where `target_cost` is the undiscounted sum of costs of an episode
- You can use the `env.get_normalized_score(return, cost_return)` function to compute a normalized reward and cost for an episode, where `returns` and `cost_returns` are the undiscounted sum of rewards and costs respectively of an episode. 
- The individual min and max reference returns are stored in `dsrl/infos.py` for reference.


## License

All datasets are licensed under the [Creative Commons Attribution 4.0 License (CC BY)](https://creativecommons.org/licenses/by/4.0/), and code is licensed under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0.html).

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/liuzuxin/dsrl",
    "name": "dsrl",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "",
    "keywords": "datasets for offline safe reinforcement learning",
    "author": "DSRL contributors",
    "author_email": "zuxin1997@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/27/5d/995bbda71be452e99f871c2ad37f3eb1cfac0098434a2c97859e2b8c19a8/dsrl-0.1.0.tar.gz",
    "platform": null,
    "description": "<div align=\"center\">\n  <a href=\"http://www.offline-saferl.org\"><img width=\"300px\" height=\"auto\" src=\"https://github.com/liuzuxin/dsrl/raw/main/docs/dsrl-logo.png\"></a>\n</div>\n\n<br/>\n\n<div align=\"center\">\n\n  <a>![Python 3.8+](https://img.shields.io/badge/Python-3.8%2B-brightgreen.svg)</a>\n  [![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](#license)\n  <!-- [![Documentation Status](https://img.shields.io/readthedocs/fsrl?logo=readthedocs)](https://fsrl.readthedocs.io) -->\n  <!-- [![CodeCov](https://codecov.io/github/liuzuxin/fsrl/branch/main/graph/badge.svg?token=BU27LTW9F3)](https://codecov.io/github/liuzuxin/fsrl)\n  [![Tests](https://github.com/liuzuxin/fsrl/actions/workflows/test.yml/badge.svg)](https://github.com/liuzuxin/fsrl/actions/workflows/test.yml) -->\n  <!-- [![CodeCov](https://img.shields.io/codecov/c/github/liuzuxin/fsrl/main?logo=codecov)](https://app.codecov.io/gh/liuzuxin/fsrl) -->\n  <!-- [![tests](https://img.shields.io/github/actions/workflow/status/liuzuxin/fsrl/test.yml?label=tests&logo=github)](https://github.com/liuzuxin/fsrl/tree/HEAD/tests) -->\n  <!-- [![PyPI](https://img.shields.io/pypi/v/fsrl?logo=pypi)](https://pypi.org/project/fsrl) -->\n  <!-- [![GitHub Repo Stars](https://img.shields.io/github/stars/liuzuxin/fsrl?color=brightgreen&logo=github)](https://github.com/liuzuxin/fsrl/stargazers)\n  [![Downloads](https://static.pepy.tech/personalized-badge/fsrl?period=total&left_color=grey&right_color=blue&left_text=downloads)](https://pepy.tech/project/fsrl) -->\n\n</div>\n\n---\n\n\n**DSRL (Datasets for Safe Reinforcement Learning)** provides a rich collection of datasets specifically designed for offline Safe Reinforcement Learning (RL). Created with the objective of fostering progress in offline safe RL research, DSRL bridges a crucial gap in the availability of safety-centric public benchmarks and datasets. \n\nDSRL provides:\n\n1. **Diverse datasets:** 38 datasets across different safe RL environments and difficulty levels in SafetyGymnasium, BulletSafetyGym, and MetaDrive, all prepared with safety considerations.\n2. **Consistent API with D4RL:** For easy use and evaluation of offline learning methods.\n3. **Data post-processing filters:** Allowing alteration of data density, noise level, and reward distributions to simulate various data collection conditions.\n\nThis package is a part of a comprehensive benchmarking suite that includes [FSRL](https://github.com/liuzuxin/fsrl) and [OSRL](https://github.com/liuzuxin/osrl) and aims to promote advancements in the development and evaluation of safe learning algorithms.\n\nTo learn more, please visit our [project website](http://www.offline-saferl.org).\n\n<!-- To learn more, please visit our [project website](http://www.offline-saferl.org) or refer to our [documentation](./docs). -->\n\n## Installation\n\nPull this repo and install:\n```\ngit clone https://github.com/liuzuxin/DSRL.git\ncd DSRL\n# install bullet_safety_gym only (by default)\npip install -e .\n# install mujoco-based safety_gymnasium\npip install -e .[mujoco]\n# install metadrive\npip install -e .[metadrive]\n# install all in once\npip install -e .[all]\n```\n\n## How to use DSRL\nDSRL uses the [Gymnasium](https://gymnasium.farama.org/) API. Tasks are created via the `gymnasium.make` function. Each task is associated with a fixed offline dataset, which can be obtained with the `env.get_dataset()` method. This method returns a dictionary with:\n- `observations`: An N \u00d7 obs_dim array of observations.\n- `next_observations`: An N \u00d7 obs_dim of next observations.\n- `actions`: An N \u00d7 act_dim array of actions.\n- `rewards`: An N dimensional array of rewards.\n- `costs`: An N dimensional array of costs.\n- `terminals`: An N dimensional array of episode termination flags. This is true when episodes end due to termination conditions such as falling over.\n- `timeouts`: An N dimensional array of termination flags. This is true when episodes end due to reaching the maximum episode length.\n\nThe usage is similar to [D4RL](https://github.com/Farama-Foundation/D4RL). Here is an example code:\n\n```python\nimport gymnasium as gym\nimport dsrl\n\n# Create the environment\nenv = gym.make('OfflineCarCircle-v0')\n\n# Each task is associated with a dataset\n# dataset contains observations, next_observatiosn, actions, rewards, costs, terminals, timeouts\ndataset = env.get_dataset()\nprint(dataset['observations']) # An N x obs_dim Numpy array of observations\n\n# dsrl abides by the OpenAI gym interface\nobs, info = env.reset()\nobs, reward, terminal, timeout, info = env.step(env.action_space.sample())\ncost = info[\"cost\"]\n\n# Apply dataset filters [optional]\n# dataset = env.pre_process_data(dataset, filter_cfgs)\n```\n\nDatasets are automatically downloaded to the `~/.dsrl/datasets` directory when `get_dataset()` is called. If you would like to change the location of this directory, you can set the `$DSRL_DATASET_DIR` environment variable to the directory of your choosing, or pass in the dataset filepath directly into the `get_dataset` method.\n\nYou can use run the following example scripts to play with the offline dataset of all the supported environments: \n\n``` bash\npython examples/run_mujoco.py --agent [your_agent] --task [your_task]\npython examples/run_bullet.py --agent [your_agent] --task [your_task]\npython examples/run_metadrive.py --road [your_road] --traffic [your_traffic] \n```\n\n### Normalizing Scores\n- Set target cost by using `env.set_target_cost(target_cost)` function, where `target_cost` is the undiscounted sum of costs of an episode\n- You can use the `env.get_normalized_score(return, cost_return)` function to compute a normalized reward and cost for an episode, where `returns` and `cost_returns` are the undiscounted sum of rewards and costs respectively of an episode. \n- The individual min and max reference returns are stored in `dsrl/infos.py` for reference.\n\n\n## License\n\nAll datasets are licensed under the [Creative Commons Attribution 4.0 License (CC BY)](https://creativecommons.org/licenses/by/4.0/), and code is licensed under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0.html).\n",
    "bugtrack_url": null,
    "license": "Apache",
    "summary": "Datasets for Offline Safe Reinforcement Learning",
    "version": "0.1.0",
    "project_urls": {
        "Homepage": "https://github.com/liuzuxin/dsrl"
    },
    "split_keywords": [
        "datasets",
        "for",
        "offline",
        "safe",
        "reinforcement",
        "learning"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "0b0778733a92f5181a50bc6cbfc0fc0ca112d3b0f57a3c1f54ec5ef5d8aac331",
                "md5": "03f201f30084cccb78ec3560694d9c8a",
                "sha256": "1d2f4b019fb884b29a783e53eb03a765e4f080e809c9055a367beab9b875baea"
            },
            "downloads": -1,
            "filename": "dsrl-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "03f201f30084cccb78ec3560694d9c8a",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 21887,
            "upload_time": "2023-06-14T05:08:18",
            "upload_time_iso_8601": "2023-06-14T05:08:18.454574Z",
            "url": "https://files.pythonhosted.org/packages/0b/07/78733a92f5181a50bc6cbfc0fc0ca112d3b0f57a3c1f54ec5ef5d8aac331/dsrl-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "275d995bbda71be452e99f871c2ad37f3eb1cfac0098434a2c97859e2b8c19a8",
                "md5": "2d7f713bdb3f68d04d85e26db56c2f42",
                "sha256": "69bc8b3c6130285a0dc82a77521af7b9a47af80c62b2c47b51a0007f78e3f8c1"
            },
            "downloads": -1,
            "filename": "dsrl-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "2d7f713bdb3f68d04d85e26db56c2f42",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 22802,
            "upload_time": "2023-06-14T05:08:20",
            "upload_time_iso_8601": "2023-06-14T05:08:20.519245Z",
            "url": "https://files.pythonhosted.org/packages/27/5d/995bbda71be452e99f871c2ad37f3eb1cfac0098434a2c97859e2b8c19a8/dsrl-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-06-14 05:08:20",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "liuzuxin",
    "github_project": "dsrl",
    "github_not_found": true,
    "lcname": "dsrl"
}
        
Elapsed time: 0.07878s