# Matrix MDP
[![Downloads](https://pepy.tech/badge/matrix-mdp-gym)](https://pepy.tech/project/matrix-mdp-gym)
Easily generate an MDP from transition and reward matricies.
Want to learn more on the story behind this repo? Check the blog post [here](https://www.paul-festor.com/post/i-created-a-python-library)!
## Installation
Assuming you are in the root directory of the project, run the following command:
```bash
pip install matrix-mdp-gym
```
## Usage
```python
import gymnasium as gym
import matrix_mdp
env = gym.make('matrix_mdp/MatrixMDP-v0')
```
## Environment documentation
### Description
A flexible environment to have a gym API for discrete MDPs with `N_s` states and `N_a` actions given:
- A vector of initial state distribution vector P_0(S)
- A transition probability matrix P(S' | S, A)
- A reward matrix R(S', S, A) of the reward for reaching S' after having taken action A in state S
### Action Space
The action is a `ndarray` with shape `(1,)` representing the index of the action to execute.
### Observation Space
The observation is a `ndarray` with shape `(1,)` representing index of the state the agent is in.
### Rewards
The reward function is defined according to the reward matrix given at the creation of the environment.
### Starting State
The starting state is a random state sampled from $P_0$.
### Episode Truncation
The episode truncates when a terminal state is reached.
Terminal states are inferred from the transition probability matrix as
$\sum_{s' \in S} \sum_{s \in S} \sum_{a \in A} P(s' | s, a) = 0$
### Arguments
- `p_à`: `ndarray` of shape `(n_states, )` representing the initial state probability distribution.
- `p`: `ndarray` of shape `(n_states, n_states, n_actions)` representing the transition dynamics $P(S' | S, A)$.
- `r`: `ndarray` of shape `(n_states, n_states, n_actions)` representing the reward matrix.
```python
import gymnasium as gym
import matrix_mdp
gym.make('MatrixMDP-v0', p_0=p_0, p=p, r=r)
```
### Version History
* `v0`: Initial versions release
## Acknowledgements
Thanks to [Will Dudley](https://github.com/WillDudley) for his help on learning how to put a Python package together/
Raw data
{
"_id": null,
"home_page": "https://github.com/Paul-543NA/matrix-mdp-gym",
"name": "matrix-mdp-gym",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": "",
"keywords": "Reinforcement-Learning Reinforcement-Learning-Environment Gym-Environment Markov-Decision-Processes Gym OpenAI-Gym Gymnasium",
"author": "Paul Festor",
"author_email": "paul.festor@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/da/9f/126c5807efcac8cd4b0b09ab907c5437ccc76eb01c749a0a3b1e0ed97fe5/matrix-mdp-gym-1.1.1.tar.gz",
"platform": null,
"description": "# Matrix MDP\n[![Downloads](https://pepy.tech/badge/matrix-mdp-gym)](https://pepy.tech/project/matrix-mdp-gym)\n\nEasily generate an MDP from transition and reward matricies.\n\nWant to learn more on the story behind this repo? Check the blog post [here](https://www.paul-festor.com/post/i-created-a-python-library)!\n\n\n## Installation\nAssuming you are in the root directory of the project, run the following command:\n```bash\npip install matrix-mdp-gym\n```\n\n## Usage\n```python\nimport gymnasium as gym\nimport matrix_mdp\nenv = gym.make('matrix_mdp/MatrixMDP-v0')\n```\n\n## Environment documentation\n\n### Description\n\nA flexible environment to have a gym API for discrete MDPs with `N_s` states and `N_a` actions given:\n - A vector of initial state distribution vector P_0(S)\n - A transition probability matrix P(S' | S, A)\n - A reward matrix R(S', S, A) of the reward for reaching S' after having taken action A in state S\n\n### Action Space\n\nThe action is a `ndarray` with shape `(1,)` representing the index of the action to execute.\n\n### Observation Space\n\nThe observation is a `ndarray` with shape `(1,)` representing index of the state the agent is in.\n\n### Rewards\n\nThe reward function is defined according to the reward matrix given at the creation of the environment.\n\n### Starting State\n\nThe starting state is a random state sampled from $P_0$.\n\n### Episode Truncation\n\nThe episode truncates when a terminal state is reached.\nTerminal states are inferred from the transition probability matrix as\n$\\sum_{s' \\in S} \\sum_{s \\in S} \\sum_{a \\in A} P(s' | s, a) = 0$\n\n### Arguments\n\n- `p_\u00e0`: `ndarray` of shape `(n_states, )` representing the initial state probability distribution.\n- `p`: `ndarray` of shape `(n_states, n_states, n_actions)` representing the transition dynamics $P(S' | S, A)$.\n- `r`: `ndarray` of shape `(n_states, n_states, n_actions)` representing the reward matrix.\n\n```python\nimport gymnasium as gym\nimport matrix_mdp\n\ngym.make('MatrixMDP-v0', p_0=p_0, p=p, r=r)\n```\n\n### Version History\n\n* `v0`: Initial versions release\n\n## Acknowledgements\n\nThanks to [Will Dudley](https://github.com/WillDudley) for his help on learning how to put a Python package together/\n",
"bugtrack_url": null,
"license": "MIT License",
"summary": "An OpenAI gym / Gymnasium environment to seamlessly create discrete MDPs from matrices.",
"version": "1.1.1",
"split_keywords": [
"reinforcement-learning",
"reinforcement-learning-environment",
"gym-environment",
"markov-decision-processes",
"gym",
"openai-gym",
"gymnasium"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "d359b599f21b2db76259549fe3599327e2aa5aee6145a9be56e7e2d8aacaa40c",
"md5": "3715743d5c78802e54ec330ef482e448",
"sha256": "a647ce63d11ad8bf9391a9816f1358fbb13100a1f733c6bd40b908062b581177"
},
"downloads": -1,
"filename": "matrix_mdp_gym-1.1.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "3715743d5c78802e54ec330ef482e448",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 6002,
"upload_time": "2023-02-02T11:32:04",
"upload_time_iso_8601": "2023-02-02T11:32:04.320622Z",
"url": "https://files.pythonhosted.org/packages/d3/59/b599f21b2db76259549fe3599327e2aa5aee6145a9be56e7e2d8aacaa40c/matrix_mdp_gym-1.1.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "da9f126c5807efcac8cd4b0b09ab907c5437ccc76eb01c749a0a3b1e0ed97fe5",
"md5": "c542cea29c6db526ed3d95498d3cb2ae",
"sha256": "a1e3e2ec1805f5e12c5f2d774e95e7acc3e133a69f0c690bb659c6807fa8bd08"
},
"downloads": -1,
"filename": "matrix-mdp-gym-1.1.1.tar.gz",
"has_sig": false,
"md5_digest": "c542cea29c6db526ed3d95498d3cb2ae",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 5333,
"upload_time": "2023-02-02T11:32:06",
"upload_time_iso_8601": "2023-02-02T11:32:06.509086Z",
"url": "https://files.pythonhosted.org/packages/da/9f/126c5807efcac8cd4b0b09ab907c5437ccc76eb01c749a0a3b1e0ed97fe5/matrix-mdp-gym-1.1.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-02-02 11:32:06",
"github": true,
"gitlab": false,
"bitbucket": false,
"github_user": "Paul-543NA",
"github_project": "matrix-mdp-gym",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "matrix-mdp-gym"
}