pymops


Namepymops JSON
Version 1.0.5 PyPI version JSON
download
home_pagehttps://github.com/awolseid/pymops
SummaryA multi-agent simulation-based optimization package for power scheduling
upload_time2023-10-18 15:48:52
maintainer
docs_urlNone
authorAwol Seid Ebrie and Young Jin Kim
requires_python>=3.9
licenseMIT
keywords economic dispach power scheduling reinforcement learning unit commitment
VCS
bugtrack_url
requirements numpy pandas scipy torch tqdm
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # `pymops`: A multi-agent simulation-based optimization package for power scheduling

## About

Power scheduling is an NP-hard optimization problem with high dimensionality, combinatorial nature, non-convex, non-smooth, and discontinuous properties together with multi-period multiple constraints.

- Two sequential tasks:
  - Unit Commitment
  - Load Dispatch:
    - Economic Load Dispatch
    - Environmental Load Dispatch

Power Scheduling aims to determine an optimal load dispatch schedule for simultaneously minimizing different conflicting objectives, particularly economic costs and environmental emissions.

`pymops` is an open-source Python package developed for solving single- to tri-objective optimization in power scheduling problems. The package is built on a novel multi-agent simulation environment, where power-generating units are represented as agents. The agents are heterogeneous, each with multiple conflicting objectives. The scheduling dynamics are simulated using Markov Decision Processes (MDPs), which are used to train a deep reinforcement learning model for solving the optimization problem. 



### Objective Function

The general multi-objective function is formulated by combining different conflicting objectives via a hybrid approach that uses both weighting hyperparameters and unit-specific cost-to-emission conversion factors:

$$\cal \Phi(C,E)=\sum\limits_{t=1}^{24}\sum\limits_{i=1}^n[\omega_0C_{ti}+\sum\limits_{h=1}^m\omega_h\eta_{ih}E_{ti}^{(h)}]$$

where $\eta_i$ denotes cost-to-emission conversion parameter which is defined as $$\displaystyle \eta_i = exp[\frac{\cal \nabla C^{on}(p_i)/\nabla E^{on}(p_i)}{{max[\cal \nabla C^{on}(p_i)/\nabla E^{on}(p_i);\forall i]-min[\cal \nabla C^{on}(p_i)/\nabla E^{on}(p_i);\forall i]}}];\forall i$$
and $\omega_h, h=0,1,...,m$ represents the weight hyperparameter associated with objective $m$.

**Economic Cost Functions**:

$$\cal C_{ti}=z_{ti}C^{on}(p_{ti})+z_{ti}(1-z_{t-1,i})C_{ti}^{su}+(1-z_{ti})z_{t-1,i}C_{ti}^{sd};\forall i,t$$ 
where 
$$\cal C^c(p_{ti})=a_i^cp_{ti}^2+b^cp_{ti}+c^c+|d^c sin[e^c_i(p_{ti}^{min}+p_{ti})]|;\forall i,t$$

**Environmental Emission Functions**: 
$$\cal E_{ti}=z_{ti}E^{on}(p_{ti})+z_{ti}(1-z_{t-1,i})E_{ti}^{su}+(1-z_{ti})z_{t-1,i}E_{ti}^{sd};\forall i,t$$
 where 
 $$\cal E^e(p_{ti})=a_i^ep_{ti}^2+b^ep_{ti}+c^e+d^eexp(e^e_ip_{ti});\forall i,t$$



| Constraints                              | Specification                            |
| ---------------------------------------- | ---------------------------------------- |
| Minimum and maximum power capacities:    | $\cal z_{ti}p_{i}^{min}\le p_{ti}\le z_{ti}p_{i}^{max}$ |
| Maximum ramp-down and ramp-up rates:     | $\cal z_{ti}p_{t-1,i}-z_{ti}p_{ti}\le p_{i}^{down}$ and $z_{ti}p_{ti}-z_{t-1,i}p_{ti}\le p_{i}^{up}$ |
| Mininmum operating (online/offline) durations: | $\cal tt_{ti}^{ON}\ge tt_{i}^{OFF}$ and $tt_{ti}^{OFF}\ge tt_{i}^{OFF}$ |
| Power supply and demand balance:         | $\cal \sum\limits_{i=1}^nz_{ti}p_{ti}=d_t$ |
| Minimum available reserve:               | $\cal \sum\limits_{i=1}^nz_{ti}p_{ti}^{max}\ge (1+ r) d_t$ |




### The Multi-Agent Reinforcement Learning (MARL) Framework
The framework MARL manifests the form of state $\cal S$, action  $\cal A$, transition (probability) function  $\cal P$ and reward  $\cal R$. 

- **Planning Horizon**: The scheduling horizon is an hourly divided day.
  - **Timestep/Period**: Each hour of a day is considered a timestep.  

  - **Episode**: One cycle of determination of unit commitments and load dispatches for a day.

    ​

- **Simulation Enviroment**: Custom MARL simulation environment, structurally similar to OpenAI Gym.
  - Mono-objective to tri-objective scheduling problem (cost, CO2 and SO2).

  - Ramp rate constraints and valve point effects are taken into account.

    ​

- **Agents**: The generating units are represented as multiple agents.
  - The agents are heterogenous (different generating-unit-specific characteristics).
  - Each agent has multiple conflicting objectives. 
  - The agents are cooperative type of RL agents: 
    - Agents collaborate the satisfy the demand at each period/timestep.

    - Agents also strive to minimize the multi-objective function in the entire planning horizon.

      ​

- **State Space**:  Consists of timestep, minimum and maximum capacities,  operating (online/offline) durations, demand to be satisfied.

  ​

- **Action Space**: The commitment statuses (ON/OFF) of all agents.

  ​

- **Transition Function**: The probability of making transition from current state to the next state (no specific formula).
  - The decisions of agents violating any constraint is automatically corrected by the environment.

  - The environment makes also adjustments for both excess and shortages of power supplies.

    ​

- **Reward function**: Agents get a common reward which is the inverse of the average of the normalized value of all objectives.



The MOPS dynamics can be simulated as a 4-tuple $\cal (S,A,P,R)$ MDP:
- The MDPs are input for the deep RL model.

- The deep RL model predicts decision (action) of agents.

- The predicted agents' action is input for the transition function in the environment

  ​



## Installation

The simulation environment can be installed using `pip` :

        ```
        pip install pymops
        ```

Or it can be cloned from GitHub repo and installed.

        ```
        git clone https://github.com/awolseid/pymops.git
        cd pymops
        pip install .
        ​```

### Import package

        ```python 
        import pymops
        from pymops.environ import SimEnv
        ```

### Create simulation environment

        ```
        env = SimEnv(
                    supply_df = default_supply_df, # Units' profile dataframe
                    demand_df = default_demand_df, # Demands profile
                    SR = 0.0, # proportion of spinning reserve => [0, 1]
                    RR = "Yes", # Ramp rate => "yes" or (default "no" (=None)) 
                    VPE = None, # Valve point effects => "yes" or (default "no" (=None))
                    n_objs = None, # Objectives => "tri" for 3 or (default "bi" (=None) for bi-objective)
                    w = None, # Weight => [0, 1] for bi-objective, a list [0.2,0.3,0.5] for tri-objective
                    duplicates = None # Num of duplicates: duplicate units and adjust demands proportionally
                )
        ```

#### Reset environment

        ```
        initial_flat_state, initial_dict_state = env.reset()
        ```

#### Get current state

        ```
        flat_state, dict_state = env.get_current_state()
        ```

#### Execute decision (action) of agents

        ```
        action_vec = np.array([1,1,0,1,0,0,0,0,0,0])
        flat_next_state, reward, done, next_state_dict, dispatch_info = env.step(action_vec)
        ```

## Develop and training (own customized) model

### Import packages

        ```python 
        from pymops.define_dqn import DQNet
        from pymops.madqn import DQNAgents
        from pymops.replaymemory import ReplayMemory
        from pymops.schedules import get_schedules
        ```

### Define model

          ```
          model_0 = DQNet(env, 64)
          print(model_0)
          ```

### Create instance

          ```
          RL_agents = DQNAgents(
                                  environ = env, 
                                  model = model_0, 
                                  epsilon_max = 1.0,
                                  epsilon_min = 0.1,
                                  epsilon_decay = 0.99,
                                  lr = 0.001
                                  )
          ```

### Replay memory

          ```
          memory = ReplayMemory(environ = env, buffer_size = 64)
          ```

### Train model

          ```
          training_results_df = RL_agents.train(memory = memory, batch_size = 64, num_episodes = 500)
          ```

### Get schedule solutions

          ```
          cost, emis, CO2, SO2, schedules_df = get_schedules(environ = env, trained_agents = RL_agents)
          schedules_df
          ```

### Contact Information
Any questions, issues, suggestions, or collaboration opportunities can be reached at: awolseid@pukyong.ac.kr ; youngk@pknu.ac.kr. 


### Citation

Users should cite the following resources. 

- Code Ocean Reproducible Capsule: https://codeocean.com/capsule/0242917/tree:

  - **Ebrie, A.S.**;, **Kim, Y.J.** (2023). pymops: *A multi-agent reinforcement learning simulation environment for multi-objective optimization in power scheduling* [Software Code]. https://doi.org/10.24433/CO.9235622.v1 
- **[Article](https://www.mdpi.com/1996-1073/16/16/5920) produced from the very first version of the package:
  - **Ebrie, A.S.**; **Paik, C.**; **Chung, Y.**; **Kim, Y.J.** (2023). *Environment-Friendly Power Scheduling Based on Deep Contextual Reinforcement Learning*. *Energies*, 16, 5920. https://doi.org/10.3390/en16165920.   

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/awolseid/pymops",
    "name": "pymops",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": "",
    "keywords": "Economic Dispach,Power Scheduling,Reinforcement Learning,Unit Commitment",
    "author": "Awol Seid Ebrie and Young Jin Kim",
    "author_email": "es.awol@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/73/9c/8f3ab7dba6be4c0ae2d41a2651b0626996fa7f7050b1c87d3e9ad702d2ad/pymops-1.0.5.tar.gz",
    "platform": null,
    "description": "# `pymops`: A multi-agent simulation-based optimization package for power scheduling\r\n\r\n## About\r\n\r\nPower scheduling is an NP-hard optimization problem with high dimensionality, combinatorial nature, non-convex, non-smooth, and discontinuous properties together with multi-period multiple constraints.\r\n\r\n- Two sequential tasks:\r\n  - Unit Commitment\r\n  - Load Dispatch:\r\n    - Economic Load Dispatch\r\n    - Environmental Load Dispatch\r\n\r\nPower Scheduling aims to determine an optimal load dispatch schedule for simultaneously minimizing different conflicting objectives, particularly economic costs and environmental emissions.\r\n\r\n`pymops` is an open-source Python package developed for solving single- to tri-objective optimization in power scheduling problems. The package is built on a novel multi-agent simulation environment, where power-generating units are represented as agents. The agents are heterogeneous, each with multiple conflicting objectives. The scheduling dynamics are simulated using Markov Decision Processes (MDPs), which are used to train a deep reinforcement learning model for solving the optimization problem. \r\n\r\n\r\n\r\n### Objective Function\r\n\r\nThe general multi-objective function is formulated by combining different conflicting objectives via a hybrid approach that uses both weighting hyperparameters and unit-specific cost-to-emission conversion factors:\r\n\r\n$$\\cal \\Phi(C,E)=\\sum\\limits_{t=1}^{24}\\sum\\limits_{i=1}^n[\\omega_0C_{ti}+\\sum\\limits_{h=1}^m\\omega_h\\eta_{ih}E_{ti}^{(h)}]$$\r\n\r\nwhere $\\eta_i$ denotes cost-to-emission conversion parameter which is defined as $$\\displaystyle \\eta_i = exp[\\frac{\\cal \\nabla C^{on}(p_i)/\\nabla E^{on}(p_i)}{{max[\\cal \\nabla C^{on}(p_i)/\\nabla E^{on}(p_i);\\forall i]-min[\\cal \\nabla C^{on}(p_i)/\\nabla E^{on}(p_i);\\forall i]}}];\\forall i$$\r\nand $\\omega_h, h=0,1,...,m$ represents the weight hyperparameter associated with objective $m$.\r\n\r\n**Economic Cost Functions**:\r\n\r\n$$\\cal C_{ti}=z_{ti}C^{on}(p_{ti})+z_{ti}(1-z_{t-1,i})C_{ti}^{su}+(1-z_{ti})z_{t-1,i}C_{ti}^{sd};\\forall i,t$$ \r\nwhere \r\n$$\\cal C^c(p_{ti})=a_i^cp_{ti}^2+b^cp_{ti}+c^c+|d^c sin[e^c_i(p_{ti}^{min}+p_{ti})]|;\\forall i,t$$\r\n\r\n**Environmental Emission Functions**: \r\n$$\\cal E_{ti}=z_{ti}E^{on}(p_{ti})+z_{ti}(1-z_{t-1,i})E_{ti}^{su}+(1-z_{ti})z_{t-1,i}E_{ti}^{sd};\\forall i,t$$\r\n where \r\n $$\\cal E^e(p_{ti})=a_i^ep_{ti}^2+b^ep_{ti}+c^e+d^eexp(e^e_ip_{ti});\\forall i,t$$\r\n\r\n\r\n\r\n| Constraints                              | Specification                            |\r\n| ---------------------------------------- | ---------------------------------------- |\r\n| Minimum and maximum power capacities:    | $\\cal z_{ti}p_{i}^{min}\\le p_{ti}\\le z_{ti}p_{i}^{max}$ |\r\n| Maximum ramp-down and ramp-up rates:     | $\\cal z_{ti}p_{t-1,i}-z_{ti}p_{ti}\\le p_{i}^{down}$ and $z_{ti}p_{ti}-z_{t-1,i}p_{ti}\\le p_{i}^{up}$ |\r\n| Mininmum operating (online/offline) durations: | $\\cal tt_{ti}^{ON}\\ge tt_{i}^{OFF}$ and $tt_{ti}^{OFF}\\ge tt_{i}^{OFF}$ |\r\n| Power supply and demand balance:         | $\\cal \\sum\\limits_{i=1}^nz_{ti}p_{ti}=d_t$ |\r\n| Minimum available reserve:               | $\\cal \\sum\\limits_{i=1}^nz_{ti}p_{ti}^{max}\\ge (1+ r) d_t$ |\r\n\r\n\r\n\r\n\r\n### The Multi-Agent Reinforcement Learning (MARL) Framework\r\nThe framework MARL manifests the form of state $\\cal S$, action  $\\cal A$, transition (probability) function  $\\cal P$ and reward  $\\cal R$. \r\n\r\n- **Planning Horizon**: The scheduling horizon is an hourly divided day.\r\n  - **Timestep/Period**: Each hour of a day is considered a timestep.  \r\n\r\n  - **Episode**: One cycle of determination of unit commitments and load dispatches for a day.\r\n\r\n    \u200b\r\n\r\n- **Simulation Enviroment**: Custom MARL simulation environment, structurally similar to OpenAI Gym.\r\n  - Mono-objective to tri-objective scheduling problem (cost, CO2 and SO2).\r\n\r\n  - Ramp rate constraints and valve point effects are taken into account.\r\n\r\n    \u200b\r\n\r\n- **Agents**: The generating units are represented as multiple agents.\r\n  - The agents are heterogenous (different generating-unit-specific characteristics).\r\n  - Each agent has multiple conflicting objectives. \r\n  - The agents are cooperative type of RL agents: \r\n    - Agents collaborate the satisfy the demand at each period/timestep.\r\n\r\n    - Agents also strive to minimize the multi-objective function in the entire planning horizon.\r\n\r\n      \u200b\r\n\r\n- **State Space**:  Consists of timestep, minimum and maximum capacities,  operating (online/offline) durations, demand to be satisfied.\r\n\r\n  \u200b\r\n\r\n- **Action Space**: The commitment statuses (ON/OFF) of all agents.\r\n\r\n  \u200b\r\n\r\n- **Transition Function**: The probability of making transition from current state to the next state (no specific formula).\r\n  - The decisions of agents violating any constraint is automatically corrected by the environment.\r\n\r\n  - The environment makes also adjustments for both excess and shortages of power supplies.\r\n\r\n    \u200b\r\n\r\n- **Reward function**: Agents get a common reward which is the inverse of the average of the normalized value of all objectives.\r\n\r\n\r\n\r\nThe MOPS dynamics can be simulated as a 4-tuple $\\cal (S,A,P,R)$ MDP:\r\n- The MDPs are input for the deep RL model.\r\n\r\n- The deep RL model predicts decision (action) of agents.\r\n\r\n- The predicted agents' action is input for the transition function in the environment\r\n\r\n  \u200b\r\n\r\n\r\n\r\n## Installation\r\n\r\nThe simulation environment can be installed using `pip` :\r\n\r\n        ```\r\n        pip install pymops\r\n        ```\r\n\r\nOr it can be cloned from GitHub repo and installed.\r\n\r\n        ```\r\n        git clone https://github.com/awolseid/pymops.git\r\n        cd pymops\r\n        pip install .\r\n        \u200b```\r\n\r\n### Import package\r\n\r\n        ```python \r\n        import pymops\r\n        from pymops.environ import SimEnv\r\n        ```\r\n\r\n### Create simulation environment\r\n\r\n        ```\r\n        env = SimEnv(\r\n                    supply_df = default_supply_df, # Units' profile dataframe\r\n                    demand_df = default_demand_df, # Demands profile\r\n                    SR = 0.0, # proportion of spinning reserve => [0, 1]\r\n                    RR = \"Yes\", # Ramp rate => \"yes\" or (default \"no\" (=None)) \r\n                    VPE = None, # Valve point effects => \"yes\" or (default \"no\" (=None))\r\n                    n_objs = None, # Objectives => \"tri\" for 3 or (default \"bi\" (=None) for bi-objective)\r\n                    w = None, # Weight => [0, 1] for bi-objective, a list [0.2,0.3,0.5] for tri-objective\r\n                    duplicates = None # Num of duplicates: duplicate units and adjust demands proportionally\r\n                )\r\n        ```\r\n\r\n#### Reset environment\r\n\r\n        ```\r\n        initial_flat_state, initial_dict_state = env.reset()\r\n        ```\r\n\r\n#### Get current state\r\n\r\n        ```\r\n        flat_state, dict_state = env.get_current_state()\r\n        ```\r\n\r\n#### Execute decision (action) of agents\r\n\r\n        ```\r\n        action_vec = np.array([1,1,0,1,0,0,0,0,0,0])\r\n        flat_next_state, reward, done, next_state_dict, dispatch_info = env.step(action_vec)\r\n        ```\r\n\r\n## Develop and training (own customized) model\r\n\r\n### Import packages\r\n\r\n        ```python \r\n        from pymops.define_dqn import DQNet\r\n        from pymops.madqn import DQNAgents\r\n        from pymops.replaymemory import ReplayMemory\r\n        from pymops.schedules import get_schedules\r\n        ```\r\n\r\n### Define model\r\n\r\n          ```\r\n          model_0 = DQNet(env, 64)\r\n          print(model_0)\r\n          ```\r\n\r\n### Create instance\r\n\r\n          ```\r\n          RL_agents = DQNAgents(\r\n                                  environ = env, \r\n                                  model = model_0, \r\n                                  epsilon_max = 1.0,\r\n                                  epsilon_min = 0.1,\r\n                                  epsilon_decay = 0.99,\r\n                                  lr = 0.001\r\n                                  )\r\n          ```\r\n\r\n### Replay memory\r\n\r\n          ```\r\n          memory = ReplayMemory(environ = env, buffer_size = 64)\r\n          ```\r\n\r\n### Train model\r\n\r\n          ```\r\n          training_results_df = RL_agents.train(memory = memory, batch_size = 64, num_episodes = 500)\r\n          ```\r\n\r\n### Get schedule solutions\r\n\r\n          ```\r\n          cost, emis, CO2, SO2, schedules_df = get_schedules(environ = env, trained_agents = RL_agents)\r\n          schedules_df\r\n          ```\r\n\r\n### Contact Information\r\nAny questions, issues, suggestions, or collaboration opportunities can be reached at: awolseid@pukyong.ac.kr ; youngk@pknu.ac.kr. \r\n\r\n\r\n### Citation\r\n\r\nUsers should cite the following resources. \r\n\r\n- Code Ocean Reproducible Capsule: https://codeocean.com/capsule/0242917/tree:\r\n\r\n  - **Ebrie, A.S.**;, **Kim, Y.J.** (2023). pymops: *A multi-agent reinforcement learning simulation environment for multi-objective optimization in power scheduling* [Software Code]. https://doi.org/10.24433/CO.9235622.v1 \r\n- **[Article](https://www.mdpi.com/1996-1073/16/16/5920) produced from the very first version of the package:\r\n  - **Ebrie, A.S.**; **Paik, C.**; **Chung, Y.**; **Kim, Y.J.** (2023). *Environment-Friendly Power Scheduling Based on Deep Contextual Reinforcement Learning*. *Energies*, 16, 5920. https://doi.org/10.3390/en16165920.   \r\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A multi-agent simulation-based optimization package for power scheduling",
    "version": "1.0.5",
    "project_urls": {
        "Bug Tracker": "https://github.com/awolseid/pymops",
        "Homepage": "https://github.com/awolseid/pymops"
    },
    "split_keywords": [
        "economic dispach",
        "power scheduling",
        "reinforcement learning",
        "unit commitment"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "89d5cbe54e1e46b2b40ce554361dac7a0d01e534e9e5242f3aa1cc2502508f5b",
                "md5": "ab65978e476bf604fac0fd72d3fe3342",
                "sha256": "2f911455a240be013026fb29960b5f476ebbabfdc8291bcd610e6d5745de2e31"
            },
            "downloads": -1,
            "filename": "pymops-1.0.5-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "ab65978e476bf604fac0fd72d3fe3342",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 20147,
            "upload_time": "2023-10-18T15:48:51",
            "upload_time_iso_8601": "2023-10-18T15:48:51.037826Z",
            "url": "https://files.pythonhosted.org/packages/89/d5/cbe54e1e46b2b40ce554361dac7a0d01e534e9e5242f3aa1cc2502508f5b/pymops-1.0.5-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "739c8f3ab7dba6be4c0ae2d41a2651b0626996fa7f7050b1c87d3e9ad702d2ad",
                "md5": "a4fe6a5504204654edcb8825ad8f5c6f",
                "sha256": "60f6948ba0d277724e89e774de94049e3abbf51eb6e0303b514c941e92849838"
            },
            "downloads": -1,
            "filename": "pymops-1.0.5.tar.gz",
            "has_sig": false,
            "md5_digest": "a4fe6a5504204654edcb8825ad8f5c6f",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 22263,
            "upload_time": "2023-10-18T15:48:52",
            "upload_time_iso_8601": "2023-10-18T15:48:52.497178Z",
            "url": "https://files.pythonhosted.org/packages/73/9c/8f3ab7dba6be4c0ae2d41a2651b0626996fa7f7050b1c87d3e9ad702d2ad/pymops-1.0.5.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-10-18 15:48:52",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "awolseid",
    "github_project": "pymops",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "numpy",
            "specs": []
        },
        {
            "name": "pandas",
            "specs": []
        },
        {
            "name": "scipy",
            "specs": []
        },
        {
            "name": "torch",
            "specs": []
        },
        {
            "name": "tqdm",
            "specs": []
        }
    ],
    "lcname": "pymops"
}
        
Elapsed time: 0.14334s