stubborn


Namestubborn JSON
Version 0.0.2 PyPI version JSON
download
home_pagehttps://github.com/cool-RR/stubborn
SummaryAn Environment for Evaluating Stubbornness between Agents with Aligned Incentives
upload_time2023-04-23 20:22:46
maintainer
docs_urlNone
authorRam Rachum
requires_python
license
keywords
VCS
bugtrack_url
requirements click frozendict more-itertools numpy pandas plotly pyyaml ray tensorflow
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Stubborn: An Environment for Evaluating Stubbornness between Agents with Aligned Incentives

<!--* [Video](http://r.rachum.com/stubborn-workshop-video)-->
<!--* [Deck](http://r.rachum.com/stubborn-deck)-->

*Stubborn* is an experiment in the field of [multi-agent reinforcement learning](https://en.wikipedia.org/wiki/Multi-agent_reinforcement_learning). The goal of the experiment is to see whether reinforcement learning agents can learn to communicate important information to each other by fighting with each other, even though they are "on the same side". By running the experiment and generating plots using the commands documented below, you could replicate the results shown in our paper. By modifying the environment rules as defined in the code, you could extend the experiment to investigate this scenario in different ways.

*Stubborn* will be presented at the [Workshop on Rebellion and Disobedience in AI](https://sites.google.com/view/rad-ai/) at [The International Conference on Autonomous Agents and Multiagent Systems](https://aamas2023.soton.ac.uk/). Read the [full paper](http://r.rachum.com/stubborn-paper). Abstract:

> Recent research in multi-agent reinforcement learning (MARL) has shown success in learning social behavior and cooperation. Social dilemmas between agents in mixed-sum settings have been studied extensively, but there is little research into social dilemmas in fully cooperative settings, where agents have no prospect of gaining reward at another agent’s expense.
>
> While fully-aligned interests are conducive to cooperation between agents, they do not guarantee it. We propose a measure of "stubbornness" between agents that aims to capture the human social behavior from which it takes its name: a disagreement that is gradually escalating and potentially disastrous. We would like to promote research into the tendency of agents to be stubborn, the reactions of counterpart agents, and the resulting social dynamics.
>
> In this paper we present Stubborn, an environment for evaluating stubbornness between agents with fully-aligned incentives. In our preliminary results, the agents learn to use their partner’s stubbornness as a signal for improving the choices that they make in the environment. [Continue reading...](http://r.rachum.com/stubborn-paper)


## Installation

```shell
python3 -m venv "${HOME}/stubborn_env"
source "${HOME}/stubborn_env/bin/activate"
pip3 install stubborn
```


## Documentation

Show list of commands:

```shell
python -m stubborn --help
```

Show arguments and options for a specific command:

```shell
python -m stubborn run --help
```


## Basic usage

### Running

Run the *Stubborn* experiment, training agents and evaluating their performance:

```shell
python3 -m stubborn run
```

### Plotting

There are two plot commands implemented. Each of them, by default, draws a plot for the very last run that you made.

Draw a plot showing the rewards of both agents as they learn:

```shell
python3 -m stubborn plot-reward
```

![plot-reward](misc/images/plot-reward.png)

Draw a plot showing the insistence of one agent as a function of the other agent's stubbornness, defined as $\zeta_{n,d}$ in the paper:

```shell
python3 -m stubborn plot-insistence
```

![plot-insistence](misc/images/plot-insistence.png)



## Citing

If you use *Stubborn* in your research, please cite the accompanying paper:

```bibtex
@article{Rachum2023Stubborn,
  title={Stubborn: An Environment for Evaluating Stubbornness between Agents with Aligned Incentives},
  author={Rachum, Ram and Nakar, Yonatan and Mirsky, Reuth},
  year = {2023},
  journal = {Proceedings of the Workshop on Rebellion and Disobedience in AI at The International Conference on Autonomous Agents and Multiagent Systems}
}
```

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/cool-RR/stubborn",
    "name": "stubborn",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "",
    "author": "Ram Rachum",
    "author_email": "ram@rachum.com",
    "download_url": "https://files.pythonhosted.org/packages/33/e7/5547873198bc91e1f5f172fc110c1f641adba2d66418cb39d81f977f76ee/stubborn-0.0.2.tar.gz",
    "platform": null,
    "description": "# Stubborn: An Environment for Evaluating Stubbornness between Agents with Aligned Incentives\r\n\r\n<!--* [Video](http://r.rachum.com/stubborn-workshop-video)-->\r\n<!--* [Deck](http://r.rachum.com/stubborn-deck)-->\r\n\r\n*Stubborn* is an experiment in the field of [multi-agent reinforcement learning](https://en.wikipedia.org/wiki/Multi-agent_reinforcement_learning). The goal of the experiment is to see whether reinforcement learning agents can learn to communicate important information to each other by fighting with each other, even though they are \"on the same side\". By running the experiment and generating plots using the commands documented below, you could replicate the results shown in our paper. By modifying the environment rules as defined in the code, you could extend the experiment to investigate this scenario in different ways.\r\n\r\n*Stubborn* will be presented at the [Workshop on Rebellion and Disobedience in AI](https://sites.google.com/view/rad-ai/) at [The International Conference on Autonomous Agents and Multiagent Systems](https://aamas2023.soton.ac.uk/). Read the [full paper](http://r.rachum.com/stubborn-paper). Abstract:\r\n\r\n> Recent research in multi-agent reinforcement learning (MARL) has shown success in learning social behavior and cooperation. Social dilemmas between agents in mixed-sum settings have been studied extensively, but there is little research into social dilemmas in fully cooperative settings, where agents have no prospect of gaining reward at another agent\u00e2\u20ac\u2122s expense.\r\n>\r\n> While fully-aligned interests are conducive to cooperation between agents, they do not guarantee it. We propose a measure of \"stubbornness\" between agents that aims to capture the human social behavior from which it takes its name: a disagreement that is gradually escalating and potentially disastrous. We would like to promote research into the tendency of agents to be stubborn, the reactions of counterpart agents, and the resulting social dynamics.\r\n>\r\n> In this paper we present Stubborn, an environment for evaluating stubbornness between agents with fully-aligned incentives. In our preliminary results, the agents learn to use their partner\u00e2\u20ac\u2122s stubbornness as a signal for improving the choices that they make in the environment. [Continue reading...](http://r.rachum.com/stubborn-paper)\r\n\r\n\r\n## Installation\r\n\r\n```shell\r\npython3 -m venv \"${HOME}/stubborn_env\"\r\nsource \"${HOME}/stubborn_env/bin/activate\"\r\npip3 install stubborn\r\n```\r\n\r\n\r\n## Documentation\r\n\r\nShow list of commands:\r\n\r\n```shell\r\npython -m stubborn --help\r\n```\r\n\r\nShow arguments and options for a specific command:\r\n\r\n```shell\r\npython -m stubborn run --help\r\n```\r\n\r\n\r\n## Basic usage\r\n\r\n### Running\r\n\r\nRun the *Stubborn* experiment, training agents and evaluating their performance:\r\n\r\n```shell\r\npython3 -m stubborn run\r\n```\r\n\r\n### Plotting\r\n\r\nThere are two plot commands implemented. Each of them, by default, draws a plot for the very last run that you made.\r\n\r\nDraw a plot showing the rewards of both agents as they learn:\r\n\r\n```shell\r\npython3 -m stubborn plot-reward\r\n```\r\n\r\n![plot-reward](misc/images/plot-reward.png)\r\n\r\nDraw a plot showing the insistence of one agent as a function of the other agent's stubbornness, defined as $\\zeta_{n,d}$ in the paper:\r\n\r\n```shell\r\npython3 -m stubborn plot-insistence\r\n```\r\n\r\n![plot-insistence](misc/images/plot-insistence.png)\r\n\r\n\r\n\r\n## Citing\r\n\r\nIf you use *Stubborn* in your research, please cite the accompanying paper:\r\n\r\n```bibtex\r\n@article{Rachum2023Stubborn,\r\n  title={Stubborn: An Environment for Evaluating Stubbornness between Agents with Aligned Incentives},\r\n  author={Rachum, Ram and Nakar, Yonatan and Mirsky, Reuth},\r\n  year = {2023},\r\n  journal = {Proceedings of the Workshop on Rebellion and Disobedience in AI at The International Conference on Autonomous Agents and Multiagent Systems}\r\n}\r\n```\r\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "An Environment for Evaluating Stubbornness between Agents with Aligned Incentives",
    "version": "0.0.2",
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "4e3ad88021ab06f7a5fa301fc097df300ba994e5b540426cd5af82be919b24a0",
                "md5": "6a3ab9c8e42f5ba2d157633d220b5696",
                "sha256": "d097e4d20ce4e20677525d3f217eacf3d33488008d09eec1818d0aa3a483595d"
            },
            "downloads": -1,
            "filename": "stubborn-0.0.2-py2.py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "6a3ab9c8e42f5ba2d157633d220b5696",
            "packagetype": "bdist_wheel",
            "python_version": "py2.py3",
            "requires_python": null,
            "size": 34373,
            "upload_time": "2023-04-23T20:22:44",
            "upload_time_iso_8601": "2023-04-23T20:22:44.735486Z",
            "url": "https://files.pythonhosted.org/packages/4e/3a/d88021ab06f7a5fa301fc097df300ba994e5b540426cd5af82be919b24a0/stubborn-0.0.2-py2.py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "33e75547873198bc91e1f5f172fc110c1f641adba2d66418cb39d81f977f76ee",
                "md5": "24751584146134a7fec64a63c089e672",
                "sha256": "c2f4ab5b601a8814add252dd1fb9cb1a0d7dded84558645c92e12508e56b7983"
            },
            "downloads": -1,
            "filename": "stubborn-0.0.2.tar.gz",
            "has_sig": false,
            "md5_digest": "24751584146134a7fec64a63c089e672",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 28192,
            "upload_time": "2023-04-23T20:22:46",
            "upload_time_iso_8601": "2023-04-23T20:22:46.883479Z",
            "url": "https://files.pythonhosted.org/packages/33/e7/5547873198bc91e1f5f172fc110c1f641adba2d66418cb39d81f977f76ee/stubborn-0.0.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-04-23 20:22:46",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "cool-RR",
    "github_project": "stubborn",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "click",
            "specs": [
                [
                    "==",
                    "8.0.4"
                ]
            ]
        },
        {
            "name": "frozendict",
            "specs": [
                [
                    "==",
                    "2.3.7"
                ]
            ]
        },
        {
            "name": "more-itertools",
            "specs": [
                [
                    "==",
                    "9.1.0"
                ]
            ]
        },
        {
            "name": "numpy",
            "specs": [
                [
                    "==",
                    "1.23.5"
                ]
            ]
        },
        {
            "name": "pandas",
            "specs": [
                [
                    "==",
                    "2.0.0"
                ]
            ]
        },
        {
            "name": "plotly",
            "specs": [
                [
                    "==",
                    "5.14.1"
                ]
            ]
        },
        {
            "name": "pyyaml",
            "specs": [
                [
                    "==",
                    "6.0"
                ]
            ]
        },
        {
            "name": "ray",
            "specs": [
                [
                    "==",
                    "2.1.0"
                ]
            ]
        },
        {
            "name": "tensorflow",
            "specs": [
                [
                    "==",
                    "2.12.0"
                ]
            ]
        }
    ],
    "lcname": "stubborn"
}
        
Elapsed time: 0.16579s