Name | sweagent JSON |
Version |
0.0.1
JSON |
| download |
home_page | https://swe-agent.com |
Summary | The official SWE-agent package - an open source Agent Computer Interface for running language models as software engineers |
upload_time | 2024-04-02 06:37:42 |
maintainer | None |
docs_url | None |
author | John Yang |
requires_python | >=3.9 |
license | None |
keywords |
nlp
agents
code
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
<p align="center">
<a href="https://www.swe-agent.com/">
<img src="assets/swe-agent-banner.png" alt="swe-agent.com" />
</a>
</p>
<p align="center">
<a href="https://www.swe-agent.com"><strong>Website & Demo</strong></a> |
<a href="https://discord.gg/AVEFbBn2rH"><strong>Discord</strong></a> |
<strong>Paper [coming April 10th]</strong>
</p>
## ๐ Overview <a name="overview"></a>
SWE-agent automatically turns bug reports from real GitHub repos into pull requests, using an LM such as GPT-4.
On the full [SWE-bench](https://github.com/princeton-nlp/SWE-bench) test set, SWE-agent resolves **12.29%** of issues, achieving the state-of-the-art result on the full test set.
### โจ Agent-Computer Interface (ACI) <a name="aci"></a>
We accomplish these results by designing simple LM-centric commands and specially-built input and output formats to make it easier for the LM to browse the repository, view, edit and execute code files. We call this **Agent-Computer Interface** (ACI) and build the SWE-agent repository to make it easy to iterate on ACI design for repository-level coding agents.
Just like typical language model use requires good prompt engineering, good ACI design leads to much better results when using agents. As we show in our paper, a baseline agent without a well-tuned ACI does much worse than SWE-agent.
SWE-agent contains features that we discovered to be immensly helpful during the agent-computer interface design process:
1. We add a linter that runs when an edit command is issued, and do not let the edit command go through if the code isn't syntactically correct.
2. We supply the agent with a special-built file viewer, instead of having it just ```cat``` files. We found that this file viewer works best when displaying just 100 lines in each turn. The file editor that we built has commands for scrolling up and down and for performing a search within the file.
3. We supply the agent with a special-built full-directory string searching command. We found that it was important for this tool to succintly list the matches- we simply list each file that had at least one match. Showing the model more context about each match proved to be too confusing for the model.
4. When commands have an empty output we return a message saying "Your command ran successfully and did not produce any output."
Read our paper for more details.
```
@misc{yang2024sweagent,
title={SWE-agent: Agent Computer Interfaces Enable Software Engineering Language Models},
author={John Yang and Carlos E. Jimenez and Alexander Wettig and Shunyu Yao and Karthik Narasimhan and Ofir Press},
year={2024},
}
```
## ๐ Setup <a name="setup"></a>
1. [Install Docker](https://docs.docker.com/engine/install/), then start Docker locally.
2. [Install Miniconda](https://docs.anaconda.com/free/miniconda/miniconda-install/), then create the `swe-agent` environment with `conda env create -f environment.yml`
3. Activate using `conda activate swe-agent`.
4. Run `./setup.sh` to create the `intercode-swe` docker image.
5. Create a `keys.cfg` file at the root of this repository and fill in the following:
```
OPENAI_API_KEY: 'OpenAI API Key Here if using OpenAI Model (optional)'
ANTHROPIC_API_KEY: 'Anthropic API Key Here if using Anthropic Model (optional)'
GITHUB_TOKEN: 'GitHub Token Here (required)'
```
See the following links for tutorials on obtaining [Anthropic](https://docs.anthropic.com/claude/reference/getting-started-with-the-api), [OpenAI](https://platform.openai.com/docs/quickstart/step-2-set-up-your-api-key), and [Github]() tokens.
## ๐ฝ Usage <a name="usage"></a>
There are two steps to the SWE-agent pipeline. First SWE-agent takes an input GitHub issue and returns a pull request that attempts to fix it. We call that step *inference*. The second step (only available for issues in the SWE-bench set) is to *evaluate* the pull request to verify that it has indeed fixed the issue.
### ๐ฉโ๐ป Inference <a name="inference"></a>
**Inference on *any* GitHub Issue**: Using this script, you can run SWE-agent on any GitHub issue!
```
python run.py --model_name gpt4 \
--data_path https://github.com/pvlib/pvlib-python/issues/1603
```
**Inference on SWE-bench**: Run SWE-agent on [SWE-bench Lite](https://www.swebench.com/lite.html) and generate patches.
```
python run.py --model_name gpt4 \
--per_instance_cost_limit 2.00 \
--config_file ./config/default.yaml \
--per_instance_cost_limit 2.00
```
If you'd like to run on a *single* issue from SWE-bench, use the `--instance_filter` option as follows:
```
python run.py --model_name gpt4 \
--instance_filter marshmallow-code__marshmallow-1359
```
* See the [`scripts/`](scripts/) folder for details about using `run.py`.
* See the [`config/`](config/) folder for details about how you can define your own configuration!
* See the [`agents/`](agent/) folder for details about the logic behind configuration based workflows.
* See the [`intercode/`](intercode/) folder for details about the `SWEEnv` environment (interface + implementation).
* See the [`trajectories/`](trajectories) folder for details about the output of `run.py`.
### ๐งช Evaluation <a name="evaluation"></a>
This step is only available for issues from the SWE-bench set. To evaluate generated pull requests:
```
cd evaluation/
./run_eval.sh <predictions_path>
```
Replace `<predictions_path>` with the path to the model's predictions, which should be generated from the *Inference* step. The `<predictions_path>` arguments should look like `../trajectories/<username>/<model>-<dataset>-<hyperparams>/all_preds.jsonl`
* See the [`evaluation/`](evaluation/) folder for details about how evaluation works.
## ๐ซ Contributions <a name="contributions"></a>
- If you'd like to ask questions, learn about upcoming features, and participate in future development, join our [Discord community](https://discord.gg/AVEFbBn2rH)!
- If you'd like to contribute to the codebase, we welcome [issues](https://github.com/princeton-nlp/SWE-agent/issues) and [pull requests](https://github.com/princeton-nlp/SWE-agent/pulls)!
- If you'd like to see a post or tutorial about some topic, please let us know via an [issue](https://github.com/princeton-nlp/SWE-agent/issues).
## ๐ชช License <a name="license"></a>
MIT. Check `LICENSE`.
Raw data
{
"_id": null,
"home_page": "https://swe-agent.com",
"name": "sweagent",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": "nlp, agents, code",
"author": "John Yang",
"author_email": "byjohnyang@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/55/74/b76c1f7f0f6de2199279d30b4c2a104ad3307bce10b4debe5de6d0a8c22d/sweagent-0.0.1.tar.gz",
"platform": null,
"description": "<p align=\"center\">\n <a href=\"https://www.swe-agent.com/\">\n <img src=\"assets/swe-agent-banner.png\" alt=\"swe-agent.com\" />\n </a>\n</p>\n\n\n<p align=\"center\">\n <a href=\"https://www.swe-agent.com\"><strong>Website & Demo</strong></a> | \n <a href=\"https://discord.gg/AVEFbBn2rH\"><strong>Discord</strong></a> | \n <strong>Paper [coming April 10th]</strong>\n</p>\n\n\n## \ud83d\udc4b Overview <a name=\"overview\"></a>\nSWE-agent automatically turns bug reports from real GitHub repos into pull requests, using an LM such as GPT-4. \n\nOn the full [SWE-bench](https://github.com/princeton-nlp/SWE-bench) test set, SWE-agent resolves **12.29%** of issues, achieving the state-of-the-art result on the full test set.\n\n\n### \u2728 Agent-Computer Interface (ACI) <a name=\"aci\"></a>\nWe accomplish these results by designing simple LM-centric commands and specially-built input and output formats to make it easier for the LM to browse the repository, view, edit and execute code files. We call this **Agent-Computer Interface** (ACI) and build the SWE-agent repository to make it easy to iterate on ACI design for repository-level coding agents. \n\nJust like typical language model use requires good prompt engineering, good ACI design leads to much better results when using agents. As we show in our paper, a baseline agent without a well-tuned ACI does much worse than SWE-agent.\n\nSWE-agent contains features that we discovered to be immensly helpful during the agent-computer interface design process:\n1. We add a linter that runs when an edit command is issued, and do not let the edit command go through if the code isn't syntactically correct.\n2. We supply the agent with a special-built file viewer, instead of having it just ```cat``` files. We found that this file viewer works best when displaying just 100 lines in each turn. The file editor that we built has commands for scrolling up and down and for performing a search within the file.\n3. We supply the agent with a special-built full-directory string searching command. We found that it was important for this tool to succintly list the matches- we simply list each file that had at least one match. Showing the model more context about each match proved to be too confusing for the model. \n4. When commands have an empty output we return a message saying \"Your command ran successfully and did not produce any output.\"\n\nRead our paper for more details.\n\n```\n@misc{yang2024sweagent,\n title={SWE-agent: Agent Computer Interfaces Enable Software Engineering Language Models}, \n author={John Yang and Carlos E. Jimenez and Alexander Wettig and Shunyu Yao and Karthik Narasimhan and Ofir Press},\n year={2024},\n}\n```\n\n## \ud83d\ude80 Setup <a name=\"setup\"></a>\n1. [Install Docker](https://docs.docker.com/engine/install/), then start Docker locally.\n2. [Install Miniconda](https://docs.anaconda.com/free/miniconda/miniconda-install/), then create the `swe-agent` environment with `conda env create -f environment.yml`\n3. Activate using `conda activate swe-agent`.\n4. Run `./setup.sh` to create the `intercode-swe` docker image.\n5. Create a `keys.cfg` file at the root of this repository and fill in the following:\n```\nOPENAI_API_KEY: 'OpenAI API Key Here if using OpenAI Model (optional)'\nANTHROPIC_API_KEY: 'Anthropic API Key Here if using Anthropic Model (optional)'\nGITHUB_TOKEN: 'GitHub Token Here (required)'\n```\nSee the following links for tutorials on obtaining [Anthropic](https://docs.anthropic.com/claude/reference/getting-started-with-the-api), [OpenAI](https://platform.openai.com/docs/quickstart/step-2-set-up-your-api-key), and [Github]() tokens.\n\n## \ud83d\udcbd Usage <a name=\"usage\"></a>\nThere are two steps to the SWE-agent pipeline. First SWE-agent takes an input GitHub issue and returns a pull request that attempts to fix it. We call that step *inference*. The second step (only available for issues in the SWE-bench set) is to *evaluate* the pull request to verify that it has indeed fixed the issue. \n\n### \ud83d\udc69\u200d\ud83d\udcbb Inference <a name=\"inference\"></a>\n**Inference on *any* GitHub Issue**: Using this script, you can run SWE-agent on any GitHub issue!\n```\npython run.py --model_name gpt4 \\\n --data_path https://github.com/pvlib/pvlib-python/issues/1603\n```\n\n**Inference on SWE-bench**: Run SWE-agent on [SWE-bench Lite](https://www.swebench.com/lite.html) and generate patches.\n```\npython run.py --model_name gpt4 \\\n --per_instance_cost_limit 2.00 \\\n --config_file ./config/default.yaml \\\n --per_instance_cost_limit 2.00\n```\n\nIf you'd like to run on a *single* issue from SWE-bench, use the `--instance_filter` option as follows:\n```\npython run.py --model_name gpt4 \\\n --instance_filter marshmallow-code__marshmallow-1359\n```\n* See the [`scripts/`](scripts/) folder for details about using `run.py`.\n* See the [`config/`](config/) folder for details about how you can define your own configuration!\n* See the [`agents/`](agent/) folder for details about the logic behind configuration based workflows.\n* See the [`intercode/`](intercode/) folder for details about the `SWEEnv` environment (interface + implementation).\n* See the [`trajectories/`](trajectories) folder for details about the output of `run.py`.\n\n### \ud83e\uddea Evaluation <a name=\"evaluation\"></a>\nThis step is only available for issues from the SWE-bench set. To evaluate generated pull requests:\n```\ncd evaluation/\n./run_eval.sh <predictions_path>\n```\nReplace `<predictions_path>` with the path to the model's predictions, which should be generated from the *Inference* step. The `<predictions_path>` arguments should look like `../trajectories/<username>/<model>-<dataset>-<hyperparams>/all_preds.jsonl`\n* See the [`evaluation/`](evaluation/) folder for details about how evaluation works.\n\n\n## \ud83d\udcab Contributions <a name=\"contributions\"></a>\n- If you'd like to ask questions, learn about upcoming features, and participate in future development, join our [Discord community](https://discord.gg/AVEFbBn2rH)!\n- If you'd like to contribute to the codebase, we welcome [issues](https://github.com/princeton-nlp/SWE-agent/issues) and [pull requests](https://github.com/princeton-nlp/SWE-agent/pulls)!\n- If you'd like to see a post or tutorial about some topic, please let us know via an [issue](https://github.com/princeton-nlp/SWE-agent/issues).\n\n## \ud83e\udeaa License <a name=\"license\"></a>\nMIT. Check `LICENSE`.\n",
"bugtrack_url": null,
"license": null,
"summary": "The official SWE-agent package - an open source Agent Computer Interface for running language models as software engineers",
"version": "0.0.1",
"project_urls": {
"Bug Reports": "http://github.com/princeton-nlp/SWE-agent/issues",
"Documentation": "https://github.com/princeton-nlp/SWE-agent",
"Homepage": "https://swe-agent.com",
"Source Code": "http://github.com/princeton-nlp/SWE-agent",
"Website": "https://sweagent.com"
},
"split_keywords": [
"nlp",
" agents",
" code"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "9fa3e5f2ebd4fe1900da9a41a5571277561580c6d5741307444839e6ecb41354",
"md5": "9601dabce2bccb180a16a5ebcf644048",
"sha256": "c9804233c229f10a9f0a17b3aca8462b621d04852b510dd07dfd5585ccee642d"
},
"downloads": -1,
"filename": "sweagent-0.0.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "9601dabce2bccb180a16a5ebcf644048",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 41055,
"upload_time": "2024-04-02T06:37:40",
"upload_time_iso_8601": "2024-04-02T06:37:40.831116Z",
"url": "https://files.pythonhosted.org/packages/9f/a3/e5f2ebd4fe1900da9a41a5571277561580c6d5741307444839e6ecb41354/sweagent-0.0.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "5574b76c1f7f0f6de2199279d30b4c2a104ad3307bce10b4debe5de6d0a8c22d",
"md5": "8993aae0d1453c502158f00f67a70c14",
"sha256": "17f6465b4da9cc0b3efbb071d4edf6025f078d74cff8b49c0e9fe7826621cff5"
},
"downloads": -1,
"filename": "sweagent-0.0.1.tar.gz",
"has_sig": false,
"md5_digest": "8993aae0d1453c502158f00f67a70c14",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 40009,
"upload_time": "2024-04-02T06:37:42",
"upload_time_iso_8601": "2024-04-02T06:37:42.665589Z",
"url": "https://files.pythonhosted.org/packages/55/74/b76c1f7f0f6de2199279d30b4c2a104ad3307bce10b4debe5de6d0a8c22d/sweagent-0.0.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-04-02 06:37:42",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "princeton-nlp",
"github_project": "SWE-agent",
"github_not_found": true,
"lcname": "sweagent"
}