# Agent Gym

[](https://discord.gg/swarms) [](https://www.youtube.com/@kyegomez3242) [](https://www.linkedin.com/in/kye-g-38759a207/) [](https://x.com/kyegomezb)
Convert any model into a r1-like reasoning hyper-intelligent agent. Leverages TRL, Huggingface, and various other libraries. This is a work in progress. Our goal is to make it easy to train any model into a reasoning agent.
- Sources:
- [Open R1 Blog](https://huggingface.co/blog/open-r1)
- [GRPO Documentation from trl](https://huggingface.co/docs/trl/main/en/grpo_trainer)
- [Huggingface Docs](https://huggingface.co/docs/transformers/main/en/index)
- [GRPO Docs](https://huggingface.co/docs/trl/main/en/grpo_trainer)
## Installation
```bash
pip3 install -U agentgym
```
## Usage
```python
from agentgym.r1_pipeline import R1Pipeline, SFTConfig
r1_pipeline = R1Pipeline(sft_model="gpt2", sft_dataset="stanfordnlp/imdb", sft_args=SFTConfig(output_dir="/tmp"))
r1_pipeline.run()
```
## Architecture
The architecture is as follows:
- SFT: Supervised Fine-Tuning
- GRPO: Generative Reinforcement Policy Optimization
-> model -> sft -> grpo -> model
```mermaid
graph TD;
A[model] --> B[sft]
B --> C[grpo]
C --> D[reasoning model]
```
# License
MIT
Raw data
{
"_id": null,
"home_page": "https://github.com/The-Swarm-Corporation/AgentGym",
"name": "agentgym",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.10",
"maintainer_email": null,
"keywords": "artificial intelligence, deep learning, optimizers, Prompt Engineering",
"author": "Kye Gomez",
"author_email": "kye@apac.ai",
"download_url": "https://files.pythonhosted.org/packages/b9/2b/279cbbe392b6608dbf251032593b7cac7a40f8c8661d1e4206ac3182abe2/agentgym-0.0.2.tar.gz",
"platform": null,
"description": "# Agent Gym\n\n\n\n[](https://discord.gg/swarms) [](https://www.youtube.com/@kyegomez3242) [](https://www.linkedin.com/in/kye-g-38759a207/) [](https://x.com/kyegomezb)\n\nConvert any model into a r1-like reasoning hyper-intelligent agent. Leverages TRL, Huggingface, and various other libraries. This is a work in progress. Our goal is to make it easy to train any model into a reasoning agent.\n\n\n- Sources:\n- [Open R1 Blog](https://huggingface.co/blog/open-r1)\n- [GRPO Documentation from trl](https://huggingface.co/docs/trl/main/en/grpo_trainer)\n- [Huggingface Docs](https://huggingface.co/docs/transformers/main/en/index)\n- [GRPO Docs](https://huggingface.co/docs/trl/main/en/grpo_trainer)\n\n\n## Installation\n\n```bash\npip3 install -U agentgym\n```\n\n## Usage\n\n```python\nfrom agentgym.r1_pipeline import R1Pipeline, SFTConfig\n\nr1_pipeline = R1Pipeline(sft_model=\"gpt2\", sft_dataset=\"stanfordnlp/imdb\", sft_args=SFTConfig(output_dir=\"/tmp\"))\n\nr1_pipeline.run()\n```\n\n## Architecture\n\nThe architecture is as follows:\n\n- SFT: Supervised Fine-Tuning\n- GRPO: Generative Reinforcement Policy Optimization\n\n-> model -> sft -> grpo -> model\n\n```mermaid\ngraph TD;\n A[model] --> B[sft]\n B --> C[grpo]\n C --> D[reasoning model]\n```\n\n# License\nMIT\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Agent Gym - Pytorch",
"version": "0.0.2",
"project_urls": {
"Documentation": "https://github.com/The-Swarm-Corporation/AgentGym",
"Homepage": "https://github.com/The-Swarm-Corporation/AgentGym",
"Repository": "https://github.com/The-Swarm-Corporation/AgentGym"
},
"split_keywords": [
"artificial intelligence",
" deep learning",
" optimizers",
" prompt engineering"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "670ae4b5639b379da18f947018aef7e31bd3ca0cd81735b1f73d4a77ce229f95",
"md5": "a3d52504fa03c6710d2bbe1024a2e5f9",
"sha256": "8f987431f6429283e5345bcddf8d9b66afc5940c3c54f3b06639c65e9c7cb022"
},
"downloads": -1,
"filename": "agentgym-0.0.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "a3d52504fa03c6710d2bbe1024a2e5f9",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.10",
"size": 9000,
"upload_time": "2025-01-29T19:19:48",
"upload_time_iso_8601": "2025-01-29T19:19:48.754641Z",
"url": "https://files.pythonhosted.org/packages/67/0a/e4b5639b379da18f947018aef7e31bd3ca0cd81735b1f73d4a77ce229f95/agentgym-0.0.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "b92b279cbbe392b6608dbf251032593b7cac7a40f8c8661d1e4206ac3182abe2",
"md5": "7e2704723c0f73cb577d66afe40841de",
"sha256": "1a62ef68d0173f749dbc1eaf6c7f1e35d38d6ad2c7bb80baa3be0d4df1dcd181"
},
"downloads": -1,
"filename": "agentgym-0.0.2.tar.gz",
"has_sig": false,
"md5_digest": "7e2704723c0f73cb577d66afe40841de",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.10",
"size": 8861,
"upload_time": "2025-01-29T19:19:49",
"upload_time_iso_8601": "2025-01-29T19:19:49.795683Z",
"url": "https://files.pythonhosted.org/packages/b9/2b/279cbbe392b6608dbf251032593b7cac7a40f8c8661d1e4206ac3182abe2/agentgym-0.0.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-01-29 19:19:49",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "The-Swarm-Corporation",
"github_project": "AgentGym",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "accelerate",
"specs": []
},
{
"name": "peft",
"specs": []
},
{
"name": "transformers",
"specs": []
},
{
"name": "datasets",
"specs": []
},
{
"name": "loguru",
"specs": []
},
{
"name": "trl",
"specs": []
},
{
"name": "torch",
"specs": []
},
{
"name": "wandb",
"specs": []
},
{
"name": "tqdm",
"specs": []
},
{
"name": "pydantic",
"specs": []
}
],
"lcname": "agentgym"
}