agentgym


Nameagentgym JSON
Version 0.0.2 PyPI version JSON
download
home_pagehttps://github.com/The-Swarm-Corporation/AgentGym
SummaryAgent Gym - Pytorch
upload_time2025-01-29 19:19:49
maintainerNone
docs_urlNone
authorKye Gomez
requires_python<4.0,>=3.10
licenseMIT
keywords artificial intelligence deep learning optimizers prompt engineering
VCS
bugtrack_url
requirements accelerate peft transformers datasets loguru trl torch wandb tqdm pydantic
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Agent Gym
![Agent Gym](images/steps.png)


[![Join our Discord](https://img.shields.io/badge/Discord-Join%20our%20server-5865F2?style=for-the-badge&logo=discord&logoColor=white)](https://discord.gg/swarms) [![Subscribe on YouTube](https://img.shields.io/badge/YouTube-Subscribe-red?style=for-the-badge&logo=youtube&logoColor=white)](https://www.youtube.com/@kyegomez3242) [![Connect on LinkedIn](https://img.shields.io/badge/LinkedIn-Connect-blue?style=for-the-badge&logo=linkedin&logoColor=white)](https://www.linkedin.com/in/kye-g-38759a207/) [![Follow on X.com](https://img.shields.io/badge/X.com-Follow-1DA1F2?style=for-the-badge&logo=x&logoColor=white)](https://x.com/kyegomezb)

Convert any model into a r1-like reasoning hyper-intelligent agent. Leverages TRL, Huggingface, and various other libraries. This is a work in progress. Our goal is to make it easy to train any model into a reasoning agent.


- Sources:
- [Open R1 Blog](https://huggingface.co/blog/open-r1)
- [GRPO Documentation from trl](https://huggingface.co/docs/trl/main/en/grpo_trainer)
- [Huggingface Docs](https://huggingface.co/docs/transformers/main/en/index)
- [GRPO Docs](https://huggingface.co/docs/trl/main/en/grpo_trainer)


## Installation

```bash
pip3 install -U agentgym
```

## Usage

```python
from agentgym.r1_pipeline import R1Pipeline, SFTConfig

r1_pipeline = R1Pipeline(sft_model="gpt2", sft_dataset="stanfordnlp/imdb", sft_args=SFTConfig(output_dir="/tmp"))

r1_pipeline.run()
```

## Architecture

The architecture is as follows:

- SFT: Supervised Fine-Tuning
- GRPO: Generative Reinforcement Policy Optimization

-> model -> sft -> grpo -> model

```mermaid
graph TD;
    A[model] --> B[sft]
    B --> C[grpo]
    C --> D[reasoning model]
```

# License
MIT

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/The-Swarm-Corporation/AgentGym",
    "name": "agentgym",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.10",
    "maintainer_email": null,
    "keywords": "artificial intelligence, deep learning, optimizers, Prompt Engineering",
    "author": "Kye Gomez",
    "author_email": "kye@apac.ai",
    "download_url": "https://files.pythonhosted.org/packages/b9/2b/279cbbe392b6608dbf251032593b7cac7a40f8c8661d1e4206ac3182abe2/agentgym-0.0.2.tar.gz",
    "platform": null,
    "description": "# Agent Gym\n![Agent Gym](images/steps.png)\n\n\n[![Join our Discord](https://img.shields.io/badge/Discord-Join%20our%20server-5865F2?style=for-the-badge&logo=discord&logoColor=white)](https://discord.gg/swarms) [![Subscribe on YouTube](https://img.shields.io/badge/YouTube-Subscribe-red?style=for-the-badge&logo=youtube&logoColor=white)](https://www.youtube.com/@kyegomez3242) [![Connect on LinkedIn](https://img.shields.io/badge/LinkedIn-Connect-blue?style=for-the-badge&logo=linkedin&logoColor=white)](https://www.linkedin.com/in/kye-g-38759a207/) [![Follow on X.com](https://img.shields.io/badge/X.com-Follow-1DA1F2?style=for-the-badge&logo=x&logoColor=white)](https://x.com/kyegomezb)\n\nConvert any model into a r1-like reasoning hyper-intelligent agent. Leverages TRL, Huggingface, and various other libraries. This is a work in progress. Our goal is to make it easy to train any model into a reasoning agent.\n\n\n- Sources:\n- [Open R1 Blog](https://huggingface.co/blog/open-r1)\n- [GRPO Documentation from trl](https://huggingface.co/docs/trl/main/en/grpo_trainer)\n- [Huggingface Docs](https://huggingface.co/docs/transformers/main/en/index)\n- [GRPO Docs](https://huggingface.co/docs/trl/main/en/grpo_trainer)\n\n\n## Installation\n\n```bash\npip3 install -U agentgym\n```\n\n## Usage\n\n```python\nfrom agentgym.r1_pipeline import R1Pipeline, SFTConfig\n\nr1_pipeline = R1Pipeline(sft_model=\"gpt2\", sft_dataset=\"stanfordnlp/imdb\", sft_args=SFTConfig(output_dir=\"/tmp\"))\n\nr1_pipeline.run()\n```\n\n## Architecture\n\nThe architecture is as follows:\n\n- SFT: Supervised Fine-Tuning\n- GRPO: Generative Reinforcement Policy Optimization\n\n-> model -> sft -> grpo -> model\n\n```mermaid\ngraph TD;\n    A[model] --> B[sft]\n    B --> C[grpo]\n    C --> D[reasoning model]\n```\n\n# License\nMIT\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Agent Gym - Pytorch",
    "version": "0.0.2",
    "project_urls": {
        "Documentation": "https://github.com/The-Swarm-Corporation/AgentGym",
        "Homepage": "https://github.com/The-Swarm-Corporation/AgentGym",
        "Repository": "https://github.com/The-Swarm-Corporation/AgentGym"
    },
    "split_keywords": [
        "artificial intelligence",
        " deep learning",
        " optimizers",
        " prompt engineering"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "670ae4b5639b379da18f947018aef7e31bd3ca0cd81735b1f73d4a77ce229f95",
                "md5": "a3d52504fa03c6710d2bbe1024a2e5f9",
                "sha256": "8f987431f6429283e5345bcddf8d9b66afc5940c3c54f3b06639c65e9c7cb022"
            },
            "downloads": -1,
            "filename": "agentgym-0.0.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "a3d52504fa03c6710d2bbe1024a2e5f9",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.10",
            "size": 9000,
            "upload_time": "2025-01-29T19:19:48",
            "upload_time_iso_8601": "2025-01-29T19:19:48.754641Z",
            "url": "https://files.pythonhosted.org/packages/67/0a/e4b5639b379da18f947018aef7e31bd3ca0cd81735b1f73d4a77ce229f95/agentgym-0.0.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b92b279cbbe392b6608dbf251032593b7cac7a40f8c8661d1e4206ac3182abe2",
                "md5": "7e2704723c0f73cb577d66afe40841de",
                "sha256": "1a62ef68d0173f749dbc1eaf6c7f1e35d38d6ad2c7bb80baa3be0d4df1dcd181"
            },
            "downloads": -1,
            "filename": "agentgym-0.0.2.tar.gz",
            "has_sig": false,
            "md5_digest": "7e2704723c0f73cb577d66afe40841de",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.10",
            "size": 8861,
            "upload_time": "2025-01-29T19:19:49",
            "upload_time_iso_8601": "2025-01-29T19:19:49.795683Z",
            "url": "https://files.pythonhosted.org/packages/b9/2b/279cbbe392b6608dbf251032593b7cac7a40f8c8661d1e4206ac3182abe2/agentgym-0.0.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-01-29 19:19:49",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "The-Swarm-Corporation",
    "github_project": "AgentGym",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "accelerate",
            "specs": []
        },
        {
            "name": "peft",
            "specs": []
        },
        {
            "name": "transformers",
            "specs": []
        },
        {
            "name": "datasets",
            "specs": []
        },
        {
            "name": "loguru",
            "specs": []
        },
        {
            "name": "trl",
            "specs": []
        },
        {
            "name": "torch",
            "specs": []
        },
        {
            "name": "wandb",
            "specs": []
        },
        {
            "name": "tqdm",
            "specs": []
        },
        {
            "name": "pydantic",
            "specs": []
        }
    ],
    "lcname": "agentgym"
}
        
Elapsed time: 0.39040s