Name | toolbrain JSON |
Version |
0.1.3
JSON |
| download |
home_page | None |
Summary | A framework for training LLM-powered agents to use tools more effectively using Reinforcement Learning |
upload_time | 2025-10-07 08:37:18 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.10 |
license | None |
keywords |
llm
agents
reinforcement-learning
tools
ai
|
VCS |
 |
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# ToolBrain 🧠

[](https://pypistats.org/packages/toolbrain)
ToolBrain is a lightweight open-source Python library for training **agentic systems** with effective tool usage and built-in reinforcement learning.
📚 Our website: [toolbrain.org](https://toolbrain.org) and [Documentation & tutorials](docs/source/tutorials/tutorials.md)
📚 Watch Introduction [Video](https://www.youtube.com/watch?v=LhYiIHTRw7E)
Support us by giving ToolBrain a ⭐ on GitHub.
## ✨ Key Features
- **🤖 Learning algorithms**: Supports [GRPO](examples/02_lightgbm_hpo_training_with_grpo/run_hpo_training.py), [DPO](examples/04_lightgbm_hpo_training_with_dpo/run_hpo_training.py), and [supervised learning](examples/05_supervised_training.py).
- **🎯 Flexible rewards**: Define your own [reward functions](examples/09_flexible_rewards.py) or use [LLM-as-judge](examples/10_llm_as_judge.py).
- **🔧 Tool management**: Scalable [retrieval](examples/06_tool_retrieval.py) for managing large tool collections.
- **📊 Knowledge distillation**: [Distill](examples/08_distillation.py) large teacher models into smaller student models for efficiency.
- **🚀 Zero-learn**: Automatically [generate training tasks](examples/03_generate_training_examples.py ).
- **⚡ Efficient training**: Supports [FP16 finetuning](examples/13_hello_world_fp16.py), LoRA, Unsloth, and [BitsAndBytes](examples/12_hello_world_bitsandbytes.py) for [resource-efficient training](examples/07_email_search_agent/).
- 🧠 **Multiple agent frameworks**: Supports SmolAgent and [LangChain](examples/11_train_langchain_agent.py), with more coming soon.
## 🚀 Getting Started
### Prerequisites
- Python **3.10+**
### Installation
Create conda env (optional)
```bash
conda create --name toolbrain python=3.12
conda activate toolbrain
```
from PyPi:
```bash
pip install toolbrain
```
Or from the source code:
```bash
git clone https://github.com/ToolBrain/ToolBrain.git
```
Enter the cloned folder and type:
```bash
pip install .
```
### Run the Example
Run the complete example to see ToolBrain in action (please see under examples folder for more advanced usage examples):
```bash
python examples/01_run_hello_world.py
```
This will:
- Initialize a `CodeAgent` with simple math tools
- Define a customised reward function
- Run the GRPO algorithm
## 📖 Usage Example
Here's a minimal example of how to use ToolBrain. This script demonstrates simplified ToolBrain API:
1. Create a smolagent CodeAgent
2. Create a brain with our main class Brain()
3. Train the agent with the GRPO algorithm
```python
from smolagents import tool, TransformersModel, CodeAgent
from toolbrain import Brain
from toolbrain.rewards import reward_exact_match
# --- 1. Define Tools and Reward Function (User-defined) ---
@tool
def add(a: int, b: int) -> int:
"""
Add two integers.
Args:
a (int): First addend.
b (int): Second addend.
Returns:
int: Sum of a and b.
"""
return a + b
# --- 2. Prepare Training Data ---
training_dataset = [
{
"query": "Use the add tool to calculate 5 + 7",
"gold_answer": "12"
}
]
# 3. Create agent
model = TransformersModel(
model_id="Qwen/Qwen2.5-0.5B-Instruct", # use a bigger model for better results
max_new_tokens=128
)
agent = CodeAgent(
model=model,
tools=[add],
max_steps=1
)
# 4. Create Brain
brain = Brain(
agent, # Agent instance
algorithm="GRPO", # Algorithm choice
reward_func=reward_exact_match # A reward function, you can customise any python function as reward
)
# 5. Train the agent with GRPO steps
brain.train(training_dataset, num_iterations=10)
```
### Results
The following plot illustrates how ToolBrain enhances the tool usage accuracy of the small Qwen/Qwen2.5-0.5B-Instruct model after just 20 training steps using GRPO.

## 📄 License
This project is licensed under the MIT License - see the [LICENSE](https://opensource.org/licenses/MIT) for details.
## 🌍 Community contributions
Our vision is for **ToolBrain** to become the universal Reinforcement Learning layer for any agentic framework. Whether you build your agents with **LangChain**, **SmolAgents**, **LlamaIndex**, **AutoGen**, or a custom solution, you should be able to make them smarter with ToolBrain.
The key to this vision is our **modular Adapter architecture**. Adding support for a new framework is as simple as implementing a new adapter that translates the agent's internal state into ToolBrain's standard *Execution Trace*.
We welcome community contributions!
If you are using an agent framework not yet supported, we encourage you to build an adapter for it.
Check out our [`CONTRIBUTING.md`](./CONTRIBUTING.md) guide and the existing implementations in the [`toolbrain/adapters/`](./toolbrain/adapters/) directory to get started.
## Contributors
[Quy Minh Le](https://www.linkedin.com/in/quy-minh-le-b70218333/), Minh Sao Khue Luu, [Khanh-Tung Tran](https://www.linkedin.com/in/khanh-tung-tran-83b3541ab), Duc-Hai Nguyen, Hoang-Quoc-Viet Pham, Quan Le, [Hoang Thanh Lam](https://research.ibm.com/people/thanh-hoang) and [Harry Nguyen](https://www.ucc.ie/en/compsci/people/harrynguyen/)
---
### 🚀 Spread the Word
If you believe in ToolBrain's vision of making agent training accessible to everyone, please consider sharing it with your network!
[](https://twitter.com/intent/tweet?text=Just%20found%20ToolBrain%2C%20a%20lightweight%20open-source%20framework%20to%20train%20AI%20agents%20%28like%20LangChain%20or%20SmolAgent%29%20to%20use%20tools%20reliably%20with%20Reinforcement%20Learning.%20A%20gym%20for%20your%20agents%21&url=https%3A%2F%2Fgithub.com%2FToolBrain%2FToolBrain&hashtags=ToolBrain,AIAgents,ReinforcementLearning,LLM,OpenSource)
[](https://www.linkedin.com/shareArticle?mini=true&url=https%3A%2F%2Fgithub.com%2FToolBrain%2FToolBrain&title=ToolBrain%3A%20A%20Lightweight%20RL%20Framework%20for%20Training%20AI%20Agents&summary=Just%20found%20ToolBrain%2C%20a%20lightweight%20open-source%20framework%20to%20train%20AI%20agents%20%28like%20LangChain%20or%20SmolAgent%29%20to%20use%20tools%20reliably%20with%20Reinforcement%20Learning.%20A%20gym%20for%20your%20agents%21)
[](https://www.facebook.com/sharer/sharer.php?u=https%3A%2F%2Fgithub.com%2FToolBrain%2FToolBrain)
[](https://www.reddit.com/submit?url=https%3A%2F%2Fgithub.com%2FToolBrain%2FToolBrain&title=ToolBrain%3A%20A%20Lightweight%20RL%20Framework%20for%20Training%20AI%20Agents)
---
## References
Please cite [our paper](https://arxiv.org/abs/2510.00023) with the following bibtex:
```
@misc{le2025toolbrainflexiblereinforcementlearning,
title={ToolBrain: A Flexible Reinforcement Learning Framework for Agentic Tools},
author={Quy Minh Le and Minh Sao Khue Luu and Khanh-Tung Tran and Duc-Hai Nguyen and Hoang-Quoc-Viet Pham and Quan Le and Hoang Thanh Lam and Hoang D. Nguyen},
year={2025},
eprint={2510.00023},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2510.00023},
}
```
**Made with ❤️ by the ToolBrain Team**
Raw data
{
"_id": null,
"home_page": null,
"name": "toolbrain",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": "llm, agents, reinforcement-learning, tools, ai",
"author": null,
"author_email": "ToolBrain Team <team@toolbrain.ai>",
"download_url": "https://files.pythonhosted.org/packages/bb/97/627a431f3c7d0418dbcf85f5fe553aba8d6de6fbf81f4c94116077389a1c/toolbrain-0.1.3.tar.gz",
"platform": null,
"description": "# ToolBrain \ud83e\udde0\n\n[](https://pypistats.org/packages/toolbrain)\n\n\n\nToolBrain is a lightweight open-source Python library for training **agentic systems** with effective tool usage and built-in reinforcement learning. \n\ud83d\udcda Our website: [toolbrain.org](https://toolbrain.org) and [Documentation & tutorials](docs/source/tutorials/tutorials.md)\n\n\ud83d\udcda Watch Introduction [Video](https://www.youtube.com/watch?v=LhYiIHTRw7E) \n\nSupport us by giving ToolBrain a \u2b50 on GitHub.\n## \u2728 Key Features\n\n- **\ud83e\udd16 Learning algorithms**: Supports [GRPO](examples/02_lightgbm_hpo_training_with_grpo/run_hpo_training.py), [DPO](examples/04_lightgbm_hpo_training_with_dpo/run_hpo_training.py), and [supervised learning](examples/05_supervised_training.py). \n- **\ud83c\udfaf Flexible rewards**: Define your own [reward functions](examples/09_flexible_rewards.py) or use [LLM-as-judge](examples/10_llm_as_judge.py). \n- **\ud83d\udd27 Tool management**: Scalable [retrieval](examples/06_tool_retrieval.py) for managing large tool collections. \n- **\ud83d\udcca Knowledge distillation**: [Distill](examples/08_distillation.py) large teacher models into smaller student models for efficiency. \n- **\ud83d\ude80 Zero-learn**: Automatically [generate training tasks](examples/03_generate_training_examples.py ). \n- **\u26a1 Efficient training**: Supports [FP16 finetuning](examples/13_hello_world_fp16.py), LoRA, Unsloth, and [BitsAndBytes](examples/12_hello_world_bitsandbytes.py) for [resource-efficient training](examples/07_email_search_agent/).\n- \ud83e\udde0 **Multiple agent frameworks**: Supports SmolAgent and [LangChain](examples/11_train_langchain_agent.py), with more coming soon.\n\n## \ud83d\ude80 Getting Started\n\n### Prerequisites\n- Python **3.10+**\n\n### Installation\n\nCreate conda env (optional)\n```bash\nconda create --name toolbrain python=3.12\nconda activate toolbrain\n```\n\nfrom PyPi:\n\n```bash\npip install toolbrain\n```\nOr from the source code:\n```bash\ngit clone https://github.com/ToolBrain/ToolBrain.git\n```\n\nEnter the cloned folder and type:\n```bash\npip install .\n```\n\n\n### Run the Example\n\nRun the complete example to see ToolBrain in action (please see under examples folder for more advanced usage examples):\n\n```bash\npython examples/01_run_hello_world.py\n```\n\nThis will:\n- Initialize a `CodeAgent` with simple math tools\n- Define a customised reward function\n- Run the GRPO algorithm\n\n## \ud83d\udcd6 Usage Example\n\nHere's a minimal example of how to use ToolBrain. This script demonstrates simplified ToolBrain API:\n1. Create a smolagent CodeAgent\n2. Create a brain with our main class Brain() \n3. Train the agent with the GRPO algorithm\n\n\n```python\nfrom smolagents import tool, TransformersModel, CodeAgent\nfrom toolbrain import Brain\nfrom toolbrain.rewards import reward_exact_match\n\n# --- 1. Define Tools and Reward Function (User-defined) ---\n@tool\ndef add(a: int, b: int) -> int:\n \"\"\"\n Add two integers.\n\n Args:\n a (int): First addend.\n b (int): Second addend.\n\n Returns:\n int: Sum of a and b.\n \"\"\"\n return a + b\n\n\n# --- 2. Prepare Training Data ---\ntraining_dataset = [\n {\n \"query\": \"Use the add tool to calculate 5 + 7\",\n \"gold_answer\": \"12\"\n }\n]\n\n\n# 3. Create agent\nmodel = TransformersModel(\n model_id=\"Qwen/Qwen2.5-0.5B-Instruct\", # use a bigger model for better results\n max_new_tokens=128\n)\n\nagent = CodeAgent(\n model=model,\n tools=[add],\n max_steps=1\n)\n\n# 4. Create Brain\n\nbrain = Brain(\n agent, # Agent instance\n algorithm=\"GRPO\", # Algorithm choice\n reward_func=reward_exact_match # A reward function, you can customise any python function as reward\n)\n\n# 5. Train the agent with GRPO steps\nbrain.train(training_dataset, num_iterations=10)\n```\n ### Results\nThe following plot illustrates how ToolBrain enhances the tool usage accuracy of the small Qwen/Qwen2.5-0.5B-Instruct model after just 20 training steps using GRPO.\n\n\n## \ud83d\udcc4 License\n\nThis project is licensed under the MIT License - see the [LICENSE](https://opensource.org/licenses/MIT) for details.\n\n## \ud83c\udf0d Community contributions\n\nOur vision is for **ToolBrain** to become the universal Reinforcement Learning layer for any agentic framework. Whether you build your agents with **LangChain**, **SmolAgents**, **LlamaIndex**, **AutoGen**, or a custom solution, you should be able to make them smarter with ToolBrain.\n\nThe key to this vision is our **modular Adapter architecture**. Adding support for a new framework is as simple as implementing a new adapter that translates the agent's internal state into ToolBrain's standard *Execution Trace*.\n\nWe welcome community contributions! \nIf you are using an agent framework not yet supported, we encourage you to build an adapter for it. \nCheck out our [`CONTRIBUTING.md`](./CONTRIBUTING.md) guide and the existing implementations in the [`toolbrain/adapters/`](./toolbrain/adapters/) directory to get started.\n\n## Contributors\n[Quy Minh Le](https://www.linkedin.com/in/quy-minh-le-b70218333/), Minh Sao Khue Luu, [Khanh-Tung Tran](https://www.linkedin.com/in/khanh-tung-tran-83b3541ab), Duc-Hai Nguyen, Hoang-Quoc-Viet Pham, Quan Le, [Hoang Thanh Lam](https://research.ibm.com/people/thanh-hoang) and [Harry Nguyen](https://www.ucc.ie/en/compsci/people/harrynguyen/)\n\n---\n\n### \ud83d\ude80 Spread the Word\n\nIf you believe in ToolBrain's vision of making agent training accessible to everyone, please consider sharing it with your network!\n\n[](https://twitter.com/intent/tweet?text=Just%20found%20ToolBrain%2C%20a%20lightweight%20open-source%20framework%20to%20train%20AI%20agents%20%28like%20LangChain%20or%20SmolAgent%29%20to%20use%20tools%20reliably%20with%20Reinforcement%20Learning.%20A%20gym%20for%20your%20agents%21&url=https%3A%2F%2Fgithub.com%2FToolBrain%2FToolBrain&hashtags=ToolBrain,AIAgents,ReinforcementLearning,LLM,OpenSource)\n[](https://www.linkedin.com/shareArticle?mini=true&url=https%3A%2F%2Fgithub.com%2FToolBrain%2FToolBrain&title=ToolBrain%3A%20A%20Lightweight%20RL%20Framework%20for%20Training%20AI%20Agents&summary=Just%20found%20ToolBrain%2C%20a%20lightweight%20open-source%20framework%20to%20train%20AI%20agents%20%28like%20LangChain%20or%20SmolAgent%29%20to%20use%20tools%20reliably%20with%20Reinforcement%20Learning.%20A%20gym%20for%20your%20agents%21)\n[](https://www.facebook.com/sharer/sharer.php?u=https%3A%2F%2Fgithub.com%2FToolBrain%2FToolBrain)\n[](https://www.reddit.com/submit?url=https%3A%2F%2Fgithub.com%2FToolBrain%2FToolBrain&title=ToolBrain%3A%20A%20Lightweight%20RL%20Framework%20for%20Training%20AI%20Agents)\n\n---\n\n## References\nPlease cite [our paper](https://arxiv.org/abs/2510.00023) with the following bibtex:\n```\n@misc{le2025toolbrainflexiblereinforcementlearning,\n title={ToolBrain: A Flexible Reinforcement Learning Framework for Agentic Tools}, \n author={Quy Minh Le and Minh Sao Khue Luu and Khanh-Tung Tran and Duc-Hai Nguyen and Hoang-Quoc-Viet Pham and Quan Le and Hoang Thanh Lam and Hoang D. Nguyen},\n year={2025},\n eprint={2510.00023},\n archivePrefix={arXiv},\n primaryClass={cs.AI},\n url={https://arxiv.org/abs/2510.00023}, \n}\n```\n\n**Made with \u2764\ufe0f by the ToolBrain Team** \n",
"bugtrack_url": null,
"license": null,
"summary": "A framework for training LLM-powered agents to use tools more effectively using Reinforcement Learning",
"version": "0.1.3",
"project_urls": {
"Documentation": "https://toolbrain.readthedocs.io",
"Homepage": "https://github.com/toolbrain/toolbrain",
"Issues": "https://github.com/toolbrain/toolbrain/issues",
"Repository": "https://github.com/toolbrain/toolbrain"
},
"split_keywords": [
"llm",
" agents",
" reinforcement-learning",
" tools",
" ai"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "cc7aad045271c528842e147586ef49497a78f39eb38c81d7de6b60a32b9b31cf",
"md5": "c1bc3c7d9e6072595246ef461d91435a",
"sha256": "f35a7789844220c5ebe091c4db552a176efc0b08b22a8e0cacc57daec320ae95"
},
"downloads": -1,
"filename": "toolbrain-0.1.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "c1bc3c7d9e6072595246ef461d91435a",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 60783,
"upload_time": "2025-10-07T08:37:16",
"upload_time_iso_8601": "2025-10-07T08:37:16.743427Z",
"url": "https://files.pythonhosted.org/packages/cc/7a/ad045271c528842e147586ef49497a78f39eb38c81d7de6b60a32b9b31cf/toolbrain-0.1.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "bb97627a431f3c7d0418dbcf85f5fe553aba8d6de6fbf81f4c94116077389a1c",
"md5": "bb2dcde6527c0ca6b4f31e612e50a656",
"sha256": "864794762013d6dd066749cfc8fdc1e3eb5c9abf7f109a3d688567830af80b7a"
},
"downloads": -1,
"filename": "toolbrain-0.1.3.tar.gz",
"has_sig": false,
"md5_digest": "bb2dcde6527c0ca6b4f31e612e50a656",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 53981,
"upload_time": "2025-10-07T08:37:18",
"upload_time_iso_8601": "2025-10-07T08:37:18.864506Z",
"url": "https://files.pythonhosted.org/packages/bb/97/627a431f3c7d0418dbcf85f5fe553aba8d6de6fbf81f4c94116077389a1c/toolbrain-0.1.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-10-07 08:37:18",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "toolbrain",
"github_project": "toolbrain",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "toolbrain"
}