<div align="center">
<img src="https://github-production-user-asset-6210df.s3.amazonaws.com/26739999/289025203-f05733ff-6bbb-46f0-92aa-8827c59df79c.png" width="450"/>
</div>
<div align="center">
[![Open in OpenXLab](https://cdn-static.openxlab.org.cn/app-center/openxlab_app.svg)](https://openxlab.org.cn/apps/detail/mzr1996/AgentLego)
[![docs](https://img.shields.io/badge/docs-latest-blue)](https://agentlego.readthedocs.io/en/latest/)
[![PyPI](https://img.shields.io/pypi/v/agentlego)](https://pypi.org/project/agentlego)
[![license](https://img.shields.io/github/license/InternLM/agentlego.svg)](https://github.com/InternLM/agentlego/tree/main/LICENSE)
English | [简体中文](./README_zh-CN.md)
</div>
- [Introduction](#introduction)
- [Quick Starts](#quick-starts)
- [Installation](#installation)
- [Use tools directly](#use-tools-directly)
- [Integrated into agent frameworks](#integrated-into-agent-frameworks)
- [Supported Tools](#supported-tools)
- [Licence](#licence)
## Introduction
<span style="color:blue"> *AgentLego* </span> is an open-source library of versatile tool APIs to extend and enhance large language model (LLM) based agents, with the following highlight features:
- **Rich set of tools for multimodal extensions of LLM agents** including visual perception, image generation and editing, speech processing and visual-language reasoning, etc.
- **Flexible tool interface** that allows users to easily extend custom tools with arbitrary types of arguments and outputs.
- **Easy integration with LLM-based agent frameworks** like [LangChain](https://github.com/langchain-ai/langchain), [Transformers Agents](https://huggingface.co/docs/transformers/transformers_agents), [Lagent](https://github.com/InternLM/lagent).
- **Support tool serving and remote accessing**, which is especially useful for tools with heavy ML models (e.g. ViT) or special environment requirements (e.g. GPU and CUDA).
https://github-production-user-asset-6210df.s3.amazonaws.com/26739999/289006700-2140015c-b5e0-4102-bc54-9a1b4e3db9ec.mp4
# Quick Starts
## Installation
**Install the AgentLego package**
```shell
pip install agentlego
```
**Install tool-specific dependencies**
Some tools requires extra packages, please check the readme file of the tool, and confirm all requirements are
satisfied.
For example, if we want to use the `ImageDescription` tool. We need to check the **Set up** section of
[readme](agentlego/tools/image_text/README.md#ImageDescription) and install the requirements.
```bash
pip install -U openmim
mim install -U mmpretrain
```
## Use tools directly
```Python
from agentlego import list_tools, load_tool
print(list_tools()) # list tools in AgentLego
image_caption_tool = load_tool('ImageDescription', device='cuda')
print(image_caption_tool.description)
image = './examples/demo.png'
caption = image_caption_tool(image)
```
## Integrated into agent frameworks
- [**Lagent**](examples/lagent_example.py)
- [**Transformers Agent**](examples/hf_agent/hf_agent_example.py)
- [**VisualChatGPT**](examples/visual_chatgpt/visual_chatgpt.py)
# Supported Tools
**General ability**
- [Calculator](agentlego/tools/calculator/README.md): Calculate by Python interpreter.
- [GoogleSearch](agentlego/tools/search/README.md): Search on Google.
**Speech related**
- [TextToSpeech](agentlego/tools/speech_text/README.md#TextToSpeech): Speak the input text into audio.
- [SpeechToText](agentlego/tools/speech_text/README.md#SpeechToText): Transcribe an audio into text.
**Image-processing related**
- [ImageDescription](agentlego/tools/image_text/README.md#ImageDescription): Describe the input image.
- [OCR](agentlego/tools/ocr/README.md#OCR): Recognize the text from a photo.
- [VQA](agentlego/tools/vqa/README.md#VQA): Answer the question according to the image.
- [HumanBodyPose](agentlego/tools/image_pose/README.md#HumanBodyPose): Estimate the pose or keypoints of human in an image.
- [HumanFaceLandmark](agentlego/tools/image_pose/README.md#HumanFaceLandmark): Estimate the landmark or keypoints of human faces in an image.
- [ImageToCanny](agentlego/tools/image_canny/README.md#ImageToCanny): Extract the edge image from an image.
- [ImageToDepth](agentlego/tools/image_depth/README.md#ImageToDepth): Generate the depth image of an image.
- [ImageToScribble](agentlego/tools/image_scribble/README.md#ImageToScribble): Generate a sketch scribble of an image.
- [ObjectDetection](agentlego/tools/object_detection/README.md#ObjectDetection): Detect all objects in the image.
- [TextToBbox](agentlego/tools/object_detection/README.md#TextToBbox): Detect specific objects described by the given text in the image.
- Segment Anything series
- [SegmentAnything](agentlego/tools/segmentation/README.md#SegmentAnything): Segment all items in the image.
- [SegmentObject](agentlego/tools/segmentation/README.md#SegmentObject): Segment the certain objects in the image according to the given object name.
**AIGC related**
- [TextToImage](agentlego/tools/image_text/README.md#TextToImage): Generate an image from the input text.
- [ImageExpansion](agentlego/tools/image_editing/README.md#ImageExpansion): Expand the peripheral area of an image based on its content.
- [ObjectRemove](agentlego/tools/image_editing/README.md#ObjectRemove): Remove the certain objects in the image.
- [ObjectReplace](agentlego/tools/image_editing/README.md#ObjectReplace): Replace the certain objects in the image.
- [ImageStylization](agentlego/tools/image_editing/README.md#ImageStylization): Modify an image according to the instructions.
- ControlNet series
- [CannyTextToImage](agentlego/tools/image_canny/README.md#CannyTextToImage): Generate an image from a canny edge image and a description.
- [DepthTextToImage](agentlego/tools/image_depth/README.md#DepthTextToImage): Generate an image from a depth image and a description.
- [PoseToImage](agentlego/tools/image_pose/README.md#PoseToImage): Generate an image from a human pose image and a description.
- [ScribbleTextToImage](agentlego/tools/image_scribble/README.md#ScribbleTextToImage): Generate an image from a sketch scribble image and a description.
- ImageBind series
- [AudioToImage](agentlego/tools/imagebind/README.md#AudioToImage): Generate an image according to audio.
- [ThermalToImage](agentlego/tools/imagebind/README.md#ThermalToImage): Generate an image according a thermal image.
- [AudioImageToImage](agentlego/tools/imagebind/README.md#AudioImageToImage): Generate am image according to a audio and image.
- [AudioTextToImage](agentlego/tools/imagebind/README.md#AudioTextToImage): Generate an image from a audio and text prompt.
# Licence
This project is released under the [Apache 2.0 license](LICENSE). Users should also ensure compliance with the licenses governing the models used in this project.
Raw data
{
"_id": null,
"home_page": "",
"name": "agentlego",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": "mzr1996 <mzr1996@163.com>",
"keywords": "LLM,Multi-Modality,AI Agent",
"author": "AgentLego Contributors",
"author_email": "",
"download_url": "https://files.pythonhosted.org/packages/b1/fb/49fb0cb23ceb5a486f09a56073c3b33273151a91495d8f6982436bf2c204/agentlego-0.2.0.tar.gz",
"platform": null,
"description": "<div align=\"center\">\n<img src=\"https://github-production-user-asset-6210df.s3.amazonaws.com/26739999/289025203-f05733ff-6bbb-46f0-92aa-8827c59df79c.png\" width=\"450\"/>\n</div>\n\n<div align=\"center\">\n\n[![Open in OpenXLab](https://cdn-static.openxlab.org.cn/app-center/openxlab_app.svg)](https://openxlab.org.cn/apps/detail/mzr1996/AgentLego)\n[![docs](https://img.shields.io/badge/docs-latest-blue)](https://agentlego.readthedocs.io/en/latest/)\n[![PyPI](https://img.shields.io/pypi/v/agentlego)](https://pypi.org/project/agentlego)\n[![license](https://img.shields.io/github/license/InternLM/agentlego.svg)](https://github.com/InternLM/agentlego/tree/main/LICENSE)\n\nEnglish | [\u7b80\u4f53\u4e2d\u6587](./README_zh-CN.md)\n\n</div>\n\n- [Introduction](#introduction)\n- [Quick Starts](#quick-starts)\n - [Installation](#installation)\n - [Use tools directly](#use-tools-directly)\n - [Integrated into agent frameworks](#integrated-into-agent-frameworks)\n- [Supported Tools](#supported-tools)\n- [Licence](#licence)\n\n## Introduction\n\n<span style=\"color:blue\"> *AgentLego* </span> is an open-source library of versatile tool APIs to extend and enhance large language model (LLM) based agents, with the following highlight features:\n\n- **Rich set of tools for multimodal extensions of LLM agents** including visual perception, image generation and editing, speech processing and visual-language reasoning, etc.\n- **Flexible tool interface** that allows users to easily extend custom tools with arbitrary types of arguments and outputs.\n- **Easy integration with LLM-based agent frameworks** like [LangChain](https://github.com/langchain-ai/langchain), [Transformers Agents](https://huggingface.co/docs/transformers/transformers_agents), [Lagent](https://github.com/InternLM/lagent).\n- **Support tool serving and remote accessing**, which is especially useful for tools with heavy ML models (e.g. ViT) or special environment requirements (e.g. GPU and CUDA).\n\nhttps://github-production-user-asset-6210df.s3.amazonaws.com/26739999/289006700-2140015c-b5e0-4102-bc54-9a1b4e3db9ec.mp4\n\n# Quick Starts\n\n## Installation\n\n**Install the AgentLego package**\n\n```shell\npip install agentlego\n```\n\n**Install tool-specific dependencies**\n\nSome tools requires extra packages, please check the readme file of the tool, and confirm all requirements are\nsatisfied.\n\nFor example, if we want to use the `ImageDescription` tool. We need to check the **Set up** section of\n[readme](agentlego/tools/image_text/README.md#ImageDescription) and install the requirements.\n\n```bash\npip install -U openmim\nmim install -U mmpretrain\n```\n\n## Use tools directly\n\n```Python\nfrom agentlego import list_tools, load_tool\n\nprint(list_tools()) # list tools in AgentLego\n\nimage_caption_tool = load_tool('ImageDescription', device='cuda')\nprint(image_caption_tool.description)\nimage = './examples/demo.png'\ncaption = image_caption_tool(image)\n```\n\n## Integrated into agent frameworks\n\n- [**Lagent**](examples/lagent_example.py)\n- [**Transformers Agent**](examples/hf_agent/hf_agent_example.py)\n- [**VisualChatGPT**](examples/visual_chatgpt/visual_chatgpt.py)\n\n# Supported Tools\n\n**General ability**\n\n- [Calculator](agentlego/tools/calculator/README.md): Calculate by Python interpreter.\n- [GoogleSearch](agentlego/tools/search/README.md): Search on Google.\n\n**Speech related**\n\n- [TextToSpeech](agentlego/tools/speech_text/README.md#TextToSpeech): Speak the input text into audio.\n- [SpeechToText](agentlego/tools/speech_text/README.md#SpeechToText): Transcribe an audio into text.\n\n**Image-processing related**\n\n- [ImageDescription](agentlego/tools/image_text/README.md#ImageDescription): Describe the input image.\n- [OCR](agentlego/tools/ocr/README.md#OCR): Recognize the text from a photo.\n- [VQA](agentlego/tools/vqa/README.md#VQA): Answer the question according to the image.\n- [HumanBodyPose](agentlego/tools/image_pose/README.md#HumanBodyPose): Estimate the pose or keypoints of human in an image.\n- [HumanFaceLandmark](agentlego/tools/image_pose/README.md#HumanFaceLandmark): Estimate the landmark or keypoints of human faces in an image.\n- [ImageToCanny](agentlego/tools/image_canny/README.md#ImageToCanny): Extract the edge image from an image.\n- [ImageToDepth](agentlego/tools/image_depth/README.md#ImageToDepth): Generate the depth image of an image.\n- [ImageToScribble](agentlego/tools/image_scribble/README.md#ImageToScribble): Generate a sketch scribble of an image.\n- [ObjectDetection](agentlego/tools/object_detection/README.md#ObjectDetection): Detect all objects in the image.\n- [TextToBbox](agentlego/tools/object_detection/README.md#TextToBbox): Detect specific objects described by the given text in the image.\n- Segment Anything series\n - [SegmentAnything](agentlego/tools/segmentation/README.md#SegmentAnything): Segment all items in the image.\n - [SegmentObject](agentlego/tools/segmentation/README.md#SegmentObject): Segment the certain objects in the image according to the given object name.\n\n**AIGC related**\n\n- [TextToImage](agentlego/tools/image_text/README.md#TextToImage): Generate an image from the input text.\n- [ImageExpansion](agentlego/tools/image_editing/README.md#ImageExpansion): Expand the peripheral area of an image based on its content.\n- [ObjectRemove](agentlego/tools/image_editing/README.md#ObjectRemove): Remove the certain objects in the image.\n- [ObjectReplace](agentlego/tools/image_editing/README.md#ObjectReplace): Replace the certain objects in the image.\n- [ImageStylization](agentlego/tools/image_editing/README.md#ImageStylization): Modify an image according to the instructions.\n- ControlNet series\n - [CannyTextToImage](agentlego/tools/image_canny/README.md#CannyTextToImage): Generate an image from a canny edge image and a description.\n - [DepthTextToImage](agentlego/tools/image_depth/README.md#DepthTextToImage): Generate an image from a depth image and a description.\n - [PoseToImage](agentlego/tools/image_pose/README.md#PoseToImage): Generate an image from a human pose image and a description.\n - [ScribbleTextToImage](agentlego/tools/image_scribble/README.md#ScribbleTextToImage): Generate an image from a sketch scribble image and a description.\n- ImageBind series\n - [AudioToImage](agentlego/tools/imagebind/README.md#AudioToImage): Generate an image according to audio.\n - [ThermalToImage](agentlego/tools/imagebind/README.md#ThermalToImage): Generate an image according a thermal image.\n - [AudioImageToImage](agentlego/tools/imagebind/README.md#AudioImageToImage): Generate am image according to a audio and image.\n - [AudioTextToImage](agentlego/tools/imagebind/README.md#AudioTextToImage): Generate an image from a audio and text prompt.\n\n# Licence\n\nThis project is released under the [Apache 2.0 license](LICENSE). Users should also ensure compliance with the licenses governing the models used in this project.\n",
"bugtrack_url": null,
"license": "Apache License 2.0",
"summary": "AgentLego is a versatile tool library for enhancing LLM-based agents.",
"version": "0.2.0",
"project_urls": {
"Documentation": "https://agentlego.readthedocs.io",
"Repository": "https://github.com/InternLM/agentlego"
},
"split_keywords": [
"llm",
"multi-modality",
"ai agent"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "4e8785cd1147f48213ef8970207d46b0ac01b3b5f0ff634d065afce3a5a83dee",
"md5": "6bc2122349be2b31f36a8d7756a8d4f3",
"sha256": "d8350d4e92664fb1d552d71674c84569407921370f638a672ccf2fa27d1f49ef"
},
"downloads": -1,
"filename": "agentlego-0.2.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "6bc2122349be2b31f36a8d7756a8d4f3",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 108144,
"upload_time": "2024-02-04T03:23:34",
"upload_time_iso_8601": "2024-02-04T03:23:34.401920Z",
"url": "https://files.pythonhosted.org/packages/4e/87/85cd1147f48213ef8970207d46b0ac01b3b5f0ff634d065afce3a5a83dee/agentlego-0.2.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "b1fb49fb0cb23ceb5a486f09a56073c3b33273151a91495d8f6982436bf2c204",
"md5": "7d2af2be16b52b8034d3ebeeb557ef06",
"sha256": "0a64afb56878d78c9533403e33dc16699b0f93c2e08d2116ec169c6c195a2ff3"
},
"downloads": -1,
"filename": "agentlego-0.2.0.tar.gz",
"has_sig": false,
"md5_digest": "7d2af2be16b52b8034d3ebeeb557ef06",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 78025,
"upload_time": "2024-02-04T03:23:35",
"upload_time_iso_8601": "2024-02-04T03:23:35.727204Z",
"url": "https://files.pythonhosted.org/packages/b1/fb/49fb0cb23ceb5a486f09a56073c3b33273151a91495d8f6982436bf2c204/agentlego-0.2.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-02-04 03:23:35",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "InternLM",
"github_project": "agentlego",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [],
"lcname": "agentlego"
}