agentlego

Name	agentlego JSON
Version	0.2.0 JSON
	download
home_page
Summary	AgentLego is a versatile tool library for enhancing LLM-based agents.
upload_time	2024-02-04 03:23:35
maintainer
docs_url	None
author	AgentLego Contributors
requires_python	>=3.8
license	Apache License 2.0
keywords	llm multi-modality ai agent
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            <div align="center">
<img src="https://github-production-user-asset-6210df.s3.amazonaws.com/26739999/289025203-f05733ff-6bbb-46f0-92aa-8827c59df79c.png" width="450"/>
</div>

<div align="center">

[![Open in OpenXLab](https://cdn-static.openxlab.org.cn/app-center/openxlab_app.svg)](https://openxlab.org.cn/apps/detail/mzr1996/AgentLego)
[![docs](https://img.shields.io/badge/docs-latest-blue)](https://agentlego.readthedocs.io/en/latest/)
[![PyPI](https://img.shields.io/pypi/v/agentlego)](https://pypi.org/project/agentlego)
[![license](https://img.shields.io/github/license/InternLM/agentlego.svg)](https://github.com/InternLM/agentlego/tree/main/LICENSE)

English | [简体中文](./README_zh-CN.md)

</div>

- [Introduction](#introduction)
- [Quick Starts](#quick-starts)
  - [Installation](#installation)
  - [Use tools directly](#use-tools-directly)
  - [Integrated into agent frameworks](#integrated-into-agent-frameworks)
- [Supported Tools](#supported-tools)
- [Licence](#licence)

## Introduction

<span style="color:blue"> *AgentLego* </span> is an open-source library of versatile tool APIs to extend and enhance large language model (LLM) based agents, with the following highlight features:

- **Rich set of tools for multimodal extensions of LLM agents** including visual perception, image generation and editing, speech processing and visual-language reasoning, etc.
- **Flexible tool interface** that allows users to easily extend custom tools with arbitrary types of arguments and outputs.
- **Easy integration with LLM-based agent frameworks** like [LangChain](https://github.com/langchain-ai/langchain), [Transformers Agents](https://huggingface.co/docs/transformers/transformers_agents), [Lagent](https://github.com/InternLM/lagent).
- **Support tool serving and remote accessing**, which is especially useful for tools with heavy ML models (e.g. ViT) or special environment requirements (e.g. GPU and CUDA).

https://github-production-user-asset-6210df.s3.amazonaws.com/26739999/289006700-2140015c-b5e0-4102-bc54-9a1b4e3db9ec.mp4

# Quick Starts

## Installation

**Install the AgentLego package**

```shell
pip install agentlego
```

**Install tool-specific dependencies**

Some tools requires extra packages, please check the readme file of the tool, and confirm all requirements are
satisfied.

For example, if we want to use the `ImageDescription` tool. We need to check the **Set up** section of
[readme](agentlego/tools/image_text/README.md#ImageDescription) and install the requirements.

```bash
pip install -U openmim
mim install -U mmpretrain
```

## Use tools directly

```Python
from agentlego import list_tools, load_tool

print(list_tools())  # list tools in AgentLego

image_caption_tool = load_tool('ImageDescription', device='cuda')
print(image_caption_tool.description)
image = './examples/demo.png'
caption = image_caption_tool(image)
```

## Integrated into agent frameworks

- [**Lagent**](examples/lagent_example.py)
- [**Transformers Agent**](examples/hf_agent/hf_agent_example.py)
- [**VisualChatGPT**](examples/visual_chatgpt/visual_chatgpt.py)

# Supported Tools

**General ability**

- [Calculator](agentlego/tools/calculator/README.md): Calculate by Python interpreter.
- [GoogleSearch](agentlego/tools/search/README.md): Search on Google.

**Speech related**

- [TextToSpeech](agentlego/tools/speech_text/README.md#TextToSpeech): Speak the input text into audio.
- [SpeechToText](agentlego/tools/speech_text/README.md#SpeechToText): Transcribe an audio into text.

**Image-processing related**

- [ImageDescription](agentlego/tools/image_text/README.md#ImageDescription): Describe the input image.
- [OCR](agentlego/tools/ocr/README.md#OCR): Recognize the text from a photo.
- [VQA](agentlego/tools/vqa/README.md#VQA): Answer the question according to the image.
- [HumanBodyPose](agentlego/tools/image_pose/README.md#HumanBodyPose): Estimate the pose or keypoints of human in an image.
- [HumanFaceLandmark](agentlego/tools/image_pose/README.md#HumanFaceLandmark): Estimate the landmark or keypoints of human faces in an image.
- [ImageToCanny](agentlego/tools/image_canny/README.md#ImageToCanny): Extract the edge image from an image.
- [ImageToDepth](agentlego/tools/image_depth/README.md#ImageToDepth): Generate the depth image of an image.
- [ImageToScribble](agentlego/tools/image_scribble/README.md#ImageToScribble): Generate a sketch scribble of an image.
- [ObjectDetection](agentlego/tools/object_detection/README.md#ObjectDetection): Detect all objects in the image.
- [TextToBbox](agentlego/tools/object_detection/README.md#TextToBbox): Detect specific objects described by the given text in the image.
- Segment Anything series
  - [SegmentAnything](agentlego/tools/segmentation/README.md#SegmentAnything): Segment all items in the image.
  - [SegmentObject](agentlego/tools/segmentation/README.md#SegmentObject): Segment the certain objects in the image according to the given object name.

**AIGC related**

- [TextToImage](agentlego/tools/image_text/README.md#TextToImage): Generate an image from the input text.
- [ImageExpansion](agentlego/tools/image_editing/README.md#ImageExpansion): Expand the peripheral area of an image based on its content.
- [ObjectRemove](agentlego/tools/image_editing/README.md#ObjectRemove): Remove the certain objects in the image.
- [ObjectReplace](agentlego/tools/image_editing/README.md#ObjectReplace): Replace the certain objects in the image.
- [ImageStylization](agentlego/tools/image_editing/README.md#ImageStylization): Modify an image according to the instructions.
- ControlNet series
  - [CannyTextToImage](agentlego/tools/image_canny/README.md#CannyTextToImage): Generate an image from a canny edge image and a description.
  - [DepthTextToImage](agentlego/tools/image_depth/README.md#DepthTextToImage): Generate an image from a depth image and a description.
  - [PoseToImage](agentlego/tools/image_pose/README.md#PoseToImage): Generate an image from a human pose image and a description.
  - [ScribbleTextToImage](agentlego/tools/image_scribble/README.md#ScribbleTextToImage): Generate an image from a sketch scribble image and a description.
- ImageBind series
  - [AudioToImage](agentlego/tools/imagebind/README.md#AudioToImage): Generate an image according to audio.
  - [ThermalToImage](agentlego/tools/imagebind/README.md#ThermalToImage): Generate an image according a thermal image.
  - [AudioImageToImage](agentlego/tools/imagebind/README.md#AudioImageToImage): Generate am image according to a audio and image.
  - [AudioTextToImage](agentlego/tools/imagebind/README.md#AudioTextToImage): Generate an image from a audio and text prompt.

# Licence

This project is released under the [Apache 2.0 license](LICENSE). Users should also ensure compliance with the licenses governing the models used in this project.

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "agentlego",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "mzr1996 <mzr1996@163.com>",
    "keywords": "LLM,Multi-Modality,AI Agent",
    "author": "AgentLego Contributors",
    "author_email": "",
    "download_url": "https://files.pythonhosted.org/packages/b1/fb/49fb0cb23ceb5a486f09a56073c3b33273151a91495d8f6982436bf2c204/agentlego-0.2.0.tar.gz",
    "platform": null,
    "description": "<div align=\"center\">\n<img src=\"https://github-production-user-asset-6210df.s3.amazonaws.com/26739999/289025203-f05733ff-6bbb-46f0-92aa-8827c59df79c.png\" width=\"450\"/>\n</div>\n\n<div align=\"center\">\n\n[![Open in OpenXLab](https://cdn-static.openxlab.org.cn/app-center/openxlab_app.svg)](https://openxlab.org.cn/apps/detail/mzr1996/AgentLego)\n[![docs](https://img.shields.io/badge/docs-latest-blue)](https://agentlego.readthedocs.io/en/latest/)\n[![PyPI](https://img.shields.io/pypi/v/agentlego)](https://pypi.org/project/agentlego)\n[![license](https://img.shields.io/github/license/InternLM/agentlego.svg)](https://github.com/InternLM/agentlego/tree/main/LICENSE)\n\nEnglish | [\u7b80\u4f53\u4e2d\u6587](./README_zh-CN.md)\n\n</div>\n\n- [Introduction](#introduction)\n- [Quick Starts](#quick-starts)\n  - [Installation](#installation)\n  - [Use tools directly](#use-tools-directly)\n  - [Integrated into agent frameworks](#integrated-into-agent-frameworks)\n- [Supported Tools](#supported-tools)\n- [Licence](#licence)\n\n## Introduction\n\n<span style=\"color:blue\"> *AgentLego* </span> is an open-source library of versatile tool APIs to extend and enhance large language model (LLM) based agents, with the following highlight features:\n\n- **Rich set of tools for multimodal extensions of LLM agents** including visual perception, image generation and editing, speech processing and visual-language reasoning, etc.\n- **Flexible tool interface** that allows users to easily extend custom tools with arbitrary types of arguments and outputs.\n- **Easy integration with LLM-based agent frameworks** like [LangChain](https://github.com/langchain-ai/langchain), [Transformers Agents](https://huggingface.co/docs/transformers/transformers_agents), [Lagent](https://github.com/InternLM/lagent).\n- **Support tool serving and remote accessing**, which is especially useful for tools with heavy ML models (e.g. ViT) or special environment requirements (e.g. GPU and CUDA).\n\nhttps://github-production-user-asset-6210df.s3.amazonaws.com/26739999/289006700-2140015c-b5e0-4102-bc54-9a1b4e3db9ec.mp4\n\n# Quick Starts\n\n## Installation\n\n**Install the AgentLego package**\n\n```shell\npip install agentlego\n```\n\n**Install tool-specific dependencies**\n\nSome tools requires extra packages, please check the readme file of the tool, and confirm all requirements are\nsatisfied.\n\nFor example, if we want to use the `ImageDescription` tool. We need to check the **Set up** section of\n[readme](agentlego/tools/image_text/README.md#ImageDescription) and install the requirements.\n\n```bash\npip install -U openmim\nmim install -U mmpretrain\n```\n\n## Use tools directly\n\n```Python\nfrom agentlego import list_tools, load_tool\n\nprint(list_tools())  # list tools in AgentLego\n\nimage_caption_tool = load_tool('ImageDescription', device='cuda')\nprint(image_caption_tool.description)\nimage = './examples/demo.png'\ncaption = image_caption_tool(image)\n```\n\n## Integrated into agent frameworks\n\n- [**Lagent**](examples/lagent_example.py)\n- [**Transformers Agent**](examples/hf_agent/hf_agent_example.py)\n- [**VisualChatGPT**](examples/visual_chatgpt/visual_chatgpt.py)\n\n# Supported Tools\n\n**General ability**\n\n- [Calculator](agentlego/tools/calculator/README.md): Calculate by Python interpreter.\n- [GoogleSearch](agentlego/tools/search/README.md): Search on Google.\n\n**Speech related**\n\n- [TextToSpeech](agentlego/tools/speech_text/README.md#TextToSpeech): Speak the input text into audio.\n- [SpeechToText](agentlego/tools/speech_text/README.md#SpeechToText): Transcribe an audio into text.\n\n**Image-processing related**\n\n- [ImageDescription](agentlego/tools/image_text/README.md#ImageDescription): Describe the input image.\n- [OCR](agentlego/tools/ocr/README.md#OCR): Recognize the text from a photo.\n- [VQA](agentlego/tools/vqa/README.md#VQA): Answer the question according to the image.\n- [HumanBodyPose](agentlego/tools/image_pose/README.md#HumanBodyPose): Estimate the pose or keypoints of human in an image.\n- [HumanFaceLandmark](agentlego/tools/image_pose/README.md#HumanFaceLandmark): Estimate the landmark or keypoints of human faces in an image.\n- [ImageToCanny](agentlego/tools/image_canny/README.md#ImageToCanny): Extract the edge image from an image.\n- [ImageToDepth](agentlego/tools/image_depth/README.md#ImageToDepth): Generate the depth image of an image.\n- [ImageToScribble](agentlego/tools/image_scribble/README.md#ImageToScribble): Generate a sketch scribble of an image.\n- [ObjectDetection](agentlego/tools/object_detection/README.md#ObjectDetection): Detect all objects in the image.\n- [TextToBbox](agentlego/tools/object_detection/README.md#TextToBbox): Detect specific objects described by the given text in the image.\n- Segment Anything series\n  - [SegmentAnything](agentlego/tools/segmentation/README.md#SegmentAnything): Segment all items in the image.\n  - [SegmentObject](agentlego/tools/segmentation/README.md#SegmentObject): Segment the certain objects in the image according to the given object name.\n\n**AIGC related**\n\n- [TextToImage](agentlego/tools/image_text/README.md#TextToImage): Generate an image from the input text.\n- [ImageExpansion](agentlego/tools/image_editing/README.md#ImageExpansion): Expand the peripheral area of an image based on its content.\n- [ObjectRemove](agentlego/tools/image_editing/README.md#ObjectRemove): Remove the certain objects in the image.\n- [ObjectReplace](agentlego/tools/image_editing/README.md#ObjectReplace): Replace the certain objects in the image.\n- [ImageStylization](agentlego/tools/image_editing/README.md#ImageStylization): Modify an image according to the instructions.\n- ControlNet series\n  - [CannyTextToImage](agentlego/tools/image_canny/README.md#CannyTextToImage): Generate an image from a canny edge image and a description.\n  - [DepthTextToImage](agentlego/tools/image_depth/README.md#DepthTextToImage): Generate an image from a depth image and a description.\n  - [PoseToImage](agentlego/tools/image_pose/README.md#PoseToImage): Generate an image from a human pose image and a description.\n  - [ScribbleTextToImage](agentlego/tools/image_scribble/README.md#ScribbleTextToImage): Generate an image from a sketch scribble image and a description.\n- ImageBind series\n  - [AudioToImage](agentlego/tools/imagebind/README.md#AudioToImage): Generate an image according to audio.\n  - [ThermalToImage](agentlego/tools/imagebind/README.md#ThermalToImage): Generate an image according a thermal image.\n  - [AudioImageToImage](agentlego/tools/imagebind/README.md#AudioImageToImage): Generate am image according to a audio and image.\n  - [AudioTextToImage](agentlego/tools/imagebind/README.md#AudioTextToImage): Generate an image from a audio and text prompt.\n\n# Licence\n\nThis project is released under the [Apache 2.0 license](LICENSE). Users should also ensure compliance with the licenses governing the models used in this project.\n",
    "bugtrack_url": null,
    "license": "Apache License 2.0",
    "summary": "AgentLego is a versatile tool library for enhancing LLM-based agents.",
    "version": "0.2.0",
    "project_urls": {
        "Documentation": "https://agentlego.readthedocs.io",
        "Repository": "https://github.com/InternLM/agentlego"
    },
    "split_keywords": [
        "llm",
        "multi-modality",
        "ai agent"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "4e8785cd1147f48213ef8970207d46b0ac01b3b5f0ff634d065afce3a5a83dee",
                "md5": "6bc2122349be2b31f36a8d7756a8d4f3",
                "sha256": "d8350d4e92664fb1d552d71674c84569407921370f638a672ccf2fa27d1f49ef"
            },
            "downloads": -1,
            "filename": "agentlego-0.2.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "6bc2122349be2b31f36a8d7756a8d4f3",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 108144,
            "upload_time": "2024-02-04T03:23:34",
            "upload_time_iso_8601": "2024-02-04T03:23:34.401920Z",
            "url": "https://files.pythonhosted.org/packages/4e/87/85cd1147f48213ef8970207d46b0ac01b3b5f0ff634d065afce3a5a83dee/agentlego-0.2.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b1fb49fb0cb23ceb5a486f09a56073c3b33273151a91495d8f6982436bf2c204",
                "md5": "7d2af2be16b52b8034d3ebeeb557ef06",
                "sha256": "0a64afb56878d78c9533403e33dc16699b0f93c2e08d2116ec169c6c195a2ff3"
            },
            "downloads": -1,
            "filename": "agentlego-0.2.0.tar.gz",
            "has_sig": false,
            "md5_digest": "7d2af2be16b52b8034d3ebeeb557ef06",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 78025,
            "upload_time": "2024-02-04T03:23:35",
            "upload_time_iso_8601": "2024-02-04T03:23:35.727204Z",
            "url": "https://files.pythonhosted.org/packages/b1/fb/49fb0cb23ceb5a486f09a56073c3b33273151a91495d8f6982436bf2c204/agentlego-0.2.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-02-04 03:23:35",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "InternLM",
    "github_project": "agentlego",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "agentlego"
}

AgentLego Contributors