llama-agents


Namellama-agents JSON
Version 0.0.14 PyPI version JSON
download
home_pageNone
SummaryNone
upload_time2024-08-15 22:50:13
maintainerLogan Markewich
docs_urlNone
authorLogan Markewich
requires_python<4.0,>=3.8.1
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # 🦙 `llama-agents` 🤖

`llama-agents` is an async-first framework for building, iterating, and productionizing multi-agent systems, including multi-agent communication, distributed tool execution, human-in-the-loop, and more!

In `llama-agents`, each agent is seen as a `service`, endlessly processing incoming tasks. Each agent pulls and publishes messages from a `message queue`.

At the top of a `llama-agents` system is the `control plane`. The control plane keeps track of ongoing tasks, which services are in the network, and also decides which service should handle the next step of a task using an `orchestrator`.

The overall system layout is pictured below.

![A basic system in llama-agents](./system_diagram.png)

## Installation

`llama-agents` can be installed with pip, and relies mainly on `llama-index-core`:

```bash
pip install llama-agents
```

If you don't already have llama-index installed, to follow these examples, you'll also need

```bash
pip install llama-index-agent-openai llama-index-embeddings-openai
```

## Getting Started

The quickest way to get started is with an existing agent (or agents) and wrapping into launcher.

The example below shows a trivial example with two agents from `llama-index`.

First, lets setup some agents and initial components for our `llama-agents` system:

```python
from llama_agents import (
    AgentService,
    AgentOrchestrator,
    ControlPlaneServer,
    SimpleMessageQueue,
)

from llama_index.core.agent import ReActAgent
from llama_index.core.tools import FunctionTool
from llama_index.llms.openai import OpenAI


# create an agent
def get_the_secret_fact() -> str:
    """Returns the secret fact."""
    return "The secret fact is: A baby llama is called a 'Cria'."


tool = FunctionTool.from_defaults(fn=get_the_secret_fact)

agent1 = ReActAgent.from_tools([tool], llm=OpenAI())
agent2 = ReActAgent.from_tools([], llm=OpenAI())

# create our multi-agent framework components
message_queue = SimpleMessageQueue(port=8000)
control_plane = ControlPlaneServer(
    message_queue=message_queue,
    orchestrator=AgentOrchestrator(llm=OpenAI(model="gpt-4-turbo")),
    port=8001,
)
agent_server_1 = AgentService(
    agent=agent1,
    message_queue=message_queue,
    description="Useful for getting the secret fact.",
    service_name="secret_fact_agent",
    port=8002,
)
agent_server_2 = AgentService(
    agent=agent2,
    message_queue=message_queue,
    description="Useful for getting random dumb facts.",
    service_name="dumb_fact_agent",
    port=8003,
)
```

### Local / Notebook Flow

Next, when working in a notebook or for faster iteration, we can launch our `llama-agents` system in a single-run setting, where one message is propagated through the network and returned.

```python
from llama_agents import LocalLauncher
import nest_asyncio

# needed for running in a notebook
nest_asyncio.apply()

# launch it
launcher = LocalLauncher(
    [agent_server_1, agent_server_2],
    control_plane,
    message_queue,
)
result = launcher.launch_single("What is the secret fact?")

print(f"Result: {result}")
```

<!-- prettier-ignore -->
> [!NOTE]
> `launcher.launch_single` creates a new asyncio event loop. Since Jupyter notebooks already have an event loop running, we need to use `nest_asyncio` to allow the creation of new event loops within the existing one.

As with any agentic system, its important to consider how reliable the LLM is that you are using. In general, APIs that support function calling (OpenAI, Anthropic, Mistral, etc.) are the most reliable.

### Server Flow

Once you are happy with your system, we can launch all our services as independent processes, allowing for higher throughput and scalability.

By default, all task results are published to a specific "human" queue, so we also define a consumer to handle this result as it comes in. (In the future, this final queue will be configurable!)

To test this, you can use the server launcher in a script:

```python
from llama_agents import ServerLauncher, CallableMessageConsumer


# Additional human consumer
def handle_result(message) -> None:
    print(f"Got result:", message.data)


human_consumer = CallableMessageConsumer(
    handler=handle_result, message_type="human"
)

# Define Launcher
launcher = ServerLauncher(
    [agent_server_1, agent_server_2],
    control_plane,
    message_queue,
    additional_consumers=[human_consumer],
)

# Launch it!
launcher.launch_servers()
```

Now, since everything is a server, you need API requests to interact with it. The easiest way is to use our client and the control plane URL:

```python
from llama_agents import LlamaAgentsClient, AsyncLlamaAgentsClient

client = LlamaAgentsClient("<control plane URL>")  # i.e. http://127.0.0.1:8001
task_id = client.create_task("What is the secret fact?")
# <Wait a few seconds>
# returns TaskResult or None if not finished
result = client.get_task_result(task_id)
```

Rather than using a client or raw `curl` requests, you can also use a built-in CLI tool to monitor and interact with your services.

In another terminal, you can run:

```bash
llama-agents monitor --control-plane-url http://127.0.0.1:8001
```

![The llama-agents monitor app](./llama_agents_monitor.png)

## Examples

You can find a host of examples in our examples folder:

- [Agentic RAG + Tool Service](https://github.com/run-llama/llama-agents/blob/main/examples/agentic_rag_toolservice.ipynb)
- [Agentic Orchestrator w/ Local Launcher](https://github.com/run-llama/llama-agents/blob/main/examples/agentic_local_single.py)
- [Agentic Orchestrator w/ Server Launcher](https://github.com/run-llama/llama-agents/blob/main/examples/agentic_server.py)
- [Agentic Orchestrator w/ Human in the Loop](https://github.com/run-llama/llama-agents/blob/main/examples/agentic_human_local_single.py)
- [Agentic Orchestrator w/ Tool Service](https://github.com/run-llama/llama-agents/blob/main/examples/agentic_toolservice_local_single.py)
- [Pipeline Orchestrator w/ Local Launcher](https://github.com/run-llama/llama-agents/blob/main/examples/pipeline_local_single.py)
- [Pipeline Orchestrator w/ Human in the Loop](https://github.com/run-llama/llama-agents/blob/main/examples/pipeline_human_local_single.py)
- [Pipeline Orchestrator w/ Agent Server As Tool](https://github.com/run-llama/llama-agents/blob/main/examples/pipeline_agent_service_tool_local_single.py)
- [Pipeline Orchestrator w/ Query Rewrite RAG](https://github.com/run-llama/llama-agents/blob/main/examples/query_rewrite_rag.ipynb)

## Components of a `llama-agents` System

In `llama-agents`, there are several key components that make up the overall system

- `message queue` -- the message queue acts as a queue for all services and the `control plane`. It has methods for publishing methods to named queues, and delegates messages to consumers.
- `control plane` -- the control plane is a the central gateway to the `llama-agents` system. It keeps track of current tasks, as well as the services that are registered to the system. It also holds the `orchestrator`.
- `orchestrator` -- The module handles incoming tasks and decides what service to send it to, as well as how to handle results from services. An orchestrator can be agentic (with an LLM making decisions), explicit (with a query pipeline defining a flow), a mix of both, or something completely custom.
- `services` -- Services are where the actual work happens. A services accepts some incoming task and context, processes it, and publishes a result
  - A `tool service` is a special service used to off-load the compution of agent tools. Agents can instead be equipped with a meta-tool that calls the tool service.

## Low-Level API in `llama-agents`

So far, you've seen how to define components and how to launch them. However in most production use-cases, you will need to launch services manually, as well as define your own consumers!

So, here is a quick guide on exactly that!

### Launching

First, you will want to launch everything. This can be done in a single script, or you can launch things with multiple scripts per service, or on different machines, or even in docker images.

In this example, we will assume launching from a single script.

```python
import asyncio

# launch the message queue
queue_task = asyncio.create_task(message_queue.launch_server())

# wait for the message queue to be ready
await asyncio.sleep(1)

# launch the control plane
control_plane_task = asyncio.create_task(self.control_plane.launch_server())

# wait for the control plane to be ready
await asyncio.sleep(1)

# register the control plane as a consumer which returns a start_consuming_callable
start_consuming_callable = await self.control_plane.register_to_message_queue()
start_consuming_callables = [start_consuming_callable]

# register the services
control_plane_url = (
    f"http://{self.control_plane.host}:{self.control_plane.port}"
)
service_tasks = []
for service in self.services:
    # first launch the service
    service_tasks.append(asyncio.create_task(service.launch_server()))

    # register the service to the message queue
    start_consuming_callable = await service.register_to_message_queue()
    start_consuming_callables.append(start_consuming_callable)

    # register the service to the control plane
    await service.register_to_control_plane(control_plane_url)

# start consuming!
start_consuming_tasks = []
for start_consuming_callable in start_consuming_callables:
    task = asyncio.create_task(start_consuming_callable())
    start_consuming_tasks.append(task)
```

With that done, you may want to define a consumer for the results of tasks.

By default, the results of tasks get published to a `human` message queue.

```python
from llama_agents import (
    CallableMessageConsumer,
    RemoteMessageConsumer,
    QueueMessage,
)
import asyncio


def handle_result(message: QueueMessage) -> None:
    print(message.data)


human_consumer = CallableMessageConsumer(
    handler=handle_result, message_type="human"
)


async def register_and_start_consuming():
    start_consuming_callable = await message_queue.register_consumer(
        human_consumer
    )
    await start_consuming_callable()


if __name__ == "__main__":
    asyncio.run(register_and_start_consuming())

# or, you can send the message to any URL
# human_consumer = RemoteMessageConsumer(url="some destination url")
# message_queue.register_consumer(human_consumer)
```

Or, if you don't want to define a consumer, you can just use the `monitor` to observe your system results

```bash
llama-agents monitor --control-plane-url http://127.0.0.1:8001
```

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "llama-agents",
    "maintainer": "Logan Markewich",
    "docs_url": null,
    "requires_python": "<4.0,>=3.8.1",
    "maintainer_email": "logan@llamaindex.ai",
    "keywords": null,
    "author": "Logan Markewich",
    "author_email": "logan.markewich@live.com",
    "download_url": "https://files.pythonhosted.org/packages/e7/71/18fc48a9e055086ff5ba85d1d1e38afd997db7dd82389889ac5085b440e9/llama_agents-0.0.14.tar.gz",
    "platform": null,
    "description": "# \ud83e\udd99 `llama-agents` \ud83e\udd16\n\n`llama-agents` is an async-first framework for building, iterating, and productionizing multi-agent systems, including multi-agent communication, distributed tool execution, human-in-the-loop, and more!\n\nIn `llama-agents`, each agent is seen as a `service`, endlessly processing incoming tasks. Each agent pulls and publishes messages from a `message queue`.\n\nAt the top of a `llama-agents` system is the `control plane`. The control plane keeps track of ongoing tasks, which services are in the network, and also decides which service should handle the next step of a task using an `orchestrator`.\n\nThe overall system layout is pictured below.\n\n![A basic system in llama-agents](./system_diagram.png)\n\n## Installation\n\n`llama-agents` can be installed with pip, and relies mainly on `llama-index-core`:\n\n```bash\npip install llama-agents\n```\n\nIf you don't already have llama-index installed, to follow these examples, you'll also need\n\n```bash\npip install llama-index-agent-openai llama-index-embeddings-openai\n```\n\n## Getting Started\n\nThe quickest way to get started is with an existing agent (or agents) and wrapping into launcher.\n\nThe example below shows a trivial example with two agents from `llama-index`.\n\nFirst, lets setup some agents and initial components for our `llama-agents` system:\n\n```python\nfrom llama_agents import (\n    AgentService,\n    AgentOrchestrator,\n    ControlPlaneServer,\n    SimpleMessageQueue,\n)\n\nfrom llama_index.core.agent import ReActAgent\nfrom llama_index.core.tools import FunctionTool\nfrom llama_index.llms.openai import OpenAI\n\n\n# create an agent\ndef get_the_secret_fact() -> str:\n    \"\"\"Returns the secret fact.\"\"\"\n    return \"The secret fact is: A baby llama is called a 'Cria'.\"\n\n\ntool = FunctionTool.from_defaults(fn=get_the_secret_fact)\n\nagent1 = ReActAgent.from_tools([tool], llm=OpenAI())\nagent2 = ReActAgent.from_tools([], llm=OpenAI())\n\n# create our multi-agent framework components\nmessage_queue = SimpleMessageQueue(port=8000)\ncontrol_plane = ControlPlaneServer(\n    message_queue=message_queue,\n    orchestrator=AgentOrchestrator(llm=OpenAI(model=\"gpt-4-turbo\")),\n    port=8001,\n)\nagent_server_1 = AgentService(\n    agent=agent1,\n    message_queue=message_queue,\n    description=\"Useful for getting the secret fact.\",\n    service_name=\"secret_fact_agent\",\n    port=8002,\n)\nagent_server_2 = AgentService(\n    agent=agent2,\n    message_queue=message_queue,\n    description=\"Useful for getting random dumb facts.\",\n    service_name=\"dumb_fact_agent\",\n    port=8003,\n)\n```\n\n### Local / Notebook Flow\n\nNext, when working in a notebook or for faster iteration, we can launch our `llama-agents` system in a single-run setting, where one message is propagated through the network and returned.\n\n```python\nfrom llama_agents import LocalLauncher\nimport nest_asyncio\n\n# needed for running in a notebook\nnest_asyncio.apply()\n\n# launch it\nlauncher = LocalLauncher(\n    [agent_server_1, agent_server_2],\n    control_plane,\n    message_queue,\n)\nresult = launcher.launch_single(\"What is the secret fact?\")\n\nprint(f\"Result: {result}\")\n```\n\n<!-- prettier-ignore -->\n> [!NOTE]\n> `launcher.launch_single` creates a new asyncio event loop. Since Jupyter notebooks already have an event loop running, we need to use `nest_asyncio` to allow the creation of new event loops within the existing one.\n\nAs with any agentic system, its important to consider how reliable the LLM is that you are using. In general, APIs that support function calling (OpenAI, Anthropic, Mistral, etc.) are the most reliable.\n\n### Server Flow\n\nOnce you are happy with your system, we can launch all our services as independent processes, allowing for higher throughput and scalability.\n\nBy default, all task results are published to a specific \"human\" queue, so we also define a consumer to handle this result as it comes in. (In the future, this final queue will be configurable!)\n\nTo test this, you can use the server launcher in a script:\n\n```python\nfrom llama_agents import ServerLauncher, CallableMessageConsumer\n\n\n# Additional human consumer\ndef handle_result(message) -> None:\n    print(f\"Got result:\", message.data)\n\n\nhuman_consumer = CallableMessageConsumer(\n    handler=handle_result, message_type=\"human\"\n)\n\n# Define Launcher\nlauncher = ServerLauncher(\n    [agent_server_1, agent_server_2],\n    control_plane,\n    message_queue,\n    additional_consumers=[human_consumer],\n)\n\n# Launch it!\nlauncher.launch_servers()\n```\n\nNow, since everything is a server, you need API requests to interact with it. The easiest way is to use our client and the control plane URL:\n\n```python\nfrom llama_agents import LlamaAgentsClient, AsyncLlamaAgentsClient\n\nclient = LlamaAgentsClient(\"<control plane URL>\")  # i.e. http://127.0.0.1:8001\ntask_id = client.create_task(\"What is the secret fact?\")\n# <Wait a few seconds>\n# returns TaskResult or None if not finished\nresult = client.get_task_result(task_id)\n```\n\nRather than using a client or raw `curl` requests, you can also use a built-in CLI tool to monitor and interact with your services.\n\nIn another terminal, you can run:\n\n```bash\nllama-agents monitor --control-plane-url http://127.0.0.1:8001\n```\n\n![The llama-agents monitor app](./llama_agents_monitor.png)\n\n## Examples\n\nYou can find a host of examples in our examples folder:\n\n- [Agentic RAG + Tool Service](https://github.com/run-llama/llama-agents/blob/main/examples/agentic_rag_toolservice.ipynb)\n- [Agentic Orchestrator w/ Local Launcher](https://github.com/run-llama/llama-agents/blob/main/examples/agentic_local_single.py)\n- [Agentic Orchestrator w/ Server Launcher](https://github.com/run-llama/llama-agents/blob/main/examples/agentic_server.py)\n- [Agentic Orchestrator w/ Human in the Loop](https://github.com/run-llama/llama-agents/blob/main/examples/agentic_human_local_single.py)\n- [Agentic Orchestrator w/ Tool Service](https://github.com/run-llama/llama-agents/blob/main/examples/agentic_toolservice_local_single.py)\n- [Pipeline Orchestrator w/ Local Launcher](https://github.com/run-llama/llama-agents/blob/main/examples/pipeline_local_single.py)\n- [Pipeline Orchestrator w/ Human in the Loop](https://github.com/run-llama/llama-agents/blob/main/examples/pipeline_human_local_single.py)\n- [Pipeline Orchestrator w/ Agent Server As Tool](https://github.com/run-llama/llama-agents/blob/main/examples/pipeline_agent_service_tool_local_single.py)\n- [Pipeline Orchestrator w/ Query Rewrite RAG](https://github.com/run-llama/llama-agents/blob/main/examples/query_rewrite_rag.ipynb)\n\n## Components of a `llama-agents` System\n\nIn `llama-agents`, there are several key components that make up the overall system\n\n- `message queue` -- the message queue acts as a queue for all services and the `control plane`. It has methods for publishing methods to named queues, and delegates messages to consumers.\n- `control plane` -- the control plane is a the central gateway to the `llama-agents` system. It keeps track of current tasks, as well as the services that are registered to the system. It also holds the `orchestrator`.\n- `orchestrator` -- The module handles incoming tasks and decides what service to send it to, as well as how to handle results from services. An orchestrator can be agentic (with an LLM making decisions), explicit (with a query pipeline defining a flow), a mix of both, or something completely custom.\n- `services` -- Services are where the actual work happens. A services accepts some incoming task and context, processes it, and publishes a result\n  - A `tool service` is a special service used to off-load the compution of agent tools. Agents can instead be equipped with a meta-tool that calls the tool service.\n\n## Low-Level API in `llama-agents`\n\nSo far, you've seen how to define components and how to launch them. However in most production use-cases, you will need to launch services manually, as well as define your own consumers!\n\nSo, here is a quick guide on exactly that!\n\n### Launching\n\nFirst, you will want to launch everything. This can be done in a single script, or you can launch things with multiple scripts per service, or on different machines, or even in docker images.\n\nIn this example, we will assume launching from a single script.\n\n```python\nimport asyncio\n\n# launch the message queue\nqueue_task = asyncio.create_task(message_queue.launch_server())\n\n# wait for the message queue to be ready\nawait asyncio.sleep(1)\n\n# launch the control plane\ncontrol_plane_task = asyncio.create_task(self.control_plane.launch_server())\n\n# wait for the control plane to be ready\nawait asyncio.sleep(1)\n\n# register the control plane as a consumer which returns a start_consuming_callable\nstart_consuming_callable = await self.control_plane.register_to_message_queue()\nstart_consuming_callables = [start_consuming_callable]\n\n# register the services\ncontrol_plane_url = (\n    f\"http://{self.control_plane.host}:{self.control_plane.port}\"\n)\nservice_tasks = []\nfor service in self.services:\n    # first launch the service\n    service_tasks.append(asyncio.create_task(service.launch_server()))\n\n    # register the service to the message queue\n    start_consuming_callable = await service.register_to_message_queue()\n    start_consuming_callables.append(start_consuming_callable)\n\n    # register the service to the control plane\n    await service.register_to_control_plane(control_plane_url)\n\n# start consuming!\nstart_consuming_tasks = []\nfor start_consuming_callable in start_consuming_callables:\n    task = asyncio.create_task(start_consuming_callable())\n    start_consuming_tasks.append(task)\n```\n\nWith that done, you may want to define a consumer for the results of tasks.\n\nBy default, the results of tasks get published to a `human` message queue.\n\n```python\nfrom llama_agents import (\n    CallableMessageConsumer,\n    RemoteMessageConsumer,\n    QueueMessage,\n)\nimport asyncio\n\n\ndef handle_result(message: QueueMessage) -> None:\n    print(message.data)\n\n\nhuman_consumer = CallableMessageConsumer(\n    handler=handle_result, message_type=\"human\"\n)\n\n\nasync def register_and_start_consuming():\n    start_consuming_callable = await message_queue.register_consumer(\n        human_consumer\n    )\n    await start_consuming_callable()\n\n\nif __name__ == \"__main__\":\n    asyncio.run(register_and_start_consuming())\n\n# or, you can send the message to any URL\n# human_consumer = RemoteMessageConsumer(url=\"some destination url\")\n# message_queue.register_consumer(human_consumer)\n```\n\nOr, if you don't want to define a consumer, you can just use the `monitor` to observe your system results\n\n```bash\nllama-agents monitor --control-plane-url http://127.0.0.1:8001\n```\n",
    "bugtrack_url": null,
    "license": null,
    "summary": null,
    "version": "0.0.14",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "99f1bb4d02aa0acdc6cfac0b33f9aa8b8a954363c4e88db2c1778cdad3247747",
                "md5": "4cf3ea949232a6636b95f446678324d3",
                "sha256": "05fd4b398a84199ca750b57ba44f36a03e4b0552709a29ee961905f7b7783cce"
            },
            "downloads": -1,
            "filename": "llama_agents-0.0.14-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "4cf3ea949232a6636b95f446678324d3",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.8.1",
            "size": 76354,
            "upload_time": "2024-08-15T22:50:11",
            "upload_time_iso_8601": "2024-08-15T22:50:11.632061Z",
            "url": "https://files.pythonhosted.org/packages/99/f1/bb4d02aa0acdc6cfac0b33f9aa8b8a954363c4e88db2c1778cdad3247747/llama_agents-0.0.14-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e77118fc48a9e055086ff5ba85d1d1e38afd997db7dd82389889ac5085b440e9",
                "md5": "c007322297aa3d896d82802c30a66860",
                "sha256": "5491886cc1e3b0ca602067e08a9c29e86b427c33e685016a262a838cb4d16c78"
            },
            "downloads": -1,
            "filename": "llama_agents-0.0.14.tar.gz",
            "has_sig": false,
            "md5_digest": "c007322297aa3d896d82802c30a66860",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.8.1",
            "size": 48485,
            "upload_time": "2024-08-15T22:50:13",
            "upload_time_iso_8601": "2024-08-15T22:50:13.036065Z",
            "url": "https://files.pythonhosted.org/packages/e7/71/18fc48a9e055086ff5ba85d1d1e38afd997db7dd82389889ac5085b440e9/llama_agents-0.0.14.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-08-15 22:50:13",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "llama-agents"
}
        
Elapsed time: 0.37472s