# Declarative Agent Orchestration (DAO) Framework
A modular, multi-agent orchestration framework for building sophisticated AI workflows on Databricks. While this implementation provides a complete retail AI reference architecture, the framework is designed to support any domain or use case requiring agent coordination, tool integration, and dynamic configuration.
## Overview
This project implements a LangGraph-based multi-agent orchestration framework that can:
- **Route queries** to specialized agents based on content and context
- **Coordinate multiple AI agents** working together on complex tasks
- **Integrate diverse tools** including databases, APIs, vector search, and external services
- **Support flexible orchestration patterns** (supervisor, swarm, and custom workflows)
- **Provide dynamic configuration** through YAML-based agent and tool definitions
- **Enable domain-specific specialization** while maintaining a unified interface
**Retail Reference Implementation**: This repository includes a complete retail AI system demonstrating:
- Product inventory management and search
- Customer recommendation engines
- Order tracking and management
- Product classification and information retrieval
The system uses Databricks Vector Search, Unity Catalog, and LLMs to provide accurate, context-aware responses across any domain.
## Key Features
- **Multi-Modal Interface**: CLI commands and Python API for development and deployment
- **Agent Lifecycle Management**: Create, deploy, and monitor agents programmatically
- **Vector Search Integration**: Built-in support for Databricks Vector Search with retrieval tools
- **Configuration-Driven**: YAML-based configuration with validation and IDE support
- **MLflow Integration**: Automatic model packaging, versioning, and deployment
- **Monitoring & Evaluation**: Built-in assessment and monitoring capabilities
## Architecture
### Overview
The Multi-Agent AI system is built as a component-based agent architecture that routes queries to specialized agents based on the nature of the request. This approach enables domain-specific handling while maintaining a unified interface that can be adapted to any industry or use case.

### Core Components
#### Configuration Components
All components are defined from the provided [`model_config.yaml`](config/hardware_store/supervisor_postgres.yaml) using a modular approach:
- **Schemas**: Define database and catalog structures
- **Resources**: Configure infrastructure components like LLMs, vector stores, catalogs, warehouses, and databases
- **Tools**: Define functions that agents can use to perform tasks (dictionary-based with keys as tool names)
- **Agents**: Specialized AI assistants configured for specific domains (dictionary-based with keys as agent names)
- **Guardrails**: Quality control mechanisms to ensure accurate responses
- **Retrievers**: Configuration for vector search and retrieval
- **Evaluation**: Configuration for model evaluation and testing
- **Datasets**: Configuration for training and evaluation datasets
- **App**: Overall application configuration including orchestration and logging
#### Message Processing Flow
The system uses a LangGraph-based workflow with the following key nodes:
- **Message Validation**: Validates incoming requests (`message_validation_node`)
- **Agent Routing**: Routes messages to appropriate specialized agents using supervisor or swarm patterns
- **Agent Execution**: Processes requests using specialized agents with their configured tools
- **Response Generation**: Returns structured responses to users
#### Specialized Agents
Agents are dynamically configured from the provided [`model_config.yaml`](config/hardware_store/supervisor_postgres.yaml) file and can include:
- Custom LLM models and parameters
- Specific sets of available tools (Python functions, Unity Catalog functions, factory tools, MCP services)
- Domain-specific system prompts
- Guardrails for response quality
- Handoff prompts for agent coordination
### Technical Implementation
The system is implemented using:
- **LangGraph**: For workflow orchestration and state management
- **LangChain**: For LLM interactions and tool integration
- **MLflow**: For model tracking and deployment
- **Databricks**: LLM APIs, Vector Search, Unity Catalog, and Model Serving
- **Pydantic**: For configuration validation and schema management
## Prerequisites
- Python 3.12+
- Databricks workspace with access to:
- Unity Catalog
- Model Serving
- Vector Search
- Genie (optional)
- Databricks CLI configured with appropriate permissions
- Databricks model endpoints for LLMs and embeddings
## Setup
1. Clone this repository
2. Install dependencies:
```bash
# Create and activate a Python virtual environment
uv venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install dependencies using Makefile
make install
```
3. Configure Databricks CLI with appropriate workspace access
## Quick Start
### Option 1: Using Python API (Recommended for Development)
```python
from retail_ai.config import AppConfig
# Load your configuration
config = AppConfig.from_file("config/hardware_store/supervisor_postgres.yaml")
# Create vector search infrastructure
for name, vector_store in config.resources.vector_stores.items():
vector_store.create()
# Create and deploy your agent
config.create_agent()
config.deploy_agent()
```
### Option 2: Using CLI Commands
```bash
# Validate configuration
dao-ai validate -c config/hardware_store/supervisor_postgres.yaml
# Generate workflow diagram
dao-ai graph -o architecture.png
# Deploy using Databricks Asset Bundles
dao-ai bundle --deploy --run
# Deploy using Databricks Asset Bundles with specific configuration
dao-ai -vvvv bundle --deploy --run --target dev --config config/hardware_store/supervisor_postgres.yaml --profile DEFAULT
```
See the [Python API](#python-api) section for detailed programmatic usage, or [Command Line Interface](#command-line-interface) for CLI usage.
## Command Line Interface
The framework includes a comprehensive CLI for managing, validating, and visualizing your multi-agent system:
### Schema Generation
Generate JSON schema for configuration validation and IDE autocompletion:
```bash
dao-ai schema > schema.json
```
### Configuration Validation
Validate your configuration file for syntax and semantic correctness:
```bash
# Validate default configuration (config/hardware_store/supervisor_postgres.yaml)
dao-ai validate
# Validate specific configuration file
dao-ai validate -c config/production.yaml
```
### Graph Visualization
Generate visual representations of your agent workflow:
```bash
# Generate architecture diagram (using default config/hardware_store/supervisor_postgres.yaml)
dao-ai graph -o architecture.png
# Generate diagram from specific config
dao-ai graph -o workflow.png -c config/custom.yaml
```
### Deployment
Deploy your multi-agent system using Databricks Asset Bundles:
```bash
# Deploy the system
dao-ai bundle --deploy
# Run the deployed system
dao-ai bundle --run
# Use specific Databricks profile
dao-ai bundle --deploy --run --profile my-profile
```
### Verbose Output
Add `-v`, `-vv`, `-vvv`, or `-vvvv` flags for increasing levels of verbosity (ERROR, WARNING, INFO, DEBUG, TRACE).
## Python API
The framework provides a comprehensive Python API for programmatic access to all functionality. The main entry point is the `AppConfig` class, which provides methods for agent lifecycle management, vector search operations, and configuration utilities.
### Quick Start
```python
from retail_ai.config import AppConfig
# Load configuration from file
config = AppConfig.from_file(path="config/hardware_store/supervisor_postgres.yaml")
```
### Agent Lifecycle Management
#### Creating Agents
Package and register your multi-agent system as an MLflow model:
```python
# Create agent with default settings
config.create_agent()
# Create agent with additional requirements and code paths
config.create_agent(
additional_pip_reqs=["custom-package==1.0.0"],
additional_code_paths=["./custom_modules"]
)
```
#### Deploying Agents
Deploy your registered agent to a Databricks serving endpoint:
```python
# Deploy agent to serving endpoint
config.deploy_agent()
```
The deployment process:
1. Retrieves the latest model version from MLflow
2. Creates or updates a Databricks model serving endpoint
3. Configures scaling, environment variables, and permissions
4. Sets up proper authentication and resource access
### Vector Search Operations
#### Creating Vector Search Infrastructure
Create vector search endpoints and indexes from your configuration:
```python
# Access vector stores from configuration
vector_stores = config.resources.vector_stores
# Create all vector stores
for name, vector_store in vector_stores.items():
print(f"Creating vector store: {name}")
vector_store.create()
```
#### Using Vector Search
Query your vector search indexes for retrieval-augmented generation:
```python
# Method 1: Direct index access
from retail_ai.config import RetrieverModel
question = "What products do you have in stock?"
for name, retriever in config.retrievers.items():
# Get the vector search index
index = retriever.vector_store.as_index()
# Perform similarity search
results = index.similarity_search(
query_text=question,
columns=retriever.columns,
**retriever.search_parameters.model_dump()
)
chunks = results.get('result', {}).get('data_array', [])
print(f"Found {len(chunks)} relevant results")
```
```python
# Method 2: LangChain integration
from databricks_langchain import DatabricksVectorSearch
for name, retriever in config.retrievers.items():
# Create LangChain vector store
vector_search = DatabricksVectorSearch(
endpoint=retriever.vector_store.endpoint.name,
index_name=retriever.vector_store.index.full_name,
columns=retriever.columns,
)
# Search using LangChain interface
documents = vector_search.similarity_search(
query=question,
**retriever.search_parameters.model_dump()
)
print(f"Found {len(documents)} documents")
```
### Configuration Utilities
The `AppConfig` class provides helper methods to find and filter configuration components:
#### Finding Agents
```python
# Get all agents
all_agents = config.find_agents()
# Find agents with specific criteria
def has_vector_search(agent):
return any("vector_search" in tool.name.lower() for tool in agent.tools)
vector_agents = config.find_agents(predicate=has_vector_search)
```
#### Finding Tools and Guardrails
```python
# Get all tools
all_tools = config.find_tools()
# Get all guardrails
all_guardrails = config.find_guardrails()
# Find tools by type
def is_python_tool(tool):
return tool.function.type == "python"
python_tools = config.find_tools(predicate=is_python_tool)
```
### Visualization
Generate and save workflow diagrams:
```python
# Display graph in notebook
config.display_graph()
# Save architecture diagram
config.save_image("docs/my_architecture.png")
```
### Complete Example
See [`notebooks/05_agent_as_code_driver.py`](notebooks/05_agent_as_code_driver.py) for a complete example:
```python
from retail_ai.config import AppConfig
from pathlib import Path
# Load configuration
config = AppConfig.from_file("config/hardware_store/supervisor_postgres.yaml")
# Visualize the workflow
config.display_graph()
# Save architecture diagram
path = Path("docs") / f"{config.app.name}_architecture.png"
config.save_image(path)
# Create and deploy the agent
config.create_agent()
config.deploy_agent()
```
For vector search examples, see [`notebooks/02_provision_vector_search.py`](notebooks/02_provision_vector_search.py).
### Available Notebooks
The framework includes several example notebooks demonstrating different aspects:
| Notebook | Description | Key Methods Demonstrated |
|----------|-------------|-------------------------|
| [`01_ingest_and_transform.py`](notebooks/01_ingest_and_transform.py) | Data ingestion and transformation | Dataset creation and SQL execution |
| [`02_provision_vector_search.py`](notebooks/02_provision_vector_search.py) | Vector search setup and usage | `vector_store.create()`, `as_index()` |
| [`03_generate_evaluation_data.py`](notebooks/03_generate_evaluation_data.py) | Generate synthetic evaluation datasets | Data generation and evaluation setup |
| [`04_unity_catalog_tools.py`](notebooks/04_unity_catalog_tools.py) | Unity Catalog function deployment | SQL function creation and testing |
| [`05_agent_as_code_driver.py`](notebooks/05_agent_as_code_driver.py) | **Complete agent lifecycle** | `create_agent()`, `deploy_agent()` |
| [`06_run_evaluation.py`](notebooks/06_run_evaluation.py) | Agent evaluation and testing | Evaluation framework usage |
| [`08_run_examples.py`](notebooks/08_run_examples.py) | End-to-end example queries | Agent interaction and testing |
## Configuration
Configuration is managed through [`model_config.yaml`](config/hardware_store/supervisor_postgres.yaml). This file defines all components of the Retail AI system, including resources, tools, agents, and the overall application setup.
**Note**: The configuration file location is configurable throughout the framework. You can specify a different configuration file using the `-c` or `--config` flag in CLI commands, or by setting the appropriate parameters in the Python API.
### Basic Structure of [`model_config.yaml`](config/hardware_store/supervisor_postgres.yaml)
The [`model_config.yaml`](config/hardware_store/supervisor_postgres.yaml) is organized into several top-level keys:
```yaml
# filepath: /Users/nate/development/dao-ai/config/hardware_store/supervisor_postgres.yaml
schemas:
# ... schema definitions ...
resources:
# ... resource definitions (LLMs, vector stores, etc.) ...
tools:
# ... tool definitions ...
agents:
# ... agent definitions ...
app:
# ... application configuration ...
# Other sections like guardrails, retrievers, evaluation, datasets
```
### Loading and Using Configuration
The configuration can be loaded and used programmatically through the `AppConfig` class:
```python
from retail_ai.config import AppConfig
# Load configuration from file
config = AppConfig.from_file("config/hardware_store/supervisor_postgres.yaml")
# Access different configuration sections
print(f"Available agents: {list(config.agents.keys())}")
print(f"Available tools: {list(config.tools.keys())}")
print(f"Vector stores: {list(config.resources.vector_stores.keys())}")
# Use configuration methods for deployment
config.create_agent() # Package as MLflow model
config.deploy_agent() # Deploy to serving endpoint
```
The configuration supports both CLI and programmatic workflows, with the Python API providing more flexibility for complex deployment scenarios.
### Developing and Configuring Tools
Tools are functions that agents can use to interact with external systems or perform specific tasks. They are defined under the `tools` key in [`model_config.yaml`](config/hardware_store/supervisor_postgres.yaml). Each tool has a unique name and contains a `function` specification.
There are four types of tools supported:
#### 1. Python Tools (`type: python`)
These tools directly map to Python functions. The `name` field should correspond to a function that can be imported and called directly.
**Configuration Example:**
```yaml
tools:
my_python_tool:
name: my_python_tool
function:
type: python
name: retail_ai.tools.my_function_name
schema: *retail_schema # Optional schema definition
```
**Development:**
Implement the Python function in the specified module (e.g., `retail_ai/tools.py`). The function will be imported and called directly when the tool is invoked.
#### 2. Factory Tools (`type: factory`)
Factory tools use factory functions that return initialized LangChain `BaseTool` instances. This is useful for tools requiring complex initialization or configuration.
**Configuration Example:**
```yaml
tools:
vector_search_tool:
name: vector_search
function:
type: factory
name: retail_ai.tools.create_vector_search_tool
args:
retriever: *products_retriever
name: product_vector_search_tool
description: "Search for products using vector search"
```
**Development:**
Implement the factory function (e.g., `create_vector_search_tool`) in `retail_ai/tools.py`. This function should accept the specified `args` and return a fully configured `BaseTool` object.
#### 3. Unity Catalog Tools (`type: unity_catalog`)
These tools represent SQL functions registered in Databricks Unity Catalog. They reference functions by their Unity Catalog schema and name.
**Configuration Example:**
```yaml
tools:
find_product_by_sku_uc_tool:
name: find_product_by_sku_uc
function:
type: unity_catalog
name: find_product_by_sku
schema: *retail_schema
```
**Development:**
Create the corresponding SQL function in your Databricks Unity Catalog using the specified schema and function name. The tool will automatically generate the appropriate function signature and documentation.
### Developing Unity Catalog Functions
Unity Catalog functions provide the backbone for data access in the multi-agent system. The framework automatically deploys these functions from SQL DDL files during system initialization.
#### Function Deployment Configuration
Unity Catalog functions are defined in the `unity_catalog_functions` section of [`model_config.yaml`](config/hardware_store/supervisor_postgres.yaml). Each function specification includes:
- **Function metadata**: Schema and name for Unity Catalog registration
- **DDL file path**: Location of the SQL file containing the function definition
- **Test parameters**: Optional test data for function validation
**Configuration Example from [`model_config.yaml`](config/hardware_store/supervisor_postgres.yaml):**
```yaml
unity_catalog_functions:
- function:
schema: *retail_schema # Reference to schema configuration
name: find_product_by_sku # Function name in Unity Catalog
ddl: ../functions/retail/find_product_by_sku.sql # Path to SQL DDL file
test: # Optional test configuration
parameters:
sku: ["00176279"] # Test parameters for validation
- function:
schema: *retail_schema
name: find_store_inventory_by_sku
ddl: ../functions/retail/find_store_inventory_by_sku.sql
test:
parameters:
store: "35048" # Multiple parameters for complex functions
sku: ["00176279"]
```
#### SQL Function Structure
SQL files should follow this structure for proper deployment:
**File Structure Example** (`functions/retail/find_product_by_sku.sql`):
```sql
-- Function to find product details by SKU
CREATE OR REPLACE FUNCTION {catalog_name}.{schema_name}.find_product_by_sku(
sku ARRAY<STRING> COMMENT 'One or more unique identifiers for retrieve. SKU values are between 5-8 alpha numeric characters'
)
RETURNS TABLE(
product_id BIGINT COMMENT 'Unique identifier for each product in the catalog',
sku STRING COMMENT 'Stock Keeping Unit - unique internal product identifier code',
upc STRING COMMENT 'Universal Product Code - standardized barcode number for product identification',
brand_name STRING COMMENT 'Name of the manufacturer or brand that produces the product',
product_name STRING COMMENT 'Display name of the product as shown to customers',
-- ... additional columns
)
READS SQL DATA
COMMENT 'Retrieves detailed information about a specific product by its SKU. This function is designed for product information retrieval in retail applications.'
RETURN
SELECT
product_id,
sku,
upc,
brand_name,
product_name
-- ... additional columns
FROM products
WHERE ARRAY_CONTAINS(find_product_by_sku.sku, products.sku);
```
**Key Requirements:**
- Use `{catalog_name}.{schema_name}` placeholders - these are automatically replaced during deployment
- Include comprehensive `COMMENT` attributes for all parameters and return columns
- Provide a clear function-level comment describing purpose and use cases
- Use `READS SQL DATA` for functions that query data
- Follow consistent naming conventions for parameters and return values
#### Test Configuration
The optional `test` section allows you to define test parameters for automatic function validation:
```yaml
test:
parameters:
sku: ["00176279"] # Single parameter
# OR for multi-parameter functions:
store: "35048" # Multiple parameters
sku: ["00176279"]
```
**Test Benefits:**
- **Validation**: Ensures functions work correctly after deployment
- **Documentation**: Provides example usage for other developers
- **CI/CD Integration**: Enables automated testing in deployment pipelines
**Note**: Test parameters should use realistic data from your datasets to ensure meaningful validation. The framework will execute these tests automatically during deployment to verify function correctness.
#### 4. MCP (Model Context Protocol) Tools (`type: mcp`)
MCP tools allow interaction with external services that implement the Model Context Protocol, supporting both HTTP and stdio transports.
**Configuration Example:**
```yaml
tools:
weather_tool_mcp:
name: weather
function:
type: mcp
name: weather
transport: streamable_http
url: http://localhost:8000/mcp
```
**Development:**
Ensure the MCP service is running and accessible at the specified URL or command. The framework will handle the MCP protocol communication automatically.
### Configuring New Agents
Agents are specialized AI assistants defined under the `agents` key in [`model_config.yaml`](config/hardware_store/supervisor_postgres.yaml). Each agent has a unique name and specific configuration.
**Configuration Example:**
```yaml
agents:
general:
name: general
description: "General retail store assistant for home improvement and hardware store inquiries"
model: *tool_calling_llm
tools:
- *find_product_details_by_description_tool
- *vector_search_tool
guardrails: []
checkpointer: *checkpointer
prompt: |
You are a helpful retail store assistant for a home improvement and hardware store.
You have access to search tools to find current information about products, pricing, and store policies.
#### CRITICAL INSTRUCTION: ALWAYS USE SEARCH TOOLS FIRST
Before answering ANY question:
- ALWAYS use your available search tools to find the most current and accurate information
- Search for specific details about store policies, product availability, pricing, and services
```
**Agent Configuration Fields:**
- `name`: Unique identifier for the agent
- `description`: Human-readable description of the agent's purpose
- `model`: Reference to an LLM model (using YAML anchors like `*tool_calling_llm`)
- `tools`: Array of tool references (using YAML anchors like `*search_tool`)
- `guardrails`: Array of guardrail references (can be empty `[]`)
- `checkpointer`: Reference to a checkpointer for conversation state (optional)
- `prompt`: System prompt that defines the agent's behavior and instructions
**To configure a new agent:**
1. Add a new entry under the `agents` section with a unique key
2. Define the required fields: `name`, `description`, `model`, `tools`, and `prompt`
3. Optionally configure `guardrails` and `checkpointer`
4. Reference the agent in the application configuration using YAML anchors
### Assigning Tools to Agents
Tools are assigned to agents by referencing them using YAML anchors in the agent's `tools` array. Each tool must be defined in the `tools` section with an anchor (using `&tool_name`), then referenced in the agent configuration (using `*tool_name`).
**Example:**
```yaml
tools:
search_tool: &search_tool
name: search
function:
type: factory
name: retail_ai.tools.search_tool
args: {}
genie_tool: &genie_tool
name: genie
function:
type: factory
name: retail_ai.tools.create_genie_tool
args:
genie_room: *retail_genie_room
agents:
general:
name: general
description: "General retail store assistant"
model: *tool_calling_llm
tools:
- *search_tool # Reference to the search_tool anchor
- *genie_tool # Reference to the genie_tool anchor
# ... other agent configuration
```
This YAML anchor system allows for:
- **Reusability**: The same tool can be assigned to multiple agents
- **Maintainability**: Tool configuration is centralized in one place
- **Consistency**: Tools are guaranteed to have the same configuration across agents
### Assigning Agents to the Application and Configuring Orchestration
Agents are made available to the application by listing their YAML anchors (defined in the `agents:` section) within the `agents` array under the `app` section. The `app.orchestration` section defines how these agents interact.
**Orchestration Configuration:**
The `orchestration` block within the `app` section allows you to define the interaction pattern. Your current configuration primarily uses a **Supervisor** pattern.
```yaml
# filepath: /Users/nate/development/dao-ai/config/hardware_store/supervisor_postgres.yaml
# ...
# app:
# ...
# agents:
# - *orders
# - *diy
# - *product
# # ... other agents referenced by their anchors
# - *general
# orchestration:
# supervisor:
# model: *tool_calling_llm # LLM for the supervisor agent
# default_agent: *general # Agent to handle tasks if no specific agent is chosen
# # swarm: # Example of how a swarm might be configured if activated
# # model: *tool_calling_llm
# ...
```
**Orchestration Patterns:**
1. **Supervisor Pattern (Currently Active)**
* Your configuration defines a `supervisor` block under `app.orchestration`.
* `model`: Specifies the LLM (e.g., `*tool_calling_llm`) that the supervisor itself will use for its decision-making and routing logic.
* `default_agent`: Specifies an agent (e.g., `*general`) that the supervisor will delegate to if it cannot determine a more specialized agent from the `app.agents` list or if the query is general.
* The supervisor is responsible for receiving the initial user query, deciding which specialized agent (from the `app.agents` list) is best suited to handle it, and then passing the query to that agent. If no specific agent is a clear match, or if the query is general, it falls back to the `default_agent`.
2. **Swarm Pattern (Commented Out)**
* Your configuration includes a commented-out `swarm` block. If activated, this would imply a different interaction model.
* In a swarm, agents might collaborate more directly or work in parallel on different aspects of a query. The `model` under `swarm` would likely define the LLM used by the agents within the swarm or by a coordinating element of the swarm.
* The specific implementation of how a swarm pattern behaves would be defined in your `retail_ai/graph.py` and `retail_ai/nodes.py`.
## Integration Hooks
The DAO framework provides several hook integration points that allow you to customize agent behavior and application lifecycle. These hooks enable you to inject custom logic at key points in the system without modifying the core framework code.
### Hook Types
#### Agent-Level Hooks
**Agent hooks** are defined at the individual agent level and allow you to customize specific agent behavior:
##### `create_agent_hook`
Used to provide a completely custom agent implementation. When this is provided all other configuration is ignored. See: **Hook Implementation**
```yaml
agents:
custom_agent:
name: custom_agent
description: "Agent with custom initialization"
model: *tool_calling_llm
create_agent_hook: my_package.hooks.initialize_custom_agent
# ... other agent configuration
```
##### `pre_agent_hook`
Executed before an agent processes a message. Ideal for request preprocessing, logging, validation, or context injection. See: **Hook Implementation**
```yaml
agents:
logging_agent:
name: logging_agent
description: "Agent with request logging"
model: *tool_calling_llm
pre_agent_hook: my_package.hooks.log_incoming_request
# ... other agent configuration
```
##### `post_agent_hook`
Executed after an agent completes processing a message. Perfect for response post-processing, logging, metrics collection, or cleanup operations. See: **Hook Implementation**
```yaml
agents:
analytics_agent:
name: analytics_agent
description: "Agent with response analytics"
model: *tool_calling_llm
post_agent_hook: my_package.hooks.collect_response_metrics
# ... other agent configuration
```
#### Application-Level Hooks
**Application hooks** operate at the global application level and affect the entire system lifecycle:
##### `initialization_hooks`
Executed when the application starts up via `AppConfig.from_file()`. Use these for system initialization, resource setup, database connections, or external service configuration. See: **Hook Implementation**
```yaml
app:
name: my_retail_app
initialization_hooks:
- my_package.hooks.setup_database_connections
- my_package.hooks.initialize_external_apis
- my_package.hooks.setup_monitoring
# ... other app configuration
```
##### `shutdown_hooks`
Executed when the application shuts down (registered via `atexit`). Essential for cleanup operations, closing connections, saving state, or performing final logging. See: **Hook Implementation**
```yaml
app:
name: my_retail_app
shutdown_hooks:
- my_package.hooks.cleanup_database_connections
- my_package.hooks.save_session_data
- my_package.hooks.send_shutdown_metrics
# ... other app configuration
```
##### `message_hooks`
Executed for every message processed by the system. Useful for global logging, authentication, rate limiting, or message transformation. See: **Hook Implementation**
```yaml
app:
name: my_retail_app
message_hooks:
- my_package.hooks.authenticate_user
- my_package.hooks.apply_rate_limiting
- my_package.hooks.transform_message_format
# ... other app configuration
```
### Hook Implementation
Hooks can be implemented as either:
1. **Python Functions**: Direct function references
```yaml
initialization_hooks: my_package.hooks.setup_function
```
2. **Factory Functions**: Functions that return configured tools or handlers
```yaml
initialization_hooks:
type: factory
name: my_package.hooks.create_setup_handler
args:
config_param: "value"
```
3. **Hook Lists**: Multiple hooks executed in sequence
```yaml
initialization_hooks:
- my_package.hooks.setup_database
- my_package.hooks.setup_cache
- my_package.hooks.setup_monitoring
```
### Hook Function Signatures
Each hook type expects specific function signatures:
#### Agent Hooks
```python
# create_agent_hook
def initialize_custom_agent(state: dict, config: dict) -> dict:
"""Custom agent initialization logic"""
pass
# pre_agent_hook
def log_incoming_request(state: dict, config: dict) -> dict:
"""Pre-process incoming request"""
return state
# post_agent_hook
def collect_response_metrics(state: dict, config: dict) -> dict:
"""Post-process agent response"""
return state
```
#### Application Hooks
```python
# initialization_hooks
def setup_database_connections(config: AppConfig) -> None:
"""Initialize database connections"""
pass
# shutdown_hooks
def cleanup_resources(config: AppConfig) -> None:
"""Clean up resources on shutdown"""
pass
# message_hooks
def authenticate_user(state: dict, config: dict) -> dict:
"""Authenticate and authorize user requests"""
return state
```
### Use Cases and Examples
#### Common Hook Patterns
**Logging and Monitoring**:
```python
def log_agent_performance(state: dict, config: AppConfig) -> dict:
"""Log agent response times and quality metrics"""
start_time = state.get('start_time')
if start_time:
duration = time.time() - start_time
logger.info(f"Agent response time: {duration:.2f}s")
return state
```
**Authentication and Authorization**:
```python
def validate_user_permissions(state: dict, config: AppConfig) -> dict:
"""Validate user has permission for requested operation"""
user_id = state.get('user_id')
if not has_permission(user_id, state.get('operation')):
raise UnauthorizedError("Insufficient permissions")
return state
```
**Resource Management**:
```python
def initialize_vector_search(config: AppConfig) -> None:
"""Initialize vector search connections during startup"""
for vs_name, vs_config in config.resources.vector_stores.items():
vs_config.create()
logger.info(f"Vector store {vs_name} initialized")
```
**State Enrichment**:
```python
def enrich_user_context(state: dict, config: AppConfig) -> dict:
"""Add user profile and preferences to state"""
user_id = state.get('user_id')
if user_id:
user_profile = get_user_profile(user_id)
state['user_context'] = user_profile
return state
```
### Best Practices
1. **Keep hooks lightweight**: Avoid heavy computations that could slow down message processing
2. **Handle errors gracefully**: Use try-catch blocks to prevent hook failures from breaking the system
3. **Use appropriate hook types**: Choose agent-level vs application-level hooks based on scope
4. **Maintain state immutability**: Return modified copies of state rather than mutating in-place
5. **Log hook execution**: Include logging for troubleshooting and monitoring
6. **Test hooks independently**: Write unit tests for hook functions separate from the main application
## Development
### Project Structure
- `retail_ai/`: Core package
- `config.py`: Pydantic configuration models with full validation
- `graph.py`: LangGraph workflow definition
- `nodes.py`: Agent node factories and implementations
- `tools.py`: Tool creation and factory functions, implementations for Python tools
- `vector_search.py`: Vector search utilities
- `state.py`: State management for conversations
- `tests/`: Test suite with configuration fixtures
- `schemas/`: JSON schemas for configuration validation
- `notebooks/`: Jupyter notebooks for setup and experimentation
- `docs/`: Documentation files, including architecture diagrams.
- `config/`: Contains [`model_config.yaml`](config/hardware_store/supervisor_postgres.yaml).
### Building the Package
```bash
# Install development dependencies
make depends
# Build the package
make install
# Run tests
make test
# Format code
make format
```
## Deployment with Databricks Bundle CLI
The agent can be deployed using the existing Databricks Bundle CLI configuration:
1. Ensure Databricks CLI is installed and configured:
```bash
pip install databricks-cli
databricks configure
```
2. Deploy using the existing `databricks.yml`:
```bash
databricks bundle deploy
```
3. Check deployment status:
```bash
databricks bundle status
```
## Usage
Once deployed, interact with the agent:
```python
from mlflow.deployments import get_deploy_client
client = get_deploy_client("databricks")
response = client.predict(
endpoint="retail_ai_agent", # Matches endpoint_name in model_config.yaml
inputs={
"messages": [
{"role": "user", "content": "Can you recommend a lamp for my oak side tables?"}
]
}
)
print(response["message"]["content"])
```
### Advanced Configuration
You can also pass additional configuration parameters to customize the agent's behavior:
```python
response = client.predict(
endpoint="retail_ai_agent",
inputs={
"messages": [
{"role": "user", "content": "Can you recommend a lamp for my oak side tables?"}
],
"configurable": {
"thread_id": "1",
"user_id": "my_user_id",
"store_num": 87887
}
}
)
```
The `configurable` section supports:
- **`thread_id`**: Unique identifier for conversation threading and state management
- **`user_id`**: User identifier for personalization and tracking
- **`store_num`**: Store number for location-specific recommendations and inventory
## Customization
To customize the agent:
1. **Update [`model_config.yaml`](config/hardware_store/supervisor_postgres.yaml)**:
- Add tools in the `tools` section
- Create agents in the `agents` section
- Configure resources (LLMs, vector stores, etc.)
- Adjust orchestration patterns as described above.
2. **Implement new tools** in `retail_ai/tools.py` (for Python and Factory tools) or in Unity Catalog (for UC tools).
3. **Extend workflows** in `retail_ai/graph.py` to support the chosen orchestration patterns and agent interactions.
## Testing
```bash
# Run all tests
make test
```
## Logging
The primary log level for the application is configured in [`model_config.yaml`](config/hardware_store/supervisor_postgres.yaml) under the `app.log_level` field.
**Configuration Example:**
```yaml
# filepath: /Users/nate/development/dao-ai/config/hardware_store/supervisor_postgres.yaml
app:
log_level: INFO # Supported levels: DEBUG, INFO, WARNING, ERROR, CRITICAL
# ... other app configurations ...
```
This setting controls the verbosity of logs produced by the `retail_ai` package.
The system also includes:
- **MLflow tracing** for request tracking.
- **Structured logging** is used internally.
## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
Raw data
{
"_id": null,
"home_page": null,
"name": "dao-ai",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.11",
"maintainer_email": "Nate Fleming <nate.fleming@databricks.com>",
"keywords": "agents, ai, databricks, langchain, langgraph, llm, multi-agent, orchestration, vector-search, workflow",
"author": null,
"author_email": "Nate Fleming <nate.fleming@databricks.com>, Nate Fleming <nate.fleming@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/c6/b6/d145eb0eb7d7860708f429c061d8a9a4ba8313e73f6a63f1cc293e02dcf0/dao_ai-0.0.2.tar.gz",
"platform": null,
"description": "# Declarative Agent Orchestration (DAO) Framework\n\nA modular, multi-agent orchestration framework for building sophisticated AI workflows on Databricks. While this implementation provides a complete retail AI reference architecture, the framework is designed to support any domain or use case requiring agent coordination, tool integration, and dynamic configuration.\n\n## Overview\n\nThis project implements a LangGraph-based multi-agent orchestration framework that can:\n\n- **Route queries** to specialized agents based on content and context\n- **Coordinate multiple AI agents** working together on complex tasks\n- **Integrate diverse tools** including databases, APIs, vector search, and external services\n- **Support flexible orchestration patterns** (supervisor, swarm, and custom workflows)\n- **Provide dynamic configuration** through YAML-based agent and tool definitions\n- **Enable domain-specific specialization** while maintaining a unified interface\n\n**Retail Reference Implementation**: This repository includes a complete retail AI system demonstrating:\n- Product inventory management and search\n- Customer recommendation engines \n- Order tracking and management\n- Product classification and information retrieval\n\nThe system uses Databricks Vector Search, Unity Catalog, and LLMs to provide accurate, context-aware responses across any domain.\n\n## Key Features\n\n- **Multi-Modal Interface**: CLI commands and Python API for development and deployment\n- **Agent Lifecycle Management**: Create, deploy, and monitor agents programmatically\n- **Vector Search Integration**: Built-in support for Databricks Vector Search with retrieval tools\n- **Configuration-Driven**: YAML-based configuration with validation and IDE support\n- **MLflow Integration**: Automatic model packaging, versioning, and deployment\n- **Monitoring & Evaluation**: Built-in assessment and monitoring capabilities\n\n## Architecture\n\n### Overview\n\nThe Multi-Agent AI system is built as a component-based agent architecture that routes queries to specialized agents based on the nature of the request. This approach enables domain-specific handling while maintaining a unified interface that can be adapted to any industry or use case.\n\n\n\n### Core Components\n\n#### Configuration Components\n\nAll components are defined from the provided [`model_config.yaml`](config/hardware_store/supervisor_postgres.yaml) using a modular approach:\n\n- **Schemas**: Define database and catalog structures\n- **Resources**: Configure infrastructure components like LLMs, vector stores, catalogs, warehouses, and databases\n- **Tools**: Define functions that agents can use to perform tasks (dictionary-based with keys as tool names)\n- **Agents**: Specialized AI assistants configured for specific domains (dictionary-based with keys as agent names)\n- **Guardrails**: Quality control mechanisms to ensure accurate responses\n- **Retrievers**: Configuration for vector search and retrieval\n- **Evaluation**: Configuration for model evaluation and testing\n- **Datasets**: Configuration for training and evaluation datasets\n- **App**: Overall application configuration including orchestration and logging\n\n#### Message Processing Flow\n\nThe system uses a LangGraph-based workflow with the following key nodes:\n\n- **Message Validation**: Validates incoming requests (`message_validation_node`)\n- **Agent Routing**: Routes messages to appropriate specialized agents using supervisor or swarm patterns\n- **Agent Execution**: Processes requests using specialized agents with their configured tools\n- **Response Generation**: Returns structured responses to users\n\n#### Specialized Agents\n\nAgents are dynamically configured from the provided [`model_config.yaml`](config/hardware_store/supervisor_postgres.yaml) file and can include:\n- Custom LLM models and parameters\n- Specific sets of available tools (Python functions, Unity Catalog functions, factory tools, MCP services)\n- Domain-specific system prompts\n- Guardrails for response quality\n- Handoff prompts for agent coordination\n\n### Technical Implementation\n\nThe system is implemented using:\n\n- **LangGraph**: For workflow orchestration and state management\n- **LangChain**: For LLM interactions and tool integration\n- **MLflow**: For model tracking and deployment\n- **Databricks**: LLM APIs, Vector Search, Unity Catalog, and Model Serving\n- **Pydantic**: For configuration validation and schema management\n\n## Prerequisites\n\n- Python 3.12+\n- Databricks workspace with access to:\n - Unity Catalog\n - Model Serving\n - Vector Search\n - Genie (optional)\n- Databricks CLI configured with appropriate permissions\n- Databricks model endpoints for LLMs and embeddings\n\n## Setup\n\n1. Clone this repository\n2. Install dependencies:\n\n```bash\n# Create and activate a Python virtual environment \nuv venv\nsource .venv/bin/activate # On Windows: .venv\\Scripts\\activate\n\n# Install dependencies using Makefile\nmake install\n```\n\n3. Configure Databricks CLI with appropriate workspace access\n\n## Quick Start\n\n### Option 1: Using Python API (Recommended for Development)\n\n```python\nfrom retail_ai.config import AppConfig\n\n# Load your configuration\nconfig = AppConfig.from_file(\"config/hardware_store/supervisor_postgres.yaml\")\n\n# Create vector search infrastructure\nfor name, vector_store in config.resources.vector_stores.items():\n vector_store.create()\n\n# Create and deploy your agent\nconfig.create_agent()\nconfig.deploy_agent()\n\n```\n\n### Option 2: Using CLI Commands\n\n```bash\n# Validate configuration\ndao-ai validate -c config/hardware_store/supervisor_postgres.yaml\n\n# Generate workflow diagram\ndao-ai graph -o architecture.png\n\n# Deploy using Databricks Asset Bundles\ndao-ai bundle --deploy --run\n\n# Deploy using Databricks Asset Bundles with specific configuration\ndao-ai -vvvv bundle --deploy --run --target dev --config config/hardware_store/supervisor_postgres.yaml --profile DEFAULT\n```\n\nSee the [Python API](#python-api) section for detailed programmatic usage, or [Command Line Interface](#command-line-interface) for CLI usage.\n\n## Command Line Interface\n\nThe framework includes a comprehensive CLI for managing, validating, and visualizing your multi-agent system:\n\n### Schema Generation\nGenerate JSON schema for configuration validation and IDE autocompletion:\n```bash\ndao-ai schema > schema.json\n```\n\n### Configuration Validation\nValidate your configuration file for syntax and semantic correctness:\n```bash\n# Validate default configuration (config/hardware_store/supervisor_postgres.yaml)\ndao-ai validate\n\n# Validate specific configuration file\ndao-ai validate -c config/production.yaml\n```\n\n### Graph Visualization\nGenerate visual representations of your agent workflow:\n```bash\n# Generate architecture diagram (using default config/hardware_store/supervisor_postgres.yaml)\ndao-ai graph -o architecture.png\n\n# Generate diagram from specific config\ndao-ai graph -o workflow.png -c config/custom.yaml\n```\n\n### Deployment\nDeploy your multi-agent system using Databricks Asset Bundles:\n```bash\n# Deploy the system\ndao-ai bundle --deploy\n\n# Run the deployed system\ndao-ai bundle --run\n\n# Use specific Databricks profile\ndao-ai bundle --deploy --run --profile my-profile\n```\n\n### Verbose Output\nAdd `-v`, `-vv`, `-vvv`, or `-vvvv` flags for increasing levels of verbosity (ERROR, WARNING, INFO, DEBUG, TRACE).\n\n## Python API\n\nThe framework provides a comprehensive Python API for programmatic access to all functionality. The main entry point is the `AppConfig` class, which provides methods for agent lifecycle management, vector search operations, and configuration utilities.\n\n### Quick Start\n\n```python\nfrom retail_ai.config import AppConfig\n\n# Load configuration from file\nconfig = AppConfig.from_file(path=\"config/hardware_store/supervisor_postgres.yaml\")\n```\n\n### Agent Lifecycle Management\n\n#### Creating Agents\nPackage and register your multi-agent system as an MLflow model:\n\n```python\n# Create agent with default settings\nconfig.create_agent()\n\n# Create agent with additional requirements and code paths\nconfig.create_agent(\n additional_pip_reqs=[\"custom-package==1.0.0\"],\n additional_code_paths=[\"./custom_modules\"]\n)\n```\n\n#### Deploying Agents\nDeploy your registered agent to a Databricks serving endpoint:\n\n```python\n# Deploy agent to serving endpoint\nconfig.deploy_agent()\n```\n\nThe deployment process:\n1. Retrieves the latest model version from MLflow\n2. Creates or updates a Databricks model serving endpoint\n3. Configures scaling, environment variables, and permissions\n4. Sets up proper authentication and resource access\n\n### Vector Search Operations\n\n#### Creating Vector Search Infrastructure\nCreate vector search endpoints and indexes from your configuration:\n\n```python\n# Access vector stores from configuration\nvector_stores = config.resources.vector_stores\n\n# Create all vector stores\nfor name, vector_store in vector_stores.items():\n print(f\"Creating vector store: {name}\")\n vector_store.create()\n```\n\n#### Using Vector Search\nQuery your vector search indexes for retrieval-augmented generation:\n\n```python\n# Method 1: Direct index access\nfrom retail_ai.config import RetrieverModel\n\nquestion = \"What products do you have in stock?\"\n\nfor name, retriever in config.retrievers.items():\n # Get the vector search index\n index = retriever.vector_store.as_index()\n \n # Perform similarity search\n results = index.similarity_search(\n query_text=question,\n columns=retriever.columns,\n **retriever.search_parameters.model_dump()\n )\n \n chunks = results.get('result', {}).get('data_array', [])\n print(f\"Found {len(chunks)} relevant results\")\n```\n\n```python\n# Method 2: LangChain integration\nfrom databricks_langchain import DatabricksVectorSearch\n\nfor name, retriever in config.retrievers.items():\n # Create LangChain vector store\n vector_search = DatabricksVectorSearch(\n endpoint=retriever.vector_store.endpoint.name,\n index_name=retriever.vector_store.index.full_name,\n columns=retriever.columns,\n )\n \n # Search using LangChain interface\n documents = vector_search.similarity_search(\n query=question,\n **retriever.search_parameters.model_dump()\n )\n \n print(f\"Found {len(documents)} documents\")\n```\n\n### Configuration Utilities\n\nThe `AppConfig` class provides helper methods to find and filter configuration components:\n\n#### Finding Agents\n```python\n# Get all agents\nall_agents = config.find_agents()\n\n# Find agents with specific criteria\ndef has_vector_search(agent):\n return any(\"vector_search\" in tool.name.lower() for tool in agent.tools)\n\nvector_agents = config.find_agents(predicate=has_vector_search)\n```\n\n#### Finding Tools and Guardrails\n```python\n# Get all tools\nall_tools = config.find_tools()\n\n# Get all guardrails\nall_guardrails = config.find_guardrails()\n\n# Find tools by type\ndef is_python_tool(tool):\n return tool.function.type == \"python\"\n\npython_tools = config.find_tools(predicate=is_python_tool)\n```\n\n### Visualization\n\nGenerate and save workflow diagrams:\n\n```python\n# Display graph in notebook\nconfig.display_graph()\n\n# Save architecture diagram\nconfig.save_image(\"docs/my_architecture.png\")\n```\n\n### Complete Example\n\nSee [`notebooks/05_agent_as_code_driver.py`](notebooks/05_agent_as_code_driver.py) for a complete example:\n\n```python\nfrom retail_ai.config import AppConfig\nfrom pathlib import Path\n\n# Load configuration\nconfig = AppConfig.from_file(\"config/hardware_store/supervisor_postgres.yaml\")\n\n# Visualize the workflow\nconfig.display_graph()\n\n# Save architecture diagram\npath = Path(\"docs\") / f\"{config.app.name}_architecture.png\"\nconfig.save_image(path)\n\n# Create and deploy the agent\nconfig.create_agent()\nconfig.deploy_agent()\n```\n\nFor vector search examples, see [`notebooks/02_provision_vector_search.py`](notebooks/02_provision_vector_search.py).\n\n### Available Notebooks\n\nThe framework includes several example notebooks demonstrating different aspects:\n\n| Notebook | Description | Key Methods Demonstrated |\n|----------|-------------|-------------------------|\n| [`01_ingest_and_transform.py`](notebooks/01_ingest_and_transform.py) | Data ingestion and transformation | Dataset creation and SQL execution |\n| [`02_provision_vector_search.py`](notebooks/02_provision_vector_search.py) | Vector search setup and usage | `vector_store.create()`, `as_index()` |\n| [`03_generate_evaluation_data.py`](notebooks/03_generate_evaluation_data.py) | Generate synthetic evaluation datasets | Data generation and evaluation setup |\n| [`04_unity_catalog_tools.py`](notebooks/04_unity_catalog_tools.py) | Unity Catalog function deployment | SQL function creation and testing |\n| [`05_agent_as_code_driver.py`](notebooks/05_agent_as_code_driver.py) | **Complete agent lifecycle** | `create_agent()`, `deploy_agent()` |\n| [`06_run_evaluation.py`](notebooks/06_run_evaluation.py) | Agent evaluation and testing | Evaluation framework usage |\n| [`08_run_examples.py`](notebooks/08_run_examples.py) | End-to-end example queries | Agent interaction and testing |\n\n## Configuration\n\nConfiguration is managed through [`model_config.yaml`](config/hardware_store/supervisor_postgres.yaml). This file defines all components of the Retail AI system, including resources, tools, agents, and the overall application setup.\n\n**Note**: The configuration file location is configurable throughout the framework. You can specify a different configuration file using the `-c` or `--config` flag in CLI commands, or by setting the appropriate parameters in the Python API.\n\n### Basic Structure of [`model_config.yaml`](config/hardware_store/supervisor_postgres.yaml)\n\nThe [`model_config.yaml`](config/hardware_store/supervisor_postgres.yaml) is organized into several top-level keys:\n\n```yaml\n# filepath: /Users/nate/development/dao-ai/config/hardware_store/supervisor_postgres.yaml\nschemas:\n # ... schema definitions ...\n\nresources:\n # ... resource definitions (LLMs, vector stores, etc.) ...\n\ntools:\n # ... tool definitions ...\n\nagents:\n # ... agent definitions ...\n\napp:\n # ... application configuration ...\n\n# Other sections like guardrails, retrievers, evaluation, datasets\n```\n\n### Loading and Using Configuration\n\nThe configuration can be loaded and used programmatically through the `AppConfig` class:\n\n```python\nfrom retail_ai.config import AppConfig\n\n# Load configuration from file\nconfig = AppConfig.from_file(\"config/hardware_store/supervisor_postgres.yaml\")\n\n# Access different configuration sections\nprint(f\"Available agents: {list(config.agents.keys())}\")\nprint(f\"Available tools: {list(config.tools.keys())}\")\nprint(f\"Vector stores: {list(config.resources.vector_stores.keys())}\")\n\n# Use configuration methods for deployment\nconfig.create_agent() # Package as MLflow model\nconfig.deploy_agent() # Deploy to serving endpoint\n```\n\nThe configuration supports both CLI and programmatic workflows, with the Python API providing more flexibility for complex deployment scenarios.\n\n### Developing and Configuring Tools\n\nTools are functions that agents can use to interact with external systems or perform specific tasks. They are defined under the `tools` key in [`model_config.yaml`](config/hardware_store/supervisor_postgres.yaml). Each tool has a unique name and contains a `function` specification.\n\nThere are four types of tools supported:\n\n#### 1. Python Tools (`type: python`)\n These tools directly map to Python functions. The `name` field should correspond to a function that can be imported and called directly.\n\n **Configuration Example:**\n ```yaml\n tools:\n my_python_tool:\n name: my_python_tool\n function:\n type: python\n name: retail_ai.tools.my_function_name\n schema: *retail_schema # Optional schema definition\n ```\n **Development:**\n Implement the Python function in the specified module (e.g., `retail_ai/tools.py`). The function will be imported and called directly when the tool is invoked.\n\n#### 2. Factory Tools (`type: factory`)\n Factory tools use factory functions that return initialized LangChain `BaseTool` instances. This is useful for tools requiring complex initialization or configuration.\n\n **Configuration Example:**\n ```yaml\n tools:\n vector_search_tool:\n name: vector_search\n function:\n type: factory\n name: retail_ai.tools.create_vector_search_tool\n args:\n retriever: *products_retriever\n name: product_vector_search_tool\n description: \"Search for products using vector search\"\n ```\n **Development:**\n Implement the factory function (e.g., `create_vector_search_tool`) in `retail_ai/tools.py`. This function should accept the specified `args` and return a fully configured `BaseTool` object.\n\n#### 3. Unity Catalog Tools (`type: unity_catalog`)\n These tools represent SQL functions registered in Databricks Unity Catalog. They reference functions by their Unity Catalog schema and name.\n\n **Configuration Example:**\n ```yaml\n tools:\n find_product_by_sku_uc_tool:\n name: find_product_by_sku_uc\n function:\n type: unity_catalog\n name: find_product_by_sku\n schema: *retail_schema\n ```\n **Development:**\n Create the corresponding SQL function in your Databricks Unity Catalog using the specified schema and function name. The tool will automatically generate the appropriate function signature and documentation.\n\n### Developing Unity Catalog Functions\n\nUnity Catalog functions provide the backbone for data access in the multi-agent system. The framework automatically deploys these functions from SQL DDL files during system initialization.\n\n#### Function Deployment Configuration\n\nUnity Catalog functions are defined in the `unity_catalog_functions` section of [`model_config.yaml`](config/hardware_store/supervisor_postgres.yaml). Each function specification includes:\n\n- **Function metadata**: Schema and name for Unity Catalog registration\n- **DDL file path**: Location of the SQL file containing the function definition\n- **Test parameters**: Optional test data for function validation\n\n**Configuration Example from [`model_config.yaml`](config/hardware_store/supervisor_postgres.yaml):**\n```yaml\nunity_catalog_functions:\n - function:\n schema: *retail_schema # Reference to schema configuration\n name: find_product_by_sku # Function name in Unity Catalog\n ddl: ../functions/retail/find_product_by_sku.sql # Path to SQL DDL file\n test: # Optional test configuration\n parameters:\n sku: [\"00176279\"] # Test parameters for validation\n - function:\n schema: *retail_schema\n name: find_store_inventory_by_sku\n ddl: ../functions/retail/find_store_inventory_by_sku.sql\n test:\n parameters:\n store: \"35048\" # Multiple parameters for complex functions\n sku: [\"00176279\"]\n```\n\n#### SQL Function Structure\n\nSQL files should follow this structure for proper deployment:\n\n**File Structure Example** (`functions/retail/find_product_by_sku.sql`):\n```sql\n-- Function to find product details by SKU\nCREATE OR REPLACE FUNCTION {catalog_name}.{schema_name}.find_product_by_sku(\n sku ARRAY<STRING> COMMENT 'One or more unique identifiers for retrieve. SKU values are between 5-8 alpha numeric characters'\n)\nRETURNS TABLE(\n product_id BIGINT COMMENT 'Unique identifier for each product in the catalog',\n sku STRING COMMENT 'Stock Keeping Unit - unique internal product identifier code',\n upc STRING COMMENT 'Universal Product Code - standardized barcode number for product identification',\n brand_name STRING COMMENT 'Name of the manufacturer or brand that produces the product',\n product_name STRING COMMENT 'Display name of the product as shown to customers',\n -- ... additional columns\n)\nREADS SQL DATA\nCOMMENT 'Retrieves detailed information about a specific product by its SKU. This function is designed for product information retrieval in retail applications.'\nRETURN \nSELECT \n product_id,\n sku,\n upc,\n brand_name,\n product_name\n -- ... additional columns\nFROM products\nWHERE ARRAY_CONTAINS(find_product_by_sku.sku, products.sku);\n```\n\n**Key Requirements:**\n- Use `{catalog_name}.{schema_name}` placeholders - these are automatically replaced during deployment\n- Include comprehensive `COMMENT` attributes for all parameters and return columns\n- Provide a clear function-level comment describing purpose and use cases\n- Use `READS SQL DATA` for functions that query data\n- Follow consistent naming conventions for parameters and return values\n\n#### Test Configuration\n\nThe optional `test` section allows you to define test parameters for automatic function validation:\n\n```yaml\ntest:\n parameters:\n sku: [\"00176279\"] # Single parameter\n # OR for multi-parameter functions:\n store: \"35048\" # Multiple parameters\n sku: [\"00176279\"]\n```\n\n**Test Benefits:**\n- **Validation**: Ensures functions work correctly after deployment\n- **Documentation**: Provides example usage for other developers\n- **CI/CD Integration**: Enables automated testing in deployment pipelines\n\n**Note**: Test parameters should use realistic data from your datasets to ensure meaningful validation. The framework will execute these tests automatically during deployment to verify function correctness.\n\n#### 4. MCP (Model Context Protocol) Tools (`type: mcp`)\n MCP tools allow interaction with external services that implement the Model Context Protocol, supporting both HTTP and stdio transports.\n\n **Configuration Example:**\n ```yaml\n tools:\n weather_tool_mcp:\n name: weather\n function:\n type: mcp\n name: weather\n transport: streamable_http\n url: http://localhost:8000/mcp\n ```\n **Development:**\n Ensure the MCP service is running and accessible at the specified URL or command. The framework will handle the MCP protocol communication automatically.\n\n### Configuring New Agents\n\nAgents are specialized AI assistants defined under the `agents` key in [`model_config.yaml`](config/hardware_store/supervisor_postgres.yaml). Each agent has a unique name and specific configuration.\n\n**Configuration Example:**\n```yaml\nagents:\n general:\n name: general\n description: \"General retail store assistant for home improvement and hardware store inquiries\"\n model: *tool_calling_llm\n tools:\n - *find_product_details_by_description_tool\n - *vector_search_tool\n guardrails: []\n checkpointer: *checkpointer\n prompt: |\n You are a helpful retail store assistant for a home improvement and hardware store.\n You have access to search tools to find current information about products, pricing, and store policies.\n \n #### CRITICAL INSTRUCTION: ALWAYS USE SEARCH TOOLS FIRST\n Before answering ANY question:\n - ALWAYS use your available search tools to find the most current and accurate information\n - Search for specific details about store policies, product availability, pricing, and services\n```\n\n**Agent Configuration Fields:**\n- `name`: Unique identifier for the agent\n- `description`: Human-readable description of the agent's purpose\n- `model`: Reference to an LLM model (using YAML anchors like `*tool_calling_llm`)\n- `tools`: Array of tool references (using YAML anchors like `*search_tool`)\n- `guardrails`: Array of guardrail references (can be empty `[]`)\n- `checkpointer`: Reference to a checkpointer for conversation state (optional)\n- `prompt`: System prompt that defines the agent's behavior and instructions\n\n**To configure a new agent:**\n1. Add a new entry under the `agents` section with a unique key\n2. Define the required fields: `name`, `description`, `model`, `tools`, and `prompt`\n3. Optionally configure `guardrails` and `checkpointer`\n4. Reference the agent in the application configuration using YAML anchors\n\n### Assigning Tools to Agents\n\nTools are assigned to agents by referencing them using YAML anchors in the agent's `tools` array. Each tool must be defined in the `tools` section with an anchor (using `&tool_name`), then referenced in the agent configuration (using `*tool_name`).\n\n**Example:**\n```yaml\ntools:\n search_tool: &search_tool\n name: search\n function:\n type: factory\n name: retail_ai.tools.search_tool\n args: {}\n\n genie_tool: &genie_tool\n name: genie\n function:\n type: factory\n name: retail_ai.tools.create_genie_tool\n args:\n genie_room: *retail_genie_room\n\nagents:\n general:\n name: general\n description: \"General retail store assistant\"\n model: *tool_calling_llm\n tools:\n - *search_tool # Reference to the search_tool anchor\n - *genie_tool # Reference to the genie_tool anchor\n # ... other agent configuration\n```\n\nThis YAML anchor system allows for:\n- **Reusability**: The same tool can be assigned to multiple agents\n- **Maintainability**: Tool configuration is centralized in one place\n- **Consistency**: Tools are guaranteed to have the same configuration across agents\n\n### Assigning Agents to the Application and Configuring Orchestration\n\nAgents are made available to the application by listing their YAML anchors (defined in the `agents:` section) within the `agents` array under the `app` section. The `app.orchestration` section defines how these agents interact.\n\n**Orchestration Configuration:**\n\nThe `orchestration` block within the `app` section allows you to define the interaction pattern. Your current configuration primarily uses a **Supervisor** pattern.\n\n```yaml\n# filepath: /Users/nate/development/dao-ai/config/hardware_store/supervisor_postgres.yaml\n# ...\n# app:\n# ...\n# agents:\n# - *orders\n# - *diy\n# - *product\n# # ... other agents referenced by their anchors\n# - *general\n# orchestration:\n# supervisor:\n# model: *tool_calling_llm # LLM for the supervisor agent\n# default_agent: *general # Agent to handle tasks if no specific agent is chosen\n# # swarm: # Example of how a swarm might be configured if activated\n# # model: *tool_calling_llm\n# ...\n```\n\n**Orchestration Patterns:**\n\n1. **Supervisor Pattern (Currently Active)**\n * Your configuration defines a `supervisor` block under `app.orchestration`.\n * `model`: Specifies the LLM (e.g., `*tool_calling_llm`) that the supervisor itself will use for its decision-making and routing logic.\n * `default_agent`: Specifies an agent (e.g., `*general`) that the supervisor will delegate to if it cannot determine a more specialized agent from the `app.agents` list or if the query is general.\n * The supervisor is responsible for receiving the initial user query, deciding which specialized agent (from the `app.agents` list) is best suited to handle it, and then passing the query to that agent. If no specific agent is a clear match, or if the query is general, it falls back to the `default_agent`.\n\n2. **Swarm Pattern (Commented Out)**\n * Your configuration includes a commented-out `swarm` block. If activated, this would imply a different interaction model.\n * In a swarm, agents might collaborate more directly or work in parallel on different aspects of a query. The `model` under `swarm` would likely define the LLM used by the agents within the swarm or by a coordinating element of the swarm.\n * The specific implementation of how a swarm pattern behaves would be defined in your `retail_ai/graph.py` and `retail_ai/nodes.py`.\n\n## Integration Hooks\n\nThe DAO framework provides several hook integration points that allow you to customize agent behavior and application lifecycle. These hooks enable you to inject custom logic at key points in the system without modifying the core framework code.\n\n### Hook Types\n\n#### Agent-Level Hooks\n\n**Agent hooks** are defined at the individual agent level and allow you to customize specific agent behavior:\n\n##### `create_agent_hook`\nUsed to provide a completely custom agent implementation. When this is provided all other configuration is ignored. See: **Hook Implementation**\n\n```yaml\nagents:\n custom_agent:\n name: custom_agent\n description: \"Agent with custom initialization\"\n model: *tool_calling_llm\n create_agent_hook: my_package.hooks.initialize_custom_agent\n # ... other agent configuration\n```\n\n##### `pre_agent_hook`\nExecuted before an agent processes a message. Ideal for request preprocessing, logging, validation, or context injection. See: **Hook Implementation**\n\n```yaml\nagents:\n logging_agent:\n name: logging_agent\n description: \"Agent with request logging\"\n model: *tool_calling_llm\n pre_agent_hook: my_package.hooks.log_incoming_request\n # ... other agent configuration\n```\n\n##### `post_agent_hook`\nExecuted after an agent completes processing a message. Perfect for response post-processing, logging, metrics collection, or cleanup operations. See: **Hook Implementation**\n\n```yaml\nagents:\n analytics_agent:\n name: analytics_agent\n description: \"Agent with response analytics\"\n model: *tool_calling_llm\n post_agent_hook: my_package.hooks.collect_response_metrics\n # ... other agent configuration\n```\n\n#### Application-Level Hooks\n\n**Application hooks** operate at the global application level and affect the entire system lifecycle:\n\n##### `initialization_hooks`\nExecuted when the application starts up via `AppConfig.from_file()`. Use these for system initialization, resource setup, database connections, or external service configuration. See: **Hook Implementation**\n\n```yaml\napp:\n name: my_retail_app\n initialization_hooks:\n - my_package.hooks.setup_database_connections\n - my_package.hooks.initialize_external_apis\n - my_package.hooks.setup_monitoring\n # ... other app configuration\n```\n\n##### `shutdown_hooks`\nExecuted when the application shuts down (registered via `atexit`). Essential for cleanup operations, closing connections, saving state, or performing final logging. See: **Hook Implementation**\n\n```yaml\napp:\n name: my_retail_app\n shutdown_hooks:\n - my_package.hooks.cleanup_database_connections\n - my_package.hooks.save_session_data\n - my_package.hooks.send_shutdown_metrics\n # ... other app configuration\n```\n\n##### `message_hooks`\nExecuted for every message processed by the system. Useful for global logging, authentication, rate limiting, or message transformation. See: **Hook Implementation**\n\n```yaml\napp:\n name: my_retail_app\n message_hooks:\n - my_package.hooks.authenticate_user\n - my_package.hooks.apply_rate_limiting\n - my_package.hooks.transform_message_format\n # ... other app configuration\n```\n\n### Hook Implementation\n\nHooks can be implemented as either:\n\n1. **Python Functions**: Direct function references\n ```yaml\n initialization_hooks: my_package.hooks.setup_function\n ```\n\n2. **Factory Functions**: Functions that return configured tools or handlers\n ```yaml\n initialization_hooks:\n type: factory\n name: my_package.hooks.create_setup_handler\n args:\n config_param: \"value\"\n ```\n\n3. **Hook Lists**: Multiple hooks executed in sequence\n ```yaml\n initialization_hooks:\n - my_package.hooks.setup_database\n - my_package.hooks.setup_cache\n - my_package.hooks.setup_monitoring\n ```\n\n### Hook Function Signatures\n\nEach hook type expects specific function signatures:\n\n#### Agent Hooks\n```python\n# create_agent_hook\ndef initialize_custom_agent(state: dict, config: dict) -> dict:\n \"\"\"Custom agent initialization logic\"\"\"\n pass\n\n# pre_agent_hook \ndef log_incoming_request(state: dict, config: dict) -> dict:\n \"\"\"Pre-process incoming request\"\"\"\n return state\n\n# post_agent_hook\ndef collect_response_metrics(state: dict, config: dict) -> dict:\n \"\"\"Post-process agent response\"\"\"\n return state\n```\n\n#### Application Hooks\n```python\n# initialization_hooks\ndef setup_database_connections(config: AppConfig) -> None:\n \"\"\"Initialize database connections\"\"\"\n pass\n\n# shutdown_hooks \ndef cleanup_resources(config: AppConfig) -> None:\n \"\"\"Clean up resources on shutdown\"\"\"\n pass\n\n# message_hooks\ndef authenticate_user(state: dict, config: dict) -> dict:\n \"\"\"Authenticate and authorize user requests\"\"\"\n return state\n```\n\n### Use Cases and Examples\n\n#### Common Hook Patterns\n\n**Logging and Monitoring**:\n```python\ndef log_agent_performance(state: dict, config: AppConfig) -> dict:\n \"\"\"Log agent response times and quality metrics\"\"\"\n start_time = state.get('start_time')\n if start_time:\n duration = time.time() - start_time\n logger.info(f\"Agent response time: {duration:.2f}s\")\n return state\n```\n\n**Authentication and Authorization**:\n```python\ndef validate_user_permissions(state: dict, config: AppConfig) -> dict:\n \"\"\"Validate user has permission for requested operation\"\"\"\n user_id = state.get('user_id')\n if not has_permission(user_id, state.get('operation')):\n raise UnauthorizedError(\"Insufficient permissions\")\n return state\n```\n\n**Resource Management**:\n```python\ndef initialize_vector_search(config: AppConfig) -> None:\n \"\"\"Initialize vector search connections during startup\"\"\"\n for vs_name, vs_config in config.resources.vector_stores.items():\n vs_config.create()\n logger.info(f\"Vector store {vs_name} initialized\")\n```\n\n**State Enrichment**:\n```python\ndef enrich_user_context(state: dict, config: AppConfig) -> dict:\n \"\"\"Add user profile and preferences to state\"\"\"\n user_id = state.get('user_id')\n if user_id:\n user_profile = get_user_profile(user_id)\n state['user_context'] = user_profile\n return state\n```\n\n### Best Practices\n\n1. **Keep hooks lightweight**: Avoid heavy computations that could slow down message processing\n2. **Handle errors gracefully**: Use try-catch blocks to prevent hook failures from breaking the system\n3. **Use appropriate hook types**: Choose agent-level vs application-level hooks based on scope\n4. **Maintain state immutability**: Return modified copies of state rather than mutating in-place\n5. **Log hook execution**: Include logging for troubleshooting and monitoring\n6. **Test hooks independently**: Write unit tests for hook functions separate from the main application\n\n\n## Development\n\n### Project Structure\n\n- `retail_ai/`: Core package\n - `config.py`: Pydantic configuration models with full validation\n - `graph.py`: LangGraph workflow definition\n - `nodes.py`: Agent node factories and implementations\n - `tools.py`: Tool creation and factory functions, implementations for Python tools\n - `vector_search.py`: Vector search utilities\n - `state.py`: State management for conversations\n- `tests/`: Test suite with configuration fixtures\n- `schemas/`: JSON schemas for configuration validation\n- `notebooks/`: Jupyter notebooks for setup and experimentation\n- `docs/`: Documentation files, including architecture diagrams.\n- `config/`: Contains [`model_config.yaml`](config/hardware_store/supervisor_postgres.yaml).\n\n### Building the Package\n\n```bash\n# Install development dependencies\nmake depends\n\n# Build the package\nmake install\n\n# Run tests\nmake test\n\n# Format code\nmake format\n```\n\n## Deployment with Databricks Bundle CLI\n\nThe agent can be deployed using the existing Databricks Bundle CLI configuration:\n\n1. Ensure Databricks CLI is installed and configured:\n ```bash\n pip install databricks-cli\n databricks configure\n ```\n\n2. Deploy using the existing `databricks.yml`:\n ```bash\n databricks bundle deploy\n ```\n\n3. Check deployment status:\n ```bash\n databricks bundle status\n ```\n\n## Usage\n\nOnce deployed, interact with the agent:\n\n```python\nfrom mlflow.deployments import get_deploy_client\n\nclient = get_deploy_client(\"databricks\")\nresponse = client.predict(\n endpoint=\"retail_ai_agent\", # Matches endpoint_name in model_config.yaml\n inputs={\n \"messages\": [\n {\"role\": \"user\", \"content\": \"Can you recommend a lamp for my oak side tables?\"}\n ]\n }\n)\n\nprint(response[\"message\"][\"content\"])\n```\n\n### Advanced Configuration\n\nYou can also pass additional configuration parameters to customize the agent's behavior:\n\n```python\nresponse = client.predict(\n endpoint=\"retail_ai_agent\",\n inputs={\n \"messages\": [\n {\"role\": \"user\", \"content\": \"Can you recommend a lamp for my oak side tables?\"}\n ],\n \"configurable\": {\n \"thread_id\": \"1\",\n \"user_id\": \"my_user_id\", \n \"store_num\": 87887\n }\n }\n)\n```\n\nThe `configurable` section supports:\n- **`thread_id`**: Unique identifier for conversation threading and state management\n- **`user_id`**: User identifier for personalization and tracking\n- **`store_num`**: Store number for location-specific recommendations and inventory\n\n## Customization\n\nTo customize the agent:\n\n1. **Update [`model_config.yaml`](config/hardware_store/supervisor_postgres.yaml)**:\n - Add tools in the `tools` section\n - Create agents in the `agents` section\n - Configure resources (LLMs, vector stores, etc.)\n - Adjust orchestration patterns as described above.\n\n2. **Implement new tools** in `retail_ai/tools.py` (for Python and Factory tools) or in Unity Catalog (for UC tools).\n\n3. **Extend workflows** in `retail_ai/graph.py` to support the chosen orchestration patterns and agent interactions.\n\n## Testing\n\n```bash\n# Run all tests\nmake test\n```\n\n## Logging\n\nThe primary log level for the application is configured in [`model_config.yaml`](config/hardware_store/supervisor_postgres.yaml) under the `app.log_level` field.\n\n**Configuration Example:**\n```yaml\n# filepath: /Users/nate/development/dao-ai/config/hardware_store/supervisor_postgres.yaml\napp:\n log_level: INFO # Supported levels: DEBUG, INFO, WARNING, ERROR, CRITICAL\n # ... other app configurations ...\n```\n\nThis setting controls the verbosity of logs produced by the `retail_ai` package.\n\nThe system also includes:\n- **MLflow tracing** for request tracking.\n- **Structured logging** is used internally.\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "DAO AI: A modular, multi-agent orchestration framework for complex AI workflows. Supports agent handoff, tool integration, and dynamic configuration via YAML.",
"version": "0.0.2",
"project_urls": {
"Changelog": "https://github.com/natefleming/dao-ai/blob/main/CHANGELOG.md",
"Documentation": "https://natefleming.github.io/dao-ai",
"Homepage": "https://github.com/natefleming/dao-ai",
"Issues": "https://github.com/natefleming/dao-ai/issues",
"Repository": "https://github.com/natefleming/dao-ai"
},
"split_keywords": [
"agents",
" ai",
" databricks",
" langchain",
" langgraph",
" llm",
" multi-agent",
" orchestration",
" vector-search",
" workflow"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "45d58ee44ed0e4953fdc9351a88fdeb86b69e922d652ecb5a94c65cd7eb5e369",
"md5": "8d5ebbdd08df47827b8652366502cd18",
"sha256": "8e418a432245a2804163bddbdb4157940cfd64ad3a7f11d0b98ea922749157b0"
},
"downloads": -1,
"filename": "dao_ai-0.0.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "8d5ebbdd08df47827b8652366502cd18",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.11",
"size": 63858,
"upload_time": "2025-07-10T16:46:35",
"upload_time_iso_8601": "2025-07-10T16:46:35.688981Z",
"url": "https://files.pythonhosted.org/packages/45/d5/8ee44ed0e4953fdc9351a88fdeb86b69e922d652ecb5a94c65cd7eb5e369/dao_ai-0.0.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "c6b6d145eb0eb7d7860708f429c061d8a9a4ba8313e73f6a63f1cc293e02dcf0",
"md5": "cbe779246b166d2a189dcf2aae2731da",
"sha256": "76ca892cf56e321a5c304ed517e8bdbc18368a68c727e11c35e07d18b4a405e8"
},
"downloads": -1,
"filename": "dao_ai-0.0.2.tar.gz",
"has_sig": false,
"md5_digest": "cbe779246b166d2a189dcf2aae2731da",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.11",
"size": 19452393,
"upload_time": "2025-07-10T16:46:37",
"upload_time_iso_8601": "2025-07-10T16:46:37.331426Z",
"url": "https://files.pythonhosted.org/packages/c6/b6/d145eb0eb7d7860708f429c061d8a9a4ba8313e73f6a63f1cc293e02dcf0/dao_ai-0.0.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-10 16:46:37",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "natefleming",
"github_project": "dao-ai",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "aiohappyeyeballs",
"specs": [
[
"==",
"2.6.1"
]
]
},
{
"name": "aiohttp",
"specs": [
[
"==",
"3.12.13"
]
]
},
{
"name": "aiohttp-retry",
"specs": [
[
"==",
"2.9.1"
]
]
},
{
"name": "aiosignal",
"specs": [
[
"==",
"1.3.2"
]
]
},
{
"name": "alembic",
"specs": [
[
"==",
"1.16.2"
]
]
},
{
"name": "annotated-types",
"specs": [
[
"==",
"0.7.0"
]
]
},
{
"name": "anthropic",
"specs": [
[
"==",
"0.54.0"
]
]
},
{
"name": "anyio",
"specs": [
[
"==",
"4.9.0"
]
]
},
{
"name": "attrs",
"specs": [
[
"==",
"25.3.0"
]
]
},
{
"name": "azure-core",
"specs": [
[
"==",
"1.34.0"
]
]
},
{
"name": "azure-storage-blob",
"specs": [
[
"==",
"12.25.1"
]
]
},
{
"name": "azure-storage-file-datalake",
"specs": [
[
"==",
"12.20.0"
]
]
},
{
"name": "blinker",
"specs": [
[
"==",
"1.9.0"
]
]
},
{
"name": "boto3",
"specs": [
[
"==",
"1.38.41"
]
]
},
{
"name": "botocore",
"specs": [
[
"==",
"1.38.41"
]
]
},
{
"name": "cachetools",
"specs": [
[
"==",
"5.5.2"
]
]
},
{
"name": "certifi",
"specs": [
[
"==",
"2025.6.15"
]
]
},
{
"name": "cffi",
"specs": [
[
"==",
"1.17.1"
]
]
},
{
"name": "cfgv",
"specs": [
[
"==",
"3.4.0"
]
]
},
{
"name": "charset-normalizer",
"specs": [
[
"==",
"3.4.2"
]
]
},
{
"name": "click",
"specs": [
[
"==",
"8.2.1"
]
]
},
{
"name": "cloudpickle",
"specs": [
[
"==",
"3.1.1"
]
]
},
{
"name": "contourpy",
"specs": [
[
"==",
"1.3.2"
]
]
},
{
"name": "cryptography",
"specs": [
[
"==",
"45.0.4"
]
]
},
{
"name": "cycler",
"specs": [
[
"==",
"0.12.1"
]
]
},
{
"name": "databricks-agents",
"specs": [
[
"==",
"1.1.0"
]
]
},
{
"name": "databricks-ai-bridge",
"specs": [
[
"==",
"0.5.1"
]
]
},
{
"name": "databricks-connect",
"specs": [
[
"==",
"16.1.6"
]
]
},
{
"name": "databricks-langchain",
"specs": [
[
"==",
"0.5.1"
]
]
},
{
"name": "databricks-sdk",
"specs": [
[
"==",
"0.55.0"
]
]
},
{
"name": "databricks-vectorsearch",
"specs": [
[
"==",
"0.56"
]
]
},
{
"name": "dataclasses-json",
"specs": [
[
"==",
"0.6.7"
]
]
},
{
"name": "deprecation",
"specs": [
[
"==",
"2.1.0"
]
]
},
{
"name": "distlib",
"specs": [
[
"==",
"0.3.9"
]
]
},
{
"name": "distro",
"specs": [
[
"==",
"1.9.0"
]
]
},
{
"name": "docker",
"specs": [
[
"==",
"7.1.0"
]
]
},
{
"name": "duckduckgo-search",
"specs": [
[
"==",
"8.0.4"
]
]
},
{
"name": "dydantic",
"specs": [
[
"==",
"0.0.8"
]
]
},
{
"name": "fastapi",
"specs": [
[
"==",
"0.115.13"
]
]
},
{
"name": "filelock",
"specs": [
[
"==",
"3.18.0"
]
]
},
{
"name": "flask",
"specs": [
[
"==",
"3.1.1"
]
]
},
{
"name": "fonttools",
"specs": [
[
"==",
"4.58.4"
]
]
},
{
"name": "frozenlist",
"specs": [
[
"==",
"1.7.0"
]
]
},
{
"name": "gitdb",
"specs": [
[
"==",
"4.0.12"
]
]
},
{
"name": "gitpython",
"specs": [
[
"==",
"3.1.44"
]
]
},
{
"name": "google-api-core",
"specs": [
[
"==",
"2.25.1"
]
]
},
{
"name": "google-auth",
"specs": [
[
"==",
"2.40.3"
]
]
},
{
"name": "google-cloud-core",
"specs": [
[
"==",
"2.4.3"
]
]
},
{
"name": "google-cloud-storage",
"specs": [
[
"==",
"3.1.1"
]
]
},
{
"name": "google-crc32c",
"specs": [
[
"==",
"1.7.1"
]
]
},
{
"name": "google-resumable-media",
"specs": [
[
"==",
"2.7.2"
]
]
},
{
"name": "googleapis-common-protos",
"specs": [
[
"==",
"1.70.0"
]
]
},
{
"name": "grandalf",
"specs": [
[
"==",
"0.8"
]
]
},
{
"name": "graphene",
"specs": [
[
"==",
"3.4.3"
]
]
},
{
"name": "graphql-core",
"specs": [
[
"==",
"3.2.6"
]
]
},
{
"name": "graphql-relay",
"specs": [
[
"==",
"3.2.0"
]
]
},
{
"name": "grpcio",
"specs": [
[
"==",
"1.73.0"
]
]
},
{
"name": "grpcio-status",
"specs": [
[
"==",
"1.71.0"
]
]
},
{
"name": "gunicorn",
"specs": [
[
"==",
"23.0.0"
]
]
},
{
"name": "h11",
"specs": [
[
"==",
"0.16.0"
]
]
},
{
"name": "httpcore",
"specs": [
[
"==",
"1.0.9"
]
]
},
{
"name": "httpx",
"specs": [
[
"==",
"0.28.1"
]
]
},
{
"name": "httpx-sse",
"specs": [
[
"==",
"0.4.0"
]
]
},
{
"name": "identify",
"specs": [
[
"==",
"2.6.12"
]
]
},
{
"name": "idna",
"specs": [
[
"==",
"3.10"
]
]
},
{
"name": "importlib-metadata",
"specs": [
[
"==",
"8.7.0"
]
]
},
{
"name": "iniconfig",
"specs": [
[
"==",
"2.1.0"
]
]
},
{
"name": "isodate",
"specs": [
[
"==",
"0.7.2"
]
]
},
{
"name": "itsdangerous",
"specs": [
[
"==",
"2.2.0"
]
]
},
{
"name": "jinja2",
"specs": [
[
"==",
"3.1.6"
]
]
},
{
"name": "jiter",
"specs": [
[
"==",
"0.10.0"
]
]
},
{
"name": "jmespath",
"specs": [
[
"==",
"1.0.1"
]
]
},
{
"name": "joblib",
"specs": [
[
"==",
"1.5.1"
]
]
},
{
"name": "jsonpatch",
"specs": [
[
"==",
"1.33"
]
]
},
{
"name": "jsonpointer",
"specs": [
[
"==",
"3.0.0"
]
]
},
{
"name": "kiwisolver",
"specs": [
[
"==",
"1.4.8"
]
]
},
{
"name": "langchain",
"specs": [
[
"==",
"0.3.26"
]
]
},
{
"name": "langchain-anthropic",
"specs": [
[
"==",
"0.3.15"
]
]
},
{
"name": "langchain-community",
"specs": [
[
"==",
"0.3.26"
]
]
},
{
"name": "langchain-core",
"specs": [
[
"==",
"0.3.66"
]
]
},
{
"name": "langchain-mcp-adapters",
"specs": [
[
"==",
"0.1.7"
]
]
},
{
"name": "langchain-openai",
"specs": [
[
"==",
"0.3.24"
]
]
},
{
"name": "langchain-text-splitters",
"specs": [
[
"==",
"0.3.8"
]
]
},
{
"name": "langgraph",
"specs": [
[
"==",
"0.4.8"
]
]
},
{
"name": "langgraph-checkpoint",
"specs": [
[
"==",
"2.1.0"
]
]
},
{
"name": "langgraph-checkpoint-postgres",
"specs": [
[
"==",
"2.0.21"
]
]
},
{
"name": "langgraph-prebuilt",
"specs": [
[
"==",
"0.2.2"
]
]
},
{
"name": "langgraph-sdk",
"specs": [
[
"==",
"0.1.70"
]
]
},
{
"name": "langgraph-supervisor",
"specs": [
[
"==",
"0.0.27"
]
]
},
{
"name": "langgraph-swarm",
"specs": [
[
"==",
"0.0.11"
]
]
},
{
"name": "langmem",
"specs": [
[
"==",
"0.0.27"
]
]
},
{
"name": "langsmith",
"specs": [
[
"==",
"0.4.1"
]
]
},
{
"name": "loguru",
"specs": [
[
"==",
"0.7.3"
]
]
},
{
"name": "lxml",
"specs": [
[
"==",
"5.4.0"
]
]
},
{
"name": "mako",
"specs": [
[
"==",
"1.3.10"
]
]
},
{
"name": "markdown-it-py",
"specs": [
[
"==",
"3.0.0"
]
]
},
{
"name": "markupsafe",
"specs": [
[
"==",
"3.0.2"
]
]
},
{
"name": "marshmallow",
"specs": [
[
"==",
"3.26.1"
]
]
},
{
"name": "matplotlib",
"specs": [
[
"==",
"3.10.3"
]
]
},
{
"name": "mcp",
"specs": [
[
"==",
"1.9.4"
]
]
},
{
"name": "mdurl",
"specs": [
[
"==",
"0.1.2"
]
]
},
{
"name": "mlflow",
"specs": [
[
"==",
"3.1.1"
]
]
},
{
"name": "mlflow-skinny",
"specs": [
[
"==",
"3.1.1"
]
]
},
{
"name": "multidict",
"specs": [
[
"==",
"6.5.0"
]
]
},
{
"name": "mypy",
"specs": [
[
"==",
"1.16.1"
]
]
},
{
"name": "mypy-extensions",
"specs": [
[
"==",
"1.1.0"
]
]
},
{
"name": "nest-asyncio",
"specs": [
[
"==",
"1.6.0"
]
]
},
{
"name": "nodeenv",
"specs": [
[
"==",
"1.9.1"
]
]
},
{
"name": "numpy",
"specs": [
[
"==",
"1.26.4"
]
]
},
{
"name": "openai",
"specs": [
[
"==",
"1.90.0"
]
]
},
{
"name": "openevals",
"specs": [
[
"==",
"0.1.0"
]
]
},
{
"name": "opentelemetry-api",
"specs": [
[
"==",
"1.34.1"
]
]
},
{
"name": "opentelemetry-sdk",
"specs": [
[
"==",
"1.34.1"
]
]
},
{
"name": "opentelemetry-semantic-conventions",
"specs": [
[
"==",
"0.55b1"
]
]
},
{
"name": "orjson",
"specs": [
[
"==",
"3.10.18"
]
]
},
{
"name": "ormsgpack",
"specs": [
[
"==",
"1.10.0"
]
]
},
{
"name": "packaging",
"specs": [
[
"==",
"24.2"
]
]
},
{
"name": "pandas",
"specs": [
[
"==",
"2.3.0"
]
]
},
{
"name": "pathspec",
"specs": [
[
"==",
"0.12.1"
]
]
},
{
"name": "pillow",
"specs": [
[
"==",
"11.2.1"
]
]
},
{
"name": "platformdirs",
"specs": [
[
"==",
"4.3.8"
]
]
},
{
"name": "pluggy",
"specs": [
[
"==",
"1.6.0"
]
]
},
{
"name": "pre-commit",
"specs": [
[
"==",
"4.2.0"
]
]
},
{
"name": "primp",
"specs": [
[
"==",
"0.15.0"
]
]
},
{
"name": "propcache",
"specs": [
[
"==",
"0.3.2"
]
]
},
{
"name": "proto-plus",
"specs": [
[
"==",
"1.26.1"
]
]
},
{
"name": "protobuf",
"specs": [
[
"==",
"5.29.5"
]
]
},
{
"name": "psycopg",
"specs": [
[
"==",
"3.2.9"
]
]
},
{
"name": "psycopg-binary",
"specs": [
[
"==",
"3.2.9"
]
]
},
{
"name": "psycopg-pool",
"specs": [
[
"==",
"3.2.6"
]
]
},
{
"name": "py4j",
"specs": [
[
"==",
"0.10.9.7"
]
]
},
{
"name": "pyarrow",
"specs": [
[
"==",
"20.0.0"
]
]
},
{
"name": "pyasn1",
"specs": [
[
"==",
"0.6.1"
]
]
},
{
"name": "pyasn1-modules",
"specs": [
[
"==",
"0.4.2"
]
]
},
{
"name": "pycparser",
"specs": [
[
"==",
"2.22"
]
]
},
{
"name": "pydantic",
"specs": [
[
"==",
"2.11.7"
]
]
},
{
"name": "pydantic-core",
"specs": [
[
"==",
"2.33.2"
]
]
},
{
"name": "pydantic-settings",
"specs": [
[
"==",
"2.10.0"
]
]
},
{
"name": "pygments",
"specs": [
[
"==",
"2.19.2"
]
]
},
{
"name": "pyparsing",
"specs": [
[
"==",
"3.2.3"
]
]
},
{
"name": "pytest",
"specs": [
[
"==",
"8.4.1"
]
]
},
{
"name": "python-dateutil",
"specs": [
[
"==",
"2.9.0.post0"
]
]
},
{
"name": "python-dotenv",
"specs": [
[
"==",
"1.1.0"
]
]
},
{
"name": "python-multipart",
"specs": [
[
"==",
"0.0.20"
]
]
},
{
"name": "pytz",
"specs": [
[
"==",
"2025.2"
]
]
},
{
"name": "pyyaml",
"specs": [
[
"==",
"6.0.2"
]
]
},
{
"name": "regex",
"specs": [
[
"==",
"2024.11.6"
]
]
},
{
"name": "requests",
"specs": [
[
"==",
"2.32.4"
]
]
},
{
"name": "requests-toolbelt",
"specs": [
[
"==",
"1.0.0"
]
]
},
{
"name": "rich",
"specs": [
[
"==",
"14.0.0"
]
]
},
{
"name": "rsa",
"specs": [
[
"==",
"4.9.1"
]
]
},
{
"name": "ruff",
"specs": [
[
"==",
"0.12.0"
]
]
},
{
"name": "s3transfer",
"specs": [
[
"==",
"0.13.0"
]
]
},
{
"name": "scikit-learn",
"specs": [
[
"==",
"1.7.0"
]
]
},
{
"name": "scipy",
"specs": [
[
"==",
"1.13.1"
]
]
},
{
"name": "setuptools",
"specs": [
[
"==",
"80.9.0"
]
]
},
{
"name": "six",
"specs": [
[
"==",
"1.17.0"
]
]
},
{
"name": "smmap",
"specs": [
[
"==",
"5.0.2"
]
]
},
{
"name": "sniffio",
"specs": [
[
"==",
"1.3.1"
]
]
},
{
"name": "sqlalchemy",
"specs": [
[
"==",
"2.0.41"
]
]
},
{
"name": "sqlparse",
"specs": [
[
"==",
"0.5.3"
]
]
},
{
"name": "sse-starlette",
"specs": [
[
"==",
"2.3.6"
]
]
},
{
"name": "starlette",
"specs": [
[
"==",
"0.46.2"
]
]
},
{
"name": "tabulate",
"specs": [
[
"==",
"0.9.0"
]
]
},
{
"name": "tenacity",
"specs": [
[
"==",
"9.1.2"
]
]
},
{
"name": "threadpoolctl",
"specs": [
[
"==",
"3.6.0"
]
]
},
{
"name": "tiktoken",
"specs": [
[
"==",
"0.9.0"
]
]
},
{
"name": "tqdm",
"specs": [
[
"==",
"4.67.1"
]
]
},
{
"name": "trustcall",
"specs": [
[
"==",
"0.0.39"
]
]
},
{
"name": "typing-extensions",
"specs": [
[
"==",
"4.14.0"
]
]
},
{
"name": "typing-inspect",
"specs": [
[
"==",
"0.9.0"
]
]
},
{
"name": "typing-inspection",
"specs": [
[
"==",
"0.4.1"
]
]
},
{
"name": "tzdata",
"specs": [
[
"==",
"2025.2"
]
]
},
{
"name": "unitycatalog-ai",
"specs": [
[
"==",
"0.3.1"
]
]
},
{
"name": "unitycatalog-client",
"specs": [
[
"==",
"0.3.0"
]
]
},
{
"name": "unitycatalog-langchain",
"specs": [
[
"==",
"0.2.0"
]
]
},
{
"name": "urllib3",
"specs": [
[
"==",
"2.5.0"
]
]
},
{
"name": "uvicorn",
"specs": [
[
"==",
"0.34.3"
]
]
},
{
"name": "virtualenv",
"specs": [
[
"==",
"20.31.2"
]
]
},
{
"name": "werkzeug",
"specs": [
[
"==",
"3.1.3"
]
]
},
{
"name": "xxhash",
"specs": [
[
"==",
"3.5.0"
]
]
},
{
"name": "yarl",
"specs": [
[
"==",
"1.20.1"
]
]
},
{
"name": "zipp",
"specs": [
[
"==",
"3.23.0"
]
]
},
{
"name": "zstandard",
"specs": [
[
"==",
"0.23.0"
]
]
}
],
"lcname": "dao-ai"
}