Multiagent orchestration in AI copilot systems

From: aidotengineer

AI copilot systems for cloud architecture require grounded reasoning rather than mere automation to navigate increasing complexity in cloud systems, tools, constraints, and expectations [00:00:07]. These systems need to understand, debate, justify, and plan solutions [00:00:44]. Cloud architecture is not just technical but also cognitive, involving constant negotiation of trade-offs based on requirements, time, and available resources [00:00:56]. Capturing the scattered and implicit context architects rely on for decision-making is crucial for AI [00:01:21].

Challenges at the Intersection of AI and Architecture

Key challenges in solving architecture design problems with AI include:

Requirement Understanding [00:01:46]: Identifying the source, format, important pieces, and scope of requirements [00:01:48].
Architecture Identification [00:02:05]: Understanding the functionalities of various components within an architecture [00:02:08].
Architecture Recommendation [00:02:27]: Providing recommendations that match requirements or improve the architecture based on best practices [00:02:31].

More specific AI-related challenges arise from:

Mixing Semantic and Graph Context [00:02:54]: Requirements are typically textual, while architecture is graph data; effectively integrating these sources for higher-level reasoning is vital [00:02:59].
Complex Reasoning Scenarios [00:03:20]: Handling vague, broad, and complex questions that require breakdown and proper planning [00:03:23].
Evaluation and Feedback [00:03:42]: Providing feedback to large AI systems with many moving parts [00:03:45].

Grounding AI Agents in Specific Context

To enable AI agents to reason effectively, providing proper context about architecture is essential [00:04:12]. Approaches for grounding agents include:

Semantic Enrichment of Architecture Data [00:04:40]: Collecting relevant semantic information for each component to make it more searchable [00:04:43].
Graph-Enhanced Component Search [00:04:57]: Utilizing graph algorithms to retrieve the correct information from an architecture when searching for components [00:05:06].
Early-Score Enrichment of Requirement Documents [00:05:22]: Scoring documents based on important concepts to facilitate faster retrieval of relevant information [00:05:27].

Learnings from Grounding

Semantic grounding improves reasoning but has limitations in scalability and producing detailed responses [00:06:09].
Prompt design is critical for effective soft grounding, guiding the agent on what to focus on and retrieve [00:06:29].
Graph memory supports continuity, allowing agents to connect different nodes in a graph and add context for proper reasoning [00:06:47]. Initial designs using vector DBs for architecture retrieval showed limitations for graph data, leading to a shift towards graph-based searches and knowledge graphs [00:07:16]. Structuring requirements using templates helps with fast retrieval and structuring business needs but can lose context in larger searches [00:08:47].

Addressing Complex Reasoning Scenarios with Multiagent Orchestration

Architecture design involves conflicting goals, trade-offs, and debates [00:10:07]. AI agents need to collaborate, argue, and converge on justified recommendations [00:10:19].

Key strategies for managing complex reasoning scenarios involve:

Multiagent orchestration with Role-Specific Agents [00:10:28]: Building a multiagent system that allows multiple agents to work together and possess specific properties [00:10:30].
Structured Message Format [00:10:53]: Using structured messages (e.g., beyond XML) to facilitate better workflows and agent interactions [00:11:08].
Conversation Management [00:11:25]: Isolating conversations between agents to manage memory, prevent wasted tokens, and avoid “hostination” (a phenomenon where increased memory leads to poorer results) [00:11:28].
Cloning of Agents for parallel processing [00:12:05]: Duplicating agents for specific tasks to speed up processes, requiring careful memory management [00:12:09].

Learnings from Complex Reasoning

Structured outputs significantly improve clarity and control in programming, despite potential concerns about reducing reasoning abilities [00:13:34].
Dynamic planning and execution in AI: Multiagent systems should be allowed to resolve trade-offs dynamically rather than executing static plans, leading to higher creativity in planning and results [00:13:10].
Successful agent orchestration requires control flow; simply letting agents work together without guidance is insufficient [00:13:50].

K.O.’s Multi-Agent AI Copilot System

K.O. employs a multiagent system for architecture recommendations [00:14:14]. This system generates recommendations categorized by areas like architecture messaging and queuing, API integration, etc., breaking each recommendation into descriptions, target states, gap analysis, and recommended actions [00:14:31].

The system’s agents include:

Chief Architect [00:15:17]: Oversees and coordinates higher-level tasks [00:15:19].
10 Staff Architects [00:15:23]: Each specialized in a domain (e.g., infrastructure, API, IM) [00:15:27].
Requirement Retriever [00:15:36]: Accesses requirements data [00:15:39].
Architecture Retriever [00:15:45]: Understands the current architecture state and can answer questions about components [00:15:47].

Multi-Agent System Workflow

The multiagent system workflow consists of three main sequential tasks to generate recommendations:

List Generation [00:16:11]: The Chief Architect requests possible recommendations from Staff Architects [00:17:15]. Staff Architects, in parallel, query the Architecture State Agent and Requirements Agent multiple times [00:17:23].
Conflict Resolution [00:16:20]: The Chief Architect reviews the generated list of recommendations for conflicts or redundancies, pruning the list [00:16:22].
Design Proposal [00:16:49]: Staff Architects generate full design proposals for each recommendation topic, including gap analysis and proposed improvements [00:16:52]. During this step, cloning of agents occurs, with each staff architect cloned for the number of recommendations it needs to generate [00:18:15]. Each clone has access to past history but maintains its own separate current history [00:18:26].

Evaluation and Feedback

To determine if a recommendation is good and to monitor the many rounds of conversations within the multiagent system, a feedback loop with human scoring and structured feedback is necessary [00:19:03].

Learnings from Evaluation

Human evaluation is the most effective method, especially in early stages of developing AI agents and agentic workflows from scratch [00:19:32]. LLM evaluations are good but often lack the specific insights needed for improvements [00:19:42].
An internal human evaluation tool (e.g., “Eagle Eye”) helps review specific cases, extracted requirements, agent conversations, and generated recommendations for relevance, visibility, and clarity scoring [00:19:55].
Confidence is not correctness [00:20:38].
Evaluation must be integrated into system design from the outset, not added later, to ensure continuous assessment as the system evolves [00:20:59]. Hallucinations can occur in agent interactions, highlighting the importance of monitoring [00:22:05].

Key Takeaways

Building an AI copilot is about designing a system that can reason, not just generate answers or provide assistance [00:23:00]. This involves handling vast amounts of data (thousands to millions of components, large document corpuses) to answer questions from diverse stakeholders [00:23:20].

Effective AI agents and agentic workflows require:

Defined roles [00:24:04].
Structured workflows [00:24:04].
Robust memory management [00:24:06].
Structured outputs [00:24:06].

Continuous experimentation is vital to identify effective patterns based on existing data [00:24:16]. Graphs are becoming increasingly important in design, especially for capturing memory and maintaining context for the AI [00:24:31]. The level of autonomy given to each agent is a key learning area [00:24:46]. Frameworks like LangGraph are being explored for building agent workflows, often managed by a higher-level system [00:24:59].

Tubegraph

Explorer

Table of Contents