Role of AI agents and agentic frameworks

From: aidotengineer

While Large Language Models (LLMs) like InstructGPT marked a significant leap in language understanding and instruction following in 2022, they still struggle with complex instructions, even by 2025 with models like GPT-4.1 [00:00:00]. This limitation stems from users attempting to include all context, constraints, and requirements into a single prompt, which often proves insufficient even for seemingly simple tasks like instruction following [00:01:04]. This is where AI agents come into play [00:01:21].

Agents do not solely rely on prompting; they incorporate planning [00:01:24].

Defining an AI Agent

From an engineering perspective, the precise definition of an AI agent is less critical than its functional effectiveness [00:01:45]. Several concepts are often referred to as agents or agentic workflows:

LLM as a Router: This involves a routing model that directs a query to a specialized LLM [00:01:55].
Function Calling: LLMs are provided with a list of external tools to interact with APIs, the internet (like Google search), or other systems [00:02:08]. The MCP (Microsoft’s framework for multi-agent collaboration and planning) standardizes this concept [00:02:24].
React (Reason and Act): A popular agentic framework that works with any language model. It involves a cycle of “thought, then act upon that thought, and observe” [00:02:39]. While effective, React processes steps one at a time, without a look-ahead to the entire plan [00:02:58].

The Necessity of Planning in Agentic Workflows

Every AI agent ultimately needs to incorporate planning [00:03:15]. Planning is the process of figuring out the steps required to achieve a specific goal [00:03:20]. It becomes essential for complex tasks that are not straightforward and require parallelization and explainability [00:03:24]. This contrasts with React, where the sequence of thoughts is visible, but the overall “why” behind the progression is not clear [00:03:36].

Planning can be implemented using:

Forms-based planners: Such as TextBL or Magentic One by Microsoft [00:03:43].
Code-based planners: Examples include small agents from Hugging Face [00:03:50].

Dynamic Planning and Smart Execution

A key aspect of effective planning is dynamic planning, which allows for replanning in the middle of a task [00:03:57]. This enables the system to reassess if the current plan is optimal or if a new path should be taken [00:04:09].

For efficiency, agents integrate a “smart execution engine” alongside the planner [00:04:17]. An execution engine can:

Analyze dependencies between steps, enabling parallel execution [00:04:24].
Manage trade-offs between speed and cost, for instance, by using branch prediction for faster systems [00:04:29].

AI21 Maestro: An Agentic Framework Example

AI21 Labs has developed an agentic framework called AI21 Maestro, which combines a planner and a smart execution engine [00:01:32] [00:04:40].

In a simplified instruction-following task, Maestro separates the prompt (context and task) from explicit requirements (e.g., length, tone, brand mentions) [00:04:53]. This separation makes validation easier [00:05:12].

The process involves an execution tree or graph:

At each step, the planner and execution engine select several candidate solutions [00:05:15].
Only the most promising candidates are pursued, fixed, and improved [00:05:26].

Techniques used within the execution engine include:

Best of N: Sampling multiple generations from an LLM (often with high temperature) or using different LLMs [00:05:36].
Candidate Discarding: Ditching unpromising candidates early and focusing on the best ones based on a predefined budget [00:05:49].
Validation and Iteration: Iteratively fixing and refining outputs [00:05:59].

The execution engine can also track expected cost, latency, and success probability, allowing the planner to choose the most appropriate path [00:06:15]. Finally, a “reduce” step combines or selects the best results for a complete answer [00:06:30].

Benefits and Challenges

AI agents employing planning and smart execution engines demonstrate significant improvements in instruction following and requirement satisfaction compared to single LLM calls [00:06:42] [00:07:02]. While this approach leads to higher quality, it comes at the cost of increased runtime and financial expenditure [00:07:10].

Conclusion

LLMs alone are not always sufficient, even for basic tasks like instruction following [00:07:21]. The approach to building effective AI agents should be incremental:

Start simple with single LLM calls if they suffice [00:07:30].
Incorporate tools or React if the task complexity increases [00:07:35].
For highly complex tasks, adopting a planning and execution engine framework is necessary to achieve desired quality and performance [00:07:43].

Tubegraph

Explorer

Table of Contents