From: aidotengineer
The evolution of AI has seen a progression from simple features to complex, autonomous systems. Initially, building AI involved creating very simple features like summarization, classification, and extraction, which felt like magic just a few years ago but are now considered basic capabilities [00:01:07]. As products matured, the approach became more sophisticated, leading to the orchestration of multiple model calls in predefined control flows, termed “workflows” [00:01:22]. These workflows allowed for trading off cost and latency for better performance [00:01:32].
The Rise of Agentic Systems
Workflows are seen as the beginning of agentic systems [00:01:39]. Modern models are increasingly capable, leading to the emergence of domain-specific agents in production [00:01:42]. Unlike workflows, agents can determine their own trajectory and operate almost independently based on environmental feedback [00:01:52].
The future of agentic systems in production is still unfolding [00:02:04]. Single agents could become more general-purpose and capable, or we might see collaboration and delegation in multi-agent settings [00:02:09]. A key trend is that as these systems are given more agency, they become more useful and capable, though this also increases the cost, latency, and consequences of errors [00:02:17].
When to Build an Agent
It’s crucial not to build agents for every task [00:02:31]. Agents are best suited for scaling complex and valuable tasks, not as a drop-in upgrade for every use case [00:02:38]. Workflows remain a practical way to deliver value for simpler, more defined tasks [00:02:51].
Consider the following checklist for building an agent:
- Complexity of Task: Agents excel in ambiguous problem spaces [00:03:01]. If a decision tree can be easily mapped, building it explicitly is more cost-effective and offers greater control [00:03:08].
- Value of Task: Agent exploration can consume many tokens, so the task must justify the cost [00:03:21]. High-volume, low-budget tasks are often better served by workflows [00:03:33].
- Critical Capabilities: Ensure there are no significant bottlenecks in the agent’s trajectory [00:04:02]. Bottlenecks can multiply cost and latency, suggesting a need to reduce scope or simplify the task [00:04:18].
- Cost of Error and Discovery: High-stakes or hard-to-discover errors make it difficult to trust agents with autonomy [00:04:32]. Mitigation through limited scope or human-in-the-loop involvement can also limit scalability [00:04:44].
Coding is cited as a prime use case for agents due to its complexity, high value, compatibility with existing tools, and verifiable output through unit tests and CI [00:05:00].
Components of an Agent
Agents are fundamentally models using tools in a loop [00:05:51]. The three core components defining an agent are:
- Environment: The system in which the agent operates [00:06:02].
- Tools: Provide an interface for the agent to take action and receive feedback [00:06:06].
- System Prompt: Defines the agent’s goals, constraints, and ideal behavior within the environment [00:06:11].
Keeping these components simple is crucial for iteration speed, as complexity upfront can hinder development [00:06:27]. Optimizations should come after the basic behaviors are established [00:07:58].
Thinking Like an Agent
To effectively develop AI agents and agentic workflows, it’s vital to “think like your agents” [00:08:06]. Agents operate based on a very limited set of context, typically 10 to 20k tokens [00:08:33]. Understanding their world view within this constrained context helps bridge the gap between human and agent understanding [00:08:51].
For instance, a computer-use agent only receives a static screenshot and a brief description [00:09:06]. While the agent’s inference and tool execution occur, it’s akin to operating a computer blind for several seconds, without immediate feedback [00:09:30]. This perspective highlights the need for crucial context, such as screen resolution, recommended actions, and limitations, to guide exploration and avoid unnecessary steps [00:10:10].
AI models can also help in understanding agent behavior. Developers can ask a model like “Cloud” to:
- Assess if system prompt instructions are ambiguous or make sense [00:10:47].
- Determine if the agent knows how to use a tool based on its description, and if it needs more or fewer parameters [00:10:52].
- Explain its decision-making process by providing its entire trajectory and asking for suggestions for better decisions [00:11:02].
Future Musings and Open Questions
Several key areas and open questions remain for the future of AI agents and multiagent systems in AI:
-
Budget Awareness: Agents need to become more budget-aware regarding cost and latency, unlike workflows where control is clearer [00:11:47]. Defining and enforcing budgets for time, money, and tokens is a crucial open question [00:12:04].
-
Self-Evolving Tools: The concept of agents designing and improving their own tool ergonomics is emerging [00:12:14]. This meta-tool capability would make agents more general-purpose as they could adapt tools for specific use cases [00:12:28].
-
Multi-Agent Collaborations and Communication: A strong conviction exists that more multi-agent collaborations will be seen in production by the end of the year [00:12:40]. These systems offer benefits like parallelization, clear separation of concerns, and sub-agents protecting the main agent’s context window [00:12:46].
A significant open question is how these agents will communicate with each other [00:12:59]. Current systems often operate within a rigid frame of synchronous user-assistant interactions [00:13:03]. Expanding to asynchronous communication and enabling more roles for agents to communicate and recognize each other is vital for the future of multi-agent systems [00:13:13].