From: redpointai
Omj, founder of Replit, is notably bullish on the future of AI agents and their role in software development [00:00:23]. While the overall impact of AI on software engineering is rapid, Omj expresses surprise at how slow the integration of AI has been into everyday technology compared to his expectations [00:24:41].
Current State and Capabilities
Currently, AI models like GPT-4 exhibit some agentic capabilities, sometimes accidentally [00:41:03]. The real power of Large Language Models (LLMs) lies in interpolating different data distributions [00:11:53]. This means they can perform tasks like writing a rap song in Shakespeare’s style [00:11:58], and by chaining these models, a complete AI-generated piece of art can be created [00:12:25].
However, the efficacy of agents, especially complex ones, still needs significant improvement [00:40:43]. Background agentic workflows are currently expensive with models like GPT-4, making them cost-prohibitive for most consumers and developers [00:40:30].
Replit’s Approach to AI Agents
Replit conducts research projects on how to train more effective agentic models [00:40:55]. While Replit focuses on delivering value to its customers, they are willing to engage in research and training if it’s necessary [00:56:18]. The company aims to be an “applied AI company” [00:56:18].
Omj believes that entrepreneurs looking to build with agents shouldn’t wait for “AI Gods” to drop APIs [00:48:22]. Instead, they should try to achieve product-market fit with existing tools, even if it’s expensive like using GPT-4 for prototypes [00:48:48].
Future Milestones for AI Agents
Omj outlines several key milestones for the advancement of AI agents:
- Reliable Task Execution: Agents should be able to follow a bulleted list of actions without going “off the rails” or requiring “insane amounts of Chain of Thought and recursive debugging” [00:49:38].
- Dependable Function Calls: Commercial models need a non-hacky and dependable way to perform function calls. Currently, while they work in 90% of cases, the 10% catastrophic failures make them unreliable for critical financial or legal workflows [00:50:05].
- High Pull Request Acceptance Rate: For tasks like converting issues to pull requests (as seen with startups like Sweep.dev), an acceptance rate of 80% or 90% would indicate significant improvement in agent capabilities [00:50:50].
Omj predicts that some version of agentic workflows will start happening this year [00:48:01]. He views agents as the “next big thing” beyond multimodal AI, which he sees as an incremental improvement rather than a “ChatGPT moment” [00:46:56]. The concept of LLMs (Large Language Models) having agent capabilities was somewhat surprising [00:41:19], as he initially thought agents would require “action transformers” [00:41:35] or large action models (LAMs) [00:42:08].
Economic and Strategic Considerations
The cost of inference for AI models, especially for complex or recursive calls, remains a significant factor [00:40:00]. If AI models were cheap enough to run in the background, like in Continuous Integration/Continuous Deployment (CI/CD) pipelines, it would become extremely expensive [00:46:04].
“I would have expected now we would have some kind of LLMs doing doing things in the background.” [00:46:26]
The increasing affordability of models, such as GPT-3.5’s price reduction, is making it more rational for companies not to train their own models for all use cases [00:27:47]. However, for companies like Replit, building their own models was crucial due to specific latency and cost characteristics desired for integration into a free product [00:28:07].
Ultimately, the future of AI agents and their widespread adoption depends on reduced costs and improved reliability, potentially leading to scenarios where agents fix bugs or perform tasks while developers are away [01:07:12].