The future of software engineering with AI

From: redpointai

Omid Mad, founder of Replit and an early hire at Codecademy, shared his insights on the evolving landscape of software development and the impact of AI on coding in a discussion with Jacob Efron and Pat Chase. Replit is a company focused on bringing the next billion developers into the space and is at the forefront of using AI in various ways [00:00:13].

Learning to Code in the AI Era

Omid’s advice for those starting to learn coding today remains consistent with his pre-AI beliefs: the best way to learn is by making things [00:01:00]. He argues that traditional academic methods, which involve learning random facts that may never be applied, are not effective for most people [00:01:43]. Instead, people learn by setting a goal and acquiring knowledge along the way to achieve it [00:01:27].

AI, particularly Large Language Models (LLMs), takes this “learning by doing” approach to an extreme conclusion [00:02:56]. With an LLM-powered editor, users can get something running in as little as five minutes, bypassing the drudgery of installations and configurations [00:03:03]. This provides immediate “dopamine hits” of seeing results, encouraging further experimentation and bigger projects [00:03:39].

The Bifurcation of Software Engineering

Omid predicts that software engineering will bifurcate into two distinct paths, widening the existing gap between front-end and back-end roles [00:04:41]:

Product Creator/Entrepreneur: This role involves focusing on making something, acquiring customers, and getting users, with less emphasis on traditional coding skills [00:04:51]. The majority of the time will be spent prompting and iterating on prompts, though debugging code will still be necessary for a period [00:05:08].
Traditional Software Engineer: This path remains largely unchanged, focusing on building cloud infrastructure, data pipelines, and backend systems [00:05:48]. For this path, a computer science degree is still relevant [00:06:10].

For those who simply want to be a builder, starting directly with building is encouraged [00:06:17].

Replit’s AI Integration Strategy

Replit’s approach to AI integration is distinctive:

Embedding AI, Not as an Add-on

Replit initially offered AI features as an add-on called “Ghostwriter,” but later renamed it “Replit AI” to reflect its complete integration into the product [00:06:36]. Omid believes the current era of “co-pilots” as add-ons is transitional, and companies relying on such add-on revenue should be concerned [00:07:23]. Replit aims to make every interaction with its product AI-powered, including as part of the free plan [00:08:10]. This internal structural decision encourages designers to start with AI in their workflows, rather than considering AI as an optional addition [00:08:31].

Key AI Features

Replit AI currently offers:

Code Suggestions: Similar to co-pilots, providing suggestions as users type code [00:08:51]. Replit trains its own models for this, prioritizing speed and power [00:09:11].
File Generation: Users can prompt the AI to generate an entire file based on context [00:09:32].
AI Debug Button: When an error occurs in the console, a button appears that opens the AI chat, pre-prompted with the error and relevant context for debugging [00:09:51].

Decision to Build Proprietary Models

Replit chose to build its own models, including a 3B parameter model, for several reasons [00:27:10]:

Latency: Commercial models did not meet Replit’s strict low-latency requirements for an interactive coding environment [00:28:07].
Cost: Building and deploying small models (like their 3B model, which cost around $100K to train) was more affordable than relying solely on commercial APIs, especially for offering AI as part of the free experience [00:28:43].
Strategic Advantage: Developing internal AI talent and having control over the model’s characteristics was crucial for Replit to be an “AI-native” company [00:29:45].

Despite building their own core models, Replit also uses commercial models for other use cases, such as general-purpose chat features [00:30:10].

Understanding AI Model Capabilities (Data & Training)

Omid views LLMs in a very reductive way, as a function of data and a compression of data [00:11:36]. Their power lies in interpolating different data distributions, such as writing a rap song in the style of Shakespeare [00:11:53].

Importance of Data Quality and Diversity

To understand a model’s capabilities, one must understand the data it was fed [00:12:49]. Key aspects of data quality include:

Size and Compute: More tokens and more compute generally lead to better models [00:16:11].
Diversity and Freshness: A greater diversity and freshness of tokens are beneficial [00:16:23].
Avoiding “Bad” Data: Training on minified JavaScript, for example, can negatively impact a model [00:16:48].
Emulating Best Programmers: GPT models essentially emulate human thinking by guessing the next token, so training on data generated by the best programmers yields higher quality models [00:17:01].
Application Code vs. Infrastructure Code: While GitHub is rich in high-quality infrastructure code (libraries), there’s a “poverty of application code” [00:18:02]. Replit shines here as many users write applications on their platform, providing valuable data [00:18:14].
Coding-Adjacent Reasoning: Even non-coding data, like scientific data or legal text, has been shown to improve code generation abilities by enhancing reasoning [00:19:17]. Omid expects 2-3 more years of increased coding capabilities [00:15:34].

Post-Training Mechanisms

These mechanisms refine a foundation model’s capabilities for specific downstream tasks [00:13:15]. Examples include instruction fine-tuning, RLHF (Reinforcement Learning from Human Feedback), and DPO (Direct Preference Optimization) [00:13:03].

Limits of “Open Source” Tokens

Omid argues that many “open source” models are not truly open source because their training cannot be reproduced [00:31:48]. Users are dependent on the goodwill of companies like Meta to continue releasing new versions [00:32:27]. The training data for models like GPT-4 includes vast amounts of internet code data plus hundreds of millions of dollars spent on contractors to annotate coding data [00:14:24].

A significant security risk exists when the training process and data of an LLM are not transparent. Backdoors can be hidden in the model that are activated only under specific conditions, and without access to the “source code” (the data), these cannot be recovered or inspected [00:36:17]. This leads to a future where distinguishing between fiction and reality, or AI-generated content, will become increasingly difficult (“hyperreality”) [00:37:05].

Impact of AI on Different Engineer Skill Levels

Initially, Omid believed that AI disproportionately benefited beginners due to the significantly increased return on investment for learning to code [00:18:45]. He cited examples of Replit users going from zero to building successful applications and companies in months [00:19:00]. Studies, like one on BCG consultants, also indicated that AI benefited lower-rated consultants more than advanced ones [00:19:27].

However, Omid suggests that this might change once people are properly trained to use AI effectively [00:19:51]. Advanced users, who possess both coding skills and proficiency in prompt engineering techniques (e.g., Chain of Thought), will likely see even greater benefits [00:20:10]. Younger generations, due to their brain plasticity, are naturally better at adapting to and building mental models for what LLMs can do [00:22:05]. Replit observed that younger users “rolled with it” when AI suggestions appeared, while older, more established users were initially jarred [00:22:35]. This is analogous to the calculator’s introduction in education [00:23:01].

Organizational Structure for AI

Replit structures its teams horizontally, integrating AI across all products rather than in vertical silos [00:24:12]. Omid believes this makes business sense, as AI will touch every aspect of software development [00:24:30]. He is surprised by the slow pace of AI integration into everyday technology and corporate structures, despite its rapid advancement [00:24:38].

The Future of the AI Coding Market

Microsoft’s Dominance

The “default pessimistic assumption” is that Microsoft will dominate the AI coding market due to its existing install base, enterprise relationships, sales team, and leadership [00:51:26].

Emergence of Specialized Companies

However, Omid is optimistic about a new wave of specialized companies focusing on different parts of the software development stack or specific coding workflows [00:52:03]. For example, companies specializing solely in generating tests (like Codium) could do well [00:53:02].

New Platforms and Holistic Approaches

Companies like Replit, which offer a holistic approach by providing a cloud development environment with AI integrated throughout the entire stack, can build more ambitious AI products, including agentic workflows [00:52:36]. This contrasts with tools like GitHub Co-pilot, which are often locked into an editor and lack broader context (e.g., other repos, Git history, Jira tickets) [00:52:20].

Code Generation Startups

For pure code generation startups, the challenge is that larger models like GPT-5 could make significant leapfrog advancements, potentially outpacing specialized models that cost millions to train [00:54:41]. While Code Llama has shown strong performance on benchmarks, its “vibes” (practical usability) may not yet match proprietary models [00:54:58]. Replit, as an applied AI company, is willing to work with any model that delivers value to its users [00:56:12].

The Future of AI Agents

Omid believes AI agents represent the next significant leap, more profound than multimodal AI (which he sees as incremental) [00:46:56].

Cost as a Barrier: Agentic workflows, especially with models like GPT-4, are currently very expensive, making them cost-prohibitive for most consumers and developers, particularly for background tasks like refactoring code and running tests [00:40:00].
Accidental Agentic Capabilities: LLMs like GPT-4 have demonstrated unexpected agentic capabilities [00:41:03], although Omid initially thought dedicated “action transformers” would be necessary [00:41:31].
Milestones for Agents: A key milestone for agents will be their ability to reliably follow a bulleted list of actions without going “off the rails” or requiring extensive chain-of-thought prompting [00:49:32]. The reliability of function calls is also crucial, as current catastrophic failures make them unsuitable for sensitive workflows [00:50:05]. Metrics like pull request acceptance rates (e.g., Sweep.dev’s 30%) indicate progress, but Omrid wants to see this reach 80-90% reliability [00:50:43].

Omid expects background agentic workflows to start appearing this year, potentially leading to a “ChatGPT moment” for agents [00:48:01]. Entrepreneurs should be willing to “walk through walls” and make agents work even if expensive, while more established companies might wait and see [00:48:27].

AI Product Pricing Strategies

Replit adopts a value-based pricing strategy, assuming AI will become table stakes, rather than a cost-plus approach based on current AI expenses [00:42:41]. They project future reductions in model costs and inference efficiency, while also considering competitive landscapes [00:43:08].

Omid also predicts that usage-based pricing will become more prevalent [00:44:24]. This is because a pure subscription SaaS model can be unsustainable when power users incur significant costs from AI model usage [00:44:43]. Replit offers bundles with AI, compute, and storage, allowing for overages [00:45:15]. This model is particularly important when AI models are integrated into CI/CD pipelines, running automatically and incurring costs even when developers are not actively working [00:46:04].

Overhyped and Underhyped AI Trends

Overhyped: Chatbots, as many things should not be chatbots [00:57:51].
Underhyped: Using LLMs as part of everyday systems and backend call chains [00:58:00].

Key Learnings from Building LLM Features at Replit

Latency Matters: The biggest surprise was how crucial latency is. A two-to-three-second response completely changes the user experience compared to 300 milliseconds [00:58:27].
Initially Flawed Features: Inline actions, which leverage cursor context for highly contextual suggestions within the IDE, were initially a “flop” because they weren’t exposed well to users [00:59:07]. After prompting users with UI patterns, adoption grew [00:59:30].

Exciting AI Startups/Companies (Outside Replit)

OpenAI: Admired for its ambition and willingness to pursue seemingly random ventures like university partnerships, robotics, and self-driving [00:59:56]. Sam Altman’s ability to “spin a whole lot of plates” is impressive [01:00:27].
Perplexity: Highly bullish on Perplexity due to its strong engineering competency and ability to zoom ahead of competitors in search [01:00:57].

Impact on Engineering Team Size

Omid believes that in five years, companies could achieve current outcomes with one-tenth the number of engineers [01:01:43]. Over ten years, he envisions a “1000x” improvement, leading to a significant reduction in company size due to AI’s impact on programming jobs [01:01:55]. While the number of “software creators” will continue to grow, the role might be called something else, much like “movie stars” became “creators” on platforms like TikTok [01:02:12].

Tubegraph

Explorer

Table of Contents