AI infrastructure and developer tools

From: redpointai

This article draws insights from a crossover discussion between Unsupervised Learning and Latace, a technical newsletter and podcast focused on AI infrastructure, tooling, and product for AI engineers [00:00:16]. The conversation delves into key developments, trends, and challenges in the rapidly evolving AI landscape [00:01:06].

Key Surprises in the AI Landscape

The past year has brought several surprises in AI:

Model Advancements: The rapid progress of models, particularly the release of reasoning models (like DeepSeek), was unexpected. This occurred shortly after a significant discussion by Ilya Sutskever suggesting “scaling is dead,” creating a perception of “so over” followed by “so back” in a short period [00:01:14]. The timing, particularly with DeepSeek’s progress, felt “suspiciously neat” as pre-training seemed to tap out and inference time became the new scaling law [00:02:08].
Open Source Adoption: There was a surprising lack of enterprise adoption for open-source models, despite their capabilities [00:02:25]. Estimates placed open-source model usage in enterprises at about 5% and decreasing, as companies prioritize powerful models for use case discovery [00:03:00].
Open Source Catch-Up: Conversely, the speed at which open source (specifically DeepSeek) caught up with closed-source reasoning models was unexpected. The value proposition of model companies, offering exclusive access to models for product building, now seems to have a much shorter exclusivity period [00:03:33]. It’s noted that “open source” in this context often refers to specific companies like DeepSeek choosing to release their models, rather than a broad “team open source” effort [00:04:11].
Low-Code Builders and AI: It was surprising that low-code builders (e.g., Zapier, Airtable, Retool, Notion) did not capture the AI builder market despite having the DNA and distribution [00:07:51]. This is attributed to a potential mindset difference: existing companies focused on improving their baseline, while new AI builders started from “whole cloth” without prior preconceptions [00:09:09].

Overhyped vs. Underhyped in AI

Overhyped

Agent frameworks: These are seen as overhyped due to the rapid flux in workloads, making it hard to build stable frameworks [00:10:43]. The current state is compared to the “jQuery era,” suggesting a need for a more stable, “React”-like approach [00:12:21]. Some argue it might even be too early for frameworks, with focus needed on protocols rather than frameworks [00:11:46].
New Model Training Companies: Despite initial thoughts that the era of new model training companies was passing, many continue to emerge [00:16:16]. While not impossible, it’s questioned what unique value they bring beyond pursuing AGI, which cannot be achieved by all [00:16:35]. The trend points towards general-purpose models, with specialized models likely requiring unique datasets (e.g., in robotics, biology, material science) [00:17:18]. The focus seems to be on whether these models will deliver on the promise of increased top-line revenue rather than just cost-cutting [00:32:04].

Underhyped

Memory (Stateful AI): The lack of true “memory memory” (beyond conversational memory) in agents from major labs like OpenAI and Cohere is surprising [00:14:54]. The ability to store knowledge graphs about users, exceeding context length, is a significant opportunity for smarter agents that can learn on the job [00:15:04]. Stateful AI is seen as potentially interesting to VCs due to its resemblance to databases [00:16:07].
Private Cloud Compute (PCC) on the device/local: Apple’s architectural work in bringing on-device security to the cloud (PCC) is considered underhyped [00:12:44]. This addresses the challenge of running large LLMs in multi-tenant cloud environments while needing single-tenant guarantees, especially as obtaining enough GPUs for private VPCs is difficult [00:13:43].

Product Market Fit in AI Applications

Current areas showing genuine product market fit:

Coding Agents: Companies like GitHub Copilot and Cursor are examples [00:26:17].
Support Agents (Customer Support): Companies like Sierra and Dekugan are leading this space, addressing a significant cost center for businesses [00:30:40].
Deep Research: Services that provide long-term agentic reporting, exemplified by Perplexity and OpenAI’s deep research features, are gaining traction [00:26:42].
Voice AI: Applications that don’t require 100% precision and recall (e.g., scheduling intake for home services) can provide significant value by improving efficiency even with partial effectiveness, as businesses often miss a high percentage of calls [00:33:17].

Emerging areas with potential for product market fit:

Screen Sharing Assistance: AI watching user work and offering assistance [00:35:56].
Outbound Sales: AI assisting with proactive sales efforts [00:36:03].
Hiring/Recruiting: AI applications for the recruiting side [00:36:13].
Personalized Education/Teaching: AI-powered personalized learning experiences [00:36:16].
Finance: Various finance-specific use cases [00:36:31].
Personal AI: Though harder to monetize, personal AI assistants are an area of interest [00:36:35].

Defensibility at the Application Layer

Defensibility at the application layer is crucial for AI product success [00:39:07]. Key factors include:

Network Effects: Prioritizing and building out multiplayer experiences can create strong network effects, as demonstrated by Chai Research (a character AI competitor) [00:39:16]. Chai Research, despite not owning its models, built a network of users submitting models, creating a robust marketplace [00:40:18].
Brand: Becoming a recognized brand synonymous with a category quickly (within 6-9 months) grants significant market access and allows for premium pricing, even if competitors offer similar technical capabilities [00:41:17].
Velocity and User Experience: The true defensibility lies in a company’s ability to execute “a thousand small things” that create a delightful user experience and design [00:42:04]. High product velocity is essential, as new models present an existential event every 3-6 months; companies must be first to figure out how to leverage them [00:42:12]. This approach is similar to traditional application SaaS companies. [00:42:30]

The AI Infrastructure Space

The most interesting areas in AI infrastructure are generally not the “bare metal” infrastructure (like GPU serving) but rather the “infra around models” [00:42:55]. This includes:

LLM OS: A conceptual framework for the operating system layer around LLMs [00:42:55].
Code Execution: As seen with companies like E2P [00:43:12].
Memory: As discussed earlier, building stateful AI solutions [00:43:19].
Search: Companies focused on enhancing search capabilities with AI [00:43:21].
Cybersecurity: Applying AI on the defense side for areas like email security, identity, and rethinking practices like red teaming using AI [00:43:38].
Semantics vs. Syntax: AI models’ ability to infer semantics from code or other data (understanding “what this code is trying to do”) goes beyond simple syntax rules, offering a new dimension for security and other applications [00:44:07].

While GPU serving and data centers are critical, they are capital-intensive and often reduce to a “cost plus” business model, which is less attractive compared to applications that charge for utility [00:46:48].

The Role of Major Labs in Infrastructure

OpenAI and other large research labs are increasingly encompassing parts of the developer and infrastructure categories by offering these as APIs. For example, OpenAI’s search capability is now available as an API, competing with startups [00:45:31]. The question remains whether these labs will prioritize being API companies or product companies, and whether their bundled offerings will outweigh best-of-breed independent solutions [00:46:18].

Challenges in AI Infrastructure

Certain areas face challenges for investment or widespread adoption:

Finetuning Companies: Standalone finetuning companies struggle to scale as a big business unless wrapped into a broader enterprise AI service offering [00:47:56].
AI DevOps/AIOps: While there’s a theoretical opportunity for AI to self-heal and manage codebases, it hasn’t fully materialized yet [00:48:20]. However, some believe it’s an interesting opportunity for the short term as the technology improves to increase the efficacy of things like Mean Time to Resolution (MTR) [00:49:00].
Voice Real-time Infrastructure: While hot and interesting, its ultimate market size is still unclear [00:48:40].

Unanswered Questions with Broad Implications

RL on Non-Verifiable Domains: Can Reinforcement Learning (RL) be successfully applied to non-verifiable domains like law (contracts, documents) or marketing/sales (simulating outbound conversations)? [00:50:22] If not, AI agents might be limited to verifiable domains, while co-pilots would be needed for non-verifiable ones, requiring human oversight [00:50:46].
Hardware Scaling and GPU Dominance: How will the AI ecosystem scale with the “rule of nines” (each order of magnitude increase in reliability requires an order of magnitude increase in compute)? [00:51:36] Will Nvidia continue its dominance, or will competitors like AWS (with their Trainium/Inferentia chips), AMD, Microsoft, and Facebook successfully challenge it to increase GPU availability? [00:52:00]
Agent Authentication: A critical emerging question is how agents will authenticate themselves when accessing websites on behalf of users, indicating that it’s an agent and not the user directly [00:54:41]. This points to a need for a “new SSO” for agents [00:55:08].

Tubegraph

Explorer

Table of Contents