Limitations of current serverless providers for longrunning workflows

From: aidotengineer

AI applications, particularly those involving agents and complex workflows, introduce new demands on infrastructure that traditional web 2.0 paradigms and existing serverless providers struggle to meet [02:01:06]. The shift from short, milliseconds-long requests to processes that can run for minutes or even hours highlights significant limitations in current serverless offerings [02:22:00].

The Evolution of AI Workflows and Infrastructure Demands

Traditionally, web 2.0 services involved simple API requests, database interactions, and quick returns within tens of milliseconds [01:27:54]. However, AI applications operate differently:

Extended Runtimes: AI applications often have P1 (critical path) runtimes of several seconds at best, and this is only with fast models or prompt caches [01:39:10]. Workflows can extend from 30 seconds to several minutes [00:59:05].
Data Engineering Shift: As AI engineers build workflows from non-deterministic code, they increasingly become data engineers, focusing on getting the right context into prompts [01:05:00]. This involves extensive LM processing to ingest data from sources like user inboxes or GitHub [01:13:00].
Reliability Challenges: LLM applications are built on “shoddy foundations,” making it difficult to build reliable apps due to frequent outages and rate limits from dependencies like OpenAI [02:06:00]. Traffic patterns can be extremely bursty, exacerbating rate limit issues [02:56:00].

Key Limitations of Existing Serverless Platforms

Existing serverless providers, while easy for quick prototypes (like a Next.js chatbot template), are not well-suited for long-running workflows [04:04:00]:

Timeout Limits: Most serverless functions time out after approximately 5 minutes [04:08:00].
HTTP Request Restrictions: Some providers limit outgoing HTTP requests [04:11:00].
Lack of Native Streaming Support: Streaming is typically bolted on at the application layer rather than being a native part of the infrastructure [04:18:00]. This is crucial for applications that run for multiple minutes [04:29:00].
Absence of Resumability: Current serverless platforms generally do not offer native resumability [04:34:00]. This means if a user refreshes the page or navigates away, the workflow context is lost [06:08:00].

Impact on User Experience and Development

These limitations create significant friction for developers and a poor experience for users:

Development Friction: Full stack AI engineers are deterred from experimenting with workflows longer than five minutes because it necessitates deep infrastructure changes due to serverless provider limitations [06:26:00].
User Engagement: For long-running agentic processes, like content generation, users need constant engagement through intermediate status updates and the ability to resume without losing context [06:52:00].
Onboarding Challenges: For processes like data ingestion, waiting 10 minutes for completion increases user fall-off in the funnel [05:25:00]. Background processing with real-time status updates is necessary [05:29:00].
Error Handling: If an intermediate error occurs, users become frustrated if they have to wait 5 minutes to get back to the same point [07:04:00].

Addressing the Challenges

While traditional long-running workflows and data engineering tools like SQS, Airflow, or Temporal exist, they are often designed for Java engineers and are not ideal for TypeScript or full stack engineers [03:31:00]. A robust solution for agentic workflows requires addressing these limitations by planning for a future that is inherently long-running [12:19:00]. This includes keeping compute and API layers separate and leveraging technologies like Redis streams for resumability [12:39:00].

It's easy to get started with basic prototypes, but building reliable and resilient [[longrunning_agents_and_failure_resilience | long-running AI workflows]] correctly in a serverless environment, especially when considering continuous deployment and worker draining, is very challenging <a class="yt-timestamp" data-t="13:04:00">[13:04:00]</a>.

Tubegraph

Explorer

Table of Contents

Limitations of current serverless providers for longrunning workflows

The Evolution of AI Workflows and Infrastructure Demands

Key Limitations of Existing Serverless Platforms

Impact on User Experience and Development

Addressing the Challenges

Graph View

Backlinks