Building reliable AI systems using Effect TypeScript

From: aidotengineer

Building reliable AI systems, especially those interacting directly with end-users and relying on large language models (LLMs) in production, presents significant challenges due to uncertain conditions [00:00:07]. Effect, a TypeScript library, is designed to manage this complexity by enabling the creation of robust, type-safe, and composable systems [00:00:16].

Why Effect for AI and LLM Systems?

While the TypeScript language provides a strong foundation, it falls short when dealing with specific challenges inherent in AI system development [00:00:26]. These challenges include:

Unreliable APIs [00:00:29]
Complex dependencies between systems [00:00:32]
Non-deterministic model outputs [00:00:33]
Long-running workflows [00:00:36]

Effect provides the necessary tools to confidently handle these situations as AI platforms evolve [00:00:36].

Key Features of Effect for Building Reliable AI Systems

Effect offers several features that contribute to the reliability and maintainability of AI systems:

Strong type guarantees across the stack [00:00:44]
Powerful composition primitives [00:00:48]
Built-in concurrency, streaming, interruptions, and retry mechanisms [00:00:50]
Structured error modeling [00:00:54]
A clean dependency injection system that facilitates easier testing and modernization [00:00:57]
Easy observability via OpenTelemetry [00:01:02]

These benefits of using Effect in TypeScript for AI and LLM systems lead to more stable, testable, and maintainable code at scale [00:01:07]. Effect can also be gradually adopted into existing codebases, feeling like a natural extension of TypeScript [00:01:14].

Architecture and Implementation at 14.ai

At 14.ai, Effect is used across the entire stack for their AI-native customer support platform [00:01:23]:

Internal RPC Server: Handles application logic, built on Effect RPC and a modified TanStack Query [00:01:39].
Public API Server: Utilizes Effect HTTP with auto-generated OpenAPI documentation from annotated schemas [00:01:47].
Data Processing Engine: Syncs and processes data from CRM, documents, and databases for real-time analytics and reporting [00:01:53].
Agent Workflows: Written in a custom DSL built on Effect, allowing for a mix of deterministic and non-deterministic behavior [00:02:02].
Database: PostgreSQL is used for both data and vector storage, with Effect SQL handling queries [00:02:09].
Effect Schemas: Model everything, providing runtime validation, encoding/decoding, type-safe input/output handling, and auto-generated documentation [00:02:15].

Agent Design and Workflow with Effect

AI agents at 14.ai are essentially planners that take user input, devise a plan, choose appropriate actions, workflows, or sub-agents, execute them, and repeat until a task is complete [00:02:28].

Actions: Small, focused units of execution, such as fetching payment information or searching logs, comparable to tool calls [00:02:40].
Workflows: Deterministic multi-step processes, like cancelling a subscription, which might involve collecting reasons, offering retention options, checking eligibility, and then performing the cancellation [00:02:51].
Sub-agents: Group related actions and workflows into larger domain-specific modules, such as a billing agent or a log retrieval agent [00:03:03].

To model this complexity, a domain-specific language (DSL) for workflows was built using Effect’s functional programming principles (pipe-based system) as the foundation [00:03:13]. This allows for clear and composable expression of branching, sequencing, retries, state transitions, and memory [00:03:22].

Reliability Mechanisms

For mission-critical systems, reliability is paramount [00:03:32]:

LLM Provider Fallbacks: If one LLM provider fails, the system automatically falls back to another with similar performance characteristics (e.g., GPT-4 mini to GD Flash 2.0 for tool calling) [00:03:36].
Retry Policies: Modeled with Effect, these policies track state to avoid retrying failed providers [00:03:47].
Token Stream Duplication: For streamed answers to end-users, token streams are duplicated, sending one directly to the user and another for internal storage (e.g., analytics) [00:03:55]. Effect facilitates this process [00:04:05].

Testing and Developer Experience

Effect significantly enhances testing and developer experience:

Dependency Injection (DI): Heavily used for mocking LLM providers and simulating failure scenarios [00:04:09]. This approach allows swapping service providers with mock versions without affecting system internals [00:04:14].
Schema-Centric Design: Input, output, and error types are defined upfront with powerful encoding/decoding built-in [00:04:32]. This provides strong type safety and automatically documented schemas [00:04:40].
Modular and Composable Services: Services are provided at the entry point of systems, allowing for easy composition, overriding behavior, or swapping implementations [00:04:50]. Dependencies are guaranteed at compile time via the type level [00:04:59].
Strong Guard Rails: Effect helps prevent common mistakes, allowing engineers new to TypeScript to become productive quickly [00:05:14]. Once past the initial learning curve, it becomes harder to fall into bad patterns [00:05:23].

Lessons Learned

While powerful, using Effect effectively requires discipline [00:05:31].

Error Handling: It’s easy to accidentally catch errors upstream or out of sight, silently losing important failures if not careful [00:05:46].
Dependency Injection at Scale: While great in principle, tracing service provision, especially across multiple layers or subsystems, can become challenging to follow [00:05:57].
Learning Curve: Effect has a significant ecosystem of concepts and tools that can be overwhelming initially [00:06:10]. However, once the initial hurdle is overcome, the benefits compound [00:06:17].

Ultimately, Effect helps build scalable AI systems that are predictable and resilient, but it is not magic and still requires careful thought [00:06:27].

Conclusion

Effect allows for incremental adoption, starting with a single service or endpoint [00:06:37]. It is particularly useful for LLM and AI-based systems where reliability and coping with non-determinism are crucial [00:06:47]. Effect provides tools for predictable and observable systems [00:06:53].

It brings the rigor of functional programming into real-world TypeScript in a practical way for production use [00:07:01]. One does not need to be a functional programming purist to gain significant value; starting small allows benefits to build up over time [00:07:07].

Tubegraph

Explorer

Table of Contents