Building AI sandboxes

From: aidotengineer

AI sandboxes are emerging as a critical component for the advancement of AI agents, providing a secure and capable environment for code execution and computer use [00:00:09]. Abhishek, the solo founder and developer of Arachis, an open-source code execution and computer use sandboxing service for AI agents, highlights their necessity and construction [00:00:04] [00:00:12].

Why AI Sandboxes Are Needed

AI models, such as GPT-4, leverage tool calling (like search or code execution) during inference to generate smarter replies [00:00:59]. These tool calls necessitate secure execution environments [00:01:07]. Beyond inference, AI sandboxes are essential for:

Reinforcement Learning (RL): Running reward functions at scale during the training phase [00:01:12].
Enhanced Agent Capabilities: A full Linux sandbox allows agents to perform advanced tasks. For instance, during code generation, agents can debug entire applications using Linux commands like ps or lsof to monitor execution and fix issues [00:01:21]. This enables agents to backtrack, replan, and work towards a goal effectively [00:01:32].
Security: Agent-generated code is akin to running untrusted code from sources like GitHub or Stack Overflow [00:01:40]. Such code can be buggy or malicious, potentially gaining root access and compromising user or client data [00:01:47]. Sandboxes provide the necessary isolation and lockdown [00:01:53].

Examples like Manas AI demonstrate how a sandbox enables complex tasks without extensive prompting or frameworks, leveraging the agent’s pre-training Linux knowledge [00:02:05] [00:02:20].

Introducing Arachis

Arachis is an open-source solution designed to spawn and manage AI sandboxes for code execution and computer use [00:02:42]. It offers:

Security: Built with MicroVMs [00:03:17].
Customization: Fully customizable [00:02:45].
Self-hosting: Can be self-hosted [00:02:45].
Backtracking: Supports snapshot and restore for agents to backtrack [00:02:53].
Speed: Boots in less than 7 seconds, with ongoing efforts to reduce this to under 1 second [00:03:54] [00:04:02].
Port Forwarding: Handles port forwarding for easy access to code execution or browser use via public URLs [00:04:14].
Computer Use: Pre-installed Chrome and VNC server for GUI access [00:04:31].
API: A dead simple, ubiquitous API with Python, Golang clients, and OpenAPI compatibility [00:05:07].
Configurability: Leverages Docker tooling for customizing binaries and packages within the sandbox [00:05:23].

Core Components of an AI Sandbox (as exemplified by Arachis)

1. MicroVM-Based Secure Code Execution

Security is paramount for AI sandboxes [00:03:20]. Arachis uses MicroVMs as its runtime environment [00:03:25].

Understanding Linux Sandboxing Options:

Linux Execution Model: Threads are the smallest unit of execution, processes are logical constructs of multiple threads sharing resources. The kernel provides privileged access to hardware via system calls [00:08:16].
Containers: Package an app’s dependencies with its logic, enabling arbitrary user code execution [00:10:15]. On Linux, containers are collections of namespaces (e.g., process, mount, net) that abstract resources, giving a bound view to the container [00:10:32]. Cgroups control resource access (CPU, memory) [00:11:41].
- Container Security Flaw: Containers run as native processes on the kernel [00:12:20]. A kernel vulnerability can allow malicious processes to gain root access, compromising the host [00:12:33].
- Mitigation (Jailing): Reducing the attack surface by restricting Linux capabilities (caps) and system calls using techniques like seccomp filters. Libraries like minijail assist in this [00:13:26]. However, jailing has limits and can still be bypassed [00:14:31].
Virtualization (VMs): Provide stronger isolation by running a guest user space and guest kernel separate from the host kernel [00:14:49]. This significantly reduces the attack surface compared to containers [00:15:10].
- Linux Virtualization: A Virtual Machine Monitor (VMM) process (e.g., QEMU, CrossVM, Firecracker) interacts with dev/kvm, a Linux kernel device exposing the processor’s virtualization stack [00:15:47]. When a VM needs host resources (disk, net), it “VM exits” to the VMM, which then communicates with the host kernel and returns the response with a “VM resume” [00:16:56]. Frequent VM exits can impact performance [00:17:28].
MicroVMs: A lighter-weight, security-first approach to VMs [00:20:29].
- Security: Often written in memory-safe languages like Rust (e.g., CrossVM) to prevent memory-related bugs in emulated devices [00:18:40]. They also jail emulated devices separately, limiting the scope of compromise [00:19:11].
- “Micro”: Refers to the smaller VMM process [00:20:26]. They support fewer architectures and only major emulated devices, leading to less code, faster boot times, and lower memory consumption [00:20:03].
- Arachis’s Choice: Arachis opts for MicroVMs (specifically Cloud Hypervisor) due to their enhanced security, fast boot times, and support for features like snapshotting, hot-plugging devices, and GPU support [00:20:54] [00:22:42]. This choice addresses the need for multi-tenant untrusted code execution without compromising data [00:21:11].

2. Storage and File System

Sandboxes need to create, read, and write files, but the root file system (root FS) must be protected from deletion or corruption by buggy or malicious code [00:24:37].

OverlayFS: Arachis uses a shared, read-only base layer (root FS) across sandboxes [00:25:03]. On top of this, each sandbox gets its own read-write layer where new files are created [00:25:14].
Snapshotting: When a sandbox is snapshotted, only the read-write layer is persisted, optimizing storage and sharing [00:25:31]. The sandbox itself sees a regular Linux file system, with the OverlayFS magic handled underneath [00:25:54].

3. Networking

Each sandbox requires networking for external actions or tool calls [00:27:00].

Isolated Networking: Every Arachis sandbox runs in a virtual machine with its own isolated networking setup [00:27:12].
Tap Device: Each sandbox receives a unique virtual networking interface called a “tap device” [00:27:21].
Linux Bridge: All tap devices connect to a Linux bridge on the host server [00:27:34].
Port Forwarding: Arachis handles port forwarding from the host to the sandbox’s code server or VNC server, eliminating the need for manual IP tables or firewall configurations [00:27:44].

4. Customization with Docker Tooling

Arachis allows full control over the sandbox environment.

Docker Files: Users can use their existing Docker commands and modify a Dockerfile to customize which binaries and packages are installed in the sandbox [00:29:29].
Pre-installed Tools: Default sandboxes include standard packages, Chrome (booted via systemd), NodeJS, npm, and Python to make agents productive out-of-the-box [00:29:50].

5. Code Execution Server

AI sandboxes bundle a code execution server [00:31:20].

Files API: Allows uploading and downloading files to and from the sandbox [00:31:27].
Command API: Executes commands within the sandbox and returns output or errors in JSON format [00:31:39].
Security: Running this server within a guest VM provides confidence against escapes, unlike running it directly on the host [00:31:52].
Browser Access: Chrome is pre-installed, and port forwarding to the VNC server allows direct GUI access [00:32:06].

6. Snapshotting and Restore

Snapshotting is a crucial feature for agent reliability, especially for multi-step workflows [00:32:26].

Motivation: Agents often fail during complex, multi-stage tasks [00:32:39]. Snapshotting allows agents to backtrack to a last good checkpoint, replan, and retry without starting from scratch [00:33:01]. This enables more reliable execution of higher-order complex tasks and parallel exploration of paths [00:33:23].
Functionality: Arachis saves the entire running state of a sandbox, including guest memory and the read-write part of the OverlayFS (created files, spawned processes, open windows) [00:33:30].
Process: Snapshotting involves four steps [00:34:39]:
1. Pause the VM via the VMM’s pause API [00:34:44].
2. Call the snapshot API to dump guest memory [00:34:50].
3. Manually persist the read-write OverlayFS layer (thin disk) [00:34:57].
4. Resume the VMM to continue execution [00:35:08].
Future Work: Plans to move to btrfs for native support of incremental snapshots [00:39:27].

High-Level Architecture of Arachis

The Arachis architecture features a REST server that spawns and manages MicroVM sandboxes [00:05:40]. Each sandbox runs a VNC server and a code server, with port forwarding handled to expose these services [00:05:47]. Clients interact via a Golang CLI (Arachis CLI), a Python SDK, or an MCP server, using an OpenAPI compatible YAML file for client generation in any language [00:05:58] [00:07:36]. The system is tied to Linux due to its reliance on dev/kvm, the Linux virtualization device [00:06:19].

Using the Arachis API

Arachis provides a simple API [00:06:36]:

VM Management: A /vms resource to start, stop, or delete VMs [00:06:41].
Snapshots: A /snapshots resource within a VM to snapshot [00:06:49].
Command Execution: A /command resource for executing commands [00:06:56].
File Management: A /files API to upload and download files [00:07:00].
Health Check: A /health endpoint for monitoring the REST server [00:07:05].

The Python SDK simplifies usage, allowing users to start sandboxes, run commands, create snapshots, and restore checkpoints with straightforward API calls [00:35:54].

Demo: Google Docs Clone

A demonstration showcased Claude Desktop creating a Google Docs clone using Arachis via its MCP server [00:37:04].

Claude piped commands into the Linux sandbox to build the collaborative app [00:37:41].
A snapshot of the initial version was taken [00:37:51].
A dark mode feature was added, demonstrating the agent’s ability to modify the application [00:37:58].
The sandbox was then restored to the previous snapshot without the dark mode, illustrating the backtracking capability [00:38:20]. This demo highlighted the ability to build end-to-end applications and achieve truly collaborative experiences, unlike client-side code generation tools [00:38:42].

Ongoing Work

Arachis development focuses on:

Boot Time Reduction: Aiming for boot times under 1 second [00:39:14].
Improved Snapshotting: Moving to btrfs for better support for incremental snapshots and persistence [00:39:24].
Dynamic Resource Management: Implementing dynamic memory management, ballooning, or hot-plugging/removal of memory at runtime to pack more sandboxes onto a single server [00:39:35].

Tubegraph

Explorer

Table of Contents