System Architecture

Tidedge Serverless is designed as a distributed system with clear separation between global coordination, datacenter-local control, and worker nodes. This architecture enables scaling from a single laptop to thousands of nodes across multiple regions.

System Roles

Global

Global Control Plane

Stores global app metadata only: repo, tag, build info, config, environment, and domain mappings. Never stores VM runtime state.

Per-DC

DC Control Plane

3-5 backplane nodes running scheduler, autoscaler, and network controller. Stores all VM metadata for that datacenter only.

Worker

Edgelets (Workers)

Up to 4,000 per DC. Pull app layers, start/stop VMs, track lifecycle, allocate IPs, and report state via the backplane.

Load Balancers

Receive requests, resolve app to service, look up endpoints in the DC backplane, trigger cold-start if needed, and route to VMs in the same DC. Workers and LBs only see their own DC.

+-------------+ scale.up +----------+ start VM +-------------+ | CLI/LB | --------------->| Backplane|<-------------->| Edgelet | | | | | | | | sf instance| vm.ready | | vm.starting | FC/QEMU | | start |<----------------| |--------------->| VMs | +-------------+ +----------+ +-------------+ | | | console.stdout | +------------------------------+ (if --output)

Request Flow

Here's what happens when a request arrives for a service that isn't running yet (cold start scenario):

1

Request arrives at Load Balancer

LB receives HTTP request for api.prod.localhost.openiap.io

2

LB checks DC backplane for endpoints

No endpoints found → cold-start needed. LB queries Global Control Plane to verify the service exists.

3

LB publishes cold-start event

Publishes to dc.{id}.service.request with service and version info.

4

Scheduler creates assignment

Scheduler picks a worker node with capacity, validates variant exists, and creates assignment in DC backplane.

5

Worker boots VM

Worker sees new assignment, allocates VM ID, pulls OCI layers from CAS, and boots VM with Firecracker or QEMU.

6

VM reports ready

Guest proxy signals readiness via the backplane. Worker registers endpoint in Dynamic backplane.

7

LB routes request

LB watches endpoints bucket, sees new endpoint appear, updates routing table, and forwards the request.

VM Lifecycle

VMs go through several phases from cold start to handling requests. The overlay filesystem ensures immutable, reproducible execution.

Boot Pipeline

  1. Layer preparation: Kernel and rootfs prepared as CAS objects
  2. Page mapping: Pages mapped to avoid unnecessary disk seeks
  3. Hypervisor invocation: Minimal configuration for fast boot
  4. Init startup: Jumps directly to application entry point
  5. Socket prebind: For most apps, socket is already open (skips binding)

Overlay Filesystem

+---------------------------------------------+ | Ephemeral Overlay (tmpfs) | <- Changes lost on restart +---------------------------------------------+ | App Layer (nodetest:0.0.5) | +---------------------------------------------+ | Framework Layer (node22:latest) | +---------------------------------------------+ | Distro Layer (alpine:latest) | +---------------------------------------------+ | Kernel + Initramfs | +---------------------------------------------+ | +---> /data (persistent volume) <- Only explicit volumes persist

The base layers are mounted read-only. A tmpfs scratch layer captures any writes. Restart the VM and you get the exact same state - only mounted volumes persist.

Guest Proxy

The guest proxy runs inside every VM as the orchestrator for mounts, command execution, and status reporting. It's called exactly twice:

  • Pool initialization: Mounts base layers when VM first boots into warm pool
  • Application attachment: Mounts app layer when first request arrives
  • Subsequent requests: Go directly to the application (no guest proxy involvement)

Data Storage Model

All state is stored in the backplane organized into three tiers based on scope and persistence requirements.

Tier Persistence Contents
Global Backplane Persistent Stacks, Variants, Aliases, Stages, Variables, Cluster config
DC Backplane Persistent Services, Nodes, Assignments, Volume claims, Layer cache
Dynamic Backplane Ephemeral Instance endpoints, Operations, Resource events, Warm pools

Event Messaging

The backplane is also used as an event bus for coordination:

  • Cold-start triggers
  • VM ready notifications
  • Scale up/down commands
  • Console output streaming
  • Telemetry bursts

Subject Pattern

dc.{dc-id}.{event}.{service}.{vm-id} # Examples: dc.dc1.service.request # Request-triggered cold start dc.dc1.vm.ready # VM ready (worker -> LB) dc.dc1.scale.up # Scale up request dc.dc1.console.42.stdout # Console streaming

Networking

The platform uses two separate networks: one for infrastructure VMs and one for application VMs.

+------------------------------------------------------------------------+ | Host | | | | STATIC NETWORK (infra) DYNAMIC NETWORK (apps) | | | | +---------+ +---------+ +---------+ | | | stap1 | | tap2 | | tap3 | ... | | +----+----+ +----+----+ +----+----+ | | | | | | | +----+-----+ +-----+------------+-----+ | | |sf-static | | br0 | | | |172.31.0.1| | 172.30.0.1 | | | +----+-----+ +-----+------------------+ | | | | | | +----------------------------------+ | | | | | iptables NAT + MASQUERADE | +-------------------------+-----------------------------------------------+ | Internet

Hypervisor Networking

Hypervisor Mode VM Access
Firecracker TAP Bridge Direct via 172.30.0.x:PORT
QEMU SLIRP Port forward via localhost:HOST_PORT

Cluster Mode (VXLAN)

In cluster mode, VXLAN overlay networking enables cross-host VM communication. Each stack gets its own isolated VNI for network segmentation:

  • VNI 1: Infrastructure VMs (static services)
  • VNI 100+: Each stack gets a unique VNI for isolation

This means services within a stack can communicate directly, but different stacks are network-isolated from each other by default.

Local Development Mode

For local development, a single binary runs everything: embedded backplane, worker, scheduler, and load balancer. The same code paths are used, just in a single process.

Differences from Production

  • IPAM uses a single global port pool
  • No VXLAN (not needed on single machine)
  • VM IDs allocated via file-based counter

What Stays the Same

  • Same layer resolution
  • Same VM boot process
  • Same overlay filesystem
  • Same guest proxy

This means you can develop and test locally with confidence that it will work the same way in production.