System Architecture

Tidedge Serverless is designed as a distributed system with clear separation between global coordination, datacenter-local control, and worker nodes. This architecture enables scaling from a single laptop to thousands of nodes across multiple regions.

System Roles

Global

Global Control Plane

Stores global app metadata only: repo, tag, build info, config, environment, and domain mappings. Never stores VM runtime state.

Per-DC

DC Control Plane

3-5 backplane nodes running scheduler, autoscaler, and network controller. Stores all VM metadata for that datacenter only.

Worker

Edgelets (Workers)

Up to 4,000 per DC. Pull app layers, start/stop VMs, track lifecycle, allocate IPs, and report state via the backplane.

Load Balancers

Receive requests, resolve app to service, look up endpoints in the DC backplane, trigger cold-start if needed, and route to VMs in the same DC. Workers and LBs only see their own DC.

+-------------+ scale.up +----------+ start VM +-------------+ | CLI/LB | --------------->| Backplane|<-------------->| Edgelet | | | | | | | | sf instance| vm.ready | | vm.starting | FC/QEMU | | start |<----------------| |--------------->| VMs | +-------------+ +----------+ +-------------+ | | | console.stdout | +------------------------------+ (if --output)

Request Flow

Here's what happens when a request arrives for a service that isn't running yet (cold start scenario):

Request arrives at Load Balancer

LB receives HTTP request for api.prod.localhost.openiap.io

LB checks DC backplane for endpoints

No endpoints found → cold-start needed. LB queries Global Control Plane to verify the service exists.

LB publishes cold-start event

Publishes to dc.{id}.service.request with service and version info.

Scheduler creates assignment

Scheduler picks a worker node with capacity, validates variant exists, and creates assignment in DC backplane.

Worker boots VM

Worker sees new assignment, allocates VM ID, pulls OCI layers from CAS, and boots VM with Firecracker or QEMU.

VM reports ready

Guest proxy signals readiness via the backplane. Worker registers endpoint in Dynamic backplane.

LB routes request

LB watches endpoints bucket, sees new endpoint appear, updates routing table, and forwards the request.

VM Lifecycle

VMs go through several phases from cold start to handling requests. The overlay filesystem ensures immutable, reproducible execution.

Boot Pipeline

Layer preparation: Kernel and rootfs prepared as CAS objects
Page mapping: Pages mapped to avoid unnecessary disk seeks
Hypervisor invocation: Minimal configuration for fast boot
Init startup: Jumps directly to application entry point
Socket prebind: For most apps, socket is already open (skips binding)

Overlay Filesystem

+---------------------------------------------+ | Ephemeral Overlay (tmpfs) | <- Changes lost on restart +---------------------------------------------+ | App Layer (nodetest:0.0.5) | +---------------------------------------------+ | Framework Layer (node22:latest) | +---------------------------------------------+ | Distro Layer (alpine:latest) | +---------------------------------------------+ | Kernel + Initramfs | +---------------------------------------------+ | +---> /data (persistent volume) <- Only explicit volumes persist

The base layers are mounted read-only. A tmpfs scratch layer captures any writes. Restart the VM and you get the exact same state - only mounted volumes persist.

Guest Proxy

The guest proxy runs inside every VM as the orchestrator for mounts, command execution, and status reporting. It's called exactly twice:

Pool initialization: Mounts base layers when VM first boots into warm pool
Application attachment: Mounts app layer when first request arrives
Subsequent requests: Go directly to the application (no guest proxy involvement)

Data Storage Model

All state is stored in the backplane organized into three tiers based on scope and persistence requirements.

Tier	Persistence	Contents
Global Backplane	Persistent	Stacks, Variants, Aliases, Stages, Variables, Cluster config
DC Backplane	Persistent	Services, Nodes, Assignments, Volume claims, Layer cache
Dynamic Backplane	Ephemeral	Instance endpoints, Operations, Resource events, Warm pools

Event Messaging

The backplane is also used as an event bus for coordination:

Cold-start triggers
VM ready notifications
Scale up/down commands
Console output streaming
Telemetry bursts

Subject Pattern

dc.{dc-id}.{event}.{service}.{vm-id}

# Examples:
dc.dc1.service.request     # Request-triggered cold start
dc.dc1.vm.ready            # VM ready (worker -> LB)
dc.dc1.scale.up            # Scale up request
dc.dc1.console.42.stdout   # Console streaming

Networking

The platform uses two separate networks: one for infrastructure VMs and one for application VMs.

+------------------------------------------------------------------------+ | Host | | | | STATIC NETWORK (infra) DYNAMIC NETWORK (apps) | | | | +---------+ +---------+ +---------+ | | | stap1 | | tap2 | | tap3 | ... | | +----+----+ +----+----+ +----+----+ | | | | | | | +----+-----+ +-----+------------+-----+ | | |sf-static | | br0 | | | |172.31.0.1| | 172.30.0.1 | | | +----+-----+ +-----+------------------+ | | | | | | +----------------------------------+ | | | | | iptables NAT + MASQUERADE | +-------------------------+-----------------------------------------------+ | Internet

Hypervisor Networking

Hypervisor	Mode	VM Access
Firecracker	TAP Bridge	Direct via 172.30.0.x:PORT
QEMU	SLIRP	Port forward via localhost:HOST_PORT

Cluster Mode (VXLAN)

In cluster mode, VXLAN overlay networking enables cross-host VM communication. Each stack gets its own isolated VNI for network segmentation:

VNI 1: Infrastructure VMs (static services)
VNI 100+: Each stack gets a unique VNI for isolation

This means services within a stack can communicate directly, but different stacks are network-isolated from each other by default.

Local Development Mode

For local development, a single binary runs everything: embedded backplane, worker, scheduler, and load balancer. The same code paths are used, just in a single process.

Differences from Production

IPAM uses a single global port pool
No VXLAN (not needed on single machine)
VM IDs allocated via file-based counter

What Stays the Same

Same layer resolution
Same VM boot process
Same overlay filesystem
Same guest proxy

This means you can develop and test locally with confidence that it will work the same way in production.