GUIDE · PREVIEW
GUIDE / CON.37
source: docs/guide/concepts/Service Architecture.md
Concepts

Service Architecture

What It Is

FortrOS runs services using a silo'd design: org services don't import FortrOS crates, don't call gossip or CRDT methods, and don't know they're running on FortrOS. A service reads files from its scratch directory and writes files to its scratch directory. The orchestration layer (maintainer, reconciler) handles everything else: pulling images, setting up encrypted scratch, syncing data, starting the process, collecting output.

This is the architectural axis: services are isolated from FortrOS internals by a filesystem boundary, not by an API contract.

Why It Matters

If services import FortrOS libraries, they become coupled to FortrOS's internal APIs. A change to the gossip protocol requires updating every service. A bug in a service's CRDT usage could corrupt org state. Testing requires a full FortrOS environment.

With the silo'd design, services are portable: they can run on any Linux system with the right files in the right places. Testing is trivial (create a directory with test inputs, run the binary, check outputs). Updating FortrOS internals doesn't touch service code. The attack surface is smaller -- a compromised service can only read/write its own scratch, not call internal APIs.

The Silo'd Design Principle

A service interacts with FortrOS through exactly three interfaces:

  1. Scratch directory (filesystem): The service reads inputs and writes outputs to its encrypted scratch volume. It doesn't know the volume is LUKS-encrypted, backed by a sparse file, or that the key came from the key service. It sees a normal directory.

  2. WireGuard overlay (network): The service listens on its overlay address for client connections. It doesn't know the network is a WireGuard mesh or that conn_auth is enforced. It sees a normal network interface.

  3. s6 supervision (process lifecycle): The service is started, supervised, and restarted by s6. It signals readiness via notification-fd. It doesn't know about the reconciler or workload CRDTs.

The service DOES NOT:

  • Import fortros-state, fortros-gossip, fortros-storage, or any other FortrOS crate
  • Call the placement API to store/retrieve data
  • Access the CRDT state trees
  • Know which node it's running on or which generation is active
  • Manage its own encryption keys

Example: The Build Service

The build service compiles FortrOS binaries and assembles node images:

Inputs (pre-loaded to scratch by orchestrator):

  • Rust workspace source
  • Base rootfs (Buildroot kernel + rootfs)
  • Node overlay (s6 services, configs)
  • Cargo registry cache
  • Build configuration

Process: Compile workspace, apply overlay, pack initramfs, compute hash.

Outputs (left on scratch for orchestrator to collect):

  • Compiled binaries
  • Assembled image (kernel + initramfs)
  • Image hash (SHA-256)

The build service doesn't push to shard storage. It doesn't sign images (that's the provisioner's responsibility). It doesn't know about gossip or CRDTs. It reads from scratch, writes to scratch, exits. The orchestrator collects outputs after the service exits.

Example: The Provisioner

The provisioner handles enrollment and boot image serving. It reads boot images from scratch (pre-loaded from shard storage by the orchestrator), serves them over TLS to enrolling nodes, and manages enrollment state. It doesn't know its boot images came from erasure-coded shard storage. It doesn't know enrollment confirmations propagate via CRDT gossip. It communicates with the maintainer via IPC.

Workload Lifecycle

Orchestrator vs Reconciler

Two components manage workloads, at different levels:

Orchestrator: Manages tier 1 node services (baked into the image, managed by s6-rc). Reads manifests from the generation image, generates s6-rc service definitions, manages s6 lifecycle. A platform binary in the base image. Handles: pull image -> create scratch -> sync data -> generate s6-rc source -> start.

Reconciler: Manages tier 2-4 workloads (pulled at runtime). A separate binary implementing level-triggered reconciliation. Reads desired state from the maintainer (CRDTs via IPC), compares to running state, converges. Handles containers (unshare + pivot_root) and VMs (cloud-hypervisor REST API). Reports observed state back to maintainer.

The separation: tier 1 must exist before the reconciler can start (the reconciler IS a tier 1 service). Tier 2-4 are managed by the reconciler after it's running.

Service Manifest

Services declare their requirements in a manifest:

name = "my-service"
tier = "org-container"
image = "org:my-service:1.0"
scratch_mb = 1024
scratch_class = "persistent"
network = "overlay"
cpus = 1
memory_mb = 512

The manifest is configuration, not code. It lives in the org's CRDT state or on disk. Any node can read it and start the service. No binary coupling between the manifest and the service binary.

Image Attestation

The supply chain for service images:

  1. Build service compiles and produces an image hash (SHA-256)
  2. Provisioner signs the hash with the org CA key
  3. Nodes verify the signature before running the image
  4. The build service doesn't sign (separation of concerns: building and signing are different trust domains)

This prevents a compromised build service from producing a malicious image that the org would trust -- the provisioner must independently sign it.

How FortrOS Uses It

  • All org services follow this pattern: provisioner, build service, key service, and any user-deployed services
  • Testing is simple: Create scratch directory, add test inputs, run binary, check outputs. No FortrOS environment needed.
  • Migration is transparent: The reconciler stops the service on one node, syncs scratch data, starts on another. The service doesn't know.
  • Encryption is transparent: Per-service encryption key from the key service, applied by the reconciler at scratch setup. Service sees plaintext files.

FortrOS Services

Each core service has its own guide page describing its role and design:

Links