Org Bootstrap

What It Is

Org bootstrap is how a FortrOS org is created from nothing -- no existing nodes, no existing state, no existing trust chain. A dedicated bootstrapper (not a regular node) creates the org's identity, builds the first images, provisions the first nodes, and then expires. The bootstrapper is a midwife, not a member.

Why It Matters

Every distributed system has a chicken-and-egg problem. The org manages itself, but someone has to create it. Nodes need an org to join, but the org needs nodes to exist. The bootstrapper breaks this cycle: it creates the org's cryptographic identity, builds images that have the org baked in, and provisions nodes that are org members from first boot.

The bootstrapper is deliberately separate from the regular node lifecycle. The create-org capability only exists on the bootstrapper -- never on the preboot, never on the main OS. A FortrOS node cannot create an org, switch orgs, or re-bootstrap. It is bound to its org from the moment its image is built. The only way to change a machine's org is to reprovision the hardware with a new image from a different bootstrapper.

How It Works

The Admin Tool

The admin tool (fortros-admin) is a standalone Rust binary with a web UI. It runs on the admin's existing device -- a workstation, laptop, or VPS. During Phase 1 bootstrap it operates as the founding FortrOS node of the org: it holds a WG identity, a signing identity, and a member record in the CRDT. Nodes that enroll cannot tell it apart from any later-joined host. It leaves the org via revocation during the Phase 2->3 migration (see "Admin Tool Migration"), not destruction.

Earlier designs placed this founding role in an ephemeral provisioner (tmpfs on a short-lived QEMU VM). That was rejected because running ephemeral in a VM was operationally painful, and because real org creators need durable infrastructure anyway to configure reverse tunnels / DNS / TLS. Doing Phase 1 on the admin's existing robust machine is strictly better than doing it on a throwaway VM. The ephemeral-provisioner binary still exists; its seed-founding-org subcommand is the idempotent primitive that writes the founding HostRecord + host_cert.json into ~/.fortros-admin/org/.

After Phase 2->3 migration completes and the admin's founding role is revoked, the admin tool becomes what earlier guide revisions described: a YubiKey-only thin client to the org-hosted admin UI. It no longer runs the maintainer, no longer joins gossip, no longer has an org identity on disk. That is the steady-state end shape, not the bootstrap shape.

The admin tool serves a web UI (htmx + server-rendered HTML, no npm/node dependency chain) on localhost. The admin uses a browser for:

Setup wizard: Create org (CA keypair, overlay prefix, name). Configure connectivity (public subdomain, own domain, LAN-only, onion). Verify DNS/TLS works. Build images (preboot UKI, node generation).
Invite manager: Create/revoke bootstrapper images. Each invite is a per-image Ed25519 signing keypair (see "Bootstrapper as Invite Link" below). Track use counts, expiry, which invite enrolled which node.
Provisioner endpoint: Serves TLS on port 7443 for preboot auth and image delivery. Can run directly (LAN) or through a reverse tunnel (Cloudflare, onion service) for WAN provisioning.
Node dashboard: Live view of enrolled nodes, gossip state, storage health. WebSocket updates for real-time status.

Org identity storage: The admin tool stores org identity in ~/.fortros-admin/org/ (or a configured path). Opening the tool with an existing org loads it. Creating a new org requires explicit confirmation to prevent accidental double-org ("This will create a NEW org. Your existing org 'home-lab' has 5 nodes. Are you sure?").

No npm, no node_modules. The web UI uses htmx (single vendored JS file, auditable) with server-rendered HTML from the Rust backend. CSS via Tailwind standalone CLI or hand-written. The entire frontend attack surface is: one Rust binary + static HTML/CSS + one JS file. This is deliberate -- the admin tool is a high-value target and the supply chain must be auditable.

Admin Tool Migration

The admin tool bootstraps the org, but the org should eventually host its own admin interface. The migration path:

Phase 1 (bootstrap): Admin tool runs on the workstation and acts as the founding node. It holds the org CA key on disk, is a normal member of the CRDT, runs the maintainer, and serves the enrollment endpoint (local or tunneled). From an enrolling node's perspective it is indistinguishable from any other FortrOS host.
Phase 2 (self-hosting): Once nodes are up, the admin UI becomes a workload on the org (tier 2 container). The org hosts its own admin endpoint. The workstation admin tool connects to the org-hosted UI as a thin client, but the founding-node identity on the workstation is still active -- other nodes still treat it as a peer.
Phase 3 (admin revoked, workstation optional): The founding-member record is added to revoked_peers via a signed revocation operation. That entry gossips across the org; all nodes stop accepting conn_auth from the admin's WG + signing pubkeys. The admin workstation end-of-lifes its local identity (keys deleted, ~/.fortros-admin/org moved to an offline backup for disaster recovery). From this point on the admin is purely a YubiKey/CAC thin client to the org-hosted UI.

The Phase 2->3 transition is a one-way decommissioning. It is explicit (an admin command, not automatic), gated on preflight checks (at least one replacement lighthouse is healthy, admin webui migration is verified), and destructive to the workstation's in-cluster identity by design.

After the migration completes, the org's admin UI runs inside FortrOS -- a compromised workstation can't compromise the org because the workstation no longer holds org keys. Authentication is YubiKey-based, not workstation-credential-based.

create-org is NOT available on the org-hosted UI. A running org cannot create another org. The create-org capability exists only in the standalone admin tool. This prevents accidental or malicious org creation on production nodes.

Managed Services

For managed services (a provider administering customer orgs), the pattern is: the customer enrolls the provider's admin YubiKey into their org with an admin role. The provider's admin tool connects to the customer's org endpoint. Each org is its own trust domain.

Onboarding: Customer invites provider's YubiKey via the org admin UI
Day-to-day: Provider's admin tool connects to N customer orgs independently (like N browser tabs to N Proxmox instances)
Offboarding: Customer revokes the provider's YubiKey. Access gone, no residue, no shared secrets to rotate

No special multi-org protocol. Each org's admin API is the same. The provider's admin tool aggregates views locally but never crosses org boundaries. The customer retains full control over who has access.

Connectivity Options

The admin tool's setup wizard guides the admin through connectivity:

Option	How it works	Who it's for
Public subdomain	Cloudflare Tunnel, auto-provisioned DNS	Homelab, easiest setup
Own domain	Admin points DNS, wizard verifies TLS	Small business
LAN only	DHCP/PXE on local network, no internet needed	Air-gapped, lab
Onion service	Tor hidden service, no public DNS	High threat model
VPS endpoint	Admin tool runs on VPS directly	Cloud-first deployment

All options use the same provisioner protocol (TLS + conn_auth on port 7443). Only the transport layer changes. The admin picks based on their infrastructure and threat model.

Bootstrapper as Invite Link

Each bootstrapper image has a unique Ed25519 keypair baked in at creation time. The org stores the public key. When a bootstrapper connects to the provisioner, it runs the conn_auth handshake -- the provisioner issues a 32-byte challenge, the bootstrapper signs with its baked-in Ed25519 key, and the provisioner verifies against the org's known invite list before serving anything.

This makes each bootstrapper image an individually controllable invite:

Multiple active at once. Like Discord invite links, an org can have several bootstrapper images in circulation -- one per datacenter, one per admin, one per provisioning event.
Individually revocable. Revoke one bootstrapper without affecting others. A lost USB stick with a bootstrapper image is one dead invite, not a backdoor to the org.
Use-limited. A bootstrapper can be configured for single use, N uses, or unlimited. The org tracks use count in its CRDT state.
Expirable. Each bootstrapper has a creation timestamp and TTL. Expired bootstrappers are rejected even if the key hasn't been explicitly revoked.
Auditable. The org records which bootstrapper enrolled which node. Full lineage from invite creation to node enrollment.

A compromised bootstrapper private key lets an attacker download the preboot UKI -- but they still can't enroll without a YubiKey touch. The bootstrapper proves "this image was authorized to start provisioning." The YubiKey proves "a human authorized this specific enrollment."

Contrast with preboot identity: The preboot UKI is upgradeable code on the ESP. Rolling upgrades replace it. If the preboot had a per-image signing key (like the bootstrapper), upgrading the preboot would change the node's identity. Instead, the preboot's identity lives in the TPM (NV storage: preboot_secret + Ed25519 signing key). Swapping the UKI = upgrade. Swapping the TPM = new identity. The bootstrapper is disposable; the preboot is permanent.

Bootstrapper Expiry

The bootstrapper does not persist:

It holds no org state on disk (everything is in memory)
Its signing keypair is unique to that image (revoked when the image expires)
When it shuts down (or its TTL expires), it is no longer part of the org
Booting it again would create a different org (new CA, new prefix, new identity). The bootstrapper doesn't remember the previous org.

This is intentional. The bootstrapper is a single-use tool for the birth of an org. The org's ongoing operation is entirely handled by its own nodes.

First Node

The first node provisioned by the bootstrapper is a regular FortrOS node -- it goes through the normal boot chain (preboot auth, LUKS unlock, kexec, s6-rc services). It's not special. It doesn't have create-org capability. It's just the first member.

What makes the first-node moment unique:

It's alone. Gossip mesh is one member. Storage has no one to replicate to.
The bootstrapper is its provisioner. Once the first node's own provisioner starts, it can enroll subsequent nodes without the bootstrapper.
Storage starts local. All shards are on one node until others join.

Second Node and Beyond

The second node is enrolled by either the bootstrapper (if it's still running) or by the first node's provisioner. From node 2 onward:

Gossip is peer-to-peer (2+ members, no central server)
CRDT state replicates (both nodes have full org state)
Storage shards distribute (see below)
The org is self-sustaining -- the bootstrapper is no longer needed

Each additional node follows the same enrollment process. Any existing node can enroll new nodes. The org grows without the bootstrapper's involvement.

Shard Distribution During Growth

With a 3-of-5 erasure coding policy on a single node:

Node 1 alone: All 5 shards (3 data + 2 parity) are created and stored locally. The org has the data and full redundancy, just no physical distribution.
Node 2 joins: The placement service distributes shards to satisfy the policy. Node 2 receives at least 3 shards (the minimum K needed for reconstruction). Node 1 keeps all 5. No re-encoding -- the same shards are copied. Both nodes can independently reconstruct the data.
Node 3+ joins: Shards continue distributing. The placement service scores nodes by topology (don't put all shards in the same rack) and health. More nodes = more physical distribution = better failure tolerance.
Lazy trimming: Nodes don't aggressively delete extra shards. Node 1 keeps all 5 shards until it needs the disk space. When space pressure hits, the placement service drops shards on the node with the most copies, down to the minimum needed to maintain the replication policy across the org. Extra copies are treated as free redundancy, not waste.

No re-encoding happens during growth. The 5 shards created on node 1 are the same 5 shards that eventually live across 5 different nodes. The data is never re-split or re-coded -- only copied and eventually trimmed.

Key Decisions

Admin tool is not a node. The create-org capability exists only in the standalone admin tool, never on the preboot or main OS. A running FortrOS node cannot create an org. The admin tool is a separate binary with a separate lifecycle. Once the org is self-hosting its admin UI, the standalone tool is only needed for disaster recovery or creating new orgs.

Web UI, not CLI. The admin tool serves a web interface because the target audience (homelab users, small business admins) expects guided setup, not terminal commands. The web stack is deliberately minimal (htmx + Rust backend, no npm) to keep the supply chain auditable.

Admin UI migrates to the org. The workstation-hosted admin tool is temporary scaffolding. Once nodes are up, the org hosts its own admin UI. This prevents a compromised workstation from being a persistent threat to the org.

Self-signed CA, not externally anchored. The org's CA is self-generated and self-signed. No dependency on external PKI. The org IS its own root of trust. Losing the CA private key is catastrophic -- it's stored in shard storage (encrypted by the key service) and backed up to admin YubiKeys/CACs.

Org identity is baked into images. The preboot UKI contains the org's CA public key and gateway addresses in its initramfs. A node built for Org A cannot join Org B. Changing orgs requires reprovisioning the hardware with a new image from Org B's bootstrapper.

Lazy shard trimming. Extra shard copies are kept until disk space is needed. This provides free redundancy during the growth phase and avoids unnecessary I/O for trimming copies that aren't hurting anything.

WAN Provisioning

The bootstrapper can provision nodes over the internet, not just the local network. The flow:

Admin creates a bootstrapper image for a remote site
The bootstrapper's gateway domain points to the org's public endpoint (e.g., fortros.example.com via Cloudflare Tunnel)
Remote hardware PXE-boots a generic bootstrapper from a USB stick
Bootstrapper does conn_auth to the org endpoint, downloads the preboot UKI
Bootstrapper kexecs into the preboot
Preboot authenticates to the org over WAN (same TLS + conn_auth flow)
Admin touches YubiKey to authorize enrollment
Node enrolls, joins gossip over WireGuard overlay

The local network provides DHCP and internet access. Everything else happens over the org's TLS endpoint. No VPN, no port forwarding, no site-to-site tunnel needed. The bootstrapper's signing key is the authorization to begin. The YubiKey is the authorization to complete.