Skip to content

ADR-054: Controller Topology by Resource KindΒΆ

Attribute Value
Status Proposed
Date 2026-06-12
Deciders Architecture Team
Extends ADR-047 (Generic Reconciliation Framework)
Supersedes service topology of ADR-016, ADR-017 (the controller boundaries, not the reconcile-through-controller principle)
Related ADRs ADR-002 (scheduler role re-scoped β€” Β§2.3), ADR-035 (scheduler role re-scoped), ADR-016, ADR-017, ADR-046 (Host = generalized worker β€” Β§2.1), ADR-050, ADR-052, ADR-059 (form-controller owns Form sync)

Rev 2 (2026-06-16). The original Β§2.1 folded worker-controller into pod-controller and named the content-sync owner content-controller. This revision splits Host into its own host-controller (Host = the generalized CmlWorker, the runtime substrate a Pod binds to β€” ADR-046) and renames the content-sync owner form-controller to match the synced unit of ADR-059. Target reconcilers are now session- / form- / pod- / host-controller + CPA + SE.


1. ContextΒΆ

The current services (lablet-controller, worker-controller, resource-scheduler, scenario-engine, CPA) were carved along the old model: a single LabletSession, CML workers, and a scheduler. With the generalized tree (ADR-036/045/046), the Definition/Instance duality (ADR-050), and the imported content taxonomy (ADR-052), those boundaries no longer match the resource kinds we reconcile. We want service boundaries that align with the per-type managers of ADR-047 β€” one owner per resource kind β€” rather than per delivery profile (lablet / practicelab / expert) or per platform (cml / proxmox / vmware).

2. DecisionΒΆ

2.1 Split by resource kindΒΆ

Controller Owns (reconciles) Replaces / absorbs
CPA Control plane: session-manager front, scheduling intent, unified dashboard, seeds the seed catalogue/config (inert, no reconcile). β€”
session-controller Session + SessionPart (ordering, gating, part lifecycle). the session half of lablet-controller.
form-controller Form sync (Mosaic β†’ RustFS β†’ LDS + SE fan-out) β€” the single synced content_package unit (ADR-059); the surrounding taxonomy is inert catalogue metadata. the content-sync half of lablet-controller.
pod-controller PodInstance (any PodType) β€” the workload (ADR-046). the pod half of lablet-controller.
host-controller Host (any HostType) with host adapters for cml_on_aws / proxmox / vmware inside (ADR-046) β€” the runtime substrate a Pod binds to; the generalized CmlWorker. worker-controller.
scenario-engine Job + Report (untimed automation instances). unchanged in role.
flowchart TB
    subgraph CP["Control plane"]
        CPA["CPA β€” control-plane-api<br/>session manager front + unified dashboard"]
    end
    subgraph Reconcilers["Resource-kind controllers"]
        SC["session-controller<br/>Session + SessionPart"]
        FC["form-controller<br/>Form sync (content_package)"]
        PC["pod-controller<br/>PodInstance (workload)"]
        HC["host-controller<br/>Host (+ adapters:<br/>cml_on_aws / proxmox / vmware)"]
    end
    subgraph Auto["Automation"]
        SE["scenario-engine<br/>Job + Report"]
    end
    CPA -->|desired_status| SC
    CPA -->|desired sync| FC
    SC -->|desired_status| PC
    PC -->|desired_status| HC
    SC -->|workflow phase| SE
    FC -->|Form synced| CPA
    FC -->|pod ref / content| PC
    PC -->|status| SC
    HC -->|status| PC
    SE -->|CloudEvent result| SC
    SC -->|status| CPA

2.2 Why resource-kind (not profile or platform)ΒΆ

  • Profile controllers (lablet / practicelab / expert) would duplicate the same reconcile logic per delivery kind β€” exactly what ADR-047 unifies. Profiles are data (session_type), not services.
  • Platform controllers (cml / proxmox / vmware) would fragment Host reconciliation; the platform difference is a host adapter detail (ADR-046), so it lives inside host-controller, not as separate services. Pod (workload) and Host (platform) are separate kinds β€” a pod binds to a host β€” so they get separate controllers, not one merged service.
  • Resource-kind controllers map 1:1 to per-type managers, keep intent-down/status-up clean, and let each service own one desired_status contract.

2.3 SchedulingΒΆ

resource-scheduler's timeslot booking/allocation (ADR-002) folds into the control plane as the intent producer (it sets desired_status + Timeslot); it is not a reconciler of a resource kind.

3. ConsequencesΒΆ

Positive β€” service boundaries match resource kinds and per-type managers; platform variety is an adapter concern, not a service explosion; the content-sync, session, and pod concerns currently tangled in lablet-controller separate cleanly.

Negative / trade-offs β€” a real re-map of existing services: split lablet-controller into session- / form- / pod-controller, and rename worker-controller β†’ host-controller (generalized beyond cml_on_aws); since this is local-only with no migration window, the cut is clean but touches deployment, Makefiles, and the workspace layout.