Architecture Decision Records (ADRs)ΒΆ
This directory contains Architecture Decision Records (ADRs) for the Lablet Cloud Manager's Lablet Resource Manager expansion.
ADR IndexΒΆ
Reading the supersession column.
β ADR-NNNmeans superseded (fully or partially) by that ADR;β ADR-NNNmeans supersedes it. The current-model cluster (ADR-036, 044β054) is still Proposed β it is the north-star the Solution Design docs describe, pending ratification. See the supersession chain below.
| ADR | Title | Status | Date | Supersession |
|---|---|---|---|---|
| ADR-001 | API-Centric State Management | Accepted | 2026-01-15 | β |
| ADR-002 | Separate Resource Scheduler Service | Accepted | 2026-01-15 | role re-scoped by β 054 |
| ADR-003 | CloudEvents for External Integration | Accepted | 2026-01-15 | β |
| ADR-004 | Port Allocation per Worker | Accepted | 2026-01-15 | β |
| ADR-005 | Dual State Store Architecture (etcd + MongoDB) | Accepted | 2026-01-16 | β |
| ADR-006 | Resource Scheduler High Availability Coordination | Accepted | 2026-01-16 | β |
| ADR-007 | Worker Template Seeding and Management | Accepted | 2026-01-15 | β |
| ADR-008 | Worker Draining State for Scale-Down | Accepted | 2026-01-16 | β |
| ADR-009 | Shared Core Package Architecture | Accepted | 2026-01-16 | β |
| ADR-010 | Service Unification on Neuroglia Framework | Accepted | 2026-01-17 | β |
| ADR-011 | APScheduler Removal and Controller Migration | Accepted | 2026-01-19 | β |
| ADR-012 | Dynamic Region Configuration | Accepted | 2026-01-19 | β |
| ADR-013 | SSE Protocol Improvements | Accepted | 2026-01-19 | β |
| ADR-014 | Worker Orphan Detection and Garbage Collection | Accepted | 2026-02-06 | β |
| ADR-015 | Control Plane API Must Not Call AWS EC2 | Accepted | 2026-02-06 | β |
| ADR-016 | License Operations via Worker-Controller | Accepted (Partially Superseded) | 2026-02-07 | topology β 054 |
| ADR-017 | Lab Operations via Lablet-Controller | Accepted (Partially Superseded) | 2026-02-07 | topology β 054 |
| ADR-018 | Lab Delivery System (LDS) Integration | Accepted | 2025-02-10 | β |
| ADR-019 | LabRecord as Independent AggregateRoot | Accepted (Partially Superseded) | 2026-02-10 | binding β 020 |
| ADR-020 | Session Entity Model Redesign | Accepted (Partially Superseded) | 2026-02-18 | β 019 Β§binding Β· state machine β 045 |
| ADR-021 | Child Entity Architecture for Session Tracking | Accepted (Partially Superseded) | 2026-02-18 | part model β 045 |
| ADR-022 | CloudEvent Ingestion via Lablet-Controller | Accepted | 2026-02-18 | β |
| ADR-023 | Content Sync Trigger via Reactive etcd Watch | Accepted | 2026-02-25 | β |
| ADR-024 | Content Package Storage in RustFS | Accepted | 2026-02-25 | β |
| ADR-025 | Content Metadata Storage in MongoDB | Accepted | 2026-02-25 | β |
| ADR-026 | Extensible Upstream Notifier Pattern (Deferred) | Accepted | 2026-02-25 | β |
| ADR-027 | Version Auto-Increment on Content Change | Accepted | 2026-02-25 | β |
| ADR-028 | LabletDefinition Initial Status (PENDING_SYNC) | Accepted | 2026-02-25 | generalised by β 059 |
| ADR-029 | Port Template Extraction from CML YAML | Accepted | 2026-02-25 | β |
| ADR-030 | Resource & Port Observation β "Learn from Live" | Accepted | 2026-02-28 | β |
| ADR-031 | Checkpoint-Based Instantiation Pipeline | Accepted | 2026-03-02 | β |
| ADR-032 | Port Allocation as LabRecord Topology Concern | Accepted | 2026-03-02 | β |
| ADR-033 | CML Node Tag Sync with Allocated Ports | Accepted | 2026-03-02 | β |
| ADR-034 | Pipeline Executor & Lifecycle Phase Handlers | Proposed | 2026-03-02 | β |
| ADR-035 | Legacy SchedulerService Removal | Accepted | 2026-03-04 | role re-scoped by β 054 |
| ADR-036 | Resource Management Abstraction Layer | Accepted | 2026-03-10 | extended by 050 |
| ADR-037 | Timeslot Management | Accepted | 2026-03-10 | β |
| ADR-038 | Step Handler Registry & Reconciler Decomposition | Accepted | 2026-03-18 | extended by 047 |
| ADR-039 | SSE Race Condition Fix | Accepted | 2026-04-10 | β |
| ADR-040 | LDS CloudEvent Direct Ingestion via CPA | Accepted | 2026-04-10 | β |
| ADR-041 | WebSocket-Based CML Worker Monitoring | Proposed | 2026-05-20 | β |
| ADR-042 | CommandHandlerBase Dependency Simplification | Proposed | 2026-06-01 | β |
| ADR-043 | Startup State Reconciliation and Discovery Separation | Accepted | 2026-06-04 | β |
| ADR-044 | ScenarioEngine β Pod Automation as a Separate Service | Proposed (Rev 2) | 2026-06-05 | β Rev 1 (in-process design) |
| ADR-045 | Multi-part Session / Part Model with Selector-Resolved Content | Proposed | 2026-06-12 | β 020, 021 (session state machine β part level) |
| ADR-046 | Host Abstraction and PodType / HostType Split | Proposed | 2026-06-12 | extends 036 |
| ADR-047 | Generic Reconciliation Framework with Per-Type Managers | Proposed | 2026-06-12 | extends 036, 038 |
| ADR-048 | Unified Resource Dashboard and Shared lcm-core UI Components |
Proposed | 2026-06-12 | β |
| ADR-049 | Unified Workflow DSL for Lifecycle / Step / Task Definitions | Proposed | 2026-06-12 | inline tasks body & validation β 057; data-flow β 058 |
| ADR-050 | Definition/Instance Duality and Two-Tier Instance Layering | Proposed | 2026-06-12 | extends 036; Form row partially β 059 |
| ADR-051 | Provisioning Sources and Asymmetric Definition Lifecycle | Proposed | 2026-06-12 | extends 050 |
| ADR-052 | Content-Authoring Taxonomy Import and Form Delivery | Proposed | 2026-06-12 | extends 050, 051; Form-delivery stance β 059 |
| ADR-053 | Authorization Policy Model Port | Proposed | 2026-06-12 | extends 050 |
| ADR-054 | Controller Topology by Resource Kind | Proposed (Rev 2) | 2026-06-12 | β topology of 016, 017; re-scopes 002, 035; Rev 2 adds form-/host-controller (059) |
| ADR-055 | Per-Resource-Kind Lifecycle State Machines | Proposed | 2026-06-13 | extends 047, 050 |
| ADR-056 | ADR Lifecycle & Supersession Conventions | Proposed | 2026-06-13 | β |
| ADR-057 | Content-Driven Lifecycle DSL β Primitives, Phases & scenarioFunctions | Proposed | 2026-06-13 | β inline tasks body of 049 Β§2.1 & task-type list of 044 Β§2.8; extends 049, 044 |
| ADR-058 | Lifecycle Data-Flow & Variable Scopes | Proposed | 2026-06-13 | extends 057 |
| ADR-059 | Form as First-Class Synced Resource | Proposed | 2026-06-16 | β Form-delivery of 052 & Form row of 050; generalises 028; extends 051; related 046, 054 |
Title note: ADR-044's filename is
ADR-044-content-driven-lifecycle-engine.md(and the mkdocs nav label still reads "Content-Driven Lifecycle Engine") but its current H1 is "ScenarioEngine β Pod Automation as a Separate Service" (Rev 2). The table above uses the current H1; the filename is retained to avoid breaking links.
Supersession chainΒΆ
flowchart LR
A019[ADR-019 LabRecord] -->|binding| A020[ADR-020 Session Model]
A020 -->|state machine| A045[ADR-045 Multi-part]
A021[ADR-021 Child Entities] -->|part model| A045
A016[ADR-016 License ops] -->|topology| A054[ADR-054 Controller Topology]
A017[ADR-017 Lab ops] -->|topology| A054
A002[ADR-002 Scheduler] -.->|role re-scoped| A054
A035[ADR-035 Scheduler removal] -.->|role re-scoped| A054
A044R1[ADR-044 Rev 1 in-process] -->|β| A044[ADR-044 Rev 2 SE service]
A049[ADR-049 Unified DSL] -->|tasks body & validation| A057[ADR-057 Lifecycle DSL]
A044 -->|task-type list| A057
A057 -->|extends| A058[ADR-058 Data-flow scopes]
A052[ADR-052 Content taxonomy] -->|Form delivery| A059[ADR-059 Form as Resource]
A050[ADR-050 Def/Instance duality] -.->|Form row| A059
A028[ADR-028 Definition status] -.->|generalised| A059
A059 -.->|form-/host-controller| A054
classDef superseded fill:#fde68a,stroke:#b45309;
classDef current fill:#a7f3d0,stroke:#047857;
class A019,A020,A021,A016,A017,A002,A035,A044R1,A049,A052,A050,A028 superseded;
class A045,A054,A044,A057,A058,A059 current;
Status DefinitionsΒΆ
| Status | Meaning |
|---|---|
| Proposed | Under discussion, not yet approved |
| Accepted | Decision made and should be followed |
| Superseded | Replaced by another ADR |
| Deprecated | No longer relevant |
ADR TemplateΒΆ
When creating new ADRs, use this template:
# ADR-NNN: Title
| Attribute | Value |
|-----------|-------|
| **Status** | Proposed |
| **Date** | YYYY-MM-DD |
| **Deciders** | Team/Person |
| **Related ADRs** | Links to related ADRs |
## Context
What is the issue that we're seeing that is motivating this decision or change?
## Decision
What is the change that we're proposing and/or doing?
## Rationale
Why is this decision being made? What alternatives were considered?
## Consequences
### Positive
- What becomes easier or possible?
### Negative
- What becomes harder or impossible?
### Risks
- What could go wrong?
## Implementation Notes
Technical details, code examples, configuration.
Dependency GraphΒΆ
ADR-001 (API-Centric)
βββ ADR-002 (Scheduler) ββββββ
β βββ ADR-006 (HA) βββββ€
β β
βββ ADR-005 (State Store) ββββ
β βββ ADR-006 (HA)
β
βββ ADR-013 (SSE Improvements)
βββ no controller-direct Redis
ADR-003 (CloudEvents)
βββ ADR-004 (Ports)
ADR-007 (Templates) β standalone
ADR-008 (Draining)
βββ ADR-002 (Scheduler)
ADR-009 (Shared Core)
βββ ADR-010 (Neuroglia Unification)
ADR-010 (Neuroglia Unification)
βββ ADR-011 (APScheduler Removal)
βββ ADR-012 (Dynamic Region Config)
ADR-011 (APScheduler Removal)
βββ controller-based execution replaces jobs
ADR-012 (Dynamic Region Config)
βββ SystemSettings + WorkerReconciler._run_discovery_loop()
ADR-013 (SSE Improvements)
βββ batching, filtering, extended events
ADR-018 (LDS Integration)
βββ ADR-017 (Lab Operations via Lablet-Controller)
βββ ADR-020 (Session Entity Model) β amends terminology
βββ ADR-022 (CloudEvent Ingestion) β amends routing
ADR-019 (LabRecord)
βββ ADR-020 (Session Entity Model) β supersedes binding model
ADR-020 (Session Entity Model)
βββ ADR-018 (LDS Integration)
βββ ADR-019 (LabRecord) β partially supersedes
βββ ADR-021 (Child Entities)
ADR-021 (Child Entity Architecture)
βββ ADR-020 (Session Entity Model)
βββ ADR-022 (CloudEvent Ingestion)
ADR-022 (CloudEvent Ingestion)
βββ ADR-003 (CloudEvents)
βββ ADR-015 (Control Plane API No External Calls)
βββ ADR-018 (LDS Integration) β amends Β§7
# Content Synchronization cluster (ADR-023β028)
ADR-023 (Content Sync Trigger)
βββ ADR-005 (Dual State Store) β extends etcd key namespace
βββ ADR-015 (CPA No External Calls)
βββ ADR-017 (Lab Operations) β extends reconciliation pattern
βββ ADR-024 (Package Storage)
βββ ADR-025 (Content Metadata)
βββ ADR-026 (Upstream Notifier)
ADR-024 (Package Storage in RustFS)
βββ ADR-025 (Content Metadata) β complementary
ADR-025 (Content Metadata in MongoDB)
βββ ADR-005 (Dual State Store)
βββ ADR-024 (Package Storage) β complementary
ADR-026 (Upstream Notifier Pattern)
βββ ADR-018 (LDS Integration)
ADR-027 (Version Auto-Increment)
βββ ADR-023 (Content Sync Trigger)
βββ ADR-028 (Definition Initial Status)
ADR-028 (Definition Initial Status)
βββ ADR-023 (Content Sync Trigger)
βββ ADR-027 (Version Auto-Increment)
ADR-029 (Port Template Extraction)
βββ ADR-025 (Content Metadata Storage)
βββ ADR-028 (Definition Initial Status)
ADR-030 (Resource & Port Observation β Learn from Live)
βββ ADR-004 (Port Allocation per Worker)
βββ ADR-017 (Lab Operations via Lablet-Controller)
βββ ADR-020 (Session Entity Model)
βββ ADR-029 (Port Template Extraction)
# Instantiation Pipeline cluster (ADR-031β033)
ADR-031 (Checkpoint Pipeline)
βββ ADR-004 (Port Allocation per Worker)
βββ ADR-017 (Lab Operations via Lablet-Controller)
βββ ADR-020 (Session Entity Model)
βββ ADR-029 (Port Template Extraction)
βββ ADR-030 (Resource Observation)
βββ ADR-032 (Port Allocation on LabRecord)
βββ ADR-033 (CML Node Tag Sync)
ADR-032 (Port Allocation as LabRecord Topology)
βββ ADR-004 (Port Allocation per Worker)
βββ ADR-019 (LabRecord as AggregateRoot)
βββ ADR-020 (Session Entity Model)
βββ ADR-029 (Port Template Extraction)
βββ ADR-031 (Checkpoint Pipeline)
βββ ADR-033 (CML Node Tag Sync)
ADR-033 (CML Node Tag Sync)
βββ ADR-004 (Port Allocation per Worker)
βββ ADR-017 (Lab Operations via Lablet-Controller)
βββ ADR-029 (Port Template Extraction)
βββ ADR-031 (Checkpoint Pipeline)
βββ ADR-032 (Port Allocation on LabRecord)
ADR-039 (SSE Race Condition Fix)
βββ ADR-013 (SSE Protocol Improvements)
βββ ADR-001 (API-Centric State Management)
ADR-040 (LDS CloudEvent Direct Ingestion via CPA)
βββ ADR-003 (CloudEvents)
βββ ADR-015 (CPA No External Calls)
βββ ADR-018 (LDS Integration)
βββ ADR-022 (CloudEvent Ingestion) β amends (dual routing)
# Worker / runtime hardening (ADR-041β043)
ADR-041 (WebSocket-Based CML Worker Monitoring)
βββ ADR-013 (SSE Protocol Improvements)
ADR-042 (CommandHandlerBase Dependency Simplification)
βββ ADR-010 (Service Unification on Neuroglia)
ADR-043 (Startup State Reconciliation and Discovery Separation)
βββ ADR-012 (Dynamic Region Configuration)
βββ ADR-014 (Worker Orphan Detection)
# Generalized resource-plane cluster (current model β ADR-036 + 044β054, all Proposed)
ADR-036 (Resource Management Abstraction Layer) β layered state base
βββ ADR-037 (Timeslot Management)
βββ ADR-046 (Host / PodTypeβHostType Split) β extends
βββ ADR-047 (Generic Reconciliation Framework) β extends (with ADR-038)
βββ ADR-050 (Definition/Instance Duality) β extends
ADR-044 (ScenarioEngine β Pod Automation as a Separate Service, Rev 2)
βββ supersedes Rev 1 (in-process design)
βββ ADR-049 (Unified Workflow DSL) β job/step description
ADR-045 (Multi-part Session / Part Model)
βββ ADR-036 (Resource Abstraction)
βββ ADR-020 (Session Entity Model) β supersedes (state machine β part level)
βββ ADR-021 (Child Entity Architecture) β supersedes (part model)
βββ ADR-046 (Host / Type split)
βββ ADR-047 (Generic Reconciliation)
ADR-047 (Generic Reconciliation Framework)
βββ ADR-036 (Resource Abstraction)
βββ ADR-038 (Step Handler Registry) β extends
βββ ADR-054 (Controller Topology) β maps managers β services
ADR-050 (Definition/Instance Duality)
βββ ADR-036 (Resource Abstraction) β extends
βββ ADR-051 (Provisioning Sources) β extends
βββ ADR-052 (Content-Authoring Taxonomy) β extends
βββ ADR-053 (Authorization Policy Model) β extends
ADR-051 (Provisioning Sources)
βββ ADR-050 (Definition/Instance Duality)
βββ ADR-023β028 (Content Sync cluster) β reconciles content_package source
ADR-052 (Content-Authoring Taxonomy)
βββ ADR-050 / ADR-051
βββ ADR-044 (Content-driven lifecycle / SE)
βββ ADR-045 (Multi-part β supplies the Forms parts select)
ADR-053 (Authorization Policy Model)
βββ ADR-050 (Definition/Instance Duality)
βββ ADR-001 (API-Centric State Management)
ADR-054 (Controller Topology by Resource Kind)
βββ ADR-047 (Per-type managers) β extends
βββ ADR-016 (License ops) β supersedes topology
βββ ADR-017 (Lab ops) β supersedes topology
βββ ADR-002 (Resource Scheduler) β re-scopes role
βββ ADR-035 (Legacy Scheduler Removal) β re-scopes role
βββ ADR-046 (Host adapters live inside pod-controller)