Skip to content

Architecture Decision Records (ADRs)

This directory contains Architecture Decision Records (ADRs) for the Lablet Cloud Manager's Lablet Resource Manager expansion.

ADR Index

ADR Title Status Date
ADR-001 API-Centric State Management Accepted 2026-01-15
ADR-002 Separate Resource Scheduler Service Accepted 2026-01-15
ADR-003 CloudEvents for External Integration Accepted 2026-01-15
ADR-004 Port Allocation per Worker Accepted 2026-01-15
ADR-005 Dual State Store Architecture (etcd + MongoDB) Accepted 2026-01-16
ADR-006 Resource Scheduler High Availability Coordination Accepted 2026-01-16
ADR-007 Worker Template Seeding and Management Accepted 2026-01-15
ADR-008 Worker Draining State for Scale-Down Accepted 2026-01-16
ADR-009 Shared Core Package Architecture Accepted 2026-01-16
ADR-010 Service Unification on Neuroglia Framework Accepted 2026-01-17
ADR-011 APScheduler Removal and Controller Migration Accepted 2026-01-19
ADR-012 Dynamic Region Configuration Accepted 2026-01-19
ADR-013 SSE Protocol Improvements Accepted 2026-01-19
ADR-014 Worker Orphan Detection and Garbage Collection Accepted 2026-02-06
ADR-015 Control Plane API Must Not Call AWS EC2 Accepted 2026-02-06
ADR-016 License Operations via Worker-Controller Accepted 2026-02-06
ADR-017 Lab Operations via Lablet-Controller Accepted 2026-02-06
ADR-018 Lab Delivery System (LDS) Integration Accepted 2025-02-10
ADR-019 LabRecord as Independent AggregateRoot Accepted (Partially Superseded) 2026-02-10
ADR-020 Session Entity Model Redesign Accepted 2026-02-18
ADR-021 Child Entity Architecture for Session Tracking Accepted 2026-02-18
ADR-022 CloudEvent Ingestion via Lablet-Controller Accepted 2026-02-18
ADR-023 Content Sync Trigger via Reactive etcd Watch Accepted 2026-02-25
ADR-024 Content Package Storage in RustFS Accepted 2026-02-25
ADR-025 Content Metadata Storage in MongoDB Accepted 2026-02-25
ADR-026 Extensible Upstream Notifier Pattern (Deferred) Accepted 2026-02-25
ADR-027 Version Auto-Increment on Content Change Accepted 2026-02-25
ADR-028 LabletDefinition Initial Status (PENDING_SYNC) Accepted 2026-02-25
ADR-029 Port Template Extraction from CML YAML Accepted 2026-02-25
ADR-030 Resource & Port Observation โ€” "Learn from Live" Accepted 2026-02-28
ADR-031 Checkpoint-Based Instantiation Pipeline Accepted 2026-03-02
ADR-032 Port Allocation as LabRecord Topology Concern Accepted 2026-03-02
ADR-033 CML Node Tag Sync with Allocated Ports Accepted 2026-03-02
ADR-034 Pipeline Executor & Lifecycle Phase Handlers Proposed 2026-03-02
ADR-035 Legacy SchedulerService Removal Accepted 2026-03-04
ADR-036 Resource Management Abstraction Layer Accepted 2026-03-10
ADR-037 Timeslot Management Accepted 2026-03-10
ADR-038 Step Handler Registry & Reconciler Decomposition Accepted 2026-03-18
ADR-039 SSE Race Condition Fix Accepted 2026-04-10
ADR-040 LDS CloudEvent Direct Ingestion via CPA Accepted 2026-04-10

Status Definitions

Status Meaning
Proposed Under discussion, not yet approved
Accepted Decision made and should be followed
Superseded Replaced by another ADR
Deprecated No longer relevant

ADR Template

When creating new ADRs, use this template:

# ADR-NNN: Title

| Attribute | Value |
|-----------|-------|
| **Status** | Proposed |
| **Date** | YYYY-MM-DD |
| **Deciders** | Team/Person |
| **Related ADRs** | Links to related ADRs |

## Context

What is the issue that we're seeing that is motivating this decision or change?

## Decision

What is the change that we're proposing and/or doing?

## Rationale

Why is this decision being made? What alternatives were considered?

## Consequences

### Positive
- What becomes easier or possible?

### Negative
- What becomes harder or impossible?

### Risks
- What could go wrong?

## Implementation Notes

Technical details, code examples, configuration.

Dependency Graph

ADR-001 (API-Centric)
    โ”œโ”€โ”€ ADR-002 (Scheduler) โ”€โ”€โ”€โ”€โ”€โ”
    โ”‚       โ””โ”€โ”€ ADR-006 (HA) โ—„โ”€โ”€โ”€โ”ค
    โ”‚                            โ”‚
    โ”œโ”€โ”€ ADR-005 (State Store) โ—„โ”€โ”€โ”˜
    โ”‚       โ””โ”€โ”€ ADR-006 (HA)
    โ”‚
    โ””โ”€โ”€ ADR-013 (SSE Improvements)
            โ””โ”€โ”€ no controller-direct Redis

ADR-003 (CloudEvents)
    โ””โ”€โ”€ ADR-004 (Ports)

ADR-007 (Templates) โ† standalone

ADR-008 (Draining)
    โ””โ”€โ”€ ADR-002 (Scheduler)

ADR-009 (Shared Core)
    โ””โ”€โ”€ ADR-010 (Neuroglia Unification)

ADR-010 (Neuroglia Unification)
    โ”œโ”€โ”€ ADR-011 (APScheduler Removal)
    โ””โ”€โ”€ ADR-012 (Dynamic Region Config)

ADR-011 (APScheduler Removal)
    โ””โ”€โ”€ controller-based execution replaces jobs

ADR-012 (Dynamic Region Config)
    โ””โ”€โ”€ SystemSettings + WorkerReconciler._run_discovery_loop()

ADR-013 (SSE Improvements)
    โ””โ”€โ”€ batching, filtering, extended events

ADR-018 (LDS Integration)
    โ”œโ”€โ”€ ADR-017 (Lab Operations via Lablet-Controller)
    โ”œโ”€โ”€ ADR-020 (Session Entity Model) โ† amends terminology
    โ””โ”€โ”€ ADR-022 (CloudEvent Ingestion) โ† amends routing

ADR-019 (LabRecord)
    โ””โ”€โ”€ ADR-020 (Session Entity Model) โ† supersedes binding model

ADR-020 (Session Entity Model)
    โ”œโ”€โ”€ ADR-018 (LDS Integration)
    โ”œโ”€โ”€ ADR-019 (LabRecord) โ† partially supersedes
    โ””โ”€โ”€ ADR-021 (Child Entities)

ADR-021 (Child Entity Architecture)
    โ”œโ”€โ”€ ADR-020 (Session Entity Model)
    โ””โ”€โ”€ ADR-022 (CloudEvent Ingestion)

ADR-022 (CloudEvent Ingestion)
    โ”œโ”€โ”€ ADR-003 (CloudEvents)
    โ”œโ”€โ”€ ADR-015 (Control Plane API No External Calls)
    โ””โ”€โ”€ ADR-018 (LDS Integration) โ† amends ยง7

# Content Synchronization cluster (ADR-023โ€“028)
ADR-023 (Content Sync Trigger)
    โ”œโ”€โ”€ ADR-005 (Dual State Store) โ† extends etcd key namespace
    โ”œโ”€โ”€ ADR-015 (CPA No External Calls)
    โ”œโ”€โ”€ ADR-017 (Lab Operations) โ† extends reconciliation pattern
    โ”œโ”€โ”€ ADR-024 (Package Storage)
    โ”œโ”€โ”€ ADR-025 (Content Metadata)
    โ””โ”€โ”€ ADR-026 (Upstream Notifier)

ADR-024 (Package Storage in RustFS)
    โ””โ”€โ”€ ADR-025 (Content Metadata) โ† complementary

ADR-025 (Content Metadata in MongoDB)
    โ”œโ”€โ”€ ADR-005 (Dual State Store)
    โ””โ”€โ”€ ADR-024 (Package Storage) โ† complementary

ADR-026 (Upstream Notifier Pattern)
    โ””โ”€โ”€ ADR-018 (LDS Integration)

ADR-027 (Version Auto-Increment)
    โ”œโ”€โ”€ ADR-023 (Content Sync Trigger)
    โ””โ”€โ”€ ADR-028 (Definition Initial Status)

ADR-028 (Definition Initial Status)
    โ”œโ”€โ”€ ADR-023 (Content Sync Trigger)
    โ””โ”€โ”€ ADR-027 (Version Auto-Increment)

ADR-029 (Port Template Extraction)
    โ”œโ”€โ”€ ADR-025 (Content Metadata Storage)
    โ””โ”€โ”€ ADR-028 (Definition Initial Status)

ADR-030 (Resource & Port Observation โ€” Learn from Live)
    โ”œโ”€โ”€ ADR-004 (Port Allocation per Worker)
    โ”œโ”€โ”€ ADR-017 (Lab Operations via Lablet-Controller)
    โ”œโ”€โ”€ ADR-020 (Session Entity Model)
    โ””โ”€โ”€ ADR-029 (Port Template Extraction)

# Instantiation Pipeline cluster (ADR-031โ€“033)
ADR-031 (Checkpoint Pipeline)
    โ”œโ”€โ”€ ADR-004 (Port Allocation per Worker)
    โ”œโ”€โ”€ ADR-017 (Lab Operations via Lablet-Controller)
    โ”œโ”€โ”€ ADR-020 (Session Entity Model)
    โ”œโ”€โ”€ ADR-029 (Port Template Extraction)
    โ”œโ”€โ”€ ADR-030 (Resource Observation)
    โ”œโ”€โ”€ ADR-032 (Port Allocation on LabRecord)
    โ””โ”€โ”€ ADR-033 (CML Node Tag Sync)

ADR-032 (Port Allocation as LabRecord Topology)
    โ”œโ”€โ”€ ADR-004 (Port Allocation per Worker)
    โ”œโ”€โ”€ ADR-019 (LabRecord as AggregateRoot)
    โ”œโ”€โ”€ ADR-020 (Session Entity Model)
    โ”œโ”€โ”€ ADR-029 (Port Template Extraction)
    โ”œโ”€โ”€ ADR-031 (Checkpoint Pipeline)
    โ””โ”€โ”€ ADR-033 (CML Node Tag Sync)

ADR-033 (CML Node Tag Sync)
    โ”œโ”€โ”€ ADR-004 (Port Allocation per Worker)
    โ”œโ”€โ”€ ADR-017 (Lab Operations via Lablet-Controller)
    โ”œโ”€โ”€ ADR-029 (Port Template Extraction)
    โ”œโ”€โ”€ ADR-031 (Checkpoint Pipeline)
    โ””โ”€โ”€ ADR-032 (Port Allocation on LabRecord)

ADR-039 (SSE Race Condition Fix)
    โ”œโ”€โ”€ ADR-013 (SSE Protocol Improvements)
    โ””โ”€โ”€ ADR-001 (API-Centric State Management)

ADR-040 (LDS CloudEvent Direct Ingestion via CPA)
    โ”œโ”€โ”€ ADR-003 (CloudEvents)
    โ”œโ”€โ”€ ADR-015 (CPA No External Calls)
    โ”œโ”€โ”€ ADR-018 (LDS Integration)
    โ””โ”€โ”€ ADR-022 (CloudEvent Ingestion) โ† amends (dual routing)