ADR-018: Lab Delivery System (LDS) IntegrationΒΆ
- Status: Accepted (Amended)
- Date: 2025-02-10 (Amended 2026-02-18)
- Deciders: Platform Team
- Related: ADR-017 (Lab Operations via Lablet-Controller), ADR-015 (Control Plane API No External Calls), ADR-020 (Session Entity Model), ADR-021 (Child Entities), ADR-022 (CloudEvent Ingestion)
- Amended by: ADR-020 (LabletInstance β LabletSession), ADR-022 (CloudEvent routing)
ContextΒΆ
The Lablet Cloud Manager (LCM) provides lab infrastructure (CML workers and labs) but users need a separate system for interacting with labs:
- User-facing Web UI for task viewing and lab interaction
- Device access provisioning with console credentials
- Assessment and grading integration
- Response collection for evaluation
The Lab Delivery System (LDS) is an existing Cisco platform that provides these capabilities through a concept called "LabSession":
- LabSession: A user-facing session providing web UI, task display, and device access
- LabSessionPart: A segment of a LabSession tied to specific content (form_qualified_name)
- Device access info: Protocol, host, port, username/password for each lab device
Problem StatementΒΆ
A LabletSession (formerly LabletInstance β see ADR-020) in LCM currently represents only the CML lab infrastructure. Users cannot:
- See their tasks or instructions
- Get device access credentials
- Submit responses for grading
- Access a unified web UI for the lab experience
RequirementsΒΆ
- When a LabletSession enters INSTANTIATING state, LCM must provision a corresponding LabSession in LDS
- Device access information must be derived from content.xml (device definitions) + cml.yaml (node labels) + allocated ports
- LDS content must be refreshed when a LabletDefinition is versioned
- LabSession must be archived when LabletSession is TERMINATED
DecisionΒΆ
1. LabletSession = CML Lab + LabSession + GradingEngineΒΆ
A LabletSession is a composite resource consisting of:
- CML Lab: Infrastructure running on a CML worker (managed by lablet-controller β CML API)
- LabSession: User-facing session in LDS (managed by lablet-controller β LDS API)
- GradingEngine: Assessment and scoring (managed by lablet-controller β GradingSPI)
All three components are orchestrated through the LabletSession lifecycle (see ADR-020).
2. Lablet-Controller Owns All LDS InteractionsΒΆ
Following the principle established in ADR-015 and ADR-017, lablet-controller is the only LCM component that interacts with LDS:
| Component | LDS Access |
|---|---|
| control-plane-api | β None (no external calls) |
| resource-scheduler | β None (scheduling only) |
| worker-controller | β None (worker metrics only) |
| lablet-controller | β Full LDS API access |
3. Synchronous REST API PatternΒΆ
LDS integration uses synchronous REST calls (not CloudEvents) because:
- LDS API is internal service, not external integration
- Immediate response needed for session creation
- Simpler error handling and retry logic
- Consistent with existing controller patterns
4. LabDeliverySPI AbstractionΒΆ
A Service Provider Interface (SPI) abstracts LDS API details:
class LabDeliverySPI(Protocol):
"""Service Provider Interface for Lab Delivery System integration."""
async def create_session_with_part(
self,
session_id: str,
form_qualified_name: str,
lablet_instance_id: str,
) -> LabSessionInfo:
"""Create LDS session with initial content part."""
...
async def set_devices(
self,
session_id: str,
devices: list[DeviceAccessInfo],
) -> None:
"""Provision device access information for session."""
...
async def get_session_info(
self,
session_id: str,
) -> LabSessionInfo:
"""Get current session state and login URL."""
...
async def get_login_url(
self,
session_id: str,
) -> str:
"""Get user login URL for the session."""
...
async def archive_session(
self,
session_id: str,
) -> None:
"""Archive session (called on TERMINATED)."""
...
async def refresh_content(
self,
form_qualified_name: str,
) -> None:
"""Signal LDS to refresh content from S3 bucket."""
...
5. Device Mapping FlowΒΆ
Device access credentials are derived through a multi-step mapping:
content.xml β cml.yaml β Port Allocation β DeviceAccessInfo
(device_labels) (node annotations) (allocated ports) (final payload)
- content.xml defines devices with
device_labelidentifiers - cml.yaml contains nodes with matching
device_labelannotations - Port Allocation (FR-2.2.4) assigns external ports to node protocols
- DeviceAccessInfo combines all data for LDS provisioning:
@dataclass
class DeviceAccessInfo:
name: str # Device label from content.xml
protocol: str # ssh, telnet, https, etc.
host: str # Worker public hostname
port: int # Allocated external port
uri: str | None # Optional full URI (for web consoles)
username: str # Device credentials
password: str # Device credentials
6. State DecouplingΒΆ
LabletSession states and LDS LabSession states are decoupled:
| LabletSession State | LDS LabSession State | Trigger | Notes |
|---|---|---|---|
| PENDING | - | - | No session created yet |
| SCHEDULED | - | - | Waiting for scheduling |
| INSTANTIATING | PENDING | lablet-controller | Session created, devices being provisioned |
| READY | PENDING | lablet-controller | Infrastructure ready, awaiting user login |
| RUNNING | RUNNING | LDS CloudEvent β lablet-controller | User has logged in |
| COLLECTING | RUNNING | lablet-controller | Assessment collection in progress |
| GRADING | USER_FINISHED | lablet-controller | GradingEngine scoring in progress |
| STOPPING | ARCHIVING | lablet-controller | Session being archived |
| STOPPED | ARCHIVED | lablet-controller | Session archived |
| TERMINATED | ARCHIVED | - | Emergency termination |
READY State
The READY state explicitly tracks when infrastructure is fully provisioned but the user has not yet logged in. This enables: (1) user engagement metrics, (2) no-show detection, (3) event-driven state transitions from LDS.
Independent Lifecycles
LDS session state may transition independently due to user actions or LDS-internal events. Lablet-controller observes LDS state during reconciliation but does not force transitions.
7. CloudEvent Integration (Amended by ADR-022)ΒΆ
Amendment
The original design routed LDS CloudEvents to control-plane-api. Per ADR-022, all inbound CloudEvents are now routed to lablet-controller via its CloudEventIngestor. lablet-controller then calls control-plane-api's internal endpoints for state mutations.
LDS and GradingEngine emit CloudEvents to notify LCM of session state changes:
| Event Type | Source | Handler Location | Effect |
|---|---|---|---|
lds.session.started |
LDS | lablet-controller | UserSession β ACTIVE, READY β RUNNING |
lds.session.ended |
LDS | lablet-controller | UserSession β ENDED, trigger COLLECTING + grading |
grading.session.completed |
Grading Engine | lablet-controller | Create ScoreReport, β STOPPING |
grading.session.failed |
Grading Engine | lablet-controller | GradingSession β FAULTED |
Why lablet-controller, not control-plane-api:
- CloudEvent handlers need to call external systems (GradingSPI, LabDeliverySPI)
- ADR-015 forbids external calls from control-plane-api
- lablet-controller already owns LabletSession lifecycle orchestration
- See ADR-022 for detailed rationale
sequenceDiagram
participant User
participant LDS as Lab Delivery System
participant LC as Lablet-Controller
participant CPA as Control-Plane-API
participant DB as MongoDB
User->>LDS: Login via login_url
LDS->>LDS: Validate session token
LDS->>LDS: Transition: PENDING β RUNNING
LDS->>LC: POST /api/events<br/>CloudEvent: lds.session.started
LC->>CPA: PUT /api/internal/sessions/{id}/user-session/status (ACTIVE)
LC->>CPA: POST /api/internal/sessions/{id}/transition (RUNNING)
CPA->>DB: Update LabletSession state
CPA-->>LC: 200 OK
8. New LabletDefinition AttributesΒΆ
To support LDS integration, LabletDefinition requires:
| Attribute | Type | Description |
|---|---|---|
form_qualified_name |
string | Globally unique content identifier (e.g., "Exam CCNP ENCOR v2.3 LAB 2.3.4a") |
content_bucket_name |
string | Slugified form_qualified_name for S3/MinIO bucket (auto-derived) |
9. LabletSession Attributes for LDS (Amended by ADR-020, ADR-021)ΒΆ
Amendment
Per ADR-020, LabletInstance is renamed to LabletSession. Per ADR-021, LDS session details
are now tracked in a separate UserSession entity rather than directly on LabletSession.
| Entity | Attribute | Type | Description |
|---|---|---|---|
LabletSession |
user_session_id |
string | FK to UserSession entity |
UserSession |
lds_session_id |
string | LDS LabSession identifier |
UserSession |
lds_part_id |
string | LDS LabSessionPart identifier |
UserSession |
login_url |
string | User-facing URL to access the session |
UserSession |
devices |
list[DeviceAccessInfo] | Provisioned device access info |
10. Assessment Flow (Amended by ADR-021, ADR-022)ΒΆ
Assessment is now orchestrated by lablet-controller via GradingSPI, triggered by LDS CloudEvents:
LDS CloudEvent (session.ended) β lablet-controller β GradingSPI β GradingEngine
β
GradingEngine CloudEvent βββββββββββββββββ
(grading.session.completed)
β
ββ lablet-controller β Create ScoreReport β STOPPING
See ADR-021 for GradingSession and ScoreReport entity details. See ADR-022 for CloudEvent ingestion architecture.
ConsequencesΒΆ
PositiveΒΆ
- Complete user experience: Users get unified web UI, tasks, and device access
- Clean architecture: Single controller (lablet-controller) handles all lab lifecycle including LDS and GradingEngine
- Separation of concerns: CML for infrastructure, LDS for user interaction, GradingEngine for assessment
- User engagement tracking: READY state enables metrics on login time, no-shows
- Event-driven transitions: CloudEvents for user-initiated state changes (via lablet-controller β ADR-022)
- Extensibility: SPI pattern allows future LDS/GradingEngine provider implementations
- Consistent patterns: Follows established ADR-015/017 controller architecture
NegativeΒΆ
- Added complexity: Lablet-controller now manages three external systems (CML, LDS, GradingEngine)
- Failure modes: LDS or GradingEngine outages can affect LabletSession provisioning or grading
- Latency: Additional API calls during INSTANTIATING phase
- Dependency: System requires LDS availability for full functionality
MitigationsΒΆ
- Graceful degradation: If LDS fails, LabletSession can still reach READY for CML-only access
- Retry logic: Transient LDS failures handled with exponential backoff
- Health monitoring: LDS availability tracked in lablet-controller health endpoint
- Timeout configuration:
LDS_API_TIMEOUTconfigurable per deployment
Implementation NotesΒΆ
Content Package StructureΒΆ
Content stored in S3/MinIO bucket (named by content_bucket_name):
{content_bucket_name}/
βββ content.xml # Device definitions, task structure
βββ cml.yaml # Lab topology with device_label annotations
βββ tasks/ # Task content (HTML/Markdown)
βββ assets/ # Images, scripts, etc.
LDS Provisioning SequenceΒΆ
sequenceDiagram
participant LC as Lablet-Controller
participant API as Control-Plane-API
participant CML as CML Worker
participant LDS as Lab Delivery System
participant S3 as S3/MinIO
Note over LC: LabletSession enters INSTANTIATING
LC->>CML: Start lab (CML API)
CML-->>LC: Lab running
LC->>S3: Fetch content.xml
S3-->>LC: Device definitions
LC->>LDS: create_session_with_part()
LDS-->>LC: session_id, login_url
LC->>LDS: set_devices(session_id, devices)
LDS-->>LC: OK
LC->>API: POST /api/internal/sessions/{id}/user-session
API-->>LC: user_session_id
LC->>API: POST /api/internal/sessions/{id}/transition (READY)
Note over LC: LabletSession transitions to READY
Content Refresh TriggerΒΆ
When a LabletDefinition is versioned (new content_bucket_name or updated content):
- Platform updates content in S3 bucket
- Platform calls LCM API to trigger refresh
- Lablet-controller calls
refresh_content(form_qualified_name) - LDS refreshes content from S3 bucket
Future ExtensionsΒΆ
The LabDeliverySPI is designed for future capabilities. Assessment and grading capabilities are now tracked via GradingSession and ScoreReport entities (ADR-021).
# Feedback Collection (future)
async def collect_user_feedback_by_session(self, session_id: str) -> Feedback: ...
async def collect_user_feedback_by_form(self, form_qualified_name: str) -> list[Feedback]: ...