Resource Scheduler Guide¶
Documentation In Progress
This service guide is a placeholder. Full documentation is being developed.
Overview¶
The Resource Scheduler is responsible for intelligent placement of LabletInstances on available CML Workers. It implements scheduling algorithms that consider resource requirements, license affinity, and time-based reservations.
Architecture¶
See Resource Scheduler Architecture for detailed design.
Core Responsibilities¶
| Responsibility | Description |
|---|---|
| Scheduling Decisions | Determine optimal worker placement for LabletInstances |
| Timeslot Management | Handle time-windowed reservations |
| Capacity Tracking | Monitor available resources per worker (via etcd) |
| HA Coordination | Leader election for single active scheduler |
Key Flows¶
Instance Scheduling Flow¶
sequenceDiagram
participant CP as Control Plane API
participant RS as Resource Scheduler
participant etcd as etcd (State Store)
CP->>etcd: Write: LabletInstance (status=PENDING)
RS->>etcd: Watch: LabletInstances
etcd-->>RS: Notify: New PENDING instance
RS->>etcd: Read: Worker capacity
RS->>RS: Calculate optimal placement
RS->>etcd: Write: Assignment (instance→worker)
RS->>CP: POST: Update instance status
API Endpoints¶
Internal Service
The Resource Scheduler primarily operates via etcd watches. Limited REST API for health and status.
| Method | Endpoint | Description |
|---|---|---|
GET |
/health |
Health check |
GET |
/ready |
Readiness check |
GET |
/metrics |
Prometheus metrics |
Configuration¶
Key environment variables:
| Variable | Description | Default |
|---|---|---|
SCHEDULER_ENABLED |
Enable scheduling | true |
SCHEDULER_LEAD_TIME_MINUTES |
Advance scheduling window | 35 |
ETCD_ENDPOINTS |
etcd cluster endpoints | http://etcd:2379 |