Experiments¶

KRATOS should evaluate whether workload history and CUDA metrics improve GPU placement compared with resource-only scheduling.

Baselines¶

Compare against:

Kubernetes default scheduler
Volcano FIFO
Volcano priority scheduling
Volcano fair sharing
Volcano with KRATOS node-selection hints

Workload Classes¶

Initial classes:

compute-bound
memory-bound
tensor-core-bound

Future classes may include communication-bound, IO-bound, and mixed workloads.

Metrics¶

Track throughput, makespan, waiting time, completion time, GPU utilization, memory utilization, active workloads, power consumption, temperature, and profile reuse rate.

For distributed scenarios, also track bandwidth, latency, and topology effects.

Proofs of Concept¶

Nsight Compute Kubernetes PoC

Keys	Action
`?`	Open this help
`n`	Next page
`p`	Previous page
`s`	Search