Running Large-Scale GPU Workloads on Kubernetes with Slurm
Slinky, developed by SchedMD and now part of NVIDIA, facilitates the integration of Slurm cluster management with Kubernetes. This integration allows for efficient management of large-scale GPU workloads, supporting advanced NVIDIA architectures. Production deployments have shown that Slinky can scale to over 8,000 GPUs while maintaining performance parity with traditional Slurm clusters.
- ▪Slinky enables native Slurm cluster management on Kubernetes using Custom Resource Definitions.
- ▪It supports automated GPU management and topology-aware scheduling for advanced NVIDIA architectures.
- ▪Production deployments at NVIDIA have demonstrated Slinky's ability to scale to over 8,000 GPUs.
Opening excerpt (first ~120 words) tap to expand
Data Center / Cloud English中文 Running Large-Scale GPU Workloads on Kubernetes with Slurm Apr 09, 2026 By Anton Polyakov, Fagani Hajizada, Marlow Warnicke and Skyler Malinowski Like Discuss (0) L T F R E AI-Generated Summary Like Dislike Slinky, developed by SchedMD (now part of NVIDIA), enables native Slurm cluster management on Kubernetes by representing all Slurm daemons as Kubernetes Custom Resource Definitions, supporting full Slurm lifecycle orchestration and high availability without relying on Slurm's native HA.Integration with the NVIDIA GPU Operator and DRA/ComputeDomains allows automated GPU management, topology-aware multinode scheduling, and per-job GPU monitoring, supporting advanced NVIDIA architectures like GB200 NVL72 with dynamic Internode Memory Exchange and topology…
Excerpt limited to ~120 words for fair-use compliance. The full article is at NVIDIA Technical Blog.