Talk

Dynamic Resource Allocation in Kubernetes: A New Paradigm for Device Allocation and Sharing Beyond GPUs

Abstract

Kubernetes has become the de facto standard for orchestrating cloud workloads, but its traditional device plugin model struggles to keep pace with the growing diversity of hardware accelerators such as GPUs, DPUs, high-speed networking devices, and emerging AI chips. Static allocation limits flexibility, resource efficiency, and multi-tenancy. This talk introduces Dynamic Resource Allocation (DRA)—a groundbreaking approach that enables fine-grained, on-demand allocation and sharing of devices across workloads, with topology-aware scheduling to optimize performance for complex hardware interconnects. We will dive into the architecture and design principles behind DRA, showcase real-world use cases, and discuss its implications for Telco, HPC and AI. Attendees will learn how DRA can unlock better utilization, scalability, and sustainability in cloud-native environments.