Tale of Tails: Anomaly Avoidance in Data Centers
Ji Xue, Robert Birke, et al.
SRDS 2016
Executing heterogeneous workloads with different priorities, resource demands and performance objectives is one of the key operations for today's data centers to increase resource as well as energy efficiency. In order to meet the performance objectives of diverse workloads, schedulers rely on evictions even resulting in waste of resources due to lost executions of evicted tasks. It is not straightforward to design priority schedulers which capture key aspects of workloads and systems and also to strike a balance between resource (in)efficiency and application performance tradeoff. To explore large space of designing such schedulers, we propose a trace-driven cluster management framework that models a comprehensive set of system configurations and general priority-based scheduling policies. In particular, we focus on the impact of task evictions on resource inefficiency and task response times of multiple priority classes driven by Google production cluster trace. Moreover, we propose a system design as a use case exploiting workload heterogeneity and introducing workload-awareness into the system configuration and task assignment.
Ji Xue, Robert Birke, et al.
SRDS 2016
Robert Birke, Mathias Bjorkqvist, et al.
SIGMETRICS 2015
Robert Birke, Andrej Podzimek, et al.
IEEE TNSM
Navaneeth Rameshan, Robert Birke, et al.
DSN-W 2016