Best Industry Paper Award @ ACM/SPEC ICPE 2019
17.05.2019Jóakim von Kistowski, Tobias Wahl (Student), and Samuel Kounev from the Chair of Computer Science II win the Best Industry Paper Award at the 10th ACM/SPEC International Conference on Performance Engineering (ICPE) 2019 in Mumbai, India.
Their work entitled "Measuring the Energy Efficiency of Transactional Loads on GPGPU" was conducted in close collaboration with different companies under the umbrella of the SPEC Open Systems Group committee on Power and Server Efficiency Benchmarking.
Jóakim von Kistowski, Johann Pais, Tobias Wahl, Klaus-Dieter Lange, Hansfried Block, John Beckett, and Samuel Kounev. Measuring the Energy Efficiency of Transactional Loads on GPGPU. In Proceedings of the 19th ACM/SPEC International Conference on Performance Engineering, Mumbai, India, 2019, ICPE '19. ACM, New York, NY, USA. 2019, Best Industry Paper Award. [ pdf | slides ]
Abstract:
General Purpose Graphics Processing Units (GPGPUs) are becoming more and more common in current servers and data centers, which in turn consume a significant amount of electrical power. Measuring and benchmarking this power consumption is important as it helps with optimization and selection of these servers. However, benchmarking and comparing the energy efficiency of GPGPU workloads is challenging as standardized workloads are rare and standardized power and efficiency measurement methods and metrics do not exist. In addition, not all GPGPU systems run at maximum load all the time. Systems that are utilized in transactional, request driven workloads, for example, can run at lower utilization levels. Existing benchmarks for GPGPU systems primarily consider performance and are intended only to run at maximum load. They do not measure performance or energy efficiency at other loads. In turn, server energy-efficiency benchmarks that consider multiple load levels do not address GPGPUs. This paper introduces a measurement methodology for servers with GPGPU accelerators that considers multiple load levels for transactional workloads. The methodology also addresses verifiability of results in order to achieve comparability of different device solutions. We analyze our methodology on three different systems with solutions from two different accelerator vendors. We investigate the efficacy of different methods of load levels scaling and our methodology's reproducibility. We show that the methodology is able to produce consistent and reproducible results with a maximum coefficient of variation of 1.4% regarding power consumption