NECSTFridayTalk – Optimized implementation of the HPCG benchmark on reconfigurable hardware
NECSTFridayTalk
Alberto Zeni
PhD student in High Performance Computing
Politecnico, Milano
Event will be online from Facebook
December 17th, 2021
1.00 pm
Contacts:
Marco Santambrogio
Research Line:
System architectures
Alberto Zeni
PhD student in High Performance Computing
Politecnico, Milano
Event will be online from Facebook
December 17th, 2021
1.00 pm
Contacts:
Marco Santambrogio
Research Line:
System architectures
Sommario
On December 17th, 2021 at 1.00 pm a new appointment of NECSTFridayTalk titled "Optimized implementation of the HPCG benchmark on reconfigurable hardware", will be held online via Facebook by Alberto Zeni, PhD student in High Performance Computing, Politecnico - Milano.
The HPCG benchmark represents a modern complement to the HPL benchmark in the performance evaluation of HPC systems, as it has been recognized as a more representative benchmark to reflect real-world applications. While typical workloads become more and more challenging, the semiconductor industry is battling with performance scaling and power efficiency on next-generation technology nodes. As a result, the industry is turning towards more customized compute architectures to help meet the latest performance requirements. In this paper, we present the details of the first FPGA-based implementation of HPCG that takes advantage of such customized compute architectures. Our results show that our high-performance multi-FPGA implementation, using 1 and 4 Xilinx Alveo U280 achieves up to 108.3 GFlops and 346.5 GFlops respectively, representing speed-ups of 104.1x and 333.2x over the state-of-the-art multi-threaded software running on a server with an Intel Xeon processor with no loss of accuracy. We also demonstrate that the FPGA-based solution achieves comparable performance with respect to modern GPUs and an up to 2.7x improvement in terms of power efficiency compared to an NVIDIA Tesla V100. Finally, a theoretical evaluation, based on Berkeley's Roofline model demonstrates that our implementation is near optimally tuned on the Xilinx Alveo U280.
The NECSTLab is a DEIB laboratory, with different research lines on advanced topics in computing systems: from architectural characteristics, to hardware-software
codesign methodologies, to security and dependability issues of complex system architectures.
Every week, the “NECSTFridayTalk” invites researchers, professionals or entrepreneurs to share their work experiences and projects they are implementing in the “Computing Systems”.
Streaming via Facebook will be available at the following LINK
The HPCG benchmark represents a modern complement to the HPL benchmark in the performance evaluation of HPC systems, as it has been recognized as a more representative benchmark to reflect real-world applications. While typical workloads become more and more challenging, the semiconductor industry is battling with performance scaling and power efficiency on next-generation technology nodes. As a result, the industry is turning towards more customized compute architectures to help meet the latest performance requirements. In this paper, we present the details of the first FPGA-based implementation of HPCG that takes advantage of such customized compute architectures. Our results show that our high-performance multi-FPGA implementation, using 1 and 4 Xilinx Alveo U280 achieves up to 108.3 GFlops and 346.5 GFlops respectively, representing speed-ups of 104.1x and 333.2x over the state-of-the-art multi-threaded software running on a server with an Intel Xeon processor with no loss of accuracy. We also demonstrate that the FPGA-based solution achieves comparable performance with respect to modern GPUs and an up to 2.7x improvement in terms of power efficiency compared to an NVIDIA Tesla V100. Finally, a theoretical evaluation, based on Berkeley's Roofline model demonstrates that our implementation is near optimally tuned on the Xilinx Alveo U280.
The NECSTLab is a DEIB laboratory, with different research lines on advanced topics in computing systems: from architectural characteristics, to hardware-software
codesign methodologies, to security and dependability issues of complex system architectures.
Every week, the “NECSTFridayTalk” invites researchers, professionals or entrepreneurs to share their work experiences and projects they are implementing in the “Computing Systems”.
Streaming via Facebook will be available at the following LINK