Implementation and Numerical Techniques for One EFlop/s HPL-AI Benchmark on Fugaku

Shuhei Kudo,Keigo Nitadori,Takuya Ina,Toshiyuki Imamura

Implementation and Numerical Techniques for One EFlop/s HPL-AI Benchmark on Fugaku

2020

Shuhei Kudo
Keigo Nitadori
Takuya Ina
Toshiyuki Imamura

Our performance benchmark of HPL-AI on the supercomputer Fugaku was awarded the 55th Top500. The effective performance was 1.42 EFlop/s, and the world's first achievement to exceed the wall of Exa-scale in a floating-point arithmetic benchmark. Because HPL-AI is brand new and has no reference code for large systems, several challenges exists in the large-scale benchmark from a low-precision numerical viewpoint. It is not sufficient to replace FP64 operations solely with those of FP32 or FP16. At the least, we need thoughtful numerical analysis for lower-precision arithmetic and the introduction of optimization techniques on extensive computing such as on Fugaku. This study presents some technical analysis and insights on the accuracy issues, implementation, performance improvement, and report on the Exa-scale benchmark on Fugaku.

Keywords:

Benchmark (computing)
Numerical analysis
Performance improvement
Technical analysis
Supercomputer
Computer science
Computer engineering
Matrix decomposition
TOP500
Iterative method

Correction
Source
Cite
Save

References

Citations