Speaker
Description
High-Performance Computing (HPC) drives discovery across science and industry and underpins the rapid advances in AI. At the heart of modern HPC platforms is the Graphics Processing Unit (GPU), which delivers the bulk of compute power but also dominates energy consumption. As GPU architectures increasingly prioritize low-precision arithmetic for AI workloads, HPC applications that depend on higher precision face new programmability challenges alongside new opportunities in mixed-precision computing.
Crucially, the energy efficiency of GPU applications depends not only on compute utilization but also on memory traffic patterns, and the fastest implementation is not always the most energy efficient. Reliable exploration of these trade-offs is further complicated by the limited accuracy and temporal resolution of current power measurement tools. Combined with the vast, discontinuous design spaces inherent to GPU programming, manual optimization is infeasible.
Automatic performance tuning, or auto-tuning, offers a proven approach to this problem, automatically searching for optimal configurations across algorithm, application, and hardware parameters. To address the emerging demands of mixed-precision computing and energy-aware execution, the field is now moving toward constrained and multi-objective optimization to enable systematic exploration of the trade-offs between performance, energy consumption, and numerical accuracy. In this talk, I will highlight key challenges, recent developments, and future directions in GPU auto-tuning.