NGT - Openlab "Optimising Floating Point Precision" Workshop

Europe/Zurich
40/S2-A01 – Salle Anderson (CERN)

40/S2-A01 – Salle Anderson

CERN

Alex Lasa Lamarca, Axel Naumann (CERN), Jacob Friedrich Finkenrath (CERN), Maria Girone (CERN), Mariana Velho (CERN), Stefan Roiser (CERN), Vasiliki Batsari
Description

 

Scientific applications in high energy physics depend in many areas on floating point operations in single, double or even higher precision.

 

With the upcoming runs at the LHC, both the amount of data and the precision for its calculation will increase significantly and therefore the computing resource requirements. It has already been proven that the throughput of several physics applications can be significantly improved by the use of computing accelerators such as GPUs. In view of this change of computing towards a heterogeneous execution environment, the use of high precision floating point operations for algorithmic data processing deserves dedicated attention with a special focus on the projections for the evolution of future GPU architectures.

 

This workshop provides a forum to discuss the efficient use of those floating point operations in the context of compute accelerators and will touch on topics such as:

 

  • The future evolution of hardware accelerators for high precision floating point operations
  • Emulation of higher floating point operations on compute accelerators
  • Tools and techniques to estimate and evaluate floating point operations precision
  • Algorithmic approaches for leveraging lower precision floating point operations

 

The workshop will feature selected talks from hardware vendors and developers, computer scientists, physicists and mathematicians on the above topics and provide ample time for discussions.

 

The deadline for registration is June 20th, 2025.

Registration
Registration
Participants
Zoom Meeting ID
69976268917
Host
Alex Lasa Lamarca
Useful links
Join via phone
Zoom URL
    • 14:00 14:10
      Welcome Session 10m 40/S2-A01 – Salle Anderson

      40/S2-A01 – Salle Anderson

      CERN

    • 14:10 14:40
      Problem statement from experiments and SFT 30m 40/S2-A01 – Salle Anderson

      40/S2-A01 – Salle Anderson

      CERN

    • 14:40 14:50
      Problem statement from Theory Department 10m 40/S2-A01 – Salle Anderson

      40/S2-A01 – Salle Anderson

      CERN

      Speaker: Jacob Finkenrath
    • 14:50 15:00
      Problem statement from Beams 10m 40/S2-A01 – Salle Anderson

      40/S2-A01 – Salle Anderson

      CERN

      Speaker: Riccardo di Maria
    • 15:00 15:30
      Coffee 30m Restaurant 1

      Restaurant 1

    • 15:30 16:15
      NVidia - Floating point for future GPU hardware 45m 40/S2-A01 – Salle Anderson

      40/S2-A01 – Salle Anderson

      CERN

      Speaker: Samuel Rodriguez
    • 16:15 16:45
      AMD - Floating point for future GPU hardware 30m 40/S2-A01 – Salle Anderson

      40/S2-A01 – Salle Anderson

      CERN

    • 16:45 17:15
      Extended Precision in Convex Optimisation 30m 40/S2-A01 – Salle Anderson

      40/S2-A01 – Salle Anderson

      CERN

      Semidefinite Programming is a matrix-form generalisation of linear programming, and is typically tackled using Interior Point Methods. These methods are of iterative nature and at each step, a matrix inversion needs to be performed. For small or sparse matrices, direct methods like sparse Cholesky factorisation are used. For dense matrices of larger size, like the ones that arise in convex relaxations of combinatorial problems, Krylov methods like Conjugate Gradient seem a better approach. We show how, as the dual-primal central trajectory approaches the feasible set and the tentative solution becomes rank-deficient, increasing the precision accelerates the convergence (in terms of number of CG iterations).

      Speakers: David Herrera-Marti, Eric Guthmuller, Jerome Fereyre
    • 17:15 17:45
      Discussion - Questions 30m 40/S2-A01 – Salle Anderson

      40/S2-A01 – Salle Anderson

      CERN

  • Wednesday 2 July
    • 09:00 09:30
      Using physics knowledge to improve numerical stability 30m 40/S2-B01 - Salle Bohr

      40/S2-B01 - Salle Bohr

      CERN

      100
      Show room on map

      The numerically stable evaluation of scattering matrix elements near the infrared limit of gauge theories is of great importance for the success of collider physics experiments. We present a novel algorithm that utilizes double precision arithmetic and reaches higher precision than a naive quadruple precision implementation at smaller computational cost. The method is based on physics-driven modifications to propagators, vertices and external polarizations. [https://arxiv.org/abs/2406.07671]

      Authors: E. Bothmann (speaker), J. M. Campbell, S. Höche, M. Knobbe

      Speaker: Enrico Bothmann
    • 09:30 10:00
      KIT - Double-double for pySecDec on GPU 30m 40/S2-B01 - Salle Bohr

      40/S2-B01 - Salle Bohr

      CERN

      100
      Show room on map
      Speaker: Vitaly Magery
    • 10:00 10:30
      An overview of mixed precision strategies for scientific computing 30m 40/S2-B01 - Salle Bohr

      40/S2-B01 - Salle Bohr

      CERN

      100
      Show room on map

      The increasing support of lower precision arithmetics in hardware provides new opportunities for high performance scientific computing. However, even though low precision arithmetics can provide significant speed, communication, and energy benefits, their use in scientific computing poses the challenge of preserving the accuracy and stability of the computation. To address this issue, a variety of mixed precision algorithms that combine low and high precisions have emerged. In this talk I will give an overview of mixed precision algorithms in numerical linear algebra, with a focus on recent advances to accelerate the solution of linear systems.

      Speaker: Theo Mary
    • 10:30 10:45
      Discussion - Questions 15m 40/S2-B01 - Salle Bohr

      40/S2-B01 - Salle Bohr

      CERN

      100
      Show room on map
    • 10:45 11:15
      Coffee 30m Restaurant 1

      Restaurant 1

    • 11:15 11:45
      University of Budapest - Tensor networks, FP64 emulation 30m 40/S2-B01 - Salle Bohr

      40/S2-B01 - Salle Bohr

      CERN

      100
      Show room on map
    • 11:45 12:00
      TNL: Numerical Library for Modern Parallel Architectures 15m 40/S2-B01 - Salle Bohr

      40/S2-B01 - Salle Bohr

      CERN

      100
      Show room on map

      TNL (www.tnl-project.org) is a collection of building blocks that facilitate the development of efficient numerical solvers and HPC algorithms. It is implemented in C++ using modern programming paradigms in order to provide a flexible and user-friendly interface similar to, for example, the C++ Standard Template Library. TNL provides native support for modern hardware architectures such as multicore CPUs, GPUs, and distributed systems, which can be managed via a unified interface. In our presentation, we will demonstrate the main features of the library together with efficiency of the implemented algorithms and data structures.

      Speaker: Thomas Oberhuber
    • 12:00 12:15
      Float32 Expansions – A Possible Answer for Scientific Computing in the Era of AI-Driven GPU Development 15m 40/S2-A01 – Salle Anderson

      40/S2-A01 – Salle Anderson

      CERN

      In recent years, the emergence of large language models has led GPU vendors to prioritize performance improvements for lower-precision arithmetic, often at the expense of continued development for Float64. Meanwhile, scientific computing has increasingly relied on GPGPU acceleration, where double precision is still essential. Multi-word expansions for single-precision floating point numbers may offer a viable alternative—providing comparable or even superior precision while achieving better performance than native double precision. In this talk, we will present results using a CUDA-enabled, templated, and ported version of the QD library within the TNL framework, applied to existing numerical algorithms.

      Speaker: František Stloukal
    • 12:15 12:45
      Discussion - Questions 30m 40/S2-B01 - Salle Bohr

      40/S2-B01 - Salle Bohr

      CERN

      100
      Show room on map
    • 12:45 14:00
      Lunch 1h 15m Restaurant 1

      Restaurant 1

    • 14:00 14:30
      LBNL - Differential programming / algo need for precision 30m 40/S2-B01 - Salle Bohr

      40/S2-B01 - Salle Bohr

      CERN

      100
      Show room on map
      Speaker: Garima Singh
    • 14:30 15:00
      Precision auto-tuning and control of accuracy in high performance simulations 30m 40/S2-B01 - Salle Bohr

      40/S2-B01 - Salle Bohr

      CERN

      100
      Show room on map

      In the context of high performance computing, new architectures, becoming more and more parallel, offer higher floating-point computing power. Thus, the size of the problems considered (and with it, the number of operations) increases, becoming a possible cause for increased uncertainty. As such, estimating the reliability of a result at a reasonable cost is of major importance for numerical software. In this talk we present an overview of different approaches for accuracy analysis (guaranteed or probabilistic ones) and the related software. We also describe methods to improve the results accuracy. We present the principles of Discrete Stochastic Arithmetic (DSA) that enables one to estimate rounding errors in simulation codes. DSA can be used to control the accuracy of programs in half, single, or double precision via the CADNA library, and also in arbitrary precision via the SAM library. Thanks to DSA, the accuracy estimation and the detection of numerical instabilities can be performed in parallel codes on CPU and on GPU. Most numerical simulations are performed in double precision, and this can be costly in terms of computing time, memory transfer and energy consumption. We present tools for floating-point auto-tuning that aim at reducing the numerical formats used in simulation programs.

      Speaker: Fabienne Jézéquel (LIP6, Sorbonne Université)
    • 15:00 15:30
      Experiences with CADNA and the Madgraph5 Event Generator 30m 40/S2-B01 - Salle Bohr

      40/S2-B01 - Salle Bohr

      CERN

      100
      Show room on map

      This talk presents a summer student project that explored the numerical stability of MadGraph5 using CADNA. It focuses on how CADNA’s warning system and its ability to quantify floating-point precision were used to assess whether MadGraph5 can operate reliably with single-precision floating-point numbers.

      Speaker: Stephan Hageboeck
    • 15:30 16:00
      Coffee 30m Restaurant 1

      Restaurant 1

    • 16:00 16:30
      Emulating Matrix Multiplication Using Mixed-Precision Computation 30m 40/S2-B01 - Salle Bohr

      40/S2-B01 - Salle Bohr

      CERN

      100
      Show room on map

      This talk introduces a method for emulating matrix multiplication through mixed-precision computation. As exemplified by the Matrix Engine on GPUs, low-precision arithmetic can be performed significantly faster than conventional FP32 or FP64 operations. We present Ozaki Scheme I and II, which leverage low-precision arithmetic to achieve accuracy comparable to standard FP64, and discuss their numerical performance.

      Speaker: Katsuhisa Ozaki
    • 16:30 16:50
      Discussion - Questions 20m 40/S2-B01 - Salle Bohr

      40/S2-B01 - Salle Bohr

      CERN

      100
      Show room on map
    • 16:50 17:00
      Closing Session 10m 40/S2-B01 - Salle Bohr

      40/S2-B01 - Salle Bohr

      CERN

      100
      Show room on map