The recent trends in HPC systems – massive parallelism, large vector, asynchronism – and the increase in computational power allow larger, more complex, and higher resolution numerical simulations. These progress however raise new concerns and challenges beyond system design. One such challenge is the validation of the numerical quality of a simulation, especially regarding the round-off error implied by the usage of a finite representation of real numbers. Furthermore, with new workloads targeting HPC facilities, such as machine learning, processors propose new representation formats such as BF16. To harness these new lower-precision formats on traditional workloads, developers need to determine lower precision implementation that guarantees correct and accurate results.
In that context we propose Verificarlo, a framework for numerical verification and optimization, which replaces floating-point operations by software emulated arithmetic. For debugging and validation, we propose a methodology based on Monte Carlo arithmetic. To optimize precision, we emulate any precision fitting in the original type, and propose a heuristic based optimization loop to minimize the precision over the code iterations while ensuring accurate and precise results compared to the user-defined reference.