Sep 24 – 27, 2019
Europe/Zurich timezone

A journey over the memory managment stack for HPC large applications on moderne architectures

Sep 25, 2019, 11:30 AM
80/1-001 - Globe of Science and Innovation - 1st Floor (CERN)

80/1-001 - Globe of Science and Innovation - 1st Floor


Show room on map


Sébastien Valat (ATOS/Bull)


Memory managment has always been an issue for large application but the increase of memory space and intra-node thread-based parallelism now put lot more pressure on this complex part of the operating system stack. Althrough there is a long tradition of algorithm developpements on this topic with behind 60 years of research there is still a lot to do.

This is even more true in large scale application where the size of the code (target was a million line C++/MPI app) and global complexity is a big limitation to apply what should theoritically be the clean way to proceed. We also today need to make global optimization to make the wall stack well interacting not letting a component breaking the performance gained by the top or bottom one.

After making a PhD. on memory management in HPC mostly arround a malloc implementation and various kernels memory managment studies for supercomputers and NUMA architectures I pursued as a post-doc developping a memory profiling tool: MALT. During my time at CERN I added to the list NUMAPROF a NUMA memory profiling tool.

I can over this talk recap the 9 years road I walked on with experience feedback showing sometimes impressive performance gaps on large real applications by considering the path from CPU caches, NUMA layout going through the OS paging system and malloc implementation closing by profiling real applications. I will try to glue the full picture showing the need to keep the global picture to really reach performance.

Presentation materials