9–13 May 2022
CERN
Europe/Zurich timezone

Neural network distributed training and optimization library (NNLO)

13 May 2022, 09:00
25m
500/1-001 - Main Auditorium (CERN)

500/1-001 - Main Auditorium

CERN

400
Show room on map
Regular talk Workshop

Speaker

Irena Veljanovic (CERN)

Description

With deep learning becoming very popular among LHC experiments, it is expected that speeding up the network training and optimization will soon be an issue. To this purpose, we are developing a dedicated tool at CMS, Neural Network Learning and Optimization library (NNLO). NNLO aims to support both widely known deep learning libraries Tensorflow and PyTorch. It should help engineers and scientists to easier scale neural network learning and hyperparameter optimization. Supported training configurations are a single GPU, a single node with multiple GPUs and multiple nodes with multiple GPUs. One of the advantages of the NNLO library is the seamless transition between resources, enabling researchers to quickly scale up from workstations to HPC and cloud resources. Compared to manual distributed training, NNLO facilitates the transition from a single to multiple GPUs without losing performance. With this contribution, we will discuss the status of the project and perspectives for the future.

Primary authors

Irena Veljanovic (CERN) Jean-Roch Vlimant (California Institute of Technology (US)) Maurizio Pierini (CERN) Vladimir Loncar (CERN)

Presentation materials