Help us make Indico better by taking this survey! Aidez-nous à améliorer Indico en répondant à ce sondage !

29 November 2021 to 3 December 2021
Virtual and IBS Science Culture Center, Daejeon, South Korea
Asia/Seoul timezone

Predicting Alice Grid throughput using recurrent neural networks

contribution ID 654
29 Nov 2021, 18:20
20m
S221-A (Virtual and IBS Science Culture Center)

S221-A

Virtual and IBS Science Culture Center

55 EXPO-ro Yuseong-gu Daejeon, South Korea email: library@ibs.re.kr +82 42 878 8299
Oral Track 1: Computing Technology for Physics Research Track 1: Computing Technology for Physics Research

Speaker

Dr Sofia Vallecorsa (CERN)

Description

The Worldwide LHC Computing Grid (WLCG) is the infrastructure enabling the storage and pro-cessing of the large amount of data generated by the LHC experiments, and in particular the ALICE experiment among them. With the foreseen increase in the computing requirements of the future HighLuminosity LHC experiments, a data placement strategy which increases the efficiency of the WLCG computing infrastructure becomes extremely relevant for the scientific success of the LHC scientificprogramme. Currently, the data placement at the ALICE Grid computing sites is optimised via heuristic algorithms. Optimisation of the data storage could yield substantial benefits in terms of efficiency and time-to-result. This has however proven to be arduous due to the complexity of the problem. In this work we propose a modelisation of the behaviour of the system via principal component analysis, time series analysis and deep learning, starting from the detailed data collected by the MonALISA monitoring system. We show that it is possible to analyse and model the throughput of the ALICE Grid to a level that has not been possible before. In particular we compare the performance of different deep learning architectures based on recurrent neural networks. Analyzing about six weeks of ALICE Grid I/O, the trend of the throughput is successfully predicted with a mean relative error of ~4%, while the prediction of the throughput itself is achieved with an accuracy of ~5%.

An accurate prediction of the MonALISA system behavior can lead to a reduction of the time to answer client queries since the in-memory learned model could be used instead of querying the database.

Speaker time zone Compatible with Europe

Primary authors

Costin Grigoras (CERN) Mircea-Marian Popa (University Politehnica of Bucharest (RO)) Dr Sofia Vallecorsa (CERN)

Presentation materials