Speaker
Description
The Worldwide LHC Computing Grid (WLCG) is the infrastructure enabling the storage and pro-cessing of the large amount of data generated by the LHC experiments, and in particular the ALICE experiment among them. With the foreseen increase in the computing requirements of the future HighLuminosity LHC experiments, a data placement strategy which increases the efficiency of the WLCG computing infrastructure becomes extremely relevant for the scientific success of the LHC scientificprogramme. Currently, the data placement at the ALICE Grid computing sites is optimised via heuristic algorithms. Optimisation of the data storage could yield substantial benefits in terms of efficiency and time-to-result. This has however proven to be arduous due to the complexity of the problem. In this work we propose a modelisation of the behaviour of the system via principal component analysis, time series analysis and deep learning, starting from the detailed data collected by the MonALISA monitoring system. We show that it is possible to analyse and model the throughput of the ALICE Grid to a level that has not been possible before. In particular we compare the performance of different deep learning architectures based on recurrent neural networks. Analyzing about six weeks of ALICE Grid I/O, the trend of the throughput is successfully predicted with a mean relative error of ~4%, while the prediction of the throughput itself is achieved with an accuracy of ~5%.
An accurate prediction of the MonALISA system behavior can lead to a reduction of the time to answer client queries since the in-memory learned model could be used instead of querying the database.
Speaker time zone | Compatible with Europe |
---|