Speaker
Description
CMS is transitioning to use ROOT’s new RNTuple data storage format for the files CMS will write in the HL-LHC era. Based on initial tests, CMS expects faster I/O and smaller files compared to the present TTree storage format. This contribution will show a comprehensive performance comparison between RNTuple and TTree I/O using CMS AOD and MiniAOD data formats as test cases for both simulation and collision data corresponding to similar data taking conditions of LHC Run 3. Quantities such as the resulting file size, the memory usage of the I/O components, and the rate of events being read from a file or written to a file will be measured. CMS’ data processing relies heavily on reading files over the local or wide area networks. The file read patterns are important because the latencies have been seen to influence the total production job times. Therefore a study on the file read patterns will be conducted by recording traces of the offset, size, and timestamp of each read request for both RNTuple and TTree. The behavior of network reads will be mimicked by reading local files where artificial latency will be added to the read requests. The effect of different latency values on the job times will be studied.