Speaker
Description
With the data deluge that is expected to come with the High-Luminosity LHC and limited storage resources, the need to reduce the on-disk file size of High-Energy Physics (HEP) data becomes even more pressing. Lossless compression algorithms and encodings are already extensively used across all experiments data tiers, leading to often significant reductions of the total on-disk data volume for the collaboration. However, the aforementioned future storage challenges naturally lead to the question of whether more could be done. One potential next step to reduce data volumes even further is the use of lossy encoding schemes to store physics analysis data. The challenge with this approach, however, is the inherent loss in precision and (perceived) lack of predictability on its effects. In this contribution, we explore the impact of lossy compression on HEP data stored in ROOT's new RNTuple data format, which offers fine-grained mechanisms for low-precision data storage. We do this by evaluating different lossy encodings applied on a selection of particle quantities, and mapping out their effects on an open-data based analysis. With this evaluation, we aim to help the community in making informed decisions on the use of lossy compression for their use case.