ROOT provides an extremely flexible format used throughout the HEP community. The number of use cases – from an archival data format to end-stage analysis – has required a number of tradeoffs to be exposed to the user. For example, a high “compression level” in the traditional DEFLATE algorithm will result in a smaller file (saving disk space) at the cost of slower decompression (costing CPU time when read). If not done correctly, at the scale of a LHC experiment, poor design choices can result in terabytes of wasted space.
We explore and attempt to quantify some of these tradeoffs. Specifically, we explore: the use of alternate compression algorithms to optimize for read performance; an alternate method of compression individual events to allow efficient random access; and a new approach to whole-file compression. Quantitative results are given, as well as guidance on how to make compression decisions for different use cases.
|Secondary Keyword (Optional)||Storage systems|
|Primary Keyword (Mandatory)||Data processing workflows and frameworks/pipelines|