10–14 Oct 2016
San Francisco Marriott Marquis
America/Los_Angeles timezone

Exploring Compression Techniques for ROOT IO

10 Oct 2016, 15:30
15m
Sierra A (San Francisco Mariott Marquis)

Sierra A

San Francisco Mariott Marquis

Oral Track 5: Software Development Track 5: Software Development

Speaker

Zhang Zhe (University of Nebraska-Lincoln)

Description

ROOT provides an extremely flexible format used throughout the HEP community. The number of use cases – from an archival data format to end-stage analysis – has required a number of tradeoffs to be exposed to the user. For example, a high “compression level” in the traditional DEFLATE algorithm will result in a smaller file (saving disk space) at the cost of slower decompression (costing CPU time when read). If not done correctly, at the scale of a LHC experiment, poor design choices can result in terabytes of wasted space.

We explore and attempt to quantify some of these tradeoffs. Specifically, we explore: the use of alternate compression algorithms to optimize for read performance; an alternate method of compression individual events to allow efficient random access; and a new approach to whole-file compression. Quantitative results are given, as well as guidance on how to make compression decisions for different use cases.

Primary Keyword (Mandatory) Data processing workflows and frameworks/pipelines
Secondary Keyword (Optional) Storage systems

Author

Brian Paul Bockelman (University of Nebraska (US))

Co-author

Zhang Zhe (University of Nebraska-Lincoln)

Presentation materials