Help us make Indico better by taking this survey! Aidez-nous à améliorer Indico en répondant à ce sondage !

19–25 Oct 2024
Europe/Zurich timezone

Reshaping Analysis for Fast Turnaround

24 Oct 2024, 17:09
18m
Large Hall B

Large Hall B

Talk Track 9 - Analysis facilities and interactive computing Parallel (Track 9)

Speaker

Kevin Patrick Lannon (University of Notre Dame (US))

Description

In the data analysis pipeline for LHC experiments, a key aspect is the step in which small groups of researchers—typically graduate students and postdocs—reduce the smallest, common-denominator data format down to a small set of specific histograms suitable for statistical interpretation. Here, we will refer to this step as “analysis” with the recognition that in other contexts, “analysis” might include other pieces, such as the actual computation required to extract statistical interoperation from the histograms. Analysis is a very important part of the pipeline as it is the step where individual researchers exercise their creativity in trying new ideas in the pursuit of discovery. Therefore, a critical metric for the analysis step is turnaround time because it determines how rapidly researchers can explore their space of ideas. We demonstrate our experience reshaping late-stage analysis applications on thousands of nodes with the goal of minimizing turnaround time. It is not enough merely to increase scale: it is necessary to make changes throughout the stack, including storage systems, data management, task scheduling, and application design. We demonstrate these changes when applied to CMS analysis applications built using the Coffea framework, leveraging Dask and TaskVine to scale out to distributed resources. We evaluate the performance of the applications on opportunistic campus clusters, showing effective scaling up to 7200 cores, thus producing significant improvement in turnaround time.

Primary authors

Austin Townsend (University of Notre Dame (US)) Barry Sly-Delgado (University of Notre Dame) Benjamin Tovar Lopez (University of Notre Dame) Connor Moore (University of Notre Dame (US)) Douglas Thain (University of Notre Dame) Jin Zhou (University of Notre Dame) Kevin Patrick Lannon (University of Notre Dame (US))

Presentation materials