July 28, 2020 to August 6, 2020
virtual conference
Europe/Prague timezone

Data Analysis with GPU-Accelerated Kernels

Jul 28, 2020, 5:30 PM
20m
virtual conference

virtual conference

Talk 14. Computing and Data Handling Computing and Data Handling

Speaker

Irene Dutta (California Institute of Technology (US))

Description

At HEP experiments, processing billions of records of structured numerical data can be a bottleneck in the analysis pipeline. This step is typically more complex than current query languages allow, such that numerical codes are used. As highly parallel computing architectures are increasingly important in the computing ecosystem, it may be useful to consider how accelerators such as GPUs can be used for data analysis. Using CMS and ATLAS Open Data, we implement a benchmark physics analysis with GPU acceleration directly in Python based on efficient computational kernels using Numba/LLVM, resulting in an order of magnitude throughput increase over a pure CPU-based approach. We discuss the implementation and performance benchmarks of the physics kernels on CPU and GPU targets. We demonstrate how these kernels are combined to a modern ML-intensive workflow to enable efficient data analysis on high-performance servers and remark on possible operational considerations.

Primary author

Irene Dutta (California Institute of Technology (US))

Co-authors

Nan Lu (California Institute of Technology (US)) Dr Jean-Roch Vlimant (California Institute of Technology (US)) Harvey Newman (California Institute of Technology (US)) Maria Spiropulu (California Institute of Technology (US)) Christina Reissel (ETH Zurich (CH)) Daniele Ruini (ETH Zurich (CH)) Joosep Pata (California Institute of Technology (US))

Presentation materials