Using Spark for Physics

Vidyo only

Vidyo only

Gordon Watts (University of Washington (US))
IRIS-HEP topical meetings
Zoom Meeting ID
David Lange
Alternative hosts
Robert Currier Tuck, Shawn Mc Kee
Useful links
Join via phone
Zoom URL
    • 4:30 PM 5:30 PM
      Investigating Apache Spark for Physics Analysis 1h

      Apache Spark is a very successful open-source tool for data processing, over the last few years Spark and platforms built around it have seen large adoption in industry. This talk will focus on the use of Spark and its DataFrame API in the context of HEP. We will go through a few demos of some simple analyses implemented on Jupyter notebooks using Apache Spark APIs. We will also briefly review some related work on Spark DataFrames for large scale Physics data preparation/reduction. Based on those experiences we will discuss the key features of Spark and its ecosystem that can be useful for Physics analysis, and what still needs improvement, compared to the current state of the art analysis software.

      Speaker: Luca Canali (CERN)