Choose timezone

Your profile timezone:

Use timezone based on: Event/category Custom

Select a custom timezone

Login

Using Spark for Physics

Wednesday 4 May 2022, 16:30 → 17:30 Europe/Zurich

Vidyo only

Vidyo only

Gordon Watts (University of Washington (US))

Description

https://github.com/LucaCanali/Miscellaneous/tree/master/Spark_Physics

Using modern Spark as an analysis tool

Videoconference

IRIS-HEP topical meetings

Zoom Meeting ID: 68133510887
Host: David Lange
Alternative hosts: Robert Currier Tuck, Shawn Mc Kee
Useful links: Join via phone
Zoom URL

- 16:30 → 17:30
  
  Investigating Apache Spark for Physics Analysis 1h
  
  Apache Spark is a very successful open-source tool for data processing, over the last few years Spark and platforms built around it have seen large adoption in industry. This talk will focus on the use of Spark and its DataFrame API in the context of HEP. We will go through a few demos of some simple analyses implemented on Jupyter notebooks using Apache Spark APIs. We will also briefly review some related work on Spark DataFrames for large scale Physics data preparation/reduction. Based on those experiences we will discuss the key features of Spark and its ecosystem that can be useful for Physics analysis, and what still needs improvement, compared to the current state of the art analysis software.
  
  Speaker: Luca Canali (CERN)
  
  ApacheSpark_for_Physics_May2022.pdf
  
  ApacheSpark_for_Physics_May2022.pptx

Powered by Indico v3.3.2-pre