29 November 2021 to 3 December 2021
Virtual and IBS Science Culture Center, Daejeon, South Korea
Asia/Seoul timezone

ServiceX: Making all Data Columnar

contribution ID 744
Not scheduled
20m
Walnut (Gather.Town)

Walnut

Gather.Town

Poster Track 2: Data Analysis - Algorithms and Tools Posters: Walnut

Speaker

Gordon Watts (University of Washington (US))

Description

ServiceX is a cloud-native distributed application that transforms data into columnar formats in the python ecosystem and ROOT framework. Along with the transformation, is applies filtering, and thinning operations to reduce the data load sent to the client. ServiceX, designed for easy deployment to a Kubernetes cluster, is runs near the data, scanning TB’s of data to send GB’s to a client or analysis facility. In parallel it can quickly read data from a variety of formats, apply selection criteria, calculations, sorting operations. Adaptors are available for ROOT and parquet files, as well as awkward arrays and ROOT’s RDataFrame interface. An overview of ServiceX, its connections inside and outside of Particle Physics, and the concepts behind transformation and applicability to data preservation will be described. Open data from ATLAS run 1 (simple ROOT TTree files) and CMS Run 1 AOD (complex binary datafiles) will be used as examples to demonstrate the functionality.

Significance

  • First time able to service root files, parquet files, and as awkward arrays, or feed into RDataFrame
  • Installations at various Analysis Facilities have now occurred
  • Gained users from the Dark Matter Community (which will discuss briefly here)
  • Can service modern Run 2 and very old Run 1 data to the same set of tools
Speaker time zone Compatible with Europe

Primary authors

Andrew Eckart (University of Chicago) Benjamin Galewsky (Univ. Illinois at Urbana Champaign (US)) Gordon Watts (University of Washington (US)) Mark Stephen Neubauer (Univ. Illinois at Urbana-Champaign) Ilija Vukotic (University of Chicago (US)) Kyungeon Choi (University of Texas at Austin (US)) Mason Proffitt (University of Washington (US)) Mr Nick Decheine (Wisconson) Peter Onyisi (University of Texas at Austin (US)) Robert William Gardner Jr (University of Chicago (US)) Suchandra Thapa (University of Chicago) Suchandra Thapa (University of Chicago)

Presentation materials