HSF WLCG Virtual Workshop
The second edition of the HSF-WLCG virtual workshop series brings us together again to review progress and discuss plans in the key areas of software and computing for HEP.
This workshop will be run with parallel software and computing tracks.
Please note that workshop sessions will be RECORDED and the video made public afterwards.
HSF sessions are being uploaded to YouTube now and you will also find the recording of each talk attached in Indico.
The document summarising outcomes and follow-ups from the workshop is now available.
For software we plan sessions on:
- Software plenary covering a wide spectrum of HSF activity
- A focus session on event generation
- A focus session on detector simulation, co-organised with Geant4
- An open session, for software R&D where we invite your contributions
The computing sessions will focus on storage and its evolution towards the HL-LHC needs.
Organisers:
- Julia Andreeva, CERN
- Simone Campana, CERN
- Philippe Canal, Fermilab
- Ian Collier, STFC
- Gloria Corti, CERN
- Jose Flix Molina, CIEMAT/PIC
- Alessandra Forti, University of Manchester
- Michel Jouvin, IJCLab
- Teng Jian Khoo, Humbold University / University of Innsbruck
- David Lange, Princeton University
- Jonathan Madsen, LBNL
- Josh McFayden, University of Sussex
- Helge Meinhard CERN
- Maarten Litmaath, CERN
- Witek Pokorski, CERN
- Oxana Smirnova, Lund University
- Graeme A Stewart, CERN
- Andrea Valassi, CERN
- Mattias Wadestein, Umea
- Efe Yazgan, National Taiwan University
-
-
Computing: Development in the experiments and implications on storage evolution
- 1
- 2
-
3
ATLASSpeakers: Alessandro Di Girolamo (CERN), Alessandro Di Girolamo (CERN)
-
4
CMSSpeakers: Danilo Piparo (CERN), James Letts (Univ. of California San Diego (US))
-
17:15
Coffee and cake
- 5
-
6
Belle IISpeaker: Silvio Pardi (Universita e sezione INFN di Napoli (IT))
- 7
-
18:30
Coffee and cake
-
8
JUNO and other experiments at IHEPSpeaker: Lu Wang (Computing Center,Institute of High Energy Physics, CAS)
-
9
Discussion
-
Software: IntroductionConveners: Graeme A Stewart (CERN), Michel Jouvin (Université Paris-Saclay (FR))
-
10
Workshop IntroductionSpeaker: Michel Jouvin (Université Paris-Saclay (FR))
-
10
-
Software: Other CommunitiesConveners: Graeme A Stewart (CERN), Michel Jouvin (Université Paris-Saclay (FR))
-
11
Future Trends in Nuclear Physics SW and Computing
Report from the recent workshop
Speaker: Dr Markus Diefenthaler (Jefferson Lab) -
16:30
Discussion
-
11
-
Software: R&D ActivitiesConveners: Graeme A Stewart (CERN), Michel Jouvin (Université Paris-Saclay (FR))
-
12
PyHEPSpeaker: Eduardo Rodrigues (University of Liverpool (GB))
-
17:00
Discussion
-
12
-
17:10
Group Photo (please switch on your camera!)
-
17:12
Coffee and Cake
Sorry, you have to bring your own... try a jammy dodger!
-
Software: R&D ActivitiesConveners: David Lange (Princeton University (US)), Teng Jian Khoo (Humboldt University of Berlin (DE))
-
13
CERN R&D on Spack and the Turnkey StackSpeaker: Valentin Volkl (University of Innsbruck (AT))
-
17:50
Discussion
-
13
-
Software: TrainingConveners: David Lange (Princeton University (US)), Teng Jian Khoo (Humboldt University of Berlin (DE))
-
14
Training: ActivitiesSpeaker: Samuel Ross Meehan (CERN)
-
18:15
Discussion
-
15
Training: Community BuildingSpeakers: Kilian Lieret, Kilian Lieret (Ludwig Maximilian University Munich)
-
18:40
Discussion
-
14
-
-
-
Computing: Site input
-
16
Storage and data management experience at CNAFSpeakers: Lucia Morganti, Vladimir Sapunenko (INFN-CNAF (IT)), Dr Vladimir Sapunenko (INFN-CNAF)
-
17
RAL : Experience with using Erasure Coding and cost projections vs Replicated StorageSpeaker: Alastair Dewhurst (Science and Technology Facilities Council STFC (GB))
-
18
KIT: Storage Setup and Evolution at the GridKa Tier-1 SiteSpeakers: Jan Erik Sundermann (Albert-Ludwigs-Universitaet Freiburg (DE)), Jan Erik Sundermann (Karlsruhe Institute of Technology (KIT))
-
19
PIC: Storage studies for CMSSpeaker: Carlos Perez Dengra (PIC-CIEMAT)
-
20
UK T2 storage evolutionSpeaker: Samuel Cadellin Skipsey
-
16:15
Coffee and cake
-
21
FR T2 : Concerns about DPM operations during Run3Speaker: David Bouvet (IN2P3/CNRS (FR))
-
22
Production pilot of Swiss ATLAS federated storage with DPM and ARC cachesSpeakers: Francesco Giovanni Sciacca (Universitaet Bern (CH)), Gianfranco Sciacca
-
23
ALPAMED: Measuring data access performances within federated DPM storageSpeaker: Stephane Jezequel (LAPP-Annecy CNRS/USMB (FR))
-
24
US CMS T2: Storage experienceSpeakers: Frank Wuerthwein (Univ. of California San Diego (US)), Frank Wuerthwein (UCSD)
-
25
US ATLAS T2 : Storage experienceSpeaker: Mark Sosebee (University of Texas at Arlington (US))
- 26
-
18:00
Cofee and cake
-
27
FNAL : Storage experienceSpeakers: Bo Jayatilaka (Fermi National Accelerator Lab. (US)), David Alexander Mason (Fermi National Accelerator Lab. (US))
-
28
BNL : Storage experienceSpeaker: Hironori Ito (Brookhaven National Laboratory (US))
-
29
TRIUMF : Status and experience with disk,tapeSpeaker: Simon Liu (TRIUMF (CA))
-
16
-
Software: Event GenerationConveners: Andrea Valassi (CERN), Efe Yazgan (National Taiwan University (TW)), Josh McFayden (University of Sussex)
-
30
IntroductionSpeaker: Josh McFayden (University of Sussex)
- 31
-
16:28
Discussion
-
32
Neural ResamplerSpeaker: Ben Nachman (Lawrence Berkeley National Lab. (US))
-
16:58
Discussion
-
17:10
Coffee and Cake
-
33
Progress on porting MadGraph5_aMC@NLO to GPUsSpeaker: Stefan Roiser (CERN)
-
17:38
Discussion
-
34
PDF/Vegas-FlowSpeaker: Juan M. Cruz Martínez (University of Milan)
-
18:08
Discussion
-
30
-
-
-
16:00
It's the weekend, take a break!
-
16:00
-
-
16:00
It's the weekend, take a break!
-
16:00
-
-
Computing: Site session 2
-
35
KISTI : Tapeless archive experienceSpeaker: Sang Un Ahn (Korea Institute of Science & Technology Information (KR))
-
36
BNL : Experience with tape storageSpeakers: Shigeki Misawa (Brookhaven National Laboratory (US)), Shigeki Misawa (Brookhaven National Laboratory)
-
37
Discussion
-
15:45
Coffe and cake
-
35
-
Computing: Storage technologies
- 38
- 39
-
40
StoRM roadmapSpeaker: Andrea Ceccanti (Universita e INFN, Bologna (IT))
-
41
DPM roadmapSpeakers: Fabrizio Furano (CERN), Petr Vokac (Czech Technical University in Prague (CZ)), Petr Vokac (Czech Technical University)
-
17:20
Coffee and cake
- 42
-
43
XRootD roadmapSpeakers: Andrew Bohdan Hanushevsky (SLAC National Accelerator Laboratory (US)), Andrew Hanushevsky (Unknown), Andrew Hanushevsky (STANFORD LINEAR ACCELERATOR CENTER)
- 44
-
45
Discussion
-
Software: Detector Simulation IConveners: Gloria Corti (CERN), Jonathan Madsen, Philippe Canal (Fermi National Accelerator Lab. (US)), Witold Pokorski (CERN)
-
46
Geant4Speaker: Marc Verderi (Centre National de la Recherche Scientifique (FR))
-
47
Discussion
-
48
Requirements from HEP experimentsSpeaker: Marilena Bandieramonte (University of Pittsburgh (US))
-
49
Discussion
-
50
Machine Learning for Detector Simulation
Full detector simulations using Geant4 are highly accurate but computationally intensive, while existing fast simulation techniques may not provide sufficient accuracy for all purposes. Machine learning offers potential paths to achieve both high speed and high accuracy. This may be especially important to address the computational challenges posed by the HL-LHC. Ongoing efforts from both inside and outside the LHC experimental collaborations will be presented. Challenges and opportunities will also be discussed.
Speaker: Kevin Pedro (Fermi National Accelerator Lab. (US)) -
51
Discussion
-
46
-
17:30
Coffee and Cake
Sorry, you have to get this yourself - why not try a mint tea?
-
Software: Detector Simulation IIConveners: Gloria Corti (CERN), Jonathan Madsen, Philippe Canal (Fermi National Accelerator Lab. (US)), Witold Pokorski (CERN)
- 52
-
53
Discussion
-
54
CeleritasSpeaker: Dr Seth Johnson (Oak Ridge National Laboratory)
-
55
Discussion
-
56
Using Geant4 and Opticks to simulate liquid Aron TPC'sSpeaker: Hans-Joachim Wenzel (Fermi National Accelerator Lab. (US))
-
57
Discussion
-
58
General Discussion
-
-
-
Computing: Analysis Facilities and implications on storage
-
59
GSI Analysis FacilitySpeakers: Mohammad Al-Turany (CERN), Thorsten Kollegger (GSI - Helmholtzzentrum fur Schwerionenforschung GmbH (DE))
-
60
U.S. CMS managed Analysis FacilitiesSpeaker: Oksana Shadura (University of Nebraska Lincoln (US))
-
61
Discussion
-
59
-
Software: Diverse R&D (Coffee break is internal - see detailed view)Conveners: David Lange (Princeton University (US)), Graeme A Stewart (CERN), Michel Jouvin (Université Paris-Saclay (FR)), Teng Jian Khoo (Humboldt University of Berlin (DE))
-
62
IntroductionSpeakers: David Lange (Princeton University (US)), Graeme A Stewart (CERN), Michel Jouvin (Université Paris-Saclay (FR)), Teng Jian Khoo (Humboldt University of Berlin (DE))
-
63
Phoenix Event Display
Visualising HEP experiment data is vital for physicists trying to debug their reconstruction software, to examine detector geometry or to understand physics analyses, and also for outreach and publicity purposes. Traditionally experiments used in-house applications which required installation (often as part of a much larger experiment specific framework). In recent years, web-based event/geometry displays have started to appear, which dramatically lower the entry-barrier to use, but typically were still per-experiment.
Phoenix was adopted as part of the HSF visualisation activity: a TypeScript-based event display framework, using the popular three.js library for rendering. It is experiment agnostic by design, with shared common tools (such as custom menus, controls, propagators) and the ability to add experiment specific extensions. It consists of two packages: a plain TypeScript core library (phoenix-event-display) and an Angular application (a React example is also provided in the documentation).The core library can be adapted for any experiment with some simple steps. It has been selected for Google Summer of Code the last two years, and is ATLAS’ officially supported web-event display. This talk will focus on the status, as well as recent developments, such as WebXR prototypes, interface improvements and the Runge-Kutta propagator.
Speaker: Edward Moyse (University of Massachusetts (US)) -
64
Discussion
-
65
High-throughput data analysis with modern ROOT interfaces
With the upcoming start of LHC Run III and beyond, HEP data analysis is facing a large increase in average input dataset sizes. At the same time, balancing analysis software complexity with the need to extract as much performance as possible from the latest HPC hardware is still often difficult.
Recent developments in ROOT significantly lower the energy barrier for the development of high-throughput data analysis applications. This was achieved through a unique combination of ingredients: a high-level and high-performance analysis framework; just-in-time compilation of C++ code for efficient I/O and usability enhancements; automatic generation of Python bindings; transparent offloading of computations to distributed computation engines such as Spark.
The resulting simplified data analysis model has enabled a whole range of R&D activities that are expected to deliver further acceleration, such as context-aware caching.
This talk will provide an overview of recent developments in ROOT as an engine for high-throughput data analysis and how it is employed in several existing real-world usecases.Speakers: Dr Enrico Guiraud (EP-SFT, CERN), Mr Vincenzo Eduardo Padulano (Valencia Polytechnic University (ES)), Mr Stefan Wunsch (KIT - Karlsruhe Institute of Technology (DE)) -
66
Discussion
-
67
bamboo: easy and efficient analysis with python and RDataFrame
The bamboo analysis framework [1] allows to write simple declarative analysis code (it effectively implements a domain-specific language embedded in python), and runs it efficiently using RDataFrame (RDF) - or viewed differently: it introduces a set of tools to efficiently generate large RDF computation graphs from a minimal amount of user code (in python), e.g. a simple way to specify selections and outputs, automatically filling a set of histograms with different systematic variations of some input variables.
It is currently being used for several analyses on the full CMS Run2 dataset, and thus provides an example of a very analysis description language-like approach that is compatible with the practical needs of modern HEP data analysis (different types of corrections, machine learning inference, user-provided extensions, combining many input samples and scaling out to a batch cluster etc.).[1] https://cp3.irmp.ucl.ac.be/~pdavid/bamboo/
Speaker: Pieter David (Universite Catholique de Louvain (UCL) (BE)) -
68
Discussion
-
69
Analysis Description Language for LHC-type analyses
Physicists aiming to perform an LHC-type analysis today are facing a number of challenges: intense computing knowledge is needed at programming level to implement the relevant algorithm, and at system level to interact with the ever evolving sets of analysis frameworks for interfacing with the analysis object information. Moreover, the ambiguity concerning the configuration of the overall computing environment impairs the reproduction of previous results. To overcome at least some of these difficulties, we propose the utilization of an Analysis Description Language (ADL), a domain specific, declarative language capable of describing the contents of an LHC analysis in a standard and unambiguous way, independent of any computing framework. Such a language decouples the computer intense aspects such as data access from the actual physics algorithm. It would therefore benefit both the experimental and phenomenological communities by facilitating the design, validation, combination, reproduction, interpretation and overall communication of the analysis contents. It would also help to preserve the analyses beyond the lifetimes of experiments or analysis software.
This presentation aims to introduce the ADL concept and summarize the current efforts to make it realistically usable in LHC analyses. In particular, the work that has been ongoing to develop transpiler and interpreter systems adl2tnm and CutLang, to implement various example analyses as well as documentation and validation efforts will be presented.Speakers: Gokhan Unel (University of California Irvine (US)), Sezen Sekmen (Kyungpook National University (KR)), Harry Prosper (Florida State University (US)) -
70
Discussion
-
71
podio - latest developments and new features of a flexible EDM toolkit
Creating efficient event data models (EDMs) for high energy physics (HEP) experiments is a non-trivial task. Past approaches, employing virtual inheritance and possibly featuring deep object-hierarchies, have shown to exhibit severe performance limitations. Additonally, the advent of multi-threading and heterogenous computing poses further constraints on how to efficiently implement EDMs and the corresponding I/O layer. podio is a c++ toolkit for the creation of EDMs with a fast and efficient I/O layer using plain-old-data (POD) structures wherever possible. Physicist users are provided with a high-level interface of lightweight handle classes. The podio code generator that produces all the necessary c++ code from a high-level description in YAML files has recently been completely reworked to improve maintainability and extensibility. We will briefly discuss the new implementation and present, as a first use case, how it has been used to introduce an additional I/O backend based on SIO, a simple binary I/O library that is also used in LCIO. We will further discuss our first implementation of providing access to metadata, i.e. data that does not fit into the EDM itself. Finally, we will show how all of these capabilities are put to use in EDM4hep, the EDM for the Key4hep project.
Speaker: Thomas Madlener (Deutsches Elektronen-Synchrotron (DESY)) -
72
Discussion
-
17:16
Coffee and Cake
-
73
Use of auto-differentiation within the ACTS tookit
The use of first and higher order differentiation is essential for many parts of track reconstruction: either as part of the transport of track parameters through the detector, in several linearization applications, and for establishing the detector alignment. While in general those derivations are well known, they can be complex to derive and even more difficult to be validated. The latter is often done with numerical cross checking using a Ridder's algorithm or similar approaches. The vast development of machine learning application in the last years has also renewed interest in algorithmic differentiation techniques, that uses compiler or runtime techniques to compute exact derivates from function expressions, surpassing the precision achievable via standard numerical differerntiation based on finite differerences.
ACTS is a common track reconstruction toolkit that aims to preserve the tack reconstruction software from the LHC era and at the same time prepares a R&D testbed for further algorithm and technology research. We present the successful inclusion of the auto-diff library into the ACTS propagation and track based alignment modules that serves as a complimentary way to calculate transport jacobians and alignment derivatives: the implementation within the ACTS software is shown, and the validation and CPU time comparison with respect to the implemented analytical or numerically determined expressions are given.Speaker: Mr Huth Benjamin (University of Regensburg) -
74
Discussion
-
75
Reconstruction for Liquid Argon TPC Neutrino Detectors Using Parallel Architectures
Neutrinos are particles that interact rarely, so identifying them requires large detectors which produce lots of data. Processing this data with the computing power available is becoming more challenging as the detectors increase in size to reach their physics goals. Liquid argon time projection chamber (TPC) neutrino experiments are planned to grow by 100 times in the next decade relative to currently operating experiments, and modernization of liquid argon TPC reconstruction code, including vectorization and multi-threading, will help to mitigate this challenge. The liquid argon TPC hit finding algorithm used across multiple experiments, through the LArSoft framework, has been vectorized and multi-threaded. This increases the speed of the algorithm up to 200 times within a standalone version on Intel architectures. This new version of the hit finder has been incorporated back into LArSoft so that it can be used by experiments. To fully take advantage of this implemented parallelism, an experiment workflow is being developed to run LArSoft at a high performance computing center. This will be used to produce samples as part of a central processing campaign.
Speaker: Sophie Berkman (Fermi National Accelerator Laboratory) -
76
Discussion
-
77
GPU-accelerated machine learning inference for offline reconstruction and analysis workflows in neutrino experiments
Future neutrino experiments like DUNE represent big-data experiments that will acquire petabytes of data per year. Processing this amount of data itself is a significant challenge. In recent years, however, the use of deep learning applications in the reconstruction and analysis of data acquired by LArTPC-based experiments has grown substantially. This will impose an even bigger amount of strain on the computing requirements of these experiments since the CPU-based systems used to run offline processing are not well suited to the task of deep learning inference. To address this problem, we adopt an "as a Service" model where the inference task is provided as a web service. We demonstrate the feasibility of this approach by testing it on the full reconstruction chain of ProtoDUNE using fully simulated data, where the GPU-based inference server is hosted on the Google Cloud Platform. We present encouraging results from our tests that include detailed studies of scaling behavior. Based on these results, the "as a Service" approach shows great promise as a solution for the growing computing needs of future neutrino experiments which are associated with deep-learning inference tasks.
Speaker: Tingjun Yang (Fermi National Accelerator Lab. (US)) -
78
Discussion
-
79
GPU-based tracking with Acts
At future hadron colliders such as the High-Luminosity LHC(HL-LHC), tens of thousands of particles can be produced in a single event, which results in a very challenging tracking environment. The estimated CPU resources required by the event processing at the HL-LHC could well exceed the available resources. To mitigate this problem, modern tracking software tends to gain performance by taking advantage of modern computing techniques on hardware such as multi-core CPUs or GPUs with the capability to process many threads in parallel.
The Acts (A Common Tracking Software) project encapsulates the current ATLAS tracking software into an experiment-independent toolkit designed for modern computing architectures. It provides a set of high-level track reconstruction tools agnostic to the details of the detector and magnetic field configuration. Particular emphasis is placed on thread-safety of the code in order to support concurrent event processing with context-dependent detector conditions, such as detector alignments or calibrations. Acts also aims to be a research and development platform for studying innovative tracking techniques and exploiting modern hardware architectures. The multi-threaded event processing on multi-core is supported by using the Intel Thread Building Block (TBB) library. It also provides plugins for heterogeneous computing, such as CUDA and SYCL/oneAPI, and contains example code that could be offloaded to a GPUs, for instance, the Acts seed finder.
In this talk, I will present a summary of the R&D activities to explore parallelism and acceleration of elements of track reconstruction using GPUs, such as the GPU-based seed finding, geometry navigation and Kalman fitting, based on the Acts software. The strategies of GPUs implementation will be shown. Both the achieved performance and the encountered difficulties will be discussed.
Speaker: Xiaocong Ai (DESY) -
80
Discussion
-
81
Investigating Portable Heterogeneous Solutions with Fast Calorimeter Simulation
Physicists at the Large Hadron Collider (LHC), near Geneva,
Switzerland, are preparing their experiments for the high
luminosity (HL) era of proton-proton collision data-taking. In
addition to detector hardware research and development for
upgrades necessary to cope with the more than two-fold increase
in instantaneous luminosity, physicists are investigating
potential heterogeneous computing solutions to address CPU
limitations that could be detrimental to an otherwise successful
physics program.At the dawn of supercomputers employing a wide range of
architectures and specifications, it is crucial that experiments'
software be much as possible abstracted away from the underlying
hardware implementation in order to utilize the vast array of
these machines. New developments in application programming
interfaces (APIs) aim to be architecture-independent, providing
the ability to write single-source codes that can be compiled for
virtually any hardware. In this talk, we present the details of
our work on a cross-platform software prototyping with Kokkos, a
single source, performant parallel C++ API that provides hardware
backends for wide range of parallel architectures, including
NVIDIA, AMD, Intel, OpenMP and pThreads, and SYCL, an abstraction
layer whose specification is defined by the Khronos Group and
members from industry-leading entities such as Intel. Using
ATLAS’s new fast calorimeter simulations codes, FastCaloSim, as a
testbed, we evaluate Kokkos and SYCL in terms of its
heterogeneity and its performance with respect to other parallel
computing APIs.Speakers: Vincent Pascuzzi (Lawrence Berkeley National Lab. (US)), Dr Charles Leggett (Lawrence Berkeley National Lab (US)) -
82
Discussion
-
83
Closing RemarksSpeakers: David Lange (Princeton University (US)), Graeme A Stewart (CERN), Michel Jouvin (Université Paris-Saclay (FR)), Teng Jian Khoo (Humboldt University of Berlin (DE))
-
62
-
Computing: HPC/cloud storage integration
-
84
NERSC experienceSpeakers: Wahid Bhimji, Wahid Bhimji (Lawrence Berkeley National Lab. (US))
-
85
CMS Experience with CINECASpeakers: Daniele Spiga (Universita e INFN, Perugia (IT)), Dr Tommaso Boccali (INFN Sezione di Pisa), Tommaso Boccali (INFN Sezione di Pisa, Universita' e Scuola Normale Superiore, P)
-
86
Discussion
-
84
-
17:30
Cofee and cake
-
Computing: Discussion of the WS summary draft and future activities
-
87
Discussion of the WS summary draft and future activities
-
87
-