# 25th International Conference on Computing in High Energy & Nuclear Physics

Europe/Paris
Description

## 25th International Conference on Computing in High-Energy and Nuclear Physics

### vCHEP2021

Welcome! The CHEP conference series addresses the computing, networking and software issues for the world’s leading data‐intensive science experiments that currently analyse hundreds of petabytes of data using worldwide computing resources.

vCHEP 2021 was held as a virtual event between Monday-Friday 17th-21st May 2021.

Thank you to everyone who came and contributed to the conference.

Participants
• Aashay Arora
• Abdelilah Moussa
• Abdullah Nayaz
• Abhishek Lekshmanan
• Agnieszka Dziurda
• Aidan McComb
• Aimilios Tsouvelekakis
• Ajit Mohapatra
• Akanksha Vishwakarma
• Akram Khan
• Alaettin Serhan Mete
• Alain Bellerive
• Alan Malta Rodrigues
• Alara zeynep hoşkan
• Alastair Dewhurst
• ALBERT ROSSI
• Alberto Aimar
• Alberto Gianoli
• Alberto Pace
• Alejandra Gonzalez-Beltran
• Aleksandr Alekseev
• Aleksandr Alikhanov
• Alessandra Doria
• Alessandra Forti
• Alessandro Di Girolamo
• Alex Gekow
• Alex Kish
• Alex Rua Herrera
• Alexander Held
• Alexander Olivas
• Alexander Paramonov
• Alexander Rogachev
• Alexander Sharmazanashvili
• Alexander Undrus
• Alexander Verkooijen
• Alexandr Zaytsev
• Alexandra Ballow
• Alexandre Franck Boyer
• Alexei Klimentov
• Alexis Vallier
• Ali Hariri
• Alina Lazar
• Alison Packer
• Alvaro Fernandez Casani
• Aman Goel
• Amanda Lund
• Amaria Bonsi Navis. I
• Amber Boehnlein
• Amit Bashyal
• Andre Sailer
• Andrea Bocci
• Andrea Ceccanti
• Andrea Dell'Acqua
• Andrea Formica
• Andrea Manzi
• Andrea Sartirana
• Andrea Sciabà
• Andrea Valassi
• Andrea Valassi
• Andrea Valenzuela Ramirez
• Andreas Gellrich
• Andreas Joachim Peters
• Andreas Morsch
• Andreas Nowack
• Andreas Pappas
• Andreas Ralph Redelbach
• Andreas Salzburger
• Andreas Stoeve
• Andrei Dumitru
• Andrei Gaponenko
• Andrei Gheata
• Andrei Kazarov
• Andrei Kirushchanka
• Andrei Sukharev
• Andreu Pacheco Pages
• Andrew Bohdan Hanushevsky
• Andrew Gallo
• Andrew Malone Melo
• Andrew McNab
• Andrew Naylor
• Andrew Olivier
• Andrew Pickford
• Andrew Picot Conaboy
• Andrey Kirianov
• Andrey Lebedev
• Andrey Shevel
• Andrzej Novak
• Angira Rastogi
• Anil Panta
• Anirudh Goel
• Anju Bhasin
• Ankit Kumar
• Ankur Singh
• Anna Alicke
• Anna Ferrari
• Anna Manou
• Annika Stein
• Antoine Perus
• Anton Alkin
• Anton Jusko
• Anton Philippov
• Antonin Dvorak
• Antonino Formuso
• Antonio Augusto Alves Junior
• Antonio Boveia
• Antonio Paulo
• Antonio Perez-Calero Yzquierdo
• Apostolos Theodoridis
• Arantxa Ruiz Martinez
• Aravind Thachayath Sugunan
• Archil Surmava
• Aristeidis Fkiaras
• Aristofanis Chionis Koufakos
• Armen Vartapetian
• Armin Nairz
• Arnaud Chiron
• Arsenii Gavrikov
• Arthur Outhenin-Chalandre
• Artur Lobanov
• Arturo Sanchez Pineda
• Asoka De Silva
• Atsushi Manabe
• Attila Krasznahorkay
• Axel Naumann
• Azam Zabihi
• Balazs Konya
• Barbara Martelli
• Bartosz Sobol
• Bastian Schlag
• Ben Couturier
• Ben Messerly
• Benedetto Di Ruzza
• Benedikt Hegner
• Benedikt Riedel
• Benedikt Volkel
• Benedikt Ziemons
• Benjamin Fischer
• Benjamin Galewsky
• Benjamin Huth
• Benjamin Krikler
• Benjamin Mark Smith
• Benjamin Morgan
• Benjamin Moritz Veit
• Benjamin Tovar
• Benoit Delaunay
• Benoit Million
• Beom Ki Yeo
• Berare Gokturk
• Bernhard Manfred Gruber
• Bernhard Meirose
• Besik Kekelia
• Birgit Lewendel
• Birgit Sylvia Stapf
• Bjarte Kileng
• Bjorn Burkle
• Bob Jones
• Bockjoo Kim
• Bohdan Dudar
• Bora Orgen
• Borja Aparicio Cotarelo
• Borja Garrido Bear
• Brian Davies
• Brian Paul Bockelman
• Bruno Alves
• Bruno Heinrich Hoeft
• Burt Holzman
• Caio Costa
• Callum Pollock
• Carl Timmer
• Carla Sophie Rieger
• Carlacio De Vecchi
• Carlos Perez Dengra
• Carlos Vázquez Sierra
• Catalin Condurache
• Caterina Doglioni
• Caterina Marcon
• Catherine Biscarat
• Cedric Caffy
• Cedric Serfon
• Cemal Azim Orneksoy
• Cenk Tuysuz
• Ceyhun Uzunoglu
• Chang-Seong Moon
• Charles Leggett
• Charline ROUGIER
• Chiara Rovelli
• Choji Saji
• Chris Backhouse
• Chris Burr
• Chris Lee
• Chris Pinkenburg
• Christian Tacke
• Christian Voss
• Christian Walter Bauer
• Christoph Anton Mitterer
• Christoph Merscher
• Christoph Wissing
• Christophe Haen
• Christopher Hollowell
• Christopher Jones
• Chun-Yu Lin
• Clara Gaspar
• Clara Nellist
• Claudio Grandi
• Claus Heinz Kleinwort
• Clemens Lange
• Concezio Bozzi
• Corentin Allaire
• Costa Bushnaq
• Cristian Contescu
• Cristian Schuszter
• Dag Gillberg
• Dalila Salamani
• Dan van der Ster
• Daniel Britzger
• Daniel Hundhausen
• Daniel Peter Traynor
• Daniel Prelipcean
• Daniel Scheirich
• Daniel Thomas Murnane
• Daniela Bauer
• Danila Oleynik
• Danish Farooq Meer
• Dario Barberis
• Dario Mapelli
• Darren Moore
• Darya Dyachkova
• Dave Dykstra
• David Bouvet
• David Britton
• David Cameron
• David Colling
• David Crooks
• David DeMuth
• David Emschermann
• David Groep
• David Hohn
• David Kelsey
• David Lange
• David Lawrence
• David Michael South
• David Rebatto
• David Rohr
• David Rousseau
• David Schultz
• David Schwartz
• David Smith
• David Southwick
• Davide Costanzo
• Davide Di Croce
• Davide Valsecchi
• Debajyoti Sengupta
• Dejan Golubovic
• Denis Pugnere
• Denis-Patrick Odagiu
• Dennis Van Dok
• Derek Weitzel
• Deya Chatterjee
• Di Qing
• Diego Ciangottini
• Diego Davila Foyo
• Dietrich Liko
• Dimitri Bourilkov
• Diogo Castro
• Dirk Duellmann
• Dirk Hufnagel
• Dirk Krucker
• Divya Srinivasan
• Dmitri Smirnov
• Dmitriy Maximov
• Dmitry Borisenko
• Dmitry Litvintsev
• Dmytro Kresan
• Domenico Giordano
• Dominik Smith
• Donald Petravick
• Doris Ressmann
• Doris Yangsoo Kim
• Dorothea vom Bruch
• Doug Benjamin
• Edgar Fernando Carrera Jarrin
• Edoardo Martelli
• Eduardo Rodrigues
• Edward Karavakis
• Edward Moyse
• Efe Yazgan
• Eileen Kuehn
• Einar Alfred Elen
• Eli Dart
• Elias Leutgeb
• Elisabetta Maria Pennacchio
• Elizabeth Gallas
• Elizabeth Sexton-Kennedy
• Elliott Kauffman
• Emanuele Simili
• Emanuele Usai
• Emilio Meschi
• Emmanouil Kargiantoulakis
• Emmanouil Vamvakopoulos
• Engin Eren
• Enol Fernández
• Enric Tejedor Saavedra
• Enrico Bocchi
• Enrico Guiraud
• Enzo Capone
• Eoin Clerkin
• Eric Cano
• Eric Christian Lancon
• Eric Fede
• Eric Grancher
• Eric Pouyoul
• Eric Vaandering
• Eric Wulff
• Eric Yen
• Erica Brondolin
• Erik Buhmann
• Ernst Hellbär
• Esteban Fullana Torregrosa
• Everson Rodrigues
• Fabian Lambert
• Fabio Catalano
• Fabio Hernandez
• Fang-Ying Tsai
• Farida Fassi
• Fatima Zahra Lahbabi
• Fazhi Qi
• Federica Agostini
• Federica Fanzago
• Federica Legger
• Federico Stagni
• Felice Pantaleo
• Felix HAMAN
• Fernando Harald Barreiro Megino
• Florentia Protopsalti
• Florian Fischer
• Florian Rehm
• Florian Uhlig
• Francesco Prelz
• Francis Pham
• Frank Berghaus
• Frank Filthaut
• Frank Gaede
• Frank Winklmeier
• Franz Rhee
• Françoise BOUVET
• Frederique Chollet
• Gabriele Benelli
• Gage DeZoort
• Gagik Gavalian
• Gang Chen
• Gavin McCance
• Gene Van Buren
• Geoff Savage
• George Patargias
• George Ryall
• Georgiana Mania
• Gerardo Ganis
• German Cancio
• Ghita Rahal
• Giacomo Boldrini
• Giacomo Govi
• Giampaolo Carlino
• Gian Michele Innocenti
• Gian Piero Siroli
• Gianantonio Pezzullo
• Gianfranco Sciacca
• Gianluca Bianco
• Gianmaria Del Monte
• Gino Marchetti
• Giordon Holtsberg Stark
• Giovanni Bassi
• Giovanni Guerrieri
• Giovanni Punzi
• Giulio Eulisse
• Giulio Usai
• Giuseppe Andronico
• Giuseppe Avolio
• Giuseppe Lo Presti
• Gloria Corti
• Gokhan Unel
• Gonzalo Merino Arevalo
• Gordon Stewart
• Gordon Watts
• Graeme A Stewart
• Graham Heyes
• Greg Corbett
• Gregory Mezera
• Grigori Rybkin
• Guenter Duckeck
• Guorui Chen
• Gustavo Uribe
• Gustavo Valdiviesso
• Guy Barrand
• Guy Tel-Zur
• Haakon Andre Reme-Ness
• Haavard Helstrup
• Haesung Park
• Haidar Mas'Ud Alfanda
• Haiwang Yu
• Haiwang Yu
• Hakob Jilavyan
• Hannes Sakulin
• Hannsjorg Weber
• Hans Von Der Schmitt
• Hanyul Ryu
• Hao Hu
• Haolai Tian
• Harald Minde Hansen
• Harald Schieler
• Harvey Newman
• Hasib Ahmed
• Haykuhi Musheghyan
• Heather Gray
• Heather Kelly
• Heidi Marie Schellman
• Helge Meinhard
• Helmut Wolters
• Hendrik Schwanekamp
• Herve Rousseau
• Hideki Miyake
• Hongwei Ke
• Hosein Hashemi
• Hosein Hashemi
• Hubert Odziemczyk
• Huiling Li
• Humaira Abdul Salam
• Hyeonja Jhang
• I Ueda
• Ian Bird
• Ian Collier
• Ian Fisk
• Ian Johnson
• Ian Neilson
• Ianna Osborne
• Ignacio Peluaga
• Ignacio Reguero
• Igor Soloviev
• Igor Vasilyevich Mandrichenko
• Ines Pinto Pereira Da Cruz
• Ingo Ebel
• Ingo Müller
• Ingvild Brevik Hoegstoeyl
• Ioan-Mihail Stan
• Irakli Chakaberia
• Irfan Haider
• Ishank Arora
• Ivan Glushkov
• Ivan Kisel
• Ivana Hrivnacova
• Jacob Linacre
• Jakob Blomer
• Jakub Kvapil
• James Amundson
• James Catmore
• James Frost
• James Robert Letts
• James Walder
• Jamie Heredge
• Jamie Shiers
• Jan de Cuveland
• Jan Erik Sundermann
• Jan Eysermans
• Jan Fiete Grosse-Oetringhaus
• Jan Iven
• Jan Kieseler
• Jan Stark
• Jana Schaarschmidt
• Janusz Martyniak
• Janusz Oleniacz
• Jaroslav Guenther
• Jaroslav Zalesak
• Jaroslava Schovancova
• Jason Smith
• Jason Webb
• Javier Lopez Gomez
• Javier Mauricio Duarte
• Jaydeep Datta
• Jaydip Singh
• Jean-Roch Vlimant
• Jeff Landgraf
• Jeff Porter
• Jeff Templon
• Jem Aizen Mendiola Guhit
• Jeremy Edmund Hewes
• Jeroen Hegeman
• Jerome Henri Fulachier
• Jerome LAURET
• Jerome Odier
• Jerome Pansanel
• Jianhui Zhu
• Jim Pivarski
• Jim Shank
• Jiri Chudoba
• Joachim Josef Mnich
• Joana Niermann
• Joanna Waczyńska
• Joao Antonio Tomasio Pina
• Joe Osborn
• Joel Closier
• Johan BREGEON
• Johann Cohen-Tanugi
• Johannes Elmsheuser
• Johannes Michael Wagner
• John Apostolakis
• John Blue
• John Derek Chapman
• John Gordon
• John Haggerty
• John Steven De Stefano Jr
• Jonas Eschle
• Jonas Hahnfeld
• Joosep Pata
• Jorge Camarero Vera
• Jose Augusto Chinellato
• Jose Benito Gonzalez Lopez
• Jose Caballero Bejar
• Jose Flix Molina
• Jose Hernandez
• Jose Salt
• Joseph Boudreau
• Joseph Wang
• Joshua Falco Beirer
• João Fernandes
• João Lopes
• Juan M. Cruz Martínez
• Judita Mamuzic
• Juerg Beringer
• Julia Andreeva
• Julien Leduc
• Julien Rische
• Julio Lozano Bahilo
• Julius Hrivnac
• Junichi Tanaka
• Junpei Maeda
• Junyeong Lee
• Juraj Smiesko
• Jurry de la Mar
• Justin Freedman
• Ka Hei Martin Kwok
• Kai Leffhalm
• Kai Lukas Unger
• Kaito Sugizaki
• Kamil Rafal Deja
• karl amrhein
• Karol Hennessy
• Karolos Potamianos
• Katarzyna Maria Dziedziniewicz-Wojcik
• Katharina Ceesay-Seitz
• Katy Ellis
• Kaushik De
• Kejun Zhu
• Kenneth Bloom
• Kenneth Herner
• Kevin Casella
• Kevin Franz Stehle
• Kevin Pedro
• Kihyeon Cho
• Kira Isabel Duwe
• Kishansingh Rajput
• Koji Terashi
• Kolja Kauder
• Komninos-John Plows
• Konstantin Gertsenberger
• Konstantinos Samaras-Tsakiris
• Krzysztof Genser
• Krzysztof Michal Mastyna
• Kyeongjun Kim
• Kyle Knoepfel
• Kyungeon Choi
• Latchezar Betev
• Laura Cappelli
• Laura Sargsyan
• Laurence Field
• Laurent Aphecetche
• Laurent Duflot
• Lauri Antti Olavi Laatu
• Lea Morschel
• Lene Kristian Bryngemark
• Lennart Rustige
• Lennart Rustige
• Leonardo Cristella
• Leonhard Reichenbach
• Leslie Groer
• Levente Hajdu
• Lia Lavezzi
• Linda Ann Cornwall
• Lindsey Gray
• Linghui Wu
• Liza Mijovic
• Lorena Lobato Pardavila
• Lorenzo Moneta
• Lorenzo Rinaldi
• Lorne Levinson
• Louis-Guillaume Gagnon
• Luca Atzori
• Luca Canali
• Luca Clissa
• Luca Giommi
• Luca Mascetti
• Lucas Nunes Lopes
• Luis Fernandez Alvarez
• Luisa Arrabito
• Luka Todua
• Lukas Alexander Heinrich
• Lukasz Sawicki
• Luke Kreczko
• Lynn Garren
• Maarten Litmaath
• Maarten van Ingen
• Maciej Pawel Szymanski
• Mackenzie Devilbiss
• Maite Barroso Lopez
• Maksim Melnik Storetvedt
• Manfred Peter Fackeldey
• Manuel Giffels
• Manuel Morales
• Manuel Reis
• Marc Dünser
• Marcello Armand Pilon
• Marcelo Vilaça Pinheiro Soares
• Marcelo Vogel
• Marcin Nowak
• Marco Cattaneo
• Marco Clemencic
• Marco Mambelli
• Marco Mascheroni
• Marco Rossi
• Marco Rovere
• Marcus Ebert
• Mareike Meyer
• Mareike Meyer
• Maria Acosta Flechas
• Maria Arsuaga Rios
• Maria Dimou
• Maria Girone
• Maria Grigoryeva
• Maria Pokrandt
• Marian Babik
• Marianette Wospakrik
• Marilena Bandieramonte
• Marina Sahakyan
• Mario Cromaz
• Mario Lassnig
• Mark Hodgkinson
• Mark Ito
• Mark Neubauer
• Markus Elsing
• Markus Frank
• Markus Schulz
• Marten Ole Schmidt
• Martin Barisits
• Martin Gasthuber
• Martin Sevior
• Martin Zemko
• Martina Javurkova
• Mary Georgiou
• Marzena Lapka
• Masahiko Saito
• Masanori Ogino
• Mason Proffitt
• Masood Zaran
• Massimo Sgaravatto
• Mathias Ajami
• Matias Alejandro Bonaventura
• Matteo Concas
• Matteo Paltenghi
• Matthew Feickert
• Matthew Heath
• Matthias Jochen Schnepf
• Matthias Kasemann
• Matthias Kleiner
• Matthias Richter
• Matthias Steinke
• Matthieu CARRERE
• Matthieu Carrère
• Matti Kortelainen
• Max Fischer
• Max Neukum
• Maxim Potekhin
• Maximilian Emanuel Goblirsch-Kolb
• Maximilian Reininghaus
• Maxwell Benjamin Orok
• Mehdi Goli
• Mehmet Demirci
• Meifeng Lin
• Meirin Oan Evans
• Melissa Gaillard
• Miaoyuan Liu
• Micah Groh
• Michael Boehler
• Michael David Sokoloff
• Michael Davis
• michael goodrich
• Michael Grippo
• Michael Kirby
• Michael Schuh
• Michal Kamil Simon
• Michal Kolodziejski
• Michal Svatos
• Michel Hernandez Villanueva
• Michel Jouvin
• Michele Michelotto
• Michelle Kuchera
• Miguel Fontes Medeiros
• Miguel Villaplana
• Mihaela Gheata
• Mihai Patrascoiu
• Mikael Myllymaki
• Mikhail Bogolyubsky
• Milos Lokajicek
• Miriam Calvo Gomez
• Miroslav Potocky
• Miroslav Saur
• Mischa Sallé
• Misha Borodin
• Mizuki Karasawa
• Mohammed Boukidi
• Mohan Krishnamoorthy
• Monika Joanna Jakubowska
• Moonhyun Kim
• Morgan Robinson
• Mykola Khandoga
• Mykyta Shchedrolosiev
• Nagaraj Panyam
• Nathan Brei
• Nazar Bartosik
• Nelly Sagidova
• Nianqi Hou
• Nicholas Styles
• Nick Fritzsche
• Nick Smith
• Nicola Mori
• Nicole Michelle Hartman
• Nicole Schulte
• Niko Neufeld
• Niko Tsutskiridze
• Nikolai Hartmann
• Nikolaos Karastathis
• Nikolay Tsvetkov
• Nilay Bostan
• Nils Erik Krumnack
• Nils Heinonen
• Nils Høimyr
• Noemi Calace
• Norm Buchanan
• Norman Anthony Graf
• Nurcan Ozturk
• Nuria Valls Canudas
• Ofer Rind
• Oisin Creaner
• Oleg Solovyanov
• Olga Chuchuk
• Oliver Freyermuth
• Oliver Gutsche
• Oliver Keeble
• Olivier Devroede
• Olivier Mattelaer
• Olivier Rousselle
• Olver Dawes
• Omar Andres Zapata Mesa
• Onno Zweers
• Oxana Smirnova
• Panagiotis Lantavos-Stratigakis
• Panos Paparrigopoulos
• Panos Stamoulis
• Paolo Calafiura
• Paolo Martinengo
• Paris Gianneios
• Patricia Mendez Lorenzo
• Patricia Rebello Teles
• Patrick Asenov
• Patrick Fuhrmann
• Patrick Gartung
• Patrycja Ewa Gorniak
• Patryk Lason
• Paul Gessinger
• Paul Jackson
• Paul James Laycock
• Paul Kyberd
• Paul Millar
• Paul Musset
• Paul Nilsson
• Pavel Kisel
• Pedro Alonso
• Percy Alexander Cáceres Tintaya
• Pere Mato
• Perisetti Sai Ram Mohan Rao
• Peter Chatain
• Peter Clarke
• Peter Hobson
• Peter Hristov
• Peter Love
• Peter McKeown
• Peter Onyisi
• Peter Van Gemmeren
• Peter Wienemann
• Petr Sestak
• Philip Grace
• Philippe Canal
• Pierre Etienne Macchi
• Pierre-Alain Loizeau
• Pierre-André Amaudruz
• Pieter David
• Placido Fernandez Declara
• Pratik Kafle
• Predrag Buncic
• Purvaja Karthikeyan
• Qingbao Hu
• Rafal Dominik Krawczyk
• Raffaella De Vita
• Rahmat Rahmat
• Rahul Balasubramanian
• Rainer Schwemmer
• Ralf Florian Von Cube
• Ralf Ulrich
• Ramona Hohl
• Ramya Srinivasan
• Ran Du
• Randall Sobie
• Raul Jimenez Estupinan
• Ravinder Dhayal
• Raymond Oonk
• Rebeca Gonzalez Suarez
• Reda Tafirout
• Reiner Hauser
• Renato Cardoso
• Rene Brun
• Rene Caspart
• Ricardo Rocha
• Riccardo Di Maria
• Riccardo Maganza
• Riccardo Maria Bianchi
• Richard Dubois
• Richard Mount
• Richard Teuscher
• Rizart Dona
• Rob Appleyard
• Robert Andrew Currie
• Robert Fay
• Robert Frank
• Robert Johannes Langenberg
• Robert Kutschke
• Robert Vasek
• Roberto Valverde Cameselle
• Robin Middleton
• Rocky Bala Garg
• Rod Burns
• Rodney Walker
• Rodrigo Sierra
• Roger Jones
• Rogerio Iope
• Romain Rougny
• Roman Lietava
• Ron Trompert
• Rose Cooper
• Rosen Matev
• Ross Hunter
• Rosy Nikolaidou
• Ruben Shahoyan
• Rui Zhang
• Ruslan Mashinistov
• Rustem Ospanov
• Ryan Taylor
• Sabah Salih
• Sabine Crepe-Renaudin
• Sakshi Shukla
• Samuel Alfageme Sainz
• Samuel Bernardo
• Sandro Christian Wenzel
• Sandro Fonseca De Souza
• Sang Un Ahn
• Sanghyun Ko
• Santiago Gonzalez De La Hoz
• Saqib Haleem
• Saroj Kandasamy
• Sascha Daniel Diefenbacher
• Sascha Mehlhase
• Saswati Nandan
• Savannah Thais
• Scott Snyder
• Sean Murray
• Sebastian Lopienski
• Sebastian Macaluso
• Sebastian Skambraks
• Sebastien Ponce
• Semen Lebedev
• Seo-Young Noh
• Sergei Gleyzer
• Sergey Chelsky
• Sergey Gorbunov
• Sergey Levonian
• Sertac Ozturk
• Seth Johnson
• Sezen Sekmen
• Shah Rukh Qasim
• Shan Zeng
• Shaojun Sun
• Shawn Mc Kee
• Shigeki Misawa
• Shivali Malhotra
• Shivam Raj
• Shiyuan Fu
• Shota Kobakhidze
• Shravan Sunil Chaudhary
• Silvio Pardi
• Simon Akar
• Simon Blyth
• Simon Fayer
• Simon George
• Simon Liu
• Simone Campana
• Simone Pagan Griso
• Simone Pigazzini
• Simone Stracka
• Sitong An
• Slava Krutelyov
• Sneha Sinha
• Sofia Vallecorsa
• Sonal Dhingra
• Soon Yung Jun
• Sophie Servan
• SRISHTI NAGU
• Srujan Patel
• Stefan Roiser
• Stefan Stonjek
• Stefano Bagnasco
• Stefano Piano
• Stefano Spataro
• Stefano Tognini
• Steffen Baehr
• Stephan Hageboeck
• Stephan Wiesand
• STEPHANE GERARD
• Stephane Jezequel
• Stephen Nicholas Swatman
• Steve Barnet
• Steve Mrenna
• Steven Calvez
• Steven Goldfarb
• Stewart Martin-Haugh
• Stiven Metaj
• Stuart Fuess
• Su Yeon Chang
• Subbulakshmi Sriram
• Sudhir Malik
• Sumit Kundu
• Sunanda Banerjee
• Suvankar Roy Chowdhury
• Sven Bollweg
• Sven Pankonin
• Svenja Meyer
• Swagato Banerjee
• Sylvain CAILLOU
• Takanori Hara
• Takashi Yamanaka
• Tal van Daalen
• Tao Lin
• Tatiana Korchuganova
• Taylor Childers
• Tejinde Virdee
• Teng Jian Khoo
• Teng LI
• Thomas Ackernley
• Thomas Baron
• Thomas Birkett
• Thomas Britton
• Thomas Calvet
• Thomas George Hartland
• Thomas Hartmann
• Thomas Koffas
• Thomas Kress
• Thomas Kuhr
• Thomas Strebler
• Thomas Throwe
• Thorsten Kollegger
• Tiansu Yu
• Tigran Mkrtchyan
• Tim Bell
• Tim Folkes
• Tim Smith
• Timothy Noble
• Tobias Golling
• Tobias Loesche
• Tobias Stockmanns
• Tobias Triffterer
• Todor Trendafilov Ivanov
• Tom Cheng
• Tom Dack
• Tomas Lindén
• Tomas Sykora
• Tommaso Boccali
• Tommaso Chiarusi
• Tommaso Diotalevi
• Tomoaki Nakamura
• Tomoe Kishimoto
• Tomohiro Yamazaki
• Tong Pan
• Tony Cass
• Tony Johnson
• Tony Wong
• Torben Ferber
• Torre Wenaus
• Torri Jeske
• Tristan Scott Sullivan
• Troels Petersen
• Tulika Bose
• Vakho Tsulaia
• Valentin Kuznetsov
• Valentin Volkl
• Valerio Formato
• Vardan Gyurjyan
• Varsha Senthilkumar
• Vasil Georgiev Vasilev
• Vasileios Belis
• Vasyl Hafych
• Vesal Razavimaleki
• Victor Goicoechea Casanueva
• Vikas Singhal
• Viktor Khristenko
• Vincent Garonne
• Vincent R. Pascuzzi
• Vincenzo Innocente
• Vineet Reddy Rajula
• Vinicius Massami Mikuni
• Vipul Davda
• Vishu Saini
• Vit Kucera
• Vitaly Pronskikh
• Volker Friese
• Wahid Redjeb
• Walter Lampl
• Waseem Bhat
• Waseem Kamleh
• Wayne Betts
• wei wang
• Wei Yang
• Weidong Li
• Wen Guan
• Wenlong Yuan
• Wenxing Fang
• Werner Wiedenmann
• William Korcari
• Wolfgang Waltenberger
• Wonho Jang
• Xavier Espinal
• Xavier Vilasis Cardona
• Xiangyang Ju
• Xiaocong Ai
• Xiaofeng Guo
• Xiaohu Sun
• Xiaomei Zhang
• Xiaowei Jiang
• Xin Zhao
• Xingtao Huang
• XinRan Liu
• Xun Chen
• Yang Li
• Yao Yao
• Yaodong Cheng
• Yaosong Cheng
• Yassine El Ouhabi
• Yasuyuki Okumura
• Yee-Ting Li
• Yingrui Hou
• Yo Sato
• Yogesh Verma
• Yongbin Feng
• Younes Belmoussa
• Yuji Kato
• Yujiang Bi
• Yuyi Guo
• Zach Marshall
• Zach Schillaci
• Zacharias Zacharodimos
• Zahoor Islam
• Zhibin Liu
• Zhihua Dong
• Ziyan Deng
• Éric Aquaronne
• 玲 李
• Monday, 17 May
• Opening Session
Zoom Meeting ID
63611158688
Host
vCHEP 00
Alternative hosts
Chiara Ilaria Rovelli, Benedikt Hegner, Edoardo Martelli, Stefan Roiser, Zoom Recording Operations 3, Zoom Recording Operations 2, Simone Campana, Catherine Biscarat, Thomas Baron, Xavier Espinal, Maarten Litmaath, Graeme A Stewart, Sebastian Lopienski, Helge Meinhard
Join via phone
Zoom URL
Conveners: James Catmore (University of Oslo (NO)) , Simone Campana (CERN)
• 1
Welcome
Speaker: Joachim Josef Mnich (CERN)
• 2
Introduction
Speaker: Dr Graeme A Stewart (CERN)
• 3:15 PM
Conference Photo

The group photo of the conference participants will be composed of small but recognizable pictures of people connected to the Zoom meeting with their cameras enabled. The names will be blurred. The final group photo will afterwards be published on the conference website, and possibly in other publications.

• If you want to appear on the group photo, please enable your camera when we will be taking the photo (technically, screenshots of the Zoom gallery view).
• If you prefer not to be included in the group photo, please just keep your camera off.

The screenshots for the group photo will be taken during the dedicated sessions on Monday afternoon (15:15) and on Tuesday morning (10:30). If you participate in one of them, there is no need to attend the other.

• Opening Session
Conveners: James Catmore (University of Oslo (NO)) , Simone Campana (CERN)
• 3
Keynote Talk: Computing Perspectives
Speakers: Ian BIRD (CNRS) , Ian Bird
• 4
Keynote Talk: Software Perspectives
Speaker: Heather Gray (UC Berkeley/LBNL)
• 4:20 PM
Break
• Monday PM plenaries: Plenaries
Zoom Meeting ID
63611158688
Host
vCHEP 00
Alternative hosts
Chiara Ilaria Rovelli, Benedikt Hegner, Edoardo Martelli, Stefan Roiser, Zoom Recording Operations 3, Zoom Recording Operations 2, Simone Campana, Catherine Biscarat, Thomas Baron, Xavier Espinal, Maarten Litmaath, Graeme A Stewart, Sebastian Lopienski, Helge Meinhard
Join via phone
Zoom URL
Conveners: Catherine Biscarat (L2I Toulouse, IN2P3/CNRS (FR)) , Stefan Roiser (CERN)
• 5
Preparing distributed computing operations for the HLLHC era with Operational Intelligence

The Operational Intelligence (OpInt) project is a joint effort from
various WLCG communities aimed at increasing the level of automation
in computing operations and reducing human interventions. The currently deployed systems have proven to be mature and capable of meeting the experiments goals, by allowing timely delivery of scientific results. However, a substantial number of interventions from software developers, shifters and operational teams is needed to manage efficiently such heterogeneous infrastructures.
Under the scope of the OpInt project, experts from most of the relevant areas
have gathered to propose and work on “smart” solutions. Machine learning,
data mining, log analysis, and anomaly detection are only some of the tools we
have evaluated for our use cases . Discussions have led to a number of ideas on
how to achieve our goals and the development of solutions has started. In this
contribution, we will report on the development of a suite of OpInt services to
cover various use cases of: workload management, data management, and site
operations.

Speaker: Panos Paparrigopoulos (CERN)
• 6
Implementation of ACTS into sPHENIX Track Reconstruction

sPHENIX is a high energy nuclear physics experiment under construction at the Relativistic Heavy Ion Collider at Brookhaven National Laboratory. The primary physics goals of sPHENIX are to measure jets, their substructure, and the upsilon resonances in $p$$+$$p$, $p$+Au, and Au+Au collisions. sPHENIX will collect approximately 200 PB of data over three run periods utilizing a finite-sized computing center; thus, performing track reconstruction in a timely manner is a challenge due to the high occupancy of heavy ion collisions. To achieve the goal of reconstructing tracks with high efficiency and within a 5 second per event computational budget, the sPHENIX experiment has recently implemented the A Common Tracking Software (ACTS) track reconstruction toolkit. This paper reports the performance status of ACTS as the default track fitting tool within sPHENIX, including discussion of the first implementation of a TPC geometry within ACTS.

Speaker: Joe Osborn (Oak Ridge National Laboratory)
• 5:40 PM
Break
• 7
The new (and improved!) CERN Single-Sign-On

The new CERN Single-Sign-On (SSO), built around an open sourcestack, has been in production for over a year and many CERN users are alreadyfamiliar with its approach to authentication, either as a developer or as an enduser. What is visible upon logging in, however, is only the tip of the iceberg.Behind the scenes there has been a significant amount of work taking placeto migrate accounts management and to decouple Kerberos [1] authenticationfrom legacy Microsoft components. Along the way the team has been engagingwith the community through multiple fora, to make sure that a solution is pro-vided that not only replaces functionality but also improves the user experiencefor all CERN members. This paper will summarise key evolutions and clarifywhat is to come in the future.

Speaker: Mary Georgiou (CERN)
• 8
Porting HEP Parameterized Calorimeter Simulation Code to GPUs

The High Energy Physics (HEP) experiments, such as those at theLarge Hadron Collider (LHC), traditionally consume large amounts of CPUcycles for detector simulations and data analysis, but rarely use compute accel-erators such as GPUs. As the LHC is upgraded to allow for higher luminosity,resulting in much higher data rates, purely relying on CPUs may not provideenough computing power to support the simulation and data analysis needs. Asa proof of concept, we investigate the feasibility of porting a HEP parameterized calorimeter simulation code to GPUs. We have chosen to use FastCaloSim,the ATLAS fast parametrized calorimeter simulation. While FastCaloSim issufficiently fast such that it does not impose a bottleneck in detector simula-tions overall, significant speed-ups in the processing of large samples can beachieved from GPU parallelization at both the particle (intra-event) and eventlevels; this is especially beneficial in conditions expected at the high-luminosityLHC, where an immense number of per-event particle multiplicities will resultfrom the many simultaneous proton-proton collisions. We report our experi-ence with porting FastCaloSim to NVIDIA GPUs using CUDA. A preliminaryKokkos implementation of FastCaloSim for portability to other parallel archi-tectures is also described

Speaker: Dr Charles Leggett (Lawrence Berkeley National Lab (US))
• Tuesday, 18 May
• Tues AM Plenaries: Plenaries
Zoom Meeting ID
63611158688
Host
vCHEP 00
Alternative hosts
Chiara Ilaria Rovelli, Benedikt Hegner, Edoardo Martelli, Stefan Roiser, Zoom Recording Operations 3, Zoom Recording Operations 2, Simone Campana, Catherine Biscarat, Thomas Baron, Xavier Espinal, Maarten Litmaath, Graeme A Stewart, Sebastian Lopienski, Helge Meinhard
Join via phone
Zoom URL
Conveners: Caterina Doglioni (Lund University (SE)) , Maria Girone (CERN)
• 9
Towards a realistic track reconstruction algorithm based on graph neural networks for the HL-LHC

The physics reach of the HL-LHC will be limited by how efficiently the experiments can use the available computing resources, i.e. affordable software and computing are essential. The development of novel methods for charged particle reconstruction at the HL-LHC incorporating machine learning techniques or based entirely on machine learning is a vibrant area of research. In the past two years, algorithms for track pattern recognition based on graph neural networks (GNNs) have emerged as a particularly promising approach. Previous work mainly aimed at establishing proof of principle. In the present document we describe new algorithms that can handle complex realistic detectors. The new algorithms are implemented in ACTS, a common framework for tracking software. This work aims at implementing a realistic GNN-based algorithm that can be deployed in an HL-LHC experiment.

Speaker: Charline Rougier (Laboratoire des 2 Infinis - Toulouse, CNRS / Univ. Paul Sabatier (FR))
• 10
ALICE Central Trigger System for LHC Run 3

A major upgrade of the ALICE experiment is ongoing aiming to a high-rate data taking during LHC Run 3 (2022-2024).
The LHC interaction rate at Point 2 will be increased to $50\ \mathrm{kHz}$ kHz in Pb-Pb collisions and $1\ \mathrm{MHz}$ in pp collisions. ALICE experiment will be able to readout full interaction rate leading to an increase of the collected luminosity up a factor of about 100 with respect to the LHC Run 1 and 2. To satisfy these requirements a new readout system has been developed for most of the ALICE detectors allowing the full readout of the data at the required interaction rates without the need for a hardware trigger selection. A novel trigger and timing distribution system will be implemented based on Passive Optical Network (PON) and GigaBit Transceiver (GBT) technology. To assure backward compatibility a triggered mode based on RD12 TTC technology as the one used in the previous LHC runs will be maintained and re-implemented under the new Central Trigger System (CTS). A new universal ALICE Trigger Board (ATB) based on the Xilinx Kintex Ultrascale FPGA has been designed to function as a Central Trigger Processor (CTP), Local Trigger Unit (LTU), and monitoring interfaces.

In this paper, this hybrid multilevel system with continuous readout will be described, together with the triggering mechanism and algorithms. An overview of the CTS, the design of the ATB and the different communication protocols will be presented.

Speaker: Jakub Kvapil (University of Birmingham (GB))
• 11
Public Engagement in a Global Pandemic

UKRI/STFC’s Scientific Computing Department (SCD) has a long and rich history of delivering face to face public engagement and outreach, both on site and in public places, as part of the wider STFC programme. Due to the global COVID-19 pandemic, SCD was forced to abandon an extensive planned programme of public engagement, alongside altering the day-to-day working methods of the majority of its staff. SCD had to respond rapidly to create a new, remote only, programme for the summer and for the foreseeable future. This was initially an exercise in improvisation, identifying existing activities that could be delivered remotely with minimal changes. As the pandemic went on, SCD also created new resources specifically for a remote audience and adapted existing activities where appropriate, using our evaluation framework to ensure these activities continued to meet the aims of the in-person programme. This paper presents the process through which this was achieved, some of the benefits and challenges of remote engagement and the plans for 2021 and beyond.

Speaker: Mr Greg Corbett (STFC)
• 10:30 AM
Conference Photo

The group photo of the conference participants will be composed of small but recognizable pictures of people connected to the Zoom meeting with their cameras enabled. The names will be blurred. The final group photo will afterwards be published on the conference website, and possibly in other publications.

• If you want to appear on the group photo, please enable your camera when we will be taking the photo (technically, screenshots of the Zoom gallery view).
• If you prefer not to be included in the group photo, please just keep your camera off.

The screenshots for the group photo will be taken during the dedicated sessions on Monday afternoon (15:15) and on Tuesday morning (10:30). If you participate in one of them, there is no need to attend the other.

• 10:35 AM
Break
• Algorithms: Tue AM
Zoom Meeting ID
61638831506
Host
vCHEP 02
Alternative hosts
Katarzyna Maria Dziedziniewicz-Wojcik, Chiara Ilaria Rovelli, Adeel Ahmad, Catharine Noble, Benedikt Hegner, Viktor Khristenko, Eric Wulff, Edoardo Martelli, Julia Andreeva, Maria Girone, Stefan Roiser, Zoom Recording Operations 3, Peter Hristov, Zoom Recording Operations 2, Eric Grancher, Simone Campana, Catherine Biscarat, Thomas Baron, Latchezar Betev, Markus Elsing, Xavier Espinal, Maarten Litmaath, Graeme A Stewart, Melissa Gaillard, Sebastian Lopienski, Anirudh Goel, David Southwick, Hannah Short, Helge Meinhard
Join via phone
Zoom URL
Conveners: David Rohr (CERN) , John Derek Chapman (University of Cambridge (GB))
• 12
A C++ Cherenkov photons simulation in CORSIKA 8

CORSIKA is a standard software for simulations of air showers induced by cosmic rays. It has been developed in Fortran 77 continuously over the last thirty years. So it becomes very difficult to add new physics features to CORSIKA 7. CORSIKA 8 aims to be the future of the CORSIKA project. It is a framework in C++17 which uses modern concepts in object oriented programming for an efficient modularity and flexibility. The CORSIKA 8 project aims to obtain high performance by exploiting techniques such as vectorization, gpu/cpu parallelization, extended use of static polymorphism and the most precise physical models available.
In this paper we focus on the Cherenkov photon propagation module of CORSIKA, which is of particular interest for gamma-ray experiments, like the Cherenkov Telescope Array. First, we present the optimizations that we have applied to the Cherenkov module thanks to the results of detailed profiling using performance counters.
Then, we report our preliminary work to develop the Cherenkov Module in the CORSIKA 8 framework. Finally, we will demonstrate the first performance comparison with the current CORSIKA software as well as the validation of physics results.

Speaker: Mr Matthieu Carrère (CNRS)
• 13
Studies of GEANT4 performance for different ATLAS detector geometries and code compilation methods

Full detector simulation is known to consume a large proportion of computing resources available to the LHC experiments, and reducing time consumed by simulation will allow for more profound physics studies. There are many avenues to exploit, and in this work we investigate those that do not require changes in the GEANT4 simulation suite. In this study, several factors affecting the full GEANT4 simulation execution time are investigated. A broad range of configurations has been tested to ensure consistency of physical results. The effect of a single dynamic library GEANT4 build type has been investigated and the impact of different primary particles at different energies has been evaluated using GDML and GeoModel geometries. Some configurations have an impact on the physics results and are therefore excluded from further analysis. Usage of the single dynamic library is shown to increase execution time and does not represent a viable option for optimizations. Lastly, the static build type is confirmed as the most effective method to reduce the simulation execution time.

Speaker: Mrs Caterina Marcon (Lund University (SE))
• 14
CMS Full Simulation for Run 3

We report status of the CMS full simulation for Run-3. During the long shutdown of the LHC a significant update has been introduced to the CMS code for simulation. CMS geometry description is reviewed. Several important modifications were needed. CMS detector description software is migrated to the DD4Hep community developed tool. We will report on our experience obtained during the process of this migration. Geant4 10.7 is the CMS choice for Run-3 simulation productions. We will discuss arguments for this choice, the strategy of adaptation of a new Geant4 version, and will report on physics performance of CMS simulation. A special Geant4 Physics List configuration FTFP_BERT_EMM will be described, which provides a compromise between simulation accuracy and CPU performance. A significant fraction of time for simulation of CMS events is spent on tracking of charge particles in magnetic field. In CMS simulation a dynamic choice of Geant4 parameters for tracking in field is implemented. A new method is introduced into simulation of electromagnetic components of hadronic showers in the electromagnetic calorimeter of CMS. For low-energy electrons and positrons a parametrization of GFlash type is applied. Results of tests of this method will be discussed. In summary, we expect about 25% speedup of CMS simulation production for Run-3 compared to the Run-2 simulations.

• 15
Fast simulation of Time-of-Flight detectors at the LHC

The modelling of Cherenkov based detectors is traditionally done using Geant4 toolkit. In this work, we present another method based on Python programming language and Numba high performance compiler to speed up the simulation. As an example we take one of the Forward Proton Detectors at the CERN LHC - ATLAS Forward Proton (AFP) Time-of-Flight, which is used to reduce the background from multiple proton-proton collisions in soft and hard diffractive events. We describe the technical details of the fast Cherenkov model of photon generation and transportation through the optical part of the ToF detector. The fast simulation is revealed to be about 200 times faster than the corresponding Geant4 simulation, and provides similar results concerning length and time distributions of photons. The study is meant as the first step in a construction of a building kit allowing creation of a fast simulation of an arbitrary shaped optical part of detectors.

Speaker: Olivier Rousselle (Laboratoire Kastler Brossel (FR))
• 16
Monte Carlo matching in the Belle II software

The Belle II experiment is an upgrade to the Belle experiment, and is located at the SuperKEKB facility in KEK, Tsukuba, Japan. The Belle II software is completely new and is used for everything from triggering data, generation of Monte Carlo events, tracking, clustering, to high-level analysis. One important feature is the matching between the combinations of reconstructed objects which form particle candidates and the underlying simulated particles from the event generators. This is used to study detector effects, analysis backgrounds, and efficiencies. This document describes the algorithm that is used by Belle II.

Speaker: Yo Sato (Tohoku University)
• Artificial Intelligence: Tue AM
Zoom Meeting ID
67263583281
Host
vCHEP 01
Alternative hosts
Katarzyna Maria Dziedziniewicz-Wojcik, Chiara Ilaria Rovelli, Adeel Ahmad, Catharine Noble, Benedikt Hegner, Viktor Khristenko, Eric Wulff, Edoardo Martelli, Julia Andreeva, Maria Girone, Stefan Roiser, Zoom Recording Operations 3, Peter Hristov, Zoom Recording Operations 2, Eric Grancher, Simone Campana, Catherine Biscarat, Thomas Baron, Latchezar Betev, Markus Elsing, Xavier Espinal, Maarten Litmaath, Graeme A Stewart, Melissa Gaillard, Sebastian Lopienski, Anirudh Goel, David Southwick, Hannah Short, Helge Meinhard
Join via phone
Zoom URL
Conveners: Eduardo Rodrigues (University of Liverpool (GB)) , Simone Pigazzini (ETH Zurich (CH))
• 17
C++ Code Generation for Fast Inference of Deep Learning Models in ROOT/TMVA

We report the latest development in ROOT/TMVA, a new system that takes trained ONNX deep learning models and emits C++ code that can be easily included and invoked for fast inference of the model, with minimal dependency. We present an overview of the current solutions for conducting inference in C++ production environment, discuss the technical details and examples of the generated code, and demonstrates its development status with a preliminary benchmark against popular tools.

Speaker: Sitong An (CERN, Carnegie Mellon University (US))
• 18
Deep learning based low-dose synchrotron radiation CT reconstruction

Synchrotron radiation sources are widely used in various fields, among which computed tomography (CT) is one of the most important fields. The amount of effort expended by the operator varies depending on the subject. If the number of angles needed to be used can be greatly reduced under the condition of similar imaging effects, the working time and workload of the experimentalists will be greatly reduced. However, decreasing the sampling Angle can produce serious artifacts and blur the details. We try to use the deep learning which can build high quality reconstruction sparse data sampling from the Angle of the image and ResAttUnet are put forward. ResAttUnet is roughly a symmetrical U-shaped network that incorporates similar mechanisms to ResNet and attention. In addition, the hybrid precision training technique is adopted to reduce the demand for video memory of the model.

Speaker: Ling Li (Institute of High Energy Physics, CAS;University of Chinese Academy of Sciences)
• 19
Intelligent compression for synchrotron radiation source image

Synchrotron radiation sources (SRS) produce a huge amount of image data. This scientific data, which needs to be stored and transferred losslessly, will bring great pressure on storage and bandwidth. The SRS images have the characteristics of high frame rate and high resolution, and traditional image lossless compression methods can only save up to 30% in size. Focus on this problem, we propose a lossless compression method for SRS images based on deep learning. First, we use the difference algorithm to reduce the linear correlation within the image sequence. Then we propose a reversible truncated mapping method to reduce the range of the pixel value distribution. Thirdly, we train a deep learning model to learn the nonlinear relationship within the image sequence. Finally, we use the probability distribution predicted by the deep leaning model combined with arithmetic coding to fulfil lossless compression. Test result based on SRS images shows that our method can further decrease 20% of the data size compared to PNG, JPEG2000 and FLIF.

Speaker: Shiyuan Fu
• 20
Event Classification with Multi-step Machine Learning

The usefulness and valuableness of Multi-step ML, where a task is organized into connected sub-tasks with known intermediate inference goals, as opposed to a single large model learned end-to-end without intermediate sub-tasks, is presented. Pre-optimized ML models are connected and better performance is obtained by re-optimizing the connected one. The selection of a ML model from several small ML model candidates for each sub-task has been performed by using the idea based on NAS. In this paper, DARTS and SPOS-NAS are tested, where the construction of loss functions is improved to keep all ML models smoothly learning. Using DARTS and SPOS-NAS as an optimization and selection as well as the connecting for multi-step machine learning systems, we find that (1) such system can quickly and successfully select highly performant model combinations, and (2) the selected models are consistent with baseline algorithms such as grid search and their outputs are well controlled.

Speaker: Masahiko Saito (University of Tokyo (JP))
• 21
The use of Boosted Decision Trees for Energy Reconstruction in JUNO experiment

The Jiangmen Underground Neutrino Observatory (JUNO) is a neutrino experiment with a broad physical program. The main goals of JUNO are the determination of the neutrino mass ordering and high precision investigation of neutrino oscillation properties. The precise reconstruction of the event energy is crucial for the success of the experiment.
JUNO is equiped with 17 612 + 25 600 PMT channels of two kind which provide both charge and hit time information. In this work we present a fast Boosted Decision Trees model using small set of aggregated features. The model predicts event energy deposition. We describe the motivation and the details of our feature engineering and feature selection procedures. We demonstrate that the proposed aggregated approach can achieve a reconstruction quality that is competitive with the quality of much more complex models like Convolution Neural Networks (ResNet, VGG and GNN).

Speaker: Mr Arsenii Gavrikov (HSE University)
• Online: Tue AM
Zoom Meeting ID
68573827329
Host
vCHEP 13
Alternative hosts
Katarzyna Maria Dziedziniewicz-Wojcik, Chiara Ilaria Rovelli, Adeel Ahmad, Catharine Noble, Benedikt Hegner, Viktor Khristenko, Eric Wulff, Edoardo Martelli, Julia Andreeva, Maria Girone, Stefan Roiser, Zoom Recording Operations 3, Peter Hristov, Zoom Recording Operations 2, Eric Grancher, Simone Campana, Catherine Biscarat, Thomas Baron, Latchezar Betev, Markus Elsing, Xavier Espinal, Maarten Litmaath, Graeme A Stewart, Melissa Gaillard, Sebastian Lopienski, Anirudh Goel, David Southwick, Hannah Short, Helge Meinhard
Join via phone
Zoom URL
Conveners: Dmytro Kresan (GSI - Helmholtzzentrum fur Schwerionenforschung GmbH (DE)) , Stewart Martin-Haugh (Science and Technology Facilities Council STFC (GB))
• 22
The Controls and Configuration Software of the ATLAS Data Acquisition System: evolution towards LHC Run 3

The ATLAS experiment at the Large Hadron Collider (LHC) op- erated very successfully in the years 2008 to 2018, in two periods identified as Run 1 and Run 2. ATLAS achieved an overall data-taking efficiency of 94%, largely constrained by the irreducible dead-time introduced to accommodate the limitations of the detector read-out electronics. Out of the 6% dead-time only about 15% could be attributed to the central trigger and DAQ system, and out of these, a negligible fraction was due to the Control and Configuration sub- system. Despite these achievements, and in order to improve even more the already excellent efficiency of the whole DAQ system in the coming Run 3, a new campaign of software updates was launched for the second long LHC shutdown (LS2). This paper presents, using a few selected examples, how the work was approached and which new technologies were introduced into the AT- LAS Control and Configuration software. Despite these being specific to this system, many solutions can be considered and adapted to different distributed DAQ systems.

Speaker: Andrei Kazarov (NRC Kurchatov Institute PNPI (RU))
• 23
Development of the Safety System for the Inner Tracking System of the ALICE Experiment

During the LHC Long Shutdown 2, the ALICE experiment has undergone numerous upgrades to cope with the large amount of data expected in Run3. Among all new elements integrated into ALICE, the experiment counts with a new Inner Tracking System (ITS), equipped with innovative pixel sensors that will substantially improve the performance of the system. The new detector is equipped with a complex Low Voltage (LV) distribution, increasing the power dissipated by the detector and requiring the installation of a large number of temperature measurement points. In 2020, a new safety system has been developed to distribute the ITS LV interlock system and to monitor the new temperature values. The safety system is based on a Siemens S7-1500 PLC device. The control application governing the PLC has been configured through the UNICOS- CPC infrastructure made at CERN for the standardisation of industrial applications. UNICOS-CPC enables both the automatisation of control tasks governing the PLC and the interface to the WinCC OA based SCADA system. This paper provides a complete description of the setup of this safety system.

Speaker: Patricia Mendez Lorenzo (CERN)
• 24
Understanding ATLAS infrastructure behaviour with an Expert System

The ATLAS detector requires a huge infrastructure consisting of numerous interconnected systems forming a complex mesh which requires constant maintenance and upgrades. The ATLAS Technical Coordination Expert System provides, by the means of a user interface, a quick and deep understanding of the infrastructure, which helps to plan interventions by foreseeing unexpected consequences, and to understand complex events when time is crucial in the ATLAS control room.
It is an object-oriented expert system based on the knowledge composed of inference rules and information from diverse domains such as detector control and safety systems, gas, water, cooling, ventilation, cryogenics, and electricity distribution.

This paper discusses the latest developments in the inference engine and the implementation of the most probable cause algorithm based on them. One example from the annual maintenance of the 15$^{\circ}$C water circuit chillers is discussed.

Speaker: Ignacio Asensi Tortajada (Univ. of Valencia and CSIC (ES))
• 25
Integration and Commissioning of the Software-based Readout System for ATLAS Level-1 Endcap Muon Trigger in Run 3

The Large Hadron Collider and the ATLAS experiment at CERN will explore new frontiers in physics in Run 3 starting in 2022. In the Run 3 ATLAS Level-1 endcap muon trigger, new detectors called New Small Wheel and additional Resistive Plate Chambers will be installed to improve momentum resolution and to enhance the rejection of fake muons. The Level-1 endcap muon trigger algorithm will be processed by new trigger processor boards with modern FPGAs and high-speed optical serial links. For validation and performance evaluation, the inputs and outputs of their trigger logic will be read out using a newly developed software-based readout system. We have successfully integrated this readout system in the ATLAS online software framework, enabling commissioning in the actual Run 3 environment. Stable trigger readout has been realized for input rates up to 100 kHz with a developed event-building application. We have verified that its performance is sufficient for Run 3 operation in terms of event data size and trigger rate. The paper will present the details of the integration and commissioning of the software-based readout system for ATLAS Level-1 endcap muon trigger in Run 3.

Speaker: Kaito Sugizaki (University of Tokyo (JP))
• 26
A real-time FPGA-based cluster finding algorithm for LHCb silicon pixel detector

Starting from the next LHC run, the upgraded LHCb High Level Trigger will process events at the full LHC collision rate (averaging 30 MHz). This challenging goal, tackled using a large and heterogeneous computing farm, can be eased addressing lowest-level, more repetitive tasks at the earliest stages of the data acquisition chain. FPGA devices are very well-suited to perform with a high degree of parallelism and efficiency certain computations, that would be significantly demanding if performed on general-purpose architectures. A particularly time-demanding task is the cluster-finding process, due to the 2D pixel geometry of the new LHCb pixel detector. We describe here a custom highly parallel FPGA-based clustering algorithm and its firmware implementation. The algorithm implementation has shown excellent reconstruction quality during qualification tests, while requiring a modest amount of hardware resources. Therefore it can run in the LHCb FPGA readout cards in real time, during data taking at 30 MHz, representing a promising alternative solution to more common CPU-based algorithms.

Speaker: Giovanni Bassi (SNS & INFN Pisa (IT))
• Software: Tue AM
Zoom Meeting ID
68415529596
Host
vCHEP 06
Alternative hosts
Katarzyna Maria Dziedziniewicz-Wojcik, Chiara Ilaria Rovelli, Adeel Ahmad, Catharine Noble, Benedikt Hegner, Viktor Khristenko, Eric Wulff, Edoardo Martelli, Julia Andreeva, Maria Girone, Stefan Roiser, Zoom Recording Operations 3, Peter Hristov, Zoom Recording Operations 2, Eric Grancher, Simone Campana, Catherine Biscarat, Thomas Baron, Latchezar Betev, Markus Elsing, Xavier Espinal, Maarten Litmaath, Graeme A Stewart, Melissa Gaillard, Sebastian Lopienski, Anirudh Goel, David Southwick, Hannah Short, Helge Meinhard
Join via phone
Zoom URL
Conveners: Benjamin Krikler (University of Bristol (GB)) , David Bouvet (IN2P3/CNRS (FR))
• 27
Daisy: Data analysis integrated software system for X-ray experiments

Daisy (Data Analysis Integrated Software System) has been designed for the analysis and visualization of the X-ray experiments. To address an extensive range of Chinese radiation facilities community’s requirements from purely algorithmic problems to scientific computing infrastructure, Daisy sets up a cloud-native platform to support on-site data analysis services with fast feedback and interaction. The plugs-in based application is convenient to process the expected high throughput data flow in parallel at next-generation facilities such as the High Energy Photon Source (HEPS). The objectives, functionality and architecture of Daisy are described in this article.

Speaker: Haolai Tian (Institute of High Energy Physics)
• 28
Readable and efficient HEP data analysis with bamboo

With the LHC continuing to collect more data and experimental analyses becoming increasingly complex, tools to efficiently develop and execute
these analyses are essential. The bamboo framework defines a domain-specific
language, embedded in python, that allows to concisely express the analysis
logic in a functional style. The implementation based on ROOT’s RDataFrame
and cling C++ JIT compiler approaches the performance of dedicated native
code. Bamboo is currently being used for several CMS Run 2 analyses that
rely on the NanoAOD data format, which will become more common in Run
3 and beyond, and for which many reusable components are included, but it
provides many possibilities for customisation, which allow for straightforward
adaptation to other formats and workflows.

Speaker: Pieter David (Universite Catholique de Louvain (UCL) (BE))
• 29

Speaker: Gokhan Unel (University of California Irvine (US))
• 30
ALICE Run 3 Analysis Framework

In LHC Run 3 the ALICE Collaboration will have to cope in Run 3 with an increase of lead-lead collision data of two orders of magnitude com- pared to the Run 1 and 2 data-taking periods. The Online-Offline (O$^2$) software framework has been developed to allow for distributed and efficient process- ing of this unprecedented amount of data. Its design, which is based on a message-passing back end, required the development of a dedicated Analysis Framework that uses columnar data format provided by Apache Arrow. The O2 Analysis Framework provides a user-friendly high-level interface and hides the complexity of the underlying distributed framework. It allows the users to access and manipulate the data in the new format both in the traditional "event loop" and a declarative approach using bulk processing operations based on Arrow’s Gandiva sub-project. Building on the well-tested system of analysis trains developed by ALICE in Run 1 and 2, the AliHyperloop infrastructure is being developed. It provides a fast and intuitive user interface for running demand- ing analysis workflows in the GRID environment and on the dedicated Analysis Facility. In this document, we report on the current state and ongoing develop- ments of the Analysis Framework and of AliHyperloop, highlighting the design choices and the benefits of the new system.

Speaker: Anton Alkin (CERN)
• 31
Analysis of heavy-flavour particles in ALICE with the O2 analysis framework

Precise measurements of heavy-flavour hadrons down to very low pT represent the core of the physics program of the upgraded ALICE experiment in Run 3.
These physics probes are characterised by a very small signal-to-background ratio requiring very large statistics of minimum-bias events.
In Run 3, ALICE is expected to collect up to 13 nb^{-1} of lead–lead collisions, corresponding to about 1e11 minimum-bias events.
In order to analyse this unprecedented amount of data, which is about 100 times larger than the statistics collected in Run 1 and Run 2, the ALICE collaboration is developing a complex analysis framework that aims at maximising the processing speed and data volume reduction.
In this paper, the strategy of reconstruction, selection, skimming, and analysis of heavy-flavour events for Run 3 will be presented.
Some preliminary results on the reconstruction of charm mesons and baryons will be shown and the prospects for future developments and optimisation discussed.

Speaker: Vit Kucera (CERN)
• 32

The traditional approach in HEP analysis software is to loop over every event and every object via the ROOT framework. This method follows an imperative paradigm, in which the code is tied to the storage format and steps of execution. A more desirable strategy would be to implement a declarative language, such that the storage medium and execution are not included in the abstraction model. This will become increasingly important to managing the large dataset collected by the LHC and the HL-LHC. A new analysis description language (ADL) inspired by functional programming, FuncADL, was developed using Python as a host language. The expressiveness of this language was tested by implementing example analysis tasks designed to benchmark the functionality of ADLs. Many simple selections are expressible in a declarative way with FuncADL, which can be used as an interface to retrieve filtered data. Some limitations were identified, but the design of the language allows for future extensions to add missing features. FuncADL is part of a suite of analysis software tools being developed by the Institute for Research and Innovation in Software for High Energy Physics (IRIS-HEP). These tools will be available to develop highly scalable physics analyses for the LHC.

Speaker: Mason Proffitt (University of Washington (US))
• Storage: Tue AM
Zoom Meeting ID
67249300031
Host
vCHEP 03
Alternative hosts
Katarzyna Maria Dziedziniewicz-Wojcik, Chiara Ilaria Rovelli, Adeel Ahmad, Catharine Noble, Benedikt Hegner, Viktor Khristenko, Eric Wulff, Edoardo Martelli, Julia Andreeva, Maria Girone, Stefan Roiser, Zoom Recording Operations 3, Peter Hristov, Zoom Recording Operations 2, Eric Grancher, Simone Campana, Catherine Biscarat, Thomas Baron, Latchezar Betev, Markus Elsing, Xavier Espinal, Maarten Litmaath, Graeme A Stewart, Melissa Gaillard, Sebastian Lopienski, Anirudh Goel, David Southwick, Hannah Short, Helge Meinhard
Join via phone
Zoom URL
Conveners: Patrick Fuhrmann (Deutsches Elektronen-Synchrotron (DE)) , Peter Clarke (The University of Edinburgh (GB))
• 33
Evaluation of a high-performance storage buffer with 3D XPoint devices for the DUNE data acquisition system

The DUNE detector is a neutrino physics experiment that is expected to take data starting from 2028. The data acquisition (DAQ) system of the experiment is designed to sustain several TB/s of incoming data which will be temporarily buffered while being processed by a software based data selection system.

In DUNE, some rare physics processes (e.g. Supernovae Burst events) require storing the full complement of data produced over 1-2 minute window. These are recognised by the data selection system which fires a specific trigger decision. Upon reception of this decision data are moved from the temporary buffers to local, high performance, persistent storage devices. In this paper we characterize the performance of novel 3DXPoint SSD devices under different workloads suitable for high-performance storage applications. We then illustrate how such devices may be applied to the DUNE use-case: to store, upon a specific signal, 100 seconds of incoming data at 1.5 TB/s distributed among 150 identical units each operating at approximately 10 GB/s.

Speaker: Adam Abed Abud (University of Liverpool (GB) and CERN)
• 34
Design of a Resilient, High-Throughput, Persistent Storage System for the ATLAS Phase-II DAQ System

The ATLAS experiment will undergo a major upgrade to take advantage of the new conditions provided by the upgraded High-Luminosity LHC. The Trigger and Data Acquisition system (TDAQ) will record data at unprecedented rates: the detectors will be read out at 1 MHz generating around 5 TB/s of data. The Dataflow system (DF), component of TDAQ, introduces a novel design: readout data are buffered on persistent storage while the event filtering system analyses them to select 10000 events per second for a total recorded throughput of around 60 GB/s. This approach allows for decoupling the detector activity from the event selection process. New challenges then arise for DF: design and implement a distributed, reliable, persistent storage system supporting several TB/s of aggregated throughput while providing tens of PB of capacity. In this paper we first describe some of the challenges that DF is facing: data safety with persistent storage limitations, indexing of data at high-granularity in a highly-distributed system, and high-performance management of storage capacity. Then the ongoing R&D to address each of them is presented and the performance achieved with a working prototype is shown.

Speaker: Matias Alejandro Bonaventura (CERN)
• 35
Enabling interoperable data and application services in a federated ScienceMesh

In recent years, cloud sync & share storage services, provided by academic and research institutions, have become a daily workplace environment for many local user groups in the High Energy Physics (HEP) community. These, however, are primarily disconnected and deployed in isolation from one another, even though new technologies have been developed and integrated to further increase the value of data. The EU-funded CS3MESH4EOSC project is connecting locally and individually provided sync and share services, and scaling them up to the European level and beyond. It aims to deliver the ScienceMesh service, an interoperable platform to easily sync and share data across institutions and extend functionalities by connecting to other research services using streamlined sets of interoperable protocols, APIs and deployment methodologies. This supports multiple distributed application workflows: data science environments, collaborative editing and data transfer services.

In this paper, we present the architecture of ScienceMesh and the technical design of its reference implementation, a platform that allows organizations to join the federated service infrastructure easily and to access application services out-of-the-box. We discuss the challenges faced during the process, which include diversity of sync & share platforms (Nextcloud, Owncloud, Seafile and others), absence of global user identities and user discovery, lack of interoperable protocols and APIs, and access control and protection of data endpoints. We present the rationale for the design decisions adopted to tackle these challenges and describe our deployment architecture based on Kubernetes, which enabled us to utilize monitoring and tracing functionalities. We conclude by reporting on the early user experience with ScienceMesh.

Speaker: Ishank Arora (CERN)
• 36
Porting the EOS from X86 (Intel) to aarch64 (ARM) architecture

With the advancement of many large HEP experiments, the amount of data that needs to be processed and stored has increased significantly, so we must upgrade computing resources and improve the performance of storage software. This article discusses porting the EOS software from the x86_64 architecture to the aarch64 architecture, with the aim of finding a more cost-effective storage solution. In the process of porting, the biggest challenge is that many dependent packages do not have aarch64 version and need to be compiled by ourselves, and the assembly part of the software code also needs to be adjusted accordingly. Despite these challenges, we have successfully ported the EOS code to the aarch64. This article discusses the current status and plans for the software port as well as performance testing after porting.

Speaker: Yaosong Cheng (IHEp)
• 37
The first disk-based custodial storage for the ALICE experiment

We proposed a disk-based custodial storage as an alternative to tape for the ALICE experiment at CERN to preserve its raw data.
The proposed storage system relies on RAIN layout -- the implementation of erasure coding in the EOS storage suite, which is developed by CERN -- for data protection and takes full advantage of high-density JBOD enclosures to maximize storage capacity as well as to achieve cost-effectiveness comparable to tape.
The system we present provides 18 PB of total raw capacity from the 18 set of high-density JBOD enclosures attached to 9 EOS front-end servers.
In order to balance between usable space and data protection, the system will stripe a file into 16 chunks on the 4-parity enabled RAIN layout configured on top of 18 containerized EOS FSTs.
Although the reduction rate of available space increases up to $33.3\%$ with this layout, the estimated annual data loss rate drops down to $8.6 \times 10^{-5}\%$.
In this paper, we discuss the system architecture of the disk-based custodial storage, 4-parity RAIN layout, deployment automation, and the integration to the ALICE experiment in detail.

Speaker: Sang Un Ahn (Korea Institute of Science & Technology Information (KR))
• Accelerators: Tue PM
Zoom Meeting ID
63711203344
Host
vCHEP 08
Alternative hosts
Katarzyna Maria Dziedziniewicz-Wojcik, Chiara Ilaria Rovelli, Adeel Ahmad, Catharine Noble, Benedikt Hegner, Viktor Khristenko, Eric Wulff, Edoardo Martelli, Julia Andreeva, Maria Girone, Stefan Roiser, Zoom Recording Operations 3, Peter Hristov, Zoom Recording Operations 2, Eric Grancher, Simone Campana, Catherine Biscarat, Thomas Baron, Latchezar Betev, Markus Elsing, Xavier Espinal, Maarten Litmaath, Graeme A Stewart, Melissa Gaillard, Sebastian Lopienski, Anirudh Goel, David Southwick, Hannah Short, Helge Meinhard
Join via phone
Zoom URL
Conveners: Felice Pantaleo (CERN) , Simon George (Royal Holloway, University of London)
• 38
A Portable Implementation of RANLUX++

High energy physics has a constant demand for random number generators (RNGs) with high statistical quality. In this paper, we present ROOT's implementation of the RANLUX++ generator. We discuss the choice of relying only on standard C++ for portability reasons. Building on an initial implementation, we describe a set of optimizations to increase generator speed. This allows to reach performance very close to the original assembler version. We test our implementation on an Apple M1 and Nvidia GPUs to demonstrate the advantages of portable code.

Speaker: Jonas Hahnfeld (CERN)
• 39
A Computing and Detector Simulation Framework for the HIBEAM/NNBAR Experimental Program at the ESS

The HIBEAM/NNBAR program is a proposed two-stage experiment for the European Spallation Source focusing on searches for baryon number violation via processes in which neutrons convert to anti-neutrons. This paper outlines the computing and detector simulation framework for the HIBEAM/NNBAR program. The simulation is based on predictions of neutron flux and neutronics together with signal and background generation. A range of diverse simulation packages are incorporated, including Monte Carlo transport codes, neutron ray-trace simulation packages, and detector simulation software. The common simulation package in which these elements are interfaced together is discussed. Data management plans and triggers are also described.

Speaker: Bernhard Meirose (Stockholms Universitet)
• 40
Performance of CUDA Unified Memory in CMS Heterogeneous Pixel Reconstruction

The management of separate memory spaces of CPUs and GPUs brings an additional burden to the development of software for GPUs. To help with this, CUDA unified memory provides a single address space that can be accessed from both CPU and GPU. The automatic data transfer mechanism is based on page faults generated by the memory accesses. This mechanism has a performance cost, that can be with explicit memory prefetch requests. Various hints on the inteded usage of the memory regions can also be given to further improve the performance. The overall effect of unified memory compared to an explicit memory management can depend heavily on the application. In this paper we evaluate the performance impact of CUDA unified memory using the heterogeneous pixel reconstruction code from the CMS experiment as a realistic use case of a GPU-targeting HEP reconstruction software. We also compare the programming model using CUDA unified memory to the explicit management of separate CPU and GPU memory spaces.

Speaker: Ka Hei Martin Kwok (Fermi National Accelerator Lab. (US))
• 41
Porting CMS Heterogeneous Pixel Reconstruction to Kokkos

Programming for a diverse set of compute accelerators in addition to the CPU is a challenge. Maintaining separate source code for each architecture would require lots of effort, and development of new algorithms would be daunting if it had to be repeated many times. Fortunately there are several portability technologies on the market such as Alpaka, Kokkos, and SYCL. These technologies aim to improve the developer productivity by making it possible to use the same source code for many different architectures. In this paper we use heterogeneous pixel reconstruction code from the CMS experiment at the CERNL LHC as a realistic use case of a GPU-targeting HEP reconstruction software, and report experience from prototyping a portable version of it using Kokkos. The development was done in a standalone program that attempts to model many of the complexities of a HEP data processing framework such as CMSSW. We also compare the achieved event processing throughput to the original CUDA code and a CPU version of it.

Speaker: Matti Kortelainen (Fermi National Accelerator Lab. (US))
• 42
Heterogeneous techniques for rescaling energy deposits in the CMS Phase-2 endcap calorimeter

We present the porting to heterogeneous architectures of the algorithm used for applying linear transformations of raw energy deposits in the CMS High Granularity Calorimeter (HGCAL). This is the first heterogeneous algorithm to be fully integrated with HGCAL’s reconstruction chain. After introducing the latter and giving a brief description of the structural components of HGCAL relevant for this work, the role of the linear transformations in the calibration is reviewed. We discuss how this work facilitates the porting of other algorithms in the existing reconstruction process, as well as integrating algorithms previously ported (but not yet integrated). The many ways in which parallelization is achieved are described, and the successful validation of the heterogeneous algorithm is covered. Detailed performance measurements are presented, showing the wall time of both CPU and GPU algorithms, and therefore establishing the corresponding speedup.

Speaker: Bruno Alves (ADI Agencia de Inovacao (PT))
• 43
Usage of GPUs in ALICE Online and Offline processing during LHC Run 3

ALICE will significantly increase its Pb--Pb data taking rate from the 1\,kHz of triggered readout in Run 2 to 50 kHz of continuous readout for LHC Run 3.
Updated tracking detectors are installed for Run 3 and a new two-phase computing strategy is employed.
In the first synchronous phase during the data taking, the raw data is compressed for storage to an on-site disk buffer and the required data for the detector calibration is collected.
In the second asynchronous phase the compressed raw data is reprocessed using the final calibration to produce the final reconstruction output.
Traditional CPUs are unable to cope with the huge data rate and processing demands of the synchronous phase, therefore ALICE employs GPUs to speed up the processing.
Since the online computing farm performs a part of the asynchronous processing when there is no beam in the LHC, ALICE plans to use the GPUs also for this second phase.
This paper gives an overview of the GPU processing in the synchronous phase, the full system test to validate the reference GPU architecture, and the prospects for the GPU usage in the asynchronous phase.

Speaker: David Rohr (CERN)
• Algorithms: Tue PM
Zoom Meeting ID
61638831506
Host
vCHEP 02
Alternative hosts
Katarzyna Maria Dziedziniewicz-Wojcik, Chiara Ilaria Rovelli, Adeel Ahmad, Catharine Noble, Benedikt Hegner, Viktor Khristenko, Eric Wulff, Edoardo Martelli, Julia Andreeva, Maria Girone, Stefan Roiser, Zoom Recording Operations 3, Peter Hristov, Zoom Recording Operations 2, Eric Grancher, Simone Campana, Catherine Biscarat, Thomas Baron, Latchezar Betev, Markus Elsing, Xavier Espinal, Maarten Litmaath, Graeme A Stewart, Melissa Gaillard, Sebastian Lopienski, Anirudh Goel, David Southwick, Hannah Short, Helge Meinhard
Join via phone
Zoom URL
Conveners: Dorothea Vom Bruch (Aix Marseille Univ, CNRS/IN2P3, CPPM, Marseille, France) , Gordon Watts (University of Washington (US))
• 44
Optimization of Geant4 for the Belle II software library

The SuperKEKB/Belle II experiment expects to collect 50 $\mathrm{ab}^{-1}$ of collision data during the next decade. Study of this data requires monumental computing resources to process and to generate the required simulation events necessary for physics analysis. At the core of the Belle II simulation library is the Geant4 toolkit. To use the available computing resources more efficiently, the physics list for Geant4 has been optimized for the Belle II environment, and various other strategies were applied to improve the performance of the Geant4 toolkit in the Belle II software library. Following the inclusion of this newly optimized physics list in an updated version of Geant4 toolkit, we obtain much better CPU usage during event simulation and reduce the computing resource usage by $\sim$ 44 %.

Speaker: Swagato Banerjee (University of Louisville (US))
• 45
Validation of Physics Models of Geant4 Versions 10.4.p03, 10.6.p02 and 10.7.p01 using Data from the CMS Experiment

CMS tuned its simulation program and chose a specific physics model of Geant4 by comparing the simulation results with dedicated test beam experiments. Test beam data provide measurements of energy response of the calorimeter as well as resolution for well identified charged hadrons over a large energy region. CMS continues to validate the physics models using the test beam data as well as collision data from the Large Hadron Collider. Isolated charged particles are measured simultaneously in the tracker as well as in the calorimeters. These events are selected using dedicated triggers and are used to measure the response in the calorimeter. Different versions of Geant4 (10.2.p02, 10.4.p03, 10.6.p02) have been used by CMS for its Monte Carlo production and a new version (10.7) is now chosen for future productions. A suitable physics list (collection of physics models) is chosen by optimizing performance against accuracy. A detailed comparison between data and Geant4 predictions is presented in this paper.

Speaker: Sunanda Banerjee (Fermi National Accelerator Lab. (US))
• 46
The Fast Simulation Chain in the ATLAS experiment

The ATLAS experiment relies heavily on simulated data, requiring the production on the order of billions of Monte Carlo-based proton-proton collisions every run period. As such, the simulation of collisions (events) is the single biggest CPU resource consumer. ATLAS's finite computing resources are at odds with the expected conditions during the High Luminosity LHC era, where the increase in proton-proton centre-of-mass energy and instantaneous luminosity will result in higher particle multiplicities and roughly fivefold additional interactions per bunch-crossing with respect to LHC Run-2. Therefore, significant effort within the collaboration is being focused on increasing the rate at which MC events can be produced by designing and developing fast alternatives to the algorithms used in the standard Monte Carlo production chain.

Speaker: Martina Javurkova (University of Massachusetts (US))
• 47
An automated tool to facilitate consistent test-driven development of trigger selections for LHCb’s Run 3

Upon its restart in 2022, the LHCb experiment at the LHC will run at higher instantaneous luminosity and utilize an unprecedented full-software trigger, promising greater physics reach and efficiency. On the flip side, conforming to offline data storage constraints becomes far more challenging. Both of these considerations necessitate a set of highly optimised trigger selections. We therefore present HltEfficiencyChecker: an automated extension to the LHCb trigger application, facilitating trigger development before data-taking driven by trigger rates and efficiencies. Since the default in 2022 will be to persist only the event's signal candidate to disk, discarding the rest of the event, we also compute efficiencies where the decision was due to the true MC signal, evaluated by matching it to the trigger candidate hit-by-hit. This matching procedure – which we validate here – demonstrates that the distinction between a “trigger” and a “trigger-on-signal” is crucial in characterising the performance of a trigger selection.

Speaker: Ross John Hunter (University of Warwick (GB))
• 48
Determination of inter-system timing for Mini-CBM in 2020

Future operation of the CBM detector requires ultra-fast analysis of the continuous stream of data from all subdetector systems. Determining the inter-system time shifts among individual detector systems in the existing prototype experiment Mini-CBM is an essential step for data processing and in particular for stable data taking. Based on the input of raw measurements from all detector systems, the corresponding time correlations can be obtained at digital level by evaluating the differences in time stamps. If the relevant systems are stable during data taking and sufficient digital measurements are available, the distribution of time differences should display a clear peak. Up to now, the outcome of the processed time differences is stored in histograms and the maximum peak is considered, after the evaluation of all timeslices of a run leading to significant run times. The results presented here demonstrate the stability of the synchronicity of Mini-CBM systems. Furthermore it is illustrated that relatively small amounts of raw measurements are sufficient to evaluate corresponding time correlations among individual Mini-CBM detectors, thus enabling fast online monitoring of them in future online data processing.

Speaker: Dr Andreas Ralph Redelbach (Goethe University Frankfurt (DE))
• 49
Apprentice for Event Generator Tuning

Apprentice is a tool developed for event generator tuning. It contains a range of conceptual improvements and extensions over the tuning tool Professor. Its core functionality remains the construction of a multivariate analytic surrogate model to computationally expensive Monte Carlo event generator predictions. The surrogate model is used for numerical optimization in chi-square minimization and likelihood evaluation. Apprentice also introduces algorithms to automate the selection of observable weights to minimize the effect of mismodeling in the event generators. We illustrate our improvements for the task of MC-generator tuning and limit setting.

Speaker: Mohan Krishnamoorthy (Argonne National Laboratory)
• Artificial Intelligence: Tue PM
Zoom Meeting ID
67263583281
Host
vCHEP 01
Alternative hosts
Katarzyna Maria Dziedziniewicz-Wojcik, Chiara Ilaria Rovelli, Adeel Ahmad, Catharine Noble, Benedikt Hegner, Viktor Khristenko, Eric Wulff, Edoardo Martelli, Julia Andreeva, Maria Girone, Stefan Roiser, Zoom Recording Operations 3, Peter Hristov, Zoom Recording Operations 2, Eric Grancher, Simone Campana, Catherine Biscarat, Thomas Baron, Latchezar Betev, Markus Elsing, Xavier Espinal, Maarten Litmaath, Graeme A Stewart, Melissa Gaillard, Sebastian Lopienski, Anirudh Goel, David Southwick, Hannah Short, Helge Meinhard
Join via phone
Zoom URL
Conveners: Patrick Fuhrmann (Deutsches Elektronen-Synchrotron (DE)) , Sofia Vallecorsa (CERN)
• 50
A Deep Learning approach to LHCb Calorimeter reconstruction using a Cellular Automaton

The optimization of reconstruction algorithms has become a key aspect in LHCb as it is currently undergoing a major upgrade that will considerably increase the data processing rate. Aiming to accelerate the second most time consuming reconstruction process of the trigger, we propose an alternative reconstruction algorithm for the Electromagnetic Calorimeter of LHCb. Together with the use of deep learning techniques and the understanding of the current algorithm, our proposal decomposes the reconstruction process into small parts that benefit the generalized learning of small neural network architectures and simplifies the training dataset. This approach takes as input the full simulation data of the calorimeter and outputs a list of reconstructed clusters in a nearly constant time without any dependency in the event complexity.

Speaker: Nuria Valls Canudas (La Salle, Ramon Llull University (ES))
• 51
Fast simulation of the electromagnetic calorimeter response using Self-Attention Generative Adversarial Networks

Simulation is one of the key components in high energy physics. Historically it relies on the Monte Carlo methods which require a tremendous amount of computation resources. These methods may have difficulties with the expected High Luminosity Large Hadron Collider need, so the experiment is in urgent need of new fast simulation techniques. The application of Generative Adversarial Networks is a promising solution to speed up the simulation while providing the necessary physics performance. In this paper we propose the Self-Attention Generative Adversarial Network as a possible improvement of the network architecture. The application is demonstrated on the performance of generating responses of the LHCb type of the electromagnetic calorimeter.

Speaker: Alexander Rogachev (Yandex School of Data Analysis (RU))
• 52
Graph Variational Autoencoder for Detector Reconstruction and Fast Simulation in High-Energy Physics

Accurate and fast simulation of particle physics processes is crucial for the high-energy physics community. Simulating particle interactions with the detector is both time consuming and computationally expensive. With its proton-proton collision energy of 13 TeV, the Large Hadron Collider is uniquely positioned to detect and measure the rare phenomena that can shape our knowledge of new interactions. The High-Luminosity Large Hadron Collider (HL-LHC) upgrade will put a significant strain on the computing infrastructure and budget due to increased event rate and levels of pile-up. Simulation of high-energy physics collisions needs to be significantly faster without sacrificing the physics accuracy. Machine learning approaches can offer faster solutions, while maintaining a high level of fidelity. We introduce a graph generative model that provides effective reconstruction of LHC events on the level of calorimeter deposits and tracks, paving the way for full detector level fast simulation.

Speaker: Ali Hariri (American University of Beirut (LB))
• 53
Particle identification with an electromagnetic calorimeter using a Convolutional Neural Network

Based on the fact that showers in calorimeters depend on the type of particle, this note attempts to perform a particle classifier for electromagnetic and hadronic particles on an electromagnetic calorimeter, based on the energy deposit of individual cells. Using data from a Geant4 simulation of a proposal of a Crystal Fiber Calorimeter (SPACAL), foreseen for a future upgrade of the LHCb detector, a classifier is built using Convolutional Neural Networks. Results obtained demonstrate that the higher resolution of this ECAL allows to attain over 95% precision in some classifications such as photons against neutrons.

Speaker: Mr Alex Rua Herrera (DS4DS, La Salle, Universitat Ramon Llull)
• 54
Conditional Wasserstein Generative Adversarial Networks for Fast Detector Simulation

Detector simulation in high energy physics experiments is a key yet computationally expensive step in the event simulation process. There has been much recent interest in using deep generative models as a faster alternative to the full Monte Carlo simulation process in situations in which the utmost accuracy is not necessary. In this work we investigate the use of conditional Wasserstein Generative Adversarial Networks to simulate both hadronization and the detector response to jets. Our model takes the $4$-momenta of jets formed from partons post-showering and pre-hadronization as inputs and predicts the $4$-momenta of the corresponding reconstructed jet. Our model is trained on fully simulated $t\overline{t}$ events using the publicly available GEANT-based simulation of the CMS Collaboration. We demonstrate that the model produces accurate conditional reconstructed jet transverse momentum ($p_T$) distributions over a wide range of $p_T$ for the input parton jet. Our model takes only a fraction of the time necessary for conventional detector simulation methods, running on a CPU in less than a millisecond per event.

Speaker: John Blue (Davidson College)
• Facilities and Networks: Tue PM
Zoom Meeting ID
62681653254
Host
vCHEP 07
Alternative hosts
Katarzyna Maria Dziedziniewicz-Wojcik, Chiara Ilaria Rovelli, Adeel Ahmad, Catharine Noble, Benedikt Hegner, Viktor Khristenko, Eric Wulff, Edoardo Martelli, Julia Andreeva, Maria Girone, Stefan Roiser, Zoom Recording Operations 3, Peter Hristov, Zoom Recording Operations 2, Eric Grancher, Simone Campana, Catherine Biscarat, Thomas Baron, Latchezar Betev, Markus Elsing, Xavier Espinal, Maarten Litmaath, Graeme A Stewart, Melissa Gaillard, Sebastian Lopienski, Anirudh Goel, David Southwick, Hannah Short, Helge Meinhard
Join via phone
Zoom URL
Conveners: David Bouvet (IN2P3/CNRS (FR)) , Dr Shawn McKee (University of Michigan (US))
• 55
Ethernet evaluation in data distribution traffic for the LHCb filtering farm at CERN

This paper evaluates the real-time distribution of data over Ethernet for the upgraded LHCb data acquisition cluster at CERN. The total estimated throughput of the system is 32 Terabits per second. After the events are assembled, they must be distributed for further data selection to the filtering farm of the online trigger. High-throughput and very low overhead transmissions will be an essential feature of such a system. In this work RoCE high-throughput Ethernet protocol and Ethernet flow control algorithms have been used to implement lossless events distribution. To generate LHCb-like traffic, a custom benchmark has been implemented. It was used to stress-test the selected Ethernet networks and to check resilience to uneven workload distribution. Performance tests were made with selected evaluation clusters. 100 Gb/s and 25 Gb/s links were used. Performance results and overall evaluation of this Ethernet-based approach are discussed.

Speaker: Rafal Dominik Krawczyk (CERN)
• 56
Systematic benchmarking of HTTPS third party copy on 100Gbps links using XRootD

The High Luminosity Large Hadron Collider provides a data challenge. The amount of data recorded from the experiments and transported to hundreds of sites will see a thirty fold increase in annual data volume. A systematic approach to contrast the performance of different Third Party Copy (TPC) transfer protocols arises. Two contenders, XRootD-HTTPS and the GridFTP are evaluated in their performance for transferring files from one server to another over 100Gbps interfaces. The benchmarking is done by scheduling pods on the Pacific Research Platform Kubernetes cluster to ensure reproducible and repeatable results. This opens a future pathway for network testing of any TPC transfer protocol.

Speaker: Aashay Arora (University of California San Diego)
• 57
NOTED: a framework to optimise network traffic via the analysis of data from File Transfer Services

Network traffic optimisation is difficult as the load is by nature dynamic and random. However, the increased usage of file transfer services may help the detection of future loads and the prediction of their expected duration. The NOTED project seeks to do exactly this and to dynamically adapt network topology to deliver improved bandwidth for users of such services. This article introduces, and explains the features of, the two main components of NOTED, the Transfer Broker and the Network Intelligence component.
The Transfer Broker analyses all queued and on-going FTS transfers, producing a traffic report which can be used by network controllers. Based on this report and its knowledge of the network topology and routing, the Network Intelligence (NI) component makes decisions as to when a network reconfiguration could be beneficial. Any Software Defined Network controller can then apply these decision to the network, so optimising transfer execution time and reducing operating costs.

Speaker: Edoardo Martelli (CERN)
• 58
Benchmarking NetBASILISK: a Network Security Project for Science

Infrastructures supporting distributed scientific collaborations must address competing goals in both providing high-performance access to resources while simultaneously securing the infrastructure against security threats. The NetBASILISK project is attempting to improve the security of such infrastructures while not adversely impacting their performance. This paper will present our work to create a benchmark and monitoring infrastructure that allows us to test for any degradation in transferring data into a NetBASILISK protected site.

Speaker: Jem Aizen Mendiola Guhit (University of Michigan (US))
• 59
Proximeter CERN's detecting device for personnel

The SARS COV 2 virus, the cause of the better known COVID-19 disease, has greatly altered our personal and professional lives. Many people are now expected to work from home but this is not always possible and, in such cases, it is the responsibility of the employer to implement protective measures. One simple such measure is to require that people maintain a distance of 2 metres but this places responsibility on employees and leads to two problems. Firstly, the likelihood that safety distances are not maintained and secondly that someone who becomes infected does not remember with whom they may have been in contact. To address both problems, CERN has developed the “proximeter”, a device that, when worn by employees, detects when they are in close proximity to others. Information about any such close contacts is sent securely over a Low Power Wide Area Network (LPWAN) and stored in a manner that respects confidentiality and privacy requirements. In the event that an employee becomes infected with COVID-19 CERN can thus identify all the possible contacts and so prevent the spread of the virus. We describe here the details of the proximeter device, the LPWAN infrastructure deployed at CERN, the communication mechanisms and the protocols used to respect the confidentiality of personal data.

Speaker: Christoph Merscher (CERN)
• Software: Tue PM
Zoom Meeting ID
68415529596
Host
vCHEP 06
Alternative hosts
Katarzyna Maria Dziedziniewicz-Wojcik, Chiara Ilaria Rovelli, Adeel Ahmad, Catharine Noble, Benedikt Hegner, Viktor Khristenko, Eric Wulff, Edoardo Martelli, Julia Andreeva, Maria Girone, Stefan Roiser, Zoom Recording Operations 3, Peter Hristov, Zoom Recording Operations 2, Eric Grancher, Simone Campana, Catherine Biscarat, Thomas Baron, Latchezar Betev, Markus Elsing, Xavier Espinal, Maarten Litmaath, Graeme A Stewart, Melissa Gaillard, Sebastian Lopienski, Anirudh Goel, David Southwick, Hannah Short, Helge Meinhard
Join via phone
Zoom URL
Conveners: Enrico Guiraud (EP-SFT, CERN) , Teng Jian Khoo (Humboldt University of Berlin (DE))
• 60
The GeoModel tool suite for detector description

The GeoModel class library for detector description has recently been released as an open-source package and extended with a set of tools to allow much of the detector modeling to be carried out in a lightweight development environment, outside of large and complex software frameworks. These tools include the mechanisms for creating persistent representation of the geometry, an interactive 3D visualization tool, various command-line tools, a plugin system, and XML and JSON parsers. The overall goal of the tool suite is a fast geometry development cycle with quick visual feedback. The tool suite can be built on both Linux and Macintosh systems with minimal external dependencies. It includes useful command-line utilities: gmclash which runs clash detection, gmgeantino which generates geantino maps, and fullSimLight which runs GEANT4 simulation on geometry imported from GeoModel description. The GeoModel tool suite is presently in use in both the ATLAS and FASER experiments. In ATLAS it will be the basis of the LHC Run 4 geometry description.

Speaker: Vakho Tsulaia (Lawrence Berkeley National Lab. (US))
• 61
Counter-based pseudorandom number generators for CORSIKA 8: A multi-thread friendly approach

This document is devoted to the description of advances in the generation of high-quality random numbers for CORSIKA 8, which is being developed in modern C++17 and is designed to run on modern multi-thread processors and accelerators. CORSIKA 8 is a Monte Carlo simulation framework to model ultra-high energy secondary particle cascades in astroparticle physics. The aspects associated with the generation of high-quality random numbers on massively parallel platforms, like multi-core CPUs and GPUs, are reviewed and the deployment of counter-based engines using an innovative and multi-thread friendly API are described. The API is based on iterators providing a very well known access mechanism in C++, and also supports lazy evaluation. Moreover,an upgraded version of the Squares algorithm with highly efficient internal 128 as well as 256 bit counters is presented in this context. Performance measurements are provided, as well as comparisons with conventional designs are given. Finally, the integration into CORSIKA 8 is commented.

Speaker: Dr Antonio Augusto Alves Junior (Institute for Astroparticle Physics of Karlsruhe Institute of Technology)
• 62
CAD support and new developments in DD4hep

Speaker: Markus Frank (CERN)
• 63
Key4hep: Status and Plans

Detector optimisation and physics performance studies are an
integral part for the development of future collider
experiments. The Key4hep project aims to design a common set of
software tools for future, or even present, High Energy Physics
projects. These proceedings describe the main components that are
developed as part of Key4hep: the event data model EDM4hep,
simulation interfaces to Delphes and Geant4, the k4MarlinWrapper
to integrate iLCSoft components, and build and validation tools
to ensure functionality and compatibility among the
components. They also include the different adaptation processes
by the CEPC, CLIC, FCC, and ILC communities towards this project,
which show that Key4hep is a viable long term solution as
baseline software for high energy experiments.

Speaker: Andre Sailer (CERN)
• 64
Preservation through modernisation: The software of the H1 experiment at HERA

The lepton–proton collisions produced at the HERA collider represent a unique high energy physics data set. A number of years after the end of collisions, the data collected by the H1 experiment, as well as the simulated events and all software needed for reconstruction, simulation and data analysis were migrated into a preserved operational mode at DESY. A recent modernisation of the H1 software architecture has been performed, which will not only facilitate on going and future data analysis efforts with the new inclusion of modern analysis tools, but also ensure the long-term availability of the H1 data and associated software. The present status of the H1 software stack, the data, simulations and the currently supported computing platforms for data analysis activities are discussed.

Speaker: Daniel Britzger (Max-Planck-Institut für Physik München)
• Storage: Tue PM
Zoom Meeting ID
67249300031
Host
vCHEP 03
Alternative hosts
Katarzyna Maria Dziedziniewicz-Wojcik, Chiara Ilaria Rovelli, Adeel Ahmad, Catharine Noble, Benedikt Hegner, Viktor Khristenko, Eric Wulff, Edoardo Martelli, Julia Andreeva, Maria Girone, Stefan Roiser, Zoom Recording Operations 3, Peter Hristov, Zoom Recording Operations 2, Eric Grancher, Simone Campana, Catherine Biscarat, Thomas Baron, Latchezar Betev, Markus Elsing, Xavier Espinal, Maarten Litmaath, Graeme A Stewart, Melissa Gaillard, Sebastian Lopienski, Anirudh Goel, David Southwick, Hannah Short, Helge Meinhard
Join via phone
Zoom URL
Conveners: Cedric Serfon (Brookhaven National Laboratory (US)) , Peter Clarke (The University of Edinburgh (GB))
• 65
An intelligent Data Delivery Service for and beyond the ATLAS experiment

The intelligent Data Delivery Service (iDDS) has been developed to cope with the huge increase of computing and storage resource usage in the coming LHC data taking. iDDS has been designed to intelligently orchestrate workflow and data management systems, decoupling data pre-processing, delivery, and main processing in various workflows. It is an experiment-agnostic service around a workflow- oriented structure to work with existing and emerging use cases in ATLAS and other experiments. Here we will present the motivation for iDDS, its design schema and architecture, use cases and current status, and plans for the future.

Speaker: Wen Guan (University of Wisconsin (US))
• 67
dCache: Inter-disciplinary storage system

The dCache project provides open-source software deployed internationally to satisfy ever more demanding storage requirements. Its multifaceted approach provides an integrated way of supporting different use-cases with the same storage, from high throughput data ingest, data sharing over wide area networks, efficient access from HPC clusters and long term data persistence on a tertiary storage. Though it was originally developed for the HEP experiments, today it is used by various scientific communities, including astrophysics, biomed, life science, which have their specific requirements. In this paper we describe some of the new requirements as well as demonstrate how dCache developers are addressing them.

Speaker: Mr Tigran Mkrtchyan (DESY)
• 68
The GridKa tape storage: latest improvements and current production setup

Tape storage remains the most cost-effective system for safe long-term storage of petabytes of data and reliably accessing it on demand. It has long been widely used by Tier-1 centers in WLCG. GridKa uses tape storage systems for LHC and non-LHC HEP experiments. The performance requirements on the tape storage systems are increasing every year, creating an increasing number of challenges in providing a scalable and reliable system. Therefore, providing high-performance, scalable and reliable tape storage systems is a top priority for Tier-1 centers in WLCG.

At GridKa, various performance tests were recently done to investigate the existence of bottlenecks in the tape storage setup. As a result, several bottlenecks were identified and resolved, leading to a significant improvement in the overall tape storage performance. These results were achieved in a test environment and introduction of these achievements in to the production environment required a great effort, among many other things, a new software had to be developed to interact with the tape management software.

This contribution provides detailed information on the latest improvements and changes on the GridKa tape storage setup.

Speaker: Haykuhi Musheghyan (Georg August Universitaet Goettingen (DE))
• 69
Improving Performance of Tape Restore Request Scheduling in the Storage System dCache

Given the anticipated increase in the amount of scientific data, it is widely accepted that primarily disk based storage will become prohibitively expensive. Tape based storage, on the other hand, provides a viable and affordable solution for the ever increasing demand for storage space. Coupled with a disk caching layer that temporarily holds a small fraction of the total data volume to allow for low latency access, it turns tape based systems into active archival storage (write once, read many) that imposes additional demands on data flow optimization compared to traditional backup setups (write once, read never). In order to preserve the lifetime of tapes and minimize the inherently higher access latency, different tape usage strategies are being evaluated. As an important disk storage system for scientific data that transparently handles tape access, dCache is making efforts to evaluate its recall optimization potential and is introducing a proof-of-concept, high-level stage request scheduling component within its SRM implementation.

Speaker: Lea Morschel (Deutsches Elektronen-Synchrotron DESY)
• 70
dCache: from Resilience to Quality of Service

A major goal of future dCache development will be to allow users to define file Quality of Service (QoS) in a more flexible way than currently available. This will mean implementing what might be called a QoS rule engine responsible for registering and managing time-bound QoS transitions for files or storage units. In anticipation of this extension to existing dCache capabilities, the Resilience service, which maintains on-disk replica state, needs to undergo both structural modification and generalization. This paper describes ongoing work to transform Resilience into the new architecture which will eventually support a more broadly defined file QoS.

Speaker: ALBERT ROSSI (Fermi National Accelerator Laboratory)
• 4:20 PM
Break
• Tues PM Plenaries: Plenaries
Zoom Meeting ID
63611158688
Host
vCHEP 00
Alternative hosts
Chiara Ilaria Rovelli, Benedikt Hegner, Edoardo Martelli, Stefan Roiser, Zoom Recording Operations 3, Zoom Recording Operations 2, Simone Campana, Catherine Biscarat, Thomas Baron, Xavier Espinal, Maarten Litmaath, Graeme A Stewart, Sebastian Lopienski, Helge Meinhard
Join via phone
Zoom URL
Conveners: Elizabeth Sexton-Kennedy (Fermi National Accelerator Lab. (US)) , Richard Philip Mount (SLAC National Accelerator Laboratory (US))
• 71
Deep Learning strategies for ProtoDUNE raw data denoising

In this work we investigate different machine learning based strategies for
denoising raw simulation data from ProtoDUNE experiment. ProtoDUNE detector
is hosted by CERN and it aims to test and calibrate the technologies for DUNE, a
forthcoming experiment in neutrino physics. Our models leverage deep learning
algorithms to make the first step in the reconstruction workchain, which
consists in converting digital detector signals into physical high level
quantities. We benchmark this approach against traditional algorithms
implemented by the DUNE collaboration. We test the capabilities of graph
neural networks, while exploiting multi-GPU setups to accelerate training and
inference processes.

Speaker: Marco Rossi (CERN)
• 72
Artificial Neural Networks on FPGAs for Real-Time Energy Reconstruction of the ATLAS LAr Calorimeters

Within the Phase-II upgrade of the LHC, the readout electronics of the ATLAS LAr Calorimeters is prepared for high luminosity operation expecting a pile-up of up to 200 simultaneous pp interactions. Moreover, the calorimeter signals of up to 25 subsequent collisions are overlapping, which increases the difficulty of energy reconstruction. Real-time processing of digitized pulses sampled at 40 MHz is thus performed using FPGAs.

To cope with the signal pile-up, new machine learning approaches are explored: convolutional and recurrent neural networks outperform the optimal signal filter currently used, both in assignment of the reconstructed energy to the correct bunch crossing and in energy resolution.

Very good agreement between neural network implementations in FPGA and software based calculations is observed. The FPGA resource usage, the latency and the operation frequency are analysed. Latest performance results and experience with prototype implementations will be reported.

Speaker: Thomas Calvet (CPPM, Aix-Marseille Université, CNRS/IN2P3 (FR))
• 5:40 PM
Break
• 73
Quantum Support Vector Machines for Continuum Suppression in B Meson Decays

Quantum computers have the potential for significant speed-ups of certain computational tasks. A possibility this opens up within the field of machine learning is the use of quantum features that would be inefficient to calculate classically. Machine learning algorithms are ubiquitous in particle physics and as advances are made in quantum machine learning technology, there may be a similar adoption of these quantum techniques.
In this work a quantum support vector machine (QSVM) is implemented for signal-background classification. We investigate the effect of different quantum encoding circuits, the process that transforms classical data into a quantum state, on the final classification performance. We show an encoding approach that achieves an Area Under Receiver Operating Characteristic Curve (AUC) of 0.877 determined using quantum circuit simulations. For this same dataset the best classical method, a classical Support Vector Machine (SVM) using the Radial Basis Function (RBF) Kernel achieved an AUC of 0.865. Using a reduced dataset we then ran the algorithm on the IBM Quantum ibmq_casablanca device achieving an average AUC of 0.703. As further improvements to the error rates and availability of quantum computers materialise, they could form a new approach for data analysis in high energy physics.

Speaker: Jamie Heredge (The University of Melbourne)
• 74
EDM4hep and podio - The event data model of the Key4hep project and its implementation

The EDM4hep project aims to design the common event data model for the Key4hep project and is generated via the podio toolkit. We present the first version of EDM4hep and discuss some of its use cases in the Key4hep project. Additionally, we discuss recent developments in podio, like the updates of the automatic code generation and also the addition of a second I/O backend based on SIO. We compare the available backends using benchmarks based on physics use cases, before we conclude with a discussion of currently ongoing work and future developments.

Speaker: Thomas Madlener (Deutsches Elektronen-Synchrotron (DESY))
• Wednesday, 19 May
• Weds AM Plenaries: Plenaries
Zoom Meeting ID
63611158688
Host
vCHEP 00
Alternative hosts
Chiara Ilaria Rovelli, Benedikt Hegner, Edoardo Martelli, Stefan Roiser, Zoom Recording Operations 3, Zoom Recording Operations 2, Simone Campana, Catherine Biscarat, Thomas Baron, Xavier Espinal, Maarten Litmaath, Graeme A Stewart, Sebastian Lopienski, Helge Meinhard
Join via phone
Zoom URL
Conveners: Catherine Biscarat (L2I Toulouse, IN2P3/CNRS (FR)) , Tommaso Boccali (INFN Sezione di Pisa, Universita' e Scuola Normale Superiore, P)
• 75
Full detector simulation with unprecedented background occupancy at a Muon Collider

In recent years a Muon Collider has attracted a lot of interest in the High-Energy Physics community thanks to its ability of achieving clean inter- action signatures at multi-TeV collision energies in the most cost-effective way. Estimation of the physics potential of such an experiment must take into account the impact of beam-induced background on the detector performance, which has to be carefully evaluated using full detector simulation. Tracing of all the back- ground particles entering the detector region in a single bunch crossing is out of reach for any realistic computing facility due to the unprecedented number of such particles. In order to make it feasible a number of optimisations have been applied to the detector simulation workflow.

This contribution presents an overview of the main characteristics of the beam-induced background at a Muon Collider, the detector technologies considered for the experiment and how they are taken into account to strongly reduce the number of irrelevant computations performed during the detector simulation. Special attention is dedicated to the optimisation of track reconstruction with the Conformal Tracking algorithm in this high-occupancy environment, which is the most computationally demand- ing part of event reconstruction.

Speaker: Nazar Bartosik (Universita e INFN Torino (IT))
• 76
HEPiX benchmarking solution for WLCG computing resources

The HEPiX Benchmarking Working Group has been developing a benchmark based on actual software workloads of the High Energy Physics community. This approach, based on container technologies, is designed to provide a benchmark that is better correlated with the actual throughput of the experiment production workloads. It also offers the possibility to separately explore and describe the independent architectural features of different computing resource types. This is very important in view of the growing heterogeneity of the HEP computing landscape, where the role of non-traditional computing resources such as HPCs and GPUs is expected to increase significantly.

Speaker: Miguel Fontes Medeiros (CERN)
• 77
Integration of Rucio in Belle II

Dirac and Rucio are two standard pieces of software widely used in the HEP domain. Dirac provides Workload and Data Management functionalities, among other things, while Rucio is a dedicated, advanced Distributed Data Management system. Many communities that already use Dirac express their interest in using Dirac for workload management in combination with Rucio for the Data management part. In this paper, we describe the integration of the Rucio File Catalog into Dirac that was initially developed for the Belle II collaboration.

Speaker: Cedric Serfon (Brookhaven National Laboratory (US))
• 10:30 AM
Break
• Algorithms: Wed AM
Zoom Meeting ID
61638831506
Host
vCHEP 02
Alternative hosts
Katarzyna Maria Dziedziniewicz-Wojcik, Chiara Ilaria Rovelli, Adeel Ahmad, Catharine Noble, Benedikt Hegner, Viktor Khristenko, Eric Wulff, Edoardo Martelli, Julia Andreeva, Maria Girone, Stefan Roiser, Zoom Recording Operations 3, Peter Hristov, Zoom Recording Operations 2, Eric Grancher, Simone Campana, Catherine Biscarat, Thomas Baron, Latchezar Betev, Markus Elsing, Xavier Espinal, Maarten Litmaath, Graeme A Stewart, Melissa Gaillard, Sebastian Lopienski, Anirudh Goel, David Southwick, Hannah Short, Helge Meinhard
Join via phone
Zoom URL
Conveners: David Rohr (CERN) , Felice Pantaleo (CERN)
• 78
Application of the missing mass method in the fixed-target program of the STAR experiment

As part of the FAIR Phase-0 program, the fast FLES (First-Level Event Selection) package algorithms developed for the CBM experiment (FAIR/GSI, Germany) has been adapted for online and offline processing in the STAR experiment (BNL, USA). Using the same algorithms creates a bridge between online and offline modes. This allows combining online and offline resources for data processing.

Thus, an express data production chain was created based on the STAR HLT farm, which extends the real-time functionality of HLT all the way down to physics analysis. The same express data production chain can be used on the RCF farm, which is used for fast offline production with the same tasks as the extended HLT. The express analysis chain does not interfere with the standard analysis chain.

An important advantage of express analysis is that it allows you to start calibration, production, and analysis of the data as soon as it is available. Therefore, the use of express analysis can be useful for BES-II data production and help accelerate scientific discovery by helping to get results within a year after data collection is complete.

Here we describe and discuss in detail the missing mass method that has been implemented as part of the KF Particle Finder package for searching and analyzing short-lived particles. Features of the application of the method within the framework of express real-time data processing are given, as well as the results of real-time reconstruction of short-lived particle decays in the BES-II environment.

Speaker: Mr Pavel Kisel (Uni-Frankfurt, JINR)
• 79
Track Finding for the PANDA Detector Based on Hough Transformations

The PANDA experiment at FAIR (Facility for Antiproton and Ion
Research) in Darmstadt is currently under construction. In order to reduce the
amount of data collected during operation, it is essential to find all true tracks
and to be able to distinguish them from false tracks. Part of the preparation
for the experiment is therefore the development of a fast online track finder.
This work presents an online track finding algorithm based on Hough transfor-
mations, which is comparable in quality and performance to the currently best
offline track finder in PANDA. In contrast to most track finders the algorithm
can handle the challenge of extended hits delivered by PANDA’s central Straw
Tube Tracker and thus benefit from its precise spatial resolution. Furthermore,
optimization methods are presented that improved the ghost ratio as well as the
speed of the algorithm by 70 %. Due to further development potential in terms
of displaced vertex finding and speed optimization on GPUs, this algorithm
promises to exceed the quality and speed of other track finders developed for
PANDA.

Speaker: Anna Alicke (Forschungszentrum Jülich)
• 80
A novel reconstruction framework for an imaging calorimeter for HL-LHC

To sustain the harsher conditions of the high-luminosity LHC, the CMS collaboration is designing a novel endcap calorimeter system. The new calorimeter will predominantly use silicon sensors to achieve sufficient radiation tolerance and will maintain highly-granular information in the readout to help mitigate the effects of pileup. In regions characterised by lower radiation levels, small scintillator tiles with individual on-tile SiPM readout are employed.
A unique reconstruction framework (TICL: The Iterative CLustering) is being developed to fully exploit the granularity and other significant detector features, such as particle identification and precision timing, with a view to mitigate pileup in the very dense environment of HL-LHC. The inputs to the framework are clusters of energy deposited in individual calorimeter layers. Clusters are formed by a density-based algorithm. Recent developments and tunes of the clustering algorithm will be presented. To help reduce the expected pressure on the computing resources in the HL-LHC era, the algorithms and their data structures are designed to be executed on GPUs. Preliminary results will be presented on decreases in clustering time when using GPUs versus CPUs.
Ideas for machine-learning techniques to further improve the speed and accuracy of reconstruction algorithms will be presented.

Speaker: Dr Leonardo Cristella (CERN)
• 81
Simultaneous Global and Local Alignment of the Belle II Tracking Detectors

The alignment of the Belle II tracking system composed of a pixel and strip vertex detectors and central drift chamber is described by approximately 60,000 parameters. These include internal local alignment: positions, orientations and surface deformations of silicon sensors and positions of drift chamber wires as well as global alignment: relative positions of the sub-detectors and larger structures.

In the next data reprocessing, scheduled since Spring 2021, we aim to determine all parameters in a simultaneous fit by Millepede II, where recent developments allow to achieve a direct solution of the full problem in about one hour and make it practically feasible for regular detector alignment.

The tracking detectors and the alignment technique are described and the alignment strategy is discussed in the context of studies on simulations and experience obtained from recorded data. Preliminary results and further refinements based on studies of real Belle II data are presented.

• 82
Improvements to ATLAS Inner Detector Track reconstruction for LHC Run-3

This talk summarises the main changes to the ATLAS experiment’s Inner Detector Track reconstruction software chain in preparation of LHC Run 3 (2022-2024). The work was carried out to ensure that the expected high-activity collisions with on average 50 simultaneous proton-proton interactions per bunch crossing (pile-up) can be reconstructed promptly using the available computing resources. Performance figures in terms of CPU consumption for the key components of the reconstruction algorithm chain and their dependence on the pile-up are shown. For the design pile-up value of 60 the updated track reconstruction is a factor of 2 faster than the previous version.

Speaker: Zachary Michael Schillaci (Brandeis University (US))
• 83
Basket Classifier: Fast and Optimal Restructuring of the Classifier for Differing Train and Target Samples

The common approach for constructing a classifier for particle selection assumes reasonable consistency between train data samples and the target data sample used for the particular analysis. However, train and target data may have very different properties, like energy spectra for signal and background contributions. We suggest using ensemble of pre-trained classifiers, each of which is trained on exclusive subset of the total dataset, data baskets. Appropriate separate adjustment of separation thresholds for every basket classifier allows to dynamically adjust combined classifier and make optimal prediction for data with differing properties without re-training of the classifier. The approach is illustrated with a toy example. Quality dependency on the number of used data baskets is also presented

Speaker: Mr Anton Philippov (HSE)
• Artificial Intelligence: Wed AM
Zoom Meeting ID
67263583281
Host
vCHEP 01
Alternative hosts
Katarzyna Maria Dziedziniewicz-Wojcik, Chiara Ilaria Rovelli, Adeel Ahmad, Catharine Noble, Benedikt Hegner, Viktor Khristenko, Eric Wulff, Edoardo Martelli, Julia Andreeva, Maria Girone, Stefan Roiser, Zoom Recording Operations 3, Peter Hristov, Zoom Recording Operations 2, Eric Grancher, Simone Campana, Catherine Biscarat, Thomas Baron, Latchezar Betev, Markus Elsing, Xavier Espinal, Maarten Litmaath, Graeme A Stewart, Melissa Gaillard, Sebastian Lopienski, Anirudh Goel, David Southwick, Hannah Short, Helge Meinhard
Join via phone
Zoom URL
Conveners: Agnieszka Dziurda (Polish Academy of Sciences (PL)) , Joosep Pata (National Institute of Chemical Physics and Biophysics (EE))
• 84
Pixel Detector Background Generation using Generative Adversarial Networks at Belle II

The pixel vertex detector (PXD) is an essential part of the Belle II detector recording particle positions. Data from the PXD and other sensors allow us to reconstruct particle tracks and decay vertices. The eﬀect of background hits on track reconstruction is simulated by adding measured or simulated background hit patterns to the hits produced by simulated signal particles. This model requires a large set of statistically independent PXD background noise samples to avoid a systematic bias of reconstructed tracks. However, data from the ﬁne-grained PXD requires a substantial amount of storage. As an eﬃcient way of producing background noise, we explore the idea of an on-demand PXD background generator using conditional Generative Adversarial Networks (GANs), adapted by the number of PXD sensors in order to both increase the image ﬁdelity and produce sensor-dependent PXD hitmaps.

Speaker: Mr Hosein Hashemi (LMU)
• 85
Machine learning for surface prediction in ACTS

We present an ongoing R&D activity for machine-learning-assisted navigation through detectors to be used for track reconstruction. We investigate different approaches of training neural networks for surface prediction and compare their results. This work is carried out in the context of the ACTS tracking toolkit.

Speaker: Mr Benjamin Huth (Universität Regensburg)
• 86
Deep neural network techniques in the calibration of space-charge distortion fluctuations for the ALICE TPC

The Time Projection Chamber (TPC) of the ALICE experiment at the CERN LHC was upgraded for Run 3 and Run 4. Readout chambers based on Gas Electron Multiplier (GEM) technology and a new readout scheme allow continuous data taking at the highest interaction rates expected in Pb-Pb collisions. Due to the absence of a gating grid system, a significant amount of ions created in the multiplication region is expected to enter the TPC drift volume and distort the uniform electric field that guides the electrons to the readout pads. Analytical calculations were considered to correct for space-charge distortion fluctuations but they proved to be too slow for the calibration and reconstruction workflow in Run 3. In this paper, we discuss a novel strategy developed by the ALICE Collaboration to perform distortion-fluctuation corrections with machine learning and convolutional neural network techniques. The results of preliminary studies are shown and the prospects for further development and optimization are also discussed.

Speaker: Ernst Hellbar (GSI - Helmholtzzentrum fur Schwerionenforschung GmbH (DE))
• 87
Accelerating End-to-End Deep Learning for Particle Reconstruction using CMS open data

Machine learning algorithms are gaining ground in high energy physics for applications in particle and event identification, physics analysis, detector reconstruction, simulation and trigger. Currently, most data-analysis tasks at LHC experiments benefit from the use of machine learning. Incorporating these computational tools in the experimental framework presents new challenges.
This paper reports on the implementation of the end-to-end deep learning with the CMS software framework and the scaling of the end-to-end deep learning with multiple GPUs.
The end-to-end deep learning technique combines deep learning algorithms and low-level detector representation for particle and event identification. We demonstrate the end-to-end implementation on a top quark benchmark and perform studies with various hardware architectures including single and multiple GPUs and Google TPU.

Speaker: Davide Di Croce (University of Alabama (US))
• 88
Development of FPGA-based neural network regression models for the ATLAS Phase-II barrel muon trigger upgrade

Effective selection of muon candidates is the cornerstone of the LHC physics programme. The ATLAS experiment uses the two-level trigger system for real-time selections of interesting events. The first-level hardware trigger system uses the Resistive Plate Chamber detector (RPC) for selecting muon candidates in the central (barrel) region of the detector. With the planned upgrades, the entirely new FPGA-based muon trigger system will be installed in 2025-2026. In this paper, neural network regression models are studied for potential applications in the new RPC trigger system. A simple simulation model of the current detector is developed for training and testing neural network regression models. Effects from additional cluster hits and noise hits are evaluated. Efficiency of selecting muon candidates is estimated as a function of the transverse muon momentum. Several models are evaluated and their performance is compared to that of the current detector, showing promising potential to improve on current algorithms for the ATLAS Phase-II barrel muon trigger upgrade.

Speaker: Rustem Ospanov (University of Science and Technology of China)
• Facilities and Networks: Wed AM
Zoom Meeting ID
62681653254
Host
vCHEP 07
Alternative hosts
Katarzyna Maria Dziedziniewicz-Wojcik, Chiara Ilaria Rovelli, Adeel Ahmad, Catharine Noble, Benedikt Hegner, Viktor Khristenko, Eric Wulff, Edoardo Martelli, Julia Andreeva, Maria Girone, Stefan Roiser, Zoom Recording Operations 3, Peter Hristov, Zoom Recording Operations 2, Eric Grancher, Simone Campana, Catherine Biscarat, Thomas Baron, Latchezar Betev, Markus Elsing, Xavier Espinal, Maarten Litmaath, Graeme A Stewart, Melissa Gaillard, Sebastian Lopienski, Anirudh Goel, David Southwick, Hannah Short, Helge Meinhard
Join via phone
Zoom URL
Conveners: Daniela Bauer (Imperial College (GB)) , David Bouvet (IN2P3/CNRS (FR))
• 89
Deploying a new realtime XRootD-v5 based monitoring framework for GridPP

To optimise the performance of distributed compute, smaller lightweight storage caches are needed which integrate with existing grid computing workflows. A good solution to provide lightweight storage caches is to use an XRootD-proxy cache. To support distributed lightweight XRootD proxy services across GridPP we have developed a centralised monitoring framework.

With the v5 release of XRootD it is possible to build a monitoring framework which collects distributed caching metadata broadcast from multiple sites. To provide the best support for these distributed caches we have built a centralised monitoring service for XRootD storage instances within GridPP. This monitoring solution is built upon experiences presented by CMS in setting up a similar service as part of their AAA system. This new framework is designed to provide remote monitoring of the behaviour, performance, and reliability of distributed XRootD services across the UK. Effort has been made to simplify ease of deployment by remote site administrators.

The result of this work is an interactive dashboard system which enables administrators to access real-time metrics on the performance of their lightweight storage systems. This monitoring framework is intended to supplement existing functionality and availability testing metrics by providing detailed information and logging from a site perspective.

Speaker: Dr Robert Andrew Currie (The University of Edinburgh (GB))
• 90
Towards Real-World Applications of ServiceX, an Analysis Data Transformation System

One of the biggest challenges in the High-Luminosity LHC (HL- LHC) era will be the significantly increased data size to be recorded and an- alyzed from the collisions at the ATLAS and CMS experiments. ServiceX is a software R&D project in the area of Data Organization, Management and Access of the IRIS- HEP to investigate new computational models for the HL- LHC era. ServiceX is an experiment-agnostic service to enable on-demand data delivery specifically tailored for nearly-interactive vectorized analyses. It is capable of retrieving data from grid sites, on-the-fly data transformation, and delivering user-selected data in a variety of different formats. New features will be presented that make the service ready for public use. An ongoing effort to integrate ServiceX with a popular statistical analysis framework in ATLAS will be described with an emphasis of a practical implementation of ServiceX into the physics analysis pipeline.

Speaker: Kyungeon Choi (University of Texas at Austin (US))
• 91
Anomaly detection in the CERN cloud infrastructure

Anomaly detection in the CERN OpenStack cloud is a challenging task due to the large scale of the computing infrastructure and, consequently, the large volume of monitoring data to analyse. The current solution to spot anomalous servers in the cloud infrastructure relies on a threshold-based alarming system carefully set by the system managers on the performance metrics of each infrastructure’s component. This contribution explores fully automated, unsupervised machine learning solutions in the anomaly detection field for time series metrics, by adapting both traditional and deep learning approaches. The paper describes a novel end-to-end data analytics pipeline implemented to digest the large amount of monitoring data and to expose anomalies to the system managers. The pipeline relies solely on open-source tools and frameworks, such as Spark, Apache Airflow, Kubernetes, Grafana, Elasticsearch. In addition, an approach to build annotated datasets from the CERN cloud monitoring data is reported. Finally, a preliminary performance of a number of anomaly detection algorithms is evaluated by using the aforementioned annotated datasets.

Speaker: Stiven Metaj (Politecnico di Milano (IT))
• 92
Reaching new peaks for the future of the CMS HTCondor Global Pool

The CMS experiment at CERN employs a distributed computing infrastructure to satisfy its data processing and simulation needs. The CMS Submission Infrastructure team manages a dynamic HTCondor pool, aggregating mainly Grid clusters worldwide, but also HPC, Cloud and opportunistic resources. This CMS Global Pool, which currently involves over 70 computing sites worldwide and peaks at 300k CPU cores, is capable of successfully handling the simultaneous execution of up to 150k tasks. While the present infrastructure is sufficient to harness the current computing power scales, CMS latest estimates predict that at least a four-time increase in the total amount of CPU will be required in order to cope with the massive data increase of the High-Luminosity LHC (HL-LHC) era, planned to start in 2027. This contribution presents the latest results of the CMS Submission Infrastructure team in exploring the scalability reach of our Global Pool, in order to preventively detect and overcome any barriers in relation to the HL-LHC goals, while maintaining high efficiency in our workload scheduling and resource utilization.

Speaker: Antonio Perez-Calero Yzquierdo (Centro de Investigaciones Energéti cas Medioambientales y Tecno)
• 93
Research and Evaluation of RoCE in IHEP Data Center

With more and more large-scale scientific facilities are built, more and more HPC requirements are needed in IHEP. RDMA is a technology that allows servers in a network to exchange data in main memory without involving the processor, cache or operating system of either server, which can provide high bandwidth and low latency. There are two RDMA technologies which were InfiniBand and a relative new comer called RoCE – RDMA over Converged Ethernet. This paper introduces the RoCE technology, we research and compare the performance of both IB and RoCE in IHEP data center, and we also evaluate the application scenarios of RoCE which can support our future technology selection in HEPS. In the end, we present our future plan.

Speaker: Dr Shan Zeng (IHEP)
• Software: Wed AM
Zoom Meeting ID
68415529596
Host
vCHEP 06
Alternative hosts
Katarzyna Maria Dziedziniewicz-Wojcik, Chiara Ilaria Rovelli, Adeel Ahmad, Catharine Noble, Benedikt Hegner, Viktor Khristenko, Eric Wulff, Edoardo Martelli, Julia Andreeva, Maria Girone, Stefan Roiser, Zoom Recording Operations 3, Peter Hristov, Zoom Recording Operations 2, Eric Grancher, Simone Campana, Catherine Biscarat, Thomas Baron, Latchezar Betev, Markus Elsing, Xavier Espinal, Maarten Litmaath, Graeme A Stewart, Melissa Gaillard, Sebastian Lopienski, Anirudh Goel, David Southwick, Hannah Short, Helge Meinhard
Join via phone
Zoom URL
Conveners: Enrico Guiraud (EP-SFT, CERN) , Stefan Roiser (CERN)
• 94
BAT. jl — A Julia-based tool for Bayesian inference

We present BAT.jl 2.0, the next generation of the Bayesian Analysis Toolkit. BAT.jl is a highly efficient and easy to use software package for Bayesian Inference. It's predecessor, BAT 1.0 in C++, has been very successful over the years with a large number of citations. Our new incarnation of BAT was rewritten from scratch in Julia and we recently released the long-term stable version 2.0.

Solving inference problems in the natural sciences, in particular High Energy Physics, often requires flexibility in using multiple programming languages, differentiable programming, and parallel execution on both CPU and GPU architectures. BAT.jl enables this by drawing on the unique capabilities of the Julia Programing Language. It provides efficient Metropolis-Hastings sampling, Hamiltonian Monte Carlo with automatic differentiation and nested sampling. We also provide algorithms to estimate the evidence (integral of the posterior), necessary to compute Bayesian factors, from posterior samples. BAT.jl uses a minimal set of dependencies and new algorithms can be easily added due to the toolbox structure of the package.

BAT.jl continues to evolve, one of its new experimental features is a sampling algorithm with space partitioning. This algorithm can efficiently utilize distributed computing resources and sample posteriors with reduced burn-in overhead while dealing with multi-modal densities. We also provide the user with a set of plotting recipes to quickly visualize results.

Speaker: Vasyl Hafych (Max-Planck-Institut fur Physik (DE))
• 95

Processing and scientific analysis of the data taken by the ATLAS experiment requires reliable information describing the event data recorded by the detector or generated in software. ATLAS event processing applications store such descriptive metadata information in the output data files along with the event information.

To better leverage the available computing resources during LHC Run3 the ATLAS experiment has migrated its data processing and analysis software to a multi-threaded framework: AthenaMT. Therefore in-file metadata must support concurrent event processing, especially around input file boundaries. The in-file metadata handling software was originally designed for serial event processing. It grew into a rather complex system over the many years of ATLAS operation. To migrate this system to the multi-threaded environment it was necessary to adopt several pragmatic solutions, mainly because of the shortage of available person-power to work on this project in early phases of the AthenaMT development.

In order to simplify the migration, first the redundant parts of the code were cleaned up wherever possible. Next the infrastructure was improved by removing reliance on constructs that are problematic during multi-threaded processing. Finally, the remaining software infrastructure was redesigned for thread safety.

Speaker: Frank Berghaus (Argonne National Laboratory (US))
• 96
Software framework for the Super Charm-Tau factory detector project

The project of Super Charm-Tau (SCT) factory --- a high-luminosity
electron-positron collider for studying charmed hadrons and tau lepton
--- is proposed by Budker INP. The project implies single collision point
equipped with a universal particle detector. The Aurora software
framework has been developed for the SCT detector. It is based on
trusted and widely used in high energy physics software packages, such
as Gaudi, Geant4, and ROOT. At the same time, new ideas and
developments are employed, in particular the Aurora project benefits a
lot from the turnkey software for future colliders (Key4HEP)
initiative. This paper describes the first release of the Aurora
framework, summarizes its core technologies, structure and roadmap for
the near future.

• 97
Exploring the virtues of XRootD5: Declarative API

Across the years, being the backbone of numerous data management solutions used within the WLCG collaboration, the XRootD framework and protocol became one of the most important building blocks for storage solutions in the High Energy Physics (HEP) community. The latest big milestone for the project, release 5, introduced multitude of architectural improvements and functional enhancements, including the new client side declarative API, which is the main focus of this study. In this contribution we give an overview of the new client API and we discuss its motivation and its positive impact on overall software quality (coupling, cohesion), readability and composability.

Speaker: Michal Kamil Simon (CERN)
• 98
Building and steering binned template fits with cabinetry

The cabinetry library provides a Python-based solution for building and steering binned template fits. It tightly integrates with the pythonic High Energy Physics ecosystem, and in particular with pyhf for statistical inference. cabinetry uses a declarative approach for building statistical models, with a JSON schema describing possible configuration choices. Model building instructions can additionally be provided via custom code, which is automatically executed when applicable at key steps of the workflow. The library implements interfaces for performing maximum likelihood fitting, upper parameter limit determination, and discovery significance calculation. cabinetry also provides a range of utilities to study and disseminate fit results. These include visualizations of the fit model and data, visualizations of template histograms and fit results, ranking of nuisance parameters by their impact, a goodness-of-fit calculation, and likelihood scans. The library takes a modular approach, allowing users to include some or all of its functionality in their workflow.

Speaker: Alexander Held (New York University (US))
• 99
CORSIKA 8 -- A novel high-performance computing tool for particle cascade Monte Carlo simulations

The CORSIKA 8 project is an international collaboration of scientists working together to deliver the most modern, flexible, robust and efficient framework for the simulation of ultra-high energy secondary particle cascades in matter. The main application is for cosmic ray air shower simulations, but is not limited to that. Besides a comprehensive collection of physics models and algorithms relevant for the field, also all possible interfaces to hardware acceleration (e.g.\ GPU) and parallelization (vectorization, multi-threading, multi-core) will be provided. We present the status and roadmap of this project. This code will soon be available for novel explorative studies and phenomonological research, and at the same time for massive productions runs for experiments.

Speaker: Ralf Ulrich (KIT - Karlsruhe Institute of Technology (DE))
• Storage: Wed AM
Zoom Meeting ID
67249300031
Host
vCHEP 03
Alternative hosts
Katarzyna Maria Dziedziniewicz-Wojcik, Chiara Ilaria Rovelli, Adeel Ahmad, Catharine Noble, Benedikt Hegner, Viktor Khristenko, Eric Wulff, Edoardo Martelli, Julia Andreeva, Maria Girone, Stefan Roiser, Zoom Recording Operations 3, Peter Hristov, Zoom Recording Operations 2, Eric Grancher, Simone Campana, Catherine Biscarat, Thomas Baron, Latchezar Betev, Markus Elsing, Xavier Espinal, Maarten Litmaath, Graeme A Stewart, Melissa Gaillard, Sebastian Lopienski, Anirudh Goel, David Southwick, Hannah Short, Helge Meinhard
Join via phone
Zoom URL
Conveners: Cedric Serfon (Brookhaven National Laboratory (US)) , Edoardo Martelli (CERN)
• 100
Prototype of the Russian Scientific Data Lake

The High Luminosity phase of the LHC, which aims for a ten-fold increase in the luminosity of proton-proton collisions is expected to start operation in eight years. An unprecedented scientific data volume at the multi-exabyte scale will be delivered to particle physics experiments at CERN. This amount of data has to be stored and the corresponding technology must ensure fast and reliable data delivery for processing by the scientific community allover the world. The present LHC computing model will not be able to provide the required infrastructure growth even taking into account the expected hard-ware evolution. To address this challenge the Data Lake R&D project has been launched by the DOMA community in the fall of 2019. State-of-the-art data handling technologies are under active development, and their current status for the Russian Scientific Data Lake prototype is presented here.

Speaker: Mr Andrey Kirianov (NRC Kurchatov Institute PNPI (RU))
• 101
ESCAPE Data Lake: Next-generation management of cross-discipline Exabyte-scale scientific data

The European-funded ESCAPE project (Horizon 2020) aims to address computing challenges in the context of the European Open Science Cloud. The project targets Particle Physics and Astronomy facilities and research infrastructures, focusing on the development of solutions to handle Exabyte-scale datasets. The science projects in ESCAPE are in different phases of evolution and count a variety of specific use cases and challenges to be addressed. This contribution describes the shared-ecosystem architecture of services, the Data Lake, fulfilling the needs in terms of data organisation, management, and access of the ESCAPE community. The Pilot Data Lake consists of several storage services operated by the partner institutes and connected through reliable networks, and it adopts Rucio to orchestrate data management and organisation. The results of a 24-hour Full Dress Rehearsal are also presented, highlighting the achievements of the Data Lake model and of the ESCAPE sciences.

Speaker: Dr Riccardo Di Maria (CERN)
• 102
LHC Data Storage: Preparing for the Challenges of Run-3

The CERN IT Storage Group ensures the symbiotic development
and operations of storage and data transfer services for all CERN physics data,
in particular the data generated by the four LHC experiments (ALICE, ATLAS,
CMS and LHCb).
In order to accomplish the objectives of the next run of the LHC (Run-3), the
Storage Group has undertaken a thorough analysis of the experiments’ requirements,
matching them to the appropriate storage and data transfer solutions, and
undergoing a rigorous programme of testing to identify and solve any issues before
the start of Run-3.
In this paper, we present the main challenges presented by each of the four LHC
experiments. We describe their workflows, in particular how they communicate
with and use the key components provided by the Storage Group: the EOS
disk storage system; its archival back-end, the CERN Tape Archive (CTA); and
the File Transfer Service (FTS). We also describe the validation and commissioning
tests that have been undertaken and challenges overcome: the ATLAS
stress tests to push their DAQ system to its limits; the CMS migration from
PhEDEx to Rucio, followed by large-scale tests between EOS and CTA with
the new FTS “archive monitoring” feature; the LHCb Tier-0 to Tier-1 staging
tests and XRootD Third Party Copy (TPC) validation; and the erasure coding
performance in ALICE.

Speaker: Dr Maria Arsuaga Rios (CERN)
• 103
CERN Tape Archive: a distributed, reliable and scalable scheduling system

The CERN Tape Archive (CTA) provides a tape backend to disk systems and, in conjunction with EOS, is managing the data of the LHC experiments at CERN.

Magnetic tape storage offer the lowest cost per unit volume today, followed by hard disks and flash. In addition, current tape drives deliver a solid bandwidth (typically 360MB/s per device), but at the cost of high latencies, both for mounting a tape in the drive and for positioning when accessing non-adjacent files. As a consequence, the transfer scheduler should queue transfer requests before the volume warranting a tape mount is reached. In spite of these transfer latencies, user-interactive operations should have a low latency.

The scheduling system for CTA was built from the experience gained with CASTOR. Its implementation ensures reliability and predictable performance, while simplifying development and deployment. As CTA is expected to be used for a long time, lock-in to vendors or technologies was minimized.

Finally quality assurance systems were put in place to validate reliability and performance while allowing fast and safe development turnaround.

Speaker: Eric Cano (CERN)
• 104
Preparing for HL-LHC: Increasing the LHCb software publication rate to CVMFS by an order of magnitude

In the HEP community, software plays a central role in the operation of experiments’ facilities and for reconstruction jobs, with CVMFS being the service enabling the distribution of software at scale. In view of High Luminosity LHC, CVMFS developers investigated how to improve the publication workflow to support the most demanding use cases. This paper reports about recent CVMFS developments and infrastructural updates that enable faster publication into existing repositories. A new CVMFS component, the CVMFS Gateway, allows for concurrent transactions and the use of multiple publishers, increasing the overall publication rate on a single repository. Also, the repository data has been migrated to Ceph-based S3 object storage, which brings a relevant performance enhancement over the previously-used Cinder volumes. We demonstrate how recent improvements allow for faster publication of software releases in CVMFS repositories by focusing on the LHCb nightly builds use case, which is currently by far the most demanding one for the CVMFS infrastructure at CERN. The publication of nightly builds is characterized by a high churn rate, needs regular garbage collection, and requires the ability to ingest a huge amount of software files over a limited period of time.

Speaker: Enrico Bocchi (CERN)
• 105
Addressing a billion-entries multi-petabyte distributed filesystem backup problem with cback: from files to objects

CERNBox is the cloud collaboration hub at CERN. The service has more than 37,000 user accounts. The backup of user and project data is critical for the service. The underlying storage system hosts over a billion files which amount to 12PB of storage distributed over several hundred disks with a two-replica RAIN layout. Performing a backup operation over this vast amount of data is a non-trivial task.

The original CERNBox backup system (an in-house event-driven file-level system) has been reconsidered and replaced by a new distributed and scalable backup infrastructure based on the open source tool restic. The new system, codenamed cback, provides features needed in the HEP community to guarantee data safety and smooth operation from the system administrators. Daily snapshot-based backups of all our user and project areas along with automatic verification and restores are possible with this the new development.

The backup data is also de-duplicated in blocks and stored as objects in a disk-based S3 cluster in another geographical location on the CERN campus, reducing storage costs and protecting critical data from major catastrophic events. We report on the design and operational experience of running the system and future improvement possibilities.

Speaker: Roberto Valverde Cameselle (CERN)
• Weds PM Plenaries: Plenaries
Zoom Meeting ID
63611158688
Host
vCHEP 00
Alternative hosts
Chiara Ilaria Rovelli, Benedikt Hegner, Edoardo Martelli, Stefan Roiser, Zoom Recording Operations 3, Zoom Recording Operations 2, Simone Campana, Catherine Biscarat, Thomas Baron, Xavier Espinal, Maarten Litmaath, Graeme A Stewart, Sebastian Lopienski, Helge Meinhard
Join via phone
Zoom URL
Conveners: James Catmore (University of Oslo (NO)) , Oxana Smirnova (Lund University (SE))
• 106
An Error Analysis Toolkit for Binned Counting Experiments

We introduce the MINERvA Analysis Toolkit (MAT), a utility for centralizing the handling of systematic uncertainties in HEP analyses. The fundamental utilities of the toolkit are the MnvHnD, a powerful histogram container class, and the systematic Universe classes, which provide a modular implementation of the many universe error analysis approach. These products can be used stand-alone or as part of a complete error analysis prescription. They support the propagation of systematic uncertainty through all stages of analysis, and provide flexibility for an arbitrary level of user customization. This extensible solution to error analysis enables the standardization of systematic uncertainty definitions across an experiment and a transparent user interface to lower the barrier to entry for new analyzers.

Speaker: Dr Ben Messerly (University of Minnesota)
• 107
Convolutional LSTM models to estimate network traffic

Network utilisation efficiency can, at least in principle, often be improved by dynamically re-configuring routing policies to better distribute on-going large data transfers. Unfortunately, the information necessary to decide on an appropriate reconfiguration---details of on-going and upcoming data transfers such as their source and destination and, most importantly, their volume and duration---is usually lacking. Fortunately, the increased use of scheduled transfer services, such as FTS, makes it possible to collect the necessary information. However, the mere detection and characterisation of larger transfers is not sufficient to predict with confidence the likelihood a network link will become overloaded. In this paper we present the use of LSTM-based models (CNN-LSTM and Conv-LSTM) to effectively estimate future network traffic and so provide a solid basis for formulating a sensible network configuration plan.

Speaker: Joanna Waczynska (Wroclaw University of Science and Technology (PL))
• 4:00 PM
Break
• 108
Design and engineering of a simplified workflow execution for the MG5aMC event generator on GPUs and vector CPUs

Physics event generators are essential components of the data analysis software chain of high energy physics experiments, and important consumers of their CPU resources. Improving the software performance of these packages on modern hardware architectures, such as those deployed at HPC centers, is essential in view of the upcoming HL-LHC physics programme. In this contribution, we describe an ongoing activity to reengineer the Madgraph5_aMC@NLO physics event generator, primarily to port it and allow its efficient execution on GPUs, but also to modernize it and optimize its performance on traditional CPUs. In our presentation at the conference, we will describe the motivation, engineering process and software architecture design of our developments, as well as some of the challenges and future directions for this project. We also plan to present the status and results of our developments at the time of the presentation, including detailed software performance metrics.

Speaker: Andrea Valassi (CERN)
• 109
Accelerating IceCube's Photon Propagation Code with CUDA

The IceCube Neutrino Observatory is a cubic kilometer neutrino detector located at the geographic South Pole designed to detect high-energy astrophysical neutrinos. To thoroughly understand the detected neutrinos and their properties, the detector response to signal and background has to be modeled using Monte Carlo techniques. An integral part of these studies are the optical properties of the ice the observatory is built into. The simulated propagation of individual photons from particles produced by neutrino interactions in the ice can be greatly accelerated using graphics processing units (GPUs). In this paper, we (a collaboration between NVIDIA and IceCube) reduced the propagation time per photon by a factor of 3. We achieved this by porting the OpenCL parts of the program to CUDA and optimizing the performance. This involved careful analysis and multiple changes to the algorithm. We also ported the code to NVIDIA OptiX to handle the collision detection. The hand-tuned CUDA algorithm turned out to be faster than OptiX. It exploits detector geometry and only a small fraction of photons ever travel close to one of the detectors.

Speaker: Benedikt Riedel (University of Wisconsin-Madison)
• 5:20 PM
Break
• Accelerators: Wed PM
Zoom Meeting ID
63711203344
Host
vCHEP 08
Alternative hosts
Katarzyna Maria Dziedziniewicz-Wojcik, Chiara Ilaria Rovelli, Adeel Ahmad, Catharine Noble, Benedikt Hegner, Viktor Khristenko, Eric Wulff, Edoardo Martelli, Julia Andreeva, Maria Girone, Stefan Roiser, Zoom Recording Operations 3, Peter Hristov, Zoom Recording Operations 2, Eric Grancher, Simone Campana, Catherine Biscarat, Thomas Baron, Latchezar Betev, Markus Elsing, Xavier Espinal, Maarten Litmaath, Graeme A Stewart, Melissa Gaillard, Sebastian Lopienski, Anirudh Goel, David Southwick, Hannah Short, Helge Meinhard
Join via phone
Zoom URL
Conveners: Dorothea Vom Bruch (Aix Marseille Univ, CNRS/IN2P3, CPPM, Marseille, France) , Stewart Martin-Haugh (Science and Technology Facilities Council STFC (GB))
• 110
Integration of JUNO simulation framework with Opticks: GPU accelerated optical propagation via NVIDIA OptiX

Opticks is an open source project that accelerates optical photon simulation by integrating NVIDIA GPU ray tracing, accessed via NVIDIA OptiX, with
Geant4 toolkit based simulations. A single NVIDIA Turing architecture GPU has been measured to provide optical photon simulation speedup factors exceeding 1500 times single threaded Geant4 with a full JUNO analytic GPU geometry automatically translated from the Geant4 geometry.
Optical physics processes of scattering, absorption, scintillator reemission and
boundary processes are implemented within CUDA OptiX programs based on the Geant4
implementations. Wavelength-dependent material and surface properties as well as
inverse cumulative distribution functions for reemission are interleaved into
GPU textures providing fast interpolated property lookup or wavelength generation. Major recent developments are the integration of Opticks with the JUNO simulation framework using the minimal G4Opticks interface class and implementation of collection efficiency hit culling on GPU that enables only collected hits to be copied to CPU, substantially reducing both the CPU memory needed for photon hits and copying overheads. Also progress with the migration of Opticks to the all new NVIDIA OptiX 7 API is described.

Speaker: simon blyth (IHEP, CAS)
• 111
GPU simulation with Opticks: The future of optical simulations for LZ

The LZ collaboration aims to directly detect dark matter by using a liquid xenon Time Projection Chamber (TPC). In order to probe the dark matter signal, observed signals are compared with simulations that model the detector response. The most computationally expensive aspect of these simulations is the propagation of photons in the detector’s sensitive volume. For this reason, we propose to offload photon propagation modelling to the Graphics Processing Unit (GPU), by integrating Opticks into the LZ simulations workflow. Opticks is a system which maps Geant4 geometry and photon generation steps to NVIDIA's OptiX GPU raytracing framework. This paradigm shift could simultaneously achieve a massive speedup and an increase in accuracy for LZ simulations. By using the technique of containerization through Shifter, we will produce a portable system to harness the NERSC supercomputing facilities, including the forthcoming Perlmutter supercomputer, and enable the GPU processing to handle different detector configurations. Prior experience with using Opticks to simulate JUNO indicates the potential for speedup factors over 1000$\times$ for LZ, and by extension other experiments requiring photon propagation simulations.

Speaker: Oisin Creaner (Lawrence Berkeley National Laboratory)
• 112
MadFlow: towards the automation of Monte Carlo simulation on GPU for particle physics processes

In this proceedings we present MadFlow, a new framework for the automation of Monte Carlo (MC) simulation on graphics processing units (GPU) for particle physics processes. In order to automate MC simulation for a generic number of processes, we design a program which provides to the user the possibility to simulate custom processes through the MG5_aMC@NLO framework. The pipeline includes a first stage where the analytic expressions for matrix elements and phase space are generated and exported in a GPU-like format. The simulation is then performed using the VegasFlow and PDFFlow libraries which deploy automatically the full simulation on systems with different hardware acceleration capabilities, such as multi-threading CPU, single-GPU and multi-GPU setups. We show some preliminary results for leading-order simulations on different hardware configurations.

Speaker: Dr Juan M. Cruz Martínez (University of Milan)
• 113
Novel features and GPU performance analysis for EM particle transport in the Celeritas code

Celeritas is a new computational transport code designed for high-performance
simulation of high-energy physics detectors. This work describes some of its
current capabilities and the design choices that enable the rapid development
of efficient on-device physics. The abstractions that underpin the code design
facilitate low-level performance tweaks that require no changes to the
higher-level physics code. We evaluate a set of independent changes that
together yield an almost 40\% speedup over the original GPU code for a net
performance increase of $220\times$ for a single GPU over a single CPU running
8.4M tracks on a small demonstration physics app.

Speaker: Seth Johnson (Oak Ridge National Laboratory)
• 114
Towards a cross-platform performance portability math kernel library in SYCL

The increasing number of high-performance computing centers around the globe is providing physicists and other researchers access to heterogeneous systems -- comprising multiple central processing units and graphics processing units per node -- with various platforms. However, it is more often than not the case that domain scientists have limited resources such that writing multiple implementations of their codes to target the different platforms is unfeasible. To help address this, a number of portability layers are being developed that aim to allow programmers to achieve performant, portable codes; for example, Intel(R) oneAPI, which is based on the SYCL programming model. Nevertheless, portable application programming interfaces often lack some features and tools that are manifest in a platform-specific API. High-energy physicists in particular rely heavily on large sets of random numbers in nearly their entire workflow, from event generation to analysis. In this paper, we detail the implementation of a cuRAND backend into Intel's oneMKL, permitting random number generation within oneAPI applications on NVIDIA hardware using libraries optimised for these devices. By utilizing existing optimisations, we demonstrate the ability to achieve nearly native performance in cross-platform applications.

Speaker: Vincent Pascuzzi (Lawrence Berkeley National Lab. (US))
• 115
PandAna: A Python Analysis Framework for Scalable High Performance Computing in High Energy Physics

Modern experiments in high energy physics analyze millions of events recorded in particle detectors to select the events of interest and make measurements of physics parameters. These data can often be stored as tabular data in files with detector information and reconstructed quantities. Current techniques for event selection in these files lack the scalability needed for high performance computing environments. We describe our work to develop a high energy physics analysis framework suitable for high performance computing. This new framework utilizes modern tools for reading files and implicit data parallelism. Framework users analyze tabular data using standard, easy-to-use data analysis techniques in Python while the framework handles the file manipulations and parallelism without the user needing advanced experience in parallel programming. In future versions, we hope to provide a framework that can be utilized on a personal computer or a high performance computing cluster with little change to the user code.

Speaker: Micah Groh (Fermi National Accelerator Laboratory)
• Artificial Intelligence: Wed PM
Zoom Meeting ID
67263583281
Host
vCHEP 01
Alternative hosts
Katarzyna Maria Dziedziniewicz-Wojcik, Chiara Ilaria Rovelli, Adeel Ahmad, Catharine Noble, Benedikt Hegner, Viktor Khristenko, Eric Wulff, Edoardo Martelli, Julia Andreeva, Maria Girone, Stefan Roiser, Zoom Recording Operations 3, Peter Hristov, Zoom Recording Operations 2, Eric Grancher, Simone Campana, Catherine Biscarat, Thomas Baron, Latchezar Betev, Markus Elsing, Xavier Espinal, Maarten Litmaath, Graeme A Stewart, Melissa Gaillard, Sebastian Lopienski, Anirudh Goel, David Southwick, Hannah Short, Helge Meinhard
Join via phone
Zoom URL
Conveners: Agnieszka Dziurda (Polish Academy of Sciences (PL)) , Joosep Pata (National Institute of Chemical Physics and Biophysics (EE))
• 116
Progress in developing a hybrid deep learning algorithm for identifying and locating primary vertices

The locations of proton-proton collision points in LHC experiments
are called primary vertices (PVs). Preliminary results of a hybrid deep learning
algorithm for identifying and locating these, targeting the Run 3 incarnation
of LHCb, have been described at conferences in 2019 and 2020. In the past
year we have made significant progress in a variety of related areas. Using
two newer Kernel Density Estimators (KDEs) as input feature sets improves the
fidelity of the models, as does using full LHCb simulation rather than the “toy
Monte Carlo" originally (and still) used to develop models. We have also built a
deep learning model to calculate the KDEs from track information. Connecting
a tracks-to-KDE model to a KDE-to-hists model used to find PVs provides
a proof-of-concept that a single deep learning model can use track information
to find PVs with high efficiency and high fidelity. We have studied a variety of
models systematically to understand how variations in their architectures affect
performance. While the studies reported here are specific to the LHCb geometry
and operating conditions, the results suggest that the same approach could be
used by the ATLAS and CMS experiments.

Speaker: Simon Akar (University of Cincinnati (US))
• 117
Graph Neural Network for Object Reconstruction in Liquid Argon Time Projection Chambers

This paper presents a graph neural network (GNN) technique for low-level reconstruction of neutrino interactions in a Liquid Argon Time Projection Chamber (LArTPC). GNNs are still a relatively novel technique, and have shown great promise for similar reconstruction tasks in the LHC. In this paper, a multihead attention message passing network is used to classify the relationship between detector hits by labelling graph edges, determining whether hits were produced by the same underlying particle, and if so, the particle type.The trained model is 84% accurate overall, and performs best on the EM shower and muon track classes. The model’s strengths and weaknesses are discussed, and plans for developing this technique further are summarised.

Speaker: Jeremy Edmund Hewes (University of Cincinnati (US))
• 118
Event vertex reconstruction with deep neural networks for the DarkSide-20k experiment

While deep learning techniques are becoming increasingly more popular in high-energy and, since recently, neutrino experiments, they are less confidently used in direct dark matter searches based on dual-phase noble gas TPCs optimized for low-energy signals from particle interactions.
In the present study, application of modern deep learning methods for event ver- tex reconstruction is demonstrated with an example of the 50-tonne liquid argon DarkSide-20k TPC with almost 10 thousand photosensors.
The developed methods successfully reconstruct event’s position withing sub- cm precision and are applicable to any dual-phase argon or xenon TPC of arbi- trary size with any sensor shape and array pattern.

Speaker: Victor Goicoechea Casanueva (University of Hawai'i at Manoa (US))
• 119
Evolutionary Algorithms for Tracking Algorithm Parameter Optimization

The reconstruction of charged particle trajectories, known as tracking, is one of the most complex and CPU consuming parts of event processing in high energy particle physics experiments. The most widely used and best performing tracking algorithms require significant geometry-specific tuning of the algorithm parameters to achieve best results. In this paper, we demonstrate the usage of machine learning techniques, particularly evolutionary algorithms, to find high performing configurations for the first step of tracking, called track seeding. We use a track seeding algorithm from the software framework A Common Tracking Software (ACTS). ACTS aims to provide an experiment- independent and framework-independent tracking software designed for mod- ern computing architectures. We show that our optimization algorithms find highly performing configurations in ACTS without hand-tuning. These tech- niques can be applied to other reconstruction tasks, improving performance and reducing the need for laborious hand-tuning of parameters.

Speaker: Peter Chatain (Stanford)
• 120
AI Enabled Data Quality Monitoring with Hydra

Data quality monitoring is critical to all experiments impacting the quality of any physics results. Traditionally, this is done through an alarm system, which detects low level faults, leaving higher level monitoring to human crews. Artificial Intelligence is beginning to find its way into scientific applications, but comes with difficulties, relying on the acquisition of new skill sets, either through education or acquisition, in data science. This paper will discuss the development and deployment of the Hydra monitoring system in production at Gluex. It will show how "off-the-shelf" technologies can be rapidly developed, as well as discuss what sociological hurdles must be overcome to successfully deploy such a system. Early results from production running of Hydra will also be shared as well as a future outlook for development of Hydra.

Speaker: Thomas Britton (JLab)
• Facilities and Networks: Wed PM
Zoom Meeting ID
62681653254
Host
vCHEP 07
Alternative hosts
Katarzyna Maria Dziedziniewicz-Wojcik, Chiara Ilaria Rovelli, Adeel Ahmad, Catharine Noble, Benedikt Hegner, Viktor Khristenko, Eric Wulff, Edoardo Martelli, Julia Andreeva, Maria Girone, Stefan Roiser, Zoom Recording Operations 3, Peter Hristov, Zoom Recording Operations 2, Eric Grancher, Simone Campana, Catherine Biscarat, Thomas Baron, Latchezar Betev, Markus Elsing, Xavier Espinal, Maarten Litmaath, Graeme A Stewart, Melissa Gaillard, Sebastian Lopienski, Anirudh Goel, David Southwick, Hannah Short, Helge Meinhard
Join via phone
Zoom URL
Conveners: Alessandra Forti (University of Manchester (GB)) , Dr David Crooks (UKRI STFC)
• 121
Updates on usage of the Czech national HPC center

The distributed computing of the ATLAS experiment at LHC is using computing resources of the Czech national HPC center IT4Innovations for several years. The submission system is based on ARC-CEs installed at the Czech LHC Tier2 site (praguelcg2). Recent improvements of this system will be discussed here. First, there was migration of the ARC-CE from version 5 to 6 which improves the reliability and scalability. The sshfs connection between praguelcg2 and IT4Innovations was a bottleneck of this system but this improved with new a version and setting. Containerisation using Singularity allows for customisation of environment without the need of requesting exceptions to HPC management as well as reduced amount of data on the shared storage. The system will need further modifications to improve CPU efficiency when running on worker nodes with very high number of cores. IT4Innovations HPCs provide significant contribution to computing done in Czech republic for the ATLAS experiment.

Speaker: Michal Svatos (Czech Academy of Sciences (CZ))
• 122
Exploitation of the MareNostrum 4 HPC using ARC-CE

HPC resources will help meet the future challenges of HL-LHC in terms of CPU requirements. The Spanish HPC centers have been used recently by implementing all the necessary edge services to integrate the resources into the LHC experiments workflow management system. Since it not always possible to install the edge services on HPC premises, we opted to set up a dedicated ARC-CE and interact with the HPC login and transfer nodes using ssh commands. In the ATLAS experiment, the repository that includes a partial copy of the experiment software in CVMFS is packaged into a container singularity image to overcome network isolation for HPC worker nodes and reduce software requirements. This article shows the Spanish contribution to the simulation of experiments after the agreement between the Spanish Ministry of Science and the Barcelona Supercomputing Center (BSC), the center that operates MareNostrum 4. Finally, we discuss some challenges to take advantage of HPC machines' next generation with heterogeneous architecture combining CPU and GPU.

Speaker: Andreu Pacheco Pages (Institut de Física d'Altes Energies - Barcelona (ES))
• 123
Exploitation of HPC Resources for data intensive sciences

The Large Hadron Collider (LHC) will enter a new phase begin- ning in 2027 with the upgrade to the High Luminosity LHC (HL-LHC). The increase in the number of simultaneous collisions coupled with a more complex structure of a single event will result in each LHC experiment collecting, stor- ing, and processing exabytes of data per year. The amount of generated and/or collected data greatly outweighs the expected available computing resources. In this paper, we discuss efficient usage of HPC resources as a prerequisite for data-intensive science at exascale. We discuss the work performed within the contexts of three EU-funded projects, DEEP-EST, EGI-ACE and CoE RAISE, with primary focus on three topics that emphasize the areas of work required to run production LHC workloads at the scale of HPC facilities. First, we dis- cuss the experience of porting CMS Hadron and Electromagnetic calorimeters to utilize Nvidia GPUs; second, we look at the tools and their adoption in order to perform benchmarking of a variety of resources available at HPC centers. Finally, we touch on one of the most important aspects of the future of HEP - how to handle the flow of PBs of data to and from computing facilities, be it clouds or HPCs, for exascale data processing in a flexible, scalable and per- formant manner. These investigations are a key contribution to technical work within the HPC collaboration among CERN, SKA, GEANT and PRACE.

Speaker: David Southwick (CERN)
• 124
Finalizing Construction of a New Data Center at BNL

Computational science, data management and analysis have been key factors in the success of Brookhaven National Laboratory's scientific programs at the Relativistic Heavy Ion Collider (RHIC), the National Synchrotron Light Source (NSLS-II), the Center for Functional Nanomaterials (CFN), and in biological, atmospheric, and energy systems science, Lattice Quantum Chromodynamics (LQCD) and Materials Science, as well as our participation in international research collaborations, such as the ATLAS Experiment at Europe's Large Hadron Collider (LHC) at CERN (Switzerland) and the Belle II Experiment at KEK (Japan). The construction of a new data center is an acknowledgement of the increasing demand for computing and storage services at BNL in the near term and enable the Lab to address the needs of the future experiments at the High-Luminosity LHC at CERN and the Electron-Ion Collider (EIC) at BNL in the long term. The Computing Facility Revitalization (CFR) project is aimed at repurposing the former National Synchrotron Light Source (NSLS-I) building as the new data center for BNL. The new data center is to become available in early 2021 for ATLAS compute, disk storage and tape storage equipment, and later that year - for all other collaborations supported by the Scientific Data and Computing Center (SDCC), including: STAR, PHENIX and sPHENIX experiments at RHIC collider at BNL, the Belle II Experiment at KEK (Japan), and the Computational Science Initiative at BNL. Migration of the majority of IT load and services from the existing data center to the new data center is expected to begin with the central networking systems and the first BNL ATLAS Tier-1 Site tape robot in 2021Q3, and it is expected to continue throughout FY2021-2024. This presentation will highlight the key mechanical, electrical, and plumbing (MEP) components of the new data center. Also, we will describe plans to migrate a subset of IT equipment between the old and the new data centers in CY2021, the period of operations with both data centers starting from 2021Q3, plans to perform the gradual IT equipment replacement in CY2021-2024, and show the expected state of occupancy and infrastructure utilization for both data centers up to FY2026.

Speaker: Mr Alexandr Zaytsev (Brookhaven National Laboratory (US))
• 125
Designing the RAL Tier-1 Network for HL-LHC and Future data lakes

The Rutherford Appleton Laboratory (RAL) runs the UK Tier-1 which supports all four LHC experiments, as well as a growing number of others in HEP, Astronomy and Space Science. In September 2020, RAL was provided with funds to upgrade its network. The Tier-1 not only wants to meet the demands of LHC Run 3, it also wants to ensure that it can take an active role in data lake development and the network data challenges in the preparation for HL-LHC. It was therefore decided to completely rebuild the Tier-1 network with a Spine / Leaf architecture. This paper describes the network requirements and design decision that went into building the new Tier-1 network. It also includes a cost analysis, to understand if the ever increasing network requirements are deliverable in a continued flat cash environment and what limitations or opportunities this may place on future data lakes.

Speaker: Alastair Dewhurst (Science and Technology Facilities Council STFC (GB))
• Software: Wed PM
Zoom Meeting ID
68415529596
Host
vCHEP 06
Alternative hosts
Katarzyna Maria Dziedziniewicz-Wojcik, Chiara Ilaria Rovelli, Adeel Ahmad, Catharine Noble, Benedikt Hegner, Viktor Khristenko, Eric Wulff, Edoardo Martelli, Julia Andreeva, Maria Girone, Stefan Roiser, Zoom Recording Operations 3, Peter Hristov, Zoom Recording Operations 2, Eric Grancher, Simone Campana, Catherine Biscarat, Thomas Baron, Latchezar Betev, Markus Elsing, Xavier Espinal, Maarten Litmaath, Graeme A Stewart, Melissa Gaillard, Sebastian Lopienski, Anirudh Goel, David Southwick, Hannah Short, Helge Meinhard
Join via phone
Zoom URL
Conveners: Luisa Arrabito (LUPM IN2P3/CNRS) , Teng Jian Khoo (Humboldt University of Berlin (DE))
• 126
Grid-based minimization at scale: Feldman-Cousins corrections for light sterile neutrino search

High Energy Physics (HEP) experiments generally employ sophisticated statistical methods to present results in searches of new physics. In the problem of searching for sterile neutrinos, likelihood ratio tests are applied to short-baseline neutrino oscillation experiments to construct confidence intervals for the parameters of interest. The test statistics of the form $\Delta \chi^2$ is often used to form the confidence intervals, however, this approach can lead to statistical inaccuracies due to the small signal rate in the region-of-interest. In this paper, we present a computational model for the computationally expensive Feldman-Cousins corrections to construct a statistically accurate confidence interval for neutrino oscillation analysis. The program performs a grid-based minimization over oscillation parameters and is written in C++. Our algorithms make use of vectorization through Eigen3, yielding a single-core speed-up of 350 compared to the original implementation, and achieve MPI data parallelism by employing DIY. We demonstrate the strong scaling of the application at High-Performance Computing (HPC) sites. We utilize HDF5 along with HighFive to write the results of the calculation to file.

Speaker: Marianette Wospakrik (Fermi National Accelerator Laboratory)
• 127
Laurelin: Java-native ROOT I/O for Apache Spark

Apache Spark is one of the predominant frameworks in the big data space, providing a fully-functional query processing engine, vendor support for hardware accelerators, and performant integrations with scientific computing libraries. One difficulty in adopting conventional big data frameworks to HEP workflows is the lack of support for the ROOT file format in these frameworks. Laurelin implements ROOT I/O with a pure Java library, with no bindings to the C++ ROOT implementation, and is readily installable via standard Java packaging tools. It provides a performant interface enabling Spark to read (and soon write) ROOT TTrees, enabling users to process these data without a pre-processing phase converting to an intermediate format.

Speaker: Andrew Malone Melo (Vanderbilt University (US))
• 128
Fine-grained data caching approaches to speedup a distributed RDataFrame analysis

Thanks to its RDataFrame interface, ROOT now supports the execution of the same physics analysis code both on a single machine and on a cluster of distributed resources. In the latter scenario, it is common to read the input ROOT datasets over the network from remote storage systems, which often increases the time it takes for physicists to obtain their results. Storing the remote files much closer to where the computations will run can bring latency and execution time down. Such a solution can be improved further by caching only the actual portion of the dataset that will be processed on each machine in the cluster, reusing it in subsequent executions on the same input data. This paper shows the benefits of applying different means of caching input data in a distributed ROOT RDataFrame analysis. Two such mechanisms will be applied to this kind of workflow with different configurations, namely caching on the same nodes that process data or caching on a separate server.

Speaker: Mr Vincenzo Eduardo Padulano (Valencia Polytechnic University (ES))
• 129
Columnar data analysis with ATLAS analysis formats

Future analysis of ATLAS data will involve new small-sized analysis
formats to cope with the increased storage needs. The smallest of
these, named DAOD_PHYSLITE, has calibrations already applied
to allow fast downstream analysis and avoid the need for further
analysis-specific intermediate formats. This allows for application
of the "columnar analysis" paradigm where operations are applied
on a per-array instead of a per-event basis. We will present methods
to read the data into memory, using Uproot, and also discuss I/O
aspects of columnar data and alternatives to the ROOT data format.
Furthermore, we will show a representation of the event data model
using the Awkward Array package and present proof of concept for a
simple analysis application.

Speaker: Nikolai Hartmann (Ludwig Maximilians Universitat (DE))
• 130
AwkwardForth: accelerating Uproot with an internal DSL

File formats for generic data structures, such as ROOT, Avro, and Parquet, pose a problem for deserialization: it must be fast, but its code depends on the type of the data structure, not known at compile-time. Just-in-time compilation can satisfy both constraints, but we propose a more portable solution: specialized virtual machines. AwkwardForth is a Forth-driven virtual machine for deserializing data into Awkward Arrays. As a language, it is not intended for humans to write, but it loosens the coupling between Uproot and Awkward Array. AwkwardForth programs for deserializing record-oriented formats (ROOT and Avro) are about as fast as C++ ROOT and 10‒80× faster than fastavro. Columnar formats (simple TTrees, RNTuple, and Parquet) only require specialization to interpret metadata and are therefore faster with precompiled code.

Speaker: Jim Pivarski (Princeton University)
• 131
hep_tables: Heterogeneous Array Programming for HEP

Array operations are one of the most concise ways of expressing common filtering and simple aggregation operations that is the hallmark of the first step of a particle physics analysis: selection, filtering, basic vector operations, and filling histograms. The High Luminosity run of the Large Hadron Collider (HL-LHC), scheduled to start in 2026, will require physicists to regularly skim datasets that are over a PB in size, and repeatedly run over datasets that are 100's of TB's – too big to fit in memory. Declarative programming techniques are a way of separating the intent of the physicist from the mechanics of finding the data, processing the data, and using distributed computing to process it efficiently that is required to extract the plot or data desired in a timely fashion. This paper describes a prototype library that provides a framework for different sub-systems to cooperate in producing this data, using an array-programming declarative interface. This prototype has a servicex data-delivery sub-system and an \awkward array sub-system cooperating to generate requested data. The ServiceX system runs against ATLAS xAOD data.

Speaker: Gordon Watts (University of Washington (US))
• Storage: Wed PM
Zoom Meeting ID
67249300031
Host
vCHEP 03
Alternative hosts
Katarzyna Maria Dziedziniewicz-Wojcik, Chiara Ilaria Rovelli, Adeel Ahmad, Catharine Noble, Benedikt Hegner, Viktor Khristenko, Eric Wulff, Edoardo Martelli, Julia Andreeva, Maria Girone, Stefan Roiser, Zoom Recording Operations 3, Peter Hristov, Zoom Recording Operations 2, Eric Grancher, Simone Campana, Catherine Biscarat, Thomas Baron, Latchezar Betev, Markus Elsing, Xavier Espinal, Maarten Litmaath, Graeme A Stewart, Melissa Gaillard, Sebastian Lopienski, Anirudh Goel, David Southwick, Hannah Short, Helge Meinhard
Join via phone
Zoom URL
Conveners: Christophe Haen (CERN) , Xavier Espinal (CERN)
• 132
CERN AFS phaseout: status & plans

In 2016, CERN decided to phase out the legacy OpenAFS storage service due to concerns for the upstream project's longevity, and the potential impact of disorderly service stop on CERN's computing services. Early 2019, the OpenAFS risks of the project collapsing have been reassessed and several early concerns have been allayed. In this paper we recap the work done so far, highlight some of the issues encountered, and present current state and planning.

Speaker: Jan Iven (CERN)
• 133
CernVM-FS powered container hub

Containers became the de-facto standard to package and distribute modern applications and their dependencies. The HEP community demonstrates an increasing interest in such technology, with scientists encapsulating their analysis workflow and code inside a container image. The analysis is first validated on a small dataset and minimal hardware resources to then run at scale on the massive computing capacity provided by the grid. The typical approach for distributing containers consists of pulling their image from a remote registry and extracting it on the node where the container runtime (e.g., Docker, Singularity) runs. This approach, however, does not easily scale to large images and thousands of nodes. CVMFS has long been used for the efficient distribution of software directory trees at global scale. In order to extend its optimized caching and network utilization to the distribution of containers, CVMFS recently implemented a dedicated container image ingestion service together with container runtime integrations. CVMFS ingestion is based on per-file deduplication, instead of the per-layer deduplication adopted by traditional container registries. On the client-side, CVMFS implements on-demand fetching of the chunks required for the execution of the container instead of the whole image.

Speaker: Enrico Bocchi (CERN)
• 134

This paper presents the experience in providing CERN users with
dows. In production for about 15 months, a High-Available Samba cluster is
regularly used by a signiﬁcant fraction of the CERN user base, following the
migration of their central home folders from Microsoft DFS in the context of
CERN’s strategy to move to open source solutions.
We describe the conﬁguration of the cluster, which is based on standard compo-
nents: the EOS-backed CERNBox storage is mounted via FUSE, and an addi-
tional mount provided by CephFS is used to share the cluster’s state. Further, we
describe some typical shortcomings of such a setup and how they were tackled.
Finally, we show how such an additional access method ﬁts in the bigger
picture, where the storage is seamlessly accessed by user jobs, sync clients,
FUSE/Samba mounts as well as the web UI, whilst aiming at a consistent view
and user experience.

Speaker: Giuseppe Lo Presti (CERN)
• 135
MetaCat - metadata catalog for data management systems

Metadata management is one of three major areas and parts of functionality of scientific data management along with replica management and workflow management. Metadata is the information describing the data stored in a data item, a file or an object. It includes the data item provenance, recording conditions, format and other attributes. MetaCat is a metadata management database designed and developed for High Energy Physics experiments. As a component of a data management system, it’s main objectives are to provide efficient metadata storage and management and fast data items selection functionality. MetaCat is supposed to work on the scale of 100 million files (or objects) and beyond. The article will discuss the functionality of MetaCat and technological solutions used to implement the product.

Speaker: Igor Mandrichenko (Fermi National Accelerator Lab. (US))
• 136
ARCHIVER - Data archiving and preservation for research environments

Over the last decades, several data preservation efforts have been undertaken by the HEP community, as experiments are not repeatable and consequently their data considered unique. ARCHIVER is a European Commission (EC) co-funded Horizon 2020 pre-commercial procurement project procuring R&D combining multiple ICT technologies including data-intensive scalability, network, service interoperability and business models, in a hybrid cloud environment. The results will provide the European Open Science Cloud (EOSC) with archival and preservation services covering the full research lifecycle. The services are co-designed in partnership with four research organisations (CERN, DESY, EMBL-EBI and PIC/IFAE) deploying use cases from Astrophysics, HEP, Life Sciences and Photon-Neutron Sciences creating an innovation ecosystem for specialist data archiving and preservation companies willing to introduce new services capable of supporting the expanding needs of research. The HEP use cases being deployed include the CERN Opendata portal, preserving a second copy of the completed BaBar experiment and the CERN Digital Memory digitising CERN’s multimedia archive of the 20th century. In parallel, ARCHIVER has established an Early Adopter programme whereby additional use cases can be incorporated at each of the project phases thereby expanding services to multiple research domains and countries.

• 137
Exploring Object Stores for High-Energy Physics Data Storage

Over the last two decades, ROOT TTree has been used for storing over one exabyte of High-Energy Physics (HEP) events. The TTree columnar on-disk layout has been proved to be ideal for analyses of HEP data that typically require access to many events, but only a subset of the information stored for each of them. Future accelerators, and particularly HL-LHC, will bring an increase of at least one order of magnitude in the volume of generated data. To this end, RNTuple has been designed to overcome TTree's limitations, providing improved efficiency and taking advantage of modern storage systems, e.g. low-latency high-bandwidth NVMe devices and object stores. In this paper, we extend RNTuple with a backend that leverages Intel DAOS as the underlying storage, proving that RNTuple's architecture can accommodate such changes. From the RNTuple user's perspective, this data can be accessed with minimal changes to the user code, i.e. replacing a filesystem path by a DAOS URI. Our performance evaluation shows that the contributed backend can be used for realistic analyses, while outperforming the compatibility solution provided by the DAOS project.

Speaker: Javier Lopez Gomez (CERN)
• Streaming: Wed PM
Zoom Meeting ID
66630870787
Host
vCHEP 04
Alternative hosts
Katarzyna Maria Dziedziniewicz-Wojcik, Chiara Ilaria Rovelli, Adeel Ahmad, Catharine Noble, Benedikt Hegner, Viktor Khristenko, Eric Wulff, Edoardo Martelli, Julia Andreeva, Maria Girone, Stefan Roiser, Zoom Recording Operations 3, Peter Hristov, Zoom Recording Operations 2, Eric Grancher, Simone Campana, Catherine Biscarat, Thomas Baron, Latchezar Betev, Markus Elsing, Xavier Espinal, Maarten Litmaath, Graeme A Stewart, Melissa Gaillard, Sebastian Lopienski, Anirudh Goel, David Southwick, Hannah Short, Helge Meinhard
Join via phone
Zoom URL
Conveners: Simon George (Royal Holloway, University of London) , Vardan Gyurjyan (Jefferson Lab)
• 138
HOSS!

The Hall-D Online Skim System (HOSS) was developed to simultaneously solve two issues for the high intensity GlueX experiment. One was to parallelize the writing of raw data files to disk in order to improve bandwidth. The other was to distribute the raw data across multiple compute nodes in order to produce calibration \textit{skims} of the data online. The highly configurable system employs RDMA, RAM disks, and zeroMQ driven by Python to simultaneously store and process the full high intensity GlueX data stream.

Speaker: David Lawrence (Jefferson Lab)
• 139
Streaming Readout of the CLAS12 Forward Tagger Using TriDAS and JANA2

An effort is underway to develop streaming readout data acquisition system for the CLAS12 detector in Jefferson Lab's experimental Hall-B. Successful beam tests were performed in the spring and summer of 2020 using a 10GeV electron beam from Jefferson Lab's CEBAF accelerator. The prototype system combined elements of the TriDAS and CODA data acquisition systems with the JANA2 analysis/reconstruction framework. This successfully merged components that included an FPGA stream source, a distributed hit processing system, and software plugins that allowed offline analysis written in C++ to be used for online event filtering. Details of the system design and performance are presented.

Speaker: Tommaso Chiarusi (INFN - Sezione di Bologna)
• 140
Simple and Scalable Streaming: The GRETA Data Pipeline

The Gamma Ray Energy Tracking Array (GRETA) is a state of the art gamma-ray spectrometer being built at Lawrence Berkeley National Laboratory to be first sited at the Facility for Rare Isotope Beams (FRIB) at Michigan State University. A key design requirement for the spectrometer is to perform gamma-ray tracking in near real time. To meet this requirement we have used an inline, streaming approach to signal processing in the GRETA data acquisition system, using a GPU-equipped computing cluster. The data stream will reach 480 thousand events per second at an aggregate data rate of 4 gigabytes per second at full design capacity. We have been able to simplify the architecture of the streaming system greatly by interfacing the FPGA-based detector electronics with the computing cluster using standard network technology. A set of high-performance software components to implement queuing, flow control, event processing and event building have been developed, all in a streaming environment which matches detector performance. Prototypes of all high-performance components have been completed and meet design specifications.

Speaker: Mario Cromaz (Lawrence Berkeley National Laboratory )
• 141
Free-running data acquisition system for the AMBER experiment

Triggered data acquisition systems provide only limited possibilities of triggering methods. In our paper, we propose a novel approach that completely removes the hardware trigger and its logic. It introduces an innovative free-running mode instead, which provides unprecedented possibilities to physics experiments. We would like to present such system, which is being developed for the AMBER experiment at CERN. It is based on an intelligent data acquisition framework including FPGAs modules and advanced software processing. The system provides the triggerless mode that allows to gain more time for the data filtration and implement more complex algorithms. Moreover, it utilises a custom data protocol optimized for needs of the free-running system. The filtration procedure takes place in a server farm playing the role of the high-level trigger. For this purpose, we introduce a high-performance filtration framework providing optimized algorithms and load balancing to cope with excessive data rates. Furthermore, this paper also describes the filtration pipeline as well as the simulation chain that is being used for production of artificial data, for testing, and validation.

Speaker: Martin Zemko (Czech Technical University in Prague (CZ))
• 142
FELIX: the Detector Interface for the ATLAS Experiment at CERN

The Front-End Link eXchange (FELIX) system is an interface between the trigger and detector electronics and commodity switched networks for the ATLAS experiment at CERN. In preparation for the LHC Run 3, to start in 2022, the system is being installed to read out the new electromagnetic calorimeter, calorimeter trigger, and muon components being installed as part of the ongoing ATLAS upgrade programme. The detector and trigger electronic systems are largely custom and fully synchronous with respect to the 40.08 MHz clock of the Large Hadron Collider (LHC). The FELIX system uses FPGAs on server-hosted PCIe boards to pass data between custom data links connected to the detector and trigger electronics and host system memory over a PCIe interface then route data to network clients, such as the Software Readout Drivers (SW ROD), via a dedicated software platform running on these machines. The SW RODs build event fragments, buffer data, perform detector-specific processing and provide data for the ATLAS High Level Trigger. The FELIX approach takes advantage of modern FPGAs and commodity computing to reduce the system complexity and effort needed to support data acquisition systems in comparison to previous designs. Future upgrades of the experiment will introduce FELIX to read out all other detector components.

Speaker: Alexander Paramonov (Argonne National Laboratory (US))
• Thursday, 20 May
• Thurs AM Plenaries: Plenaries
Zoom Meeting ID
63611158688
Host
vCHEP 00
Alternative hosts
Chiara Ilaria Rovelli, Benedikt Hegner, Edoardo Martelli, Stefan Roiser, Zoom Recording Operations 3, Zoom Recording Operations 2, Simone Campana, Catherine Biscarat, Thomas Baron, Xavier Espinal, Maarten Litmaath, Graeme A Stewart, Sebastian Lopienski, Helge Meinhard
Join via phone
Zoom URL
Conveners: Benedikt Hegner (CERN) , Patrick Fuhrmann (Deutsches Elektronen-Synchrotron (DE))
• 143
Coffea-casa: an analysis facility prototype

Data analysis in HEP has often relied on batch systems and event loops; users are given a non-interactive interface to computing resources and consider data event-by-event. The "Coffea-casa" prototype analysis facility is an effort to provide users with alternate mechanisms to access computing resources and enable new programming paradigms. Instead of the command-line interface and asynchronous batch access, a notebook-based web interface and interactive computing is provided. Instead of writing event loops, the column-based Coffea library is used.

In this paper, we describe the architectural components of the facility, the services offered to end users, and how it integrates into a larger ecosystem for data access and authentication.

• 144
Evaluating CephFS Performance vs. Cost on High-Density Commodity Disk Servers

CephFS is a network filesystem built upon the Reliable Autonomic Distributed Object Store (RADOS). At CERN we have demonstrated its reliability and elasticity while operating several 100-to-1000TB clusters which provide NFS-like storage to infrastructure applications and services. At the same time, our lab developed EOS to offer high performance 100PB-scale storage for the LHC at extremely low costs, while also supporting the complete set of security and functional APIs required by the particle-physics user community. This work seeks to evaluate the performance of CephFS on this cost-optimized hardware when it is combined with EOS to support the missing functionalities. To this end, we have setup a proof-of-concept Ceph Octopus cluster on high-density JBOD servers (840 TB each) with 100Gig-E networking. The system uses EOS to provide an overlayed namespace and protocol gateways for HTTP(S) and XROOTD, and uses CephFS as an erasure-coded object storage backend. The solution also enables operators to aggregate several CephFS instances and adds features such as third-party-copy, SciTokens, and high-level user and quota management. Using simple benchmarks we measure the cost/performance tradeoffs of different erasure-coding layouts, as well as the network overheads of these coding schemes. We demonstrate some relevant limitations of the CephFS metadata server and offer improved tunings which can be generally applicable. To conclude, we reflect on the advantages and drawbacks related to this architecture, such as RADOS-level free space requirements and double-network penalties, and offer ideas for improvements in the future.

Speaker: Dan van der Ster (CERN)
• 145
Fast and Accurate Electromagnetic and Hadronic Showers from Generative Models

Generative machine learning models offer a promising way to efficiently amplify classical Monte Carlo generators' statistics for event simulation and generation in particle physics. Given the already high computational cost of simulation and the expected increase in data in the high-precision era of the LHC and at future colliders, such fast surrogate simulators are urgently needed.

This contribution presents a status update on simulating particle showers in high granularity calorimeters for future colliders. Building on prior work using Generative Adversarial Networks (GANs), Wasserstein-GANs, and the information-theoretically motivated Bounded Information Bottleneck Autoencoder (BIB-AE), we further improve the fidelity of generated photon showers. The key to this improvement is a detailed understanding and optimisation of the latent space. The richer structure of hadronic showers compared to electromagnetic ones makes their precise modelling an important yet challenging problem.
We present initial progress towards accurately simulating the core of hadronic showers in a highly granular scintillator calorimeter.

Speaker: Sascha Daniel Diefenbacher (Hamburg University (DE))
• 10:30 AM
Break
• Artificial Intelligence: Thu AM
Zoom Meeting ID
67263583281
Host
vCHEP 01
Alternative hosts
Katarzyna Maria Dziedziniewicz-Wojcik, Chiara Ilaria Rovelli, Adeel Ahmad, Catharine Noble, Benedikt Hegner, Viktor Khristenko, Eric Wulff, Edoardo Martelli, Julia Andreeva, Maria Girone, Stefan Roiser, Zoom Recording Operations 3, Peter Hristov, Zoom Recording Operations 2, Eric Grancher, Simone Campana, Catherine Biscarat, Thomas Baron, Latchezar Betev, Markus Elsing, Xavier Espinal, Maarten Litmaath, Graeme A Stewart, Melissa Gaillard, Sebastian Lopienski, Anirudh Goel, David Southwick, Hannah Short, Helge Meinhard
Join via phone
Zoom URL
Conveners: Gian Michele Innocenti (CERN) , Jason Webb (Brookhaven National Lab)
• 146
Decoding Photons: Physics in the Latent Space of a BIB-AE Generative Network

Given the increasing data collection capabilities and limited computing resources of future collider experiments, interest in using generative neural networks for the fast simulation of collider events is growing. In our previous study, the Bounded Information Bottleneck Autoencoder (BIB-AE) architecture for generating photon showers in a high-granularity calorimeter showed a high accuracy modeling of various global differential shower distributions. In this work, we investigate how the BIB-AE encodes this physics information in its latent space. Our understanding of this encoding allows us to propose methods to optimize the generation performance further, for example, by altering latent space sampling or by suggesting specific changes to hyperparameters. In particular, we improve the modeling of the shower shape along the particle incident axis.

Speaker: Erik Buhmann (Hamburg University (DE))
• 147
Distributed training and scalability for the particle clustering method UCluster

In recent years, machine learning methods have become increasingly important for the experiments of the Large Hadron Collider (LHC). They are utilized in everything from trigger systems to reconstruction to data analysis. The recent UCluster method is a general model providing unsupervised clustering of particle physics data, that can be easily modified for a variety of different tasks. In the current paper, we improve on the UCluster method by adding the option of training the model in a scalable and distributed fashion, which extends its usefulness even further. UCluster combines the graph-based neural network ABCnet with a clustering step, using a combined loss function to train. It was written in TensorFlow v1.14 and has previously been trained on a single GPU. It shows a clustering accuracy of 81% when applied to the problem of multiclass classification of simulated jet events. Our implementation adds the distributed training functionality by utilizing the Horovod distributed training framework, which necessitated a migration of the code to TensorFlow v2. Together with using parquet files for splitting data up between different nodes, the distributed training makes the model scalable to any amount of input data, something that will be essential for use with real LHC datasets. We find that the model is well suited for distributed training, with the training time decreasing in direct relation to the number of GPU's used.

Speaker: Olga Sunneborn Gudnadottir (Uppsala University (SE))
• 148
Training and Serving ML workloads with Kubeflow at CERN

Machine Learning (ML) has been growing in popularity in multiple areas and groups at CERN, covering fast simulation, tracking, anomaly detection, among many others. We describe a new service available at CERN, based on Kubeflow and managing the full ML lifecycle: data preparation and interactive analysis, large scale distributed model training and model serving. We cover specific features available for hyper-parameter tuning and model metadata management, as well as infrastructure details to integrate accelerators and external resources. We also present results and a cost evaluation from scaling out a popular ML use case using public cloud resources, achieving close to linear scaling when using a large number of GPUs.

Speaker: Dejan Golubovic (CERN)
• 149
Accelerating GAN training using highly parallel hardware on public cloud

With the increasing number of Machine and Deep Learning applications in High Energy Physics, easy access to dedicated infrastructure represents a requirement for fast and efficient R&D. This work explores different types of cloud services to train a Generative Adversarial Network (GAN) in a parallel
environment, using Tensorflow data parallel strategy. More specifically, we parallelize the training process on multiple GPUs and Google Tensor Processing Units (TPU) and we compare two algorithms: the TensorFlow built-in logic and a custom loop, optimised to have higher control of the elements assigned to each GPU worker or TPU core. The quality of the generated data is compared to Monte Carlo simulation. Linear speed-up of the training process is obtained, while retaining most of the performance in terms of physics results. Additionally, we benchmark the aforementioned approaches, at scale, over multiple GPU nodes, deploying the training process on different public cloud providers, seeking for overall efficiency and cost-effectiveness. The combination of data science, cloud deployment options and associated economics
allows to burst out heterogeneously, exploring the full potential of cloud-based services.

Speaker: Renato Paulo Da Costa Cardoso (Universidade de Lisboa (PT))
• 150
Multi-particle reconstruction in the High Granularity Calorimeter using object condensation and graph neural networks

The high-luminosity upgrade of the LHC will come with unprecedented physics and computing challenges. One of these challenges is the accurate reconstruction of particles in events with up to 200 simultaneous proton-proton interactions. The planned CMS High Granularity Calorimeter offers fine spatial resolution for this purpose, with more than 6 million channels, but also poses unique challenges to reconstruction algorithms aiming to reconstruct individual particle showers. In this contribution, we propose an end-to-end machine-learning method that performs clustering, classification, and energy and position regression in one step while staying within memory and computational constraints. We employ GravNet, a graph neural network, and an object condensation loss function to achieve this task. Additionally, we propose a method to relate truth showers to reconstructed showers by maximising the energy weighted intersection over union using maximal weight matching. Our results show the efficiency of our method and highlight a promising research direction to be investigated further.

Speaker: Shah Rukh Qasim (Manchester Metropolitan University (GB))
• Education, Training, Outreach: Thu AM
Zoom Meeting ID
66687805476
Host
vCHEP 10
Alternative hosts
Katarzyna Maria Dziedziniewicz-Wojcik, Chiara Ilaria Rovelli, Adeel Ahmad, Catharine Noble, Benedikt Hegner, Viktor Khristenko, Eric Wulff, Edoardo Martelli, Julia Andreeva, Maria Girone, Stefan Roiser, Zoom Recording Operations 3, Peter Hristov, Zoom Recording Operations 2, Eric Grancher, Simone Campana, Catherine Biscarat, Thomas Baron, Latchezar Betev, Markus Elsing, Xavier Espinal, Maarten Litmaath, Graeme A Stewart, Melissa Gaillard, Sebastian Lopienski, Anirudh Goel, David Southwick, Hannah Short, Helge Meinhard
Join via phone
Zoom URL
Conveners: Clara Nellist (Radboud University Nijmegen and NIKHEF (NL)) , Marzena Lapka (CERN)
• 151
EsbRootView

EsbRootView is an event display for the detectors of ESSnuSB able to exploit natively all the nice devices that we have in hands today; desktop, laptops but also smartphones and tablets.

Speaker: Guy Barrand (Université Paris-Saclay (FR))
• 152
Browser-based visualization framework Tracer for Outreach & Education

Education & outreach is an important part of HEP experiments. With outreach & education, experiments can have an impact on the public, students and their teachers, as well as policymakers and the media. The tools and methods for visualization enable to represent the detectors' facilities, explaining their purpose, functionalities, development histories, and participant institutes. In addition, they make it possible to visualize different physical events together with important parameters and plots for physics analyses. 3D visualization and advanced VR (Virtual Reality), AR (Augmented Reality) and MR (Mixed Reality) extensions are the keys for successful outreach & education. This paper describes requirements and methods for the creation of browser-based visualization applications for outreach & education. The visualization framework TRACER is considered as a case study.

Speaker: Alexander Sharmazanashvili (Georgian Technical University (GE))
• 153
The Phoenix event display framework

Visualising HEP experiment event data and geometry is vital for physicists trying to debug their reconstruction software, their detector geometry or their physics analysis, and also for outreach and publicity purposes. Traditionally experiments used in-house applications that required installation (often as part of a much larger experiment specific framework). In recent years, web-based event/geometry displays have started to appear, dramatically lowering the entry barrier to use, but which typically are still per-experiment. The Phoenix framework is an extensible, experiment-agnostic framework for event and geometry visualisation.

Speaker: Edward Moyse (University of Massachusetts (US))
• 154
The fight against COVID-19: Running Folding@Home simulations on ATLAS resources

Following the outbreak of the COVID-19 pandemic, the ATLAS experiment considered how it could most efficiently contribute using its distributed computing resources. After considering many suggestions, examining several potential projects and following the advice of the CERN COVID Task Force, it was decided to engage in the Folding@Home initiative, which provides payloads that perform protein folding simulations. This paper describes how ATLAS made a significant contribution to this project over the summer of 2020.

Speaker: David Michael South (Deutsches Elektronen-Synchrotron (DE))
• Monitoring: Thu AM