The Deep Underground Neutrino Experiment (DUNE) is an international effort to build the next-generation neutrino observatory to answer fundamental questions about the nature of elementary particles and their role in the universe. Integral to DUNE is the process of reconstruction, where the raw data from Liquid Argon Time Projection Chambers (LArTPC) are transformed into products that can be...
Data acquisition systems (DAQ) for high energy physics experiments utilize complex FPGAs to handle unprecedented high data rates. This is especially true in the first stages of the processing chain. Developing and commissioning these systems becomes more complex as additional processing intelligence is placed closer to the detector, in a distributed way directly on the ATCA blades, in the...
E-mail service is considered as a critical collaboration system. We will share our experience regarding technical and organizational challenges when migrating 40 000 mailboxes from Microsoft Exchange to free and open source software solution: Kopano.
The INFN Tier-1 located at CNAF in Bologna (Italy) is a major center of the WLCG e-Infrastructure, supporting the 4 major LHC collaborations and more than 30 other INFN-related experiments.
After multiple tests towards elastic expansion of CNAF compute power via Cloud resources (provided by Azure, Aruba and in the framework of the HNSciCloud project), but also building on the experience...
The IHEP local cluster is a middle-sized HEP data center which consists of 20’000 CPU slots, hundreds of data servers, 20 PB disk storage and 10 PB tape storage. After data taking of JUNO and LHAASO experiment, the data volume processed at this center will approach 10 PB data per year. Facing the current cluster scale, anomaly detection is a non-trivial task in daily maintenance. Traditional...
We will describe a component of the Intelligent Data Delivery Service being developed in collaboration with IRIS-HEP and the LHC experiments. ServiceX is an experiment-agnostic service to enable on-demand data delivery specifically tailored for nearly-interactive vectorized analysis. This work is motivated by the data engineering challenges posed by HL-LHC data volumes and the increasing...
Statistical modelling is a key element for High-Energy Physics (HEP) analysis. Currently, most of this modelling is performed with the ROOT/RooFit toolkit which is written in C++ and provides Python bindings which are only loosely integrated into the scientific Python ecosystem. We present zfit, a new alternative to RooFit, written in pure Python. Built on top of TensorFlow (a modern, high...
A Grid computing site consists of various services including Grid middlewares, such as Computing Element, Storage Element and so on. Ensuring a safe and stable operation of the services is a key role of site administrators. Logs produced by the services provide useful information for understanding the status of the site. However, it is a time-consuming task for site administrators to monitor...
As of March 2019, CERN is no longer eligible for academic licences of Microsoft products. For this reason, CERN IT started a series of task forces to respond to the evolving requirements of the user community with the goal of reducing as much as possible the need for Microsoft licensed software. This exercise was an opportunity to understand better the user requirements for all office...
The NSF-funded Scalable CyberInfrastructure for Artificial Intelligence and Likelihood Free Inference (SCAILFIN) project aims to develop and deploy artificial intelligence (AI) and likelihood-free inference (LFI) techniques and software using scalable cyberinfrastructure (CI) built on top of existing CI elements. Specifically, the project has extended the CERN-based REANA framework, a...
The diversity of the scientific goals across HEP experiments necessitates unique bodies of software tailored for achieving particular physics results. The challenge, however, is to identify the software that must be unique, and the code that is unnecessarily duplicated, which results in wasted effort and inhibits code maintainability.
Fermilab has a history of supporting and developing...
ROOT provides, through TMVA, machine learning tools for data analysis at HEP experiments and beyond. In this talk, we present recently included features in TMVA and the strategy for future developments in the diversified machine learning landscape. Focus is put on fast machine learning inference, which enables analysts to deploy their machine learning models rapidly on large scale datasets....
We will report on the status of the OSiRIS project (NSF Award #1541335, UM, IU, MSU and WSU) after its fourth year. OSiRIS is delivering a distributed Ceph storage infrastructure coupled together with software-defined networking to support multiple science domains across Michigan’s three largest research universities. The project’s goal is to provide a single scalable, distributed storage...
The ALFA framework is a joint development between ALICE Online-Offline and FairRoot teams. ALFA has a distributed architecture, i.e. a collection of highly maintainable, testable, loosely coupled, independently deployable processes.
ALFA allows the developer to focus on building single-function modules with well-defined interfaces and operations. The communication between the independent...
The Belle II experiment started taking physics data in March 2019, with an estimated dataset of order 60 petabytes expected by the end of operations in the mid-2020s. Originally designed as a fully integrated component of the BelleDIRAC production system, the Belle II distributed data management (DDM) software needs to manage data across 70 storage elements worldwide for a collaboration of...
Nowadays, a number of technology R&D activities has been launched in Europe trying to close the gap with traditional HPC providers like USA and Japan and more recently emerging ones like China.
The EU HPC strategy, funded through EuroHPC initiative, leverages on two different pillars: the first one targets the procurement and the hosting of two/three commercial pre-Exascale systems, in order...
In this talk the approach chosen to monitor firstly a world-wide video conference server infrastructure and secondly a wide diversity of audio-visual devices that build up the audio-visual conference room ecosystem at CERN will be presented.
CERN video conference system is a complex ecosystem which is being used by most HEP institutes, together with Swiss Universities through SWITCH. As a...
We study the use of interaction networks to perform tasks related to jet reconstruction. In particular, we consider jet tagging for generic boosted-jet topologies, tagging of large-momentum H$\to$bb decays, and anomalous-jet detection. The achieved performance is compared to state-of-the-art deep learning approaches, based on Convolutional or Recurrent architectures. Unlike these approaches,...
The Deep Underground Neutrino Experiment (DUNE) will be the world’s foremost neutrino detector when it begins taking data in the mid-2020s. Two prototype detectors, collectively known as ProtoDUNE, have begun taking data at CERN and have accumulated over 3 PB of raw and reconstructed data since September 2018. Particle interaction within liquid argon time projection chambers are challenging to...
For the High Luminosity LHC, the CMS collaboration made the ambitious choice of a high granularity design to replace the existing endcap calorimeters. The thousands of particles coming from the multiple interactions create showers in the calorimeters, depositing energy simultaneously in adjacent cells. The data are analog to 3D gray-scale image that should be properly reconstructed.
In this...
Indico, CERN’s popular open-source tool for event management, is in widespread use among facilities that make up the HEP community. It is extensible through a robust plugin architecture that provides features such as search and video conferencing integration. In 2018, Indico version 2 was released with many notable improvements, but without a full-featured search functionality that could be...
Efforts in distributed computing of the CMS experiment at the LHC at CERN are now focusing on the functionality required to fulfill the projected needs for the HL-LHC era. Cloud and HPC resources are expected to be dominant relative to resources provided by traditional Grid sites, being also much more diverse and heterogeneous. Handling their special capabilities or limitations and maintaining...
After the current LHC shutdown (2019-2021), the ATLAS experiment will be required to operate in an increasingly harsh collision environment. To maintain physics performance, the ATLAS experiment will undergo a series of upgrades during the shutdown. A key goal of this upgrade is to improve the capacity and flexibility of the detector readout system. To this end, the Front-End Link eXchange...
A new bookkeeping system called Jiskefet is being developed for A Large Ion Collider Experiment (ALICE) during Long Shutdown 2, to be in production until the end of LHC Run 4 (2029).
Jiskefet unifies two functionalities. The first is gathering, storing and presenting metadata associated with the operations of the ALICE experiment. The second is tracking the asynchronous processing of the...
The Faster Analysis Software Taskforce (FAST) is a small, European group of HEP researchers that have been investigating and developing modern software approaches to improve HEP analyses. We present here an overview of the key product of this effort: a set of packages that allows a complete implementation of an analysis using almost exclusively YAML files. Serving as an analysis description...
The Dutch science funding organization NWO is in the process of drafting requirements for the procurement of a future high-performance compute facility. To investigate the requirements for this facility to potentially support high-throughput workloads in addition to traditional high-performance workloads, a broad range of HEP workloads are being functionally tested on the current facility. The...
The OpenMP standard is the primary mechanism used at high performance computing facilities to allow intra-process parallelization. In contrast, many HEP specific software (such as CMSSW, GaudiHive, and ROOT) make use of Intel's Threading Building Blocks (TBB) library to accomplish the same goal. In this talk we will discuss our work to compare TBB and OpenMP when used for scheduling algorithms...
The high-level trigger (HLT) of LHCb in Run 3 will have to process 5 TB/s of data, which is about two orders of magnitude larger compared to Run 2. The second stage of the HLT runs asynchronously to the LHC, aiming for a throughput of about 1 MHz. It selects analysis-ready physics signals by O(1000) dedicated selections totaling O(10000) algorithms to achieve maximum efficiency. This poses two...
We present recent work in supporting deep learning for particle physics and cosmology at NERSC, the US Dept. of Energy mission HPC center. We describe infrastructure and software to support both large-scale distributed training across (CPU and GPU) HPC resources and for productive interfaces via Jupyter notebooks. We also detail plans for accelerated hardware for deep learning in the future...
(On behalf of the JUNO collaboration)
Abstract:
The JUNO (Jiangmen Underground Neutrino Observatory) experiment is designed to determine the neutrino mass hierarchy and precisely measure oscillation parameters with an unprecedented energy resolution of 3% at 1MeV. It is composed of a 20kton liquid scintillator central detector equipped with 18000 20” PMTs and 25000 3” PMTs, a water pool...
CERNBox is the CERN cloud storage hub for more than 16000 users at CERN. It allows synchronising and sharing files on all major desktop and mobile platforms (Linux, Windows, MacOSX, Android, iOS) providing universal, ubiquitous, online- and offline access to any data stored in the CERN EOS infrastructure. CERNBox also provides integration with other CERN services for big science: visualisation...
Micro-Pattern Gas Detectors (MPGDs) are the new frontier in between the gas tracking systems. Among them, the triple Gas Electron Multiplier (triple-GEM) detectors are widely used. In particular, cylindrical triple-GEM (CGEM) detectors can be used as inner tracking devices in high energy physics experiments. In this contribution, a new offline software called GRAAL (Gem Reconstruction And...
In physics we often encounter high-dimensional data, in the form of multivariate measurements or of models with multiple free parameters. The information encoded is increasingly explored using machine learning, but is not typically explored visually. The barrier tends to be visualising beyond 3D, but systematic approaches for this exist in the statistics literature. I will use examples from...
LHCb is one of the 4 experiments at the LHC accelerator at CERN. During the upgrade phase of the experiment, several new electronic boards and Front End chips that perform the data acquisition for the experiment will be added by the different sub-detectors. These new devices will be controlled and monitored via a system composed of GigaBit Transceiver (GBT) chips that manage the bi-directional...
As part of the LHCb detector upgrade in 2021, the hardware-level trigger will be removed, coinciding with an increase in luminosity. As a consequence, about 40 Tbit/s of data will be processed in a full-software trigger, a challenge that has prompted the exploration of alternative hardware technologies. Allen is a framework that permits concurrent many-event execution targeting many-core...
The ATLAS model for remote access to database resident information relies upon a limited set of dedicated and distributed Oracle database repositories complemented with the deployment of Frontier system infrastructure on the WLCG. ATLAS clients with network access can get the database information they need dynamically by submitting requests to a squid server in the Frontier network which...
RooFit is the statistical modeling and fitting package used in many experiments to extract physical parameters from reduced particle collision data. RooFit aims to separate particle physics model building and fitting (the users' goals) from their technical implementation and optimization in the back-end. In this talk, we outline our efforts to further optimize the back-end by automatically...
Collaborative services are essential for any experiment.
They help to integrate global virtual communities by allowing to share
and exchange relevant information among members.
Typical examples are public and internal web pages, wikis, mailing
list services, issue tracking systems, and services for meeting
organizations and documents.
After reviewing their collaborative services with...
The innovative Barrel DIRC (Detection of Internally Reflected Cherenkov light) counter will provide hadronic particle identification (PID) in the central region of the PANDA experiment at the new Facility for Antiproton and Ion Research (FAIR), Darmstadt, Germany. This detector is designed to separate charged pions and kaons with at least 3 standard deviations for momenta up to 3.5 GeV/c...
Most of the challenges set by modern physics endeavours are related to the management, processing and analysis of massive amount of data. As stated in a recent Nature editorial (The thing about data, Nature Physics volume 13, page 717, 2017), "the rise of big data represents an opportunity for physicists. To take full advantage, however, they need a subtle but important shift in mindset"....
ATLAS event processing requires access to centralized database systems where information about calibrations, detector status and data-taking conditions are stored. This processing is done on more than 150 computing sites on a world-wide computing grid which are able to access the database using the squid-Frontier system. Some processing workflows have been found which overload the Frontier...
In preparation for Run 3 of the LHC, the ATLAS experiment is modifying its offline software to be fully multithreaded. An important part of this is data structures that can be efficiently and safely concurrently accessed from many threads. A standard way of achieving this is through mutual exclusion; however, the overhead from this can sometimes be excessive. Fully lockless implementations are...
The LHCb detector at the LHC is a single forward arm spectrometer dedicated to the study of $b-$ and $c-$ hadron states. During Run 1 and 2, the LHCb experiment has collected a total of 9 fb$^{-1}$ of data, corresponding to the largest charmed hadron dataset in the world and providing unparalleled datatests for studies of CP violation in the $B$ system, hadron spectroscopy and rare decays, not...
There is a general trend in WLCG towards the federation of resources, aiming for increased simplicity, efficiency, flexibility, and availability. Although general, VO-agnostic federation of resources between two independent and autonomous resource centres may prove arduous, a considerable amount of flexibility in resource sharing can be achieved, in the context of a single WLCG VO, with a...
In this talk I will present an investigation into sizeable interference effects between a {heavy} charged Higgs boson signal produced via $gg\to t\bar b H^-$ (+ c.c.) followed by the decay $H^-\to b\bar t$ (+ c.c.) and the irreducible background given by $gg\to t\bar t b \bar b$ topologies at the Large Hadron Collider (LHC). I will show how such effects could spoil current $H^\pm$...
The EGI Cloud Compute service offers a multi-cloud IaaS federation that brings together research clouds as a scalable computing platform for research accessible with OpenID Connect Federated Identity. The federation is not limited to single sign-on, it also introduces features to facilitate the portability of applications across providers: i) a common VM image catalogue VM image replication to...
The cloudscheduler VM provisioning service has been running production jobs for ATLAS and Belle II for many years using commercial and private clouds in Europe, North America and Australia. Initially released in 2009, version 1 is a single Python 2 module implementing multiple threads to poll resources and jobs, and to create and destroy virtual machine. The code is difficult to scale,...
High Energy Physics (HEP) experiments will enter a new era with the start of the HL-LHC program, where computing needs required will surpass by large factors the current capacities. Looking forward to this scenario, funding agencies from participating countries are encouraging the HEP collaborations to consider the rapidly developing High Performance Computing (HPC) international...
The Data Acquisition (DAQ) system of the Compact Muon Solenoid (CMS) experiment at LHC is a complex system responsible for the data readout, event building and recording of accepted events. Its proper functioning plays a critical role in the data-taking efficiency of the CMS experiment. In order to ensure high availability and recover promptly in the event of hardware or software failure of...
The ATLAS physics program relies on very large samples of GEANT4 simulated events, which provide a highly detailed and accurate simulation of the ATLAS detector. However, this accuracy comes with a high price in CPU, and the sensitivity of many physics analyses is already limited by the available Monte Carlo statistics and will be even more so in the future. Therefore, sophisticated fast...
GAMBIT is a modular and flexible framework for performing global fits to a wide range of theories for new physics. It includes theory and analysis calculations for direct production of new particles at the LHC, flavour physics, dark matter experiments, cosmology and precision tests, as well as an extensive library of advanced parameter-sampling algorithms. I will present the GAMBIT software...
With increasing data volume from Nuclear Physics experiments requirements to data
storage and access are changing. To keep up with large data sets new data formats
are needed for efficient processing and analysis of the data. Frequently, in the
experiments data goes through stages from data acquisition to reconstruction and
data analysis and data is converted from one format to another...
The Queen Mary University of London WLCG Tier-2 Grid site has been providing GPU resources on the Grid since 2016. GPUs are an important modern tool to assist in data analysis. They have historically been used to accelerate computationally expensive but parallelisable workloads using frameworks such as OpenCL and CUDA. However, more recently their power in accelerating machine learning,...
The number of women in technical computing roles in the HEP community hovers at around 15%. At the same time there is a growing body of research to suggest that diversity, in all its forms, brings positive impact on productivity and wellbeing. These aspects are directly in line with many organisations’ values and missions, including CERN. Although proactive efforts to recruit more women in our...
In a HEP Computing Center, at least 1 batch systems are used. As an example, at IHEP, we’ve used 3 batch systems, PBS, HTCondor and Slurm. After running PBS as local batch system for 10 years, we replaced it by HTCondor (for HTC) and Slurm (for HPC). During that period, problems came up on both user and admin sides.
On user side, the new batch systems bring a set of new commands, which...
The Information Service (IS) is an integral part of the Trigger and Data Acquisition (TDAQ) system of the ATLAS experiment at the Large Hadron Collider (LHC) at CERN. The IS allows online publication of operational monitoring data, and it is used by all sub-systems and sub-detectors of the experiment to constantly monitor their hardware and software components including more than 25000...
The advent of computing resources with co-processors, for example Graphics Processing Units (GPU) or Field-Programmable Gate Arrays (FPGA), for use cases like the CMS High-Level Trigger (HLT) or data processing at leadership-class supercomputers imposes challenges for the current data processing frameworks. These challenges include developing a model for algorithms to offload their...
In recent years proficiency in data science and machine learning (ML) became one of the most requested skills for jobs in both industry and academy. Machine learning algorithms typically require large sets of data to train the models and extensive usage of computing resources both for training and inference. Especially for deep learning algorithms, training performances can be dramatically...
Cloud Services for Synchronization and Sharing (CS3) have become increasing popular in the European Education and Research landscape in the last
years. Services such as CERNBox, SWITCHdrive, CloudStor and many more have become indispensable in everyday work for scientists, engineers and in administration
CS3 services represent an important part of the EFSS market segment (Enterprise File...
The High-Luminosity LHC will provide an unprecedented data volume of complex collision events. The desire to keep as many of the "interesting" events for investigation by analysts implies a major increase in the scale of compute, storage and networking infrastructure required for HL-LHC experiments. An updated computing model is required to facilitate the timely publication of accurate physics...
Despite the overwhelming cosmological evidence for the existence of dark matter, and the considerable effort of the scientific community over decades, there is no evidence for dark matter in terrestrial experiments.
The GPS.DM observatory uses the existing GPS constellation as a 50,000 km-aperture sensor array, analysing the satellite and terrestrial atomic clock data for exotic physics...
For almost 10 years now XRootD has been very successful at facilitating data management of LHC experiments. Being the foundation and main component of numerous solutions employed within the WLCG collaboration (like EOS and DPM), XRootD grew into one of the most important storage technologies in the High Energy Physics (HEP) community. With the latest major release (5.0.0) XRootD framework...
The Belle II experiment features a substantial upgrade of the Belle detector and will operate at the SuperKEKB energy-asymmetric $e^+ e^-$ collider at KEK in Tuskuba, Japan. The accelerator successfully completed the first phase of commissioning in 2016 and the Belle II detector saw its first electron-positron collisions in April 2018. Belle II features a newly designed silicon vertex detector...
The Jiangmen Underground Neutrino Observatory (JUNO) is a multipurpose neutrino experiment, which plans to take about 2PB raw data each year starting from 2021. The experiment data plans to be stored in IHEP and have another copy in Europe (CNAF, IN2P3, JINR data centers). MC simulation tasks are expected to be arranged and operated through a distributed computing system to share efforts among...
In this talk, we discuss the new physics implication in Two Higgs doublet Model (2HDM) under various experimental constraints. As part work of Gambit group, our work is to use the global fit method to constrain the parameter space, find out the hints for new physics and try to make some predictions for further studies.
In our global fit, we include the constraints from LEP, LHC (SM-like...
The File Transfer Service developed at CERN and in production since 2014, has become fundamental component for LHC experiments workflows.
Starting from the beginning of 2018 with the participation to the EU project Extreme Data Cloud (XDC) [1] and the activities carried out in the context of the DOMA TPC [2] and QoS [3] working groups, a series of new developments and improvements has been...
The future need of simulated events for the LHC experiments and their High Luminosity upgrades, is expected to increase dramatically. As a consequence, research on new fast simulation solutions, based on Deep Generative Models, is very active and initial results look promising.
We have previously reported on a prototype that we have developed, based on 3 dimensional convolutional Generative...
iTHEPHY is an ERASMUS+ project which aims at developing innovative student-centered Deeper Learning Approaches (DPA) and Project-Based teaching and learning methodologies for HE students, contributing to increase the internationalization of physics master courses. In this talk we'll introduce the iTHEPHY project status and main goals attained, with a focus on the web-based virtual environment...
In the last couple of years, we have been actively developing the Dynamic On-Demand Analysis Service (DODAS) as an enabling technology to deploy container-based clusters over any Cloud infrastructure with almost zero effort. The DODAS engine is driven by high-level templates written in the TOSCA language, that allows to abstract the complexity of many configuration details. DODAS is...
Predictions for requirements for the LHC computing for Run 3 and Run 4 (HL_LHC) over the course of the next 10 years show a considerable gap between required and available resources, assuming budgets will globally remain flat at best. This will require some radical changes to the computing models for the data processing of the LHC experiments. Concentrating computational resources in fewer...
LHCb is one of the major experiments operating at the Large Hadron Collider at CERN. The richness of the physics program and the increasing precision of the measurements in LHCb lead to the need of ever larger simulated samples. This need will increase further when the upgraded LHCb detector will start collecting data in the LHC Run 3. Given the computing resources pledged for the production...
With GPUs and other kinds of accelerators becoming ever more accessible, High Performance Computing Centres all around the world using them ever more, ATLAS has to find the best way of making use of such accelerators in much of its computing.
Tests with GPUs -- mainly with CUDA -- have been performed in the past in the experiment. At that time the conclusion was that it was not advantageous...
The International Particle Physics Outreach Group (IPPOG) is a network of scientists, science educators and communication specialists working across the globe in informal science education and outreach for particle physics. IPPOG’s flagship activity is the International Particle Physics Masterclass programme, which provides secondary students with access to particle physics data using...
The LHCb high level trigger (HLT) is split in two stages. HLT1 is synchronous with collisions delivered by the LHC and writes its output to a local disk buffer, which is asynchronously processed by HLT2. Efficient monitoring of the data being processed by the application is crucial to promptly diagnose detector or software problems. HLT2 consists of approximately 50000 processes and 4000...
Low latency, high throughput data processing in distributed environments is a key requirement of today's experiments. Storage events facilitate synchronisation with external services where the widely adopted request-response pattern does not scale because of polling as a long-running activity. We discuss the use of an event broker and stream processing platform (Apache Kafka) for storage...
The XRootD software framework is essential for data access at WLCG sites. The WLCG community is exploring and expanding XRootD functionality. This presents a particular challenge at the RAL Tier-1 as the Echo storage service is a Ceph based Erasure Coded object store. External access to Echo uses gateway machines which run GridFTP and XRootD servers. This paper will describe how third party...
Belle II uses a Geant4-based simulation to determine the detector response to the generated decays of interest. A realistic detector simulation requires the inclusion of noise from beam-induced backgrounds. This is accomplished by overlaying random trigger data to the simulated signal. To have statistically independent Monte-Carlo events a high number of random trigger events are desirable....
The future upgraded High Luminosity LHC (HL-LHC) is expected to deliver about 5 times higher instantaneous luminosity than the present LHC, producing pile-up up to 200 interactions per bunch crossing. As a part of its phase-II upgrade program, the CMS collaboration is developing a new end-cap calorimeter system, the High Granularity Calorimeter (HGCAL), featuring highly-segmented hexagonal...
The ALICE Experiment at CERN LHC (Large Hadron Collider) is undertaking a major upgrade during LHC Long Shutdown 2 in 2019-2020, which includes a new computing system called O² (Online-Offline). The raw data input from the ALICE detectors will then increase a hundredfold, up to 3.4 TB/s. In order to cope with such a large amount of data, a new online-offline computing system, called O2, will...
When the LHC started data taking in 2009 the data rates were unprecedented for the time and forced the WLCG community develop a range of tools for managing their data across many different sites. A decade later other science communities are finding their data requirements have grown far beyond what they can easily manage and are looking for help. The RAL Tier-1’s primary mission has always...
Searches for beyond-Standard Model physics at the LHC have thus far not uncovered any evidence of new particles, and this is often used to state that new particles with low mass are now excluded. Using the example of the supersymmetric partners of the electroweak sector of the Standard Model, I will present recent results from the GAMBIT collaboration that show that there is plenty of room for...
The ALICE experiment has originally been designed as a relatively low-rate experiment, in particular given the limitations of the Time Projection Chamber (TPC) readout system using MWPCs. This will not be the case anymore for LHC Run 3 scheduled to start in 2021.
After the LS2 upgrades, including a new silicon tracker and a GEM-based readout for the TPC, ALICE will operate at a peak Pb-Pb...
The Geant4 electromagnetic (EM) physics sub-packages is an important component of LHC experiment simulations. During long shutdown 2 for LHC these packages are under intensive development and in this work we report a progress for the new Geant4 version 10.6. These developments includes modifications allowing speed-up computations for EM physics, improve EM models, extend set for models, and...
GNA is a high performance fitter, designed to handle large scale models with big number of parameters. Following the data flow paradigm the model in GNA is built as directed acyclic graph. Each node (transformation) of the graph represents a function, that operates on vectorized data. A library of transformations, implementing various functions, is precompiled. The graph itself is assembled...
The unprecedented computing resource needs of the ATLAS experiment have motivated the Collaboration to become a leader in exploiting High Performance Computers (HPCs). To meet the requirements of HPCs, the PanDA system has been equipped with two new components; Pilot 2 and Harvester, that were designed with HPCs in mind. While Harvester is a resource-facing service which provides resource...
The “Third Party Copy” (TPC) Working Group in the WLCG’s “Data Organization, Management, and Access” (DOMA) activity was proposed during a CHEP 2018 Birds of a Feather session in order to help organize the work toward developing alternatives to the GridFTP protocol. Alternate protocols enable the community to diversify; explore new approaches such as alternate authorization mechanisms; and...
MPI-learn and MPI-opt are libraries to perform large-scale training and hyper-parameter optimization for deep neural networks. The two libraries, based on Message Passing Interface, allows to perform these tasks on GPU clusters, through different kinds of parallelism. The main characteristic of these libraries is their flexibility: the user has complete freedom in building her own model,...
The traditional HEP analysis model uses successive processing steps to reduce the initial dataset to a size that permits real-time analysis. This iterative approach requires significant CPU time and storage of large intermediate datasets and may take weeks or months to complete. Low-latency, query-based analysis strategies are being developed to enable real-time analysis of primary datasets by...
The International Particle Physics Outreach Group (IPPOG) is a network of scientists, science educators and communication specialists working across the globe in informal science education and outreach for particle physics. The primary methodology adopted by IPPOG requires the direct involvement of scientists active in current research with education and communication specialists, in order to...
The increase in luminosity by a factor of 100 for the HL-LHC with respect to Run 1 poses a big challenge from the data analysis point of view. It demands a comparable improvement in software and processing infrastructure. The use of GPU enhanced supercomputers will increase the amount of computer power and analysis languages will have to be adapted to integrate them. The particle physics...
I describe a novel interactive virtual reality visualization of the Belle II detector at KEK and the animation therein of GEANT4-simulated event histories. Belle2VR runs on Oculus and Vive headsets (as well as in a web browser and on 2D computer screens, in the absence of a headset). A user with some particle-physics knowledge manipulates a gamepad or hand controller(s) to interact with and...
We report on performance measurements and optimizations of the event-builder software for the CMS experiment at the CERN Large Hadron Collider (LHC). The CMS event builder collects event fragments from several hundred sources. It assembles them into complete events that are then handed to the High-Level Trigger (HLT) processes running on O(1000) computers. We use a test system with 16...
The STAR Heavy Flavor Tracker (HFT) has enabled a rich physics program, providing important insights into heavy quark behavior in heavy ion collisions. Acquiring data during the 2014 through 2016 runs at the Relativistic Heavy Ion Collider (RHIC), the HFT consisted of four layers of precision silicon sensors. Used in concert with the Time Projection Chamber (TPC), the HFT enables the...
With an increased dataset obtained during CERN LHC Run-2, the even larger forthcoming Run-3 data and more than an order of magnitude expected increase for HL-LHC, the ATLAS experiment is reaching the limits of the current data production model in terms of disk storage resources. The anticipated availability of an improved fast simulation will enable ATLAS to produce significantly larger Monte...
For the past several years, IceCube has embraced a central, global overlay grid of HTCondor glideins to run jobs. With guaranteed network connectivity, the jobs themselves transferred data files, software, logs, and status messages. Then we were given access to a supercomputer, with no worker node internet access. As the push towards HPC increased, we had access to several of these...
CERN IT department has been maintaining different HPC facilities over the past five years, one in Windows and the other one on Linux as the bulk of computing facilities at CERN are running under Linux. The Windows cluster has been dedicated to engineering simulations and analysis problems. This cluster is a High Performance Computing (HPC) cluster thanks to powerful hardware and low-latency...
Modern hardware is trending towards increasingly parallel and heterogeneous architectures. Contemporary machine processors are spread across multiple sockets, where each socket can access some system memory faster than the rest, creating non-uniform memory access (NUMA). Efficiently utilizing these NUMA machines is becoming increasingly important. This paper examines latest Intel Skylake and...
Since its earliest days, the Worldwide LHC Computational Grid (WLCG) has relied on GridFTP to transfer data between sites. The announcement that Globus is dropping support of its open source Globus Toolkit (GT), which forms the basis for several FTP client and servers, has created an opportunity to reevaluate the use of FTP. HTTP-TPC, an extension to HTTP compatible with WebDAV, has arisen...
VecGeom is a geometry modeller library with hit-detection features as needed by particle detector simulation at the LHC and beyond. It was incubated by a Geant-R&D initiative and the motivation to combine the code of Geant4 and ROOT/TGeo into a single, better maintainable piece of software within the EU-AIDA program.
So far, VecGeom is mainly used by LHC experiments as a geometry primitive...
For the last 5 years Accelogic pioneered and perfected a radically new theory of numerical computing codenamed “Compressive Computing”, which has an extremely profound impact on real-world computer science. At the core of this new theory is the discovery of one of its fundamental theorems which states that, under very general conditions, the vast majority (typically between 70% and 80%) of the...
The upcoming generation of exascale HPC machines will all have most of their computing power provided by GPGPU accelerators. In order to be able to take advantage of this class of machines for HEP Monte Carlo simulations, we started to develop a Geant pilot application as a collaboration between HEP and the Exascale Computing Project. We will use this pilot to study and characterize how the...
The hardware landscape used in HEP and NP is changing from homogeneous multi-core systems towards heterogeneous systems with many different computing units, each with their own characteristics. To achieve data processing maximum performance the main challenge is to place the right computing on the right hardware.
In this paper we discuss CLAS12 charge particle tracking workload partitioning...
The Compressed Baryonic Matter (CBM) experiment is currently under construction at the GSI/FAIR accelerator facility in Darmstadt, Germany. In CBM, all event selection is performed in a large online processing system, the “First-level Event Selector” (FLES). The data are received from the self-triggered detectors at an input-stage computer farm designed for a data rate of 1 TByte/s. The...
The SKA will enable the production of full polarisation spectral line cubes at a very high spatial and spectral resolution. Performing a back-of-the-evelope estimate gives you the incredible amount of around 75-100 million tasks to run in parallel to perform a state-of-the-art faceting algorithm (assuming that it would spawn off just one task per facet, which is not the case). This simple...
We present an interactive game for up to seven players that demonstrates the challenges of on-line event selection at the Compact Muon Solenoid (CMS) experiment to the public. The game - in the shape of a popular classic pinball machine - was conceived and prototyped by an interdisciplinary team of graphic designers, physicists and engineers at the CMS Create hackathon in 2016. Having won the...
A Third Party Copy (TPC) has existed in the pure XRootD storage environment for many years. However using XRootD TPC in the WLCG environment presents additional challenges due to the diversity of the storage systems involved such as EOS, dCache, DPM and ECHO, requiring that we carefully navigate the unique constraints imposed by these storage systems and their site-specific environments...
Covariance matrices are used for a wide range of applications in particle ohysics, including Kalman filter for tracking purposes, as well as for Primary Component Analysis and other dimensionality reduction techniques. The covariance matrix contains covariance and variance measures between all permutations of data dimensions, leading to high computational cost.
By using a novel decomposition...
The rapid economic growth is building new trends in careers. Almost every domain, including high-energy physics, needs people with strong capabilities in programming. In this evolving environment, it is highly desirable that young people are equipped with computational thinking (CT) skills, such as problem-solving and logical thinking, as well as the ability to develop software applications...
The HL-LHC and the corresponding detector upgrades for the CMS experiment will present extreme challenges for the full simulation. In particular, increased precision in models of physics processes may be required for accurate reproduction of particle shower measurements from the upcoming High Granularity Calorimeter. The CPU performance impacts of several proposed physics models will be...
ATLAS Computing Management has identified the migration of all resources to Harvester, PanDA’s new workload submission engine, as a critical milestone for Run 3 and 4. This contribution will focus on the Grid migration to Harvester.
We have built a redundant architecture based on CERN IT’s common offerings (e.g. Openstack Virtual Machines and Database on Demand) to run the necessary Harvester...
The anticipated increase in storage requirements for the forthcoming HL-LHC data rates is not matched by a corresponding increase in budget. This results in a short-fall in available resources if the computing models remain unchanged. Therefore, effort is being invested in looking for new and innovative ways to optimise the current infrastructure, so minimising the impact of this...
Detailed simulation is one of the most expensive tasks, in terms of time and computing resources for High Energy Physics experiments. The need for simulated events will dramatically increase for the next generation experiments, like the ones that will run at the High Luminosity LHC. The computing model must evolve and in this context, alternative fast simulation solutions are being studied....
Optimization of computing resources, in particular storage, the costliest one, is a tremendous challenge for the High Luminosity LHC (HL-LHC) program. Several venues are being investigated to address the storage issues foreseen for HL-LHC. Our expectation is that savings can be achieved in two primary areas: optimization of the use of various storage types and reduction of the required...
ALICE (A Large Ion Collider Experiment), one of the large LHC experiments, is currently undergoing a significant upgrade. Increase in data rates planned for LHC Run3, together with triggerless continuous readout operation, requires a new type of networking and data processing infrastructure.
The new ALICE O2 (online-offline) computing facility consists of two types of nodes: First Level...
Fluidic Data is a floor-to-ceiling installation spanning the four levels of the CERN Data Centre stairwell. It utilizes the interplay of water and light to visualize the magnitude and flow of information coming from the four major LHC experiments. The installation consists of an array of transparent hoses that house colored fluid, symbolizing the data of each experiment, surrounded by a...
The WLCG is today comprised of a range of different types of resources such as cloud centers, large and small HPC centers, volunteer computing as well as the traditional grid resources. The Nordic Tier 1 (NT1) is a WLCG computing infrastructure distributed over the Nordic countries. The NT1 deploys the Nordugrid ARC CE, which is non-intrusive and lightweight, originally developed to cater for...
The ATLAS experiment has successfully integrated High-Performance Computing (HPC) resources in its production system. Unlike the current generation of HPC systems, and the LHC computing grid, the next generation of supercomputers is expected to be extremely heterogeneous in nature: different systems will have radically different architectures, and most of them will provide partitions optimized...
The JUNO (Jiangmen Underground Neutrino Observatory) experiment is a multi-purpose neutrino experiment designed to determine the neutrino mass hierarchy and precisely measure oscillation parameters. It is composed of a 20kton liquid scintillator central detector equipped with 18000 20’’ PMTs and 25000 3’’ PMTs, a water pool with 2000 20’’ PMTs, and a top tracker. Monte-Carlo simulation is a...
Scikit-HEP is a community-driven and community-oriented project with the goal of providing an ecosystem for particle physics data analysis in Python. Scikit-HEP is a toolset of approximately twenty packages and a few “affiliated” packages. It expands the typical Python data analysis tools for particle physicists. Each package focuses on a particular topic, and interacts with other packages in...
One of the most costly factors in providing a global computing infrastructure such as the WLCG is the human effort in deployment, integration, and operation of the distributed services supporting collaborative computing, data sharing and delivery, and analysis of extreme scale datasets. Furthermore, the time required to roll out global software updates, introduce new service components, or...
The COFFEA Framework provides a new approach to HEP analysis, via columnar operations, that improves time-to-insight, scalability, portability, and reproducibility of analysis. It is implemented with the Python programming language and commodity big data technologies such as Apache Spark and NoSQL databases. To achieve this suite of improvements across many use cases, COFFEA takes a factorized...
High Energy Physics experiments face unique challenges when running their computation on High Performance Computing (HPC) resources. The LZ dark matter detection experiment has two data centers, one each in the US and UK, to perform computations. Its US data center uses the HPC resources at NERSC.
In this talk, I will describe the current computational workflow of the LZ experiment, detailing...
The LHCb experiment will be upgraded in 2021 and a new trigger-less readout system will be implemented. In the upgraded system, both event building (EB) and event selection will be performed in software for every collision produced in every bunch-crossing of the LHC. In order to transport the full data rate of 32 Tb/s we will use state of the art off-the-shelf network technologies, e.g....
Public Engagement (PE) with science should be more than “fun” for the staff involved. PE should be a strategic aim of any publically funded science organisation to ensure the public develops an understanding and appreciation of their work, its benefits to everyday life and to ensure the next generation is enthused to take up STEM careers. Most scientific organisations do have aims to do this,...
Many of the challenges faced by the LHC experiments (aggregation of distributed computing resources, management of data across multiple storage facilities, integration of experiment-specific workflow management tools across multiple grid services) are similarly experienced by "midscale" high energy physics and astrophysics experiments, particularly as their data set volumes are increasing at...
The ALICE experiment at the CERN LHC will feature several upgrades for run 3, one of which is a new inner tracking system (ITS). The ITS upgrade is currently under development and commissioning. The new ITS will be installed during the ongoing long shutdown 2.
The specification for the ITS upgrade calls for event rates of up to 100 kHz for Pb-Pb, and 400 kHz pp, which is two orders of...
HL-LHC will confront the WLCG community with enormous data storage, management and access challenges. These are as much technical as economical. In the WLCG-DOMA Access working group, members of the experiments and site managers have explored different models for data access and storage strategies to reduce cost and complexity, taking into account the boundary conditions given by our...
Large-scale particle physics experiments face challenging demands for high-throughput computing resources both now and in the future. New heterogeneous computing paradigms on dedicated hardware with increased parallelization, such as Field Programmable Gate Arrays (FPGAs), offer exciting solutions with large potential gains. The growing applications of machine learning algorithms in particle...
GUM is a new feature of the GAMBIT global fitting software framework, which provides a direct interface between Lagrangian level tools and GAMBIT. GUM automatically writes GAMBIT routines to compute observables and likelihoods for physics beyond the Standard Model. I will describe the structure of GUM, the tools (within GAMBIT) it is able to create interfaces to, and the observables it is able...
In this paper we present the latest CMS open data release published on the CERN Open Data portal. The samples of raw datasets, collision and simulated datasets were released together with the detailed information about the data provenance. The data production chain covers the necessary compute environments, the configuration files and the computational procedures used in each data production...
In the near future, large scientific collaborations will face unprecedented computing challenges. Processing and storing exabyte datasets require a federated infrastructure of distributed computing resources. The current systems have proven to be mature and capable of meeting the experiment goals, by allowing timely delivery of scientific results. However, a substantial amount of interventions...
The envisaged Storage and Compute needs for the HL-LHC will be a factor up to 10 above what can be achieved by the evolution of current technology within a flat budget. The WLCG community is studying possible technical solutions to evolve the current computing in order to cope with the requirements; one of the main focuses is resource optimization, with the ultimate objective of improving...
The WLCG Web Proxy Auto Discovery (WPAD) service provides a convenient mechanism for jobs running anywhere on the WLCG to dynamically discover web proxy cache servers that are nearby. The web proxy caches are general purpose for a number of different http applications, but different applications have different usage characteristics and not all proxy caches are engineered to work with the...
Within the FAIR Phase-0 program the fast algorithms of the FLES (First-Level Event Selection) package developed for the CBM experiment (FAIR/GSI, Germany) are adapted for online and offline processing in the STAR experiment (BNL, USA). Using the same algorithms creates a bridge between online and offline. This makes it possible to combine online and offline resources for data...
The CMS computing infrastructure is composed by several subsystems that accomplish complex tasks such as workload and data management, transfers, submission of user and centrally managed production requests. Till recently, most subsystems were monitored through custom tools and web applications, and logging information was scattered in several sources and typically accessible only by experts....
Computing needs projections for the HL-LHC era (2026+), following the current computing models, indicate that much larger resource increases would be required than those that technology evolution at a constant budget could bring. Since worldwide budget for computing is not expected to increase, many research activities have emerged to improve the performance of the LHC processing software...
Within the ATLAS detector, the Trigger and Data Acquisition system is responsible for the online processing of data streamed from the detector during collisions at the Large Hadron Collider (LHC) at CERN. The online farm is composed of ~4000 servers processing the data read out from ~100 million detector channels through multiple trigger levels. The capability to monitor the ongoing data...
Software improvements in the ATLAS Geant4-based simulation are critical to keep up with the evolving hardware and increasing luminosity. Geant4 simulation currently accounts for about 50% of CPU consumption in ATLAS and it is expected to remain the leading CPU load during Run 4 (HL-LHC upgrade) with an approximately 25% share in the most optimistic computing model. The ATLAS experiment...
In big physics experiments, as simulation, reconstruction and analysis become more sophisticated, scientific reproducibility is not a trivial task. Software is one of the biggest challenges. Modularity is a common sense of software engineering to facilitate quality and reusability of code. However, that often introduces nested dependencies not obvious for physicists to work with. Package...
The CMS collaboration at the CERN LHC has made more than one petabyte of open data available to the public, including large parts of the data which formed the basis for the discovery of the Higgs boson in 2012. Apart from their scientific value, these data can be used not only for education and outreach, but also for open benchmarks of analysis software. However, in their original format, the...
Background field methods offer an approach through which fundamental non-perturbative hadronic properties can be studied. Lattice QCD is the only ab initio method with which Quantum Chromodynamics can be studied at low energies; it involves numerically calculating expectation values in the path integral formalism. This requires substantial investment in high performance super computing...
HEP experiments simulate the detector response by accessing all needed data and services within their own software frameworks. However, decoupling the simulation process from the experiment infrastructure can be useful for a number of tasks, amongst them the debugging of new features, or the validation of multithreaded vs sequential simulation code and the optimization of algorithms for HPCs....
For the last 10 years, the ATLAS Distributed Computing project has based its monitoring infrastructure on a set of custom designed dashboards provided by CERN-IT. This system functioned very well for LHC Runs 1 and 2, but its maintenance has progressively become more difficult and the conditions for Run 3, starting in 2021, will be even more demanding; hence a more standard code base and more...
The University of California system has excellent networking between all of its campuses as well as a number of other Universities in CA, including Caltech, most of them being connected at 100 Gbps. UCSD and Caltech have thus joined their disk systems into a single logical xcache system, with worker nodes from both sites accessing data from disks at either site. This setup has been in place...
Open Data Science Mesh (CS3MESH4EOSC) is a newly funded project to create a new generation, interoperable federation of data and higher-level services to enable friction-free collaboration between European researchers.
This new EU-funded project brings together 12 partners from the CS3 community (Cloud Synchronization and Sharing Services). The consortium partners include CERN, Danish...
In 2021 the LHCb experiment will be upgraded, and the DAQ system will be based on full reconstruction of events, at the full LHC crossing rate. This requires an entirely new system, capable of reading out, building and reconstructing events at an average rate of 30 MHz. In facing this challenge, the system could take advantage of a fast pre-processing of data on dedicated FPGAs. We present the...
Development of scientific software has always presented challenges to its practitioners, among other things due to its inherently collaborative nature. Software systems often consistent of up to several dozen closely-related packages developed within a particular experiment or related ecosystem, with up to a couple of hundred externally-sourced dependencies. Making improvements to one such...
With the unprecedented high luminosity delivered by the LHC, detector readout and data storage limitations severely limit searches for processes with high-rate backgrounds. An example of such searches is those for mediators of the interactions between the Standard Model and dark matter, decaying to hadronic jets. Traditional signatures and data taking techniques limit these searches to masses...
The ATLAS Collaboration is releasing a new set of recorded and simulated data samples at a centre-of-mass energy of 13 TeV. This new dataset was designed after an in-depth review of the usage of the previous release of samples at 8 TeV. That review showed that capacity-building is one of the most important and abundant uses of public ATLAS samples. To fulfil the requirements of the community...
The central Monte-Carlo production of the CMS experiment utilizes the WLCG infrastructure and manages daily thousands of tasks, each up to thousands of jobs. The distributed computing system is bound to sustain a certain rate of failures of various types, which are currently handled by computing operators a posteriori. Within the context of computing operations, and operation intelligence, we...
A general problem faced by computing on the grid for opportunistic users is that while delivering opportunistic cycles is simpler compared to delivering opportunistic storage. In this project we show how we integrated Xrootd caches places on the internet backbone to simulate a content delivery network for general science workflows. We will show that for some workflows on LIGO, DUNE, and...
There exists a long standing discrepancy of around 3.5 sigma between experimental measurements and standard model calculations of the magnetic moment of the muon. Current experiments aim to reduce the experimental uncertainty by a factor of 4, and Standard Model calculations must also be improved by a similar order. The largest uncertainty in the Standard Model calculation comes from the QCD...
The pattern recognition of the trajectories of charged particles is at the core of the computing challenge for the HL-LHC, which is currently the center of a very active area of research. There has also been rapid progress in the development of quantum computers, including the D-Wave quantum annealer. In this talk we will discuss results from our project investigating the use of annealing...
The Heavy Photon Search (HPS) is an experiment at the Thomas Jefferson National Accelerator Facility designed to search for a hidden sector photon (A’) in fixed-target electro-production. It uses a silicon micro-strip tracking and vertexing detector inside a dipole magnet to measure charged particle trajectories and a fast lead-tungstate crystal calorimeter just downstream of the magnet to...
Perform data analysis and visualisation on your own computer? Yes, you can! Commodity computers are now very powerful in comparison to only a few years ago. On top of that, the performance of today's software and data development techniques facilitates complex computation with fewer resources. Cloud computing is not always the solution, and reliability or even privacy is regularly a concern....
The ATLAS Spanish Tier-1 and Tier-2s have more than 15 years of experience in the deployment and development of LHC computing components and their successful operations. The sites are already actively participating in, and even coordinating, emerging R&D computing activities developing the new computing models needed in the LHC Run3 and HL-LHC periods.
In this contribution, we present details...
Computing the gluon component of momentum in the nucleon is a difficult and computationally expensive problem, as the matrix element involves a quark-line-disconnected gluon operator which suffers from ultra-violet fluctuations. But also necessary for a successful determination is the non-perturbative renormalisation of this operator. We investigate this renormalisation here by direct...
With the increase of storage needs at the HL-LHC horizon, the data management and access will be very challenging for this critical service. The evaluation of possible solutions within the DOMA, DOMA-FR (IN2P3 project contribution to DOMA) and ESCAPE initiatives is a major activity to select the most optimal ones from the experiment and site point of views. The LAPP and LPSC teams have put...
The large volume of data expected to be produced by the Belle II experiment presents the opportunity for for studies of rare, previously inaccessible processes. To investigate such rare processes in a high data volume environment necessitates a correspondingly high volume of Monte Carlo simulations to prepare analyses and gain a deep understanding of the contributing physics processes to each...
The HL-LHC will see ATLAS and CMS see proton bunch collisions reaching track multiplicity up to 10.000 charged tracks per event. Algorithms need to be developed to harness the increased combinatorial complexity. To engage the Computer Science community to contribute new ideas, we have organized a Tracking Machine Learning challenge (TrackML). Participants are provided events with 100k 3D...
Data movement between sites, replication and storage are very expensive operations, in terms of time and resources, for the LHC collaborations, and are expected to be even more so in the future. In this work we derived usage patterns based on traces and logs from the data and workflow management systems of CMS and ATLAS, and simulated the impact of different caching and data lifecycle...
DESY is one of the largest accelerator laboratories in Europe, developing and operating state of the art accelerators, used to perform fundamental science in the areas of high-energy physics photon science and accelerator development.\newline
While for decades high energy physics has been the most prominent user of the DESY compute, storage and network infrastructure, various scientific...
We describe the dataset of very rare events recorded by the OPERA experiment. Those events represent tracks of particles associated with tau neutrinos emerged from a pure muon neutrino beam, due to neutrino oscillations. The OPERA detector, located in the underground Gran Sasso Laboratory, consisted of an emulsion/lead target with an average mass of about 1.2 kt, complemented by the electronic...
Development of the second generation JANA2 multi-threaded event processing framework is ongoing through an LDRD initiative grant at Jefferson Lab. The framework is designed to take full advantage of all cores on modern many-core compute nodes. JANA2 efficiently handles both traditional hardware triggered event data and streaming data in online triggerless environments. Development is being...
In this work we review existing monitoring outputs and recommend some novel alternative approaches to improve the comprehension of large volumes of operations data that are produced in distributed computing. Current monitoring output is dominated by the pervasive use of time-series histograms showing the evolution of various metrics. These can quickly overwhelm or confuse the viewer due to the...
Estimations of the CPU resources that will be needed to produce simulated data for the future runs of the ATLAS experiment at the LHC indicate a compelling need to speed-up the process to reduce the computational time required. While different fast simulation projects are ongoing (FastCaloSim, FastChain, etc.), full Geant4 based simulation will still be heavily used and is expected to consume...
Based on work in the ROOTLINQ project, we’ve re-written a functional declarative analysis language in Python. With a declarative language, the physicist specifies what they want to do with the data, rather than how they want to do it. Then the system translates the intent into actions. Using declarative languages would have numerous benefits for the LHC community, ranging from analysis...
A present-day detection system for charged tracks in particle physics experiments is typically composed of two or more types of detectors. Then global track finding with these sub-detectors is one important topic. This contribution is to describe a global track finding algorithm with Hough Transform for a detection system consist of a Cylindrical-Gas-Electron-Multiplier (CGEM) and a Drift...
The ARM platform extends from the mobile phone area to development board computers and servers. It could be that in the future the importance of the ARM platform will increase if new more powerful (server) boards are released. For this reason CMSSW has previously been ported to ARM in earlier work.
The CMS software is deployed using CVMFS and the jobs are run inside Singularity containers....
We will present techniques developed in collaboration with the OSiRIS project (NSF Award #1541335, UM, IU, MSU and WSU) and SLATE (NSF Award #1724821) for orchestrating software defined network slices with a goal of building reproducible and reliable computer networks for large data collaborations. With this project we have explored methods of utilizing passive and active measurements to...
BAT.jl, the Julia version of the Bayesian Analysis Toolkit, is a software package which is designed to help solve statistical problems encountered in Bayesian inference. Typical examples are the extraction of the values of the free parameters of a model, the comparison of different models in the light of a given data set, and the test of the validity of a model to represent the data set at...
Many physics analyses using the Compact Muon Solenoid (CMS) detector at the LHC require accurate, high resolution electron and photon energy measurements. Excellent energy resolution is crucial for studies of Higgs boson decays with electromagnetic particles in the final state, as well as searches for very high mass resonances decaying to energetic photons or electrons. The CMS electromagnetic...
During the last few years, the EOS distributed storage system at CERN has seen a steady increase in use, both in terms of traffic volume as well as sheer amount of stored data.
This has brought the unwelcome side effect of stretching the EOS software stack to its design constraints, resulting in frequent user-facing issues and occasional downtime of critical services.
In this paper, we...
The LHCb detector will be upgraded in 2021, where the hardware-level trigger will be replaced by a High Level Trigger 1 software trigger that needs to process the full 30 MHz data-collision rate. As part of the efforts to create a GPU High Level Trigger 1, tracking algorithms need to be optimized for SIMD architectures in order to achieve high-throughput. We present a SPMD (Single Program,...
Conditions databases is an important class of database applications where the database is used
to record the state of a set of quantities as a function of observation time.
Conditions databases are used in Hight Energy Physics to record the state of
the detector apparatus during data taking, and then to use the data during
the event reconstruction and analysis phases.
At FNAL, we...
China Spallation Neutron Source (CSNS) is a large science facility, and it is public available to researchers from all over the world. The data platform of CSNS is aimed for diverse data and computing supports, the design philosophy behind is data safety, big-data sharing, and user convenience.
In order to manage scientific data, a metadata catalogue based on ICAT is built to manage full...
ATLAS Metadata Interface (AMI) is a generic ecosystem for metadata aggregation, transformation and cataloging benefiting from about 20 years of feedback in the LHC context. This poster describes the design principles of the Metadata Querying Language (MQL) implemented in AMI, a metadata-oriented domain-specific language allowing to query databases without knowing the relation between tables....
During the third long shutdown of the CERN Large Hadron Collider, the CMS Detector will undergo a major upgrade to prepare for Phase-2 of the CMS physics program, starting around 2026. Upgrade projects will replace or improve detector systems to provide the necessary physics performance under the challenging conditions of high luminosity at the HL-LHC. Among other upgrades, the new CMS...
With the evolution of the WLCG towards opportunistic resource usage and cross-site data access, new challenges for data analysis have emerged in recent years. To enable performant data access without relying on static data locality, distributed caching aims at providing data locality dynamically. Recent work successfully employs various approaches for effective and coherent caching, from...
With the explosion of the number of distributed applications, a new dynamic server environment emerged grouping servers into clusters, utilization of which depends on the current demand for the application. To provide reliable and smooth services it is crucial to detect and fix possible erratic behavior of individual servers in these clusters. Use of standard techniques for this purpose...
Large experiments in high energy physics require efficient and scalable monitoring solutions to digest data of the detector control system. Plotting multiple graphs in the slow control system and extracting historical data for long time periods are resource intensive tasks. The proposed solution leverages the new virtualization, data analytics and visualization technologies such as InfluxDB...
The DUNE Collaboration has successfully implemented and currently operates
an experimental program based at CERN which includes a beam test and an extended
cosmic ray run of two large-scale prototypes of the DUNE Far Detector. The volume of data already collected by the protoDUNE-SP (the single-phase Liquid Argon TPC prototype) amounts to approximately 3PB and the sustained rate of data sent...
The goal to obtain more precise physics results in current collider experiments drives the plans to significantly increase the instantaneous luminosity collected by the experiments. The increasing complexity of the events due to the resulting increased pileup requires new approaches to triggering, reconstruction, analysis,
and event simulation. The last task brings to a critical problem:...
The second-generation Belle II experiment at the SuperKEKB colliding-beam accelerator in Japan searches for new-physics signatures and studies the behaviour of heavy quarks and leptons produced in electron-positron collisions. The KLM (K-long and Muon) subsystem of Belle II identifies long-lived neutral kaons via hadronic-shower byproducts and muons via their undeflected penetration through...
Triple-GEM detectors are gaseous devices used in high energy physics to measure the path of the particles which cross them. The characterisation of triple GEM detectors and the estimation of the performance for real data experiments require a complete comprehension of the mechanisms which transform the passage of one particle in the detector into electric signals, and dedicated MonteCarlo...
NOvA is a long-baseline neutrino experiment aiming to study neutrino oscillation phenomenon in the muon neutrino beam from complex NuMI at Fermilab (USA). Two identical detectors have been built to measure the initial neutrino flux spectra at the near site and the oscillated one at a 810 km distance, which significantly reduces many systematic uncertainties. To improve electron neutrino and...
This paper presents the network architecture of the TIER 1 data center at JINR using the modern multichannel data transfer protocol TRILL. The obtained experimental data folow our activity to further study the nature of traffic distribution in redundant topologies. There are several questions. How the distribution of packet data occurs on four (or more) equivalent routes? What happens when the...
ATLAS Metadata Interface (AMI) is a generic ecosystem for metadata aggregation, transformation and cataloging. Benefiting from about 20 years of feedback in the LHC context, the second major version was released in 2018. This poster describes how to install and administrate AMI version 2. A particular focus is given to the registration of existing databases in AMI, the adding of additional...
The Jiangmen Underground Neutrino Observatory (JUNO) is designed to primarily measure the neutrino mass hierarchy. The JUNO central detector (CD) would be the world largest liquid scintillator (LS) detector with an unprecedented energy resolution of 3\%/\sqrt{E(MeV)} and a superior energy nonlinearity better than 1%. A calibration complex, including Cable Loop System (CLS), Guide Tube...
The JUNO (Jiangmen Underground Neutrino Observatory) is designed to determine the neutrino mass hierarchy and precisely measure oscillation parameters. The JUNO central detector is a 20 kt spherical volume of liquid scintillator (LS) with 35m diameter instrumented with 18,000 20-inch photomultiplier tubes (PMTs). Neutrinos are captured by protons of the target via the inverse beta decay...
In modern physics experiments, data analysis need considerable computing capacity. Computing resources of a single site are often limited and distributed computing is often inexpensive and flexible. While several large-scale grid solutions exist, for example DiRAC (Distributed Infrastructure with Remote Agent Control), there are few schemes devoted to solve the problem at small-scale. For the...
This work addresses key technological challenges in the preparation of data pipelines for machine learning and deep learning at scale of interest for HEP. A novel prototype to improve the event filtering system at LHC experiments, based on a classifier trained using deep neural networks has recently been proposed by T. Nguyen et al. https://arxiv.org/abs/1807.00083. This presentation covers...
The CMS Collaboration has recently commissioned a new compact data format, named NANOAOD, reducing the per-event compressed size to about 1-2 kB. This is achieved by retaining only high level information on physics objects, and aims at supporting a considerable fraction of CMS physics analyses with a ~20x reduction in disk storage needs. NANOAOD also facilitates the dissemination of analysis...
The Czech Tier-2 center hosted and operated by Institute of Physics of the Czech Academy o Sciences significantly upgraded external network connection in 2019. The older edge router Cisco 6509 provided several 10 Gbps connections via a 10 Gigabit Ethernet Fiber Module, from which 2 ports were used for external LHCONE conection, 1 port for generic internet traffic and 1 port to reach other...
Virtual Monte Carlo (VMC) provides a unified interface to different detector simulation transport engines such as GEANT3 and Geant4. Since recently, all VMC packages: the VMC core library, also included in ROOT, Geant3 and Geant4 VMC are distributed via the VMC Project GitHub organization. In addition to these VMC related packages, the VMC project also includes the Virtual Geometry Model...
The Production Operations Management System (POMS) is a set of software tools which allows production teams and analysis groups across multiple Fermilab experiments to launch, modify and monitor large scale campaigns of related Monte Carlo or data processing jobs.
POMS provides a web service interface that enables automated jobs submission on distributed resources according to customers’...
The HistFactory
p.d.f. template [CERN-OPEN-2012-016] is per-se independent of its implementation in ROOT
and it is useful to be able to run statistical analysis outside of the ROOT
, RooFit
, RooStats
framework. pyhf
is a pure-python implementation of that statistical model for multi-bin histogram-based analysis and its interval estimation is...
As an important detector spectrum for the Nuclotron-based Ion Collider fAcility(NICA) accelerator complex at JINR, the MultiPurpose Detector(MPD) is proposed to investigate the hot and dense baryonic matter in heavy-ion collisions over a wide range of atomic masses, from Au+Au collisions at a centre-of-mass energy of $\sqrt{s_{nn}}=11GeV(for\ Au^{79+})$ to proton-proton collisions with...
The SAGE2 project is a collaboration between industry, data centres and research institutes demonstrating an exascale-ready system based on layered hierarchical storage and a novel object storage technology. The development of this system is based on a significant co-design exercise between all partners, with the research institutes having well established needs for exascale computing...
The Caltech team in collaboration with network, computer science, and HEP partners at the DOE laboratories and universities, building smart network services ("The Software-defined network for End-to-end Networked Science at Exascale (SENSE) research project") to accelerate scientific discovery.
The overarching goal of SENSE is to enable National Labs and universities to request and...
Current and future end-user analyses and workflows in High Energy Physics demand the processing of growing amounts of data. This plays a major role when looking at the demands in the context of the High-Luminosity-LHC. In order to keep the processing time and turn-around cycles as low as possible analysis clusters optimized with respect to these demands can be used. Since hyperconverged...
Lattice quantum chromodynamics (QCD) has provided great insight into the nature of empty space, but quantum chromodynamics alone does not describe the vacuum in its entirety. Recent developments have introduced Quantum Electrodynamic (QED) effects directly into the generation of lattice gauge field configurations. Using lattice ensembles incorporating fully dynamical QCD and QED effects we are...
This talk describes the deployment of ATLAS offline software in containers for use in production workflows such as simulation and reconstruction. For this purpose we are using Docker and Singularity, which are both lightweight virtualization technologies that can encapsulate software packages inside complete file systems. The deployment of offline releases via containers removes the...
Detector Control Systems (DCS) for modern High-Energy Physics (HEP) experiments are based on complex distributed (and often redundant) hardware and software implementing real-time operational procedures meant to ensuring that the detector is always in a "safe" state, while at the same time maximizing the live time of the detector during beam collisions. Display, archival and often analysis of...
In the High Luminosity LHC, planned to start with Run4 in 2026, the ATLAS experiment will be equipped with the Hardware Track Trigger (HTT) system, a dedicated hardware system able to reconstruct tracks in the silicon detectors with short latency. This HTT will be composed of about 700 ATCA boards, based on new technologies available on the market, like high speed links and powerful FPGAs, as...
Data growth over several years within HEP experiments requires a wider use of storage systems for WLCG Tiered Centers. It also increases the complexity of storage systems, which includes the expansion of hardware components and thereby complicates existing software products more. To cope with such systems is a non-trivial task and requires highly qualified specialists.
Storing petabytes of...
The Electromagnetic Calorimeter (ECAL) is one of the sub-detectors of the Compact Muon Solenoid (CMS), a general-purpose particle detector at the CERN Large Hadron Collider (LHC). The CMS ECAL Detector Control System (DCS) and the CMS ECAL Safety System (ESS) have supported the detector operations and ensured the detector's integrity since the CMS commissioning phase, more than 10 years ago....
The Virtual Geometry Model (VGM) is a geometry conversion tool, currently providing conversion between Geant4 and ROOT TGeo geometry models. Its design allows the inclusion of another geometry model by implementing a single sub-module instead of writing bilateral converters for all already supported models.
The VGM was last presented at CHEP in 2008 and since then it has been under continuous...
The Tile Calorimeter (TileCal) is a crucial part of the ATLAS detector which jointly with other calorimeters reconstructs hadrons, jets, tau-particles, missing transverse energy and assists in muon identification. It is constructed of alternating iron absorber layers and active scintillating tiles and covers region |eta| < 1.7. The TileCal is regularly monitored by several different systems,...
Support for token-based authentication and authorization has emerged in recent years as a key requirement for storage elements powering WLCG data centers. Authorization tokens represent a flexible and viable alternative to other credential delegation schemes (e.g. proxy certificates) and authorization mechanisms (VOMS) historically used in WLCG, as documented in more detail in other submitted...
Designing new experiments, as well as upgrade of ongoing experiments, is a continuous process in experimental high energy physics. Frontier R&Ds are used to squeeze the maximum physics performance using cutting edge detector technologies.
The evaluating of physics performance for particular configuration includes sketching this configuration in Geant, simulating typical signals and...
The gluon field configurations that form the foundation of every lattice QCD calculation contain a rich diversity of emergent nonperturbative phenomena. Visualisation of these phenomena creates an intuitive understanding of their structure and dynamics. This presentation will illustrate recent advances in observing the chromo-electromagnetic vector fields, their energy and topological charge...
Detector description is an essential component in simulation, reconstruction and analysis of data resulting from particle collisions in high energy physics experiments and for the detector development studies for future experiments. Current detector description implementations of running experiments are mostly specific implementations. DD4hep is an open source toolkit created in 2012 to serve...
The CMS experiment supports and contributes the development of next-generation Event Visualization Environment (EVE) of the ROOT framework with the intention of superseding Fireworks, the physics analysis oriented event display of CMS that was developed ten years ago and has been used for Run 1 and Run 2, with a new server-web client implementation. This paper presents progress in development...
The ALICE experiment at the Large Hadron Collider (LHC) at CERN will deploy a combined online-offline facility for detector readout and reconstruction, as well as data compression. This system is designed to allow the inspection of all collisions at rates of 50 kHz in the case of Pb-Pb and 400 kHz for pp collisions in order to give access to rare physics signals. The input data rate of up to...
The detection of long-lived particles (LLPs) in high energy experiments are key for both the study of the Standard Model (SM) of particle physics and to search for new physics beyond it.
Many interesting decay modes involve strange particles with large lifetimes such as Ks or L0s. Exotic LLP are also predicted in many new theoretical models. The selection and reconstruction of LLPs produced...
The ATLAS Experiment is storing detector and simulation data in raw and derived data formats across more than 150 Grid sites world-wide: currently, in total about 200 PB of disk storage and 250 PB of tape storage is used.
Data have different access characteristics due to various computational workflows. Raw data is only processed about once per year, whereas derived data are accessed...
The Scalable Systems Laboratory (SSL), part of the IRIS-HEP Software Institute, provides Institute participants and HEP software developers generally with a means to transition their R&D from conceptual toys to testbeds to production-scale prototypes. The SSL enables tooling, infrastructure, and services supporting innovation of novel analysis and data architectures, development of software...
In recent years containerization has revolutionized cloud environments, providing a secure, lightweight, standardized way to package and execute software. Solutions such as Kubernetes enable orchestration of containers in a cluster, including for the purpose of job scheduling. Kubernetes is becoming a de facto standard, available at all major cloud computing providers, and is gaining increased...
Recent searches for supersymmetric particles at the Large Hadron Collider have been unsuccessful in detecting any BSM physics. This is partially because the exact masses of supersymmetric particles are not known, and as such, searching for them is very difficult. The method broadly used in searching for new physics requires one to optimise on the signal being searched for, potentially...
The WLCG Authorisation Working Group formed in July 2017 with the objective to understand and meet the needs of a future-looking Authentication and Authorisation Infrastructure (AAI) for WLCG experiments. Much has changed since the early 2000s when X.509 certificates presented the most suitable choice for authorisation within the grid; progress in token based authorisation and identity...
DD4hep is an open-source software toolkit that provides comprehensive and complete generic detector descriptions for high energy physics (HEP) detectors. The Compact Muon Solenoid collaboration (CMS) has recently evaluated and adopted DD4hep to replace its custom detector description software. CMS has demanding software requirements as a very large, long-running experiment that must support...
We will describe the deployment of containers on the ATLAS infrastructure. There are several ways to run containers: as part of the batch system infrastructure, as part of the pilot, or called directly. ATLAS is exploiting them depending on which facility its jobs are sent to. Containers have been a vital part of the HPC infrastructure for the past year, and using fat images - images...
The information security threats currently faced by WLCG sites are both sophisticated and highly profitable for the actors involved. Evidence suggests that targeted organisations take on average more than six months to detect a cyber attack, with more sophisticated attacks being more likely to pass undetected.
An important way to mount an appropriate response is through the use of a...
Data acquisition (DAQ) systems are a key component for successful data taking in any experiment. The DAQ is a complex distributed computing system and coordinates all operations, from the data selection stage of interesting events to storage elements.
For the High Luminosity upgrade of the Large Hadron Collider (HL-LHC), the experiments at CERN need to meet challenging requirements to record...
Measurements involving rare B meson decays by the LHCb and Belle Collaborations have revealed a number of anomalous results. Collectively, these anomalies are generating significant interest in the community, as they may be interpreted as a first sign of new physics in the lepton flavour sector. In 2018, the CMS experiment recorded an unprecedented data set containing the unbiased decays of 10...
The Alpha Magnetic Spectrometer (AMS) is a particle physics experiment installed and operating on board of the International Space Station (ISS) from May 2011 and expected to last through Year 2024 and beyond. Aiming to explore a new frontier in particle physic, the AMS collaboration seeks to store, manage and present its research results as well as the details of the detector and the...
Following a thorough review in 2018, the CMS experiment at the CERN LHC decided to adopt Rucio as its new data management system. Rucio is emerging as a community software project and will replace an aging CMS-only system before the start-up of LHC Run 3 in 2021. Rucio was chosen after an evaluation determined that Rucio could meet the technical and scale needs of CMS. The data management...
As part of a modernization effort at IceCube, a new unified authorization system has been developed to allow access to multiple applications with a single credential. Based on SciTokens and JWT, it allows for the delegation of specific accesses to cluster jobs or third party applications on behalf of the user. Designed with security in mind, it includes short expiration times on access...
Software defect prediction aims at detecting part of software that can likely contain faulty modules - e.g. in terms of complexity, maintainability, and other software characteristics - and therefore that require actual attention. Machine Learning (ML) has proven to be of great value in a variety of Software Engineering tasks, such as software defects prediction, also in the presence of...
Four years after deployment of our public web site using the Drupal 7 content management system, the ATLAS Education and Outreach group is in the process of migrating to the new CERN Drupal 8 infrastructure. We present lessons learned from the development, usage and evolution of the original web site, and how the choice of technology helped to shape and reinforce our communication strategy. We...
The need for an unbiased analysis of large complex datasets, especially those collected by the LHC experiments, is pushing for data acquisition systems where predefined online trigger selections are limited if not suppressed at all. Not just this poses tremendous challenges for the hardware components, but also calls for new strategies for the online software infrastructures. Open source...
For physics analyses with identical final state objects, e.g. jets, the correct sorting of the objects at the input of the analysis can lead to a considerable performance increase.
We present a new approach in which a sorting network is placed upstream of a classification network. The sorting network combines the whole event information and explicitly pre-sorts the inputs of the analysis....
The CMS experiment at the LHC features the largest crystal electromagnetic calorimeter (ECAL) ever built. It consists of about 75000 scintillating lead tungstate crystals. The ECAL crystal energy response is fundamental for both triggering purposes and offline analysis. Due to the challenging LHC radiation environment, the response of both crystals and photodetectors to particles evolves with...
The Dynafed data federator is designed to present a dynamic and unified view of a distributed file repository. We describe our use of Dynafed to construct a production-ready WLCG storage element (SE) using existing Grid storage endpoints as well as object storage. In particular, Dynafed is used as the primary SE for the Canadian distributed computing cloud systems. Specifically, we have been...
The ART system is designed to run test jobs on the Grid after an ATLAS nightly release has been built. The choice was taken to exploit the Grid as a backend as it offers a huge resource pool, suitable for a deep set of integration tests, and running the tests could be delegated to the highly scalable ATLAS production system (PanDA). The challenge of enabling the Grid as a test environment is...
In mathematics and computer algebra, automatic differentiation (AD) is a set of techniques to evaluate the derivative of a function specified by a computer program. AD exploits the fact that every computer program, no matter how complicated, executes a sequence of elementary arithmetic operations (addition, subtraction, multiplication, division, etc.) and elementary functions (exp, log, sin,...
In this talk, the speaker will present the computer security risk landscape as faced by academia and research organisations; look into various motivations behind attacks; and explore how these threats can be addressed. This will be followed by details of several types of vulnerabilities and incidents recently affecting HEP community, and lessons learnt. The talk will conclude with a outlook...
As a data-intensive computing application, high-energy physics requires storage and computing for large amounts of data at the PB level. Performance demands and data access imbalances in mass storage systems are increasing. Specifically, on one hand, traditional cheap disk storage systems have been unable to handle high IOPS demand services. On the other hand, a survey found that only a very...
We present preliminary studies of a deep neural network (DNN) "tagger" that is trained to identify the presence of displaced jets arising from the decays of new long-lived particle (LLP) states in data recorded by the CMS detector at the CERN LHC. Particle-level candidates, as well as secondary vertex information, are refined through the use of convolutional neural networks (CNNs) before being...
Athena is the software framework used in the ATLAS experiment throughout the data processing path, from the software trigger system through offline event reconstruction to physics analysis. The shift from high-power single-core CPUs to multi-core systems in the computing market means that the throughput capabilities of the framework have become limited by the available memory per process. For...
The CERN Batch Service faces many challenges in order to get ready for the computing demands of future LHC runs. These challenges require that we look at all potential resources, assessing how efficiently we use them and that we explore different alternatives to exploit opportunistic resources in our infrastructure as well as outside of the CERN computing centre.
Several projects, like...
Abstract
Various studies have shown the crucial and strong impact that undergraduate research has on the learning outcome of students and its role in clarifying their career path. It was proven that promoting research at the undergraduate level is essential to build an enriched learning environment for students [1,2]. Students get exposed to the research world at an early stage, acquire new...
The SND is a non-magnetic detector deployed at the VEPP-2000 e+e- collider (BINP, Novosibirsk) for hadronic cross-section measurements in the center of mass energy region below 2 GeV. The important part of the detector is a three-layer hodoscopic electromagnetic calorimeter (EMC) based on NaI(Tl) counters. Until the recent EMC spectrometric channel upgrade, only the energy deposition...
High Performance Computing (HPC) facilities provide vast computational power and storage, but generally work on fixed environments designed to address the most common software needs locally, making it challenging for users to bring their own software. To overcome this issue, most HPC facilities have added support for HPC friendly container technologies such as Shifter, Singularity, or...
IRIS is the co-ordinating body of a UK science eInfrastructure and is a collaboration between UKRI-STFC, its resource providers and representatives from the science activities themselves. We document the progress of an ongoing project to build a security policy trust framework suitable for use across the IRIS community.
The EU H2020-funded AARC projects addressed the challenges involved in...
The European-funded ESCAPE project will prototype a shared solution to computing challenges in the context of the European Open Science Cloud. It targets Astronomy and Particle Physics facilities and research infrastructures and focuses on developing solutions for handling Exabyte scale datasets.
The DIOS work package aims at delivering a Data Infrastructure for Open Science. Such an...
The physics software stack of LHCb is based on Gaudi and is comprised of about 20 interdependent projects, managed across multiple Gitlab repositories. At present, the continuous integration (CI) system used for regular building and testing of this software is implemented using Jenkins and runs on a cluster of about 300 cores.
LHCb CI pipelines are python-based and relatively modern with some...
Deep neural networks (DNNs) have been applied to the fields of computer vision and natural language processing with great success in recent years. The success of these applications has hinged on the development of specialized DNN architectures that take advantage of specific characteristics of the problem to be solved, namely convolutional neural networks for computer vision and recurrent...
The INSPIRE digital library serves the scientific community since almost 50 years. Previously known as SPIRES, it was the first web site outside Europe and the first database on the web. Today, INSPIRE connects 100'000 scientists in High Energy Physics worldwide, with over 1 million scientific articles, thousands scientific profiles of authors, data, conferences and jobs in High Energy...
The reconstruction of trajectories of the charged particles in the tracking detectors of high energy physics experiments is one of the most difficult and complex tasks of event reconstruction at particle colliders. As pattern recognition algorithms exhibit combinatorial scaling to high track multiplicities, they become the largest contributor to the CPU consumption within event reconstruction,...
The upcoming PANDA experiment is one of the major pillars of the future FAIR accelerator facility in Darmstadt, Germany. With its multipurpose detector and an antiproton beam with a momentum of up to 15 GeV/c, PANDA will be able to test QCD in the intermediate energy regime and shed light on important questions such as: Why is there a matter-antimatter asymmetry in the Universe?
Achieving its...
The Jiangmen Underground Neutrino Observatory (JUNO) is an underground 20 kton liquid scintillator detector being built in the south of China and expected to start data taking in late 2021. The JUNO physics program is focused on exploring neutrino properties, by means of electron anti-neutrinos emitted from two nuclear power complexes at a baseline of about 53km. Targeting an unprecedented...
We describe a multi-disciplinary project to use machine learning techniques based on neural networks (NNs) to construct a Monte Carlo event generator for lepton-hadron collisions that is agnostic of theoretical assumptions about the microscopic nature of particle reactions. The generator, referred to as ETHER (Empirically Trained Hadronic Event Regenerator), is trained to experimental data...
The CMS experiment relies on a substantial C++ and Python-based software release for its day-to-day production, operations and analysis needs. While very much under active development, this codebase continues to age. At the same time, CMSSW codes are likely to be used for the next two decades, in one form or another. Thus, the "cost" of bugs entering CMSSW continues to increase, both due to...
The new jAliEn (Java ALICE Environment) middleware is a Grid framework designed to satisfy the needs of the ALICE experiment for the LHC Run 3, such as providing a high-performance and high-scalability service to cope with the increased volumes of collected data. This new framework also introduces a split, two-layered job pilot, creating a new approach to how jobs are handled and executed...
Access libraries such as ROOT and HDF5 allow users to interact with datasets using high level abstractions, like coordinate systems and associated slicing operations. Unfortunately, the implementations of access libraries are based on outdated assumptions about storage systems interfaces and are generally unable to fully benefit from modern fast storage devices. For example, access libraries...
Events containing muons, electrons or photons in the final state are an important signature for many analyses being carried out at the Large Hadron Collider (LHC), including both standard model measurements and searches for new physics. To be able to study such events, it is required to have an efficient and well-understood trigger system. The ATLAS trigger consists of a hardware based system...
The International Particle Physics Outreach Group (IPPOG) is a network of scientists, science educators and communication specialists working across the globe in informal science education and outreach for particle physics. Members initiate, develop and participate in a variety of activities in classrooms, public events, festivals, exhibitions, museums, institute open days, etc. The IPPOG...
The Virtual Monte Carlo (VMC) package together with its concrete implementations provides a unified interface to different detector simulation transport engines such as GEANT3 or GEANT4. However, so far the simulation of one event was restricted to the usage of one chosen engine.
We introduce here the possibility to mix multiple engines within the simulation of one event. Depending on user...
Beginning in 2021, the upgraded LHCb experiment will use a triggerless readout system collecting data at an event rate of 30 MHz. A software-only High Level Trigger will enable unprecedented flexibility for trigger selections. During the first stage (HLT1), a sub-set of the full offline track reconstruction for charged particles is run to select particles of interest based on single or...
An important part of the LHC legacy will be precise limits on indirect effects of new physics, framed for instance in terms of an effective field theory. These measurements often involve many theory parameters and observables, which makes them challenging for traditional analysis methods. We discuss the underlying problem of “likelihood-free” inference and present powerful new analysis...
The dCache project provides open-source software deployed internationally
to satisfy ever more demanding storage requirements of various scientific
communities. Its multifaceted approach provides an integrated way of supporting different use-cases with the same storage, from high throughput data ingest, through wide access and easy integration with existing systems, including
event driven...
The LHAASO(Large High Altitude Air Shower Observatory) experiment of IHEP is located in Daocheng, Sichuan province (at the altitude of 4410 m). The main scientific goals of LHAASO are searching for galactic cosmic ray origins by extensive spectroscopy investigations of gamma ray sources above 30TeV. To accomplish these goals, LHAASO contains four detector arrays, which generates huge amounts...
In view of the increasing computing needs for the HL-LHC era, the LHC experiments are exploring new ways to access, integrate and use non-Grid compute resources. Accessing and making efficient use of Cloud and supercomputer (HPC) resources present a diversity of challenges. In particular, network limitations from the compute nodes in HPC centers impede CMS experiment pilot jobs to connect to...
The Jiangmen Underground Neutrino Observatory (JUNO) in China is a 20 kton liquid scintillator detector, designed primarily to determine the neutrino mass hierarchy, as well as to study various neutrino physics topics. Its core part consists of O(10^4) Photomultiplier Tubes (PMTs). Computations looping through this large amount of PMTs on CPU will be very time consuming. GPU parallel computing...
RWebWindow class builds the core functionality for web-based widgets in ROOT. It combines all necessary server-side components and provides communication channels with multiple JavaScript clients.
Following new ROOT widgets are build based on RWebWindow functionality:
- RCanvas – ROOT7 canvas for drawing all kinds of primitives, including
histograms and graphs - RBrowser – hierarchical...
Belle II is a rapidly growing collaboration with members from
113 institutes spread around the globe. The software development team of
the experiment, as well as the software users, are very much
decentralised. Together with the active development of the software,
such decentralisation makes the adoption of the latest software
releases by users an essential, but quite challenging...
With the ever increasing size of scientific collaborations and complexity of scientific instruments the software needed to acquire, process and analyze the gathered data is gaining in complexity and size too. Unfortunately the role and career path of scientists and engineers working on software R&D and developing scientific software is neither clearly established nor defined in many fields of...
The DOMA activities gave the opportunity for DPM to contribute to
the WLCG plans for Run-3 and beyond. Here we identify the themes
that are relevant to site storage systems and explain how the
approaches chosen in DPM are relevant for features like
scalability, third party copy, bearer tokens, multi-site deployments and
volatile caching pools.
We will also discuss the status of the...
ATLAS distributed computing is allowed to opportunistically use resources of the Czech national HPC center IT4Innovations in Ostrava. The jobs are submitted via an ARC Compute Element (ARC-CE) installed at the grid site in Prague. Scripts and input files are shared between the ARC-CE and the shared file system located at the HPC, via sshfs. This basic submission system has worked there since...
Extracting information about the quark and gluon (or parton) structure of the nucleon from high-energy scattering data is a classic example of the inverse problem: the experimental cross sections are given by convolutions of the parton probability distributions with process-dependent hard coefficients that are perturbatively calculable from QCD. While most analyses in the past have been based...
Boost.Histogram, a header-only C++14 library that provides multi-dimensional histograms and profiles, is now available in Boost-1.70. It is extensible, fast, and uses modern C++ features. Using template meta-programming, the most efficient code path for any given configuration is automatically selected. The library includes key features designed for the particle physics community, such as...
Neutrinos are particles that interact rarely, so identifying them requires large detectors which produce lots of data. Processing this data with the computing power available is becoming more difficult as the detectors increase in size to reach their physics goals. In liquid argon time projection chambers (TPCs) the charged particles from neutrino interactions produce ionization electrons...
VIRGO is an interferometer for the detection of Gravitational Waves at the European Gravitational Observatory in Italy. Along with the two LIGO interferometers in the US, VIRGO is being used to collect data from astrophysical sources such as compact binary coalescences, and is currently running its third observational period, collecting gravitational wave events at a rate if more than one per...
One of the key challenges identified by the HEP R&D roadmap for software and computing is the ability to integrate heterogeneous resources in support of the computing needs of HL-LHC. In order to meet this objective, a flexible Authentication and Authorization Infrastructure (AAI) has to be in place, to allow the secure composition of computing and storage resources provisioned across...
C++ Modules come in C++20 to fix the long-standing build scalability problems in the language. They provide an io-efficient, on-disk representation capable to reduce build times and peak memory usage. ROOT employs the C++ modules technology further in the ROOT dictionary system to improve its performance and reduce the memory footprint.
ROOT with C++ Modules was released as a technology...
High energy physics (HEP) experiments produce a large amount of data, which is usually stored and processed on distributed sites. Nowadays, the distributed data management system faces some challenges such as global file namespace and efficient data access. Focusing on those problems, the paper proposed a cross-domain data access file system (CDFS), a data cache and access system across...
Developing, maintaining, and evolving the algorithms and
software implementations for HEP experiments will continue for many
decades. In particular, the HL-LHC will start collecting data 8 or
9 years from now, and then acquire data for at least another decade.
Building the necessary software requires a workforce with a mix of
HEP domain knowledge, advanced software skills, and strong...
The Level-0 Muon Trigger system of the ATLAS experiment will undergo a full upgrade for HL-LHC to stand the challenging performances requested with the increasing instantaneous luminosity. The upgraded trigger system foresees to send RPC raw hit data to the off-detector trigger processors, where the trigger algorithms run on new generation of Field-Programmable Gate Arrays (FPGAs). The FPGA...
The ATLAS experiment is using large High Performance Computers (HPC's) and fine grained simulation workflows (Event Service) to produce fully simulated events in an efficient manner. ATLAS has developed a new software component (Harvester) which provides resource provisioning and workload shaping. In order to run effectively on the largest HPC machines, ATLAS develop Yoda-Droid software to...
Experiments in Photon Science at DESY will, in future, undergo significant changes in terms of data volumes, data rates and most important, to fully enable online (synchronous to experiment) data analysis. Primary goal is to support new type of experimental setups requiring significant computing effort to perform controlling and data quality monitoring, allow effective data reductions and,...
At the High Luminosity Large Hadron Collider (HL-LHC), many
proton-proton collisions happen during a single bunch crossing. This
leads on average to tens of thousands of particles emerging from the
interaction region. Two major factors impede finding charged particle
trajectories from measured hits in the tracking detectors. First,
deciding whether a given set of hits was produced by a...
During 2019 and 2020, the CERN tape archive (CTA) will receive new data from LHC experiments and import existing data from CASTOR, which will be phased out for LHC experiments before Run 3.
This contribution will present the statuses of CTA as a service and of its integration with EOS and FTS and the data flow chains of LHC experiments.
The latest enhancements and additions to the...
The CernVM FileSystem (CVMFS) is widely used in High Throughput Computing to efficiently distributed experiment code. However, the standard CVMFS publishing tools are designed for a small group of people from each experiment to maintain common software, and the tools don't work well for the majority of users that submit jobs related to each experiment. As a result, most user code, such as code...
Since years, e-mail is one of the main attack vectors that organisations and individuals face. Malicious actors use e-mail messages to run phishing attacks, to distribute malware, and to send around various types of scams. While technical solutions exist to filter out most of such messages, no mechanism can guarantee 100% efficiency. Recipients themselves are the next, crucial layer of...
The Deep Underground Neutrino Experiment (DUNE) will be a world-class neutrino observatory and nucleon decay detector aiming to address some of the most fundamental questions in particle physics. With a modular liquid argon time-projection chamber (LArTPC) of 40 kt fiducial mass, the DUNE far detector will be able to reconstruct neutrino interactions with an unprecedented resolution. With no...
The Mikado approach is the winner algorithm of the final phase of the TrackML particle reconstruction challenge [1].
The algorithm is combinatorial. Its strategy is to reconstruct data in small portions, each time trying to not damage the rest of the data. The idea reminds Mikado game, where players should carefully remove wood sticks one-by-one from a heap.
The algorithm does 60...
The Solenoidal Tracker at RHIC (STAR) is a multi-national supported experiment located at Brookhaven National Lab and is currently the only remaining running experiment at RHIC. The raw physics data captured from the detector is on the order of tens of PBytes per data acquisition campaign, which makes STAR fit well within the definition of a big data science experiment. The production of the...
PODIO is a C++ toolkit for the creation of event data models (EDMs) with a fast and efficient I/O layer, developed in the AIDA2020 project. It employs plain-old-data (POD) data structures wherever possible, while avoiding deep object-hierarchies and virtual inheritance. A lightweight layer of handle classes provides the necessary high-level interface for the physicist, such as support for...
The CERN analysis preservation portal (CAP) comprises a set of tools and services aiming to assist researchers in describing and preserving all the components of a physics analysis such as data, software and computing environment. Together with the associated documentation, all these assets are kept in one place so that the analysis can be fully or partially reused even several years after the...