With the coming luminosity increase at the High Luminosity LHC, the ATLAS experiment will find itself facing a significant challenge in processing the hundreds of petabytes of data that will be produced by the detector.
The computing tasks faced by the LHC experiments such as ATLAS are primarily throughput limited, and our frameworks are optimized to run these on High Throughput Computing...
As we are approaching the high-luminosity era of the LHC, the computational requirements of the ATLAS experiment are expected to increase significantly in the coming years. In particular, the simulation of MC events is immensely computationally demanding, and their limited availability is one of the major sources of systematic uncertainties in many physics analyses. The main bottleneck in the...
In response to the rising CPU consumption and storage demands, as we enter a new phase in particle physics with the High-Luminosity Large Hadron Collider (HL-LHC), our efforts are centered around enhancing the CPU processing efficiency of reconstruction within the ATLAS inner detector. The track overlay approach involves pre-reconstructing pileup tracks and subsequently running reconstruction...
Simulation of the detector response is a major computational challenge in modern High Energy Physics experiments, as for example it accounts for about two fifths of the total ATLAS computing resources. Among simulation tasks, calorimeter simulation is the most demanding, taking up about 80% of resource use for simulation and expected to increase in the future. Solutions have been developed to...
In some fields, scientific data formats differ across experiments due to specialized hardware and data acquisition systems. Researchers need to develop, document, and maintain specific analysis software to interact with these data formats. These software are often tightly coupled with a particular data format. This proliferation of custom data formats has been a prominent challenge for small...
The Institute of High Energy Physics' computing platform includes isolated grid sites and local clusters. Grid sites manage grid jobs from international experiments, including ATLAS, CMS, LHCb, BELLEII, JUNO, while the local cluster concurrently processes data from experiments leading by IHEP like BES, JUNO, LHAASO. These resources have distinct configurations, such as network segments, file...
Among human activities that contribute to the environmental footprint of our species, computational footprint, i.e. the environmental impact that results from the employment of computing resources, might be one of the most underappreciated ones. While many modern scientific discoveries have been obtained thanks to the availability of more and more performing computers and algorithms, the...
The computing challenges in collecting, storing, reconstructing, and analyzing the colossal volume of data produced by the ATLAS experiment and producing similar numbers of simulated Monte Carlo (MC) events put formidable requirements on the computing resources of the ATLAS collaboration. ATLAS currently expends around 40% of its CPU resources on detector simulation, in which half of the...
Here, we present deep generative models for the fast simulation of calorimeter shower events. Using a three-dimensional, cylindrical scoring mesh, a shower event is parameterized by the total energy deposited on each cell of the scoring mesh. Due to the three-dimensional geometry, to simulate a shower event, it is required to learn a complex probability distribution of $O(10^3) \sim O(10^4)$...
The visualization process of detector is one of the important problems in high energy physics (HEP) software. At present, the description of detectors in HEP is complicated. Industry professional visualization platforms such as Unity, have the most advanced visualization capabilities and technologies, which can help us to achieve the visualization of detectors. The work is to find an automated...
The interTwin project, funded by the European Commission, is at the forefront of leveraging 'Digital Twins' across various scientific domains, with a particular emphasis on physics and earth observation. Two of the most advanced use-cases of interTwin are event generation for particle detector simulation at CERN as well as the climate-based Environmental Modelling and Prediction Platform...
In the realm of Grid middleware, efficient job matching is paramount, ensuring that tasks are seamlessly assigned to the most compatible worker nodes. This process hinges on meticulously evaluating a worker node's suitability for the given task, necessitating a thorough assessment of its infrastructure characteristics. However, adjusting job matching parameters poses a significant challenge...
The Jiangmen Underground Neutrino Observatory (JUNO) is currently under construction in southern China, with the primary goals of the determining the neutrino mass ordering and the precisely measurement of oscillation parameters. The data processing in JUNO is challenging. When JUNO starts data taking in the late 2024, the expected event rate is about 1 kHz, which is about 31.5 billions of...
The ATLAS experiment at the Large Hadron Collider (LHC) operated very successfully in the years 2008 to 2023. ATLAS Control and Configuration (CC) software is the core part of the ATLAS Trigger and DAQ system, it comprises all the software required to configure and control the ATLAS data taking. It provides essentially the glue that holds the various ATLAS sub-systems together. During recent...
The new-generation light sources, such as the High Energy Photon Source (HEPS) under construction, are one of the advanced experimental platforms that facilitate breakthroughs in fundamental scientific research. These large scientific installations are characterized by numerous experimental beam lines (more than 90 at HEPS), rich research areas, and complex experimental analysis methods,...
In high-energy physics experiments, the software's visualization capabilities are crucial, aiding in detector design, assisting with offline data processing, offering potential for improving physics analysis, among other benefits. Detailed detector geometries and architectures, formatted in GDML or ROOT, are integrated into platforms like Unity for three-dimensional modeling. In this study,...
The HXMT satellite is China's first space astronomy satellite. It is a space-based X-ray telescope capable of broadband and large-field X-ray sky surveys, as well as the study of high-energy celestial objects such as black holes and neutron stars, focusing on short-term temporal variations and broadband energy spectra. It also serves as a highly sensitive all-sky monitor for gamma-ray bursts....
Considering the slowdown of Moore’s Law, an increasing need for energy efficiency, and larger and larger amounts of data to be processed in real-time and non-real-time in physics research, new computing paradigms are emerging in hardware and software. This requires bigger awareness from developers and researchers to squeeze as much performance as possible out of applications. However, this is...
In a wide range of high-energy particle physics applications, machine learning methods have proven as powerful tools to enhance various aspects of physics data analysis. In the past years, various ML models were also integrated in central workflows of the CMS experiment, leading to great improvements in reconstruction and object identification efficiencies. However, the continuation of...
In the 5+ years since their inception, Uproot and Awkward Array have become cornerstones for particle physics analysis in Python, both as direct user interfaces and as base layers for physicist-facing frameworks. Although this means that the software is achieving its mission, it also puts the need for stability in conflict with new, experimental developments. Boundaries must be drawn between...
The software toolbox used for "big data" analysis in the last few years is changing fast. The adoption of approaches able to exploit the new hardware architectures plays a pivotal role in boosting data processing speed, resources optimisation, analysis portability and analysis preservation.
The scientific collaborations in the field of High Energy Physics (e.g. the LHC experiments, the...
In the era of digital twins a federated system capable of integrating High-Performance Computing (HPC), High-Throughput Computing (HTC), and Cloud computing can provide a robust and versatile platform for creating, managing, and optimizing Digital Twin applications. One of the most critical problems involve the logistics of wide-area with multi stage workflows that move back and forth across...
Hydra is an extensible framework for training, managing, and deploying machine learning models for near real-time data quality monitoring. It is designed to take some of the burden off of shift crews by providing ‘round-the-clock’ monitoring of plots representing the data being collected. The Hydra system is backed by a database which is leveraged for near push button training and is...
In the LHCb experiment, during Run2, more than 90% of the computing resources available to the Collaboration were used for detector simulation. The detector and trigger upgrades introduced for Run3 allow to collect larger datasets that, in turn, will require larger simulated samples. Despite the use of a variety of fast simulation options, the demands for simulations will far exceed the...
The increasing computing power and bandwidth of programmable digital devices opens new possibilities in the field of real-time processing of HEP data. LHCb is exploiting this technology advancements in various ways to enhance its capability for complex data reconstruction in real time. Amongst them is the real-time reconstruction of hits in the VELO pixel detector, by means of cluster-finding...
The Large Hadron Collider at CERN in Geneva is poised for a transformative upgrade, preparing to enhance both its accelerator and particle detectors. This strategic initiative is driven by the tenfold increase in proton-proton collisions anticipated for the forthcoming high-luminosity phase scheduled to start by 2029. The vital role played by the underlying computational infrastructure, the...
The integration of geographically diverse computing facilities involves dynamically allocating unused resources, relocating workflows, and addressing challenges in heterogeneous, distributed, and opportunistic compute provisioning. Key hurdles include effective resource management, scheduling, data transfer optimization, latency reduction, and ensuring security and privacy. Our proposed...
Today, the Worldwide LHC Computing Grid (WLCG) provides the majority of compute resources for the High Energy Physics (HEP) community. With its homogeneous Grid centers all around the world trimmed to a high throughput of data, it is tailored to support typical HEP workflows, offering an optimal environment for efficient job execution.
...
The PHENIX Collaboration has actively pursued a Data and Analysis Preservation program since 2019, the first such dedicated effort at RHIC. A particularly challenging aspect of this endeavor is preservation of complex physics analyses, selected for their scientific importance and the value of the specific techniques developed as a part of the research. For this, we have chosen one of the most...
The need to interject, process and analyze large datasets in an as-short-as-possible amount of time is typical of big data use cases. The data analysis in High Energy Physics at CERN in particular will require, ahead of the next phase of high-luminosity at LHC, access to big amounts of data (order of 100 PB/year). However, thanks to continuous developments on resource handling and software, it...
Particle physics faces many challenges and opportunities in the coming decades, as reflected by the Snowmass Community Planning Process, which produced about 650 reports on various topics. These reports are a valuable source of information, but they are also difficult to access and query. In this work, we explore the use of Large Language Models (LLMs) and Retrieval Augmented Generation (RAG)...
The Thomas Jefferson National Accelerator Facility (JLab) has created and is currently working on various tools to facilitate streaming readout (SRO) for upcoming experiments. These include reconstruction frameworks with support for Artificial Intelligence/Machine Learning, distributed High Throughput Computing (HTC), and heterogeneous computing which all contribute significantly to swift data...
Graph Neural Networks (GNNs) have demonstrated significant performance in addressing the particle track-finding problem in High-Energy Physics (HEP). Traditional algorithms exhibit high computational complexity in this domain as the number of particles increases. This poster addresses the challenges of training GNN models on large, rapidly evolving datasets, a common scenario given the...
Effective data extraction has been one of major challenges in physics analysis and will be more important in the High-Luminosity LHC era. ServiceX provides a novel data access and delivery by exploiting industry-driven software and recent high-energy physics software in the python ecosystem. An experiment-agnostic nature of ServiceX will be described by introducing various types of transformer...
The ALICE experiment's Grid resources vary significantly in terms of memory capacity, CPU cores, and resource management. Memory allocation for scheduled jobs depends on the hardware constraints of the executing machines, system configurations, and batch queuing policies. The O2 software framework introduces multi-core tasks where deployed processes share resources. To accommodate these new...
The search for long lived particles, a common extension to the Standard Model, requires a sophisticated neural network design, one that is able to accurately discriminate between signal and background. At the LHC's ALTAS experiment, beam induced background (BIB) and QCD jets are the two significant sources of background. For this purpose, a recurrent neural network (RNN) with an adversary was...
We present a framework based on Catch2 to evaluate performance of OpenMP's target offload model via micro-benchmarks. The compilers supporting OpenMP's target offload model for heterogeneous architectures are currently undergoing rapid development. These developments influence performance of various physics applications in different ways. This framework can be employed to track the impact of...
The diffusion model has demonstrated promising results in image generation, recently becoming mainstream and representing a notable advancement for many generative modeling tasks. Prior applications of the diffusion model for both fast event and detector simulation in high energy physics have shown exceptional performance, providing a viable solution to generate sufficient statistics within a...
With the increased integration of machine learning and the need for the scale of high-performance computing infrastructures, scientific workflows are undergoing a transformation toward greater heterogeneity. In this evolving landscape, adaptability has emerged as a pivotal factor in accelerating scientific discoveries through efficient execution of workflows. To increase resource utilization,...
Celeritas is a Monte Carlo (MC) detector simulation library that exploits current and future heterogeneous leadership computing facilities (LCFs). It is specifically designed for, but not limited to, High-Luminosity Large Hadron Collider (HL-LHC) simulations. Celeritas implements full electromagnetic (EM) physics, supports complex detector geometries, and runs on CPUs and Nvidia or AMD GPUs....
To study and search for increasingly rare physics processes at the LHC, a staggering amount of data needs to be analyzed with progressively complex methods. Analyses involving tens of billions of recorded and simulated events, multiple machine learning algorithms for different purposes, and an amount of 100 or more systematic variations are no longer uncommon. These conditions impose a complex...
High Energy Photon Source (HEPS) is a crucial scientific research facility that necessitates efficient, reliable, and secure services to support a wide range of experiments and applications. However, traditional physical server-based deployment methods suffer from issues such as low resource utilization, limited scalability, and high maintenance costs.Therefore, the objective of this study is...
When working with columnar data file formats, it is easy for users to devote too much time to file manipulation. With Python, each file conversion requires multiple lines of code and the use of multiple I/O packages. Some conversions are a bit tricky if the user isn’t very familiar with certain formats, or if they need to work with data in smaller batches for memory management. To try and...
Traditionally, data centers provide computing services to the outside world, and their security policies are usually separated from the outside world based on firewalls and other security defense boundaries. Users access the data center intranet through VPN, and individuals or endpoints connected through remote methods receive a higher level of trust to use computing services than individuals...
The environmental impact of computing activities is starting to be acknowledged as relevant and several scientific initiatives and research lines are gathering momentum in the scientific community to identify and curb it. Governments, industries, and commercial businesses are now holding high expectations for quantum technologies as they have the potential to create greener and faster methods...
Extensive data processing is becoming commonplace in many fields of science, especially in computational physics. Distributing data to processing sites and providing methods to share the data with others efficiently has become essential. The Open Science Data Federation (OSDF) builds upon the successful StashCache project to create a global data distribution network. The OSDF expands the...
Efficient fast simulation techniques in high energy physics experiments are crucial in order to produce the necessary amount of simulated samples. We present a new component in the Gaussino core simulation framework that facilitates the integration of fast simulation hooks in Geant4 with machine learning serving, based on Gaudi’s scheduling and data processing tools. The implementation...
Computing demands for large scientific experiments, such as the CMS experiment at the CERN LHC, will increase dramatically in the next decades. To complement the future performance increases of software running on central processing units (CPUs), explorations of coprocessor usage in data processing hold great potential and interest. Coprocessors are a class of computer processors that...
There is a significant expansion in the variety of hardware architectures these years, including different GPUs and other specialized computing accelerators. For better performance portability, various programming models are developed across those computing systems, including Kokkos, SYCL, OpenMP, and others. Among these programming models, the C++ standard parallelism (std::par) has gained...
As part of the Scientific Discovery through Advanced Computing (SciDAC) program, the Quantum Chromodynamics Nuclear Tomography (QuantOM) project aims to analyze data from Deep Inelastic Scattering (DIS) experiments conducted at Jefferson Lab and the upcoming Electron Ion Collider. The DIS data analysis is performed on an event-level by combining the input from theoretical and experimental...
As the LHC continues to collect larger amounts of data, and in light of the upcoming HL-LHC, using tools that allow efficient and effective analysis of HEP data becomes more and more important. We present a test of the applicability and user-friendliness of several columnar analysis tools, most notably ServiceX and Coffea, by completing a full Run-2 ATLAS analysis. Working collaboratively with...
APS at USA completed the high-precision training and inference in Nvidia GPU clusters taking the ptychoNN algorithm combined with ePIE Conjugate Gradient method. By the reference of that idea, we came up with a new model called W1-Net whose training speed was faster with higher precision of inference. After this development, we implemented the model onto DCU cluster. However, the performance...
The ATLAS experiment at the LHC relies on crucial tools written in C++ to calibrate physics objects and estimate systematic uncertainties in the event-loop analysis environment. However, these tools face compatibility challenges with the columnar analysis paradigm that operates on many events at once in Python/Awkward or RDataFrame environments. Those challenges arise due to the intricate...