The 19th edition of ACAT will bring together experts to explore and confront the boundaries of computing, automated data analysis, and theoretical calculation technologies, in particle and nuclear physics, astronomy and astrophysics, cosmology, accelerator science and beyond. ACAT provides a unique forum where these disciplines overlap with computer science, allowing for the exchange of ideas and the discussion of cutting-edge computing, data analysis and theoretical calculation technologies in fundamental physics research.
There is a fundamental shift occurring in how computing is used in research in general and data analysis in particular. The abundance of inexpensive, powerful, easy to use computing power in the form of CPUs, GPUs, FPGAs, etc., has changed the role of computing in physics research over the last decade. The rise of new techniques, like deep learning, means the changes promise to keep coming. Even more revolutionary approaches, such as Quantum Computing, are now closer to becoming a reality.
Please join us to explore these future changes, and learn about new algorithms and ideas and trends in scientific computing physics. Most of all, join us for the discussions and the sharing of expertise in the field.
As often has happened in the development of AI, theoretical advances in the domain have opened the door to exceptional perspectives of application in the most diverse fields of science, business and society at large. Today the introduction of Machine (Deep) Learning is no exception, and beyond the hype we can already see that this new family of techniques and approaches may usher a real revolution in various fields. However, as it has happened time and again in the past, we start realizing that one of the limitations of Deep Learning is sheer computing power. While these techniques allow to tackle extremely complex problems on very large amount of data, the computational cost, particularly of the training phase, is rising fast. The introduction of meta-optimizations such as hyper-parameter scans may further enlarge the possibilities of Machine Learning, but this in turn will require substantial improvements in code performance.
At the same time, High Performance Computing is also in full evolution, offering new solutions and perspectives, both in hardware and in software. In this version of ACAT we would like to focus on how this renewed momentum in the HPC world may provide the necessary power to fulfill the revolutionary promises offered by recent breakthroughs in AI at large and in Machine Learning in particular.
The meeting will take place in the Steinmatte Conference Centre of the Allalin Hotel, situated here in Saas-Fee, Switzerland.
Saas-Fee, a resort village in the Swiss Alps near the Italian border, is known for its proximity to mountains more than 4,000m above sea level, or 4-thousanders. It's a gateway to more than 100km of pistes for skiing and snowboarding, plus sledding and toboggan runs. The Mittelallalin Ice Pavilion is a frozen grotto carved into the Fee Glacier. In the summer, the surrounding area draws hikers and rock climbers.
Elevation: 1,800 m
Weather: 1°C, Wind S at 32 km/h, 74% Humidity
Twin towns: Steamboat Springs (USA), Rocca di Cambio (Italy)
More information at https://en.wikipedia.org/wiki/Saas-Fee
Sign up for email notifications here. This list is low traffic and will only get you ACAT conference announcements and general information (for this and future conferences in the ACAT series).
Many people are working together to bring you this conference! The organization page has some details. F. Carminati is chair of the Scientific Program Committee, F.Rademakers is the chair of the Local Organizing Committee.
Computer algebra is one of a key tools in modern physics research. In this talk I will give an overview of the main mathematical and programming concepts that lie in the basis of modern computer algebra tools and how they are applied for solving modern theoretical physics and some engineering problems. I will also give a sketch overview of modern computer algebra software, including general purpose systems and dedicated tools, how they compare by functionality and performance, and what are the modern trends in the development and programming of computer algebra software.
The extremely low flux of ultra-high energy cosmic rays
(UHECR) makes their direct observation by orbital experiments
practically impossible. For this reason all current and planning UHECR
experiments detect cosmic rays indirectly observing extensive air
showers (EAS) initiated by cosmic ray particles in the atmosphere.
Various types of shower observables are analysed in modern UHECR
experiments including secondary radio signal and fluorescent light from
excited nitrogen molecules. The most of data is collected by the network
of surface area detectors which allows to measure horizontal EAS profile
directly. The raw observables in this case are the time-resolved signals
for the set of adjacent triggered detectors. To recover primary particle
properties Monte Carlo shower simulation is performed. In traditional
techniques the MC simulation is used to fit some synthetic observable such
as shower rise time, shower front curvature and particle density normalized
to a given distance from the core. In this talk we consider an alternative
approach based on the deep convolutional neural network using detector
signal time series as an input and trained on a large Monte-Carlo dataset.
The above approach has proven its efficiency with the Monte-Carlo simulations
of the Telescope Array Observatory surface detector. We will discuss in detail how we optimize network architecture for the particular task.
Posters in this session can be seen in Room B during all Monday and Tuesday.
X-ray Free Electron Lasers (XFELs) are among the most complex accelerator projects in the world today. With large parameter spaces, sensitive dependence on beam quality, huge data rates, and challenging machine protection, there are diverse opportunities to apply machine learning (ML) to XFEL operation. This talk will summarize promising ML methods and highlight recent examples of successful applications at the Linac Coherent Light Source (LCLS).
All lunch breaks will take place in the Hotel Dom and Hotel Du Glacier (situated next to each other). The buffet is open until 14.30h.
When it comes to number-crunching, C++ is at the core of HENP’s software. But while C++17 is old news, many of us did not get to use it yet. And why would we? This presentation introduces some of the main reasons to move to C++17 - focusing on performant, readable code and robust interfaces.
Where C++17 has many new features that help, C++20 might come as “your next C++11”, a major step forward for C++: it will most likely introduce concepts, contracts and ranges; fairly likely the “spaceship operator”, coroutines and networking. Some of these will change the way we want to write code. As today’s compilers are already implementing many of tomorrow’s features, now is a good time to see where C++ is heading, and to learn how this affects our usage of C++.
NANOAOD is an event data format that has recently been commissioned by the CMS Collaboration to serve the needs of a substantial fraction of its physics analyses. The new format is about 20 times more compact than the MINIAOD format and only includes high level physics object information. NANOAOD is easily customisable for development activities, and supports standardised routines for content validation and automated data analysis workflows. The talk will review the current status and perspectives of NANOAOD design and implementation.
PyROOT is the name of ROOT’s Python bindings, which allow to access all the ROOT functionality implemented in C++ from Python. Thanks to the ROOT type system and the Cling C++ interpreter, PyROOT creates Python proxies for C++ entities on the fly, thus avoiding to generate static bindings beforehand.
PyROOT is in the process of being enhanced and modernised to meet the demands of the HEP Python community. In particular, the ongoing work in PyROOT comprises three areas: first, making PyROOT more pythonic, by adding the so-called “pythonisations” to make it simpler to access C++ from Python; second, improving the interoperability of PyROOT with the Python data science ecosystem tools (for instance, NumPy); third, redesigning PyROOT on top of the Cppyy library, in order to benefit from the modern C++ features supported by the latter.
During the past two years ROOT's analysis tools underwent a major renovation,
embracing a declarative approach.
This contribution explores the most recent developments of the implementation of
such approach, some real-life examples from LHC experiments as well as present
and future R&D lines.
After an introduction of the tool offering access to declarative analysis,
RDataFrame, the newly introduced syntax for the treatment of collections
is described together with examples concerning the analysis of Open Datasets.
The tooling for visualising and studying computation graphs built with
RDataFrame in the form of diagrams is then presented.
Example real-life analyses based on RDataFrame from collider and non-collider
experiments are then discussed from the programming model and performance
perspective.
Finally, the status of existing R&D lines as well as future direction is
discussed, most notably the integration of RDataFrame with big data technologies
to distribute interactive calculations on massive computing resources.
For two decades, ROOT brought its own graphics system abstraction based
on a graphics model inspired by the popular graphics systems available
at that time. (X11, OpenGL, Cocoa ...)
With the emergence of modern C++ and recent graphics systems based on client/server
models, it was time to redefined completely ROOT graphics.
This has been been done in the context of ROOT 7 which provides the new Graphics
library using modern C++ serving JavaScript-based clients over the web.
This new approach re-think the High Energy Physics graphics language targeting
the production of plots designed for usability with new graphics style and optimal defaults
Modern data processing (acquisition, storage and analysis) requires modern tools.
One of the problems shared by existing scientific software is "scripting" approach, when user writes an imperative script which describes the stages in which data should be processed. The main deficiency of such approach is the lack of possibility to automate the process. For example one usually needs script to manipulate or even textually generate other scripts in order to run complex tasks. Also scripted interaction usually could not be easily run in parallel or on distributed or cluster systems.
The DataForge metadata processing framework remedies this problem by presenting a declarative approach to data processing. In this approach the process described as a composition of tasks with automated task tree builder based on tree-like metadata communication. DataForge allows to write flexible simple tasks (in any programming style) and then create automatically managed task graphs of any complexity.
In the High-Luminosity Large Hadron Collider (HL-LHC), one of the most challenging computational problems is expected to be finding and fitting charged-particle tracks during event reconstruction. The methods currently in use at the LHC are based on the Kalman filter. Such methods have shown to be robust and to provide good physics performance, both in the trigger and offline. In order to improve computational performance, we explored Kalman-filter-based methods for track finding and fitting, adapted for many-core SIMD and SIMT architectures. Our adapted Kalman-filter-based software has obtained significant parallel speedups using such processors, e.g., Intel Xeon Phi, Intel Xeon SP (Scalable Processors) and (to a limited degree) NVIDIA GPUs.
Recently, an effort has started towards the integration of our software into the CMS software framework, in view of its exploitation for the Run III of the LHC. Prior reports have shown that our software allows in fact for some significant improvements over the existing framework in terms of computational performance with comparable physics performance, even when applied to realistic detector configurations and event complexity. Here, we demonstrate that in such conditions physics performance can be further improved with respect to our prior reports, while retaining the improvements in computational performance, by making use of the knowledge of the detector and its geometry.
ConformalTracking is an open source library created in 2015 to serve as a detector independent solution for track reconstruction in detector development studies at CERN. Pattern recognition is one of the most CPU intensive tasks of event reconstruction at present and future experiments. Current tracking programs of the LHC experiments are mostly tightly linked to individual detector descriptions or event processing frameworks. ConformalTracking does a pattern recognition in a conformal-mapped plane, where helix trajectories of charged particles in a magnetic field are projected into straight lines, followed by a Kalman-Filter-based fit in global space. At the core of the library lies a nearest neighbour search that is optimized by means of fast KDTrees and enhanced with a cellular automaton to reconstruct the linear paths. Being based exclusively on the spatial coordinates of the hits, this algorithm is adaptable to different detector designs and beam conditions. In the detectors at CLIC and FCCee, it also profits from the low-mass silicon tracking system, which reduces complications from multiple scattering and interactions. Full-simulation studies have been performed in order to validate the algorithm and assess its performances, also in the presence of beam-induced background. In this talk, recent developments and features of the track reconstruction chain as well as results for isolated tracks and complex events with background will be discussed.
To address the unprecedented scale of HL-LHC data, the HEP.TrkX project has been investigating a variety of machine learning approaches to particle track reconstruction. The most promising of these solutions, a graph neural network, processes the event as a graph that connects track measurements (detector hits corresponding to nodes) with candidate line segments between the hits (corresponding to edges). This architecture enables separate input features for edges and nodes, ultimately creating a hidden representation of the graph that is used to turn edges on and off, leaving only the edges that form tracks. Due to the large scale of this graph for an entire LHC event, we present new methods that allow the event graph to be scaled to a computationally reasonable size. We report the results of the graph neural network on the TrackML dataset, detailing the effectiveness of this model on event data with large pileup. Additionally, we propose post-processing methods that further refine the result of the graph neural network, ultimately synthesizing an end-to-end machine learning solution to particle track reconstruction.
Machine learning methods are integrated into the pipelined first level track trigger of the upgraded flavor physics experiment Belle II in Tsukuba, Japan. The novel triggering techniques cope with the severe background conditions coming along with the upgrade of the instantaneous luminosity by a factor of 40 to $\mathcal{L} = 8 \times 10^{35} \text{cm}^{−2} \text{s}^{−1}$. Using the precise drift-time information of the central drift chamber, a neural network L1 trigger estimates the 3D track parameters of found single tracks. An extension of the present 2D Hough track finder to a 3D finder is proposed, where the single hit representations in the Hough plane are trained using Monte Carlo. This 3D finder enables an improvement of the track finding efficiency by including the stereo sense wires as input. The estimated polar track angle allows a specialization of the following neural networks to phase space sectors.
With the upgrade of the LHC to high luminosity, an increased rate of collisions will place a higher computational burden on track reconstruction algorithms. Typical algorithms such as the Kalman Filter and Hough-like Transformation scale worse than quadratically. However, the energy function of a traditional method for tracking, the geometric Denby-Peterson (Hopfield) network method, can be described as a quadratic unconstrained binary optimization (QUBO) problem. Quantum annealers have shown shown promise in their ability to solve QUBO problems despite being NP-hard. We present a novel approach for track reconstruction by applying a quantum annealing-inspired algorithm to the Denby-Peterson method. We propose additional techniques to divide an LHC event into disjoint subgraphs in order to allow the problem to be embeddable on existing quantum annealing hardware, using multiple anneals to fit tracks to a single event. To accommodate this dimension reduction, we use Bayesian methods and further algorithms to pre- and post-process the data. Results on the TrackML dataset are presented, demonstrating the successful application of quantum annealing-inspired algorithms to the track reconstruction problem.
We present a novel general Boltzmann machine with continuous visible
and discrete integer valued hidden states, yielding a parametric
density function involving a ratio of Riemann-Theta functions. After a
brief overview of the theory required to define this new ML
architecture, we show how the conditional expectation of a hidden
state for given visible states can be used as activation function in a
feedforward neural network, thereby increasing the modeling capacity
of the network. We then provide application examples for density
estimation, data regression and data classification in HEP. This work
is based on arXiv:1712.07581 and arXiv:1804.07768.
We report on multi-loop integral computations executed on a
PEZY/Exascaler large-scale (immersion cooling) computing system.
The programming model requires a host program written in C++
with an OpenCL kernel. However the kernel can be generated by
the Goose compiler interface, which allows parallelizing loops
according to compiler directives. As an advantage, the executable
derived from a program instrumented with Goose pragmas can be
run on multiple devices and multiple nodes without changes to
the program. We use lattice rules and lattice copy (composite)
rules on PEZY to approximate integrals for multi-loop self-energy
diagrams with and without masses.
The high-energy community recently witnessed the first attempts at leveraging machine (deep) learning techniques for improving the efficiency of the numerical Monte-Carlo integrations that lie at the core of most high-energy physics simulations.
The first part of my talk will characterise the various type of integrations necessary in these simulations as well as the type of improvements that could significantly impact their efficiency.
The second part will focus on reviewing the objectives and achievements of the first attempts at applying modern machine learning techniques in this context.
Massively parallel simulations generate increasing volumes of large data, whose exploitation requires large storage resources, efficient network and increasingly large post-processing facilities. In the coming era of exascale computations, there is an emerging need for new data analysis and visualization strategies.
Data manipulation, during the simulation and after, considerably slows down the analysis process, now becoming the bottleneck of high performance computing. The traditional usage consists in performing the simulations in order to write output data on disk. When dealing with three-dimensional time-dependent problems computed on thousands of cores, the volume of data generated is big and highly partitioned. As a consequence, their post-processing often requires to decrease the spatial or the time resolution in order to be performed on local platform, with less resources than on the computational machine. Another solution consists in coupling analysis with simulation, so that both are performed simultaneously.
In order to address these questions, a client-server in-situ analysis for massively parallel time-evolving computations has been developed and applied to a spectral code for the study
of turbulence and transition. It is shown to have a low impact on computational time with a reasonable increase of resource usage, while enriching data exploration. Large time sequences have been analyzed. This could not have been achieved with the traditional workflow. Moreover, computational steering has been performed with real-time adjustment of the simulation parameters, thereby getting closer to a numerical experiment process.
Posters in this session can be seen in Room B during all Monday and Tuesday.
As a data-intensive computing application, high-energy physics requires storage and computing for large amounts of data at the PB level. Performance demands and data access imbalances in mass storage systems are increasing. Specifically, on one hand, traditional cheap disk storage systems have been unable to handle high IOPS demand services. On the other hand, a survey found that only a very small number of files have been active in storage for a period of time. Most files have never been accessed. Some enterprises and research organizations are beginning to use tiered storage architectures, such as tape, disk or solid state drives to reduce hardware purchase costs and power consumption.
As the amount of stored data grows, tiered storage requires data management software to migrate less active data to lower cost storage devices. Thus an automated data migration strategy is needed. At present, automatic data migration strategies such as LRU, CLOCK, 2Q, GDSF, LFUDA, FIFO, etc., are usually based on files’ recent access mode(such as file access frequency, etc.), are mainly used to resolve data migration between memory and disk. They need to run in the operating system kernel, so the rules are relatively simple. For file access mode does not take file life cycle trend into account, some regularly accessed files are often not predicted accurately. In addition, file history access records are not considered.
Data access requests are not completely random. They are driven by the behavior of users or programs. There must be association between different files that are accessed consecutively. This paper proposes a method of file access heat prediction. Data heat trend is used as the basis for migration to a relatively low-cost storage device. Due to the limitations of traditional models, it is difficult to achieve good results in predicting at such nonlinear scenes. This paper attempts to use the deep learning algorithm model to predict the evolution trend of data access heat. This paper discussed the implementation of some initial parts of the system, in particular the trace collector and the LSTM model. Then some preliminary experiments are conducted with these parts.
DODAS stands for Dynamic On Demand Analysis Service and is a Platform as a Service toolkit built around several EOSC-hub services designed to instantiate and configure on-demand container-based clusters over public or private Cloud resources. It automates the whole workflow from service provisioning to the configuration and setup of software applications. Therefore, such solution allows to use “any cloud provider”, with almost zero effort. In this talk, we demonstrate how DODAS can be adopted as deployment manager to set up and manage compute resources and services, required to develop an AI solution for smart data caching. The smart caching layer may reduce the operational cost and increase flexibility with respect to regular centrally managed storage of the current CMS computing model. The cache space should be dynamically populated with the most requested data. In addition, clustering such caching systems will allow to operate them as Content Delivery System between data providers and end-users. Moreover a geographically distributed caching layer will be functional also to a data-lake based model, where many satellite computing centers might appear and disappear dynamically. In this context, our strategy is to develop a flexible and automated AI environment for smart management of the content of such clustered cache system. In this contribution we will describe the identified computational phases required for the AI environment implementation, as well as the related DODAS integration. Therefore we will start with the overview of the architecture for the pre-processing step, based on Spark, which has the role to prepare data for a Machine Learning technique. A focus will be given on the automation implemented through DODAS. Then, we will show how to train an AI-based smart cache and how we implemented a training facility managed through DODAS. Finally we provide a overview of the inference system, based on the CMS-TensorFlow as a Service and also deployed as a DODAS service.
A large amount of data is produced by large scale scientific facilities in high energy physics (HEP) field. And distributed computing technologies has been widely used to process these data. In traditional computing model such as grid computing, computing job is usually scheduled to the sites where the input data was pre-staged in. This model will lead to some problems includ-ing low CPU utilization, inflexibility, and difficulty in highly dynamic cloud environment. The paper proposed a cross-domain data access system (CDAS), which presents one same file system view at local and the remote sites, supporting directly data access on demand. Then the computing job can run everywhere no need to know where data is located. For the moment the system has been implemented including these functionalities such as native access for remote data, quick response, data transmission and management on demand based on HTTP, data block hash and store, uniform file view and so on. The test results showed the performance was much better than tradi-tional file system on high-latency WAN.
Storage have been identified as the main challenge for the future distributed computing infrastructures: Particle Physics (HL-LHC, DUNE, Belle-II), Astrophysics and Cosmology (SKA, LSST). In particular, the High Luminosity LHC (HL-LHC) will begin operations in the year of 2026 with expected data volumes to increase by at least an order of magnitude as compared with the present systems. Extrapolating from existing trends in disk and tape pricing, and assuming flat infrastructure budgets, the implications for data handling for end-user analysis are significant. HENP experiments need to manage data across a variety of mediums based on the types of data and its uses: from tapes (cold storage) to disks and solid state drives (hot storage) to caches (including world wide access data in clouds and “data lakes”). DataLake R&D project aims at exploring an evolution of distributed storage while bearing in mind very high demands of HL-LHC era. Its primary objective is to optimize hardware usage and operational costs of a storage system deployed across distributed centers connected by fat networks and operated as a single service. Such storage would host a large fraction of the data and optimize the cost, eliminating inefficiencies due to fragmentation. In this talk we will highlight current status of the project, its achievements, interconnection with other research activities in this field like WLCG-DOMA and ATLAS-Google DataOcean, and future plans.
Machine learning is becoming ubiquitous across HEP. There is great potential to improve trigger and DAQ performances with it. However, the exploration of such techniques within the field in low latency/power FPGAs has just begun. We present hls4ml, a user-friendly software, based on High-Level Synthesis (HLS), designed to deploy network architectures on FPGAs. As a case study, we use hls4ml for boosted-jet tagging with deep networks at the LHC. We map out resource usage and latency versus network architectures, to identify the typical problem complexity that hls4ml could deal with. We discuss possible applications in current and future HEP experiments.
Finding tracks downstream of the magnet at the earliest LHCb trigger level is not part of the baseline plan of the Upgrade trigger, on account of the significant CPU time required to execute the search. Many long-lived particles, such as Ks and strange baryons, decay after the vertex track detector (VELO), so that their reconstruction efficiency is limited. We present a study of the performances of a future innovative real-time tracking system based on FPGAs, R&D developed in the context of the LHCb Upgrade Ib (LHC Run 4), dedicated to reconstructing particle downstream of the magnet in the forward tracking detector (Scintillating Fibre Tracker), that is capable of processing events at the full LHC collision rate (30 MHz).
In the transition to Run 3 in 2021, LHCb will undergo a major luminosity upgrade, going from 1.1 to 5.6 expected visible Primary Vertices (PVs) per event, and will adopt a purely software trigger. This has fueled increased interest in alternative highly-parallel and GPU friendly algorithms for tracking and reconstruction. We will present a novel prototype algorithm for vertexing in the LHCb upgrade conditions.
We use a custom kernel to transform the sparse 3D space of hits and tracks into a dense 1D dataset, and then apply Deep Learning techniques to find PV locations. By training networks on our kernels using several Convolutional Neural Network layers, we have achieved better than 90% efficiency with no more than 0.2 False Positives (FPs) per event. Beyond its physics performance, this algorithm also provides a rich collection of possibilities for visualization and study of 1D convolutional networks. We will discuss the design, performance, and future potential areas of improvement and study, such as possible ways to recover the full 3D vertex information.
The Belle II experiment, beginning data taking with the full detector in early 2019, is expected to produce a volume of data fifty times that of its predecessor. With this dramatic increase in data comes the opportunity for studies of rare previously inaccessible processes. The investigation of such rare processes in a high data volume environment requires a correspondingly high volume of Monte Carlo simulations to prepare analyses and gain a deep understanding of the contributing physics processes to each individual study. This presents a significant challenge in terms of computing resource requirements and calls for more intelligent methods of simulation, in particular background processes with very high rejection rates. This work presents a method of predicting in the early stages of the simulation process the likelihood of relevancy of an individual event to the target study using convolutional neural networks. The results show a robust training that is integrated natively into the existing Belle II analysis software framework.
A key ingredient in an automated evaluation of two-loop multileg processes is a
fast and numerically stable evaluation of scalar Feynman integrals. In this respect, the calculation of two-loop three- and four-point functions in the general complex mass case so far relies on multidimensional numerical integration through sector decomposition whereby a reliable result has a high computing cost, whereas the derivation of a fully analytic result remains beyond reach. It would therefore be useful to perform part of the Feynman
parameter integrations analytically in a systematic way to let only a reduced
number of integrations to be performed numerically. Such a working program has been initiated for the calculation of massive two-loop $N$-point functions
using analytically computed building blocks. This approach is based on the implementation of two-loop scalar $N$-point functions in four dimensions $^{(2)}I_{N}^{4}$ as double integrals in the form:
$$
^{(2)}I_{N}^{4}
\sim
\sum \int_{0}^{1} d \rho \int_{0}^{1} d \xi \, P(\rho,\xi) \;
^{(1)}\widetilde{I}_{N+1}^{4}(\rho,\xi)
$$
where the building blocks $^{(1)}\widetilde{I}{N+1}^{4}(\rho,\xi)$ involved in the integrands are similar to "generalised" one-loop $(N+1)$-point Feynman-type integrals, and where $P(\rho,\xi)$ are weighting functions. The $^{(1)}\widetilde{I}{N+1}^{4}(\rho,\xi)$ are "generalised" in the sense that the integration domain spanned by the Feynman parameters defining them is no longer the usual simplex ${ 0 \leq z_{j} \leq 1, j = 1,\cdots,N+1; \sum_{j=1}^{N+1} z_{j}=1}$ at work for the one-loop $(N+1)$-point function, but another domain (e.g. a cylinder with triangular basis) which depends on the topology of the two-loop $N$-point function considered. The generalisation concerns also the underlying kinematics, which, besides external momenta, depends on two extra Feynman parameters $\rho$ and $\xi$. The parameter space spanned by this kinematics is larger than the one spanned in one-loop $(N+1)$-particle processes at colliders.The only two remaining integrations over $\rho,\xi$ to be performed numerically represent a substantial gain w.r.t. a fully numerical integration of the many Feynman parameter two-loop integrals.
As a first step in this direction, the method developped has been successfully applied to the usual one-loop four-point function for arbitrary masses and kinematics as a ``proof of concept'', showing its ability to circumvent the subtleties of the various analytic continuations in the kinematical variables in a systematic way, in a series ofthree articles. The target work, namely its practical implementation to compute the building blocks $^{(1)}\widetilde{I}_{N+1}^{4}(\rho,\xi)$ is to be elaborated and presented
in a future series of articles.
We present a novel framework that enables efficient probabilistic inference in large-scale scientific models by allowing the execution of existing domain-specific simulators as probabilistic programs, resulting in highly interpretable posterior inference. Our framework is general purpose and scalable, and is based on a cross-platform probabilistic execution protocol through which an inference engine can control simulators in a language-agnostic way. We demonstrate the technique in particle physics, on a scientifically accurate simulation of the τ (tau) lepton decay, which is a key ingredient in establishing the properties of the Higgs boson. High-energy physics has a rich set of simulators based on quantum field theory and the interaction of particles in matter. We show how to use probabilistic programming to perform Bayesian inference in these existing simulator codebases directly, in particular conditioning on observable outputs from a simulated particle detector to directly produce an interpretable posterior distribution over decay pathways. Inference efficiency is achieved via inference compilation where a deep recurrent neural network is trained to parameterize proposal distributions and control the stochastic simulator in a sequential importance sampling scheme, at a fraction of the computational cost of Markov chain Monte Carlo sampling.
Since 2013, ETH Zürich and University of Bologna have been working on the PULP project to develop energy efficient computing architectures suitable for a wide range of applications starting from the IoT domain where computations have to be done in a few milliWatts, all the way to the HPC domain where the goal is to extract the maximum number of calculations within a given power budget. For this project, we have adopted an open source approach. Our main computation cores are based on the open RISC-V ISA, and we have developed highly optimized 32bit and 64bit RISC-V cores. Together with a rich set of peripherals, we have released a series of open source computing platforms from single-core microcontroller, to multi-cluster systems with tens of cores. So far we have designed and tested nearly 30 ASICs as part of the PULP project and our open source offering has been used by many companies including Google, IBM and NXP. In this talk, I will give an overview of the PULP project and show what we are currently working on.
Posters in this session can be seen in Room B during all Monday and Tuesday.
The LHCb Upgrade experiment will start operations in LHC Run 3 from 2021 onwards. Owing to the five-times higher instantaneous luminosity and higher foreseen trigger efficiency, the LHCb Upgrade will collect signal yields per unit time approximately ten times higher than that of the current experiment, with pileup increasing by a factor of six. This contribution presents the changes in the computing model and the associated offline computing resources needed for the LHCb Upgrade, that are defined by the significantly increased trigger output rate compared to the current situation, and the corresponding necessity to generate significantly larger samples of simulated events. The update of the LHCb computing model for Run 3, and beyond, is discussed with an emphasis on the optimization that has been applied to the usage of distributed computing CPU and storage resources.
The LHCb experiment will be upgraded for data taking in Run 3 and beyond and the instantaneous luminosity will in particular increase by a factor five. The lowest level trigger of the current experiment, a hardware-based trigger that has a hard limit of 1 MHz in its event output rate, will be removed. and replaced with a full software trigger. This new
trigger needs to sustain rates up 30 MHz for the inelastic proton-proton collisions and will thus process 5 Tb/s of data, which is over two orders of magnitude larger than the rate processed by the current LHCb experiment, all this to be achieved within the same costs of the current data processing and without compromising the physics performance. For this purpose, the Gaudi framework currently used in LHCb has been re-engineered to enable the maximally efficient usage of vector registers and of multi- and many-core architectures. In particular, a new scheduler and a re-design of the data structures were needed in order to make the most efficient usage of memory resources and speed up
access patterns.
This contribution presents these and other critical points that had to be tackled as well as the current status and an outlook of the work program that will address the challenges of the software trigger in the LHCb Upgrade.
The HL-LHC program has seen numerous extrapolations of its needed computing resources that each indicate the need for substantial changes if the desired HL-LHC physics program is to be supported within the current level of computing resource budgets. Drivers include detector upgrades, large increases in event complexity (leading to increased processing time and analysis data size) and trigger rates needed (5-10 fold increases) for the HL-LHC program. In this presentation, we discuss the newly developed modeling techniques in use for improving the accuracy of CMS computing resource needs for HL-LHC. Our emphasis is on monitoring-data driven techniques for model construction, parameter determination, and importantly, model extrapolations. Additionally we focus on uncertainty quantification as a critical component for understanding and properly interpreting our results.
The ATLAS experiment produced so far hundreds of petabytes of data and expects to have one order of magnitude more in the future. This data are spread among hundreds of computing Grid sites around the world. The EventIndex is the complete catalogue of all ATLAS events, real and simulated, keeping the references to all permanent files that contain a given event in any processing stage. It provides the means to select and access event data in the ATLAS distributed storage system, and provides support for completeness and consistency checks and trigger and offline selection overlap studies. The EventIndex employs various data handling technologies like Hadoop and Oracle databases, and is integrated with other systems of the ATLAS distributed computing infrastructure, including those for data, metadata, and production management. The project is in operation since the start of LHC Run 2 in 2015, and is in permanent development in order to fit the production and analysis demands and follow technology evolutions. The main data store in Hadoop, based on MapFiles and HBase, has worked well during Run 2 but new solutions are explored for the future. This paper reports on the current system performance and on the studies of a new data storage prototype that can carry the EventIndex through Run 3.
Network monitoring is of great importance for every data acquisition system (DAQ), it ensures stable and uninterrupted data flow. However, when using standard tools such as Icinga, often homogeneity of the DAQ hardware is not exploited.
We will present the application of machine learning techniques to detect anomalies among network devices as well as connection instabilities. The former exploits homogeneity of network hardware to detect device anomalies such as too high CPU or memory utilization, and consequently uncover a pre-failure state. The latter algorithm learns to distinguish between port speed instabilities caused by, e.g. failing transceiver or fiber, and speed changes due to scheduled system reboots.
All the algorithms described are implemented in the DAQ network of the ATLAS experiment.
Generative models, and in particular generative adversarial networks, are gaining momentum in hep as a possible way to speed up the event simulation process. Traditionally, gan models applied to hep are designed to return images. On the other hand, many applications (e.g., analyses based on particle flow) are designed to take as input lists of particles. We investigate the possibility of using recurrent GANs as a generator of particle lists. We discuss a prototype implementation, challenges and limitations in the context of specific applications.
At this moment the most convenient approach in electromagnetic shower generation is Monte-Carlo simulation produced by software packages like GEANT4. However, one of the critical problems of Monte-Carlo production is that it is extremely slow since it involves simulation of numerous subatomic interactions.
Recently, generative adversarial networks(GANs) addressed speed issue in the simulation of calorimeters response with significant speeding-up a two-three order of magnitude in comparison with the current approach. However, it is challenging to define network architecture that converges within a reasonable timeframe and define a proper figure of merit that yields realistic synthetic objects.
In this work, we propose a metric that deals successfully with the structure of the showers. The architecture of the neural network that performs nicely with shower-like objects is called graphical network. Plus the approach for the generation of electromagnetic showers with graphical neural networks fits well into a GAN-based training and produces the meaningful result. The novelty of this approach lies, firstly, in the generation of complex recursive physical process with neural network and, secondly, in significant speed-up in comparison with traditional simulation approaches.
The increasing luminosities of future LHC runs and next generation of collider experiments will require an unprecedented amount of simulated events to be produced. Such large scale productions are extremely demanding in terms of computing resources. Thus new approaches to event generation and simulation of detector responses are needed. In LHCb the simulation of the RICH detector using the classical method takes a sizeable fraction of CPU time. We generate high-level reconstruction observables using a generative neural network to bypass low level details. This network is trained to reproduce the particle species likelihoods based on the track kinematic parameters and detector occupancy. The fast simulation is trained using real data samples collected by LHCb during run 2 with the help of sWeight technique. We demonstrate that this approach provides high-fidelity results along with a significant speed increase and discuss possible implication of these results. We also present an implementation of this algorithm into LHCb simulation software and validation tests.
An extensive upgrade programme has been developed for LHC and its experiments, which is crucial to allow the complete exploitation of the extremely high-luminosity collision data. The programme is staggered in two phases, so that the main interventions are foreseen in Phase II.
For this second phase, the main hadronic calorimeter of ATLAS (TileCal) will redesign its readout electronics but the optical signal pathway will be kept unchanged.
However, there is a technical possibility to increase the detector granularity, without changing its mechanical structure, by modifying only the calorimeter readout. During the high luminosity regime, particle jets with high transverse momentum tend to deposit its energy in the last layers of TileCal. Therefore, dividing the actual calorimeter cells into new subregions will improve momentum reconstruction, mass, transverse energy and angular position of those jets, allowing future analysis benefit from a finer-grained granularity detector.
The light emitted by the calorimeter tiles is collected by a set of WLS fibers grouped in a bundle, one per calorimeter cell, coupled by a light mixer to a Photomultiplier Tube (PMT).
Aiming at extracting additional information on the spatial distribution of the energy deposited within each cell, the original PMT is substituted by a Multi-Anode Photomultiplier
Tube (MA-PMT) with 64 photosensors distributed in a grid of 8 x 8 pixels. This makes it possible to increase the detector granularity by means of an algorithm whose purpose is to match the image pattern formed in the grid of pixels to a topological subregion within a given cell. Calibration data are used for algorithm development, which is costly to produce, demanding time and man power. Therefore, the Generative Adversarial Network (GAN) is used to simulate the interaction of particles in a calorimeter cell, and, thereafter, leveraging the amount of statistics for the final classification model development.
Using a variant of the GAN model based on deep layers (DCGAN), a substantial increase in the number of images was obtained. As a consequence, a supervised deep learning approach based on Convolutional Neural Network (CNN) could be developed for mapping the signal image information onto two regions of the 8 x 8 grid. During the development
stage, the synthetic images produced with the generative model were used to train the CNN and its performance was evaluated in real calibration data. The preliminary results show an accuracy of more than 95% in both splits. This is encouraging for a possible solution of a such important step in calorimeter upgrade in the ATLAS experiment.
Complete one-loop electroweak radiative corrections to polarized Bhabha
scattering are presented. Higher order QED effects are evaluated in the leading
logarithmic approximation. Numerical results are shown for the conditions of future
circular and linear electron-positron colliders with polarized beams. Theoretical
uncertainties are estimated.
A new Monte Carlo event generator MCSANCee for simulation of processes at future e^+e^- colliders is presented. Complete one-loop electroweak radiative corrections and polarization of the initial beams are taken into account. The present generator includes the following processes: e^+e^- \to e^+e^- (mu^+mu^-, tau^+tau^-, ZH, Z\gamma, \gamma\gamma). Numerical results for all of these processes are shown together with tuned comparisons with other existing codes. The plan for the further extension of the MCSANCee generators are discussed.
The Grace system is an automatic system to calculate cross sections based on the standard model and MSSM including one-loop corrections.
I would like to report recent progress of the GRACE system including optimization of generated codes.
The Monte Carlo generator to simulate events of single-photon
annihilation to hadrons at center-of-mass energies below 2.5 GeV
is described. The generator is based on existing data on cross sections
of various exclusive channels of e+e- annihilation obtained in various
e+e- experiments by the scan and ISR methods. It is extensively used
in the software packages for analysis of experiments at Novosibirsk colliders VEPP-2000 and VEPP-4 aimed at high-precision measurements of hadronic cross sections for calculations of the hadronic vacuum polarization for the muon anomaly problem.
Posters in this session can be seen in Room B during all Monday and Tuesday.
The increasing LHC luminosity in Run III and, consequently, the increased number of simultaneous proton-proton collisions (pile-up) pose significant challenges for the CMS experiment. These challenges will affect not only the data taking conditions, but also the data processing environment of CMS, which requires an improvement in the online triggering system to match the required detector performance. In order to mitigate the increasing collision rates and complexity of a single event, various approaches are being investigated. Heterogenous computing resources, recently becoming prominent and abundant, may be significantly more performant for certain types of workflows. In this work, we investigate implementations of common algorithms targeting heterogenous platforms, such as GPUs and FPGAs. The local reconstruction algorithms of the CMS calorimeters, given their granularity and intrinsic parallelizability, are among the first candidates considered for implementation in such heterogenous platforms. We will present current development status and preliminary performance results. Challenges and various obstacles related to each platform, together with the integration into CMS experiment’s framework, will be further discussed.
The first LHCb upgrade will take data at an instantaneous luminosity of 2E33cm^{-2}s^{-1} starting in 2021. Due to the high rate of beauty and charm signals LHCb has chosen as its baseline to read out the entire detector into a software trigger running on commodity x86 hardware at the LHC collision frequency of 30MHz, where a full offline-quality reconstruction will be performed. In this talk we present the challenges of triggering in the MHz signal era. We pay particular attention to the need for flexibility in the selection and reconstruction of events without sacrificing performance.
The ATLAS software infrastructure has undergone several changes towards the adoption of Continuous Integration methodology to develop and test software. The users community can benefit from a CI environment in several ways: they can develop their custom analysis, build and test it using revision control services such as GitLab. By providing targeted official base images ATLAS enables users to also build self-contained Linux container images as part of the CI pipelines, a crucial component for analysis preservation and re-use scenarios such as reinterpretation of searches for Beyond the Standard Model physics (RECAST).
However, so far, the execution of preserved analyses was constrained to dedicated cloud infrastructure and not well-integrated into the wider WLCG computing model, where software distribution has so far relied on a combination of collaboration software distributed via the CVMFS filesystem and user software distributed ad-hoc by the workflow management system.
We describe an integration of containerized workloads into the grid infrastructure enabling users to submit self-authored or externally provided container images. To that end, the pilot process executed on the worker node has been extended to utilize the userspace container runtime singularity to execute such workloads. Further, the PanDA job configuration as well as the user-facing command line interfaces have been adapted to allow a detailed specification of the runtime environment.
Through this work a continuous grid analysis paradigm emerges, in which for each change in the revision control system, an automated pipeline of unit testing, image building and workload submission based on the freshly built image into the grid is triggered thus further streamlining physics analyses.
Resources required for high-throughput computing in large-scale particle physics experiments face challenging demands both now and in the future. The growing exploration of machine learning algorithms in particle physics offers new solutions to simulation, reconstruction, and analysis. These new machine learning solutions often lead to increased parallelization and faster reconstructions times on dedicated hardware, here specifically Field Programmable Gate Arrays. We explore the possibility that applications of machine learning simultaneously also solve the increasing computing challenges. Employing machine learning acceleration as a web service, we demonstrate a heterogeneous compute solution for particle physics experiments that requires minimal modification to the current computing model. First results with Project Brainwave by Microsoft Azure, using the Resnet-50 image classification model as an example, demonstrate inference times of approximately 50 (10) milliseconds with our experimental physics software framework using Brainwave as a cloud (edge) service. We also adapt the image classifier, for example, physics applications using transfer learning: jet identification in the CMS experiment and event classification in the Nova neutrino experiment at Fermilab. Solutions explored here are potentially applicable sooner than may have been initially realized.
The Belle II experiment at the SuperKEKB e+e- collider has completed its first-collisions run in 2018. The experiment is currently preparing for physics data taking in 2019. The electromagnetic calorimeter of the Belle II detector consists of 8,736 Thallium-doped CsI crystals with PIN-photodiode readout. Each crystal is equipped with waveform digitizers that allow the extraction of energy, time, and pulse-shape information. The talk will describe the offline reconstruction algorithm and first experience with the data taken in 2018. Further optimizations towards the high-rate data taking and high-dose background environment of Belle II will be discussed. Important steps in this process are improvements of existing regression algorithms for energy and position reconstruction, improvements of neutral and charged particle identification, and refinements to clustering itself using machine learning.
The ATLAS experiment records data from the proton-proton collisions produced by the Large Hadron Collider (LHC). The Tile Calorimeter is the hadronic sampling calorimeter of ATLAS in the region |η| < 1.7. It uses iron absorbers and scintillators as active material. Jointly with the other calorimeters it is designed for reconstruction of hadrons, jets, tau-particles and missing transverse energy. It also assists in muon identification. The energy deposited by the particles in the Tile Calorimeter is read out by approximately 10,000 channels. The signal provided by the readout electronics for each channel is digitized at 40 MHz and its amplitude is estimated by an optimal filtering algorithm. The increase of LHC luminosity leads to signal pile-up that deforms the signal of interest and compromises the amplitude estimation performance. This work presents the proposed algorithm for energy estimation in the Tile Calorimeter under high pile-up conditions during LHC Run 3, named Wiener Filter. The performance of the proposed method is studied under various pile-up conditions and compared with current optimal filtering method using proton-proton collision data and Monte Carlo.
We introduce a novel implementation of a reinforcement learning
algorithm which is adapted to the problem of jet grooming, a
crucial component of jet physics at hadron colliders. We show
that the grooming policies trained using a Deep Q-Network model
outperform state-of-the-art tools used at the LHC such as
Recursive Soft Drop, allowing for improved resolution of the mass
of boosted objects. The algorithm learns how to optimally remove
soft wide-angle radiation, allowing for a modular jet grooming
tool that can be applied in a wide range of contexts.
A large part of the success of deep learning in computer science can be attributed to the introduction of dedicated architectures exploiting the underlying structure of a given task. As deep learning methods are adopted for high energy physics, increasing attention is thus directed towards the development of new models incorporating physical knowledge.
In this talk, we present a network architecture that utilizes our knowledge of particle combinations and directly integrates Lorentz boosting to learn relevant physical features from basic four vectors. We explore two example applications, namely the discrimination of hadronic top-quark decays from light quark and gluon jets, and the separation of top-quark pair associated Higgs boson events from a $t\bar{t}$ background. We also investigate the learned combinations and boosts to gain insights into what the network is learning.
I describe a novel interactive virtual reality visualization of the Belle II detector at KEK and the animation therein of GEANT4-simulated event histories. Belle2VR runs on Oculus and Vive headsets (as well as in a web browser and on 2D computer screens, in the absence of a headset). A user with some particle-physics knowledge manipulates a gamepad or hand controller(s) to interact with and interrogate the detailed GEANT4 event history over time, to adjust the visibility and transparency of the detector subsystems, to translate freely in 3D, to zoom in or out, and to control the event-history timeline (scrub forward or backward, speed up or slow down). A non-expert uses the app - during public outreach events, for example - to explore the world of subatomic physics via electron-positron collision events in the Belle II experiment at the SuperKEKB colliding-beam facility at KEK in Japan. Multiple simultaneous users, wearing untethered locomotive VR backpacks and headsets, walk about a room containing the virtual model of the Belle II detector and each others' avatars as they observe and control the simulated event history. Developed at Virginia Tech by an interdisciplinary team of researchers in physics, education, and virtual environments, the simulation is intended to be integrated into the undergraduate physics curriculum. I describe the app, including visualization features and design decisions, and illustrate how a user interacts with its features to expose the underlying physics in each electron-positron collision event.
A high-precision calculation of the electron anomalous magnetic moment requires an evaluation of QED Feynman diagrams up to five independent loops. To make this calculation practically feasible it is necessary to remove all infrared and ultraviolet divergences before integration. A procedure of removing both infrared and ultraviolet divergences in each individual Feynman diagram will be presented. The procedure is based on linear operators that are applied to the Feynman amplitudes of ultraviolet divergent subdiagrams. The usage of linear operators allows us to avoid residual renormalizations after subtraction of divergences. This procedure leads immediately to finite Feynman parametric integrals. A method of Monte Carlo integration of these Feynman parametric integrands will be presented. The method is based on importance sampling. The probability density function is constructed for each Feynman diagram individually by using some combinatorial information from the diagram. The calculated value of the total contribution of the 5-loop QED Feynman diagrams without lepton loops to the electron anomalous magnetic moment will be presented. This result was obtained by a GPU-based computation on a supercomputer. The calculation provides double-checking of the value. The contributions of nine gauge-invariant classes of 5-loop Feynman diagrams without lepton loops will be presented for the first time. Also, the contributions of some individual 6-loop Feynman diagrams will be given for demonstration of the method.
Posters in this session can be seen in Room B during all Wednesday and Thursday.
Women obtain more than half of U.S. undergraduate degrees in biology, chemistry, and mathematics, yet they earn less than 20% of computer science, engineering, and physics undergraduate degrees (NSF, 2014). Why are women represented in some STEM fields more than others? The STEM Paradox and the Gender Equality Paradox show that countries with greater gender equality have a lower percentage of female STEM graduates. This phenomenon as well as other factors explaining gender disparities in STEM participation will be discussed.
Modern electronic general-purpose computing has been on an unparalleled path of exponential acceleration for more than 7 decades. From the 1970 onwards, this trend was driven by the success of integrated circuits based on silicon technology. The exponential growth has become a self-fulfilling (and economically driven) prophecy commonly referred to as Moore’s Law. The end of Moore’s law has been augured many times before, but now the economic equation fueling Moore’s law is increasingly broken leading to actual technology delays. Ground rule scaling of the underlying technology is expected to saturate in less than 10 years. If computational performance needs to keep increasing beyond this horizon, alternative sources of advancements will have to be found. We will have to rely much more than before on software innovations, specialized chips and ultimately new computing paradigms. This talk will cover these challenges and will discuss which role neuroscience may play in the search for novel computing paradigms, in particular neuromorphic computing.
The next generation of HPC and HTC facilities, such as Oak Ridge’s Summit, Lawrence Livermore’s Sierra, and NERSC's Perlmutter, show an increasing use of GPGPUs and other accelerators in order to achieve their high FLOP counts. This trend will only grow with exascale facilities such as A21. In general, High Energy Physics computing workflows have made little use of GPUs due to the relatively small fraction of kernels that run efficiently on GPUs, and the expense of rewriting code for rapidly evolving GPU hardware. However, the computing requirements for high-luminosity LHC are enormous, and it will become essential to be able to make use of supercomputing facilities that rely heavily on GPUs and other accelerator technologies.
ATLAS has already developed an extension to AthenaMT, its multithreaded event processing framework, that enables the non-intrusive offloading of computations to external accelerator resources, and has begun investigating strategies to schedule the offloading efficiently. The same applies to LHCb, which, while sharing the same underlying framework as ATLAS (Gaudi), has considerably different workflow. CMS's framework, CMSSW, also has the ability to efficiently offload tasks to external accelerators. But before investing heavily in writing many kernels for specific offloading architectures, we need to better understand the performance metrics and throughput bounds of the workflows with various accelerator configurations. This can be done by simulating a diverse set of workflows, using real metrics for task interdependencies and timing, as we vary fractions of offloaded tasks, latencies, data conversion speeds, memory bandwidths, and accelerator offloading parameters such as CPU/GPU ratios and speeds.
We present the results of these studies performed on multiple workflows from ATLAS, LHCb and CMS, which will be instrumental in directing effort to make HEP framework, kernels and workflows run efficiently on exascale facilities.
The ATLAS experiment at the Large Hadron Collider at CERN relies on a complex and highly distributed Trigger and Data Acquisition (TDAQ) system to gather and select particle collision data obtained at unprecedented energy and rates. The TDAQ Controls system is the component that guarantees the smooth and synchronous operations of all the TDAQ components and provides the means to minimize the downtime of the system caused by runtime failures.
Given the scale and complexity of the TDAQ system and the rates of data to be analysed, the automation of the system functionality in the areas of error detection and recovery is a strong requirement. That is why in Run 2 the Central Hint and Information Processor (CHIP) service has been introduced; it can be truly considered the “brain” of the TDAQ Controls system. CHIP is an intelligent system able to supervise the ATLAS data taking, take operational decisions and handle abnormal conditions. It is based on an open-source Complex Event Processing (CEP) engine, ESPER. Currently, CHIP’s knowledge base is made up of more than 300 rules organized in about 30 different contexts.
This paper will focus on the experience gained with CHIP during the whole LHC Run 2 period. Particular attention will be paid to demonstrate how the use of CHIP for automation and error recovery proved to be a valuable asset in optimizing the data taking efficiency, reducing operational mistakes, efficiently handling complex scenarios and improving the latency to react to abnormal situations. Additionally, the huge benefits brought by the CEP engine in terms of both flexibility and simplification of the knowledge base will be reported.
The ATLAS experiment at the LHC at CERN will move to use the Front-End Link eXchange (FELIX) system in a staged approach for LHC Run 3 (2021) and LHC Run 4 (2026). FELIX will act as the interface between the data acquisition; detector control and TTC (Timing, Trigger and Control) systems; and new or updated trigger and detector front-end electronics.
FELIX functions as a router between custom serial links from front end ASICs and FPGAs to data collection and processing components via a commodity switched network. Links may aggregate many slower links or be a single high bandwidth link. FELIX also forwards the LHC bunch-crossing clock, fixed latency trigger accepts and resets received from the TTC system to front-end electronics.
The FELIX system uses commodity server technology in combination with FPGA-based PCIe I/O cards. The FELIX servers run a software routing platform serving data to network clients. Commodity servers connected to FELIX systems via the same network run innovative multi-threaded software for event fragment building, processing, buffering and forwarding.
This presentation will describe the design and status of the FELIX based readout for the Run 3 upgrade, during which a subset of the detector will be migrated. It will also show how the same concept has been successfully introduced into the demonstrator test bench of the ATLAS Pixel Inner Tracker, acting as a proof of concept towards the longer term Run 4 upgrade in which all remaining detectors will adopt a FELIX based readout.
ATLAS production system called ProdSys2 is used during Run2 to define
and to organize workflows and to schedule, submit and execute payloads
in a distributed computing infrastructure. We design ProdSys2 to manage
all ATLAS workflows: data (re)processing, MC simulation, physics groups
analysis objects production, High Level Trigger processing, SW release
building and user analysis. It simplifies the life of ATLAS scientists by
offering a web-based user interface with rich options, which
implements a user-friendly environment for workflow management, such as
a simple way of combining different data flows, and real-time monitoring,
optimised for use with a huge amount of information to present. We
present an overview of the ATLAS Production System technical
implementation: job and task definitions, workflow manager and the web-based user
interface. We show how it interfaces to different computing resources, for
instance, HPC systems and clouds, and how the Production System interacts with
the ATLAS data management system. We describe important technical design
decisions, work experience acquired during the LHC Run2 and how will the Production System
evolve to be used in the heterogeneous computing environment foreseen for
Run3 and High-Luminosity LHC.
The Level-0 Muon Trigger system of the ATLAS experiment will undergo a full upgrade for HL-LHC to stand the challenging performances requested with the increasing instantaneous luminosity. The upgraded trigger system foresees to send RPC raw hit data to the off-detector trigger processors, where the trigger algorithms run on new generation of Field-Programmable Gate Arrays (FPGAs). The FPGA represents an optimal solution in this context, because of its flexibility, wide availability of logical resources and high processing speed. Studies and simulations of different trigger algorithms have been performed, and novel low precision deep neural network architectures (based on ternary dense and convnet networks) optimized to run on FPGAs and to cope with sparse data are presented. Both physics performances in terms of efficiency and fake rates, and FPGA logic resource occupancy and timing obtained with the developed algorithms are presented.
The CMS experiment has been designed with a two-level trigger system: the Level 1 Trigger, implemented on custom-designed electronics, and the High Level Trigger (HLT), a streamlined version of the CMS offline reconstruction software running on a computer farm. A software trigger system requires a trade-off between the complexity of the algorithms running on the available computing resources, the sustainable output rate, and the selection efficiency.
During its “Phase 2” the LHC will reach a luminosity of 7×10³⁴cm⁻²s⁻¹ with a pileup of 200 collisions. To fully exploit the higher luminosity, the CMS experiment will increase the full readout rate from 100 kHz to 750 kHz. The higher luminosity, pileup and input rate present an unprecedented challenge to the HLT, that will require a processing power larger than today by at least a factor 20. This exceeds by far the expected increase in processing power for conventional CPUs, demanding an alternative approach.
Industry and HPC have been successfully using heterogeneous computing platforms, that can achieve higher throughput and better energy efficiency by matching each job to the most appropriate architecture.
The reliable use of a heterogeneous platform at the HLT since the beginning of Phase 2 requires the careful assessment of its performance and characteristics, which can only be attained by running a prototype in production already during Run 3. The integration of heterogeneous computing in the CMS reconstruction software depends upon improvements to its framework and scheduling, together with a tailoring of the reconstruction algorithms to the different architectures.
This R&D work began in 2017, and by the end of 2018 produced a demonstrator working in realistic conditions. This presentation will describe the results of the development and the characteristics of the system, along with its future perspectives.
Multivariate analyses in particle physics often reach a precision such that its uncertainties are dominated by systematic effects. While there are known strategies to mitigate systematic effects based on adversarial neural nets, the application of Boosted Decision Trees (BDT) so far had to ignore systematics in the training.
We present a method to incorporate systematic uncertainties into a BDT, the "systematics-aware BDT" (saBDT).
We evaluate our method on open data of the ATLAS Higgs to tau tau machine learning challenge and compare our results to neural nets trained with an adversary to mitigate systematic effects.
Analysis in high-energy physics usually deals with data samples populated from different sources. One of the most widely used ways to handle this is the sPlot technique. In this technique the results of a maximum likelihood fit are used to assign weights that can be used to disentangle signal from background. Some events are assigned negative weights, which makes it difficult to apply machine learning methods. Loss function becomes unbounded and the underlying optimization problem non-convex. In this contribution, we propose a mathematically rigorous way to apply Machine Learning methods on data with weights obtained by the sPlot. Examples of applications are also shown.
Variable-dependent scale factors are commonly used in HEP to improve shape agreement of data and simulation. The choice of the underlying model is of great importance, but often requires a lot of manual tuning e.g. of bin sizes or fitted functions. This can be alleviated through the use of neural networks and their inherent powerful data modeling capabilities.
We present a novel and generalized method for producing scale factors using an adversarial neural network. This method is investigated in the context of the bottom-quark jet-tagging algorithms within the CMS experiment. The primary network uses the jet variables as inputs to derive the scale factor for a single jet. It is trained through the use of a second network, the adversary, which aims to differentiate between the data and rescaled simulation.
Complex computer simulations are commonly required for accurate data modelling in many scientific disciplines, including experimental High Energy Physics, making statistical inference challenging due to the intractability of the likelihood evaluation for the observed data. Furthermore, sometimes one is interested on inference drawn over a subset of the generative model parameters while taking into account model uncertainty or misspecification on the remaining nuisance parameters. In this work, we show how non-linear summary statistics can be constructed by minimising inference-motivated losses via stochastic gradient descent such they provided the smallest uncertainty for the parameters of interest. As a use case, the problem of confidence interval estimation for the mixture coefficient in a multi-dimensional two-component mixture model (i.e. signal vs background) is considered, where the proposed technique clearly outperforms summary statistics based on probabilistic classification, which are a commonly used alternative but do not account for the presence of nuisance parameters.
A large number of physics processes as seen by ATLAS at the LHC manifest as collimated, hadronic sprays of particles known as ‘jets’. Jets originating from the hadronic decay of a massive particle are commonly used in searches for both measurements of the Standard Model and searches for new physics. The ATLAS experiment has employed machine learning discriminants to the challenging task of identifying the origin of a given jet, but such multivariate classifiers exhibit strong non-linear correlations with the invariant mass of the jet, complicating many analyses which wish to make use of the mass spectrum. Adversarially trained neural networks (ANN) are presented as a way to construct mass-decorrelated jet classifiers by jointly training two networks in a domain-adversarial fashion. The use of neural networks further allows this method to benefit from high-performance computing platforms for fast development. A comprehensive study of different mass-decorrelation techniques is performed in ATLAS simulated datasets, comparing ANNs to designed decorrelated taggers (DDT), fixed-efficiency k-NN regression, convolved substructure (CSS), and adaptive boosting for uniform efficiency (uBoost). Performance is evaluated using metrics for background jet rejection and mass-decorrelation.
In radio-based physics experiments, sensitive analysis techniques are often required to extract signals at or below the level of noise. For a recent experiment at the SLAC National Accelerator Laboratory to test a radar-based detection scheme for high energy neutrino cascades, such a sensitive analysis was employed to dig down into a spurious background and extract a signal. This analysis employed singular-value decomposition (SVD) to decompose the data into a basis of patterns constructed from the data itself. Expansion of data in a decomposition basis allows for the extraction, or filtration, of patterns which may be unavailable to other analysis techniques. In this talk we briefly present the results of this analysis in the context of experiment T-576 at SLAC, and detail the analysis method which was used to extract a hint of a radar signal at a significance of 2.3$\sigma$.
While the Higgs boson couplings to other particles are increasingly well-measured by LHC experiments, it has proven difficult to set constraints on the Higgs trilinear self-coupling $\lambda$, principally due to the very low cross-section of Higgs boson pair production. We present the results of NLO QCD corrections to Higgs pair production with full top-quark mass dependence, where the fixed-order computation up to two loops has been performed numerically, using both CPUs and GPUs. It is supplemented by parton showering within the $\texttt{POWHEG}$ event generator framework. We use the interface between the $\texttt{POWHEG-BOX-V2}$ program and both $\texttt{Pythia8}$ and $\texttt{Herwig7}$ parton showers to generate differential distributions for various values of the trilinear self-coupling $\lambda$ that are still allowed by the current experimental constraints.
In this talk, we consider some of the computational aspects encountered in recent computations of double Higgs boson production in gluon fusion. We consider the NLO virtual amplitude in the high-energy limit, and the NNLO virtual amplitude in the low-energy (or large top quark mass) limit. We discuss various optimizations which were necessary to produce our results.
We present an algorithm which allows to solve analytically linear systems of differential equations which factorize to first order. The solution is given in terms of iterated integrals over an alphabet where its structure is implied by the coefficient matrix of the differential equations. These systems appear in a large variety of higher order calculations in perturbative Quantum Field Theories. We apply this method to calculate the master integrals of the three-loop massive form factors for different currents, as an illustration, and present the results for all the form factors in detail. Here the solution space emerging is given by the cyclotomic harmonic polylogarithms and their associated special constants. No special basis representation of the master integrals is needed. The algorithm can be applied as well to more general cases factorizing at first order, which are based on more general alphabets, iterated integrals and associated constants.
In this contribution I will discuss the practicalities of storing events from a NNLO calculation on disk with the view of "replaying" the simulation for a different analysis and under different conditions, such as a different PDF fit or a different scale setting.
We present the HepMC3 library designed to perform manipulations with
event records of High Energy Physics Monte Carlo Event Generators
(MCEGs). The library is a natural successor of HepMC and HepMC2
libraries used in the present and in the past. HepMC3 supports all
functionality of previous versions and significantly extends them.
In comparison to the previous versions, the default event record has
been simplified, while an option to add arbitrary information to the
event record has been implemented. Particles and vertices are stored
separately in an ordered graph structure, reflecting the evolution of
a physics event and enabling usage of sophisticated algorithms for
event record analysis.
The I/O functionality of the library has been extended to support common
input and output formats of HEP MCEGs, including formats used in Fortran
HEP MCEGs, formats used in HepMC2 library and ROOT. The functionality of
the library allows user to implement customized input or output format.
The library is already supported by popular modern MCEGs (e.g. Sherpa and
Pythia8) and can replace the older HepMC versions in many others.
Posters in this session can be seen in Room B during all Wednesday and Thursday.
The next LHC Runs, nominally RunIII and RunIV, pose problems to the offline and computing systems in CMS. RunIV in particular will needs completely different solutions, given the current estimates of LHC conditions and Trigger estimates. We want to report on the R&D process CMS has a whole has established, in order to gain insight on the needs and the possible solutions for the 2020+ CMS computing.
Certifying the data recorded by the Compact Muon Solenoid (CMS) experiment at CERN which is usable for publication of physics results is a crucial and onerous task. Anomalies caused by detector malfunctioning or sub-optimal data processing are difficult to enumerate a priori and occur rarely, making it difficult to use classical supervised classification. We base out prototype towards the automation of such procedure on a semi-supervised approach using deep autoencoders. We demonstrate the ability of the model to detect anomalies with high accuracy, when compared against the outcome of the fully supervised methods. We show that the model has great interpretability of the results, ascribing the origin of the problems in the data to a specific sub-detector or physics object. Finally, we tailor the approach with a systematic method for feature filtering and address the issue of feature dependency on LHC beam intensity.
The hardware trigger L0 will be removed in LHCb upgrade I, and the software High Level Trigger have to process event at full LHC collision rate (30 MHz). This is a huge task, and delegating some low-level time-consuming tasks to FPGA accelerators can be very helpful in saving computing time that can be more usefully devoted to higher level tasks. In particular, the 2-D pixel geometry of the new LHCb VELO detector makes the cluster-finding process a particularly CPU-time demanding task. We present here the first results achieved with a highly parallel clustering algorithm implemented in dedicated FPGA cards, developed in an R&D programme in the context of the LHCb Upgrade I, in view of potential future applications.
ROOT has several features which interact with libraries and require implicit header inclusion. This can be triggered by reading or writing data on disk, or user actions at the prompt. Often, the headers are immutable and reparsing is redundant. C++ Modules are designed to minimize the reparsing of the same header content by providing an efficient on-disk representation of C++ Code. ROOT has released a C++ Modules-aware technology preview which intends to become the default for the next release.
In this contribution, we would like to summarize our ROOT experience migrating to C++ modules the codebase of ROOT. We outline the challenges for migration of the CMS software stack to use C++ modules, including integration of modules support in the build system while providing better functionality and correctness. We also give an insight of the continuous process of the improving performance bottlenecks for C++ modules and also evaluate the performance benefits that experiments are expected to achieve.
The Gambit collaboration is a new effort in the world of global BSM fitting -- the combination of the largest possible set of observational data from across particle, astro, and nuclear physics to gain a synoptic view of what experimental data has to say about models of new physics. Using a newly constructed, open source code framework, Gambit have released several state-of-the-art scans of large BSM-model parameter spaces, which have revealed structures masked by the Simplified Model approach that dominates LHC collaborations' in-house data interpretations. I will present the publicly available Gambit framework for marshalling physics calculations and assembling composite likelihoods -- including its use of OpenMP and MPI parallelisation, and novel scanning algorithms -- as well as headline results from Gambit's programme of BSM data recasting.
Data analysis based on forward simulation often require the use of a machine learning model for statistical inference of the parameters of interest.
Most of the time these learned model are trained to discriminate events between backgrounds and signals to produce a 1D score, which is used to select a relatively pure signal region.
The training of the model does not take into account the final objective that is to estimate the values of the parameters of interest.
Those measurements also depends on other parameters, denoted as nuisance parameters, that will induce systematic errors on estimated values.
We propose to explore learning methods that directly minimize the measurement error (both statistical and systematic) on a realistic case coming from HEP.
A common goal in the search for new physics is the determination of sets of New Physics models, typically parametrized by a number of parameters such as masses or couplings, that are either compatible with the observed data or excluded by it, where the determination into which category a given model belong requires expensive computation of the expected signal. This problem may be abstracted into the generalized problem of finding excursion sets (or, equivalently, iso-surfaces) of scalar, multivariate functions in $n$ dimensions.
We present an iterative algorithm for choosing points within the problem domain for which the functions are evaluated in order to estimate such sets at a significantly lower computational cost. The algorithm implements a Bayesian Optimization procedure, in which a information-based acquisition function seeks to maximally reduce the uncertainty on a excursion set. Further extension of the basic algorithm to the simultaneous estimation of excursion sets of multiple functions as well as batched selection of multiple points is presented.
Finally, a python package, excursion[1], is presented, which implements the algorithm and performance benchmarks are presented comparing this active-learning approach to other strategies commonly used in the high energy physics context, such as random sampling and grid searches.
[1] https://github.com/diana-hep/excursion
The Belle II experiment is an e+e- collider experiment in Japan, which
begins its main physics run in early 2019. The clean environment of e+e-
collisions together with the unique event topology of Belle II, in which
an Υ(4S) particle is produced and subsequently decays to a pair of B
mesons, allows a wide range of physics measurements to be performed
which are difficult or impossible at hadron colliders. A critical
technique for many of these measurements is tag-side B meson
reconstruction, in which one B meson in the event is reconstructed. The
Full Event Interpretation is an algorithm which reconstructs tag-side B
mesons at Belle II. The algorithm trains multivariate classifiers to
classify O(100) unique decay channels, which allows it in turn to
reconstruct O(10000) decay chains. This talk presents the algorithm and
its performance relative to previous tag-side B meson reconstruction
algorithms.
I briefly review the recently finished 5-loop renormalization program of QCD, and explain the status and prospects of the computer-algebraic techniques involved.
We propose an algorithm to find a solution to an integro-differential equation of the DGLAP type for all the orders in the running coupling α with splitting functions given at a fixed order in α. Complex analysis is significantly used in the construction of the algorithm, we found a way to calculate the involved integrals over contours in the complex planes in more simple way than by any of the methods known at present. Then, we write a code in Mathematica based on the proposed algorithm. We apply these algorithm and code to the DGLAP equation for singlet parton distributions of QCD and compare our solution with the results which may be obtained by using the existing numerical or symbolical software tools, for example, we compare it in this talk with the results obtained for singlet parton distribution functions by using QCDNUM.
Please note that the Conference dinner is not included in the registration fee. Please book your ticket beforehand, the cost is 100 CHF.
An important part of the LHC legacy will be precise limits on indirect effects of new physics, framed for instance in terms of an effective field theory. These measurements often involve many theory parameters and observables, which makes them challenging for traditional analysis methods. We discuss the underlying problem of “likelihood-free” inference and present powerful new analysis techniques that combine physics insights, statistical methods, and the power of machine learning. We have developed MadMiner, a new Python package that makes it straightforward to apply these techniques. In example LHC problems we show that the new approach lets us put stronger constraints on theory parameters than established methods, demonstrating its potential to improve the new physics reach of the LHC legacy measurements. While we present techniques optimized for particle physics, the likelihood-free inference formulation is much more general, and these ideas are part of a broader movement that is changing scientific inference in fields as diverse as cosmology, genetics, and epidemiology.
Abstract:
The HEP software ecosystem faces new challenges in 2020 with the approach of the High Luminosity LHC (HL-LHC) and the turn-on of a number of large new experiments. Current software development is organized around the experiments: No other field has attained this level of self-organization and collaboration in software development.
During 2017 the community produced a roadmap for the software R&D needed to address the software challenges of the 2020’s, with a focus on the HL-LHC. Members of the community that produced over 20 papers and strategic roadmaps included individual researchers, the US and European labs, LHC-based experiments, non-collider experiments, and members of industry. The field’s organization was apparent during this process.
HEP must now build on its past successes to address the challenges in front of it. Both technical solutions and new models of collaboration will be required. Funding agencies are ready to make new investments, such as the recent US NSF funded “Institute in Software in High Energy Physics (IRIS-HEP)” and recent calls for funding in Europe. In this talk, I will discuss the process by which we have gotten to this point and possible way forward.
Posters in this session can be seen in Room B during all Wednesday and Thursday.
The HL-LHC will see ATLAS and CMS see proton bunch collisions reaching track multiplicity up to 10.000 charged tracks per event. Algorithms need to be developed to harness the increased combinatorial complexity. To engage the Computer Science community to contribute new ideas, we organize a Tracking Machine Learning challenge (TrackML). Participants are provided events with 100k 3D points, and are asked to group the points into tracks; they are also given a 100GB training dataset including the ground truth. The challenge is run in two phases. The first "Accuracy" phase has run on Kaggle platform from May to August 2018; algorithms were judged judged only on a score related the fraction of correctly assigned hits. The second "Throughput" phase runs Sep 2018 to March 2019 on Codalab, will require code submission; algorithms are there ranked by combining accuracy and speed. The first phase has seen 653 participants, with top performers with innovative approaches. The second phase will finish at the time of ACAT. The talk will report on the first lessons from the challenge.
The anomalous magnetic moment of the electron $a_e$ and that of the muon $a_\mu$ occupy the special positions for precision tests of the Standard Model of elementary particles. Both have been precisely measured, 0.24 ppb for $a_e$ and 0.5 ppm for $a_\mu$, and new experiments of both $a_e$ and $a_\mu$ are on-going aiming to reduce the uncertainties. Theoretical calculations of $a_e$ and $a_\mu$ starting from the Lagrangian of the Standard model can also be achieved to the same precision of the future measurements. However, to do so, we need to carry out the five-loop QED calculation without any approximation. I will overview the computation method invented by T. Kinoshita in 1960’s that enables us to numerically calculate the entire five-loop QED contribution to the lepton anomalous magnetic moment. I also discuss the current status of the precision tests of the lepton anomalies and the fine-structure constant $\alpha$.
In particle physics, workflow management systems are primarily used as tailored solutions in dedicated areas such as Monte Carlo production. However, physicists performing data analyses are usually required to steer their individual workflows manually which is time-consuming and often leads to undocumented relations between particular workloads.
We present the luigi analysis workflow (law) Python package which is based on the open-source pipelining tool luigi, originally developed by Spotify. It establishes a generic design pattern for analyses of arbitrary scale and complexity, and shifts the focus from executing to defining the analysis logic. Law provides the building blocks to seamlessly integrate with interchangeable remote resources without, however, limiting itself to a specific choice of infrastructure. In particular, it introduces the paradigm of complete separation between analysis algorithms on the one hand, and run locations, storage locations, and software environments on the other hand.
To cope with the sophisticated demands of end-to-end HEP analyses, law supports job execution on WLCG infrastructure (ARC, gLite) as well as on local computing clusters (HTCondor, LSF), remote file access via most common protocols through the Grid File Access Library (GFAL2), and an environment sandboxing mechanism with support for Docker and Singularity containers. Moreover, the novel approach ultimately aims for analysis preservation out-of-the-box.
Law is developed open-source and entirely experiment independent. It is successfully used in ttH cross section measurements and searches for di-Higgs boson production with the CMS experiment.
RooFit is the statistical modeling and fitting package used in many big particle physics experiments to extract physical parameters from reduced particle collision data, e.g. the Higgs boson experiments at the LHC.
RooFit aims to separate particle physics model building and fitting (the users' goals) from their technical implementation and optimization in the back-end.
In this talk, we outline our efforts to further optimize the back-end by automatically running major parts of user models in parallel on multi-core machines.
A major challenge is that RooFit allows users to define many different types of models, with different types of computational bottlenecks.
Our automatic parallelization framework must then be flexible, while still reducing run-time by at least an order of magnitude, preferably more.
We have performed extensive benchmarks and identified at least three bottlenecks that will benefit from parallelization.
To tackle these and possible future bottlenecks, we designed a parallelization layer that allows us to parallelize existing classes with minimal effort, but with high performance and retaining as much of the existing class's interface as possible.
The high-level parallelization model is a task-stealing approach.
Our multi-process approach uses socket-based communication, originally implemented using a custom built, highly performant bi-directional memory mapped pipe, while currently we are considering switching to ZeroMQ for more flexibilty.
Preliminary results show speed-ups of factor 2 to 20, depending on the exact model and parallelization strategy.
The LHCb detector will be upgraded in 2021, and due to the removal of the hardware-level trigger and the increase in the luminosity of the collisions, the conditions for a High Level Trigger 1 in software will become more challenging, requiring processing the full 30 MHz data-collision rate. The GPU High Level Trigger 1 is a framework that permits concurrent many-event execution targeting many-core architectures. It is designed to hide data transmission overhead with a custom memory manager and maximize GPU resource usage employing a static scheduler. We present the core infrastructure of this R&D project on many-core architectures developed in the context of the LHCb Upgrade I. We discuss the design aspects driving it, and present algorithm-specific data layout design and evaluate their impact on performance.
Artificial neural networks are becoming a standard tool for data analysis, but their potential remains yet to be widely used for hardware-level trigger applications. Nowadays, high-end FPGAs, as they are also often used in low-level hardware triggers, offer enough performance to allow for the inclusion of networks of considerable size into these system for the first time. Nevertheless, in the trigger context, it is necessary to highly optimize the implementation of neural networks to make full use of the FPGA capabilities.
We implemented the processing data and control flow of typical NN layers, taking into account incoming data rates of up to multiple tens of MHz and sub-microsecond latency limits, but also aiming at an efficient use of the resources of the FPGA. This resulted in a highly optimized neural network implementation framework, which typically reaches 90 to 100 % computational efficiency, requires few extra FPGA resources for data flow and controlling, and achieves latencies in the order of only tens to few hundreds of nanoseconds for entire (deep) networks. The implemented layers include 2D convolutions and pooling (both with multi-channel support), as well as dense layers, all of which play a role in many physics-/detector-related applications. Significant effort was put especially into the 2D convolutional layers, to achieve a fast implementation with minimal resource usage.
A toolkit is provided which automatically creates the optimized FPGA implementation of trained deep neural network models. Results are presented, both for individual layers as well as entire networks created by the toolkit.
Nested data structures are critical for particle physics: it would be impossible to represent collision data as events containing arbitrarily many particles in a rectangular table (without padding or truncation, or without relational indirection). These data structures are usually constructed as class objects and arbitrary length sequences, such as vectors in C++ and lists in Python, and data analysis logic is expressed in imperative loops and conditionals. However, code expressed this way can thwart auto-vectorization in C++ and Numpy optimization in Python, and may be too explicit to automatically parallelize. We present an extension of the "array programming" model of APL, R, MATLAB, and Numpy, which expresses regular operations on large arrays in a concise syntax. Ordinarily, array programming only applies to flat arrays and rectangular tables, but we show that it can be extended to collections of arbitrary length lists ("jagged arrays"), nested records, polymorphic unions, and pointers. We have implemented such a library in Python called awkward-array, and we will show how it can be used to fit particle physics data into systems designed for Numpy data, such as Pandas (for analysis organization), Numba (for just-in-time compilation), Dask (for parallel processing), and CuPy (array programming on the GPU). We will also show how a proper set of primitives enables non-trivial analyses, such as combinatorial searches for particle candidates, in SIMD environments.
Efficient random number generation with high quality statistical properties and exact reproducibility of Monte Carlo simulation are important requirements in many areas of computational science. VecRNG is a package providing pseudo-random number generation (pRNG) in the context of a new library VecMath. This library bundles up several general-purpose mathematical utilities, data structures and algorithms having both SIMD and SIMT(GPUs) support based on VecCore. Several state-of-the-art RNG algorithms are implemented as kernels supporting parallel generation of random numbers in both scalar, vector and Cuda workflows. In this report, we will present design considerations, implementation details and computing performance of parallel pRNG engines on both CPU and GPU. Reproducibility of propagating multiple particles in parallel in HEP event simulation is demonstrated, using GeantV based examples, for both sequential and fine-grain track-level concurrent simulation workflows. Strategies for efficient uses of vectorized pRNG and non-overlapping streams of random number sequences in concurrent computing environments will be also discussed.
We investigate the problem of dark matter detection in emulsion detector. Previously we have shown, that it is very challenging but possible to use emulsion films of OPERA-like detector in SHiP experiment to separate electromagnetic showers from each other, thus hypothetically separating neutrino events from dark matter. In this study, we have investigated the possibility of usage of Target Tracker (TT) stations in OPERA-like SHiP detector to identify the energy and position of the initial particle. The idea of such search is that unlike emulsion, TT are online detectors, benefiting of zero events pile up. Thus, online observation of the excess of events with proper energy can be a signal of a dark matter.
Two different approaches were applied: classical, using Gaussian Mixtures and machine learning based on a convolutional neural network with coordinate convolution layers for energy and longitudinal position prediction. Clusterization techniques were used for transverse coordinate estimation. The obtained results are about 25% for energy resolution and about 0.8 cm for position resolution in the longitudinal direction and 1 mm in the transverse direction, without any usage of the emulsion. Obtained results are comparable to the case of multiple showers separation in the emulsion.
The obtained results will be further used to optimise the cost and parameters of the proposed SHiP emulsion detector.
Ground-based $\gamma$-ray astronomy relies on reconstructing primary particles' properties from the measurement of the induced air showers. Currently, template fitting is the state-of-the-art method to reconstruct air showers. CNNs represent promising means to improve on this method in both, accuracy and computational cost. Promoted by the availability of inexpensive hardware and open-source deep learning frameworks (DLFs) the applicability of CNNs for air shower reconstruction is in focus of recent and on-going studies. Thereby, the hexagonal sampling of data, which is common for Cherenkov telescopes but does not fit the input format of DLFs, poses an obstacle. It has been addressed e.g by transforming the hexagonally sampled data to an approximate representation on a rectangular grid prior to the application of CNNs. Though this procedure was shown to yield promising results, it comes at the prize of increasing computational costs. The transformation can be omitted if convolutions are directly applied on the hexagonal grid. For this purpose a Python library, called HexagDLy, was written and made publicly available. In the present study, HexagDLy was used to build CNN models for the analysis of data from the High Energy Stereoscopic System. The performance of these models on classifying and reconstructing air-shower events will be shown and compared to alternative methods.
In recent years, the astroparticle physics community has successfully adapted supervised learning algorithms for a wide range of tasks, including event reconstruction in cosmic ray observatories[1], photon identification at Cherenkov telescopes[2], and the extraction of gravitational wave signals from time traces[3]. In addition, first unsupervised learning approaches of generative models at observatories for cosmic rays showed promising results[4]. Besides simulation acceleration, here, the refinement of physics simulations was investigated by training a refiner network to make simulated time traces to look like data traces. This may have groundbreaking outcomes on machine learning algorithms and shows the potential to explore unsupervised learning for physics research.
In this presentation we summarize the latest developments in machine learning in
the context of astroparticle physics and discuss the far-reaching scope of future applications.
[1] DOI: 10.1016/j.astropartphys.2017.10.006
[2] DOI: 10.1016/j.astropartphys.2018.10.003
[3] DOI: 10.1016/j.physletb.2017.12.053
[4] DOI: 10.1007/s41781-018-0008-x
From a breakthrough revolution, Deep Learning (DL) has grown to become a de-facto standard technique in the fields of artificial intelligence and computer vision. In particular Convolutional Neural Networks (CNNs) are shown to be a powerful DL technique to extract physics features from images: They were successfully applied to the data reconstruction and analysis of Liquid Argon Time Projection Chambers (LArTPC), a class of particle imaging detectors which records the trajectory of charged particles in either 2D or 3D volumetric data with a breathtaking resolution (~3mm/pixel). The CNNs apply a chain of matrix multiplications and additions, and can be massively parallelized on many-core systems such as GPUs when applied on image data analysis. Yet a unique feature of LArTPC data challenges traditional CNN algorithms: it is locally dense (no gap in a particle trajectory) but generally sparse. A typical 2D LArTPC image has less than 1% of pixels occupied with non-zero value. This makes standard CNNs with dense matrix operations very inefficient. Submanifold sparse convolutional networks (SSCN) have been proposed to address exactly this class of sparsity challenges by keeping the same level of sparsity throughout the network. We demonstrate their strong performance on some of our data reconstruction tasks which include 3D semantic segmentation for particle identification at the pixel-level. They outperform a standard, dense CNN in an accuracy metric with substantially less computations. SSCN can address the problem of computing resource scalability for 3D DL-based data reconstruction chain R&D for LArTPC detectors.
I give an update on recent developments in FeynArts, FormCalc, and LoopTools, and show how the new features were used in making the latest version of FeynHiggs.
The software framework SModelS, which has already been presented at the ACAT 2016 conference, allows for a very fast confrontation of arbitrary BSM models exhibiting a Z2 symmetry with an ever growing database of simplified models results from CMS and ATLAS. In this talk we shall present its newest features, like the extension to include searches for heavy stable charged particles (HSCPs), or the ability to combine the results from several signal regions, exploiting the simplified likelihood framework introduced by CMS. Also, the database has been greatly extended; it now comprises almost 3000 individual results from close to 100 individual analyses. Finally, we shall also discuss ongoing developments, like the use of neural networks to further speed up the software, and the extension to an even wider set of experimental signatures.
FORM is a symbolic manipulation system, which is especially advantageous for handling gigantic expressions with many small terms. Because FORM has been developed in tackling real problems in perturbative quantum field theory, it has some features useful in such problems, although FORM applications are not restricted to any specific research field. In this talk, we discuss recent developments of FORM and its new features.
Posters in this session can be seen in Room B during all Wednesday and Thursday.
Improving the computing performance of particle transport simulation is an important goal to address the challenges of HEP experiments in the coming decades (i.e. HL-LHC), as well as the needs of other fields (i.e. medical imaging and radiotherapy).
The GeantV prototype includes a new transport engine, based on track level parallelization by grouping a large number of tracks in flight into "baskets", with improved use of caches and vectorisation. The main goal is to investigate what performance increase this new approach can deliver compared to the performance of the Geant4 toolkit.
We have implemented a prototype of the transport engine and auxiliary components, including a work scheduler, vectorized code for geometry, transport and magnetic field handling, as well as a complete set of vectorized electromagnetic physics models compatible with a recent Geant4 version. Based on this prototype, computing performance benchmarking and software optimization will allow us to determine the performance gains achievable in realistic conditions.
An analysis of the current performance results will be presented, both for a complete LHC detector and a simplified sampling calorimeter setup. The individual sources of the observed performance gain will be discussed together with the experienced limitations, and an estimate of the final performance will be given.
Deep Learning techniques have are being studied for different applications by the HEP community: in this talk, we discuss the case of detector simulation. The need for simulated events, expected in the future for LHC experiments and their High Luminosity upgrades, is increasing dramatically and requires new fast simulation solutions. We will describe an R&D activity within CERN openlab, aimed at providing a configurable tool capable of training a neural network to reproduce the detector response and replace standard Monte Carlo simulation. This represents a generic approach in the sense that such a network could be designed and trained to simulate any kind of detector in just a small fraction of time. We will present the first application of three-dimensional convolutional Generative Adversarial Networks to the simulation of high granularity electromagnetic calorimeters. We will describe detailed validation studies comparing our results to Geant4 Monte Carlo simulation, showing, in particular, the very good agreement we obtain for high level physics quantities (such as energy shower shapes) and detailed calorimeter response (single cell response). Finally we will show how this tool can easily be generalized to describe a larger class of calorimeters, opening the way to a generic machine learning based fast simulation approach. To achieve generalization we will leverage advanced optimization algorithms (using Bayesian and/or Genetic approach) and apply state of the art data parallel strategies to distribute the training process across multiple nodes in HPC and Cloud environment. Performance of the parallelization of GAN training on HPC clusters will also be discussed in details.
JANA2 is multi-threaded event reconstruction framework being developed for Experimental Nuclear Physics. It is an LDRD funded project that will be the successor of the original JANA framework. JANA2 is a near complete rewrite emphasizing C++ language features that have only become available since the C++11 standard. Successful and less-than-successful strategies employed in JANA and how they are being addressed in JANA2 will be presented as well as new features suited to modern and future trends in data analysis.
The Solenoidal Tracker at RHIC (STAR) is a multi-national supported experiment located at Brookhaven National Lab. The raw physics data captured from the detector is on the order of tens of PBytes per data acquisition campaign, which makes STAR fit well within the definition of a big data science experiment. The production of the data has typically run on standard nodes or on standard Grid computing environments. All embedding simulations (complex workflow mixing real and simulated events) have been run a standard Linux resources at NERSC aka PDSF. However, HPC resources such as Cori have become available for STAR’s data production as well as embedding, and STAR has been the very first experiment to show feasibility of running a sustainable data production campaign on this computing resource.
The use of Docker containers with Shifter is required to run on HPC @ NERSC – this approach encapsulates the environment in which a standard STAR workflow runs. From the deployment of a tailored Scientific Linux environment (requiring many of its own libraries and special configurations required to run) to the deployment of third-party software and the STAR specific software stack, it has become impractical to rely on a set of containers containing each specific software release. To this extent, solutions based on CVMFS for the deployment of software and services have been employed in HENP, but one needs to make careful scalability considerations when using a resource like Cori, such as not allowing all software to be deployed in containers or bare node. Additionally, CVMFS clients are not compatible on Cori nodes and one needs to rely on an indirect NFS mount scheme. In our contribution, we will discuss our strategies from the past and our current solution based on CVMFS. Furthermore, running on HPC is not a simple task as each aspect of the workflow must be enabled to scale, run efficiently, and the workflow needs to fit within the boundaries of the provided queue system (SLURM in this case). Lastly, we will also discuss what we have learned to be the best method for grouping jobs to maximize a single 48 core HPC node within a specific time frame and maximize our workflow efficiency.
We hope both aspects will serve the community well as well as those following the same path.
PICO is a dark matter experiment using superheated bubble chamber technology. One of the main analysis challenges in PICO is to unambiguously distinguish between background events and nuclear recoil events from possible WIMP scatters. The conventional discriminator, acoustic parameter (AP), utilizes frequency analysis in Fourier space to compute the acoustic power, which is proven to be different for alpha and nuclear recoils. In a recent machine learning development, an intern collaborator demonstrated extremely powerful discriminators using semi-supervised learning. I will be presenting the results he achieved, and provide an outlook for machine learning in future analysis.
Deep learning architectures in particle physics are often strongly dependent on the order of their input variables. We present a two-stage deep learning architecture consisting of a network for sorting input objects and a subsequent network for data analysis. The sorting network (agent) is trained through reinforcement learning using feedback from the analysis network (environment). A tree search algorithm is used to examine the large space of different possible orders.
The optimal order depends on the environment and is learned by the agent in an unsupervised approach. Thus, the 2-stage system can choose an optimal solution which is not know to the physicist in advance.
We present the new approach and its application to various classification tasks.
Accurate particle identification (PID) is one of the most important aspects of the LHCb experiment. Modern machine learning techniques such as deep neural networks are efficiently applied to this problem and are integrated into the LHCb software. In this research, we discuss novel applications of neural network speed-up techniques to achieve faster PID in LHC upgrade conditions. We show that the best results are obtained using variational dropout sparsification, which provide a prediction speed increase of up to a factor five even when compared to a model with shallow networks.
Using variational autoencoders trained on known physics processes, we develop a one-side p-value test to isolate previously unseen event topologies as outlier events. Since the autoencoder training does not depend on any specific new physics signature, the proposed procedure has a weak dependence on underlying assumptions about the nature of new physics. An event selection based on this algorithm would be complementary to classic LHC searches, typically based on model-dependent hypothesis testing. Such an algorithm would deliver a list of anomalous events, that the experimental collaborations could further scrutinize and even release as a catalog, similarly to what is typically done in other scientific domains. Repeated patterns in this dataset could motivate new scenarios for beyond-the-standard-model physics and inspire new searches, to be performed on future data with traditional supervised approaches. Running in the trigger system of the LHC experiments, such an application could identify anomalous events that would be otherwise lost, extending the scientific reach of the LHC.
The talk is devoted to the overview of Rings — an efficient lightweight library for commutative algebra written in Java and Scala languages. Polynomial arithmetic, GCDs, polynomial factorization and Gröbner bases are implemented with the use of modern asymptotically fast algorithms. Rings can be easily interacted or embedded in applications in high-energy physics and other research areas via a simple API with fully typed hierarchy of algebraic structures and algorithms for commutative algebra. The use of the Scala language brings a quite novel powerful, strongly typed functional programming model allowing to write short, expressive, and fast code for applications. At the same time Rings shows one of the best performances among existing software for algebraic calculations.
Over the last few years manipulating expressions with millions of terms has become common in particle physics. Form is the de facto tool for manipulations of extremely large expressions, but it comes with some downsides. In this talk I will discuss an effort to modernize aspects of Form, such as the language and workflow, and the introduction of bindings to C and Python. This new tool is written in Rust and is called reFORM
The Mathematica package STR (Star-Triangle Relations) is a recently developed tool designed to solve Feynman diagrams by means of the method of uniqueness in any (Euclidean) spacetime dimension D. The method of uniqueness is a powerful technique to solve multi-loop Feynman integrals in theories with conformal symmetry imposing some relations between D and the powers of propagators. In our algorithm we include both identities for scalar and Yukawa type integrals. The package is equipped with a graphical environment in which is possible to draw the desired diagram with the mouse input and a set of tools to modify and compute it. Throughout the use of a graphic interface, the package should be easily accessible to users with little or no previous experience on diagrams computation.
Beginning in 2021, the upgraded LHCb experiment will use a triggerless readout system collecting data at an event rate of 30 MHz. A software-only High Level Trigger will enable unprecedented flexibility for trigger selections. During the first stage (HLT1), a sub-set of the full offline track reconstruction for charged particles is run to select particles of interest based on single or two-track selections. After this first stage, the event rate is reduced by at least a factor 30. Track reconstruction at 30 MHz represents a significant computing challenge, requiring a renovation of current algorithms and the underlying hardware. In this talk we present work based on an R&D project in the context of the LHCb Upgrade I exploring the approach of executing the full HLT1 chain on GPUs. This includes decoding the raw data, clustering of hits, pattern recognition, as well as track fitting. We will discuss the development of algorithms optimized for many-core architectures. Both the computing and physics performance of the full HLT1 chain will be presented.
The LHCb experiment is dedicated to the study of the c- and b-hadrons decays, including long living particles such as Ks and strange baryons (Lambda, Xi, etc... ). These kind of particles are difficult to reconstruct from LHCb tracking systems since they escape the detection in the first tracker. A new method to evaluate the performance in terms of efficiency and throughput of the different tracking algorithms for long living particles have been developed. Special emphasis is laid on particles hitting only part of the tracking system of the new LHCb upgrade detector.