Conveners
Parallelism, Heterogeneity and Distributed Data Processing
- Javier Cervantes Villanueva (CERN)
Parallelism, Heterogeneity and Distributed Data Processing
- Javier Cervantes Villanueva (CERN)
Parallelism, Heterogeneity and Distributed Data Processing: Discussion Session
- Javier Cervantes Villanueva (CERN)
Future accelerators and detectors pose to HEP scientific software a series of challenges, among which the efficient analysis of the data collected by the experiments. In the past few years ROOT became to a large extent parallel but our endeavour is not complete.
This presentation is dedicated to the characterisation of the parallelisation effort which took place up to now and to the lessons...
In order to take full advantage of the current computer architectures and to improve performance with increasing amounts of data to analyze, we developed tools for the parallelization of ROOT at task-level and integrated libraries for its parallelization at data-level. This tools have been extensively deployed throughout ROOT, from vectorization and parallelization of the fit to the parallel...
We present an update on our recent efforts to further parallelize RooFit. We have performed extensive benchmarks and identified at least three bottlenecks that will benefit from parallelization. To tackle these and possible future bottlenecks, we designed a parallelization layer that allows us to parallelize existing classes with minimal effort, but with high performance and retaining as much...
Array-at-a-time processing is key for performance of low arithmetic intensity calculations, such as plotting, because of sequential memory access and SIMD parallelization. However, HEP data typically have nested structures and HEP algorithms typically require inner loops. We will present techniques for manipulating arrays of nested data to perform combinatoric calculations (e.g. pairs of...
The next big upgrade of the LHC will increase data volume by an order of magnitude. This is a significant challenge that will require most HEP software to adapt to be able to exploit all forms of parallelism. Portable and efficient SIMD vectorization has been a particularly difficult challenge due to incompatible programming APIs from different libraries, and rapid evolution of hardware. This...
We present the results of a diploma thesis adding CUDA (runtime) C++ support to cling. Today's HPC systems are heterogeneous and get most of their computing power from so-called accelerator hardware, such as GPUs. Programming GPUs with modern C++ is a perfect match, allowing perfectly tailored and zero-overhead abstractions for performance-critical "kernels".
Nevertheless, tool complexity in...
The simulation of complex systems can involve a high degree of dependencies and interaction between different parts of the simulation. A standard approach to handle these dependencies is the use of a double-buffered global state where all components of the simulation are processed synchronously in lockstep. Parallelization of such an approach has the drawback that it requires a synchronization...
The DEEP-EST is the European Project building a new generation of the Modular Supercomputer Architecture (MSA). The MSA is a blueprint for heterogeneous HPC systems supporting high performance compute and data analytics workloads with highest efficiency and scalability.
Within the context of the project, we are working on the JVM based implementation of the ROOT File Format,...
ROOT’s Declarative Approach for Manipulation and Analysis of Datasets
ROOT proposed a modern, declarative approach to the treatment of columnar datasets, RDataFrame.
Conceived initially as a way to implement functional chains, RDataFrame became a highly performant Swiss-Army knife for dataset manipulation and analysis.
This contribution discusses RDataFrame’s minimal and modern interface...
We are less than three years apart from the first, double precision Exa-Flop/s supercomputers. Already today, our scientific software stacks are facing the challenge to run efficiently on a potpourri of architectures. But the real troubles might await us at the choke points of extreme data rates, where traditional workflows of data acquisition, filtering, processing and subsequent long-term...