Since the last ROOT Users' Workshop in 2015, ROOT has seen many new developments, new team members and contributors, production releases and infrastructure changes. This presentation will summarize where ROOT is, from measurements (lines of code, commits per second, merge request and forum activity) to summary-level news on areas of development and infrastructure (github, Discourse).
ROOT is...
I would like to present some thoughts and ideas may be useful for the future based on one side on my previous experience as an actor and now observer of the project since a few years, and on the other side on many comments/suggestions received from many users.
The International project physicsmasterclasses.org brings the excitement of cutting-edge particle-physics research into the classroom including hands-on experience with real experimental data. In 2018 about 15 000 students, 15 to 19 years of age, participated in this popular event over 6 weeks in 52 countries, hosted by 225 institutes. The existing LHC masterclasses are used in many other...
In order to take full advantage of the current computer architectures and to improve performance with increasing amounts of data to analyze, we developed tools for the parallelization of ROOT at task-level and integrated libraries for its parallelization at data-level. This tools have been extensively deployed throughout ROOT, from vectorization and parallelization of the fit to the parallel...
We present an update on our recent efforts to further parallelize RooFit. We have performed extensive benchmarks and identified at least three bottlenecks that will benefit from parallelization. To tackle these and possible future bottlenecks, we designed a parallelization layer that allows us to parallelize existing classes with minimal effort, but with high performance and retaining as much...
Array-at-a-time processing is key for performance of low arithmetic intensity calculations, such as plotting, because of sequential memory access and SIMD parallelization. However, HEP data typically have nested structures and HEP algorithms typically require inner loops. We will present techniques for manipulating arrays of nested data to perform combinatoric calculations (e.g. pairs of...
We present the results of a diploma thesis adding CUDA (runtime) C++ support to cling. Today's HPC systems are heterogeneous and get most of their computing power from so-called accelerator hardware, such as GPUs. Programming GPUs with modern C++ is a perfect match, allowing perfectly tailored and zero-overhead abstractions for performance-critical "kernels".
Nevertheless, tool complexity in...
The simulation of complex systems can involve a high degree of dependencies and interaction between different parts of the simulation. A standard approach to handle these dependencies is the use of a double-buffered global state where all components of the simulation are processed synchronously in lockstep. Parallelization of such an approach has the drawback that it requires a synchronization...
The DEEP-EST is the European Project building a new generation of the Modular Supercomputer Architecture (MSA). The MSA is a blueprint for heterogeneous HPC systems supporting high performance compute and data analytics workloads with highest efficiency and scalability.
Within the context of the project, we are working on the JVM based implementation of the ROOT File Format,...
ROOT’s Declarative Approach for Manipulation and Analysis of Datasets
ROOT proposed a modern, declarative approach to the treatment of columnar datasets, RDataFrame.
Conceived initially as a way to implement functional chains, RDataFrame became a highly performant Swiss-Army knife for dataset manipulation and analysis.
This contribution discusses RDataFrame’s minimal and modern interface...
We are less than three years apart from the first, double precision Exa-Flop/s supercomputers. Already today, our scientific software stacks are facing the challenge to run efficiently on a potpourri of architectures. But the real troubles might await us at the choke points of extreme data rates, where traditional workflows of data acquisition, filtering, processing and subsequent long-term...
What is ROOT I/O? What are its strengths? What are its weakness? How does it compare to alternatives? This presentation will address these questions and present performance contrasts with other solutions. We will then review the latest additions, fixes and features. Finally, we will give our vision of where we can take ROOT I/O to make it even better and more relevant to today’s...
We present an investigation into the usage of compression in the ROOT I/O subsystem and its impact on ROOT-related performance and file size. The goal of the talk is to explain how compression is used in ROOT and which algorithms could be most interesting in terms of ROOT performance, while describing pro and cons of compression algorithms in details. We provide a comparison analysis of ROOT...
The ROOT TTree data format encodes hundreds of petabytes of HEP events. Its columnar layout drives rapid analyses, as only those parts (columns) that are really used in a given analysis need to be read from storage. Its unique feature is the seamless C++ integration, which allows users to directly store their event classes without explicitly defining data schemas.
In this contribution, we...
The uproot package provides access to ROOT data in several new ways: (a) natively in Python+Numpy data types, (b) installable without specialized binary dependencies, and (c) with columnar, array-at-a-time performance characteristics. This talk presents a new feature in uproot: the ability to write ROOT files, rather than just reading them. In addition to outlining the scope of which kinds of...
C++ remains one of the best languages for writing performance sensitive software. However, it has proven to pose significant challenges for developers, especially when a single codebase is evolved over a long period of time. We have taken an extremely large, complex C++ codebase at Google from being a huge risk and liability, into something that is sustainable, with both productive and...
ROOT graphics had many key improvements since the last workshop, for
instance ratio plots, automatic placement of legends, automatic
coloring, and the highlight technique. We will present the most relevant
recent developments.
The tidyverse is a collection of R packages for data science sharing an underlying design philosophy and data structures similar to RDataFrame. ggplot is the graphics library component of the tidyverse, [implementing the concepts][1] behind [The Grammar of Graphics][2]. ggplot allows to express our analysis in a declarative, layered, functional way, closely related to the programming model...
For two decades, ROOT brought its own window system abstraction (for X11, GL, Cocoa, and Windows) together with its own GUI library. X11 is nearing the end of its lifetime; new windowing systems shine with performance and features. To make best use of them, the ROOT team has decided to re-implement its graphics and GUI subsystem using web technology.
To enable development of new web-based...
In the context of ROOT 7 a new web based model for Graphics is being developed using web technologies (html, JavaScript). We will present the developments already achieved, the missing features and the foreseen functionalities. This new environment already has the basic graphics classes: RCanvas, RFrame, basic primitives and attributes and coordinates system; even ROOT6 histograms etc can be...
The Fit Panel is a GUI to interactively fit histograms using all the power of the ROOT fitting tools. It can fit histograms and graphs with a predefined function, or a function defined by the user. In addition, we can edit different choices for the fitting, select a specific section of the histogram to fit or select different drawing option. The ROOT 7 version of this important GUI element is...
Doxygen is the tool ROOT used to produce the reference guide and markdown/pandoc for the all other documents. A doxygen filter has been developed the past few years coupling closely ROOT and the reference guide generation. For instance this allows to generate on the fly high definition images for inline macros' code and tutorials, to produce the tree of needed libraries for each class. Future...
The core driving force behind the data intensive nature of research in particle physics, is the level of statistical precision achievable in measuring the parameters of the Standard Model. The RooFit and RooStats data modelling packages are the gateway that allows users to interface the complex data processing abilities provided by ROOT and the precise statistical inference tools required to...
We all learn by example. Great examples are not easy to create. I'd love to see a user driven ROOT gallery. A place where users can publish examples of beautiful plots they created in ROOT or tricky issues they solved in ROOT. The gallery should be searchable and user's should be able to comment and improve examples.
With Python already established as a key player in scientific computing, the interest for the language in HEP has been growing over the last years. In order to address that demand, the ROOT Python bindings (PyROOT) are being modernised and extended. A new experimental PyROOT has been created by leveraging Cppyy, which provides the basic bleeding edge functionality for generating automatic...
iminuit is an external Python interface to the Minuit2 C++ code,
which can be compiled standalone without the rest of ROOT. iminuit
has recently seen a boost of development which culminated in the
latest 1.3 release and will join the Scikit-HEP project in this year.
To simplify Minuit’s use as a standalone CMake package, for projects
like iminuit and GooFit, a new standalone build system...
In this talk we present PyrooFit, a fit framework based on python and pandas DataFrames built on top of the ROOT.RooFit package. It currently allows for simple fits of standard PDFs and easy setup of custom PDFs in one or more fit dimensions, but the ultimate goal is to develop it into a general framework independent of a specific back-end.
Finally in this talk we report on general feedback...
This presentation discusses our work on ROOT package manager and build features for it. The part of talk will conclude the integration of package manager into ROOT runtime. Taking in account ongoing work on C++ modules in ROOT, we will align package manager’s activity together with it to develop a simpler way to use software libraries and in the same time provides better compile-time...
After (too) many years without any ROOT 6 release on Windows, we got our first preview release of ROOT 6 on Windows (6.14/00) and Visual Studio 2017. This presentation will expose some of the issues we had, what is the current status, what is missing, what has to be done to port ROOT on Windows 64 bit (horizon 2020?), and more generally what people should think of in order to write portable code.
ROOT 6.14/00 is the first release in which CMake is used exclusively for the build system, after deprecation and removal of the old configure/make scripts. However, the CMake build system was added in 2011, before CMake 3.0 and “modern CMake” became established. This presentation will summarize how ROOT's build system is evolving to use modern CMake constructs to become more modular and to...
In recent years, static code analysis tools for C++ have made huge improvements. This talk shows how to use clang-tidy and reports about the results applying it to the ROOT source code. As the resulting log is huge, finding relevant results can be challenging. Nevertheless, a few problems in ROOT’s source code and even a bug in LLVM were found this way. In addition, it is shown how one can use...
This talk shows the status of the C++ Modules in ROOT and CMSSW. We will demonstrate performance improvement in ROOT and in CMSSW. Runtime C++ module improves correctness and simplifies the current implementation which relies on rdict_pcms and rootmap files. We would like to describe the major challenges: improving both ROOT and Clang. We will describe the current state of the experimental...
Widespread distributed processing of big datasets has been around for more than a decade now thanks to Hadoop, but only recently higher-level abstractions have been proposed for programmers to easily operate on those datasets, e.g. Spark. ROOT has joined that trend with its RDataFrame tool for declarative analysis, which currently supports local multi-threaded parallelisation. However,...
I would like to share my path of learning ROOT and acquiring software skills as a young physicist. I started learning ROOT in my 2nd year of physics bachelors with little of programming knowledge and some experience in C. I found mostly outdated docs and code from old guard physicists (overuse of pointers, new, delete, single file long macros, no OOP, …). What followed was 3 years of learning...
Most of the ATLAS analyses do their final step by processing TTree-based ntuples. As actual analysis selection optimisations take place at this step, it is crucial that one can process all events in a timely fashion. A (personal) experience writing a set of ROOT-based algorithms to process ntuples and create final histograms will be presented. More than 2 TB of data can be processed in about...
In 2016, development of an open source C++ library for multi-dimensional histograms in collaboration with the Boost community started, which leverages modern C++11 features such as variadic templates and template meta-programming (TMP). Variadic templates make code possible that works for arbitrary dimensions. TMP allows one to write histograms which compile to very efficient assembler while...
TRooFit is an extension of the RooFit toolkit, providing a set of classes that are inspired by core ROOT objects (specifically histograms and histogram stacks) but are fully-functional RooFit pdfs or functions. TRooFit's TRooH1D and TRooHStack are fillable and drawable in the same way as a TH1D and THStack, but are simultaneously concrete RooFit PDFs that can be fit to a RooFit dataset....
I have been using ROOT as an undergraduate and graduate student for about seven years now. The first scripts I ever wrote were ROOT macros. From naive histogram-filling in MakeClass.C to writing python wrapper-classes and analyzing unbinned fits, much of my development as a programmer has taken place in the context of ROOT.
In this talk, I present needs and examples of ROOT usage from fellow...
ROOT is a universal data analyzing library, which is not only used by particle physicists, but also by nuclear physicists. This presentation will focus on the data analysis of "small" datasets (10's GB) from a typical nuclear physics experiment, parallelization for beginners and a viable alternative to the current TTreeReader class. Missing features will also be addressed from a user perspective.
This talk reports on experiences and lessons learned moving from particle physics via software development to data science. It explains why we are still using ROOT from time to time and highlights some of its features making data analysis simple and fast. It also presents a few ideas to improve common tasks in ROOT and TMVA. Some annoyances and problems with ROOT are shown as well.
The aim of this talk is twofold: to summarize ROOT prompt options by simple use cases, and to introduce the newly introduced --strict flag. ROOT prompt, which runs C++ interpreter in its background, supports many features while not many of those are known to users. In this talk, we would like to summarize these options classified by simple use cases.
Second part is dedicated to an option...
This talk will give a short overview over the Belle II Software
Framework and its use in offline and online data processing. It will
highlight the use of ROOT for data storage and inter-process
communication as well as our Python 3 integration and user
interface. Finally we will present some feedback and requests,
primarily from the offline side.
We will discuss ROOT and its impact in the CMS offline production software, analysis and online systems. We will identify ways that CMS relies on ROOT components, which span a broad space that reaches most aspects of the experiment software stack. We'll focus our discussion on current development activities and lessons learned around areas of software operations, data storage, I/O, data...
CERN LHCb experiment has been using ROOT for 20 years as back-end for storing the reconstructed event data and the stripped down data used for physics analysis.
With the ongoing work to prepare the experiment software for the upgrade of the detector happening during LHC second Long Shutdown, we are reviewing the LHCb software framework Gaudi to improve the code base in terms of performance,...
The second stage of the LHCb software trigger application consists of an
up-front event reconstruction followed by approximately 500 independent
event selections. Monitoring of the performance of the reconstruction and
selections is crucial to detect and address issues that may appear as soon
as possible. To this end, each process produces approximately $3 * 10^3$
monitoring histograms,...
DD4hep is a detector geometry package which can provide
all auxiliary information necessary to process data from
particle collisions in high energy physics such as geometry
and readout information, but also interfaces to conditions
and alignment data. DD4hep supports all data processing
activities of an experiment: simulation, reconstruction
and analysis.
The modeling of the geometry of a...
The talk will present the plans for the evolution of the ROOT geometry package in relation with VecGeom development. The future integration of VecGeom performance features and the impact on external frameworks depending on TGeo (such as DD4HEP or VMC) will also be discussed.
We are going to present the N-dimensional analysis pipeline of the ALICE experiment, including creation of materialized views, n-dimensional histogramming package, statistical maps extraction, local regression and optimizer of physical models.
Besides we will demonstrate how the pipeline can be interfaced to the TMVA package.
We will show multiple examples of the pipeline usage in the ALICE...
In this talk, we will describe the latest additions to the Toolkit for Multivariate Analysis (TMVA), the machine learning package integrated into the ROOT framework. In particular, we will focus on the new deep learning module that contains robust fully-connected, convolutional and recurrent deep neural networks implemented on CPU and GPU architectures. We will present performance of these new...
Cross validation, an important tool for optimising and understanding generalisation performance of large models (many tunable parameters), was recently introduced in TMVA. Currently implemented to estimate model performance, developments are ongoing for model selection. This talk will give an introduction to what CV is and how it is used in TMVA, with particular focus on HEP...
TMVA has been a pioneering effort which set a milestone for machine-learning (ML) in high-energy physics (HEP) more than ten years ago and remains in use in numerous analyses of LHC experiments.
On the other hand, the ML landscape explosively evolved during these years and - as industry stepped in - ML became suddenly one of the most active fields in science. This talk discusses how TMVA can...
The International Linear Collider is a linear e+e- collider project expected to be built in Japan. In order to study the potential physics discoveries with such a collider, the ILCSoft framework has been developed by the ILC and CLIC collaborations. As for most of the experiment frameworks, ROOT plays a central role in ILCSoft. In particular, the DD4hep package, based on the TGeo component of...
The art framework, used by roughly a dozen HEP experiments, provides simple interactions with ROOT for its users. In addition to facilities that enable dictionary generation and ROOT-based input sources and output modules, art also provides easy user interaction with ROOT constructs from within the framework.
We describe how art and its experiments use ROOT, both within the framework...
SHiP is an experiment that is currently being designed to look for hidden particles beyond the Standard Model at CERN's SPS. As a SHiP PhD student working (among other things) on its software, I will give an overview of the SHiP software stack, with a focus on how we use ROOT, and how we would like it improve.
I will cover some aspects that are common to small experiments using ROOT, but also...