### Conveners

#### Statistical Methods for Physics Analysis in the XXI Century: H1a

- Tommaso Dorigo (Universita e INFN, Padova (IT))

#### Statistical Methods for Physics Analysis in the XXI Century: H1b

- Tommaso Dorigo (Universita e INFN, Padova (IT))

#### Statistical Methods for Physics Analysis in the XXI Century: H2a

- Tommaso Dorigo (Universita e INFN, Padova (IT))

#### Statistical Methods for Physics Analysis in the XXI Century: H2b

- Sergei Gleyzer (University of Florida (US))

#### Statistical Methods for Physics Analysis in the XXI Century: H3a

- Sergei Gleyzer (University of Florida (US))

#### Statistical Methods for Physics Analysis in the XXI Century: H3b

- Sergei Gleyzer (University of Florida (US))

Statistics plays a crucial role in the extraction of information from physics measurements, and its scope has been steadily increasing in the XXI century, pushed in particular by development of machine learning tools. In this talk will be given an introduction of the status of statistics practice, relying on the driving example of HEP, and a look at the goals for the three days of talks,...

The study of the Quark-Gluon Plasma created in ultrarelativistic heavy-ion collisions at the CERN-LHC is complemented by reference measurements in proton-lead (p--Pb) and proton-proton (pp) collisions, where the effects of multiple-parton interactions and hadronization beyond independent string fragmentation can be investigated.

In this talk, we present a Bayesian unfolding procedure to...

Optimization problems in HEP often involve maximizing a measure of how sensitive is a given analysis to an hypothesis with respect to another hypothesis; the latter is referred to as "null" hypothesis and in a frequentist framework is tested against the former, which is referred to as "alternative" hypothesis.

In most cases, it is desirable to fully compute the expected frequentist...

Much effort has been expended in deconstructing deep neural networks, that is, in trying to understand their internal representations of data. For example, understanding what convolutional neural networks are doing layer by layer has been the focus of much research. I argue that this effort is largely misplaced. Of far greater importance, in my view, is understanding what these functions...

Experiments on neutrinos are very challenging due to the usual collection of very low number of events, the huge and sometime unknown systematics, and the sparse experimental techniques with the corresponding critical assembling of the measurements. All these characteristics point to the necessity of robust, controlled and well established data analyses. Unfortunately, the neutrino community...

In this talk, we will describe the latest additions to the Toolkit for Multivariate Analysis (TMVA), the machine learning package integrated into the ROOT framework. In particular, we will focus on the new deep learning module that contains robust fully-connected, convolutional and recurrent deep neural networks implemented on CPU and GPU architectures. We will present performance of these new...

Bayesian Gaussian Process Optimization [1,2,3] can be considered as a method of the determination of the model parameters, based on the experimental data. In the range of soft QCD physics, the processes of hadron and nuclear interactions require using phenomenological models containing many parameters. In order to minimize the computation time, the model predictions can be parameterized using...

The plans for the second Run of the LHC changes the focus in the Higgs sector from searches to precision measurements. Effective Lagrangians can be used for parameterisation. A signal morphing method is developed to take all parameters into account simultaneously and model interference effects. It provides a continues description of arbitrary physical signal observables such as cross sections...

Recent statistical evaluation for High-Energy Physics measurements, in particular those at the Large Hadron Collider, require careful evaluation of many sources of systematic uncertainties at the same time. While the fundamental aspects of the statistical treatment are now consolidated, both using a frequentist or a Bayesian approach, the management of many sources of uncertainties and their...

Different situations appearing in HEP involve the calculation of CI for linear combinations of observations that follow a Poisson distribution. Although apparently a simple problem, no precise methods exist unless asymptotic approximations can be assumed. We propose different alternatives beyond the error propagation of Gaussian errors and estimate their performance in some common examples.

First I will review significant performance gains that were reached for ongoing experiments by applying deep learning techniques classification tasks in jet physics. I will also review how to extend such methods to for cases where we do not have unique labels, but where the labels in simulation themselves are already a production of a random process of simulation. Finally, if times allow I...

The precision determination of the parton distribution functions (PDFs) of the proton is a central component for the precision phenomenology program at the Large Hadron Collider (LHC). Pinning down the quark and gluon structure of the proton strengthens a number of LHC cornerstone measurements such as the characterisation of the Higgs sector and searches for high-mass bSM resonances. In this...

Data Quality plays an important role in many high-energy physics experiments, e.g. the ALICE experiment at the Large Hadron Collider (LHC), CERN. Currently used methods for quality assurance problems such as quality label assignment or particle identification, rely heavily on human expert judgments and complex computations. Those tasks, however, can be easily addressed by modern machine...

A lot of work done in advancing the performance of deep-learning approaches often takes place in the realms of image recognition - many papers use famous benchmark datasets, such as Cifar or Imagenet, to quantify the advantages their idea offers. However it is not always obvious, when reading such papers, whether the concepts presented can also be applied to problems in other domains and still...

The Standard Model is currently the most widely accepted physical theory that classifies all known elementary particles and represents three out of the four fundamental forces in the universe. Despite the confirmation of the model, there is a need for its generalization or for the development of a new theory, able to complete our knowledge of the Universe. For this purpose, High Energy...

Differential cross section measurement in experimental particle physics are smeared by the finite resolution of particle detectors. Using the smeared observations to infer the true particle-level spectrum is an ill-posed inverse problem, which is typically referred to as unfolding or unsmearing. In this talk, I will first give an overview of the statistical techniques that are currently used...

For decades, high-energy physics (HEP) had been on the forefront of big data technology, developing techniques to explore and analyze datasets too large for memory that were revolutionary when they appeared in other fields years later. Today, that dominance is ending, and I argue that it's a good thing. The rise of web-scale datasets and high-frequency trading has interested the commercial...

In this talk, I will focus on an exceptional way of doing data-driven research employing networked community. Many examples of collaboration with the data-science community within competitions organised on Kaggle or Coda Lab platforms usually get limited by restrictions on those platforms. Common metrics do not necessarily correspond to the goal of the original research. Constraints imposed...

Complex machine learning tools, such as deep neural networks and gradient boosting algorithms, are increasingly being used to construct powerful discriminative features for High Energy Physics analyses. These methods are typically trained with simulated or auxiliary data samples by optimising some classification or regression surrogate objective. The learned feature representations are then...

Different evaluation metrics for binary classifiers are appropriate to different scientific domains and even to different problems within the same domain. This presentation discusses the evaluation of binary classifiers in experimental high-energy physics, and in particular those used for the discrimination of signal and background events. In the introductory part of the talk, the general...

The presentation will provide insight into the treatment of statistical problems by particle physicists, which is commonly driven by practical considerations much more than mathematical reasoning. Common pitfalls and their origin will be discussed using real life (but anonymized) examples, touching on topics such as unfolding and limit setting.

GPUs represent one of the most sophisticated and versatile parallel

computing architectures that have recently entered in the HEP field.

GooFit is an open source tool interfacing ROOT/RooFit to the CUDA

platform that allows to manipulate probability density functions and

perform fitting tasks. The computing capabilities of GPUs with

respect to traditional CPU cores have been explored with...

Summary Section H

I recall the basic properties and history of CLs, a method or procedure to derive robust upper limits in searches for new phenomena that was developed in preparation for Higgs boson searches at LEP200 in the 1990’s.

Starting from 2020, future development projects for the Large Hadron Collider will constantly bring nominal luminosity increase, with the ultimate goal of reaching a peak luminosity of $5 · 10^{34} cm^{−2} s^{−1}$ for ATLAS and CMS experiments planned for the High Luminosity LHC (HL-LHC) upgrade. This rise in luminosity will directly result in an increased number of simultaneous proton...