### Conveners

#### Special Section Statistical Methods for Physics Analysis in the XXI Century

- Tommaso Dorigo (Universita e INFN, Padova (IT))

#### Special Section Statistical Methods for Physics Analysis in the XXI Century

- Eilam Gross (Weizmann Institute of Science (IL))

#### Special Section Statistical Methods for Physics Analysis in the XXI Century

- Luca Lista (INFN Sezione di Napoli)

#### Special Section Statistical Methods for Physics Analysis in the XXI Century

- Tommaso Dorigo (Universita e INFN, Padova (IT))

### Description

Machine learning techniques; data fitting and extraction of signals; new developments in unfolding methods; averaging and combination of results

Conveners: T. Dorigo (INFN, Italy)

The most accurate method to combine measurement is to build a combined likelihood function and use it to perform the desired inference. This is not always possible for various possible reasons, hence approximate methods are often convenient. Among those, the best linear unbiased estimator (BLUE) is the most popular, allowing to take into account individual uncertainties and their correlations....

Often physicists need to calculate the confidence interval for the ratio of two measurements and many times just use the so-called “error propagation” of the corresponding uncertainties, without being aware of the approximations involved and the limitations of this approach. We will explore these limitations, as well as some alternative and more accurate methods. “Exact” methods for the case...

Based on

Estimating the significance of a signal in a multi-dimensional search

Ofer Vitells, Eilam Gross (Weizmann Inst.). May 2011. 5 pp.

Published in Astropart.Phys. 35 (2011) 230-234

A method is presented for the reduction of large sets of related uncertainty sources into strongly reduced representations which retain a suitable level of correlation information for use in many cases. The method provides a self-consistent means of determining whether a given analysis is sensitive to the loss of correlation information arising from the reduction procedure. The method is...

I begin with an introduction to deep learning methods and the kinds of problems in particle physics to which the methods could be usefully applied, such as searching data for evidence of new physics. Then I discuss the Bayesian connection. I conclude with a perspective on what data analysis might look like in the not too distant future.

Statistical classification models are commonly used to separate a signal from a background. In this talk we face the problem of isolating the signal of the double Higgs production using the decay channel in which each boson decays into a pair of b-quarks. Typically in this context non parametric methods are used, such as Random Forest or different types of Boosting. We remain in the same non...

The problem of correcting data for detector effects (unfolding) is

disussed, with emphasis on practical difficulties showing up

in particle physics. A selection of unfolding methods

commonly used in particle physics is presented, such as

iterative algorithms (D'Agostini), methods based on matrix

decomposition (SVD unfolding) and fits with Tikhonov regularisation

(TUnfold). The...

In ATLAS, several unfolding methods are used to correct experimental measurements for detector effects, like acceptance and resolution. These methods use as input the raw experimental distributions, as well as Monte Carlo simulation for the description of the detector effects. The systematic uncertainties associated to the various unfolding methods are evaluated. The statistical and systematic...

Centrality, as a geometrical property of the collision, is crucial for the physical interpretation of nucleus-nucleus and proton-nucleus experimental data. However, it cannot be directly accessed in event-by-event data analysis. Contemporary methods of the centrality estimation in A-A and p-A collisions usually rely on a single detector (either on the signal in zero-degree calorimeters or on...

Many challenges, such as determining significance, exist in identifying new structures in hadron spectra. This talk will summarize first-hand experience on significance determination including look-elsewhere-effect, background determination, as well

as signal extraction in hadron spectra.

The combination of experimental results requires a careful statistical treatment. We review the methods and tools used in ATLAS for the statistical combination of measurements and of limits on new physics. We highlight the methods used in the recent combination of ATLAS and CMS measurements of the Higgs boson production/decay rates and the constraints on the Higgs coupling parameters.

A new data-driven technique for modelling the QCD multijet background component for analyses which include several jets in their final state is presented and studied in detail. By combining pairs of hemispheres from other events based on a nearest neighbour distance, a new mixed dataset can be constructed which kinematically resembles the majority component of the original mixture. Therefore,...

Nuclear collisions at high energies produce large numbers of secondaries.

Results from the ALICE collaboration show that more than 20 000 of them are produced in a Pb-Pb collision. In view of this it is natural to consider a statistical-thermal model to analyse these and concepts like temperature, energy density, pressure, net baryon density etc... are useful.

A presentation will be given...

Graphical Processing Units (GPUs) represent one of the most sophisticated and versatile parallel computing architectures available that are nowadays entering the High Energy Physics field. GooFit is an open source tool interfacing ROOT/RooFit to the CUDA platform on nVidia GPUs. Specifically it acts as an interface between the MINUIT minimisation algorithm and a parallel processor which allows...

For data sets populated by a very well modeled process and by another process of unknown p.d.f., a desired feature when manipulating the fraction of the unknown process (either for enhancing it or suppressing it) consists in avoiding modifying the kinematic distributions of the well modeled one. A bootstrap technique is used to identify sub-samples rich in the well modeled process, and...

The Matrix Element Method (MEM) is a powerful multivariate method allowing to maximally exploit the experimental and theoretical information available to an analysis. The method is reviewed in depth, and several recent applications of the MEM at LHC experiments are discussed, such as searches for rare processes and measurements of Standard Model observables in Higgs and Top physics. Finally, a...

In my talk, I will present an overview of on-going machine-learning software development in particle physics, in particular focusing on the recent developments related to the Toolkit of Multivariate Analysis (TMVA). I will additionally summarize the current activities of the Inter-experimental Machine Learning Working Group (IML).