29 November 2021 to 3 December 2021
Virtual and IBS Science Culture Center, Daejeon, South Korea
Asia/Seoul timezone

Signal to background discrimination for the production of double Higgs boson events via vector boson fusion mechanism in the decay channel with four charged leptons and two b-jets in the final state at the LHC experiment

contribution ID 540
3 Dec 2021, 15:50
10m
Auditorium (Virtual and IBS Science Culture Center, Daejeon, South Korea)

Auditorium

Virtual and IBS Science Culture Center, Daejeon, South Korea

55 EXPO-ro Yuseong-gu Daejeon, South Korea email: library@ibs.re.kr +82 42 878 8299
Poster Track 2: Data Analysis - Algorithms and Tools Lightning talks session

Speaker

Ms Brunella D'Anzi (Universita e INFN, Bari (IT))

Description

Artificial Neural Networks in High Energy Physics: introduction and goals

Nowadays High Energy Physics (HEP) analyses take generally advantages of the implementation of Machine Learning techniques to optimize the discrimination between signal and background, preserving as much signal as possible. Running a classical cut-based selection would imply a severe reduction of both signal and background candidates as well, which would turn to be a quite inefficient choice especially when signal events are rare as it usually happens performing beyond Standard Model studies.
HEP Multivariate Analysis (MVA) focuses on using a pre-determined set of independent variables optimally combined to build discriminants which could effectively separate signal from background.
In this context, an Artificial Neural Network naturally implements a MVA since it is a complex computing system which mimics the biological neural networks receiving these variables as input, training and testing on them and, eventually, producing a single output for binary classification problems. Generally, a neural network is composed of nodes organized in layers. Each node receives either a feature of the problem or a weighted sum of the previous layer node output. Each neural network presents different parameters (i.e. number of layers, nodes, activation functions, etc) and hyper-parameters (parameters set manually in order to help the estimate model parameters). These networks usually implement a model which performs an iterative process (the algorithm) whose aim is to minimize a given loss function that represents the distance of the network response from the actual class of the events. Managing big amount of data for these classification problems via Deep Neural Networks (DNNs) can ensure that the network will correctly classify data never seen before.

The production of Double Higgs via Vector Boson Fusion

In our work we will show the implementation and optimization of a DNN for signal and background event classification.
For this study we used Monte Carlo generated events from non-resonant Higgs boson pair production analysis at the energies of the LHC, where one of the Higgs bosons decays into the four-lepton final state and the other one decays into a pair of b quarks. The signal and background datasets were generated to run the analysis on data corresponding to the integrated luminosity reached by the LHC experiment at CERN during the full Run II period using proton-proton collisions at a center of mass energy of 13 TeV. Furthermore, the performance of a Random Forest classifier, usually considered a very versatile and efficient method for these kind of problems, will be studied, compared with the DNN and discussed. The results will be given exploiting different metrics, providing accuracy and purity times efficiency distributions, confusion matrices and ROC curves for both models.

Innovation and further improvements

The discovery of the Higgs boson at the Large Hadron Collider in 2012 opened a new frontier in HEP both for Standard Model (SM) and Beyond Standard Model (BSM) scenarios. Thus, a new era of ambitious high luminosity studies has been opened up. After the precise measurement of the main parameters of the SM Higgs, one of the most important determination to be accomplished is the measurement of the Higgs self-couplings, which are strictly related to the shape of the Higgs potential. The Higgs self-coupling studies clearly involve the investigation processes which have a pair of Higgs bosons. In contrast to what happens for the dominating production mechanism of gluon-gluon fusion, the production of these two Higgs bosons via vector-boson fusion (VBF) turns to be a particularly important process for the determination of the triple-Higgs coupling. In fact, in VBF the Higgs bosons are produced at leading order from heavy gauge bosons which are radiated off two quarks which can be used as tags for jets to simplify the experimental identification and measurement. Thus, the innovation for this study is related both to the technological point of view and to the originality of the physics analysis signature considered. In fact, differently from the single Higgs production modes widely explored and studied in the Run I and II at the LHC, the double Higgs boson production via VBF in the 4lepton+ 2 b-jet final state was not yet investigated. This was mainly due to the small value of its cross section weighted with the branching ratios (for the HH production via VBF, with the Higgs mass set to its best fit value of 125,09 GeV, the cross section at 13 TeV is $\sim1.723 fb$ and the corresponding BRs are $2.79\times10^{-4}$ for $H\rightarrow{ZZ}\rightarrow{4l}$, with $l=e,\mu,\tau$, and $5.75\times10^{-1}$ for $H\rightarrow{bb}$), thus requiring an exclusive event selection in order to efficiently perform a background rejection.

Benefits and Feasibility of the study for teaching purpose

In addition to the scientific originality of the work itself mentioned in the previous section and its major importance for a better understanding of SM Higgs potential, from these studies several exercises can be derived for both Analysis and Computing Schools, ensuring a deeper understanding of both the particle physics itself and the use of the most suitable ML techniques. As an example, we studied the problem of hyper-parameters tuning using cross validation and other approaches, trying to understand which is the most performing one for the different models. These techniques have been already applied on a different analysis by the authors and submitted to Hackathon INFN 2021 Competition, targeting master and Ph.D. students. The importance of practicing for learners during hands-on tutorials using the very latest physics analyses seems to be a quite remarkable matter, especially due to the application of different but also interconnected fields such as Particle Physics, Artificial Intelligence, Big Data Analytics and the optimization of computing resources using High Performance Computing Clusters.

Speaker time zone No preference

Primary authors

Ms Brunella D'Anzi (Universita e INFN, Bari (IT)) Giorgia Miniello (Universita e INFN, Bari (IT)) Prof. Nicola De Filippis (Politecnico/INFN Bari (IT)) Walaa Elmetenawee (Universita e INFN, Bari (IT))

Presentation materials