9-13 July 2018
Sofia, Bulgaria
Europe/Sofia timezone

Anomaly detection using Deep Autoencoders for the assessment of the quality of the data acquired by the CMS experiment

9 Jul 2018, 14:30
Hall 9 (National Palace of Culture)

Hall 9

National Palace of Culture

presentation Track 6 – Machine learning and physics analysis T6 - Machine learning and physics analysis


Adrian Alan Pol (Université Paris-Saclay (FR))


The certification of the CMS data as usable for physics analysis is a crucial task to ensure the quality of all physics results published by the collaboration. Currently, the certification conducted by human experts is labor intensive and can only be segmented on a run by run basis. This contribution focuses on the design and prototype of an automated certification system assessing data quality on a per-luminosity section (i.e. 23 seconds of data taking) basis. Anomalies caused by detector malfunctions or sub-optimal reconstruction are unpredictable and occur rarely, making it difficult to use classical supervised classification methods such as feedforward neural networks. We base our prototype on a semi-supervised model which employs deep autoencoders. This approach has been qualified successfully on CMS data collected during the 2016 LHC run: we demonstrate its ability to detect anomalies with high accuracy and low fake rate, when compared against the outcome of the manual certification by experts. A key advantage of this approach over other ML technologies is having great interpretability of the results, which can be further used to ascribe the origin of the problems in the data to a specific sub-detector or physics objects.

Primary authors

Gianluca Cerminara (CERN) Giovanni Franzoni (CERN) Federico De Guio (Texas Tech University) Adrian Alan Pol (Université Paris-Saclay (FR)) Filip Siroky (Masaryk University (CZ))


Virginia Azzolini (Massachusetts Inst. of Technology (US)) Maurizio Pierini (CERN) Jean-Roch Vlimant (California Institute of Technology (US))

Presentation Materials