Speaker
Ben Nachman
(Lawrence Berkeley National Lab. (US))
Description
Machine learning in high energy physics relies heavily on simulation for fully supervised training. This often results in sub-optimal classification when ultimately applied to (unlabeled) data. At CTD2017, we showed how to avoid this problem by training directly on data using as input the fraction of signal and background in each training sample. We now have a new method that does not even require these fractions called Classification Without Labels (CWoLa). In addition to explaining this new method, we show for the first time how to apply these techniques to high-dimensional data, where significant architectural changes are required.
Primary authors
Ben Nachman
(Lawrence Berkeley National Lab. (US))
Eric Metodiev
(Massachusetts Institute of Technology)
Patrick Komiske
(Massachusetts Institute of Technology)
Francesco Rubbo
(SLAC National Accelerator Laboratory (US))
Matthew Schwartz
Jesse Thaler
Lucio Dery
(Stanford University)