Learning to Classify from Impure Samples with High-Dimensional Data

17 Jul 2018, 15:20
25m
Charpak Amphitheater (Paris)

Charpak Amphitheater

Paris

UPMC (Jussieu) Campus

Speakers

Ben Nachman (University of California Berkeley (US)) Eric Metodiev (Massachusetts Institute of Technology) Patrick Komiske (Massachusetts Institute of Technology)

Description

Machine learning in high energy physics relies heavily on simulation for fully supervised training. This often results in sub-optimal classification when ultimately applied to (unlabeled) data. In addition to describing a new method for weak supervision (learning directly from data) called Classification Without Labels (CWoLa), we show for the first time how to apply these techniques to high-dimensional data, where significant architectural changes are required. This is critically important for learning from and about the full radiation pattern inside jets.

Primary authors

Ben Nachman (University of California Berkeley (US)) Eric Metodiev (Massachusetts Institute of Technology) Patrick Komiske (Massachusetts Institute of Technology) Matthew Schwartz

Presentation Materials