11–15 Mar 2024
Charles B. Wang Center, Stony Brook University
US/Eastern timezone

Fair Universe: HiggsML Uncertainty Challenge

12 Mar 2024, 12:50
20m
Theatre ( Charles B. Wang Center, Stony Brook University )

Theatre

Charles B. Wang Center, Stony Brook University

100 Circle Rd, Stony Brook, NY 11794
Oral Track 1: Computing Technology for Physics Research Track 1: Computing Technology for Physics Research

Speaker

Wahid Bhimji

Description

The Fair Universe project is building a large-compute-scale AI ecosystem for sharing datasets, training large models and hosting challenges and benchmarks. Furthermore, the project is exploiting this ecosystem for an AI challenge series focused on minimizing the effects of systematic uncertainties in High-Energy Physics (HEP), and on predicting accurate confidence intervals. This talk will describe the challenge platform we have developed that builds on the open-source benchmark ecosystem Codabench to interface it to the NERSC HPC center and its Perlmutter system with over 7000 A100 GPUs. This presentation will also launch the first of our Fair Universe public challenges hosted on this platform, the Fair Universe: HiggsML Uncertainty Challenge, the a pilot phase of which will be run concurrently with ACAT so that attendees will be able to enter the competition; interact with organizers; and have their uncertainty-aware ML methods evaluated on large datasets.

This challenge will present participants with a much larger training dataset than previous competitions corresponding measurement of the Higgs decay to tau leptons at the Large Hadron Collider. They should design an advanced analysis technique able to not just measure the signal strength but also to provide a confidence interval, from which correct coverage will be evaluated automatically from pseudo-experiments. The confidence interval should include statistical uncertainty and also systematic uncertainties (including, for example, detector calibration,background levels among others). It is expected that advanced analysis techniques that are able to control the impact of systematics will perform best, thereby pushing the field of uncertainty aware AI techniques for HEP and beyond.

The Codabench/NERSC platform also allows for hosting challenges from other communities, and we also intend to make our benchmark designs available as templates so similar efforts can be easily launched in other domains.

Significance

This contribution describes work that pushes the state of the art in the ML challenge platform; the ML challenge itself; and in the evaluation of uncertainty-aware methods.
For the platform we describe a system capable of operating at much larger scale than other approaches, including on large datasets and trained and evaluated on multiple GPUs in parallel. The platform also provides a leaderboard and ecosystem for long-lived benchmarks, as well as capabilities to not only evaluate different models but also test models against new datasets.
For the “Fair Universe: HiggsML Uncertainty Challenge” we provide larger datasets, with multiple systematic uncertainties applied, as well as evaluation of uncertainties as part of the challenge, performed on multiple pseudo-experiments. All of these aspects are novel to HEP ML challenges as far as we are aware.
Furthermore we will present methodological innovations including novel metrics for evaluation of uncertainty aware methods as well as improvements in uncertainty aware methods themselves.

Primary authors

Aishik Ghosh (University of California Irvine (US)) Ben Nachman (Lawrence Berkeley National Lab. (US)) Chris Harris (NERSC, Lawrence Berkeley National Laboratory) Daniel Whiteson (University of California Irvine (US)) David Rousseau (IJCLab-Orsay) Elham E Khoda (University of Washington (US)) Ihsan Ullah (ChaLearn) Isabelle Guyon (ChaLearn/Google) Paolo Calafiura (Lawrence Berkeley National Lab. (US)) Peter Nugent (Lawrence Berkeley National Laboratory) Ragansu Chakkappai (Université Paris-Saclay (FR)) Sascha Diefenbacher (Lawrence Berkeley National Lab. (US)) Shih-Chieh Hsu (University of Washington Seattle (US)) Steven Farrell (Lawrence Berkeley National Laboratory) Wahid Bhimji Yuan-Tang Chou (University of Washington (US))

Presentation materials