Anomaly Detection topical meeting
Virtual
Minutes & Key Points
1. Opening and Meeting Goals
-
First topical meeting dedicated to Anomaly Detection (AD) within the Prompt BSM WG.
-
Goals of the topical meeting series:
-
Brainstorm what has been done so far, next steps, and perspectives.
-
Form task forces to agree on:
-
Data preservation & reinterpretation
-
Uncertainties
-
Results presentation
-
Benchmarks and validation tests
-
Collect inputs and define guidelines to be summarized in a public document.
-
Upcoming topical meetings:
-
Oct 7, 2025 – Heavy Resonances
-
Later: pMSSM SUSY, VLF, LQ, HNL.
Action Items:
-
Contact conveners if interested in leading task forces on specific topics.
2. Theoretical Overview – David Shih
Topic: ML-powered model-agnostic anomaly detection searches at the LHC.
Key Points:
-
Motivation: LHC has produced thousands of model-specific searches; yet new physics remains elusive. There is likely untapped discovery potential with model-agnostic ML searches.
-
Types of anomalies:
-
Outliers – rare, extreme deviations (autoencoders effective here).
-
Overdensities – excesses over smooth backgrounds (weak supervision, density estimation). Both approaches are complementary.
-
Autoencoders (AE):
-
Learn to reconstruct background events → large reconstruction error indicates anomaly.
-
Demonstrated sensitivity to QCD jets vs. anomalous tops/gluinos.
-
Overdensity methods:
-
Learn the ratio R(x) = pdata(x)/pbg(x).
-
Techniques: CWoLa, ANODE, SALAD, CATHODE, etc..
-
Proof-of-concept results: e.g., CATHODE enhances dijet anomalies from ~2σ to ~30σ significance.
-
Resonant AD: Combining anomaly scores with bump hunts (already applied in CMS/ATLAS dijet analyses).
-
Non-resonant AD:
-
More challenging; requires robust background estimation (ABCD with decorrelated autoencoders, latent space overdensity scores).
-
Early proof-of-concepts (e.g., CONRAD, dual autoencoders) show promise.
-
Trigger-level AD:
-
Fast autoencoder-based triggers (CICADA, AXOL1TL, GELATO).
-
Potential complementary approaches with online generative modeling.
Suggestions for Reinterpretation:
-
AE-based searches: publish anomaly score function → theorists can inject signals and reweight.
-
Overdensity-based searches: more difficult; require publishing background models/events and compressed data features so theorists can retrain anomaly scores.
Q&A Highlights
-
Mario Campanelli: Asked clarification on generative model before the trigger.
-
Response (D. Shih): Idea is to train a generative model on buffered data pre-trigger, then generate synthetic events for offline searches/scouting-like analysis.
-
Javier Jiménez Peña: Asked about “double independent autoencoders.”
-
Response: By training two decorrelated autoencoders, anomalies manifesting in multiple features can be flagged in both → enabling ABCD background estimation.
-
Jack Harrison (ATLAS): Mentioned ATLAS recently published a non-resonant AD search in multilepton final states; this should be added to references.
-
Vilius Cepaitis: Raised concern about the computational cost of retraining overdensity models for reinterpretations.
-
Response (D. Shih): Agreed this is an important challenge; possible need for heuristics or surrogate models to reduce computational overhead.
3. ATLAS Anomaly Detection Overview – Vilius Čepaitis (on behalf of ATLAS)
Key Points:
-
ATLAS has completed six public AD analyses with Run-2 data; no significant excess observed.
-
Covered a spectrum of techniques: unsupervised (autoencoders, normalizing flows), weakly-supervised (CWoLa), semi-supervised (ANTELOPE), and dedicated AD triggers (GELATO).
-
Examples presented:
-
Y→XH analysis: VRNN-AE anomaly score alongside dedicated Higgs tagging regions.
-
jet+X states: AE on rapidity–mass matrix; selection of most anomalous 1% events; ADFilter tool released for public reinterpretation.
-
Multilepton anomalies: Normalizing flow with kinematic features; 16 anomaly regions.
-
Semi-visible jets (SVJ): Semi-supervised with ANTELOPE.
-
CWoLa round 1 & 2: Iterative improvements using CURTAINS and SALAD for background templates.
-
Feature sensitivity: Input choice strongly affects performance; BDTs may be more robust than NNs in some cases.
-
Validation strategies: ATLAS uses combinations of MC validation, topological control regions, low-anomaly CRs, and pseudo-data with generative models.
-
Benchmarking: No single optimal AD method; suggests using “standard candle” BSM signals or mixed validation sets to benchmark new techniques.
-
Result presentation: Different combinations used (BumpHunter p-values, model-dependent and model-independent limits). Model-independent results and public tools (e.g., ADFilter) are especially valuable.
-
Uncertainties: Besides normal systematics, AD methods bring stochastic uncertainty. Ensembles (multiple trainings with different seeds) can quantify this.
-
ATLAS is preparing internal AD guidelines covering scope, validation, reinterpretation, and result presentation.
4. CMS Anomaly Detection Overview – Louis Moureaux (on behalf of CMS)
Key Points:
-
Scope of CMS AD efforts:
-
Data quality monitoring: ECAL autoencoder flags local detector anomalies (not physics).
-
Triggers:
-
AXOL1TL (global trigger objects) and CICADA (calorimeter towers), both autoencoder-based. Running at Level-1 with ~µs latency, sensitive across benchmarks.
-
By end of 2025, expected ~200 fb⁻¹ (AXOL1TL) and ~100 fb⁻¹ (CICADA). Next question: how to analyze these datasets.
-
Offline analyses:
-
Dijet resonance anomaly search (Run-2, 138 fb⁻¹) [2412.03747]: applied multiple AD methods (CWoLa Hunting, TNT, CATHODE(-b), VAE-QR, QUAK).
-
Strategy: retain ~1% most anomalous events, bump-hunt mjj.
-
Results: No significant excess. Limits improve over inclusive fits; dedicated searches still stronger.
-
Methodology details:
-
VAE-QR: quantile regression to remove mjj sculpting.
-
QUAK: hybrid flows with signal priors, complementary to others.
-
Weak supervision requires signal injection for efficiency; retraining expensive.
-
Other studies: Boosted top quarks found with weak supervision; H(bb)+anomalous selection with ParticleNet.
-
Complementarity: AD methods show small correlations; thus complementary.
Open Issues Highlighted by CMS:
-
Non-resonant AD: Yet to be tried on CMS; natural link to EFTs suggested.
-
Reinterpretation:
-
Limits from weak supervision might be easier to provide.
-
Key question: what information should CMS publish if an excess is found?.
-
How to evaluate performance without benchmark models?
-
Methodology & uncertainties:
-
Best input features not yet clear.
-
Background estimation uncertainties critical.
-
Can weak supervision be extended to triggers?