11–15 Mar 2024
Charles B. Wang Center, Stony Brook University
US/Eastern timezone

Reducing Systematic Differences between Data and Simulation with Generative Models

13 Mar 2024, 14:50
20m
Lecture Hall 2 ( Charles B. Wang Center, Stony Brook University )

Lecture Hall 2

Charles B. Wang Center, Stony Brook University

100 Circle Rd, Stony Brook, NY 11794
Oral Track 2: Data Analysis - Algorithms and Tools Track 2: Data Analysis - Algorithms and Tools

Speaker

Dmitrii Torbunov

Description

High Energy Physics (HEP) experiments rely on scientific simulation to develop reconstruction algorithms. Despite the remarkable fidelity of modern simulation frameworks, residual discrepancies between simulated and real data introduce a challenging domain shift problem. The existence of this issue raises significant concerns regarding the feasibility of implementing Deep Learning (DL) methods in detector analysis, impeding the adoption of such techniques.

We present our ongoing research in developing new DL strategies to mitigate the differences between simulation and real data. Our approach is based on a combination of Generative Adversarial Networks (GANs) for the translation of simulated samples into the real data domain, reducing the magnitude of domain shift effects.

We discuss our progress made in applying this approach specifically to LArTPC-based particle detectors. We demonstrate the effectiveness of our method on a simplified and cropped LArTPC benchmark dataset. Then we highlight various performance and computational challenges encountered in the process of adapting the method to realistic LArTPC datasets of high-resolution images ($\sim 6000 \times 960$ pixels). By systematically addressing these challenges, our research aims to bridge the gap between the simulated and real data and advance the applicability of DL methods in HEP experiments.

References

https://arxiv.org/abs/2304.12858

Significance

We successfully applied the proposed method on a toy problem (https://arxiv.org/abs/2304.12858). The novelty here is that we managed to make the method work on a realistic dataset of high-resolution images and made the method computationally efficient.

Experiment context, if any ProtoDUNE, DUNE

Primary author

Presentation materials