29 November 2021 to 3 December 2021
Virtual and IBS Science Culture Center, Daejeon, South Korea
Asia/Seoul timezone

Parallel processing in data analysis of the JUNO experiment

contribution ID 609
1 Dec 2021, 17:20
20m
S221-A ( Virtual and IBS Science Culture Center)

S221-A

Virtual and IBS Science Culture Center

55 EXPO-ro Yuseong-gu Daejeon, South Korea email: library@ibs.re.kr +82 42 878 8299
Oral Track 1: Computing Technology for Physics Research Track 1: Computing Technology for Physics Research

Speaker

Yixiang Yang (Institute of High Energy Physics)

Description

The JUNO experiment is being built mainly to determine the neutrino mass hierarchy by detecting neutrinos generated in the Yangjiang and Taishan nuclear plants in southern China. The detector will record 2 PB raw data every year, but each day it can only collect about 60 neutrino events scattered among huge background events. Selection of extremely sparse neutrino events brings a big challenge to offline data analysis. A typical neutrino physics event normally spans across a number of consecutive readout events, flagged by a fast positron signal followed by a slow neutron signal within a varying-size time window. To facilitate this analysis, a two-step data processing scheme has been proposed. In the first step (called data preparation), the event index data is produced and skimmed, which only contains information of minimum physics quantities of events as well as their addresses in the original reconstructed data file. In the second step (called time correlation analysis), event index data is further selected with stricter criteria. And then, for each selected event, the time correlation analysis is performed by reading all associated events within a pre-defined time window from the original data file according to the selected event’s address and timestamp.
Firstly, this contribution will introduce the design of the above data processing scheme and then focus on the multi-threaded implementation of time correlation analysis based on the Intel Threading Building Block (TBB) in the SNiPER framework. Secondly, this contribution will describe the implementation of distributed analysis using MPI in which the time correlation analysis task is divided into subtasks running on multiple computing nodes. At last, this contribution will present the detailed performance measurements made on a multiple-node test bed. By using both skimming and indexing techniques, the total amount of data read by time correlation analysis is reduced significantly. So that the processing time could be reduced by two orders of magnitude.

Speaker time zone Compatible with Asia

Authors

Yixiang Yang (Institute of High Energy Physics) Weidong Li (IHEP, Beijing) Jiaheng Zou (Chinese Academy of Sciences (CN)) Tao Lin Teng LI (Shandong University, CN) Xingtao Huang (Shandong University)

Presentation materials