Speaker
Description
The development of techniques based on machine learning (ML) relies on the availability of datasets. Many studies are carried out within the context of particular experiments, using e.g. their simulation data. This narrows down the possibilities for collaboration as well as publication, with only limited datasets published for open access.
This gap can be bridged with the datasets produced with the Open Data Detector (ODD), a detector designed for algorithm research and development. Its goal is to create a benchmark detector with public simulation data released and available for algorithm studies. Such data can be used for all the ongoing activities in the areas such as fast simulation or reconstruction.
The tracking system of the ODD is an evolution of the detector used in the successful Tracking Machine Learning Challenge, offering a more complex and realistic design. It is complemented with the granular calorimetry and will be completed with the muon system. The magnetic field in the detector can be created with a solenoid located either in front or behind the calorimeters, providing two alternative options for detector studies.
The Calo Challenge, the first ML challenge focused on the development of the ML fast shower simulation, provided valuable feedback regarding the dataset for the ML-based fast simulation studies. Different representation of shower data is among the most important features of the ODD dataset. This should avoid bias towards the choice of the ML architecture. A wider range of particles and wider pseudorapidity coverage will present a more realistic complexity of the problem that experiments face. Ultimately, the ODD dataset users will be provided with the possibility of inserting their models inside the simulation framework, thus allowing a fair comparison of full and fast simulation in terms of accuracy as well as time and memory performance.