Tracking at the HL-LHC in ATLAS and CMS will be very challenging. In particular, the pattern recognition will be very resource hungry, as extrapolated from current conditions. There is a huge on-going effort to optimise the current software. In parallel, completely different approaches should be explored.
To reach out to Computer Science specialists, a Tracking Machine Learning challenge (trackML) is being set up, building on the experience of the successful Higgs Machine Learning challenge in 2014 (which associated ATLAS and CMS physicists with Computer Scientists). A few relevant points:
• A dataset consisting of a simulation of a typical full Silicon LHC experiments has been created, listing for each event the measured 3D points, and the list of 3D points associated to a true track. The data set is large to allow the training of data hungry Machine Learning methods : the orders of magnitude are : one million event, 10 billion tracks, 1 terabyte. Typical CPU time spent by traditional algorithms is 100s per event.
• The participants to the challenge should find the tracks in an additional test dataset, meaning building the list of 3D points belonging to each track (deriving the track parameters is not the topic of the challenge)
• A figure of merit should be defined which combines the CPU time, the efficiency and the fake rate (with an emphasis on CPU time)
• The challenge platforms should allow measuring the figure of merit and to rate the different algorithms submitted.
The emphasis is to expose innovative approaches, rather than hyper-optimising known approaches. Machine Learning specialists have showed a deep interest to participate to the challenge, with new approaches like Convolutional Neural Network, Deep Neural Net, Monte Carlo Tree Search and others.
A slimmed down 2D version of the challenge will be proposed right after this talk, and will be introduced here.