Speaker
Description
We demonstrate the use of the MLP-Mixer architecture for fast jet classification in high-energy physics. The MLP-Mixer architecture is a simple and efficient architecture consisting of MLP blocks applied in different directions of the input tensor. It is first proposed by Tolstikhin et al., and is shown to be competitive with state-of-the-art architectures like Vision Transformers (ViTs) for image classification tasks.
We benchmark MLP-Mixer model with the hls4ml LHC Jet dataset, which is meant to represent the type of L1 trigger objects that we may expect in future High-Luminosity LHC (HL-LHC) experiments. The trained models are compared with the JEDI-Net, a GNN model that is a state-of-the-art for this task that was implementable on FPGAs. We show that our MLP-Mixer model could supersede JEDI-net 50 particle performance (>81.2%) with 1/8 of LUTs, no DSP, 100x throughput, and 1/10 latency (~10k LUTs, II=1 at 200 Mhz, and sub 100ns latency) when trained with HGQ and synthesized with hls4ml with MAC Tree unrolled dot-product optimization.