The  Beyond Leading Order Calculations on HPCs brought experts from the Argonne and Oak Ridge Leadership Computing Facility and NERSC together with theoretical physicists. These theorists are writing the (N)NLO parton interaction Monte Carlo generators and predictions the LHC experiments depend on for comparison to measurements. (N)NLO calculations are becoming computationally intensive requiring on the order of weeks to perform phase space integrals before events can be generated. As these calculations move to NNLO the computational complexity increases. 

    Radja Boughezal, a theorist at Argonne, showed how she and her collaborators are using the Mira supercomputer at Argonne to perform NNLO calculations. They performed V+jet calculations on the entire supercomputer for 3hr 30min and completed the results for one publication (Physics Letters B (2016), pp. 6-13), thus illustrating how supercomputers can enable excellence in particle physics by providing capabilities that are extremely difficult if not impossible to provide on traditional computing infrastructure.

    We had presentations from the three DOE supercomputer facilities (Argonne , NERSC, and Oak Ridge) and from three Monte Carlo event generators (Sherpa, MadGraph5_aMC@NLO, and Pythia). These presentations were followed by three discussion sections: How to build codes that target both GPUs and many-core CPUs, Common scalable theory and Monte Carlo tools, and experience from the LHC experiments in scaling existing codes.

    The common tools discussion concluded that a scalable Monte Carlo integrator would be the best tool for community development. The discussion of multi-architecture codes focused on how to target GPUs (like Titan) and many-core CPUs (like Cori & Theta). The consensus was a method similar to that used in the HACC simulation software of computational cosmology in which the core computational kernels of (N)NLO calculations are identified then multiple versions written that are optimized for a particular architecture. The top-level framework can otherwise remain the same. ATLAS & CMS described their experiences scaling generators. Taylor Childers described the experiences within ATLAS of scaling Alpgen to the entire Mira supercomputer (1.5M processes) and running Sherpa at the scale of one third of Mira (128k processes). Josh Bendavid described the plans of CMS to perform similar scaling of the Sherpa and MadGraph5_aMC@NLO generators in the near future.