Speaker
Description
We report on performance measurements and optimizations of the event-builder software for the CMS experiment at the CERN Large Hadron Collider (LHC). The CMS event builder collects event fragments from several hundred sources. It assembles them into complete events that are then handed to the High-Level Trigger (HLT) processes running on O(1000) computers. We use a test system with 16 dual-socket Skylake-based computers interconnected with 100 Gbps Infiniband and Ethernet networks. The main challenge is the demanding message rate and memory performance required of the event-builder node to fully exploit the network capabilities. Each event-builder node has to process several TCP/IP streams from the detector backends at an aggregated bandwidth of 100 Gbps, distribute event fragments to other event-builder nodes at the fist level trigger rate of 100 kHz, verify and build complete events using fragments received from all other nodes, and finally make the complete events available to the HLT processors. The achievable performance on today's hardware and different system architectures is described. Furthermore, we compare native Infiniband with RoCE (RDMA over Converged Ethernet). We discuss the required optimizations and highlight some of the problems encountered. We conclude with an outlook on the baseline CMS event-builder design for the LHC Run 3 starting in 2021.
Consider for promotion | Yes |
---|