Indico celebrates its 20th anniversary! Check our blog post for more information!

CMS DAQ Current and Future Hardware Upgrades up to Post Long Shutdown 3 (LS3) Times

12 Sept 2017, 12:20
25m
Thimann I Lecture Hall (UCSC)

Thimann I Lecture Hall

UCSC

Thimann I Lecture Hall
Oral Systems, Planning, Installation, Commissioning and Running Experience Systems, Planning, Installation, Commissioning and Running Experience

Speaker

Attila Racz (CERN)

Description

Following the first LHC collisions seen and recorded by CMS in 2009, the DAQ hardware went through a major upgrade during LS1 (2013-2014) and new detectors have been connected during the 2016-2017 winter shutdown. Now, LS2 (2019-2020) and LS3 (2024-mid 2026) are actively prepared. This paper shows how CMS DAQ hardware has evolved from the beginning and will continue to evolve in order to meet the future challenges posed by High Luminosity LHC (HL-LHC) and the CMS detector evolution. In particular, post LS3 DAQ architectures are focused upon.

Summary

The initial requirement on the central CMS DAQ was to readout ~1MB of data at 100 kHz level-1 trigger rate (1Tb/s). The main peculiarity of its architecture was to profit from the rapid evolution of the networking technologies in order to build complete events at 100 kHz directly into computers memory without requiring a Level 2 trigger. This unique feature remains to this day.

Initially, DAQ1 was interfaced to the detector front-end via custom modular electronic (FRL) receiving the data over copper cables at 200MB/s average. The event builder (EVB) was implemented in stages with different technologies with 640 servers acting as a bridge. The assembled events were analyzed and classified by the High Level Trigger (HLT) running on ~1000 multicore servers.

During LS1, all commercial parts in the EVB and analysis farm were replaced due to their obsolescence. Moreover, part of the FRL has been upgraded to accommodate new front end systems with optical only data interfaces. The FRL produces now a TCP/IP compliant output at 10 Gb/s. The new single stage event builder is based on Infiniband FDR technology (56 Gb/s). The analysis farm has been replaced with state of the art multicore servers. With these changes, the post LS1 DAQ is now able to readout up to ~2MB of data per trigger.

For LS2, the baseline plan is a box to box replacement, no change in the hardware architecture is foreseen. However we will take profit of faster network technologies like Omni-Path or InfiniBand EDR, both running at 100 Gb/s, and more powerful analysis machines.

For LS3, due to HL-LHC design parameters, all detector readout systems must be upgraded, leading to a completely new central DAQ design, while maintaining the initial architecture supporting a single level of hardware trigger.

The upgrade of the LHC will result in a factor ~50 increase in the total data throughput and a factor ~20 increase in the requirements on the on-line computing power.

After an overview of the new readout requirements per sub-detector, we are reviewing, in detail, the possible hardware topologies for the data to surface (D2S) funneling into the event builder. The key element of the D2S will be the future DAQ and Trigger Hub (DTH) in charge of the translation and agglomeration of many custom sub-detector data streams into standard network streams. It will also provide the timing and trigger signals to sub-systems.
As of today, the DTH will be an ATCA form factor card sitting in the hub slots of the sub-system shelves. Its architecture will allow an optimal data collection within its shelf, taking into account the specific data volumes and throughput of each sub-system that can be very different from one sub-system to another.
A full program of prototyping and development is presented with a view to making available functional DTHs along with a companion software stack very early before LS3 and eventually, when we get closer to LS3, definitive full performance DTHs.

Primary author

Presentation materials