Real-Time System for Distribution of Clock and Control Commands with Fixed Latency



**Maurício Féo –** CERN on behalf of the LHCb Online team



EE 23<sup>rd</sup> Real Time Conference



## The LHCb Upgrade



When:LHC Long Shutdown 2 (by 2022)<br/>For Runs 3 & 4 (2022 - 2029)Why:To increase statistics

9 fb<sup>-1</sup> (Runs 1-2)  $\rightarrow$  50 fb<sup>-1</sup> (Runs 1-4)

How: Increasing instant. luminosity  $5x \text{ higher } \rightarrow L_{inst} = 2x10^{33} \text{ cm}^{-2}\text{s}^{-1}$ Increasing readout rate  $1 \text{ MHz} \rightarrow 40 \text{ MHz}$ 

This requires some **main changes:** 

- Replace many sub-detectors:
  - New Tracking System (VeLo, UT, SciFi)
  - Partially new Particle ID System (RICH1 + RICH2)
- Replace of ALL the electronics:
  - No more hardware trigger
  - Event selection in software
  - Completely new DAQ system



# The Online DAQ System





# From Collision to Memory



The data path LHC

 LHC beams are divided in bunches that cross in synchrony with a clock signal ~40MHz = 25ns / cross

Not all bunches are filled → no collision
We need to select which
bunch crossings to save

 Particles arrive at different times depending on the subdetector
We need to phase-align the clocks per subdetector and have fixed latency

#### The TFC System: A Real-Time Architecture





## **GBT Project: The GigaBit Transceiver**



- The GBT Project provides a radiation hard chipset for handling control and data acquisition on frontend boards
- Designed at CERN, it is widely used on the upgrade of its experiments
- also provides firmware lt а component (GBT-FPGA) that allows FPGAs to interface directly with the GBTx chip.



For more info:

# **The TTC-PON Project**



Timing, Trigger and Control for Master/ Follower architecture BC Trigger Unit RF2TTC orbit Passive Optical Networks OLT = Master (Link to project) ONU = Follower OLT **FPGA-Based System for TFC** Slow Control Implemented distribution with fixed latency Everything controllable from the master 9.6 Gbps downstream with FEC (8.0 for the user) **USER PAYLOAD** FEC 200b 28h BC period (25ns) ONU ONU ONU 2.4 Gbps upstream 8b10b (Time Division Multiplexing) **GBT FPG GBT FPG GBT FPGA** ONU1 off-detector data ONU2 gap 41.7ns detectors 25.0ns **GBTx GBTx GBTx** 100b 60b **GBT**x -----USER PAYLOAD ... **FE chip FE chip FE** chip 56b **FE chip** Total: 80b (100b with 8b10b)

OLT: Optical Line Terminal / ONU: Optical Network Unit

preamble

58.3ns

140b

HDR

TTC-PON: Clock Recovery

- The TTC-PON firmware manages clock and data recovery, making use of the FPGA transceivers and an external SFP+ module
- The OLT sets the beginning of the transmission of a new word from a strobe pulse provided by the user
- The ONU identifies the header position in the serial stream and generates a strobe pulse phase-aligned with the header\*
- The recovered clock can be generated from the ONU Rx strobe

\*This is achieved by using the FPGA transceivers' rxslide/bitslip functionality to shift the recovered parallel clock until it is aligned with the header. For more details:

Achieving Picosecond-Level Phase Stability in Timing Distribution Systems With Xilinx Ultrascale Transceivers

Eduardo Mendes<sup>19</sup>, Sophie Baron, Member, IEEE, Csaba Soos, Jan Troska<sup>10</sup>, and Paolo Novellini

https://ieeexplore.ieee.org/document/8967127



The Tx and Rx waveforms look the same, but: On the Tx, the strobe defines the position of the data word On the Rx, the data word defines the position of the strobe



Maurício Féo - m.feo@cern.ch

#### **TTC-PON:** Integration into the Firmware

- The TTC-PON firmware manages clock and data recovery, making use of the FPGA transceivers and an external SFP+ module
- It provides an abstraction layer to the user
- The user still needs to handle the configuration and calibration of the transceiver, provide the interfaces of the TTC-PON core to the transceiver and the external SFP+ module and provide a way of accessing the TTC-PON registers.
  - The strobe pulse is an input to the OLT and output from the ONU
  - The SFP+ control signals differs from the ONU and OLT







## **TFC System: Clock Distribution**

- The TFC System uses the TTC-PON firmware component to distribute the clock between the Supervisor and Control cards, and between the Control and DAQ cards
- In the Control card, the strobe pulse used by the OLT is generated from the 40MHz clock recovered from the ONU. This way the clock propagated downstream to the DAQ cards is phase-locked with the one received from the Supervisor card
- The clock to the FrontEnd electronics uses instead the GBT protocol
  - The BackEnd firmware uses instances of the GBT-FPGA core
  - The FrontEnd boards use the GBTx ASIC





Maurício Féo - m.feo@cern.ch

#### **TFC System: Control Commands**





#### August 04, 2022



The readout period is limited by the duration of a round-robin, which is proportional to the

Total: 80b (100b with 8b10b)

The cards of the TFC System can also send

gap

25.0ns

60b

----

ONU2

USER PAYLOAD 56b

data upstream using the TTC-PON

data

41.7ns

100b

ONU1

preamble

58.3ns

140h

- levels of hierarchy, the readout period would be 32 x 2 x 5 = **320 clocks** or **8us** (**125kHz**)
- This is the worst-case scenario when reading bits from ALL cards in the system





13/19



#### TFC System Upstream: Real-Time Monitoring

#### Throttle is a "bit" in the DAQ card that indicates when it can't handle many more events (could be caused by any bottleneck down the data path)

Application example: Throttle

- The throttle bit is sent upstream
- The Readout Supervisor automatically reacts to it by reducing the trigger rate
- This happens orders of magnitude faster than what a SCADA system would allow, preventing the whole DAQ system from going into error





### TFC System: Link Quality & Slow Monitoring

Custom data generator + data checker

- Pattern / Counter / PRBS
- Firmware implemented counters
  - PLLs Locked
  - Transceivers Locked / Ready
  - Header Aligned / Locked to Burst
  - FECs (downstream) / 8b10b errors (upstream)
- SCADA software for monitoring and control



#### **OLT Monitoring Panel**

| PON_Status_lab516_ch1                   | (DAQESC | 8000 - DAC | ESC8000; # | 1)                                    |           |          |         |             |           |                 |                 |                 |                 |   |
|-----------------------------------------|---------|------------|------------|---------------------------------------|-----------|----------|---------|-------------|-----------|-----------------|-----------------|-----------------|-----------------|---|
| TFC Monitoring - Channel 1: OLT         |         |            |            |                                       |           |          |         |             | T Arrival | dT              | Error           | Addr            | TFC Data        | 1 |
|                                         |         | -          |            |                                       |           |          |         | Enable FIFO | 60076     | 30              | 0               | 2               | 020202 02020202 | 2 |
| Server: lbo                             | lages   | sc800      | 0 - C      | ard: la                               | ab516     | 5        |         |             | 60046     | 30              | 0               | 1               | 010101 01010101 | 1 |
|                                         | -       |            |            |                                       |           |          | Dec     | ata         | 60016     | 30              | 0               | 3               | 030303 03030303 | 3 |
| Status                                  |         | Counte     | ers        | C                                     | ocks      |          | _Rese   |             | 59986     | 30              | 0               | 2               | 020202 02020202 | 2 |
| Tx PLL Locked                           | RX      | _Ready     |            | CDR 40                                | ) MHz     |          | Gener   | al          | 59956     | 30              | 0               | 1               | 010101 01010101 | 1 |
|                                         |         | 0          |            | 40.                                   | 077989    | <u> </u> | MGT R   | x/Tx        | 59926     | 30              | 0               | 3               | 030303 03030303 | 3 |
| Tx_Ready                                |         | _Ready     |            | 2.624                                 |           | -  0     | OLT C   | ore         | 59896     | 30              | 0               | 2               | 020202 02020202 | 2 |
| Rx Ready                                |         | 0          |            | Ref 24                                |           | _  0     | Data H  | landler     | 59866     | 30              | 0               | 1               | 010101 01010101 | 1 |
|                                         |         | t_phase_   | good       | 240                                   | .467947   |          | Status  | Counter     | 59836     | 30              | 0               | 3               | 030303 03030303 | 3 |
|                                         |         | 0          |            |                                       |           |          | Diatas  | counter     | 59806     | 30              | 0               | 2               | 020202 02020202 | 2 |
| 8b10b errors                            | cn      | t_olt_lock | ed         | _Con                                  | figure    | <u> </u> |         | TEC         | 59776     | 30              | 0               | 1               | 010101 01010101 | 1 |
| 0                                       |         | 0          |            |                                       | igure TFC | _        |         |             | 59746     | 30              | 0               | 3               | 030303 03030303 | 3 |
| max onu address                         | cn      | t_address  | s_good     | · · · · · · · · · · · · · · · · · · · | 5         |          |         | Pattern     | 59716     | 30              | 0               | 2               | 020202 02020202 | 2 |
| 3                                       |         | 0          |            | Result                                |           | _        | _       | PRBS        | 59686     | 30              | 0               | 1               | 010101 01010101 | 1 |
|                                         |         | Clear cour | ters       |                                       |           |          | 00      | Counter     | 59656     | 30              | 0               | 3               | 030303 03030303 | 3 |
|                                         |         |            |            |                                       |           |          |         |             | 59626     | 30              | 0               | 2               | 020202 02020202 | 2 |
| Data Source                             |         |            |            |                                       |           |          |         |             | 59596     | 30              | 0               | 1               | 010101 01010101 | 1 |
| Errors to inject Spy Address Spy Data   |         |            |            |                                       |           |          |         | 59566       | 30        | 0               | 3               | 030303 03030303 | 3               |   |
| 0 01010101010101                        |         |            |            |                                       |           | 59536    | 30      | 0           | 2         | 020202 02020202 | 2               |                 |                 |   |
| Inject errors                           |         |            |            |                                       |           |          |         |             | 59506     | 30              | 0               | 1               | 010101 01010101 | 1 |
|                                         |         |            |            |                                       |           |          | 59476   | 30          | 0         | 3               | 030303 03030303 | 3               |                 |   |
| 00000000000000000000000000000000000000  |         |            |            |                                       |           | torn     | 59446   | 30          | 0         | 2               | 020202 02020202 | 2               |                 |   |
| 000000000000000000000000000000000000000 | 0000000 | 00000000   | 00000000   | 00000000                              | 0000000   |          | Set Fat | tem         | 59416     | 30              | 0               | 1               | 010101 01010101 | 1 |
| 199-180 179-160 1                       | 59-140  | 139-120    | 119-100    | 99-80                                 | 79-60     | 59-40    | 39-20   | 19-0        | 59386     | 30              | 0               | 3               | 030303 03030303 | 3 |
| 00000 00000                             | 00000   | 00000      | 00000      | 00000                                 | 00000     | 00000    | 00000   | 00000       | 59356     | 30              | 0               | 2               | 020202 02020202 | 2 |
| 4FE00 027F0                             | 0013F   | 80009      | FC000      | 4FE00                                 | 027F0     | 0013F    | 80009   | FC000       | 59326     | 30              | 0               | 1               | 010101 01010101 | 1 |
|                                         |         |            |            |                                       |           |          |         |             | 59296     | 30              | 0               | 3               | 030303 03030303 | 3 |
|                                         |         |            |            | Channel 0                             |           |          |         |             | 59266     | 30              | 0               | 2               | 020202 02020202 | 2 |
| ✓ Update every 1                        | s. Last | : Os ago.  |            | Channel 1                             | lab516    | )        | •       | Close       | 59236     | 30              | 0               | 1               | 010101 01010101 | 1 |

#### **ONU Monitoring Panel**





#### **TFC System: Link Quality & Slow Monitoring**

- Testing recovered clock phase
  - 36.000 resets of the Supervisor (SODIN)
  - Clocks recovered with Std. Dev. of 80ps



- Longest PRBS Run: 32 DAQ cards for almost 4 days
  - 0 errors over 2.6x10<sup>15</sup> bits transmitted
  - 1 single Forward Error Correction
  - Equivalent of 95% CL for having < 1 error per 2h for 500 cards

(Not all bits would actually cause an error but more statistics is still desirable)

#### **System Monitoring Panel**

|                    | Flavor   |            | Status      |               | FEC   | Data Mode  |   | ata Checker | Configure | Panels | Resets |
|--------------------|----------|------------|-------------|---------------|-------|------------|---|-------------|-----------|--------|--------|
|                    | SODIN    |            | All Ready!  | 0             |       | 0 TFC TFC  |   | isabled     | Configure | Open   | Rese   |
|                    | ✓ SOL40  | r2sol011   |             | 0             |       | 0 TFC PRBS |   | isabled     | Configure | Open   | Rese   |
|                    | ✓ SOL40  | r2sol012   |             | 0             |       | 0 TFC PRBS |   | isabled     | Configure |        | Rese   |
|                    | ✓ SOL40  | r2sol013   |             | 0             |       | 0 TFC PRBS |   | isabled     | Configure | Open   | Rese   |
|                    | ✓ SOL40  | r2sol014   |             | 0             |       | 0 TFC PRBS |   | isabled     | Configure | Open   | Rese   |
|                    | ✓ SOL40  | tfsol011   |             | 0             |       | 0 TFC PRBS |   | isabled     | Configure | Open   | Rese   |
|                    | ✓ TELL40 | r2tel011   |             | 0             |       | 0 PRBS     |   | / 322569s   | Configure | Open   | Rese   |
|                    | ✓ TELL40 | r2tel012   |             | 0             |       | 0 PRBS     |   | / 322569s   | Configure | Open   | Rese   |
|                    | ✓ TELL40 | r2tel013   |             | 0             |       | 0 PRBS     |   | / 322569s   | Configure | Open   | Rese   |
|                    | ✓ TELL40 | r2tel021   |             | 0             |       | 0 PRBS     |   | / 322569s   | Configure | Open   | Rese   |
|                    | ✓ TELL40 | r2tel022   |             | 0             |       | 0 PRBS     |   | / 322569s   | Configure | Open   | Rese   |
|                    | ✓ TELL40 | r2tel023   | All Ready!  | 0             |       | 0 PRBS     | 0 | / 322569s   | Configure | Open   | Rese   |
|                    | ✓ TELL40 | r2tel031   |             | 0             |       | 0 PRBS     | 0 | / 322569s   | Configure | Open   | Rese   |
|                    | ✓ TELL40 | r2tel033   | All Ready!  | 0             |       | 0 PRBS     | 0 | / 322569s   | Configure | Open   | Rese   |
|                    | ✓ TELL40 | r2tel041   |             | 0             |       | 0 PRBS     | 0 | / 322569s   | Configure | Open   | Rese   |
| $\rightarrow$      | ✓ TELL40 | r2tel042   |             | 0             |       | 0 PRBS     |   | / 322569s   | Configure | Open   | Rese   |
|                    | ✓ TELL40 | r2tel043   | All Ready!  | 0             |       | 0 PRBS     | 0 | / 322569s   | Configure | Open   | Rese   |
|                    | ✓ TELL40 | r2tel051   |             | 0             |       | 0 PRBS     |   | / 322569s   | Configure | Open   | Rese   |
| Onc                | ✓ TELL40 | r2tel052   |             | 0             |       | 0 PRBS     | 0 | / 322569s   | Configure | Open   | Res    |
| Ops                | ✓ TELL40 | r2tel053   | All Ready!  | 0             |       | 0 PRBS     | 0 | / 322569s   | Configure | Open   | Res    |
|                    | ✓ TELL40 | r2tel061   |             | 0             |       | 0 PRBS     | 0 | / 322569s   | Configure | Open   | Res    |
|                    | ✓ TELL40 | r2tel062   |             | 0             |       | 0 PRBS     |   | / 322569s   | Configure | Open   | Rese   |
|                    | ✓ TELL40 | r2tel071   |             | 0             |       | 0 PRBS     |   | / 322569s   | Configure |        | Rese   |
|                    | ✓ TELL40 | r2tel072   |             | 0             |       | 0 PRBS     |   | / 322569s   | Configure | Open   | Rese   |
|                    | ✓ TELL40 | r2tel073   |             | 0             |       | 0 PRBS     |   | / 322569s   | Configure | Open   | Rese   |
|                    | ✓ TELL40 | r2tel081   |             | 0             |       | 0 PRBS     |   | / 322569s   | Configure |        | Rese   |
|                    | ✓ TELL40 | r2tel082   |             | 0             |       | 0 PRBS     |   | / 322569s   | Configure | Open   | Res    |
|                    | ✓ TELL40 | r2tel083   |             | 0             |       | 0 PRBS     |   | / 322569s   | Configure | Open   | Res    |
| -··-               | ✓ TELL40 | r2tel091   |             | 0             |       | 0 PRBS     |   | / 322569s   | Configure |        | Res    |
| ays                | ✓ TELL40 | r2tel092   |             | 0             |       | 0 PRBS     |   | / 322569s   | Configure | Open   | Res    |
| ays                | ✓ TELL40 | r2tel101   |             | 0             |       | 0 PRBS     |   | / 322569s   | Configure | Open   | Res    |
|                    | ✓ TELL40 | r2tel102   |             | 0             |       | 0 PRBS     |   | / 322569s   | Configure |        | Res    |
|                    | ✓ TELL40 | r2tel103   |             | 0             |       | 0 PRBS     |   | / 322569s   | Configure | Open   | Res    |
|                    | ✓ TELL40 | r2tel111   |             | 0             |       | 0 PRBS     |   | / 322569s   | Configure | Open   | Rese   |
|                    | ✓ TELL40 | r2tel112   |             | 0             |       | 0 PRBS     |   | / 322569s   | Configure | Open   | Res    |
|                    | ✓ TELL40 | r2tel113   | All Ready!  | 0             |       | 0 PRBS     |   | / 3225695   | Configure | Open   | Res    |
|                    | ✓ TELL40 | r2tel121   |             | 0             |       | 0 PRBS     |   | / 322569s   | Configure | Open   | Rese   |
| 5 - 5 - 5          | ✓ TELL40 | r2tel122   |             | 0             |       | 1 PRBS     |   | / 322557s   | Configure | Open   | Res    |
| n for 500 cards    |          |            |             |               |       |            |   |             |           |        |        |
|                    |          |            |             |               | (     |            | - |             | -         | -      |        |
|                    |          |            |             |               |       |            | - |             | -         |        |        |
| s still desirable) |          | ±          |             |               |       |            | - |             | _         | -      | -      |
| s still desilable  |          |            |             | Clear         | Clear | Data       |   | Data        |           |        | Ser    |
|                    | Apply a  | actions to | ALL 🝷 cards | S -> Counters | FEC   | Mode       |   | Checker     | Configure |        | Res    |



#### **Summary**







- Downstream system fully functional and it is currently in use
  - Since October/2021 (for commissioning of the subdetectors)
  - Since 30 days ago for data taking of LHC Run3
- Upstream system tested and being commissioned
  - Tested in a single sub-detector, expected deployment to the whole experiment by Sept/2022
  - Full characterization still pending
  - Results are, however, very promising :)

# Obrigado!

Questions are welcome :)