A new methodology of clock phase adjustment in a large-scale clock distribution system for HL-LHC ATLAS TGC front-end electronics

**IEEE real time conference, 23 April 2024** 

#### 1. Abstract

 In the High Luminosity LHC (HL-LHC) ATLAS experiment, the Thin Gap Chamber (TGC) system performs fast online muon reconstruction using a coincidence algorithm as part of the first hardware-based triggering stage.

For precise bunch-crossing identification in TGC trigger electronics, clock tuning well below O(1) ns is necessary. Clock phase adjustment process involves 1,434 front-end electronics, with clock distributed via 1,434 optical fibers and reconstructed individually in the TGC system.
A methodology developed to ensure low-skew clock signal distribution, focusing on efforts to maintain aligned clock signals across the large-scale electronics system will be presented. A measurement-driven estimation of the expected size of skew to be absorbed and the expected uncertainties on the alignment will also be discussed.

2. TGC electronics system for the HL-LHC ATLAS

#### **TGC electronics**

The Thin Gap Chamber (TGC) system, responsible for the endcap muon trigger of ATLAS (1.05 <  $|\eta|$  < 2.4), will upgrade its readout and trigger electronics for the HL-LHC The TGC electronics system consists of:

ASD: Amplifier-Shaper-Discriminator (~23K in TGC)

## 5. Demonstration of clock phase adjustment between 1/12 sectors

Demonstration consists of two steps of Sec.5 and Sec.6.

Setup

Two TAMs and Sector Logic are used. Oscilloscope is used for validation.



Result

Clock phase of Two TAMs are matched with sufficient accuracy.



- •PS board: Processing Board (1434 boards)
   •Kintex-7 FPGA
- Assigns bunch-crossing identification to hit signals by using LHC clock distributed from SL
- JATHub: JTAG Assistance Hub (148 boards)
  Zynq 7000 SoC
- TAM: Timing Alignment Master (24 boards)
  Kintex-7 FPGA
- Control, Monitor ASD ASD Control for the form of the test of test o

Fig: Overview of TGC electronics system.

Ethernet Hub

- •SL: Sector Logic (48 boards)
- Virtex Ultrascale+ FPGA, Zynq Ultrascale+ MPSoC

• Distributes LHC clock to all 1,434 PS boards via optical fiber independently.

TGC is located in both endcaps (40 m away) of the ATLAS detector. Each side is divided into 12 independent sectors (1/12 sectors) in terms of infrastructure and functionality. For precise bunch-crossing identification, phase of clock signals distributed to individual 1,434 PS boards must be matched with an accuracy of 1 ns.

## 3. Methodology of clock phase adjustment

#### MEASURE clock phase at the endpoints (PS boards) and ADJUST in PS boards.

- 1. SL distributes data (including the timing signals) and clock to PS Boards and TAMs.
- 2. All 24 TAMs match clock phase differences between 1/12 sectors (green line).
- 3. TAM distributes reference clock for 6 JATHubs (blue line).
  - $\rightarrow$  All 148 (= 24 × 6) JATHubs have the same-phase reference clock.
- 4. JATHub monitors the distributed LHC clock on PS Boards independently (red line).
  - 11 PS boards are connected with one JATHub maximally (148  $\times$  11 > 1,434).
- 5. SL updates delay
- 1/12 sector VME crate

Ch 0 (Trigger) Oscilloscope



Fig: Setup of demonstration of clock phase adjustment between 1/12 sectors. SL distributes clock to two TAMs via optical fiber. LEMO cables are aligned aiming for TAM 2 to match its clock phase to TAM 1.

Fig: Result of demonstration of clock phase adjustment between 1/12 sectors. 1 division is set to 1 ns. Clock phase of TAM 2 is matched TAM 1 with sufficient accuracy.

## 6. Demonstration of clock phase adjustment within 1/12 sectors

## Setup

11 PS boards, TAM, JATHub, and Sector Logic are used.



Fig: Setup of demonstration of clock phase adjustment within 1/12 sector. SL distributes clock to 11 PS boards and TAM via optical fiber. TAM distributes reference clock to JATHub. JATHub monitors clock phase of 11 PS boards. Results of clock phase measurement are obtained via ethernet. SL rewrites optimal delay parameters of 11 PS boards.

#### Reslut

Clock phase of 11 PS boards is matched with an accuracy of ~ 50 ps.

~ 50 ps uncertainty comes from reproducibility with soft reset (details in section 7).



Fig: Results of demonstration of clock phase adjustment within 1/12 sector. (a) is 25 ns window and (b) is zoom-in window.



2 is clock phase adjustment
between 1/12 sectors.
3 and 4 are clock phase
adjustment within 1/12 sector.



Fig: Overview of clock phase adjustment methodology.

## 4. Low-level implementation in FPGA and SoC

## Clock distribution and reconstruction with fixed latency

• Data and clock information are encoded to serial data by 8b/10b.

•PS boards and TAMs decode and reconstruct clock information with fixed latency which means that clock phase does not change by FPGA reconfiguration and soft reset.

• This is not accomplished when we use GT transceiver module (GTX/GTY) supported by Xilinx as it is.

•We deactivate 8b/10b decoder functionality in GT transceiver and use self-custom modules (8b/10b decoder etc.) for fixed latency purpose.



## **Clock phase measurement**

Clock phase measurement method is implemented in TAM and JATHub.
TAM monitors and compares another TAM clock in next 1/12 sector with its own TAM clock.

JATHub monitors PS board clock with is ~ 50 ps.
 reference clock distributed from TAM.
 Phase measurement procedures are

- as follows
- Latching clock by reference clock with 1000 times. Counting the number of high/low.
- Shifting phase of reference clock with 1 step. 1 step is 18 ps in fine delay. 1 step is 25 ns in coarse.

Repeat and scan 1 UI of clock. We Fig: Reproducibility of clock phase of a PS board. Each Fig and Tab: Fig (a) shows shematic view of each component related to clock use 40 MHz (1 UI = 25 ns) clock for corresponds to plots with all modules' reset steps.

## 7. Reproducibility and part-to-part skew

Part-to-part skew

## Reproducibility

 Reproducibility of fixed latency clock reconstruction and clock phase measurement is checked with all modules (PS board, TAM, JATHub, and SL) reboot which includes all phase-changing effects (each reset component is also checked).

Uncertainty of reproducibility



## Regarded this as syst. uncertainty. IATHub (Skew of LVDS receiver is do

JATHub (Skew of LVDS receiver is dominant)
 ✓ Measure skew of LVDS receivers of all JATHub and take it into account.

Skew related to only clock phase measurement.

•Cat 6 cable. (It is difficult to measure skew of all

Estimated by measurement-driven method.

✓ Observed by clock phase adjustment.

Skew related to reconstructed clock own.

Cat 6 cables in our commissioning plans.)

We can conclude that total systematic uncertainty is about 600 ps in our commissioning plan (acceptable



|   | component                              | max [ns] | typical [ns] |  |  |
|---|----------------------------------------|----------|--------------|--|--|
| 1 | SL port (same SLR)                     | 1.5      | 0.7          |  |  |
|   | SL port (different SLR)                | 3        | 1.5          |  |  |
|   | MPO24-MPO24 (1 m)                      | 0.2      | 0.1          |  |  |
| 1 | MPO24-LC24 (1 m-1 m)                   | <0.01    | -            |  |  |
| 1 | SFP+ (same type)                       | <0.01    | -            |  |  |
| 1 | SFP+ (different type)                  | 0.1      | 0.1          |  |  |
| 1 | PS board                               | 0.1      | 0.1          |  |  |
| 1 | Cat 6 cable (10 m)                     | 0.6      | 0.4          |  |  |
|   | JATHub (LVDS receiver)                 | 0.4      | 0.2          |  |  |
|   | JATHub (LVDS receiver)<br>(b): Systema |          | ינ           |  |  |

|   | RX packet | (200 MHz)              | 8b/10b  | (200 MHz)                | 8b/10b           | RXREFCLK(200 MHz)                            |  |
|---|-----------|------------------------|---------|--------------------------|------------------|----------------------------------------------|--|
|   | deformer  | Parallel data (32 bit) | decoder | ■ Parallel data (40 bit) | decoder          | Serial data (8 GBps)                         |  |
| Ľ | LHC C     | LK (40 MHz)            |         |                          | LHC CLK (40 MHz) | Clock jitter<br>cleaner<br>(Zero delay mpde) |  |
|   |           |                        |         |                          |                  |                                              |  |

Fig: Implementation of clock distribution with fixed latency. The data shifts right by one bit for every RXSLIDE (red line) pulse issued. 8b/10b decoder decodes 40 bit data to 32 bit data by assigning RXSLIDE unless comma word detected. RX packet deformer reconstructs 40 MHz clock from 200 MHz clock with fixed latency using header word defined in data format.

# Variable phase shift (coarse and fine delay)

• Coarse delay is implemented by shift register in kintex-7 FPGA.

• Fine delay is implemented by Mixed-Mode Clock Manager (MMCM) in kintex-7 FPGA

use 40 MHz (1 UI = 25 ns) clock for determining fine delay parameter and 200 kHz clock (1 UI = 5 us) for coarse.

This method has been validated with



Fig: Clock phase measurement. (a) shows principle. (b) shows results and has been validated with oscilloscope.

#### 8. Conclusion

In the HL-LHC ATLAS TGC system, clock phase measurement process involves a large number of frontend electronics (1,434) with clock signals distributed via optical fibers and reconstructed individually. Effective bunch-crossing identification in TGC trigger electronics requires clock tuning well below O(1) ns.
A methodology has been developed for measuring the clock phase for individual reconstructed signals across the entire system and aligning the clock phase remotely and automatically in situ. Reproducibility of fixed latency clock distribution is an accuracy of ~ 50 ps and all systematic skew observed by clock phase adjustment or taking it into account when clock phase measurement.
We can conclude that clock phase adjustment of 1,434 front-end electronics can be done

with an accuracy of ~ 700 ps (= ~ 600 ps: skew of Cat 6 cable + ~ 50 ps: uncertainty of reproducibility) in our plan. This is sufficient for our requirement (O(1) ns).

Ren Nagasaka, (The University of Tokyo, ICEPP) on behalf of the ATLAS Collaboration

