Karlsruhe Institute of Technology

SPONSORED BY THE



Federal Ministry of Education and Research



# Time and Clock Distribution Over a Hierarchy of Deterministic Optical Links

V. Sidorenko, W. F. J. Müller, W. Zabolotny, I. Fröhlich, D. Emschermann, J. Becker



#### www.kit.edu

# **CBM\* experiment overview**



Data rate: up to 1 TB/sec.

CBN Karlsruhe Institute of Technology

- Peak R<sub>int</sub> is 10 MHz for Au+Au.
- Fast & radiation hard detectors.
- 4D tracking (space, time).



\* CBM – Compressed Baryonic Matter

V. Sidorenko et al. "Time and Clock Distribution Over a Hierarchy of Deterministic Optical Links", TWEPP 2023



# Streaming data acquisition in CBM





- Distribute a synchronous system-wide <u>clock</u> signal to the CRI endpoints.
- Synchronise the local <u>time</u> counters across the CRI boards.

3

02.10.23



Protect the DAQ system from congestion through data throttling.



## **Requirements for CBM TFC**



Scalability to serve > 200 endpoints with common clock and time.

- Based on the configuration of the data readout chain.
- < 200 ps synchronization accuracy.</p>
  - Roughly based on timing resolution of RICH, the fastest subsystem apart from ToF.
- < 6 µs fast control response time.</p>
  - Estimation based on the timing constants in the readout system and throttling strategy simulations\*.

\* X. Gao, D. Emschermann, J. Lehnert, and W. F. J. Müller, "Throttling strategies and optimization of the trigger-less streaming DAQ system in the CBM experiment," Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, vol. 978. Elsevier BV, p. 164442, Oct. 2020. doi: 10.1016/j.nima.2020.164442.



TFC Timing

- The Master node receives the external clock reference.
- Time counter in the Master node defines 64-bit experiment-wide TFC time.
- TFC time is initialized via PCIe control interface.



External

*EP – Endpoint* CRI – Common Readout Interface

02.10.23

5

**CRI local time** 









**6** 02.10.23

V. Sidorenko et al. "Time and Clock Distribution Over a Hierarchy of Deterministic Optical Links", TWEPP 2023



TFC Fast control

- Each CRI board issues status information on FIFO fill level.
- The status information is aggregated and passed to the TFC Master.

TFC links must be bidirectional.





**7** 02.10.23

CRI – Common Readout Interface

*EP – Endpoint* 

V. Sidorenko et al. "Time and Clock Distribution Over a Hierarchy of Deterministic Optical Links", TWEPP 2023

FEE buffer fill level



TFC Fast control

- When the buffers are dangerously occupied, this information is propagated to the Master node.
- With the upstream link ratio of 47:1, Submasters must aggregate the data.





External

clock source

(optional)



**CBM** computing

infrastructure

PTP

EP – Endpoint CRI – Common Readout Interface

02.10.23

8

FEE buffer fill level



9





02.10.23

V. Sidorenko et al. "Time and Clock Distribution Over a Hierarchy of Deterministic Optical Links", TWEPP 2023

πv

# **TFC architecture**

Platform board: BNL-712

Mezzanine cards:

10

- Master WR TMC
- Endpoint TTC-PON TMC





- Board highlights:
  - Developed by Brookhaven National Lab, USA
  - Xilinx XCKU115FLVF1924-2E FPGA
  - Si5345 jitter cleaner for recovered clock
  - 48 optical connections (1 SFP, 47 Broadcom MiniPOD)
  - PCIe Gen3 x16 lane interface





#### **TFC** architecture



- Clock signal is embedded into the data communication.
- Each node recovers clock from an upstream link and uses it for its own logic and for further downstream links if needed.
- All components in the clock cascading chain have a deterministic input-to-output delay.

īΤīV

#### **TFC** architecture



- Wishbone as the system bus with AGWB\* infrastructure.
- GBT-FPGA provides latency-deterministic communication over fibre.



\* W. M. Zabołotny, M. Gumiński, M. Kruszewski, and W. F. J. Müller, "Control and Diagnostics System Generator for Complex FPGA-Based Measurement Systems," Sensors, vol. 21, no. 21. MDPI AG, p. 7378, Nov. 06, 2021. doi: 10.3390/s21217378.

īΤīV

# **Clock jitter and skew in the system**

Latency determinism over one hop has been previously evaluated\*.

It is still unclear how the timing error will scale in a larger system.

Goals of the current study:

- Evaluate clock jitter in the system nodes.
- Estimate how clock jitter changes with added network layers and endpoint nodes.
- Estimate how clock skew changes with added network layers and endpoint nodes.

Hardware used:

- 3x BNL-712 + SFP mezzanines.
- Tektronix TDS6154C (4 ch, 15 GHz, 40 GSa/s) + TDSJIT3 Advanced jitter measurement app.

\* V. Sidorenko, W. F. J. Müller, W. Zabolotny, I. Fröhlich, D. Emschermann, and J. Becker, "Evaluation of GBT-FPGA for timing and fast control in CBM experiment," Journal of Instrumentation, vol. 18, no. 02. IOP Publishing, p. C02052, Feb. 01, 2023. doi: 10.1088/1748-0221/18/02/c02052.









# **Test conditions**



- Direct measurement of the 40 MHz system clock.
- Clock recovery with Silabs Si5345 (Rev B) at ~87 Hz loop bandwidth.
- Air-conditioned room with insignificant temperature variation.

Measurements:

- Jitter measurement on each node.
- Clock skew between nodes at each hop.
- 3 measurements, >1M samples each
- Two setup configurations:





V. Sidorenko et al. "Time and Clock Distribution Over a Hierarchy of Deterministic Optical Links", TWEPP 2023

īΤīV



# Test results: 2 hop configuration

| Master period jitter |        |               |  |
|----------------------|--------|---------------|--|
| Test no.             | μ, ns  | $\sigma$ , ps |  |
| 1                    | 25.001 | 9.696         |  |
| 2                    | 25.001 | 9.893         |  |
| 3                    | 25.001 | 9.902         |  |
| AVG                  | 25.001 | 9.831         |  |

| Submaster period jitter |        |               |  |  |
|-------------------------|--------|---------------|--|--|
| Test no.                | μ, ns  | $\sigma$ , ps |  |  |
| 1                       | 25.001 | 9.691         |  |  |
| 2                       | 25.001 | 9.524         |  |  |
| 3                       | 25.001 | 9.616         |  |  |
| AVG                     | 25.001 | 9.610         |  |  |

| Endpoint period jitter |        |               |  |  |
|------------------------|--------|---------------|--|--|
| Test no.               | μ, ns  | $\sigma$ , ps |  |  |
| 1                      | 25.001 | 12.684        |  |  |
| 2                      | 25.001 | 10.182        |  |  |
| 3                      | 25.001 | 10.527        |  |  |
| AVG                    | 25.001 | 11.131        |  |  |



| Master-Submaster skew |        | Master-Submaster skew |  |
|-----------------------|--------|-----------------------|--|
| Test no.              | μ, ns  | $\sigma$ , ps         |  |
| 1                     | 13.756 | 24.377                |  |
| 2                     | 14.298 | 21.311                |  |
| 3                     | 14.284 | 21.527                |  |
| AVG                   | 14.113 | 25.738                |  |

V. Sidorenko et al. "Time and Clock Distribution Over a Hierarchy of Deterministic Optical Links", TWEPP 2023





## Test results: 2 hop configuration



V. Sidorenko et al. "Time and Clock Distribution Over a Hierarchy of Deterministic Optical Links", TWEPP 2023



**16** 02.10.23



## Test results: 2 endpoint configuration



**17** 02.10.23

V. Sidorenko et al. "Time and Clock Distribution Over a Hierarchy of Deterministic Optical Links", TWEPP 2023





## Test results: 2 endpoint configuration



**18** 02.10.23

V. Sidorenko et al. **"Time and Clock** Distribution Over a Hierarchy of Deterministic Optical Links", TWEPP 2023

īΤīV

# Applying the results to the TFC system



Link latency measurements.

50 runs, 1000 samples each (full power cycle between runs).



- With the current hardware platform, 2 hops are required to serve 200 CRI boards.
- Clock skew  $\sigma$  over 2 hops:

$$\sigma_{M-EP} = \sigma\sqrt{2} \approx 43.322 \, ps$$

Before hypothesis 2 can be applied, the worst-case link must be identified.



## Conclusions



- Vertical scaling (adding layers or hops) appears to be predictable.
- Horisontal scaling appears to be defined by the worst-case link and requires its identification.

- Insight has been gained into how timing distribution error scales with adding network nodes and layers
- …although there is more insight to gain!

Although more accurate estimations have yet to be done, performance of the timing distribution system looks very promising for the needs of the experiment.











# Thank you!





# **Backup slides**

2202.10.23V. Sidorenko et al. "Time and Clock Distribution Over a Hierarchy of<br/>Deterministic Optical Links", TWEPP 2023













24

02.10.23





#### 50 runs, 1000 samples each (full power cycle between runs)



ΠīV