



# The ALICE Central Trigger Processor (CTP) Upgrade

Marian Krivda<sup>1)</sup> and <u>Jan Pospíšil<sup>2)</sup></u>

On behalf of ALICE collaboration

1) University of Birmingham, Birmingham, United Kingdom

2) Nuclear Physics Institute ASCR, Řež, Czech Republic

29. September 2015, TWEPP 2015

#### Content

- CTP for LHC Run 2
  - ALICE experiment after LS1
  - Trigger challenges for Run 2
  - New LMO board
  - Main features of upgraded CTP
  - Integration and commissioning
- CTP proposal for LHC Run 3
  - Requirements
  - Design Proposal
  - CTP emulator for Local Trigger Unit (LTU)
  - Low Latency Interface

2

### ALICE Experiment After LS1

- CENTRAL TRACKER
  - Silicon pixel, Silicon Drifts, Silicon Microstrips, TPC, TRD (fully completed), TOF
- FORWARD DETECTORS
  - T0, V0, FMD, PMD
- SPECIAL DETECTORS
  - ACORDE, PHOS, EMCAL, HMPID
  - New: DCAL, CPV
- DIMUON TRACKER
  - Absorber, Tracking chambers, Trigger chambers



### ALICE Trigger Challenges for LHC Run 2

Trigger – selects interesting physics events (based on different triggering detectors)

- Optimise for different running scenarios (p-p, p-Pb, Pb-Pb) with different interaction rates
- Optimise rates according to physics requirements (downscaling)
- Optimise use of detectors with widely different busy times
  - Detector grouping trigger clusters
- Different latency requirements (4 trigger levels)
- Protect detectors from Pile-Up detector protection interval

# Requirements for CTP for LHC Run 2

- Add LM trigger level (before L0 trigger level) for TRD pre-trigger
- Increase number of classes from 50 to 100
- Trigger input switch integrated into CTP
- New snapshot/generator memory (using 1+1 GB of DDR3 memory)
- All functions (special type of trigger inputs) from 8 inputs (only 4 inputs before)
- LM and LO interactions (definition of basic trigger) from first 8 inputs
- Past-Future protection for LM and LO levels (protects detectors from Pile-Up)
- Second link to DAQ for extended Interaction record

## ALICE CTP for LHC Run 2

Changes with respect to LHC Run 1

- New LMO board with Kintex-7 FPGA
- 96 diff. IOs at front panel
- New octopus cable (blue) for CTP inputs
- New Detector Data Link (DDL2) to DAQ
- New FPGA designs for all CTP and LTU boards
- Installed new trigger cables from T0, V0, CPV and DCAL
- Repaired many old cables
- Timing on CTP backplane, CTP-LTU and LTU-TTCex connections re-checked after upgrade of all FPGA designs



# LMO Board





2 GB DDR3 memory

#### Kintex-7 FPGA:

- LM + LO trigger levels
- LM + LO interaction
- 100 classes
- Integrated trigger inputs
- Increased input for each trigger functions (from 4 to 8)

#### Integration and Commissioning for LHC Run 2

- All detectors tested with 100 classes (logical combination of triggers)
- LM trigger tested with TRD detector
- New functions with 8 trigger inputs tested for CTP
- DDL2 link basic functionality (physical layer) tested with DAQ
- New improved downscaling successfully tested
- New Past-Future protection tested

8

# **CTP** Proposal for LHC Run 3

#### Requirements for LHC Run 3

- The main "interaction" trigger via the "FIT" detector
  - Not selecting events, just announcing an interaction
  - Not every bunch crossing has an interaction in ALICE
- No trigger levels, each detector selects which max. trigger latency it can accept
- 2 modes of running for detectors: triggered and continuous
- Triggers sent to all detectors which are not busy
- Each detector as separate cluster, but retain also clustering possibility
- No CTP Dead-time
- Interaction Rates: 50 kHz for Pb-Pb, and up to 200 kHz for p-p and p-Pb

#### Requirements for LHC Run 3 (cont.)

- 3 types of trigger distribution
  - Directly on detector (ITS detector)
  - Via Common Readout Unit (CRU)
  - Via detector specific readout system
- 2 types of link layer
  - GBT (PON) for upgraded detectors
  - TTC system for old detectors
- 12 detectors (7 with GBT system, 4 with TTC system, 1 with GBT+TTC system)
- 6 Triggering detectors (FIT, ACO, EMC, PHO, TOF, ZDC)
- Trigger latencies
  - 775 ns (contributing detector: FIT) → wake-up signal for TRD electronics
  - 1.5 μs (contributing detectors: FIT, ACO, EMC (only some inputs), PHO, TOF, ZDC (only some inputs))
  - 6.4 µs (contributing detectors: all)

#### ALICE System Block Diagram for LHC Run 3 3 Types of Trigger Distribution



D – Detector
CTP – Central trigger Processor
FO – Fan-Out board
LTU – Local Trigger Unit
CRU – Common Read-out Units
DSR – Detector Specific Read-out
O<sup>2</sup> – Online + Offline Computing
DCS – Detector Control System



12

#### Design Proposal for Trigger System



#### Trigger Protocol

- Detectors with fast links (GBT)
  - Synchronous message with 188 bits. This message can be transmitted every BC, i.e. 40 MHz
- Detectors with TTC (old detectors can run with max. 100 kHz)
  - One synchronous pulse in channel A
  - Asynchronous message in channel B
    - The size of channel B message is restricted by TTC bandwidth. We have 2 options:
    - Short message which is 16 bit long and carries 8 bit of data (188 bits/trigger at 100kHz)
    - Long message which is 42 bits long with 16 bit of data (148 bits/trigger at 100 kHz)

#### LHC Clock Distribution

- Clock distributed to each board as low jitter clock directly from LHCmi crate
- 2 clock domains (clock from CTP and clock from LHCmi) synchronized at LTU board with time constraints
- Common Readout Units (CRU) get clock via GBT/10G PON



#### Design Proposal of LTU Board

- LTU Local Trigger Unit
  - Global mode: interface between Detector and CTP
  - Standalone mode: programmable FSM generator of CTP sequences with adjustable frequency/pseudorandom
- 6U VME board (only power taken from VME)
- 1 slot for Samtec FireFly cable with 12 diff. links (Low Latency Interface from CTP)
- PLL with fixed IN/OUT phase for 120/280 MHz clock (Silicon Labs 5338)
- 6 GBT (10G PON) links (trigger distribution)
- 1 GBT link to DAQ (control and monitoring)
- Compatibility with 10G PON

#### **CTP Emulator on LTU Board**



The ALICE Central Trigger Processor (CTP) Upgrade, TWEPP 2015

# Low Latency Interface

(minimization of trigger latency for CTP-FO-LTU connections)

### Assignment, Constraints

- Propose communication interface for future CTP-FO-LTU connection
  - Low latency of link (to minimize trigger latency)
- Xilinx Kintex-7 HR pins (maximal speed: 1250 Mbit/s)
- Samtec FireFly connector/cable (UEC5, UCC8)
  - 12 differential pairs
- More bandwidth than in present system
  - Requested bandwidth: 280 bit @ 40 MHz (11.2 Gbit/s)
- Base design clock: LHC (~40 MHz)
  - Or integer multiplication
  - Synchronous to LHC clock

# FPGA Design Proposal (2 links)



20

#### Configurations at Maximal Rates

|                                  | Variant 1   | Variant 2    |
|----------------------------------|-------------|--------------|
| Bus width @ 40 MHz               | 330 bits    | 308 bits     |
| Intermediate Freq. (Frame Clock) | 200 MHz     | 280 MHz      |
| SERDES                           | 6:1:6       | 4:1:4        |
| Link frequency (Bit Clock)       | 600 MHz     | 560 MHz      |
| One Link speed                   | 1200 Mbit/s | 1120 Mbit/s  |
| Throughput (all 11 links)        | 13.2 Gbit/s | 12.32 Gbit/s |
| Bandwidth Utilisation            | 96%         | 90 %         |
| Latency Rx+Tx (first fragment)   | 45 ns       | 39.29 ns     |
| Latency Rx+Tx (whole message)    | 70 ns       | 64.29 ns     |

Mbit =  $10^6$  bit, Gbit =  $10^9$  bit

# Test Setup

- Tested on maximal datasheet speed
  - 1250 Mbit/s on one pin/link
- DDR, SERDES ratio 1:8
  - Frame clock 156.25 MHz, bit clock 625 MHz
- PRBS test data
- Two identical LMO boards
  - FPGA Xilinx Kintex-7 (XC7K325T-2FFG900C)
  - Inserted in VME crate, controlled by VME processor
- Samtec FireFly cable with ECUE connectors
  - Various types and lengths



#### **Error Rates**



#### Low Latency Interface Summary

- Latency measured:
  - TX Parallel data  $\rightarrow$  RX parallel data
  - Without medium (only transmitter + receiver)
- Our interface latency, Frame Clock = 280 MHz
  - 39.29 ns for the first fragment (25 ns 40/280 MHz conv., 14.29 ns SER/DES)
  - 64.29 ns for the whole message (25 ns 40/280 MHz conv., 14.29 ns SER/DES, 25 ns 280/40 MHz conv.)
- Good scalability (separate clock and data links)
- Comparison: Latency optimized GBT
  - Xilinx Kintex-7, high-speed serial pins
  - Latency 71.2 ns for the first fragment
  - H. Chen, K. Chen, F. Lanni: "The testing of the GBT-FPGA on Xilinx FPGA", 2014

# Summary

- CTP upgrade for LHC Run 2 is integrated and commissioned
  - LM implemented and fully tested
  - 100 classes implemented and fully tested
  - New Past-Future implemented and fully tested
  - DDL2 physical layer implemented and tested
- CTP upgrade for LHC Run 3 is ongoing
  - Design proposal well advanced
  - LTU requirements being collected
  - Low Latency Interface between CTP-FO-LTU tested

# **Back-up Slides**

#### Trigger Data Format

- Detectors with GBT or PON:
  - Event ID (Orbit 32 bits + BCld 12 bits): 44 bits
  - Input Mask: 48 bits
  - Detector Mask: 72 bits (= 24 x 3 different trigger latencies)
  - Event type: 24 bits (= 8 x 3 different trigger latencies)
  - Message/Spare: 64 bits
  - TOTAL: 252 bits/each BC
- Detectors with TTC:
  - it seems that at 100 khz there is enough bandwidth to transmit all apart Message/Spare:
  - Event ID (Orbit 32 bits + BCld 12 bits): 44 bits
  - Input Mask: 48 bits
  - Detector Mask: 24 bits
  - Event type: 8 bits
  - TOTAL: 124 bits/each BC

# Main CTP Connections

- CTP board FO board → custom protocol (12 diff. links, 1 link CLK (1200 Mbit/s) x 11 data bits, 330 bits each BC, ORBIT is one bit in data)
- FO board old LTU+TTCex boards → modified protocol from LHC Run 1 (less data through TTC in order to increase trigger rates)
- Old LTU board FEE  $\rightarrow$  L0 cable (LVDS) + TTC system
- FO board new LTU boards → custom protocol
- New LTU board CRU  $\rightarrow$  GBT (10G PON) (data protocol defined by ALICE)
- New LTU board ITS  $\rightarrow$  GBT
- CTP board + FO board + new LTU boards DAQ → control and monitoring instead VME, bidirectional GBT protocol (data defined by ALICE)

#### Error Rates

| Frame clock<br>[MHz] | Bit clock<br>[MHz] | Throughput<br>[Mbit/s] | Error rate<br>[1/s]    | BER<br>[-]               |
|----------------------|--------------------|------------------------|------------------------|--------------------------|
| 156.25               | 625                | 1250                   | 0                      | 3.92 × 10 <sup>-14</sup> |
| 175                  | 700                | 1400                   | 0                      | 3.50 × 10 <sup>-14</sup> |
| 180                  | 720                | 1440                   | 0                      | 3.99 × 10 <sup>-14</sup> |
| 181.25               | 725                | 1450                   | 0                      | 1.91 × 10 <sup>-14</sup> |
| 185                  | 740                | 1480                   | 3.00 × 10 <sup>6</sup> | 2.03 × 10 <sup>-3</sup>  |
| 187.5                | 750                | 1500                   | $1.25 \times 10^{7}$   | 8.33 × 10 <sup>-3</sup>  |
| 190                  | 760                | 1520                   | $1.30 \times 10^{7}$   | 8.58 × 10 <sup>-3</sup>  |
| 193.75               | 775                | 1550                   | $4.48 \times 10^{7}$   | 2.89 × 10 <sup>-2</sup>  |
| 195                  | 780                | 1560                   | $7.52 \times 10^{7}$   | 4.82 × 10 <sup>-2</sup>  |

BER on measurement with 0 errors has been estimated with 95% confidence level.