The ATLAS Level-1 Topological Processor: Phase-I upgrade and Phase-II plans

Outline:

- Topological Trigger why?
- New Topo for Run 3
- Topological Trigger in Run 4

Emanuel Meuser on behalf of the ATLAS Collaboration ICHEP2024 | Prague | 17.07.2024 - 24.07.2024









#### **Concept of a topological trigger**

Most triggers are just based on multiplicities:

- Works, but rate scales with luminosity
- L1 Rate limited to 100 kHz
- Would need to increase thresholds...



### **Concept of a topological trigger**

Most triggers are just based on multiplicities:

- Works, but rate scales with luminosity
- L1 Rate limited to 100 kHz
- Would need to increase thresholds...



Add additional topological requirements like  $\Delta \eta$ ,  $\Delta \phi$ ,  $\Delta R$ , Invariant Mass, ... => Small reduction of signal, huge reduction of rate!



## **Concept of a topological trigger**

Most triggers are just based on multiplicities:

- Works, but rate scales with luminosity
- L1 Rate limited to 100 kHz
- Would need to increase thresholds...



Add additional topological requirements like  $\Delta \eta$ ,  $\Delta \phi$ ,  $\Delta R$ , Invariant Mass, ...

=> Small reduction of signal, huge reduction of rate!



- 3 dual width, custom-designed ATCA boards
- 2 processor FPGAs (VU9P) + 1 Zynq SOM per board
- 12 Minipod opto/electrical transceivers per FPGA
  - $\circ$  ~ 10 RX and 2 TX per FPGA at 11.2 Gbps ~
  - system's total receiving bandwidth at <u>8 Tbits</u>
- System fully installed in January 2022



- 3 dual width, **custom-designed** ATCA boards
- 2 processor FPGAs (VU9P) + 1 Zynq SOM per board
- 12 Minipod opto/electrical transceivers per FPGA
  - $\circ$  ~ 10 RX and 2 TX per FPGA at 11.2 Gbps ~
  - system's total receiving bandwidth at <u>8 Tbits</u>
- System fully installed in January 2022

Commissioning challenges in ...

... hardware testing, installation and verification





- 3 dual width, **custom-designed** ATCA boards
- 2 processor FPGAs (VU9P) + 1 Zynq SOM per board
- 12 Minipod opto/electrical transceivers per FPGA
  - $\circ$  ~ 10 RX and 2 TX per FPGA at 11.2 Gbps ~
  - system's total receiving bandwidth at <u>8 Tbits</u>
- System fully installed in January 2022

Commissioning challenges in ...

... hardware testing, installation and verification



Muon Geometry

- 3 dual width, **custom-designed** ATCA boards
- 2 processor FPGAs (VU9P) + 1 Zynq SOM per board
- 12 Minipod opto/electrical transceivers per FPGA
  - $\circ$  ~ 10 RX and 2 TX per FPGA at 11.2 Gbps ~
  - system's total receiving bandwidth at <u>8 Tbits</u>
- System fully installed in January 2022

Commissioning challenges in ...

... hardware testing, installation and verification
 ... correct fiber mapping
 ... conversion of various
 input data formats into common format



Muon Geometry

Software

Firmware

- 3 dual width, **custom-designed** ATCA boards
- 2 processor FPGAs (VU9P) + 1 Zynq SOM per board
- 12 Minipod opto/electrical transceivers per FPGA
  - $\circ$  ~ 10 RX and 2 TX per FPGA at 11.2 Gbps ~
  - system's total receiving bandwidth at <u>8 Tbits</u>
- System fully installed in January 2022

Commissioning challenges in ...

- ... hardware testing, installation and verification
   ... correct fiber mapping
   ... conversion of various
   input data formats into common format
   ... validation by comparison to bit-wise
  - software simulation for 80+ algorithms
  - Always the question if the bug is in the firmware or the software...



## L1Topo - Algorithmic firmware overview

L1 Topo firmware challenges:

- spread over 6 FPGAs
- roughly 80+ algorithms, i.e. different firmware blocks
  - Different Multiplicities
  - Topological calculations: delta R, InvMass...
  - more complex stuff: Kalman-MET, LLP trigger...
- 200 results per BC
- Highly parallel, pipelined firmware
  - limited to 75 ns latency budget
- => Firmware generated from the Trigger Menu using "hand-written" firmware blocks via script
- => Firmware blocks structured into 3 categories: Multiplicities, Sort/Selects and Decisions
- => Parameters can be configured during Run time via software (IPBUS)





## L1Topo - Bphysics triggers

Topological B-physics trigger:

- Select up to 10 µ-Trigger**OB**jects (TOBs)
- Calculate for all 45 combinations of 2 µ the ...
  - $\dots \Delta \eta$

 $\dots \Delta \phi$ 

 $\dots$  Invariant Mass assuming  $\mu$  as massless:

 $\mathsf{M}^{\mathtt{2}} = \mathtt{2} * \mathsf{E}_{\mathtt{t},\mathtt{1}} * \mathsf{E}_{\mathtt{t},\mathtt{2}} \left( cosh(\Delta \eta) - cos(\Delta \phi) \right)$ 

• Fire trigger bit if any combination fits within mass window of the J/ $\Psi$  or the  $\Upsilon$ 

#### <u>L1Topo provides ~ 70 % of unique rate</u> for $J/\Psi$ and $\Upsilon$ candidates!



## L1Topo - more topological triggers

More noteworthy topological algorithms:

- Simple Cone:
  - Cluster small-R jets within R=1 over threshold to large-R jet
  - Fire trigger if over clustered energy over threshold





## L1Topo - more topological triggers

More noteworthy topological algorithms:

- Simple Cone:
  - Cluster small-R jets within R=1 over threshold to large-R jet
  - Fire trigger if over clustered energy over threshold
- VBF triggers look for combination of 2 jets with:
  - $\circ \quad \text{high } \eta$
  - $\circ$  high  $\Delta \phi$
  - high Invariant Mass
  - Fire trigger if all 3 requirements are met







## L1Topo - more topological triggers

More noteworthy topological algorithms:

- Simple Cone:
  - Cluster small-R jets within R=1 over threshold to large-R jet
  - Fire trigger if over clustered energy over threshold
- VBF triggers look for combination of 2 jets with:
  - high  $\eta$
  - $\circ$  high  $\Delta \phi$
  - high Invariant Mass
  - Fire trigger if all 3 requirements are met
- ZAFB (Z -> ee) requires:
  - One central electron (eFEX)
  - One forward electron (jFEX)
  - $\circ$  Looks for  $\Delta \phi$  and Invariant Mass within window
  - Target measurement of Electroweak mixing angle specifically









## L1Topo - Long Lived Particle (LLP) triggers

Multiple events 'in-flight' on L1Topo at the same time, what allows for special triggers using 2 consecutive BCs. Examples are:

Late-Muon Trigger:

- Using a faster multiplicity algorithm to check for a high pt muon and send trigger bit to CTP 1 BC earlier
- CTP coincidence triggers on late muon trigger bit and a single jet trigger bit or a MET trigger bit



## L1Topo - Long Lived Particle (LLP) triggers

Multiple events 'in-flight' on L1Topo at the same time, what allows for special triggers using 2 consecutive BCs. Examples are:

Late-Muon Trigger:

- Using a faster multiplicity algorithm to check for a high pt muon and send trigger bit to CTP 1 BC earlier
- CTP coincidence triggers on late muon trigger bit and a single jet trigger bit or a MET trigger bit All three quantile

Jet + Miss ET:

- Delay MET by additional pipelining
- $\Delta \phi$  calculation of delayed MET and Jet
- Fire trigger bit for small  $\Delta \phi$





## LHC schedule - Towards Higher Luminosities



High Lumi - LHC brings challenges for the Trigger:

- Luminosity of up to  $7.5 \cdot 10^{34}$  cm<sup>-1</sup>s<sup>-1</sup>
- Pileup of up to 200 (60 in Run 3)

#### => Adapt Trigger System for High-Lumi



## ATLAS Level-0 Trigger System - Run 4 and beyond



Changes for first level trigger for Run 4:

- Overall Latency increases from 2.5 μs to 10 μs
- Full cell-level granularity of whole detector combined on single FPGA of L0Global
  - TOBs from L0Calo and L0Muon
  - Run own e, j, tau, XE, TE algorithms to improve TOBs efficiencies
  - Hosts Topological Trigger
- => Combines 1 event onto single FPGA at full granularity using <u>time multiplexing</u>



#### L0Global - time multiplexed topological firmware



Time multiplexed system using round-robin scheme:

1.2 µs until next event
> Data moves serially through L0Global
Variety of algorithms running on L0Global
=> Tight resource budget: 100k LUTs allocated for topological part (3.3M LUTs on VP1802)



#### **L0Global - Minimization of Resources**



L1Topo firmware is highly parallelized due to tight latency:

- A Decision logic is build for each combination of TOBs
- Topological algorithm forms decision within single BC (25 ns)



#### **L0Global - Minimization of Resources**



L1Topo firmware is highly parallelized due to tight latency:

- A Decision logic is build for each combination of TOBs
- Topological algorithm forms decision within single BC (25 ns)

Completely different boundary conditions (L1Topo ->L0Global):

- 2.5 M LUTs -> 100 k LUTs in resource occupation
- 75 ns -> 1.2 μs in latency budget (400 ns for muons)
  - => Can't just copy&paste L1Topo firmware onto L0Global!



### **L0Global - Minimization of Resources**



Resource minimization through serialization:

- trade resource vs. time
- process one combination per clock tick
- requires additional logic to provide all combinations

For this example algorithm serialization reduces resource costs from 31'732 to 636 LUTs

• 95 % of Run 3 algorithms already serialized - fits into 100k budget



Seq.

Or

60 combinations sequentially -

=> implementation uses 636 LUTs

60 sub-ticks (60 \* 3.125 ns)

# L0Global - time multiplexed topological firmware

13b Et

Having a serialized firmware allows us to reduce the trigger-menu-dependency of the Topological firmware (a.k.a. Hypothesis) further than it was possible in L1Topo. We aim for:

- A Generic 64-bit TOB (type independent)
- Type independent selection and sorting blocks using CFGLUT5s
- Overbuilt decision algorithm logic to fulfill most requirements
  - Some special case algorithms Ο will have to stay hardcoded
- Add (pruned) switches to allow for flexible data flow

#### => Will greatly reduce menu-dependence



#### Conclusion

- Low p<sub>t</sub> physics data taking benefits from Topological Trigger immensely
- Run 3 Topo system was installed and is fully commissioned
- First physics results from Run 3 show great performance!
- Run 4: Completely different boundary conditions for topological firmware
  - Serialized topological algorithms fit into 100k LUT budget
  - Potential for decoupling Firmware further from the Trigger Menu

