# Implementation of a PC-based Level 0 Trigger Processor for the NA62 Experiment A. Gianoli INFN - Sezione di Ferrara # The NA62 experiment A fixed target experiment precision kaon physics program at CERN Ultra rare K decays K→πνν How rare is it? 1 in 10<sup>10</sup>-10<sup>11</sup> particle decays Aim to get O(100) events in 2-3 years ## The NA62 experiment A fixed target experiment precision kaon physics program at CERN Ultra rare K decays $K \rightarrow \pi \nu \nu$ How rare is it? 1 in $10^{10}$ - $10^{11}$ particle decays Aim to get O(100) events in 2-3 years Very intense primary beam: 10<sup>13</sup> protons/s Very intense secondary beam: 10<sup>9</sup> particles/s Many (uninteresting) events: 10<sup>7</sup> decays/s # The NA62 experiment # Trigger/DAQ key requirements - Ultra-rare decays - Not limited by proton flux - Reliability of vetoing power - High trigger efficiency (>95%) - Low random veto (<5%)</li> - •High online time + double pulse resolution - High data Bandwidth - •DAQ reliability (undetected losses <10<sup>-8</sup>) - Trigger reproducibility - •Integrated Trigger + DAQ (40MHz common coherent clock) - Completely digital data stream from FE to TDAQ - Full monitored system (inefficiency and flow control recording) - Uniformity for most subdetectors - Custom hardware minimized: L0 hardware + L1/2 software - Bandwidth scalability - •Flexibility: higher intensities, additional physic channels, updgrades #### Trigger/TDAQ overview ## LO Trigger Processor (LOTS) #### Tasks: - merge primitive lists (collect them via ethernet) - re-synchronize L0 trigger to drive TTC - provide trigger data for readout #### Requirements: - cut 10MHz→1MHz (up to 7 detectors, CHOD, MUV, LKR, RICH should suffice) - fixed delay response (< 1ms)</li> #### LOTS: how to do it? #### Classical way - custom module - fpga based - real-time #### what we would like - off the shelf components - flexibility - simplicity (to program and to maintain) Do we really need real-time? Where? #### LOTS: how to do it? •use high performance PC to run selection algorithm use fpga board to handle fixed delay output to TTC (needs real-time) avoid memory-to-memory copy: use fpga board to collect primitives (udp packets) and put them into PC ram #### HW used: - •core i7 920 2.67 GHz - •core i7 3930K 3.2 GHz - Terasic DE4-230 board (Altera StratixIV, PCIe Gen2 x8, 4 eth ports) ## Matching algorithm v 0.1 - Test computation requirements - Dummy primitives already loaded in ram - Primitives are time aligned - Single "smart" trigger - •Trigger condition: $CHOD \land MUV \land LAV \land LKR \land RICH$ ## Matching algorithm v 0.1 ## Round Trip Time v 0.1 - Add the fpga (no net): fpga → cpu → fpga - primitives are "time aligned" - fpga adds "own" timestamp to data #### RTT v 0.1 #### RTT v 0.1 ## Matching algorithm v 0.9 - Primitives won't be time aligned: more sophisticated match - "one" trigger is not enough: need at least 8 - more realistic primitives, not time aligned, "smart" trigger - average matching time: ~16 ns/event - modify algorithm to accomodate more triggers - with 8 triggers: ~25 ns/event ### CPU-DE4 synchronization - Problem arise with "not time aligned" primitives - Tested two synchronization - active polling - credit buffer #### RTT v0.9 #### polling #### RTT v 0.9 #### credit buffer #### Summary - We are investigating the feasibility of a mixed PCfpga system as a L0 trigger - Latencies and rates pose a challenge on several aspects - Results are good, no fundamental show-stoppers - The complete system will be ready for data-taking starting in 2014