### Prompt Trigger Primitives for a Self-Seeded Track Trigger

#### Mitch Newcomer, Nandor Dressnandt

Inspiration from Maurice Garcia Schiveres & real work from Amogh Halgeri, Vinata Koppal, Madhura Kamat

### **Prompt Trigger Objective**





Challenge: Provide cluster information within a few BC of Event Time to seed first level trigger beam syncronous trigger with high PT track enhanced triggers.

### Hardware Prototype Development

Target Location - Silicon Strip Strip tracker outer layers ATLAS.

Purpose - Provide Central Trigger Processor with low latency, intra-layer coincidence information with a strong preference for primitives (stubs) likely to reconstruct into High PT tracks.

Readout Granularity - 128 contiguous strip blocks bonded to FEIC's. Two, 128 strip blocks per FEIC. One Dedicated LVDS output / block transmits Serialized cluster data to an intra layer cluster correlator chip.



### Preliminary Tracking Layer Assumptions

#### Outer barrel layer tracking at R ~70cm to 100cm

- − Strip Occupancy 2% long strips  $\rightarrow$  0.5% short strips
- # Clusters depends on cluster width  $\rightarrow$  Fewer clusters than strips.



## **Cluster Resolution**

- Atlas Strip pitch ~75um Length: Short 2.5cm Long 10cm
- High PT tracks leave ionization in 1 or 2 strips.
- Cluster Position resolution ~40um possible by identifying # strips over threshold (1 or 2).

#### Rate

- Assume typically 2 or fewer clusters / bank of 128 strips.
  - This gets frozen early on in the FEIC design. Need to be sure it makes sense now.

#### Occupancy vs ModuleID for pileup 200



### FEIC Fast Clustering Conceptual Block Diagram



- Store Data as two banks of 128 strips at rising BC edge. One buffer register per bank.
- Perform combinational Logic Clustering algorithm. (6ns) using tightly restricted acceptance rules.
- Send Fixed Length Cluster information at a fixed #BC (2) after the event to each dedicated LVDS output.

#### This is being implemented in the ABC130



- Treat Each 128 strip Bank independently.
- Cluster may have 1 or 2 strips only. → 40µm resolution. (8 bits/cluster)
   ie 3+ = Veto. → Prefers energetic track.
- Cluster finding proceeds from both ends towards the middle.
- First 2 Clusters found reported in each Bank. (2\* 8 bit clusters => 16 bits)
- New cluster information locked out until Serialized cluster data is sent to prevent overwrite for serializer clocks slower than 640MHz.
  - Restart cluster processor as soon as overwrite condition is avoidable.
- Data  $\rightarrow$  Two 8 bit values 16 bits total = 4 BC's at 160Mbps
  - = 1BC @ 640Mbps
  - "Null" cluster = "FF " No clusters FFFF"

#### **ABC130 Fast Cluster Implementation**

1 - 256 channel FEIC  $\rightarrow$  2 - 128 Strip physical Banks



### TIMING DIAGRAM





## FAST CLUSTER FINDER in FEIC

#### Fast Read Out at 40 MHz

- No. of Gates 10600
- Total Power 3.88 mW

#### Serializer at 640MHz

- No. of Gates –106 (upper) +106 (lower)
- Total Power 1.45 mW
  - . Total # of gates 10913
  - . Total Area 27200um^2 (165X165um)
  - Total Power at 160 MHz 4.45 mW
  - Total Power at 640 MHz 5.459 mW
  - . Two Drivers and 1 Receiver add ~ 6mW

#### **Hit Location Latch**

- No. of Gates 101
- Total Power 0.129 mW

#### Serializer at 160MHz

- No. of Gates 106 (upper) +106 (lower)
- Total Power 0.445 mW

### Fast Cluster in ABC130 NC Verilog Simulation Output



### **Correlator Chip**

#### Target Design 1 Correlator / Hybrid with 1 output @ 640Mbps

#### How long does it take to get coincidence data into the correlator chip?

→ 16 'bit times' to get candidate coincident data into the correlator chip.
 @640Mbps → one BC assuming both sides transmit in parallel.

However..... there are many parallel inputs into the correlator chip.
Each ABC130 sends clusters from 2 banks of 128 strips.
10 ABC130 /per hybrid. → 20 correlator inputs / hybrid / side.
One correlator/hybrid position/both sides of a barrel layer.....
40 inputs → 20 unique hybrid positions: 5 bits to uniquely identify hybrid @
Stub data = 16bits + 5 bits hybrid@ comes in in parallel..... but

→ Can't be sent out raw in one BC @ 640Mbps! needs intelligent packing algorithm or... ??? here

## How many Stubs / Hybrid / BC

- Raw Rates ....per 128 strip bank...
  - Probability of 2 or fewer clusters / bank is >50 % in the inner detector.
  - This will be significantly increased (good) by using short strips in an outer layer tracking region to ~90% probability of 2 or fewer clusters / bank.
  - This is good for the ABC130s.... They will keep up.

 $\rightarrow$  Large for a hybrid based Correlator with 10 chips & one output....

- (Intelligent) Correlated Coincident high PT Tracks
  - Expect Data reduction from coincidence requirement ~X 10 to 100.
  - Given target correlator design two regimes need to be examined...
  - 1) If # interesting coincidences ~ .1 /BC/Hybrid Must link stub data with BCID to be efficient and institute a fifo based transmission.
     The probability of more than 1 coincident pair is much larger than Zero.
  - 2) # interesting coincidences < .05/BC/ Hybrid Sending BX synchronous data</li>
     @640MBps is OK if stub is defined uniquely by a 16 bit word.
  - → This rate has significant implications for the correlator chip (*later*). Beam Synchronous ? or BCID tagged ?

### **Pre-informed...** Data Reduction

In a bank of 256 half spaced strips (2<sup>8</sup>) if each strip has 8 Interesting matches (2<sup>3</sup>) Then 2<sup>11</sup> unique tags can be defined to describe each unique coincidence. If these reside in a large enough memory say (2<sup>12</sup>) then each correlator can be programmed for interesting coincidences at its unique position. A hybrid based correlator could be designed with 20 identical memories to cover all interesting possibilities.

The combination of inner and outer addresses can be used itself as a unique address to access the tag in memory but the memory doesn't need to be as large as the list of All possible addresses.



Only 20 bank pairs come into each correlator chip.

### Connection of ABCN130 in a Seeded Track Layer



### Geometric Hybrid Mapping Outer or Inner track layer



WIT2012 May4, 2012

### Single Bank (128 strip) Serial Data Stream



### Single Bank (128 strip) Serial Data Streams



## **Correlator Chip Approach**

Interesting Coincidence Memory Search

20 done in Parallel after data is acquired

Outer 2\*8bits/128strip bank/BC Include consideration for nearest neighbor stiff tracks



- 3 Possible Memory @ checks for each Cluster position
- Total for both coincidences / inner hybrid position
   Six memory cycles @ 640MHz ?? @ 160MHz ??
   The tradeoff is power vs delay not through put

If the number of interesting coincidences can be limited to 2<sup>11</sup> possibilities per 2560 strip pairs, then one "stub" with 5 hybrid address bits can sent up to the Trigger Processor / BC @640Mbps. With a Correlator latency of ~ 5 BC.

### **Correlator Chip Approach**

Memory Addresses 2<sup>16</sup>

| 6 Sequential Tests      | Inner Cluster n+1 | Outer Cluster n+2 | Tag 331 |
|-------------------------|-------------------|-------------------|---------|
| One bank position       | Inner Cluster n+1 | Outer Cluster n+1 | No Tag  |
|                         | Inner Cluster n+1 | Outer Cluster n   | No Tag  |
|                         | Inner Cluster n   | Outer Cluster n+1 | No Tag  |
|                         | Inner Cluster n   | Outer Cluster n   | No Tag  |
|                         | Inner Cluster n   | Outer Cluster n-1 | No Tag  |
| <b>Sequential Tests</b> | Inner Cluster n-1 | Outer Cluster n   | No Tag  |
| Next bank position      | Inner Cluster n-1 | Outer Cluster n-1 | No Tag  |
|                         | Inner Cluster n-1 | Outer Cluster n-2 | No Tag  |
|                         |                   |                   |         |
|                         |                   |                   |         |

Tag Data only to Trigger Processor

# Clocking Issues

- **1**. Framing the serialized data between the ABC130 and the Correlator chip will be a significant issue without some kind of marker or clock alignment procedure.
  - Our answer is to add a training mode where the ABC130 Cluster Finders send 1's during the positive part of the BC clock and 0's during the negative part of the BC clock.

 $\rightarrow$  This will essentially be a copy of the BC clock sent by the serializer. The Correlator chip will then be able to setup the phasing of its local clock to frame the 16 bit data with its serializer clock.

**2**. The ABC130 serializer needs a 640MHz clock to report out its 2 cluster, 40um precision 16 bit word at the beam crossing rate. Currently the HCC is envisioned to have a 160MHz clock output,

→We are considering optionally sending the available 640MHz clock out of the HCC to simplify the self-seeded track trigger prototyping. This would require an additional output from the HCC which would be dormant during the envisioned round of stave prototyping.

## First Verilog Simulations of Correlator Lookup block

#### The "Correlator" working for 2 Serial Input lines



Test case: Serial data coming in within one beam crossing

on each of the two serial lines is converted to 16 bit words, 007E and 7E00, available at the next crossing. The four candidate locations(Address1 through Address4) for interesting hits are formed and looked up in a memory preloaded with possible interesting triggers. If there is a match (here 7E7E), the interesting trigger's ID(here 03D) is sent out and the Trigger-Flag is raised.

## Summary

- The Fast Cluster design for the ABC130 is complete. The area it takes up is quite small except for the I/O. (2 drivers, 1 receiver). It enables prototyping the self-seeded track trigger.
- When operational it requires ~ 12mW of additional power in the ABC130.
- The fast cluster finder holds out the promise to reduce the overhead in detecting intra layer coincidences by providing the capacity to report out up to two clusters every BC for each bank of 128 strips. By differentiating between 1 and 2 hit strips, it will offer an effective 40µm cluster location precision across a ~4-5mm distance between layers:

1 mrad angular precision in  $\phi$ .

- An early look at its partner, Correlator chip, suggests that high momentum coincidence information from two aligned hybrids (10240 strips) could be sent off detector on a single line @640Mbps with a total Latency of ~200ns.
- A single 3.8Gbps GBT could aggregate coincidences for 40K strips /hybrid.