Summary 500 words
We present a new VLSI processor for pattern recognition based on Content Addressable Memory (CAM), optimized for on-line track finding in high-energy physics experiments. A large CAM bank stores all trajectories of interest and extracts the ones compatible with a given event.
This task is naturally parallelized by a CAM architecture able to output identified trajectories, recognized among a huge amount of possible combinations, in just a few 100 MHz clock cycles. This device is optimized for the planned Fast Tracker (FTK)  processor, an ATLAS trigger upgrade project at LHC.
The CAM memory array, organised in macro blocks, has been designed with a full-custom approach to minimizearea and power consumption. The full-custom macro block contains 8 sub-blocks of 32 CAM words of 18 bits (cells) each. The peculiar feature of this CAM device is that matches are obtained as multiple matches of different CAM words at different times. Eight CAM words are organized into a “pattern”. The matches of single CAM words are stored into latches and kept until an init is issued. The pattern matches if all or a majority of the CAM words are matched. Typical applications use this feature to perform pattern recognition for detectors with up to 8 layers.
Six dedicated bits of each CAM word can be used to implement 3 ternary bits (”don’t care bits”) and implement variable size patterns.
To reduce power consumption, we have used a mixed solution of current-race and selective-precharge matchlinesensing techniques. The area of the sub-block is 55.42 μm × 57.60 μm. Between sub block pairs we have placeda dummy row which controls the timing of the enable signals of the current generators and the matchline resets.
To prevent malfunctions due to process and mismatch variations, the timing of control signals can be trimmed bymeans of a programmable delay. The total area of a macro block containing 4.6 kbits is 225.40 μm × 122.40 μm.
Whereas, the estimated power consumption is in the worst power case about 0.5 mW at 1.44 V corresponding to15 μW for each pattern of 8*18 bits. Finally, the full-custom block frame has been designed to be compatible with the standard cell environment in order to allow integration with interface and control logic. We describe the design of a 12mm2 MPW prototype and of the final AMchip, of which the parameters are summarized in table 1.
We discuss also possible future extensions based on 3-D technology. This processor has a flexible and easilyconfigurable structure that makes it suitable for applications also in other experimental environments. Most application are expected to benefits from the variable resolution feature.
For the FTK application we expect a gain equivalent to a factor of 5 extra patterns at with a silicon area cost of jsut less than 20%.
 A. Annovi et al., “The fast tracker architecture for the LHC baseline luminosity,” PoS, vol. EPS-HEP2009, p. 136, 2009.