The upgrade of the LHCb experiment calls for pixel detectors capable of handling larger amounts of data. The detectors will be exposed to higher luminosity and triggers are generated further downstream in the system. All hit data will need to be guided off chip, which requires the acquisition system to suppress zero data and to remove redundant data. Efficient use of silicon area is a key condition, since the available space is limited to space beneath the sensitive area of the sensor and a small strip of silicon at the end of the pixel columns.
In the proposed architecture a token ring scans the pixels in the columns for a hit. Secondary token rings at the End-of-Column serve the pixel columns in a variable and programmable pace. Highly illuminated columns are served at each 40 MHz bunch crossing interval, less illuminated columns are read out at lower speeds; i.e. the secondary End-of-Column token rings are longer for these columns. This approach leads to the hardware beneath each pixel to be used as a large buffer. Also, the method of readout is inherently zero suppressing. Furthermore, loss of hit data - due to pixels which are hit again but have not been read out yet - is distributed evenly across the pixel array, resulting in a loss rate that has low dependence on the illumination gradient.
The hit data flows out of the columns as pairs. The pixel logic is able to pair up two neighbouring pixels if they are both hit in the same bunch crossing. This is a first step in reduction of redundancy. The pairing only takes place inside a column; i.e. neighbouring rows can be paired.
In a following stage a collection of FIFOs further smooths out the data flow. The FIFO stage can be of limited size since the pixel array is already used as buffer space. In a subsequent stage the data is sorted by the four LSB of the Time-Of-Arrival (TOA) of the hit and fed into the sixteen lanes, one for each 'sort bin'. This operation 'derandomizes' the data stream which is required for redundancy reduction later on.
Redundancy in the data is mainly present in TOA and position information; i.e. multiple clustered hits occur in one bunch crossing. After the sort stage there are two compression stages. The first stage collects hit data belonging to clusters (location redundancy). Each lane is treated separately. Hits are grouped in packets of 1 to 4 occurring in a 2x2 area in the pixel array. When data enters the data lane, the new data is compared to data already stored in existing packets and data is stored in a matching or new packet. When no new data arrives in the data lane - which is possible because of the modulo-16 nature of sorting stage - the oldest cluster data packet is sent out to the second compression stage. The second compression stage removes TOA redundancy by performing run-length encoding on the cluster data packets.
The proposed architecture has been modeled and statistically simulated in C++. A translation to VHDL has been made on which further simulations were performed and from which area and power estimates were derived. Radiation hardness was investigated, especially in the compression stages where a radiation induced upset may cause long lasting corruption of data.
C++ simulations show that the hit loss is around 1% for a peak illumination of 300 Mhits/cm^2/s. Simulations also show that the data rate after compression, on the pixel arrays closest to the beam is lower than 10 Gb/s, given 10-bit TOA, 8+8-bit coordinates, 4-bit energy data, an average cluster size of 3 and a 1.4mm x 1.4mm pixel array.