

# DepFET Movie Chip (DMC)

enabling ultra high speed mega pixel full frame data acquisition for the DepFET direct electron detector (EDET DH80K) system

> 21<sup>st</sup> IEEE NPSS Real Time Conference Williamsburg (VA,USA)

> > 9-15 June 2018

### EDET DH80K camera



- Direct hit electron detector based on DepFET sensor array for fast imaging experiments (transmission electron microscopy (TEM))
  - 4 quadrants with each 512x512 pixel  $60x60\mu m^2$ , thinned sensor area  $50\mu m$
  - Single e- resolution and large dynamic range via signal compression
  - Stroposcopic imaging 1M pixel (8bit resolution) with up to 80kHz frame rate
  - Local buffering of bursts (movies) with 100 frames, maximum burst rate 100Hz





L. Andricek, M. Ibrahim, C. Koffmane, S. Krivokuca, J. Ninkovic, M. Polovykh, M. Predikaka, E. Prinker, R. Richter, G. Schaller, F. Schopper, E. Tafelmayer, J. Treis, A. Wassatsch, C. Zirr (MPG HLL, Munich) I. Dourki, S. Epp, D. Gitaric, D. Miller (MPI for the Structure and Dynamics of Matter, Hamburg)

I. Peric (KIT Karlsruhe)



### **Detector all-silicon module (ASM)**



Depfet Movie Chip @ 21<sup>st</sup> IEEE NPSS Real Time Conference Williamsburg, VA (USA) 9-15 June 2018 (3)(PS)



### **DMC** specification



- Base clock: 80MHz, local pll for 320MHz
- JTAG IEEE 1149.1 control interface
- DCD interface
  - DCDclk (320MHz via LVDS), sync\_reset
  - JTAG "slave" interface, dynamic chain integration
  - 8x8 @ 320Mb/s LVS time multiplexed (32 values per conversion)
  - 8x2 @ 320Mb/s LVS time multiplexed pedestals for the DCD (incl. storage on DMC)
- Switcher interface (JTAG, sequence control)
- DAQ interface
  - 8x serial link @ 320Mb/s, custom protocol
  - Manchester encoded trigger bus

### DMC specification (cont.)



- Foot print compatible to DHP1.1 (major pins (DCD, slow ctrl, DAQ))
  - 3280µm × 5000µm = 16.4mm<sup>2</sup>.
  - additional io's (serial links) at unused bump balls of the DHP
- Capable to capture 100 frames of 512x64 8bit pixel @ 80kHz frame rate
  - $\rightarrow$  26Mbit SRAM  $\rightarrow$  TSMC 40nm  $\rightarrow$   $\sim70\%$  of the die area
- Local sequencer to generate DCD, Switcher and internal control signals
- At least memory for 512x64 2bit DCD pedestals
- Power budget
  - Core 1.1V 500mA
  - IO 1.8V 500mA

#### DMC structure





## DMC building blocks: pll

- Standard architecture
- 1:4 system clock multiplication ( $80MHz \rightarrow 320MHz$ )
- Very small VCO with f<sub>osc</sub>>2GHz
- Divider and PFD (Phase Frequency Detector) based on very fast dynamic circuit technique TSPC (True Single Phase Clock)
- area 60x60µm<sup>2</sup>
- Digital lock detect circuit
  - Lock time  $\sim 2\mu s$
- Silicon proven (mini asic test chip)
- Clk bypass via JTAG ctrl









### DMC building blocks: delay7

- Adapt length mismatch on all bused signals on edet module
- 7 stage binary weighted bypass mux architecture
- 128 steps (~32ps)
- Signal duty cycle 48-52%
- Area: 30x30µm<sup>2</sup>
- Integration between DMC core and io pad ring





### DMC building blocks: LVS receiver

- Interface to the DCD (8x8 @ 320Mb/s)
- Receiving the weak low voltage swing data stream signals
- Selectable voltage and current mode
- Also used as standard LVDS receiver (CLK, TRIG, JTAG, ...)
- Integration as compatible io pad cell for standard io pad ring (pio style)









### DMC building blocks: LVDSdriver

- Interface to the DCD/Switcher and to the DAQ (JTAG, fast links)
- Selectable voltage and current mode
- Also used to send the LVS pedestal signals for the DCD
- Integration as compatible io pad cell for standard io pad ring (pio style)







### DMC building blocks: TSMC SRAM IP

- IP block generated by TSMC memory compiler
- Max size of single memory block (256kx128bit) @ minimum area
- Target operating frequency 400MHz
- Single port architecture → time multiplexed user logic to realize multi port access
- No MBIST (memory build in sytem test) used in the IP (size) → single global user logic MBIST for all mem blocks
- No redundancy
  - $\rightarrow$  Pre-qualification of the DMC necessary (MBIST enabled via JTAG)





### DMC building blocks: JTAG



- JTAG interface conform to IEEE 1149.1
- Used for test and slow control task with in the DMC
- Dynamic extension of the JTAG chain to the corresponding DCD and Switcher by special control bits
- Full JTAG chain of DMC described in a BSDL (Boundary Scan Description Language) file
- Python tools to generate SVF (Serial Vector File) data for JTAG operations based on BSDL file
  - Memory pre-load (sequencer, pedestals, test data)
  - Configuration register pre load or test
  - common SVF data for HDL verification, emulation and final chip configuration

### DMC building blocks: JTAG (cont.)

- Minimization of long JTAG chains to speedup the access
- Auto address increment for faster memory scan
- Hardware emulation on FPGA board to check the standard conformance
  - Openocd based access via SVF files







# Depfet Movie Chip @ 21<sup>st</sup> IEEE NPSS Real Time Conference Williamsburg, VA (USA) 9-15 June 2018 (14)(PS)

### DMC building blocks: memInterface

- Time multiplexed round robin\* operation to realize virtual multi port access
  - 2 steps for pixel memory acces (capturing or read out)
  - 1 step for DCD pedestal generation
  - 1 step for basic DMC sequencer operation
- All 100 +1 memory blocks read/write accessible via JTAG
  - Separate ctrl and data JTAG chains allows together with the auto address increment features fast access to big memory areas
  - Separated address space of the different stream (DCD column pairs) memory blocks

\*algorithm assigning regular time slices in circular order





### DMC building blocks: dmcSeq

- Separate 256k x 128bit SRAM block for sequencer operation
  - Limited size of sequencer and pedestal data words allows the virtual memory reorganization to 512k x 64bit, which doubles the usable line cnt
  - Minimization of used SRAM IP block variants, identical block as in the pixel memory used
- Dynamic assignment of the memory space to the base sequencer or the DCD pedestal sequencer
- 7 independent loop counter (16bit)
  - Decrement jump not zero operation
  - auto reload on zero crossing
- 4 trigger event jump address register





### DMC building blocks: dmcSeq (cont.)

- Base and pedestal sequencer border check register (min/max)
- Fast pedestal block change via direct register access
- Separate tracks for DCD and DMC synchronisation
- Emulated multi port memory interface
  - → each sequencer has only each 4th clock cycle access to the memory
  - → 4 consecutive bits per sequencer line

 $\rightarrow$  reduced memory foot print of dmc seq ( 18 seq lines compared to >100 for example DHP sequence )

• Python based sequencer compiler (generate memory content from ascii sequence "program" file )







### DMC building blocks: dmcOutputMux

- Different readout configurations needs different assignment of the max. 8 available readout ports to the 8 readout streams
  - Edet standard: 2 times 4 stream readout via the standard DMC-link port
  - 8 streams time multiplexed readout only via the "DHP"-link port or an other DMC-link port
  - Full parallel 8 streams via the 7 DMC-link port + "DHP"-link port
- Fully programmable link port assignment
  - Dynamic assignment of the readout streams to the available link ports
  - Partial stream readout supported





### DMC edet quadrant digital test-bench



- Configurable test-bench with hardware emulations models for DCD, Switcher and DepFET matrix
  - matrix emulation via pixel function f(x,y,t), used for stimulation and readout verification
  - JTAG config via svf files generated by python lib, based on BSDL files



Depfet Movie Chip @ 21<sup>st</sup> IEEE NPSS Real Time Conference Williamsburg, VA (USA) 9-15 June 2018 (18)(PS)

### DMC test chip



- TSMC 40nm mini asic for pll/delay7/SRAM IP run 4685 (CMOS LP MS RF with 1P8M\_5X2Z\_RDL)
- Mini asic testboard (MAT) with controllable power-supply for core and io voltage on board
  - Power monitoring with
    TI INA chip (via I<sup>2</sup>C)
- separate FPGA configurations to address the different test scenarios
- Delay7
  - scan over the whole shift value range generated by the soft core μC in the FPGA
  - Nominal 32ps per step
  - successfully tested over a wide range (0.9V - 1.3V) of the core voltage



### Overall chip design flow





Depfet Movie Chip @ 21<sup>st</sup> IEEE NPSS Real Time Conference Williamsburg, VA (USA) 9-15 June 2018 (20)(PS)

### The final layout



- pio style io cell ring
- 3 Layers for RDL (io cell → bump pad)
- fine granular power mesh for core voltage
  - direct connection from ball pad to core net
  - addition ESD protection structures
- fully script based synthese and chip assembly run needs ~48h on a multicore systems with
  - 1.2Tbyte RAM peak memory request



#### Current status



- Preparation of the final layout
  - Delay by ESD DRC in the full custom LV(D)S io cell's shown up only in the fully assembled layout
  - Not show in the base cell view
  - Dependent from the external wiring (single ended or differential use)





### Conclusions



- 40nm Technology enables the integration of the necessary 26Mbit SRAM in the DMC for the EDET DH80K camera
- With a careful characterization (Liberate tool-set) of the full custom analog parts a digital top design flow can be used
  - Static timing and power analysis decrease the finale verification efforts before tape out

 Thanks to the Europractice Cadence support team @ RAL/UK and the Europractice TSMC support team @ IMEC/Belgium for the valuable help on tool and technology issues



# Questions ?

Depfet Movie Chip @ 21<sup>st</sup> IEEE NPSS Real Time Conference Williamsburg, VA (USA) 9-15 June 2018 (24)(PS)

Andreas Wassatsch, MPG HLL