September 28, 2015 to October 2, 2015
Europe/Zurich timezone

Hardware evaluation of Xilinx High Level Synthesis for building data readout systems – a CMS ECAL Data Concentrator Card case

Sep 30, 2015, 5:08 PM
Hall of Civil Engineering (Lisbon)

Hall of Civil Engineering


IST (Instituto Superior Técnico ) Alameda Campus Av. Rovisco Pais, 1 1049-001 Lisboa Portugal
Poster Systems Poster


Michal Husejko (CERN)


The current production version of the CMS ECAL Data Concentrator Card (DCC) is implemented with 11 FPGA devices (Altera and Xilinx), all placed on a single 9U VME card. The development and verification of the DCC FPGA firmware have consumed considerable amount of engineering resources. We have observed in recent years a trend in availability of new engineering tools for FPGA programming. Most of them promise higher productivity than widely used hardware description languages such as VHDL and Verilog/SystemVerilog. In this study, we present an implementation of the CMS ECAL DCC using C/C++ and Xilinx Vivado HLS. Emphasis will be placed on the hardware evaluation of results.


The CMS ECAL Data Concentrator Card (DCC) is built with 11 FPGA devices: nine Virtex II Pro FPGAs and two Stratix FPGAs. This high number of FPGA devices is required to provide all data readout and data reduction functionalities required by CMS ECAL. The functionality is split among three main building blocks: Input Handler (IH), Event Merger (EM) and Event Builder (EB). There are nine IHs on the DCC, each receives data from the Front-end Electronics (FE) links and applies Zero Suppression (ZS) algorithm (6-tap FIR filter). The EB receives trigger information, performs packet building, and sends data to the DAQ over a Slink64 interface. The EM is used to buffer the data between IHs and Slink64. Each IH is implemented in a single Xilinx FPGA. EB and EM are placed in Altera FPGA, each in a single FPGA. In addition, one of the IH blocks receives Selective Readout (SR) flags from a Selective Readout Processor (SRP). These are used to control ZS blocks inside IHs.

We first start to describe data and packet processing algorithms employed by DCC and how they can be implemented in the HLS. Then we present implementation of IH, EB and EM blocks in the HLS with emphasis on optimization of the design for both latency and throughput. As a final step we present the hardware evaluation of the IP cores generated by Vivado HLS. For the evaluation we selected Xilinx Virtex-7 and Zynq FPGAs with the DCC emulation done in respective development kits.

We conclude with lessons learned and future plans.

Primary author


John Evans (CERN) Jose Carlos Rasteiro Da Silva (LIP Laboratorio de Instrumentacao e Fisica Experimental de Part)

Presentation materials