AutoFDO, FPGAs

Name: AutoFDO, FPGAs
Start: 2016-11-21T17:30:00+01:00
End: 2016-11-21T18:30:00+01:00
Location: CERN

Monday 21 Nov 2016, 17:30 → 18:30 Europe/Zurich

13/2-005 (CERN)

13/2-005

CERN

Show room on map

Description

Nathalie will share her experience and findings on the effectiveness of an optimization technique called AutoFDO.

Christian will give a technical prequel for the next session's "Streaming DAQ" theme, introducing state of the art FPGAs, their use cases and FPGA-related coding.

- 17:30 → 18:00
  
  AutoFDO 30m
  
  AutoFDO is a tool (developed by Google) that converts perf-profiles into a format that compilers like gcc and clang can use to perform Feedback Directed Optimization (FDO). The presentation will discuss the advantages of AutoFDO compared to traditional instrumentation based FDO. Benchmark tests for Geant4 have shown that performance speedups of up to 13% can be reached. The tests also proved that even when changing job configuration and job type of the training run, significant performance speedups are still measurable. This demonstrates that improvements are stable against a change of simulation scenarios.
  
  Speaker: Nathalie Rauschmayr (CERN)
  
  slides.pdf
- 18:00 → 18:30
  
  Validation of FPGA computing acceleration for the LHCb Upgrade 30m
  
  The LHCb experiment at the LHC will upgrade its detector by 2018/2019 to a 'triggerless' readout scheme, where all the readout electronics and several sub-detector parts will be replaced. The new readout electronics will be able to readout the detector at 40MHz. This increases the data bandwidth from the detector down to the event filter farm to 40TBit/s, which also has to be processed to select the interesting proton-proton collision for later storage. The architecture of such a computing farm, which can process this amount of data as efficiently as possible, is a challenging task and several compute accelerator technologies are being considered for use inside the new event filter farm.
  
  In the high performance computing sector more and more FPGA compute accelerators are used to improve the compute performance and reduce the power consumption (e.g. in the Microsoft Catapult project and Bing search engine). Also for the LHCb upgrade the usage of an experimental FPGA accelerated computing platform in the event building or in the event filter farm (trigger) is being considered and therefore tested. This platform from Intel hosts a general CPU and a high performance FPGA linked via a high speed link which is for this platform a QPI link. On the FPGA an accelerator is implemented. The used system is a two socket platform from Intel with a Xeon CPU and an FPGA. The FPGA has cache-coherent memory access to the main memory of the server and can collaborate with the CPU. Furthermore, the system is compared to a 'usual' PCIe FPGA accelerator from Nallatech the 385 and a Nvidia GeForce GTX 690 card.
  
  As a first step, a computing intensive algorithm to reconstruct Cherenkov angles for the LHCb RICH particle identification was successfully ported to the Intel Xeon/FPGA platform and accelerated by a factor of 35. Also another FPGA accelerator and a GPU were tested for performance and power consumption. The programming of the FPGA is an important issue, to make these devices accessable for a larger community, why also the performance of OpenCL was tested. The results show that the Intel Xeon/FPGA platforms, which are built in general for high performance computing, are also very interesting for the High Energy Physics community.
  
  In the end an outlook is given for the near future FPGA acceleration.
  
  Speaker: Christian Faerber (CERN)
  
  Validation of FPGA computing acceleration for the LHCb Upgrade.pdf