ACAT 2021

Name: ACAT 2021
Start: 2021-11-29T08:30:00+09:00
End: 2021-12-03T19:30:00+09:00
Location: Virtual and IBS Science Culture Center, Daejeon, South Korea

29 November 2021 to 3 December 2021

Virtual and IBS Science Culture Center, Daejeon, South Korea

Asia/Seoul timezone

ACAT 2021

Demonstration of FPGA Acceleration of Monte Carlo Simulation

contribution ID 731

30 Nov 2021, 19:00

20m

S221-A (Virtual and IBS Science Culture Center)

S221-A

Virtual and IBS Science Culture Center

55 EXPO-ro Yuseong-gu Daejeon, South Korea email: library@ibs.re.kr +82 42 878 8299

Oral Track 1: Computing Technology for Physics Research Track 1: Computing Technology for Physics Research

Marco Barbone (Imperial College London)

We present results from a stand-alone simulation of electron single coulomb scattering as implemented completely on an FPGA architecture and compared with an identical simulation on a standard CPU. FPGA architectures offer unprecedented speed-up capability for Monte Carlo simulations, however with the caveats of lengthy development cycles and resource limitation particularly in terms of on-chip memory and DSP blocks. As a proof of principle of acceleration on an FPGA we chose a single scattering process of electrons in water at an energy of 6 MeV. The initial code-base was implemented in c++ and optimised for CPU processing. To measure the potential performance gains of FPGAs compared to modern multi-core CPUs we computed 100M histories of a 6 MeV electron interacting in water. The FPGA bit-stream is implemented using MaxCompiler 2021.1 and Vivado 2019.2. MaxCompiler is a High-Level Synthesis (HLS) language that facilitates implementation between CPU and FPGAs; it greatly reduces the development time but does not achieve the same performance as manually optimised VHDL. We did not perform any hardware specific optimisation. We also limited the clock frequency to only 200 MHz, which is easily achievable by any HLS implementation on a modern FPGA. The same arithmetic precision was applied to the FPGA as the CPU implementation. The system configuration comprises an AMD Ryzen 5900x 12-cores CPU running at 3.7 GHz and boosting up to 4.8GHz with a Xilinx's Alveo U200 Data Center accelerator card. The Alveo U200 incorporates a VU9P FPGA device, with a capacity of 1,182,240 LUTs, 2,364,480 FFs, 6,840 DSPs, 4,320 BRAMs and 960 URAMs. The results shows that the FPGA implementation is over 110 times faster than an optimised parallel implementation running on 12-cores and over 270x faster than a sequential single core implementation. For today's market prices, this shows a cost equivalent speed-up of more than 10. The results on both architectures were statistically equivalent. The successful implementation and measured acceleration is very encouraging for future exploits of more generic Monte Carlo simulation on FPGAs for High Energy Physics applications.

Significance

Monte Carlo simulation on an FPGA of electrons scattering in water has not been demonstrated before, together with a direct comparison of the same codebase on a conventional CPU. In addition a significant speed up has been measured which equates to greater than a factor of 10 vs. CPU in a like-for-like cost comparison.

Speaker time zone	Compatible with Europe

Marco Barbone (Imperial College London) Dingyu Chen (Imperial College London) Alexander Howard (Imperial College London) Mihaly Novak (CERN) Wayne Luk (Imperial College London)

ACAT 2021.pdf

ACAT 2021.pptx

Recording

ACAT 2021

ACAT 2021

Demonstration of FPGA Acceleration of Monte Carlo Simulation

S221-A

Virtual and IBS Science Culture Center

Speaker

Description

Significance

Authors

Presentation materials