Oct 14 – 18, 2013
Amsterdam, Beurs van Berlage
Europe/Amsterdam timezone

Architectural improvements and 28nm FPGA implementation of the APEnet+ 3D Torus network for hybrid HPC systems

Oct 14, 2013, 3:00 PM
Grote zaal (Amsterdam, Beurs van Berlage)

Grote zaal

Amsterdam, Beurs van Berlage

Poster presentation Software Engineering, Parallelism & Multi-Core Poster presentations


Dr Roberto Ammendola (INFN Roma Tor Vergata)


Modern Graphics Processing Units (GPUs) are now considered accelerators for general purpose computation. A tight interaction between the GPU and the interconnection network is the strategy to express the full potential on capability computing of a multi-GPU system on large HPC clusters; that is why an efficient and scalable interconnect is a key technology to finally deliver GPUs for scientific HPC. In this paper we show the latest architectural and performance improvement of the APEnet+ network fabric, a FPGA-based PCIe board with 6 fully bidirectional off-board links with 34 Gbps of raw bandwidth per direction, and X8 Gen2 bandwidth towards the host PC. The board implements a Remote Direct Memory Access (RDMA) protocol that leverages upon peer-to-peer (P2P) capabilities of Fermi and Kepler-class NVIDIA GPUs to obtain real zero-copy, low-latency GPU-to-GPU transfers. Finally we report on the development activities for 2013 focusing on the adoption of the latest generation 28 nm FPGAs and the preliminary results achieved with synthetic benchmarks exploiting the implementation of state-of-the-art signalling capabilities of PCI-express Gen3 host interface.

Primary authors

Dr Alessandro Lonardo (INFN Roma) Dr Andrea Biagioni (INFN Roma) Dr Davide Rossetti (INFN Roma) Dr Francesca Lo Cicero (INFN Roma) Dr Francesco Simula (INFN Roma) Dr Laura Tosoratto (INFN Roma) Dr Ottorino Frezza (INFN Roma) Dr Pier Stanislao Paolucci (INFN Roma) Prof. Piero Vicini (INFN Roma) Dr Roberto Ammendola (INFN Roma Tor Vergata)

Presentation materials