Multicore-Aware Data Transfer Middleware (MDTM)

14 Apr 2015, 15:45
15m
B250 (B250)

B250

B250

oral presentation Track4: Middleware, software development and tools, experiment frameworks, tools for distributed computing Track 4 Session

Speaker

Dr Wenji Wu (Fermi National Accelerator Laboratory)

Description

Multicore and manycore have become the norm for scientific computing environments. Multicore/manycore platform architectures provide advanced capabilities and features that can be exploited to enhance data movement performance for large-scale distributed computing environments, such as LHC. However, existing data movement tools do not take full advantage of these capabilities and features. The result is inefficiencies in data movement operations, particularly within the wide area scope. These inefficiencies will become more pronounced as networks upgrade to 100GE infrastructure, and host systems correspondingly migrate to 10GE, Nx10GE, and 40GE technologies. To address these inefficiencies and limitations, DOE’s Advanced Scientific Computing Research (ASCR) office has funded Fermilab and Brookhaven National Laboratory to collaboratively work on the Multicore-Aware Data Transfer Middleware (MDTM) project. MDTM aims to accelerate data movement toolkits on multicore systems. Essentially, the MDTM project consists of two major components: - MDTM middleware services to harness multicore parallelism and make intelligent decisions that align CPU, memory, and I/O device operations in a manner which optimizes performance for higher layer applications. - MDTM-enabled data transfer applications (client or server) that can utilize MDTM middleware to reserve and manage multiple CPU cores, memory, network devices, disk storage as an integrated resource entity, thus achieving higher throughput and enhanced quality of service when compared with existing approaches. A prototype version of MDTM is currently undergoing testing and evaluation. This talk will describe MDTM’s architectural and design principles, how it works in implementation, and initial test results in comparison to standard GridFTP and BBCP operations. In addition, future directions for the project will be discussed, including the notion of enabling an external resource scheduling capability for MDTM, thus making it a reservable component for application-driven end-to-end path resource reservations.

Primary author

Dr Wenji Wu (Fermi National Accelerator Laboratory)

Co-authors

Liang Zhang (Fermilab) Mr Phil Demar (Fermilab)

Presentation materials