4–11 Jul 2018
COEX, SEOUL
Asia/Seoul timezone

Exploitation of heterogeneous resources for ATLAS Computing

7 Jul 2018, 11:15
15m
209 (COEX, Seoul)

209

COEX, Seoul

Parallel Computing and Data Handling Computing and Data Handling

Speaker

Jiri Chudoba (Acad. of Sciences of the Czech Rep. (CZ))

Description

LHC experiments require significant computational resources for Monte Carlo simulations and real data processing and the ATLAS experiment is not an exception. In 2017, ATLAS exploited steadily almost 3M HS06 units, which corresponds to about 300 000 standard CPU cores. The total disk and tape capacity managed by the Rucio data management system exceeded 350 PB.

Resources are provided mostly by Grid computing centers distributed in geographically separated locations and connected by the Grid middleware. The ATLAS collaboration developed several systems to manage computational jobs, data files and network transfers. ATLAS solutions for job and data management (PanDA and Rucio) were generalized and now are used also by other collaborations.

More components are needed to include new resources such as private and public clouds, volunteers' desktop computers and primarily supercomputers in major HPC centers.
Workflows and data flows significantly differ for these less traditional resources and extensive software redesign was needed for some components of the ATLAS distributed computing software stack. High Performance Computers might not allow internet connection directly from/to computing nodes. Some provide hundreds of thousands cores each several times slower than a standard Grid core, others require jobs running in parallel on many cores using MPI, still others allow ATLAS jobs only as a backfill.
The newly developed and commissioned ATLAS software framework called Event Service has been put in place to exploit these highly volatile resources.

The volunteer computing project ATLAS@Home is based on the BOINC platform. Virtualization technologies enabled usage of various platforms and simplified installation. The project adds up to several tens of thousands computing cores used for ATLAS simulations and serves as a unique tool for outreach activities. Not only desktop computers are used; servers from computing clusters too can increase total utilization by running ATLAS@Home on top of standard jobs.

We will discuss current usage of ATLAS pledged and opportunistic resources, evolution of the software used for the management of the huge number of distributed jobs and need for a significant upgrade of computational infrastructure for HL-LHC.

Primary author

Jiri Chudoba (Acad. of Sciences of the Czech Rep. (CZ))

Presentation materials