Speaker
Tadashi Maeno
(Brookhaven National Laboratory (US))
Description
Experiments at the Large Hadron Collider (LHC) face unprecedented computing challenges. Heterogeneous resources are distributed worldwide at hundreds of sites, thousands of physicists analyze the data remotely, the volume of processed data is beyond the exabyte scale, while data processing requires more than a few billion hours of computing usage per year. The PanDA (Production and Distributed Analysis) system was developed to meet the scale and complexity of LHC distributed computing for the ATLAS experiment. In the process, the old batch job paradigm of locally managed computing in HEP was discarded in favor of a far more automated, flexible and scalable model. The success of PanDA in ATLAS is leading to widespread adoption and testing by other experiments. PanDA is the first exascale workload management system in HEP, already operating at more than a million computing jobs per day, and processing over an exabyte of data in 2013. There are many new challenges that PanDA will face in the near future, in addition to new challenges of scale, heterogeneity and increasing user base. PanDA will need to handle rapidly changing computing infrastructure, will require factorization of code for easier deployment, will need to incorporate additional information sources including network metrics in decision making, be able to control network circuits, handle dynamically sized workload processing, provide improved visualization, and face many other challenges. In this talk we will focus on the new features, planned or recently implemented, that are relevant to the next decade of distributed computing workload management using PanDA.
Primary author
Kaushik De
(University of Texas at Arlington (US))
Co-authors
Alexandre Vaniachine
(ATLAS)
Dr
Alexei Klimentov
(Brookhaven National Laboratory (US))
Artem Petrosyan
(Joint Inst. for Nuclear Research (RU))
Danila Oleynik
(Joint Inst. for Nuclear Research (RU))
Jaroslava Schovancova
(Brookhaven National Laboratory (US))
Paul Nilsson
(Brookhaven National Laboratory (US))
Sergey Panitkin
(Brookhaven National Laboratory (US))
Tadashi Maeno
(Brookhaven National Laboratory (US))
Dr
Torre Wenaus
(Brookhaven National Laboratory (US))