28–29 May 2013
CERN
Europe/Zurich timezone

Extending the ATLAS PanDA Workload Management System for New Big Data Applications

29 May 2013, 10:50
20m
60/6-015 - Room Georges Charpak (Room F) (CERN)

60/6-015 - Room Georges Charpak (Room F)

CERN

90
Show room on map

Speaker

Dr Alexei Klimentov (Brookhaven National Laboratory (US))

Description

The LHC experiments are today at the leading edge of large scale distributed data-intensive computational science. The LHC's ATLAS experiment processes data volumes which are particularly extreme, over 130 PB to date, distributed worldwide at over of 120 sites. An important element in the success of the exciting physics results from ATLAS is the highly scalable integrated workflow and dataflow management afforded by the PanDA workload management system, used for all the distributed computing needs of the experiment. The PanDA design is not experiment specific and PanDA is now being extended to support other data intensive scientific applications. Alpha-Magnetic Spectrometer, an astro-particle experiment on the International Space Station, and the Compact Muon Solenoid, an LHC experiment, have successfully evaluated PanDA and are pursuing its adoption. PanDA was cited as an example of "a high performance, fault tolerant software for fast, scalable access to data repositories of many kinds" during the "Big Data Research and Development Initiative" announcement, a $200 million U.S. government investment in tools to handle huge volumes of digital data needed to spur science and engineering discoveries. In this talk, a description of the new program of work to develop a generic version of PanDA will be given, as well as the progress in extending PanDA's capabilities to support supercomputers, clouds, leverage intelligent networking, while accommodating the ever growing needs of current users. PanDA has already demonstrated at a very large scale the value of automated data-aware dynamic brokering of diverse workloads across distributed computing resources. The next generation of PanDA will allow many data-intensive sciences employing a variety of computing platforms to benefit from ATLAS' experience and proven tools in highly scalable processing.

Author

Dr Alexei Klimentov (Brookhaven National Laboratory (US))

Co-authors

Alexandre Vaniachine (ATLAS) Dr Dantong Yu (BROOKHAVEN NATIONAL LABORATORY) Kaushik De (University of Texas at Arlington (US)) Paul Nilsson (University of Texas at Arlington (US)) Sergey Panitkin (Brookhaven National Laboratory (US)) Tadashi Maeno (Brookhaven National Laboratory (US)) Dr Torre Wenaus (Brookhaven National Laboratory (US))

Presentation materials