Speaker
Dr
Randy Sobie
(University of Victoria (CA))
Description
The computing model of the ATLAS experiment was designed around the concept of grid computing and, since the start of data taking, this model has proven very successful. However, new cloud computing technologies bring attractive features to improve the operations and elasticity of scientific distributed computing. ATLAS sees grid and cloud computing as complementary technologies that will coexist at different levels of resource abstraction, and two years ago created an R&D working group to investigate the different integration scenarios. The ATLAS Cloud Computing R&D has been able to demonstrate the feasibility of offloading work from grid to cloud sites and, as of today, is able to integrate transparently various cloud resources into the PanDA workload management system. The ATLAS Cloud Computing R&D is operating various PanDA queues on private and public resources and has provided several hundred thousand CPU days to the experiment. The HammerCloud grid site testing framework is used to evaluate the performance of cloud resources and, where appropriate, compare it with the performance of bare metal to measure virtualization penalties. As a result, the ATLAS Cloud Computing R&D group has gained a deep insight into the cloud computing landscape and has identified points that still need to be addressed in order to fully profit from this young technology.
This contribution will explain the cloud integration models that are being evaluated and will discuss ATLAS' learning during the collaboration with leading commercial and academic cloud providers.
Primary author
Sergey Panitkin
(Brookhaven National Laboratory (US))