Speaker
Subir Sarkar
(INFN-CNAF)
Description
Higher instantaneous luminosity of the Tevatron Collider forces large increases in
computing requirements for CDF experiment which has to be able to cover future needs
of data analysis and MC production. CDF can no longer afford to rely on dedicated
resources to cover all of its needs and is therefore moving toward shared, Grid,
resources. CDF has been relying on a set of CDF Analysis Farms (CAFs), dedicated
pools of commodity nodes managed as Condor pools, with a small CDF specific software
stack on top of it. We have extended this model by using the Condor glide-in
mechanism that allows for the creation of dynamic Condor pools on top of existing
batch systems, without the need to install any additional software. The GlideCAF is
essentially a CAF plus the tools needed to keep the dynamic pool alive. All the
monitoring tools supported on the dedicated resource CAFs, including semi-interactive
access to the running jobs and detailed monitoring, have been preserved. In this
talk, we present the problems we have encountered during the implementation of
glide-in based Condor pools and the challenges we have in maintaining them. We also
show the amount of resources we manage with this technology and how much we have
gained through it.
Primary authors
Igor Sfiligoi
(INFN-Frascati)
Subir Sarkar
(INFN-CNAF)
Co-authors
Donatella Lucchesi
(INFN-Padova)
Elliot Lipeles
(UCSD)
Frank Wuerthwein
(UCSD)
Mark Neubauer
(UCSD)
Shih-Chieh Hsu
(UCSD)
Stefano Belforte
(INFN-Trieste)