Speaker
Dr
Marian Zvada
(Fermilab)
Description
Many members of large science collaborations already have specialized grids available to advance their research in the need of getting more computing resources for data analysis. This has forced the Collider Detector at Fermilab (CDF) collaboration to move beyond the usage of dedicated resources and start exploiting Grid resources.
Nowadays, CDF experiment is increasingly relying on glidein-based computing pools for data reconstruction. Especially, Monte Carlo production and user data analysis, serving over 400 users by central analysis farm middleware (CAF) on the top of Condor batch system and CDF Grid infrastructure. Condor is designed as distributed architecture and its glidein mechanism of pilot jobs is ideal for abstracting the Grid computing by making a virtual private computing pool.
We would like to present the first production use of the generic pilot-based Workload Management System (glideinWMS), which is an implementation of the pilot mechanism based on the Condor distributed infrastructure. CDF Grid computing uses glideinWMS for its data reconstruction on the FNAL campus Grid, user analysis and Monte Carlo production across Open Science Grid (OSG). We review this computing model and setup used including CDF specific configuration within the glideinWMS system which provides powerful scalability and makes Grid computing working like in a local batch environment with ability to handle more than 10000 running jobs at a time.
Presentation type (oral | poster) | oral |
---|