Speaker
L. GOOSSENS
(CERN)
Description
In order to validate the Offline Computing Model and the
complete software suite, ATLAS is running a series of Data
Challenges (DC). The main goals of DC1 (July 2002 to April
2003) were the preparation and the deployment of the
software required for the production of large event samples,
and the production of those samples as a worldwide
distributed activity.
DC2 (May 2004 until October 2004) is divided into three
phases: (i) Monte Carlo data are produced using GEANT4 on
three different Grids, LCG, Grid3 and NorduGrid; (ii)
simulate the first pass reconstruction of data expected in
2007, also called Tier0 exercise, using the MC sample; and
(iii) test the Distributed Analysis model.
A new automated data production system has been developed
for DC2. The major design objectives are minimal human
involvement, maximal robustness, and interoperability with
several grid flavors and legacy systems. A central
component of the production system is the production
database holding information about all jobs. Multiple
instances of a 'supervisor' component pick up unprocessed
jobs from this database, distribute them to 'executor'
processes, and verify them after execution. The 'executor'
components interface to a particular grid or legacy flavour.
The job distribution model is a combination of push and
pull. A data management system keeps track of all produced
data and allows for file transfers.
The basic elements of the production system are described.
Experience with the use of the system in world-wide DC2
production of ten million events will be presented. We also
present how the three Grid flavors are operated and
monitored. Finally we discuss the first attempts on using
the Distributed Analysis system.
Primary authors
K. DE
(UNIVERSITY OF TEXAS AT ARLINGTON)
L. GOOSSENS
(CERN)