Conveners
Distributed Processing and Analysis: Monday
- Jerome Lauret (BNL)
Distributed Processing and Analysis: Monday
- Pablo Saiz (CERN)
Distributed Processing and Analysis: Tuesday
- Fons Rademakers (CERN)
Distributed Processing and Analysis: Tuesday
- Patricia Mendez Lorenzo (CERN)
Distributed Processing and Analysis: Thursday
- Joel Snow (Langston University)
Distributed Processing and Analysis: Thursday
- Fabrizio Furano (CERN)
Dr
Jakub Moscicki
(CERN IT/GS), Dr
Patricia Mendez Lorenzo
(CERN IT/GS)
23/03/2009, 14:00
Distributed Processing and Analysis
oral
Recently a growing number of various applications have been quickly and successfully enabled on the Grid by the CERN Grid application support team. This allowed the applications to achieve and publish large-scale results in short time which otherwise would not be possible.
The examples of successful Grid applications include the medical and particle physics simulation (Geant4, Garfield),...
Valentin Kuznetsov
(Cornell University)
23/03/2009, 14:20
Distributed Processing and Analysis
oral
The CMS experiment has a distributed computing model, supporting thousands of physicists at hundreds of sites around the world. While this is a suitable solution for "day to day" work in the LHC era there are edge use-cases that Grid solutions do not satisfy. Occasionally it is desirable to have direct access to a file on a users desktop or laptop; for code development, debugging or examining...
Dr
Daniel van der Ster
(CERN)
23/03/2009, 14:40
Distributed Processing and Analysis
oral
Ganga has been widely used for several years in Atlas, LHCb and a handful of other communities in the context of the EGEE project. Ganga provides a simple yet powerful interface for submitting and managing jobs to a variety of computing backends. The tool helps users configuring applications and keeping track of their work. With the major release of version 5 in summer 2008, Ganga's main...
Dr
Douglas Smith
(STANFORD LINEAR ACCELERATOR CENTER)
23/03/2009, 15:00
Distributed Processing and Analysis
oral
The Babar experiment produced one of the largest datasets in high
energy physics. To provide for many different concurrent analyses
the data is skimmed into many data streams before analysis can begin,
multiplying the size of the dataset both in terms of bytes and number
of files. As a large scale problem of job management and data
control, the Babar Task Manager system was...
Dr
Fabrizio Furano
(Conseil Europeen Recherche Nucl. (CERN))
23/03/2009, 15:20
Distributed Processing and Analysis
oral
The Scalla/Xrootd software suite is a set of tools and suggested methods useful to build scalable, fault tolerant and high performance storage systems for POSIX-like data access. One of the most important recent development efforts is to implement technologies able to deal with the characteristics of Wide Area Networks, and find solutions in order to allow data analysis applications to...
Dr
Alexei Klimentov
(BNL)
23/03/2009, 15:40
Distributed Processing and Analysis
oral
We present our experience with distributed reprocessing of the LHC beam
and cosmic ray data taken with the ATLAS detector during 2008/2009.
Raw data were distributed from CERN to ATLAS Tier-1 centers, reprocessed
and validated. The reconstructed data were consolidated at CERN and ten WLCG
ATLAS Tier-1 centers and made available for physics analysis.
The reprocessing was done...
Sophie Lemaitre
(CERN)
23/03/2009, 16:30
Distributed Processing and Analysis
oral
One of the current problem areas for sustainable WLCG operations is in the
area of data management and data transfer. The systems involved (e.g.
Castor, dCache, DPM, FTS, gridFTP, OPN network) are rather complex and have
multiple layers - failures can and do occur in any layer and due to the
diversity of systems involved, the differences in the information they have
available and their...
Mr
Levente HAJDU
(BROOKHAVEN NATIONAL LABORATORY)
23/03/2009, 16:50
Distributed Processing and Analysis
oral
Processing datasets on the order of tens of terabytes is an onerous task, faced by production coordinators everywhere. Users solicit data productions and, especially for simulation data, the vast amount of parameters (and sometime incomplete requests) point at the need for a tracking, control and archiving all requests made so a coordinated handling could be made by the production team.
With...
Norbert Neumeister
(Purdue University)
23/03/2009, 17:10
Distributed Processing and Analysis
oral
We present a Web portal for CMS Grid submission and management. Grid portals can deliver complex grid solutions to users without the need to download, install and maintain specialized software, or worrying about setting up site-specific components. The goal is to reduce the complexity of the user grid experience and to bring the full power of the grid to physicists engaged in LHC analysis...
Mr
Marco Meoni
(CERN)
23/03/2009, 17:30
Distributed Processing and Analysis
oral
The ALICE experiment at CERN LHC is intensively using a PROOF cluster for fast analysis and reconstruction. The current system (CAF - CERN Analysis Facility) consists of some 120 CPU cores and about 45 TB of local space. One of the most important aspects of the data analysis on the CAF is the speed with which it can be carried out. Fast feedback on the collected data can be obtained, which...
Dr
James Letts
(Department of Physics-Univ. of California at San Diego (UCSD))
23/03/2009, 17:50
Distributed Processing and Analysis
oral
During normal data taking CMS expects to support potentially as many as 2000 analysis users. In 2008 there were more than 800 individuals who submitted a remote analysis job to the CMS computing infrastructure. The bulk of these users will be supported at the over 40 CMS Tier-2 centers. Supporting a globally distributed community of users on a globally distributed set of computing clusters is...
Dr
Douglas Smith
(STANFORD LINEAR ACCELERATOR CENTER)
23/03/2009, 18:10
Distributed Processing and Analysis
oral
The Babar experiment has been running at the SLAC National Accelerator
Laboratory for the past nine years, and has measured 500 fb-1 of data.
The final data run for the experiment finished in April 2008. Once the
data was finished the final processing of all Babar data was started.
This was the largest computing production effort in the history of
Babar, including a reprocessing of...
Mrs
Andrew Hanushevsky
(SLAC National Accelerator Laboratory)
24/03/2009, 14:00
Distributed Processing and Analysis
oral
Scalla (also known as xrootd) is quickly becoming a significant part of LHC data analysis as a stand-alone clustered data server (US Atlas T2 and CERN Analysis Farm), globally clustered data sharing framework (ALICE), and an integral part of PROOF-base analysis (multiple experiments). Until recently, xrootd did not fit well in the LHC Grid infrastructure as a Storage Element (SE) largely...
Dr
Sergey Panitkin
(Department of Physics - Brookhaven National Laboratory (BNL))
24/03/2009, 14:20
Distributed Processing and Analysis
oral
The Parallel ROOT Facility - PROOF is a distributed analysis system which allows to exploit inherent event level parallelism of high energy physics data.
PROOF can be configured to work with centralized storage systems, but it is especially effective together with distributed local storage systems - like Xrootd, when data are distributed over computing nodes.
It works efficiently on...
Lassi Tuura
(Northeastern University)
24/03/2009, 14:40
Distributed Processing and Analysis
oral
In the last two years the CMS experiment has commissioned a full end
to end data quality monitoring system in tandem with progress in the
detector commissioning. We present the data quality monitoring and
certification systems in place, from online data taking to delivering
certified data sets for physics analyses, release validation and offline
re-reconstruction activities at Tier-1s. We...
Dr
Stuart Paterson
(CERN)
24/03/2009, 15:00
Distributed Processing and Analysis
oral
DIRAC, the LHCb community Grid solution, uses generic pilot jobs to obtain a virtual pool of resources for the VO community. In this way agents can request the highest priority user or production jobs from a central task queue and VO policies can be applied with full knowledge of current and previous activities. In this paper the performance of the DIRAC WMS will be presented with emphasis...
Mr
Andrei Gheata
(CERN/ISS)
24/03/2009, 15:20
Distributed Processing and Analysis
oral
ALICE offline group has developed a set of tools that do formalize data access patterns and impose certain rules on how individual data analysis modules have to be structured in order to maximize the data processing efficiency at the whole collaboration scale. The ALICE analysis framework was developed and extensively tested on MC reconstructed data during the last 2 years in the ALICE...
Dr
Giacinto Donvito
(INFN-Bari)
24/03/2009, 16:30
Distributed Processing and Analysis
oral
The Job Submitting Tool provides a solution for the submission of a large number of jobs to the grid in an unattended way. Indeed the tool is able to manage the grid submission, bookkeeping and resubmission of failed jobs .
It also allows the monitor in real time of the status of each job using the same framework.
The key elements of this tool are:
A Relational Db that contains all the...
Gerardo GANIS
(CERN)
24/03/2009, 16:50
Distributed Processing and Analysis
oral
PROOF-Lite is an implementation of the Parallel ROOT Facility (PROOF)
optimized for many-core machines. It gives ROOT users a straight-forward
way to exploit the many-cores by using them all in parallel for a data
analysis or generic computing task controlled via the ROOT TSelector
mechanism.
PROOF-Lite is, as the name suggests, a lite version of PROOF, where the
multi-tier...
Dr
Sanjay Padhi
(UCSD)
24/03/2009, 17:10
Distributed Processing and Analysis
oral
With the evolution of various grid federations, the Condor glide-ins represent a key
feature in providing a homogeneous pool of resources using late-binding technology.
The CMS collaboration uses the glide-in based Workload Management System, glideinWMS,
for production (ProdAgent) and distributed analysis (CRAB) of the data. The Condor
glide-in daemons traverse to the worker nodes,...
Dr
Dantong Yu
(BROOKHAVEN NATIONAL LABORATORY)
24/03/2009, 17:30
Distributed Processing and Analysis
oral
PanDA, ATLAS Production and Distributed Analysis framework, has been identified as one of the most important services provided by the ATLAS Tier 1 facility at Brookhaven National Laboratory (BNL), and enhanced to what is now a 24x7x365 production system. During this period, PanDA has remained under active development for additional functionalities and bug fix, and processing requirements have...
Johannes Elmsheuser
(Ludwig-Maximilians-Universitรคt Mรผnchen)
24/03/2009, 17:50
Distributed Processing and Analysis
oral
The distributed data analysis using Grid resources is one of the
fundamental applications in high energy physics to be addressed
and realized before the start of LHC data taking. The needs to
manage the resources are very high. In every experiment up to a
thousand physicist will be submitting analysis jobs into the Grid.
Appropriate user interfaces and helper applications have to be...
Bjoern Hallvard Samset
(Fysisk institutt - University of Oslo)
24/03/2009, 18:10
Distributed Processing and Analysis
oral
A significant amount of the computing resources available to the ATLAS experiment at the LHC are connected via the ARC grid middleware. ATLAS ARC-enabled resources, which consist of both major computing centers at Tier-1 level and lesser, local clusters at Tier-2 and 3 level, have shown excellent performance running heavy Monte Carlo (MC) production for the experiment. However, with the...
Dr
Guido NEGRI
(CERN)
26/03/2009, 14:00
Distributed Processing and Analysis
oral
Within the ATLAS hierarchical, multi-tier computing infrastructure,
the Tier-0 centre at CERN is mainly responsible for prompt processing
of the raw data coming from the online DAQ system, to archive the raw
and derived data on tape, to register the data with the relevant
catalogues and to distribute them to the associated Tier-1 centres.
The Tier-0 is already fully functional. It has...
Fabrizio Furano
(Conseil Europeen Recherche Nucl. (CERN))
26/03/2009, 14:20
Distributed Processing and Analysis
oral
Performance, reliability and scalability in data access are key issues in the context of Grid computing and High Energy Physics (HEP) data analysis. We present the technical details and the results of a large scale validation and performance measurement achieved at the INFN Tier1, the central computing facility of the Italian National Institute for Nuclear Research (INFN). The aim of this work...
Dr
David Mason
(FNAL)
26/03/2009, 14:40
Distributed Processing and Analysis
oral
CMS' infrastructure to process, store and analyze data is based on worldwide distributed tiers of computing resources. Monitoring and trouble shooting of all parts of the computing infrastructure, and importantly the experiment specific data flows and workflows running on this infrastructure, is essential to guarantee timely delivery of processed data to the physicists. This is especially...
Dr
Andrew Maier
(CERN)
26/03/2009, 15:00
Distributed Processing and Analysis
oral
Ganga (http://cern.ch/ganga) is a job-management tool that offers a simple, efficient and consistent user experience in a variety of heterogeneous environments: from local clusters to global Grid systems. Experiment specific plugins allow Ganga to be customised for each experiment. This paper will describe these LHCb plugins of Ganga. For LHCb users, Ganga is the job submission tool of choice...
Daniel Colin Van Der Ster
(Conseil Europeen Recherche Nucl. (CERN))
26/03/2009, 15:20
Distributed Processing and Analysis
oral
Effective distributed user analysis requires a system which meets the demands of running arbitrary user applications on sites with varied configurations and availabilities. The challenge of tracking such a system requires a tool to monitor not only the functional statuses of each grid site, but also to perform large-scale analysis challenges on the ATLAS grids. This work presents one such...
Pablo SAIZ
(CERN)
26/03/2009, 15:40
Distributed Processing and Analysis
oral
WLCG relies on the SAM (Service Availability Monitoring) infrastructure to monitor the behaviour of sites and as a powerful debugging tool. SAM is also used by individual experiments and VOs (Virtual Organisations) to submit application-specific tests to the grid. This degree of specificity implies additional requirements in terms of visualisation and manipulation of the test results provided...
Ulrich Schwickerath
(CERN)
26/03/2009, 16:30
Distributed Processing and Analysis
oral
Instrumentation of jobs throughout its lifecycle is not obvious, as
they are quite independent after being submitted, crossing multiple
environments and locations until landing on a worker node. In
order to measure correctly the resources used at each step, and to compare
it with the view from a Fabric Infrastructure, we propose a solution
using Messaging System for the Grids (MSG)...
Prof.
joel snow
(Langston University)
26/03/2009, 16:50
Distributed Processing and Analysis
oral
DZero uses a variety of resources on four continents to pursue a
strategy of flexibility and automation in the generation of simulation
data. This strategy provides a resilient and opportunistic system
which ensures an adequate and timely supply of simulation data to
support DZero's physics analyses. A mixture of facilities, dedicated
and opportunistic, specialized and generic, large...
Daniele Spiga
(Universita degli Studi di Perugia & CERN)
26/03/2009, 17:10
Distributed Processing and Analysis
oral
CMS has a distributed computing model, based on a hierarchy of tiered regional computing centres. However, the end physicist is not interested in the details of the computing model nor the complexity of the underlying infrastructure, but only to access and use efficiently and easily the remote services. The CMS Remote Analysis Builder (CRAB) is the official CMS tool that allows the access to...
Dr
Alessandra Doria
(INFN Napoli)
26/03/2009, 17:30
Distributed Processing and Analysis
oral
An optimized use of the grid computing resources in the ATLAS experiment requires the enforcement of a mechanism of job priorities and of resource sharing among the different activities inside the ATLAS VO. This mechanism has been implemented through the VOViews publication in the information system and the fair share implementation per UNIX group in the batch system. The VOView concept...
Mr
Anar Manafov
(GSI)
26/03/2009, 17:50
Distributed Processing and Analysis
oral
โPROOF on demandโ is a set of utilities, that allows to start a PROOF
cluster at user request, on a batch farm or on the Grid. It provides a
plug-in based system, which allows to use different job submission
frontends, such as LSF or gLite WMS. Main components of โPROOF on
demandโ are the PROOFAgent and the PAConsole. PROOFAgent provides the
communication layer between the xrootd...
Ian Fisk
(Fermi National Accelerator Laboratory (FNAL))
26/03/2009, 18:10
Distributed Processing and Analysis
oral
CMS is the the process of commissioning a complex detector and a globally distributed computing model simultaneously. The represents a unique challenge for the current generation of experiments. Even at the beginning there is not sufficient analysis or organized processing resources at CERN alone. In this presentation we will discuss the unique computing challenges CMS expects to face during...