Speaker
Dr
Pavel Nevski
(BNL)
Description
In addition to challenges on computing and data handling, ATLAS and other
LHC experiments place a great burden on users to configure and manage the
large number of parameters and options needed to carry out distributed
computing tasks.
Management of distribute physics data is being made more transparent by
dedicated ATLAS grid computing technologies, such as PanDA (a pilot-based
job control system).
The laborious procedure of steering the data processing application by
providing physics parameters and software configurations remained beyond
the scope of large grid projects.
The error-prone manual procedure does not scale to the LHC challenges.
To reduce human errors and automate the process of populating the ATLAS
production database with million of jobs per year we developed a system
for ATLAS knowledge management ("Knowledgement") of Task Request (AKTR).
AKTR manages configuration parameters, used for massive grid data
processing tasks (groups of similar jobs). The system assures a scalable
management of ATLAS-wide knowledge of distributed production conditions,
and guaranties reproducibility of results.
Use of AKTR system resulted in major gains in efficiency and productivity
of ATLAS production infrastructure.
Summary
This talk presents details of the AKTR system for knowledge management of configurations and parameters used for distributed data processing in ATLAS experiment.
Presentation type (oral | poster) | oral |
---|