Dr Pavel Nevski (BNL)
In addition to challenges on computing and data handling, ATLAS and other LHC experiments place a great burden on users to configure and manage the large number of parameters and options needed to carry out distributed computing tasks. Management of distribute physics data is being made more transparent by dedicated ATLAS grid computing technologies, such as PanDA (a pilot-based job control system). The laborious procedure of steering the data processing application by providing physics parameters and software configurations remained beyond the scope of large grid projects. The error-prone manual procedure does not scale to the LHC challenges. To reduce human errors and automate the process of populating the ATLAS production database with million of jobs per year we developed a system for ATLAS knowledge management ("Knowledgement") of Task Request (AKTR). AKTR manages configuration parameters, used for massive grid data processing tasks (groups of similar jobs). The system assures a scalable management of ATLAS-wide knowledge of distributed production conditions, and guaranties reproducibility of results. Use of AKTR system resulted in major gains in efficiency and productivity of ATLAS production infrastructure.
This talk presents details of the AKTR system for knowledge management of configurations and parameters used for distributed data processing in ATLAS experiment.
|Presentation type (oral | poster)||oral|