Help us make Indico better by taking this survey! Aidez-nous à améliorer Indico en répondant à ce sondage !

Sep 2 – 9, 2007
Victoria, Canada
Europe/Zurich timezone
Please book accomodation as soon as possible.

Performance Measurement and Monitoring for HENP Applications

Sep 3, 2007, 8:00 AM
10h 10m
Victoria, Canada

Victoria, Canada

Board: 38
poster Software components, tools and databases Poster 1

Speaker

Dr Sebastien Binet (LBNL)

Description

LHC experiments are entering in a phase where optimization in view of data taking as well as robustness' improvements are of major importance. Any reduction in event data size can bring very significant savings in the amount of hardware (disk and tape in particular) needed to process data. Another area of concern and potential major gains is reducing the memory size and I/O bandwidth requirements of processing nodes, especially with increasing usage of multi-core CPUs. LHC experiments are already collecting abundant performance information about event size, memory and CPU usage, I/O compression and bandwidth requirements. What is missing is a coherent set of tools to present this information in a tailored fashion to release coordinators, package managers and physics algorithm developers. This paper describes such a toolkit that we are developing in the context of ATLAS computing to harvest performance monitoring information from an extensible set of sources. The challenge is to map performance data with an immediate impact on hardware costs into entities which are relevant for the various users. For example the toolkit allows an ATLAS data model developer to evaluate the impact on resource usage throughout the entire software pipeline of their design decisions for an event data object. We present the data in a way that highlights potential areas of concerns, allowing experts to drill down to the level of detail they need (the size of a data member of a class, the CPU usage of a component method). A configurable monitoring system allows to set off alarms when a quantity or an histogram goes out of a specified range. This allows a release coordinator to monitor e.g. the global size of a data stream throughout a development cycle and have the developers correct a problem well before a release goes into production.
Submitted on behalf of Collaboration (ex, BaBar, ATLAS) ATLAS

Primary author

Co-authors

Presentation materials

There are no materials yet.