The ATLAS physics analysis model and production of derived datasets
Presented by Amir FARBIN on 24 May 2012 from 13:30 to 18:15
Session: Poster Session
Track: Event Processing (track 2)
The ATLAS experiment has collected vast amounts of data with the arrival of the inverse-femtobarn era at the LHC. ATLAS has developed an intricate analysis model with several types of derived datasets, including their grid storage strategies, in order to make data from O(109) recorded events readily available to physicists for analysis. Several use cases have been considered in the ATLAS analysis model with a few distinct classes of analyses that need to look at various parts of the overall data. A rst class of analysis needs very detailed information in order to study detector and reconstruction performance, as well as performing physics analyses for non-standard scenarios. For this case, specialized Derived Event Summary Data (DESD) are produced at the Tier 0, right after the main reconstruction was performed. These DESDs contain only specic events and sometimes also very specic per-event content in order to keep their total size manageable. They are distributed on the grid in order to allow easy access for physicists. All other types of analysis could in principle be performed on the Analysis Object Data (AOD) which has a size of '150 kByte/event. It is generally considered the main data format for physics analysis in ATLAS. However, its still very large total size makes it unpractical in most cases to frequently process it. Thus, further size reduction is necessary. Derived AODs (DAOD) are produced from the AOD and distributed on the grid with only events and content of interest to a specic class of physics analysis. These DAODs contain the full object structure of the AOD. The most commonly used data format for physics analysis is today a type of ROOT les known as D3PDs which contain only simple types and vectors of simple types for selected events, i.e. no ATLAS-specic software is needed to process them. Both DAODs and D3PDs can be produced centrally for individual analyses or whole analysis groups. They are stored on the grid using dedicated group-specic space quota, but fully available to the whole collaboration. In all cases, the selection criteria, luminosity bookkeeping, and other relevant information is stored inside the resulting les as meta-data.