Distributed Data Collection for the ATLAS EventIndex.

Apr 13, 2015, 3:45 PM
C209 (C209)



oral presentation Track3: Data store and access Track 3 Session


Javier Sanchez (Instituto de Fisica Corpuscular (ES))


The ATLAS EventIndex contains records of all events processed by ATLAS, in all processing stages. These records include the references to the files containing each event (the GUID of the file) and the internal “pointer” to each event in the file. This information is collected by all jobs that run at Tier-0 or on the Grid and process ATLAS events. Each job produces a snippet of information for each permanent output file. This information is packed and transfered to a central broker at CERN using an ActiveMQ messaging system, and then is unpacked, sorted and reformatted in order to be stored and catalogued into a central Hadoop server. This talk describes in detail the Producer/Consumer architecture to convey this information from the running jobs through the messaging system to the Hadoop server.

Primary author

Javier Sanchez (Instituto de Fisica Corpuscular (ES))


Alvaro Fernandez Casani (Instituto de Fisica Corpuscular (ES)) Javier Sanchez (Universidad de Valencia (ES)) Dr Santiago Gonzalez De La Hoz (IFIC-Valencia)

Presentation materials