Speaker
Dr
Dario Barberis
(Università e INFN Genova (IT))
Description
Modern scientific experiments collect vast amounts of data that must be cataloged to meet multiple use cases and search criteria. In particular, high-energy physics experiments currently in operation produce several billion events per year. A database with the references to the files including each event in every stage of processing is necessary in order to retrieve the selected events from data storage systems. The ATLAS EventIndex project is studying the best way to store the necessary information using modern data storage technologies (Hadoop, HBase etc.) that allow saving in memory key-value pairs and select the best tools to support this application from the point of view of performance, robustness and ease of use. At the end of this development, a new technology that is inherently independent of the type of data that are stored in the database -- and therefore directly applicable to all scientific experiments with large amounts of data -- will be available and demonstrated by the example of the ATLAS experiment. This paper describes the initial design and performance tests and the project evolution towards deployment and operation in 2014.
Author
Dr
Dario Barberis
(Università e INFN Genova (IT))
Co-authors
Alvaro Fernandez Casani
(Universidad de Valencia (ES))
Dr
David Malon
(Argonne National Laboratory (US))
Gancho Dimitrov
(CERN)
Dr
JOSE SALT
(IFIC-VALENCIA)
Dr
Jack Cranshaw
(Argonne National Laboratory (US))
Javier Sanchez
(IFIC)
Dr
Julius Hrivnac
(Universite de Paris-Sud 11 (FR))
Marcin Nowak
(Brookhaven National Laboratory (US))
Qizhi Zhang
(Argonne National Laboratory (US))
Roman Sorokoletov
(University of Texas at Arlington (US))
Dr
Santiago Gonzalez De La Hoz
(IFIC-Valencia)