Evaluation of NoSQL database MongoDB for HEP analyses

Not scheduled
15m
OIST

OIST

1919-1 Tancha, Onna-son, Kunigami-gun Okinawa, Japan 904-0495
poster presentation Track3: Data store and access

Speaker

Christopher Jung (KIT - Karlsruhe Institute of Technology (DE))

Description

Most analyses in experimental high-energy physics (HEP) are based on the data analysis framework ROOT. Therefore, simulated as well as measured events are stored in ROOT trees. A typical analysis loops over events in ROOT files and selects relevant events for further processing according to certain selection criteria. The emergence of NoSQL databases provide a new mean for large scale data storage and analyses. NoSQL databases allow horizontal scaling and can be used as a new approach for HEP analyses. We present a study that stores simulated top-pair-production events in the document-based NoSQL database MongoDB. We compare analysis steps using MongoDB as input source to the traditional approach using ROOT trees. We demonstrate that many analysis steps beyond the selection of events are supported by the database query language. The advantages and limitations of the NoSQL approach are discussed in terms of functionality, performance, storage efficiency, and usability.

Primary author

Jörg Meyer (Karlsruhe Institute of Technology)

Co-authors

Achim Streit (KIT - Karlsruhe Institute of Technology (DE)) Christopher Jung (KIT - Karlsruhe Institute of Technology (DE))

Presentation materials