6–7 Jun 2011
CERN
Europe/Zurich timezone

ATLAS DDM/DQ2 & NoSQL databases: Use cases and experiences

7 Jun 2011, 12:00
30m
IT Auditorium (CERN)

IT Auditorium

CERN

Speaker

Dr Vincent Garonne (Conseil Europeen Recherche Nucl. (CERN)-Unknown-Unknown)

Description

The Distributed Data Management System DQ2 is responsible for the global management of petabytes of ATLAS physics data. DQ2 has a critical dependency on Relational Database Management Systems (RDBMS), like Oracle, as RDBMS are well-suited to enforce data integrity in online transaction processing application. Despite these advantages, concerns have been raised recently on the scalability of data warehouse-like workload against the transactional schema, in particular for the analysis of archived data or the aggregation of data for summary purposes. Therefore, we have considered new approaches of handling vast amount of data. More specifically, we investigated a new class of database technologies commonly referred to as NoSQL databases. This includes distributed filesystem like HDFS that support parallel execution of computational tasks on distributed data, as well as schema-less approaches via key-value stores, like HBase, Cassandra or MongoDB. These databases provide solutions to particular types of problems: for example, NoSQL databases have demonstrated that they can scale horizontally, deliver high throughput, have automatic fail-over mechanisms, and provide easy replication support over LAN and WAN. In this talk, we will describe our use cases in ATLAS, and share our experiences with NoSQL databases in a comparative study with Oracle.
Proposed speaker V.Garonne

Authors

Dr Angelos Molfetas (CERN-PH-ADP-CO) Donal Zang (IHEP) Mr Gancho Dimitrov (BNL) Dr Luca Canali (CERN-IT-DB) Mr Mario Lassnig (CERN-PH-ADP-CO) Dr Vincent Garonne (Conseil Europeen Recherche Nucl. (CERN)-Unknown-Unknown)

Presentation materials