QuerySpaces on Hadoop for the ATLAS EventIndex

Not scheduled
15m
OIST

OIST

1919-1 Tancha, Onna-son, Kunigami-gun Okinawa, Japan 904-0495
poster presentation Track3: Data store and access

Speaker

Dr Julius Hrivnac (Laboratoire de l'Accelerateur Lineaire (FR))

Description

The new ATLAS EventIndex catalogue uses a Hadoop cluster to store information on each event processed by ATLAS. Several tools belonging to the Hadoop eco-system are used to organise the data in HDFS, catalogue it internally, and provide the search functionality. This presentation will describe the Hadoop-based implementation of the adaptive query engine serving as the back-end for the ATLAS EventIndex. The QuerySpaces implementation handles both original data and search results providing fast and efficient mechanisms for new user queries using already accumulated knowledge for optimisation. Detailed description and statistics about user requests are collected in HBase tables and HDFS files. Requests are associated to their results and a graph of relations between them is created to be used to find the most efficient way of providing answers to new requests. The environment is completely transparent to users and is accessible over several command-line interfaces, a Web Service and a programming API.

Primary author

Dr Julius Hrivnac (Laboratoire de l'Accelerateur Lineaire (FR))

Co-authors

Andrea Favareto (Università degli Studi e INFN Genova) Claudia Glasman (Universidad Autonoma de Madrid) Dr Jack Cranshaw (Argonne National Laboratory (US)) Rainer Toebbicke (CERN) Ruijun Yuan (Laboratoire de l'Accelerateur Lineaire (FR))

Presentation materials