24–28 Apr 2017
Hungarian Academy of Sciences
Europe/Budapest timezone

CosmoHub on Hadoop: a web portal to analyze and distribute massive cosmological data

26 Apr 2017, 12:40
25m
Hungarian Academy of Sciences

Hungarian Academy of Sciences

Széchenyi István tér 9 1051 Budapest Hungary
Computing & Batch Services Computing and batch systems

Speaker

Jordi Casals Hernandez (University of Barcelona (ES))

Description

We present CosmoHub, a web platform to perform interactive analysis of massive cosmological data without any SQL knowledge. CosmoHub is built on top of Apache Hive, which is an Apache Hadoop ecosystem component, which facilitates reading, writing, and managing large datasets.

CosmoHub is hosted at the Port de Informació Científica (PIC) and currently provides support to several international cosmology projects such as the Euclid space ESA mission, the Dark Energy Survey (DES), the Physics of the Accelerated Universe (PAU) and the Marenostrum Institut de Ciències de l'Espai Simulations (MICE). More than two billion objects, from public and private data, as well as observed and simulated data, are available among all projects. In the last three an a half years more than 400 users have produced about 1500 custom catalogs occupying 2TB in compressed format.

CosmoHub allows users to access value-added data, to load and explore pre-built datasets and to create their own custom datasets through a guided process. All those datasets can be interactively explored using an integrated visualization tool which includes 1D histogram and 2D heatmap plots. In our current implementation, online analysis of datasets of a billion objects can be done in less than 25 seconds. Finally, all those datasets can be downloaded in three different formats: CSV.BZ2, FITS and ASDF.

The components, integration and performance of the system will be reviewed in this contribution.

Scheduling constraints / preferences

Would like to have ethernet internet connection for a live demo.

Length of talk (minutes) 20

Primary authors

Dr Jorge Carretero Pau Tallada Crespi (Universitat Autònoma de Barcelona (ES)) Jordi Casals Hernandez (University of Barcelona (ES)) Marc Caubet Serrabou (Universitat Autònoma de Barcelona (ES))

Co-authors

Manuel Delfino Reznicek (Universitat Autònoma de Barcelona (ES)) Mr Francesc Torradeflot Nadia Tonello (PIC) Dr Pablo Fosalba Dr Jordi Delgado Dr Santiago Serrano Christian Neissner (PIC) Carlos Acosta Silva (Universitat Autònoma de Barcelona (ES))

Presentation materials