HEPiX Spring 2017 Workshop

Name: HEPiX Spring 2017 Workshop
Start: 2017-04-24T08:00:00+02:00
End: 2017-04-28T14:30:00+02:00
Location: Hungarian Academy of Sciences

24–28 Apr 2017

Hungarian Academy of Sciences

Europe/Budapest timezone

Organisers

hepix-2017spring-support@hepix.org

CosmoHub on Hadoop: a web portal to analyze and distribute massive cosmological data

26 Apr 2017, 12:40

25m

Hungarian Academy of Sciences

Széchenyi István tér 9 1051 Budapest Hungary

Computing & Batch Services Computing and batch systems

Jordi Casals Hernandez (University of Barcelona (ES))

We present CosmoHub, a web platform to perform interactive analysis of massive cosmological data without any SQL knowledge. CosmoHub is built on top of Apache Hive, which is an Apache Hadoop ecosystem component, which facilitates reading, writing, and managing large datasets.

CosmoHub is hosted at the Port de Informació Científica (PIC) and currently provides support to several international cosmology projects such as the Euclid space ESA mission, the Dark Energy Survey (DES), the Physics of the Accelerated Universe (PAU) and the Marenostrum Institut de Ciències de l'Espai Simulations (MICE). More than two billion objects, from public and private data, as well as observed and simulated data, are available among all projects. In the last three an a half years more than 400 users have produced about 1500 custom catalogs occupying 2TB in compressed format.

CosmoHub allows users to access value-added data, to load and explore pre-built datasets and to create their own custom datasets through a guided process. All those datasets can be interactively explored using an integrated visualization tool which includes 1D histogram and 2D heatmap plots. In our current implementation, online analysis of datasets of a billion objects can be done in less than 25 seconds. Finally, all those datasets can be downloaded in three different formats: CSV.BZ2, FITS and ASDF.

The components, integration and performance of the system will be reviewed in this contribution.

Scheduling constraints / preferences

Would like to have ethernet internet connection for a live demo.

Length of talk (minutes)	20

Dr Jorge Carretero Pau Tallada Crespi (Universitat Autònoma de Barcelona (ES)) Jordi Casals Hernandez (University of Barcelona (ES)) Marc Caubet Serrabou (Universitat Autònoma de Barcelona (ES))

Manuel Delfino Reznicek (Universitat Autònoma de Barcelona (ES)) Mr Francesc Torradeflot Nadia Tonello (PIC) Dr Pablo Fosalba Dr Jordi Delgado Dr Santiago Serrano Christian Neissner (PIC) Carlos Acosta Silva (Universitat Autònoma de Barcelona (ES))

CosmoHub_on_Hadoop_Hepix.pdf

Video Demo

HEPiX Spring 2017 Workshop

Organisers

CosmoHub on Hadoop: a web portal to analyze and distribute massive cosmological data

Hungarian Academy of Sciences

Speaker

Description

Scheduling constraints / preferences

Primary authors

Co-authors

Presentation materials

Choose timezone

HEPiX Spring 2017 Workshop

Organisers

Speaker

Description

Scheduling constraints / preferences

Primary authors

Co-authors

Presentation materials

Share this page

Direct link

Social networks

Calendaring