CHEP 2016 Conference, San Francisco, October 8-14, 2016

Name: CHEP 2016 Conference, San Francisco, October 8-14, 2016
Start: 2016-10-10T08:00:00-07:00
End: 2016-10-14T18:00:00-07:00
Location: San Francisco Marriott Marquis

10–14 Oct 2016

San Francisco Marriott Marquis

America/Los_Angeles timezone

Hadoop and friends - first experience at CERN with a new platform for high throughput analysis steps

13 Oct 2016, 14:45

15m

GG A+B (San Francisco Mariott Marquis)

GG A+B

San Francisco Mariott Marquis

Oral Track 5: Software Development Track 5: Software Development

Prasanth Kothuri (CERN)

The statistical analysis of infrastructure metrics comes with several specific challenges, including the fairly large volume of unstructured metrics from a large set of independent data sources. Hadoop and Spark provide an ideal environment in particular for the first steps of skimming rapidly through hundreds of TB of low relevance data to find and extract the much smaller data volume that is relevant for statistical analysis and modelling.
This presentation will describe the new Hadoop service at CERN and the use of several of its components for high throughput data aggregation and ad-hoc pattern searches. We will describe the hardware setup used, the service structure with a small set of decoupled clusters and the first experience with co-hosting different applications and performing software upgrades. We will further detail the common infrastructure used for data extraction and preparation from continuous monitoring and database input sources.

Primary Keyword (Mandatory)	Analysis tools and techniques
Secondary Keyword (Optional)	Monitoring

Dirk Duellmann (CERN)

Kacper Surdy (CERN) Luca Menichetti (CERN) Prasanth Kothuri (CERN) Rainer Toebbicke (CERN) Vineet Menon (Bhabha Atomic Research Centre (IN))

Oral-231.pdf

CHEP 2016 Conference, San Francisco, October 8-14, 2016

Hadoop and friends - first experience at CERN with a new platform for high throughput analysis steps

GG A+B

San Francisco Mariott Marquis

Speaker

Description

Author

Co-authors

Presentation materials