Introduction to Hadoop, Spark and Big Data at CERN
→
Europe/Zurich
572 R-013 (CERN)
572 R-013
CERN
-
-
09:00
→
09:05
Welcome and Agenda 5mSpeaker: Pedro Andrade (CERN)
-
09:05
→
09:45
Introduction to Big Data and Hadoop 40m
The Big Data ecosystem today and the role Hadoop plays
Speaker: Emil Kleszcz (CERN) -
09:45
→
10:00
The Hadoop and Spark service at CERN 15m
An overview on how the service is organise and what it offers
Speaker: Pedro Andrade (CERN) -
10:00
→
10:15
Getting Started 15m
How to use the service and connect to the clusters
Speaker: Borja Aparicio Cotarelo (CERN) -
10:15
→
10:30
Coffee break 15m
-
10:30
→
11:15
Data Storage 45m
Durable, scalable storage for big data
Speaker: Luis Pigueiras (CERN) -
11:15
→
12:00
Data Ingestion & Formats 45m
Efficient data input and modern table formats
Speaker: Panagiotis Georgopoulos -
12:00
→
13:30
Lunch break 1h 30m
-
13:30
→
15:30
Data Processing 2h
Distributed computation for large-scale data workflows
Speaker: Luca Canali (CERN) -
15:30
→
15:45
Coffee break 15m
-
15:45
→
16:45
Data Access 1h
Real-time analytics and federated querying
Speaker: Emil Kleszcz (CERN) -
16:45
→
17:00
Q&A 15m
-
09:00
→
09:05