SWAN Spark: hosted Jupyter notebooks meet Spark jobs at scale

Name: SWAN Spark: hosted Jupyter notebooks meet Spark jobs at scale
Start: 2018-04-24T16:00:00+02:00
End: 2018-04-24T17:00:00+02:00
Location: CERN

Tuesday 24 Apr 2018, 16:00 → 17:00 Europe/Zurich

513/1-024 (CERN)

513/1-024

CERN

Show room on map

- 16:00 → 16:40
  
  SWAN Spark - a data analysis platform with hosted Jupyter notebooks and Spark on Hadoop service at CERN 40m
  
  The goal of this presentation is helping users of the Hadoop and Spark service in getting started with the SWAN-Spark integration and functionality for running Spark jobs at scale on SWAN. The session is also an occasion for service managers to gather feedback for future improvements. The integration of SWAN hosted notebooks with Spark and Hadoop service has recently been deployed into production. SWAN Spark allows you to run your data analysis at scale using Spark Python APIs (PySpark) on YARN/Hadoop clusters at CERN IT. This presentation will introduce the main components of the platform and walk you through its functionality and some get-started examples.
  See also: https://swan.cern.ch and https://swan.web.cern.ch
  
  Speaker: Prasanth Kothuri (CERN)
  
  HUF_SWAN_Spark_Integration.pdf
  
  HUF_SWAN_Spark_Integration.pptx
- 16:40 → 17:00
  
  Q&A 20m