SWAN Spark: hosted Jupyter notebooks meet Spark jobs at scale

Europe/Zurich
513/1-024 (CERN)

513/1-024

CERN

50
Show room on map
    • 16:00 16:40
      SWAN Spark - a data analysis platform with hosted Jupyter notebooks and Spark on Hadoop service at CERN 40m

      The goal of this presentation is helping users of the Hadoop and Spark service in getting started with the SWAN-Spark integration and functionality for running Spark jobs at scale on SWAN. The session is also an occasion for service managers to gather feedback for future improvements. The integration of SWAN hosted notebooks with Spark and Hadoop service has recently been deployed into production. SWAN Spark allows you to run your data analysis at scale using Spark Python APIs (PySpark) on YARN/Hadoop clusters at CERN IT. This presentation will introduce the main components of the platform and walk you through its functionality and some get-started examples.
      See also: https://swan.cern.ch and https://swan.web.cern.ch

      Speaker: Prasanth Kothuri (CERN)
    • 16:40 17:00
      Q&A 20m