Nov 4 – 8, 2019
Australia/Adelaide timezone

Evolution of web-based analysis for Machine Learning and LHC Experiments: power of integrating storage, interactivity and collaboration with JupyterLab in SWAN

Nov 4, 2019, 11:30 AM
Track 6 – Physics Analysis


Jakub Moscicki (CERN)


SWAN (Service for Web-based ANalysis) is a CERN service that allows users to perform interactive data analysis in the cloud, in a "software as a service" model. The service is a result of the collaboration between IT Storage and Databases groups and EP-SFT group at CERN. SWAN is built upon the widely-used Jupyter notebooks, allowing users to write - and run - their data analysis using only a web browser. SWAN is a data analysis hub: users have immediate access to user storage CERNBox, entire LHC data repository on EOS, software (CVMFS) and computing resources, in a pre-configured, ready-to-use environment. Sharing of notebooks is fully integrated with CERNBox and users can easily access their notebook projects on all devices supported by CERNBox.

In the first quarter of 2019 we have recorded more than 1300 individual users of SWAN, with a majority from all four LHC experiments. Integration of SWAN with CERN Spark clusters is at the core of the new controls data logging system for the LHC. Every month new users discover SWAN through tutorials on data analysis and machine learning.

The SWAN service evolves, driven by the user's needs. In the future SWAN will provide access to GPUs, to the more powerful interface of Jupyterlab - that replaces Jupyter notebooks - and to a more configurable, easier to use and more shareable way of setting the software environment of Projects and notebooks.

This presentation will update the HEP community with the status of this effort and its future direction, together with the general evolution of SWAN.

