Jan 29 – 31, 2018
AGH Computer Science Building D-17
Europe/Zurich timezone

SWAN: Service for Web-based Analysis

Jan 30, 2018, 10:00 AM
20m
AGH Computer Science Building D-17

AGH Computer Science Building D-17

AGH WIET, Department of Computer Science, Building D-17, Street Kawiory 21, Krakow

Speaker

Diogo Castro (CERN)

Description

SWAN (Service for Web-based ANalysis) is a CERN service that allows users to perform interactive data analysis in the cloud, in a "software as a service" model. It is built upon the widely-used Jupyter notebooks, allowing users to write - and run - their data analysis using only a web browser. By connecting to SWAN, users have immediate access to storage, software and computing resources that CERN provides, and that they need to do their analyses.

All these computing resources are isolated and provide users a secure place to run their work. The software provided is centrally managed, delivered in a distributed file system - called CVMFS - allowing users to forget about installation, configuration and compatibility of packages. Storage is provided by EOS, CERN’s mass storage solution, with a private area that is synchronizable through CERNBox - the cloud storage service.

Besides providing an easier way of producing scientific code and results, SWAN is also a great tool to create shareable content. From results that need to be reproducible, to tutorials and demonstrations for outreach and teaching, Jupyter notebooks are the ideal way of distributing this content. In one single file, users can pack their code, the results of the calculations and all the relevant textual information. By sharing them, it allows others to visualise, modify, personalise or even re-run all the code.

Given the importance of sharing and collaboration in our scientific community, the interface of SWAN has been enhanced to ease this task as much as possible. Up until now, besides the manual options (like sending notebooks in emails), users were able to use CERNBox to share their work. But they had to leave SWAN and go to the CERNBox interface, and look for the notebook that they were editing back in SWAN.

This approach worked, but it was not optimal. We wanted to offer a more integrated and simple model. Something that worked with the minimum clicks possible. With this in mind, we brought CERNBox sharing directly inside SWAN. And with it, we also brought a new and redesigned interface that also introduces the concept of a Project. A project is a special kind of folder that, besides the notebook(s), contains all other sorts of files, like input data or images. In order to simplify the process and keep it consistent, this is the only entity that can be shared among users, from within SWAN. And it can be shared from wherever the users are, either from the files view or inside the notebooks editor. Users just need to click on a button and write the names of whom they wish to share with (single users or groups), using an autocomplete that searches CERN’s directory. When someone gets a shared project, they can clone it to their storage - in order to open and edit the files - just by clicking on a button. All without switching services. And since this cloned project now belongs to the user, he can modify it as he wishes, and even share it again.

With the new approach described, sharing is now a first-class citizen in SWAN. A functionality that is very present and highlighted throughout the new user interface. Something that our users need to collaborate in a simpler manner.

Primary authors

Presentation materials