Description
Format: oral presentations, 20 minutes + 5 minutes QA
Classical networked storage systems typically accepted science data in bulk uploads, often after processing; as a consequence, stored data usually wasn’t live in the sense of fresh from the instrument. Similarly, efforts at building Virtual Research Environments (VREs; essentially cloud-based science toolchains) haven’t seen great uptake, again because the tools are only useful if they have fresh data to operate on, and users typically do not good discipline at regular uploading of data taking runs.
In contrast, synched data stores hold what can be considered live data – thereby offering the possibility of performing first-line scientific munging / workflow / analytics on the cloud platform, rather than on researchers’ desktops. This opens up interesting possibilities of transparent compute scaling, GPU compute, science package management etc. not normally available on researcher-managed (desktop) platforms. This stream is intended to showcase such novel opportunities.
Keywords:
- Virtual Research Environments
- Data Management and Workflows
- File Transfer & Distribution
- Virtualization: Open Stack, Open Nebula
- Containers and Orchestration: Kubernetes, Mesos
- Analytics: Hadoop,Spark
- Compute and Grid services
-
Guido Aben (AARNet)30/01/2018, 09:00Presentation
What is the DLCF?
The Data LifeCycle Framework (DLCF) is an Australian nationwide strategy to connect research resources and activities; predominantly those funded by national eInfrastructure funding.
The goal of the DLCF is to smooth over the complexity faced by ordinary researchers, when they have to piece together their own digital workflow from all the bits and pieces made available...
Go to contribution page -
Tibor Simko (CERN), Diego Rodriguez Rodriguez (Universidad de Oviedo (ES))30/01/2018, 09:20Presentation
The revalidation, reuse and reinterpretation of data analyses requires having access to the original virtual environments, datasets, software, instructions and workflow steps which were used by the researcher to produce the original scientific results in the first place. The CERN Analysis Preservation pilot project is developing a set of tools that assist the particle...
Go to contribution page -
Mr Vladislav Makarenko (Max-Planck Digital Library)30/01/2018, 09:40Presentation
Keeper is a central service for scientists of the Max Planck Society and their project partners for storing and archiving all relevant data of scientific projects. Keeper facilitates the storage and distribution of project data among the project members during or after a particular project phase and seamlessly integrates into the everyday work of scientists. The main goal of the Keeper service...
Go to contribution page -
Diogo Castro (CERN)30/01/2018, 10:00
SWAN (Service for Web-based ANalysis) is a CERN service that allows users to perform interactive data analysis in the cloud, in a "software as a service" model. It is built upon the widely-used Jupyter notebooks, allowing users to write - and run - their data analysis using only a web browser. By connecting to SWAN, users have immediate access to storage, software and computing resources that...
Go to contribution page