29–31 Jan 2018
AGH Computer Science Building D-17
Europe/Zurich timezone

Reproducible high energy physics analyses

30 Jan 2018, 09:20
20m
AGH Computer Science Building D-17

AGH Computer Science Building D-17

AGH WIET, Department of Computer Science, Building D-17, Street Kawiory 21, Krakow

Speakers

Tibor Simko (CERN) Diego Rodriguez Rodriguez (Universidad de Oviedo (ES))

Description

The revalidation, reuse and reinterpretation of data analyses requires having access to the original virtual environments, datasets, software, instructions and workflow steps which were used by the researcher to produce the original scientific results in the first place. The CERN Analysis Preservation pilot project is developing a set of tools that assist the particle physics researchers in structuring their analyses so that preserving and capturing the knowledge around analyses would lead to easier sharing, reusing and reinterpreting data. Assuming the full preservation of the original analysis environment, the user code and the computational workflow steps, the REANA Reusable Analysis platform enables one to launch container-based processes on the computing cloud (Docker, Kubernetes) and to rerun the analysis workflow jobs with new input. The REANA system aims at supporting several workflow engines (CWL, Yadage), several shared storage systems (Ceph, EOS) and compute cloud infrastructures (OpenStack, HTCondor). REANA was developed with the particle physics use case in mind and profits from synergies with general research data analysis patterns in other scientific disciplines such as life sciences.

Primary authors

Presentation materials