19–21 Mar 2025
LMU
Europe/Zurich timezone

Infrastructure and Practices for Sharing and Disseminating In-Silico Medicine Research Data

21 Mar 2025, 12:30
15m
LMU

LMU

Presentation CS3 federations and synergies with eResearch infrastructures. Data sharing infrastructures

Speaker

Taras Zhyhulin

Description

A significant amount of research data remains underutilized due to being unpublished or poorly described, leading to a loss of funding and scientific potential. To recover lost data and prevent further waste, researchers must be encouraged to use FAIR data sharing repositories and adopt good publishing practices, such as providing descriptive metadata and utilizing datasets from the community. Since this involves additional complications, the data retrieval and publication process should be simplified, and extra motivation should be provided.
Our solution comprises integration between the Model Execution Environment platform for in-silico model simulations on HPC resources, and open-source data sharing repositories, along with the required infrastructure, including an instance of the Dataverse repository, and a set of practices for its effective use, to enable convenient collaboration within the community. It aims to facilitate data management for execution of medical simulations, provide scientists with tools for cooperation, and engage the scientific community in further advancing research through data contributions. Additionally, the Model Execution Environment leverages HPC resources for the scientists, providing a straightforward interface and structuring complex computational workflows.
The aforementioned practices include rule-based data sharing based on an incentive-driven mechanism, fueling research even after the data becomes public. This approach is embodied by the Sano Dataverse instance, part of the RODBUK Krakow Open Research Data Repository, through the publication of a dataset from the DPValid case study, conducted by UNIBO within the InSilicoWorld project. The publication preparation process, which includes data processing on HPC using the MEE platform, uploading to Sano Dataverse or Zenodo, configuring rule-based data sharing, and ensuring ongoing curation and support, is presented on the basis of the requirements of the ISW scientific community which the authors are part of.
This publication is partly supported by the EU H2020 grants Sano (857533), ISW (101016503) and by the Minister of Science and Higher Education "Support for the activity of Centers of Excellence established in Poland under Horizon 2020" number MEiN/2023/DIR/3796.
We gratefully acknowledge Polish high-performance computing infrastructure PLGrid (HPC Center: ACK Cyfronet AGH) for providing computer facilities and support within computational grant no. PLG/2024/017022

Authors

Mr Karol Zając (Sano Centre for Computational Medicine) Taras Zhyhulin

Co-authors

Ms Francesca Bottin (Alma Mater Studiorum - University of Bologna) Dr Giorgio Davico (Alma Mater Studiorum - University of Bologna) Mr Goran Stanic (Alma Mater Studiorum - University of Bologna) Jan Meizner (Sano Centre for Computational Medicine) Maciej Malawski (AGH University of Science and Technology) Mr Marek Kasztelnik (ACC Cyfronet AGH) Marian Bubak (AGH Krakow) Piotr Nowakowski (ACC Cyfronet AGH) Mr Piotr Połeć (ACC Cyfronet AGH)

Presentation materials