Speaker
Description
A significant amount of research data remains underutilized due to being unpublished or poorly described, leading to a loss of funding and scientific potential. To recover lost data and prevent further waste, researchers must be encouraged to use FAIR data sharing repositories and adopt good publishing practices, such as providing descriptive metadata and utilizing datasets from the community. Since this involves additional complications, the data retrieval and publication process should be simplified, and extra motivation should be provided.
Our solution comprises integration between the Model Execution Environment platform for in-silico model simulations on HPC resources, and open-source data sharing repositories, along with the required infrastructure, including an instance of the Dataverse repository, and a set of practices for its effective use, to enable convenient collaboration within the community. It aims to facilitate data management for execution of medical simulations, provide scientists with tools for cooperation, and engage the scientific community in further advancing research through data contributions. Additionally, the Model Execution Environment leverages HPC resources for the scientists, providing a straightforward interface and structuring complex computational workflows.
The aforementioned practices include rule-based data sharing based on an incentive-driven mechanism, fueling research even after the data becomes public. This approach is embodied by the Sano Dataverse instance, part of the RODBUK Krakow Open Research Data Repository, through the publication of a dataset from the DPValid case study, conducted by UNIBO within the InSilicoWorld project. The publication preparation process, which includes data processing on HPC using the MEE platform, uploading to Sano Dataverse or Zenodo, configuring rule-based data sharing, and ensuring ongoing curation and support, is presented on the basis of the requirements of the ISW scientific community which the authors are part of.
This publication is partly supported by the EU H2020 grants Sano (857533), ISW (101016503) and by the Minister of Science and Higher Education "Support for the activity of Centers of Excellence established in Poland under Horizon 2020" number MEiN/2023/DIR/3796.
We gratefully acknowledge Polish high-performance computing infrastructure PLGrid (HPC Center: ACK Cyfronet AGH) for providing computer facilities and support within computational grant no. PLG/2024/017022