4th Meeting Open Science Practitioners Forum

Europe/Zurich
Description

During this meeting, we will review the outcome of the implementation plan of the years 2023/2024. 

Work in progress document. 

Introduction

  • News
    • Merten Dahlkemper is new Open Science Community Engagement Manager
    • Increasing collaboration with OAPEN: Hosting OA Books
    • ICFA Data Lifecycle panel has published recommendations for best practices on data preservation
  • Events
    • OSPO event today at 3pm in Council Chamber about Open Source with Dawn Foster
    • Conference on Open Science in September: OS Fair on 15-17 September, organised by CERN and OpenAIRE, held in Science Gateway
  • Topic of last OSPF was monitoring frameworks. We continue work on that and follow also international efforts closely
  • Todays agenda
    • We need to review Implementation Plan by assessing achievements and gaps to write Open Science Report and prepare for the next two years
    • Work in Progress on a Google Doc linked in the description of this event. Today: presentation on all the chapters by the “sheperds” of the chapters
    • Need to present review to the OSSB by March/April
    • Publish report publicly in May/June

Open Access

  • It’s the oldest Open Science chapters at CERN (policy exists since 2014)
  • 4 different mechanisms to allow authors to comply with policy:
    • SCOAP3 for HEP
    • Favoured publishing model: collective models, where funding is provided by some consortia and everyone can publish
    • OA agreements where corresponding author has to be CERN-affiliated. CERN has 10 such agreements
    • Individual APCs for journals not covered by any agreements
  • Open Access in 2023: 96% of CERN publications were OA (half of it via SCOAP3, 7% only Green OA)
    • Aim is to get to 100%
    • Licenses are on track: 98% are CC-BY, which is demanded by policy (more restricitive license could sign away author’s rights)
      • We are doing campaigns to make authors aware of licenses
  • Continue support to researchers on OA
  • SCOAP3 went into the next phase to push publishers to include Open Science elements
  • Continued support to infrastructure (OAPEN, DOAJ,…)
  • Participation in Diamond OA, where journals are open to everyone
  • Future plans:
    • More education and outreach on good practices
    • Assess transformative agreements, how do we make them more transformative
    • New phase for SCOAP3
    • Global Infrastructure for OA (diamond/collective OA)

Discussion

  • More comments can be added in the Review document
  • We should communicate more general rules or recommendations for how to license something. That will be part of the communication strategy to be developed by Merten

Open Data

  • History
    • Open Data Working Group started in 2020
    • LHC experiments OD policy was very broad, goes more into detail in Implementation Plan; implementation needed to start within 5 years; after this period, data should be released at the end of the run where the data was taken
    • IT agreed to provide resources by the end of 2025; they employed a person to work on tape integration of Open Data
    • Big experiments status
      • CMS has released 3 big datasets already
      • LHCb has released one big dataset already; they have developed their own system to more efficiently store/use data
      • ATLAS has had two data releases
      • ALICE is still missing but expect to have theor first release soon
    • Status is discussed at annual ODWG meetings
  • Small LHC experiments also need to be included. They endorsed the policy in 2023, latency period is still ongoing (until 2028)
  • Other facilities: Policy might not be applicable to all experiments
    • ISOLDE has written their own policy
    • AD and nTOF experiments are still missing a policy
    • SPS experiments currently in discussion
    • Need to follow up
  • Discussion ongoing on monitoring
  • ATLAS will also release Monte Carlo generator information (useful for simulations)
  • Open Data Portal could be used also for older experiments (such as LEP experiments), storage size not a problem, but compatibility might be an issue
  • Lessons has been learned on data preservation; synergy between data preservation and open data all experiments can learn from
  • Future resource planning
    • used resources were smaller than expected in the past
    • In Run 2/3, there will be probably 2.5 as much as data as expected
    • Ongoing discussion as to where to get the resources for this
    • Interface between open data portal and tape system will bring down the costs

Discussion

  • ESPP update
  • It’s a complicated discussion about the resource need of open data
    • Resources for data sharing use up to some part the resources for running the experiment
    • Discussion with management is ongoing on what the priorities are

Open Source Software

  • OSPO was launched (2 years of preparation)
    • internal CERN community
    • broader open-source community
  • Focus on guidelines/best practices: e.g., choice of default licenses; process for signing CLAs (Contributor License Agreements)
  • CERN Open Source Software Catalogue
  • 9 open sourcing requests in 2024
  • Community event
  • Plans for the coming two years
    • Consolidation: KPI development(Project with Software Heritage on measuring impact); Collaboration with European agencies; Contributing to non-CERN FOSS (theme of today’s OSPO seminar)
    • Tooling: Catalogue; Streamline open-sourcing processes; dependency tracking; license choosing workflow

Discussion

  • Good that the organisation dedicated resources to OSPO

Open Source Hardware

  • OS Hardware has similar status as FOSS in the OSPO
  • Started work by focusing on creating a catalogue for OSHW
    • Software-Hardware asymmetry in policy: For software, the standard entry point is OSPO, for hardware it’s KT
  • Gateware (language to describe hardware) was decided to be rather hardware than software -> default entry point is KT
  • Development of new OSHW Repository. Preview at https://ohwr.github.io/ohwr.org/
  • Include best practices for Gateware and Hardware
  • KPI development
    • CERN’s electronic drawing office: 15 / 200 designs were OSHW (2024)
    • Designsat CERN’s electronic drawing office done with FOSS tool KiCad: 8 / 200 (2024)
    • CERN OHL v2 (de facto standard license for Open Hardware) adaption in github.com: 1343 projects in total used one of the three licenses
  • New OSHW Repository to be released before summer
  • Improve guidelines in OSPO docs (including licence chooser)
  • Streamline the use of KiCAD; add guidelines on its use in OSPO docs; work with CERN drawing office, designers and EDAC
  • By the end of the year assess, increase awareness on OSHW
  • Future plans: also focus on mechanical designs (currently mostly electronics); Include KiCad on Product Lifecycle Management plans

Discussion

  • Are there any insights in why people don’t share their designs under OSHW?
    • It lacks awareness -> Need for communication. Also complicated process, therefore need for advising and streamline of the process

Research Integrity

  • More complicated topic
  • About research output, not only data, but also metadata, auxiliary data, linked software, analysis workflows, documentation etc
  • Report mostly done from the CMS perspective, as its very much about processes within the collaborations
  • Workshops on workflow languages
  • Trainings by HSF in 23/24, most are online
  • Software development: REANA has been continouosly developed; CMS released Combine software; statistical models can now be published on CDS with a workflow
  • Analysis code not so much covered by OSPO, as usually it’s not standalone code and it belongs to specific analysis
  • For publsihing analysis code, CDS might be more appropriate than Zenodo
  • A plan for research integrity has not been drafted but “recommendations for best practices for open science” have been drafted by ICFA Lifecycle panel which coulld be used here
  • In CMS, a new group has been formed on common analysis tools that created templates for analysis reusability
  • Discussions around needs and priorities for analysis preservation tools has been started, but most advancements are thanks to grassroots movements
  • Trainings have been provided by HSF. There should be more trainings on CERN-specific tools, such as REANA. The frequency has to be defined
  • automatic preservation is rather seen as a resommendation -> In any case it needs more trainings and documentation
  • Communication and coordination needed

Discussion

  • We are trying to promote the archival of software on repository.cern, also working on a gitlab integration
  • OSPO is also looking into trainings for Open Source licensing
  • OS Office could take over coordination

Open Infrastructure

  • Plan was to make a list of services, identify sustainability measures and resource needs, develop a draft roadmap and to review these measures continuously
  • Little was achieved, mostly because of limited capacity. It needs a more coordinated approach with OS Office
  • Progress has made on the various platforms
    • opendata.cern: 80k records, tenth anniversary
    • Inspire: New Data collection
    • reana was used to reiinterpret ATLAS-Run-2 analyses
    • Zenodo: 11th anniversary, EU Open Research repository
    • Invenio RDM: COAR notify
  • Suggestion for a reworked Implementation Plan:
    • Minimal set of services. For each service:
      • FAIR assessment
      • Key principles
      • Propose KPI
      • Identify missing integrations between services
    • Team meeting 2h/month, review achievements in 1 year
    • Short document to be provided
      • Potentially on the Open Science website

Discussion

  • OS Office will support in coordination and creating and publishing a document to collect the infrastructure

Research Assessment

  • CoARA as the main instrument to work on Research Assessment
    • It’s a coalition which aims to jointly update assessment practices with ten main principles (4 of them being “hard criteria”)
      • Recognize more than just articles
      • Base research assessment on qualitative instead of quantitative indicators
      • Abandon journal- and publication-based metrics (JIF)
      • Avoid use of rankings of research
  • So far, we have released a preliminary action plan
    • We have started with interviews with colleagues in TH and EP to find out what the actual assessment principles look like
    • Develop educational material to create awareness
      • First policies on that topics have been in place already since 2003
    • Promote teaching and outreach
    • Implement, test, and revise changes
  • Also want to assess current practices in allocating beamtime
  • Currently Alex and Antonia from SIS are working on it, but everyone is encouraged to help

Discussion

  • None

Training, Education and Outreach

  • Lack of awareness for Implementation Plan
  • Training: Has not happened yet. There are very limited activities from SIS in Science Writing course and in a course for newcomers
  • Hands-on workshops reaching out to 30k people in 2023/24
  • Teacher and student programmes have reached several thousand teachers and students
  • virtual tours and educational videos
  • Digital resources need to be licensed properly. TSP resources are licensed via Zenodo with CC-BY
  • Academic training also exists with recordings and slides made available
  • Outreach events: CERN-based masterclasses reach around 10k students
  • Exhibitions: We need to reconsider the wording in the Implementation Plan
    • Should the exhibitions be on the theme of Open Science?
  • Science Gateway inspiration book to inspire other museums and exhibitions
  • We need to
    • expand training programs
    • manage licensing of materials
    • distinguish between educational and outreach material. Not clear if distinction is really needed
    • consider rewording of Implementation Plan

Discussion

  • If we provide CERN material, it should be discussed whether CDS might be better than Zenodo as it’s the institutional repository.
  • Implementation Plan stresses distinction between education (conceptual understanding) and outreach (raising awareness). Not clear whether we really need this distinction.

Citizen Science

  • No coordinated effort has been done here over the past two years
  • LHC@Home has continued
  • In the past, other projects, mostly by ATLAS and CMS have been conducted over a limited time frame
  • Educational resources have been published, e.g., on the open data portal
  • There is great potential at CERN for doing citizen science

Discussion

  • We should leverage what already exists, we don’t want to take anything away, just to promote things
  • Some people do things, but only on a voluntary basis with a limited fraction of their time
  • We can only try to highlight what is already happening
  • Maybe we could reach out to organisations that promote citizen science

Discussion and next steps

  • What happens next?
    • We continue working on review
    • Everyone gets two more weeks to work on the document. After that OS office will edit the document with the aim to have it ready by end of March
    • This review will be discussed with OSSB, while already working on the next Implementation Plan and preparing a document to be shared publicly
    • Also we will organize more focussed discussions on certain aspects of the plan
There are minutes attached to this event. Show them.