HEPiX Spring 2026 Workshop

Europe/Lisbon
Auditório J.J. Laginha (ISCTE Instituto Universitário de Lisboa)

Auditório J.J. Laginha

ISCTE Instituto Universitário de Lisboa

Av. das Forças Armadas, 1649-026 Lisboa, Portugal
Ofer Rind (Brookhaven National Laboratory), Jose Flix Molina (CIEMAT - Centro de Investigaciones Energéticas Medioambientales y Tec. (ES)), Tomoaki Nakamura, Jorge Humberto Lucio Oliveira Gomes (LIP Laboratorio de Instrumentaco e Fisica Experimental de Particulas)
Description

HEPiX Spring 2026 at LIP in Lisbon

The HEPiX forum brings together worldwide information technology staff, including system administrators, system engineers, and managers from High Energy Physics and Nuclear Physics laboratories and institutes, to foster a learning and sharing experience between sites facing scientific computing and data challenges.

Participating sites include BNL, CERN, DESY, FNAL, IHEP, IN2P3, INFN, IRFU, JLAB, KEK, LBNL, LIP, NDGF, NIKHEF, PIC, RAL, SLAC, TRIUMF, and many other research labs and universities from all over the world.

More information about the HEPiX workshops, the working groups (who report regularly at the workshops) and other events is available on the HEPiX Web site.

This workshop will be hosted by LIP and CNCA and will be held at ISCTE in Lisbon.

Important dates:

  • Submission of abstracts: 22nd January to 5th April
  • Early bird registration: 22nd January to 16th March
  • Late registration: 17th March to 10th April
  • Event: 20th April to 24th April

 

SPONSORS

GOLD


 
 
 
 
   
   
 

ACADEMIC COLLABORATIONS


 
 
 
Participants
Zoom Meeting ID
69938449703
Host
Jose Flix Molina
Useful links
Join via phone
Zoom URL
    • Registration Auditório J.J. Laginha

      Auditório J.J. Laginha

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal
    • Welcome Auditório J.J. Laginha

      Auditório J.J. Laginha

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal
      Convener: Jose Flix Molina (CIEMAT - Centro de Investigaciones Energéticas Medioambientales y Tec. (ES))
      • 1
        Welcome talk
        Speaker: Jorge Humberto Lucio Oliveira Gomes (LIP)
      • 2
        Logistics talk
        Speakers: Hugo Miguel Da Silva Gomes (LIP), Joao Antonio Tomasio Pina (Laboratory of Instrumentation and Experimental Particle Physics (PT)), Jorge Humberto Lucio Oliveira Gomes (LIP)
    • Site Reports Auditório J.J. Laginha

      Auditório J.J. Laginha

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal
      Convener: Dennis van Dok (Nikhef)
    • 10:30 AM
      Coffee Break Auditório J.J. Laginha

      Auditório J.J. Laginha

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal
    • Site Reports Auditório J.J. Laginha

      Auditório J.J. Laginha

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal
      Convener: Sebastien Gadrat (CC-IN2P3 - Centre de Calcul (FR))
      • 6
        BNL SCDF Site Report

        This presentation will cover developments over the past year, as well as upcoming plans, at the Scientific Computing and Data Facilities (SCDF) at BNL.

        Speaker: Ofer Rind (Brookhaven National Laboratory)
      • 7
        IHEP Site Report

        To introduce the recent updates of IHEP site.

        Speaker: Mr Xiaowei Jiang (IHEP(中国科学院高能物理研究所))
      • 8
        GridKa Site Report

        We provide an overview of current activities, topics and challenges around the GridKa Tier-1 centre. Key experiences include very high capacity workernodes, the physical relocation of our entire tape library between campuses, and the transition of the entire compute centre to a new network layout. We also dive into current woes around Grid-size online storage and configuration management.

        Speaker: Dr Max Kühn (Karlsruhe Institute of Technology)
      • 9
        PIC site report

        PIC report to HEPIX Spring 2026.

        Speaker: Jose Flix Molina (CIEMAT - Centro de Investigaciones Energéticas Medioambientales y Tec. (ES))
    • 12:30 PM
      Lunch Auditório J.J. Laginha

      Auditório J.J. Laginha

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal
    • Storage & data management Auditório J.J. Laginha

      Auditório J.J. Laginha

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal
      Convener: Dr Andrew Pickford (Nikhef)
      • 10
        Report on the 2026 CS3 conference

        The CS3 (Conference on Sync & Share Services) [1] is a community-driven event whose history dates back to 2014.
        Its focus has shifted from the original exploration of various products for S&S services, site hardware configurations as well as the integration of applications towards the federation of various S&S instances in research and academic institutions, the support of scientific workflows and -most recently- the role of AI in further optimising and improving these functions.

        This article summarises the content of the recent three-day conference at the University of Oslo [2] and aims to convey the benefits of participating in this community and its annual conferences.

        [1] https://www.cs3community.org
        [2] https://indico.cern.ch/event/1560960

        Speaker: Peter van der Reest
      • 11
        Rearchitecting RAL's Ceph Storage for the needs of 2030

        Demand for object storage at RAL is growing. We already have Echo, a 130PB cluster for the WLCG and general physics, and we are expecting to support users with new AI/ML training frameworks that predominantly use S3.

        To support these use cases, we're building a new dense flash cluster, "Leo", using QLC storage. Leo will only offer an S3 interface, and will initially aim to support scientific AI workloads for UK users.

        Until now, our local and UK experiment support has been built on Deneb, a large shared CephFS filesystem. Deneb has delivered reliable performance, but as the filesystem has grown we have encountered scaling limits in the MDS tier and in our ability to back up the namespace. The hardware profile — 8–12 TB HDDs, chosen for performance over capacity — has also proved difficult to cost-optimise. Leo is designed to address all three: QLC NVMe delivers substantially better performance at an acceptable additional cost per terabyte, and an S3 object store removes the need for metadata servers entirely.

        This talk will cover the architecture of Leo and give a broader update on developments across RAL's Ceph infrastructure.

        Speaker: Robert Appleyard
      • 12
        Sync-Polen: A sync&share service for the Portuguese scientific community

        Polen is a set of services for Open Research Data provided for researchers by the national funding agency (FCCN). In this framework, one such service is a sync&share data platform based on Nextcloud. The implementation is done by LIP, whereby we present the architecture, the implementation decisions based on the requirements and additional development for the operation of the service.

        Speaker: Mario Jorge Moura David
    • 3:30 PM
      Coffee break Auditório J.J. Laginha

      Auditório J.J. Laginha

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal
    • Storage & data management Auditório J.J. Laginha

      Auditório J.J. Laginha

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal
      Convener: Peter van der Reest (Deutsches Elektronen-Synchrotron DESY)
      • 13
        CosmoHub: Building the next generation multimessenger open science platform

        The adoption of Open Science in data-intensive fields such as astronomy and cosmology requires infrastructures capable of managing, distributing, and enabling the reproducible reuse of massive datasets across geographically distributed communities. Addressing these challenges, CosmoHub is a high-performance open science data platform developed at the Port d’Informació Científica (PIC) to support the access, exploration, and analysis of large-scale astronomical and cosmological data.

        This presentation will highlight ongoing developments, including a complete redesign of both the frontend and backend based on a modern technology stack, aimed at improving usability, performance, and overall user experience. In parallel, significant efforts are being made to enhance interoperability through the adoption of International Virtual Observatory Alliance (IVOA) standards (e.g., TAP, ADQL, UWS, and VOSpace), the integration of federated authentication mechanisms, and alignment with EOSC requirements and broader open science strategies. Recognized as a Research Data Repository and a key use case in the EOSC macro-roadmap, CosmoHub also contributes to training and education, fostering the adoption of open science practices in large-scale scientific data analysis.

        Speaker: Dr Pau Tallada-Crespí (PIC-CIEMAT)
      • 14
        HERD data management and workflow

        The High Energy cosmic Radiation Detector (HERD) is a major international space astronomy and particle astrophysics experiment planned for installation on the Chinese Space Station (CSS) around 2027. Its primary scientific objectives include precise measurement of high-energy cosmic rays up to the PeV range, indirect detection of dark matter, and observation of high-energy gamma rays.
        It is estimated that HERD experiment will generate over 70PB data from during 10 years operation, encompassing simulated, raw, reconstructed, calibration and engineering datasets. To address the challenges of data management, data processing and distribution, we designed and developed the HERD Data Management and Workflow system. This system is structured around three core functional modules to ensure streamlined operations.
        The Data Management module is responsible for the metadata catalog, data distribution across infrastructure, and data services. The Data Workflow module automates the entire data processing pipeline, including data simulation, reconstruction, and transfer. The workflow is designed to minimize manual intervention, ensuring consistency and reliability from data acquisition to final data analysis. The Payload Operation Monitoring module monitors the telemetry and engineering data from the HERD payloads in real-time. It is designed to detect anomalies and trigger alerts, thereby ensuring the instrument's health and operational continuity.
        This paper firstly introduces the HERD experiment and data challenges, and then details the system's architecture, elaborates on the implementation of these three modules, and reports on the current development progress. Finally, it concludes with a summary and outlines the future plan for the system's deployment and refinement.

        Speaker: Hao Hu (Institute of High Energy of Physics)
    • 5:30 PM
      Welcome Drink Sponsors Atrium

      Sponsors Atrium

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal
    • Environmental sustainability, business continuity, and Facility improvement Auditório J.J. Laginha

      Auditório J.J. Laginha

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal
      Conveners: Dr Dwayne Spiteri (DESY), Henryk Giemza (National Centre for Nuclear Research (PL))
      • 15
        Power and Efficiency Monitoring for WLCG Sites

        Monitoring power consumption at the level of grid job slots remains a missing component of current Workload Management Systems for HEP experiments. While individual computing centres can monitor power consumption locally, maintaining a consistent view across heterogeneous clusters and re-benchmarking systems after each configuration change is time-consuming and often impractical for sites.

        At the WLCG Workshop last year, we proposed a lightweight approach to address this gap by integrating power measurements into existing WLCG benchmarking and workload infrastructures. Over the past year, this approach has been used in practice through adoption at new sites and systematic data collection, enabling iterative refinement and validation based on real production data. In this contribution, we present the current status of the implementation together with the first results obtained from production environments.

        Two power collectors are currently recommended to cover the majority of WLCG sites: a systemd-based collector and an implementation designed for sites already using Prometheus. Both solutions are straightforward to deploy and require minimal effort from site administrators, lowering the barrier for adoption.

        Over the past year, the work has evolved from a proposed concept to a validated implementation supported by real production data and cross-site analysis. Initial results allow us to explore performance-per-watt characterisation, anomaly detection, and consistency of measurements across heterogeneous environments. Broader site adoption will enable more robust cross-site comparisons and improved modeling. In turn, this will support carbon footprint estimation per job, representative HS23/Watt values even for sites without direct power measurements, and more informed operational and hardware decisions across the WLCG infrastructure.

        Speaker: Natalia Diana Szczepanek (CERN)
      • 16
        Technical aspects of RF2.0 and what you can learn about your cluster by running benchmarks

        I have been working at DESY for a bit over a year as part of Research Facilities 2.0 (RF2.0), with the aim of making the compute infrastructure more ressource efficient.

        In this talk I will present how we rolled out benchmarks to our clusters, how the results helped us finding misconfigurations, and some of the configuration changes we made to our infrastructure.
        Also, i will present our concept for POCCET, a small demo compute cluster we are setting up to explore different ways to shape the power usage of a datacentre by following an external input curve.

        Speaker: Jan Hartmann
      • 17
        Dynamic Power Scaling at PIC: Optimizing AMD EPYC Processors for Carbon and Cost Reduction

        High Energy Physics (HEP) computing centers face increasing pressure to optimize energy consumption due to volatile electricity markets and the urgent need to reduce carbon footprints. The Port d'Informació Científica (PIC) is actively investigating strategies to implement dynamic, intra-day power scaling, aiming to reduce power draw during periods of peak electricity prices or high CO2 emissions by lowering CPU clock frequencies.

        To determine the most effective hardware and configuration strategies for this initiative, we conducted a comprehensive performance and energy evaluation across PIC’s core computing fleet. This study focuses on AMD EPYC architectures, specifically the 7513, 7452, 7502, and 9825 models, which collectively represent 75% of the center's total computing capacity.

        We utilized the HEPScore benchmark to quantify the impact of various system-level configurations on both compute throughput and power consumption. Our testing matrix evaluates the effects of toggling Simultaneous Multithreading (SMT) on and off, manipulating numad daemon states alongside different NUMA node configurations, and applying targeted clock frequency limits. Furthermore, to ensure the accuracy and reliability of our power tracking in a production environment, we present a comparative analysis of power measurements extracted via IPMI versus physical readings from Power Distribution Units (PDUs).

        The results of this study detail the performance-per-watt degradation curves for each CPU model under HEP workloads. These findings provide the foundational data required to build an automated, intra-day frequency scaling policy at PIC, enabling the center to make informed decisions on hardware life-cycles and dynamically trade compute throughput for power savings when external grid conditions demand it.

        Speaker: Jose Flix Molina (CIEMAT - Centro de Investigaciones Energéticas Medioambientales y Tec. (ES))
    • 10:10 AM
      Group Photograph Auditório J.J. Laginha

      Auditório J.J. Laginha

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal
    • 10:30 AM
      Coffee Break Auditório J.J. Laginha

      Auditório J.J. Laginha

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal
    • Site Reports Auditório J.J. Laginha

      Auditório J.J. Laginha

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal
      Convener: Sebastien Gadrat (CC-IN2P3 - Centre de Calcul (FR))
      • 18
        US ATLAS SWT2 Site Report

        Updates on the US ATLAS SouthWest Tier2 Center since the last HEPiX we attended.

        Speaker: Horst Severini (University of Oklahoma (US))
      • 19
        DESY Site Report

        News from the lab

        Speaker: Andreas Haupt (Deutsches Elektronen-Synchrotron (DE))
      • 20
        GSI Site Report

        Site report for GSI Helmholtzzentrum

        Speaker: Christopher Huhn
      • 21
        NT1 site report

        Site report from the Nordic Tier-1

        Speaker: Mattias Wadenstein (University of Umeå (SE))
    • 12:30 PM
      Lunch Auditório J.J. Laginha

      Auditório J.J. Laginha

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal
    • Follow-up on mid-long term evolution of facilities (Topical Session with WLCG OTF) Auditório J.J. Laginha

      Auditório J.J. Laginha

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal
      Conveners: Alessandro Di Girolamo (CERN), James Letts (Univ. of California San Diego (US)), Jose Flix Molina (CIEMAT - Centro de Investigaciones Energéticas Medioambientales y Tec. (ES))
      • 22
        Introduction
        Speakers: Alessandro Di Girolamo (CERN), James Letts (Univ. of California San Diego (US))
      • 23
        Operational Effort at Sites
        Speaker: Jose Flix Molina (CIEMAT - Centro de Investigaciones Energéticas Medioambientales y Tec. (ES))
      • 24
        Evolution of tape (archival) storage incl. performance requirements (report from OTF#8)
        Speaker: Jose Flix Molina (CIEMAT - Centro de Investigaciones Energéticas Medioambientales y Tec. (ES))
      • 25
        Storage Capacity vs. Throughput
        Speaker: Thomas Birkett
      • 26
        U.S. CMS Facility Evolution Plans

        The HL-LHC era and associated increase in data volume and processing requirements has led to many re-evaluations and discussions on how the computing systems supporting high energy physics should be architected. This presentation summarizes work done by U.S. CMS in recent years to develop its mid- to long-term infrastructure strategy based on task force work conducted in 2024-2025.

        This talk describes a transition from the current proportionally scaled Tier1/2 resource model to one in which stateful storage is increasingly centralized at the Tier-1 while Tier-2 sites primarily deploy CPU and GPU resources. This shift in approach addresses current facility constraints at Fermilab, improves overall storage availability, and increases the flexibility of how university sites can contribute resources.

        Also discussed will be the continued development and refinement of Analysis Facility resources and their role in reducing latency and improving throughput for HL-LHC scale analysis. Impacts of AI/ML workflows on infrastructure design and operational costs are additionally considered. The evolving role of HPC centers and possible integration of large-scale DOE computing resources will round out the strategy discussed.

        Speaker: Garhan Attebury (University of Nebraska Lincoln (US))
    • 4:00 PM
      Coffee break Auditório J.J. Laginha

      Auditório J.J. Laginha

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal
    • Follow-up on mid-long term evolution of facilities (Topical Session with WLCG OTF) Auditório J.J. Laginha

      Auditório J.J. Laginha

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal
      Conveners: Alessandro Di Girolamo (CERN), James Letts (Univ. of California San Diego (US)), Jose Flix Molina (CIEMAT - Centro de Investigaciones Energéticas Medioambientales y Tec. (ES))
    • Software and Services for Operation Auditório J.J. Laginha

      Auditório J.J. Laginha

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal
      Convener: Mary Catherine Hester (Nikhef National institute for subatomic physics (NL))
      • 29
        WLCG Operations Coordination towards HL-LHC: Current Focus Areas and Near-Term Plans

        As WLCG prepares for the HL-LHC era, Operations Coordination continues to steer the infrastructure through major technical transitions while ensuring stable Run 3 operations.
        The current focus is on four key cross-cutting projects: the migration to token-based authentication and authorization, the evolution and sustainability of WLCG accounting, the implementation of XRootD monitoring, and the structured incorporation of heterogeneous resources and architectures. Each of these areas requires close coordination between experiments, sites, EGI, OSG, and middleware providers to align operational practices with future scalability, security, and sustainability requirements.
        This contribution provides a concise overview of ongoing activities, coordination mechanisms, and concrete near-term priorities as WLCG progressively adapts its operational model to meet HL-LHC demands while maintaining reliability and service continuity.

        Speaker: Panos Paparrigopoulos (CERN)
      • 30
        EGI Software Provisioning Infrastructure

        The Unified Middleware Distribution (UMD) is a software distribution provided by EGI that integrates a collection of software components (middleware) selected from various technology providers for deployment on the EGI/WLCG production infrastructure. The software repository for UMD (repository.egi.eu) is developed, maintained, and operated by LIP under contract with EGI.

        The repositories are organised by operating system and validation status, which includes categories such as release, testing, contrib, and the EGI Trust Anchors (CA certificate packages). All artefacts available in the UMD repositories undergo a software provisioning process. This process is managed through a highly automated pipeline using GitHub and Jenkins to ensure that all software meets established quality criteria and passes a series of tests before being added to the main repository.

        The EGI Software Repositories and support services have been assisting the WLCG community since 2012. The current major version is UMD-5, which provides support for EL9. In this presentation, we will provide an overview of software repository architecture, current status, quality assurance and plans for improvement.

        Speaker: Joao Antonio Tomasio Pina (Laboratory of Instrumentation and Experimental Particle Physics (PT))
      • 31
        Modernizing the CNCA Helpdesk from RT to Zammad

        We present the migration of the CNCA Helpdesk from Request Tracker (RT) to Zammad, driven by the need for improved usability, automation, and integration capabilities.

        The talk focuses on the migration approach, including data extraction from RT and transformation/import into Zammad via APIs. Existing open-source tooling was extended to support CNCA specific requirements, enabling the accurate migration of ~2,500 tickets across multiple queues while preserving history, attachments, and user mappings.

        We also describe the resulting architecture and operational practices, including access control and automation workflows.

        Finally, we share lessons learned, common pitfalls, and practical recommendations for similar migrations in production environments.

        Speaker: João Machado
      • 32
        Joblens: A Lightweight Observability Collector for Cluster Job

        Joblens, a lightweight and observability collector designed to achieve fine-grained monitoring of cluster jobs. Leveraging eBPF-based kernel instrumentation, Joblens enables dynamic tracking of process creation and system calls with zero overhead and no need for kernel modifications. Its modular and highly configurable plugin system, built on an asynchronous double-buffer pipeline, exports metrics to multiple backends such as Elasticsearch and Prometheus while maintaining less than 5% CPU overhead. Additionally, Joblens incorporates a Lua script rule engine that dynamically registers monitoring policies, allowing automatic detection and tracking of specific jobs. Integrated into the Interacitve aNalysis worKbench(INK), Joblens provides users with real-time web-based monitoring of job execution status, significantly enhancing cluster observability and debuggability.

        Speaker: Jingyan Shi (IHEP)
    • 10:30 AM
      Coffee Break Auditório J.J. Laginha

      Auditório J.J. Laginha

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal
    • Cloud Technologies, Virtualization & Orchestration, Operating Systems Auditório J.J. Laginha

      Auditório J.J. Laginha

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal
      Convener: Dr Michele Michelotto (Universita e INFN, Padova (IT))
      • 33
        BNL experience with RHEV to OpenShift Migration

        The Scientific Computing and Data Facilities (SCDF) at Brookhaven Lab began
        in 1997 when the Relativistic Heavy Ion Collider (RHIC) and ATLAS Computing Facility
        was established. The full-service scientific computing facility has since supported
        a diverse range of scientific collaborations, by providing dedicated data processing,
        storage, and analysis resources for these expansive experiments with general
        computing capabilities and support for users.
        We are currently in the process of migrating our virtual environment from RHEV to Red Hat
        OpenShift. I will present the successes and pitfalls of this migration, as well as why we
        decided on OpenShift as a solution. I will also discuss our increasing e􀆯orts to migrate
        some of our VM-based services to OpenShift containers.

        Speaker: Joseph Frith
      • 34
        Kubernetes Adoption: Best Practices for Deployment, Architecture, and Integration

        Kubernetes (K8s) is a game-changing container orchestration platform that revolutionizes the way we deploy and manage applications. In this talk, we will delve into the intricacies of K8s deployment, optimal architecture design, and seamless integration with essential tools like ArgoCD and Gitlab CI. Join us as we share our journey, best practices, and key insights for successful Kubernetes adoption.

        Speaker: Samuel Bernardo
      • 35
        Event-Driven Group Storage Provisioning in dCache via the Helmholtz Cloud Portal: A Reference Architecture

        Provisioning research group storage at large-scale scientific computing facilities typically requires manual intervention from storage administrators: a user submits a request, an administrator logs into the storage management console, creates a namespace directory, sets quotas, and assigns ownership. This process does not scale as the number of virtual organisations (VOs) and research groups grows.

        We present an open-source, cloud-native agent that automates this workflow end-to-end for dCache, one of the most widely deployed distributed storage systems in the high-energy physics and photon science communities. The agent integrates the Helmholtz Cloud Portal, the dCache HTTP namespace API, the dCache SSH administrative interface, and LDAP-based group identity resolution into a cohesive, event-driven provisioning pipeline. A companion shell script running as a cron job on the dCache POSIX frontend completes the ownership assignment step, which cannot be performed through the namespace API alone.

        The architecture is idempotent, stateless, and deployable on Kubernetes via Helm. All components are open source. We describe the design decisions, the integration challenges encountered, and the deployment model in production at DESY, where the system serves as the storage solution for the Helmholtz Federated IT Services (HIFIS) community.

        Speaker: Franz Rhee (DESY)
      • 36
        Migrating from Vmware to XCP-ng with XOA

        After Vmware was bought out by Broadcom the price of the products increased and the service decreased. Nikhef decided to move our Vmware cluster to XPC-ng with XOA.
        I will be talking about why we decided to do this, but most of the talks will be about the migration and our experiences of the last year.

        Speaker: Bart van der Wal (NIkhef)
    • 12:30 PM
      Lunch Auditório J.J. Laginha

      Auditório J.J. Laginha

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal
    • Computing and Batch Services Auditório J.J. Laginha

      Auditório J.J. Laginha

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal
      Convener: Matthias Schnepf
      • 37
        HEPiX Benchmarking Working Group Report

        The HEPiX Benchmarking Working Group develops and maintains benchmarking tools to measure computing resources across the Worldwide LHC Computing Grid (WLCG). Since the adoption of HEPScore23 in April 2023, the WG has been enhancing the benchmark suite to address evolving community needs. Currently, the WG is focused on two main developments: extending the benchmark suite with modules to measure server utilization metrics (load, frequency, I/O, and power consumption) for compute sustainability and advancing the integration of GPU workloads into the benchmark catalog. These enhancements enable more comprehensive evaluation of both performance and power efficiency across diverse computing environments. In this report, we provide an overview of the WG’s recent activities and current development status, highlighting the impact on the HEP computing community and the path forward for sustainable, high-performance computing in WLCG.

        Speaker: Domenico Giordano (CERN)
    • Techwatch (Topical Session) Auditório J.J. Laginha

      Auditório J.J. Laginha

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal
      Conveners: Mr Andrea Chierici (Universita e INFN, Bologna (IT)), Dr Andrea Sciabà (CERN)
      • 38
        Fujifilm’s Evolution of Magnetic Tape: BaFe and SrFe Innovations for Next-Generation HPC Storage

        High-Performance Computing (HPC) has become a critical tool in scientific research, engineering, AI training, and more. Some of the main challenges of HPC in data storage are handling massive data volumes, ensuring long-term data integrity and security, reducing the floor space and the carbon footprint.

        HPC applications generate petabytes of data, requiring high-capacity storage solutions. HPC data centres consume significant power, creating a need for storage solutions that are energy-efficient to reduce costs and cooling requirements.

        Fujifilm as worldwide leading manufacturer of magnetic tapes, used primarily for data archiving, works tirelessly to constantly improve the intrinsic characteristics of tape storage solutions, such as file lifespan, high security levels achieved by disconnecting stored data from the network, high data integrity, low cost per TB, higher recording densities, reduced environmental footprint, and much more.

        Fujifilm continues to contribute actively to the development of an unprecedented wave of innovations applied to tape storage solutions, such as new tape coating processes using Nanocubic technology combined with Barium Ferrite (BaFe) and Strontium Ferrite (SrFe) particles, major advances in the field of electromagnetic power, and new read/write heads capable of absorbing potential variations in the dimensions of the magnetic tape developed by IBM, to name but a few.
        At the end of 2023, the launch of IBM's Enterprise 3592JF tape with the TS1170 drive, marked a new milestone by being able to store a native capacity of 50 terabytes on a single data cartridge. This was the first tape to combine Strontium Ferrite (SrFe) technology with Barium Ferrite (BaFe) technology in its manufacture.
        The latest generation of Fujifilm LTO tape launched on the market last June, LTO-10, incorporates these “fine hybrid magnetic particles” developed by combining nanoparticle design technologies used in both “Strontium Ferrite (SrFe) magnetic particles” a next-generation magnetic material, and “Barium Ferrite (BaFe) particles” currently used in high-capacity data storage tapes, and has adopted this material for the first time in the “LTO Ultrium” series. By further reducing the particle size of the magnetic material and improving its magnetic properties, Fujifilm has increased the areal recording density of magnetic tape by approximately 1.7 times compared to current products. Achieving a native capacity from 30 terabytes per LTO-10 tape.

        Strontium Ferrite and its integration with Barium Ferrite in fine hybrid magnetic particles, underpins a new era of density, stability, and longevity. The combination of these Fujifilm tape coating technologies allows this level of high recording density to be achieved while maintaining the stability of the magnetic properties of the particles to ensure that the written data can be read for more than three decades.

        The continuous evolution of tape is made possible by innovations in core materials. Fujifilm's commitment to finding alternative materials that can increase tape recording density while also offering higher performance led to the use of Barium Ferrite (BaFe) particles with superior magnetic properties to the Metal Particles (MP) technology used previously, which reached its capacity limit at 2.5TB native per tape.

        First-generation BaFe particles, with a particle size 50% smaller than the smallest MP particles, were sufficient to manufacture 6TB native data cartridges, while second-generation BaFe particles, which are 10-15% smaller, enable the manufacture of tapes with native capacities of 18TB.

        The use of Strontium Ferrite (SrFe) particles, which are up to 60% smaller than BaFe particles but have superior magnetic properties, allows the storage of 580TB native data on a single data cartridge. As demonstrated at the end of 2020 in the latest tape-recording record, the fifth since 2006, which enabled the roadmap for LTO technology to be extended to its fourteenth generation.

        Increasing the recording density on magnetic tapes poses major challenges for manufacturers. We will analyse the challenges we must overcome to ensure recording stability and long-term data readability, such as greater precision of the read/write head on increasingly narrow data tracks. We will review the advanced features of Enterprise tape technology that have been incorporated into the latest generations of LTO tape technology, such as oRAO, which reduces access time to multiple files recorded on the same tape by defining the optimal file access path, or hrRAO, which divides the tape into 128 segments using HRTD (High Resolution Recommended Access Order) to locate and retrieve files more quickly and efficiently, thereby speeding up data access.

        These breakthroughs unlock higher storage densities on a single tape, dramatically extend the lifespan of the drives, and ensure that the most critical information stays preserved and accessible—even under demanding environmental conditions. Offering great peace of mind to users and demonstrates the great development potential of tape technology, which already has prototypes capable of meeting the storage needs into the next decades.

        Speaker: Ms Elisabeth Gameiro (FUJIFILM Recording Media France)
      • 39
        Tape Technology evolution update from the CERN perspective

        This talk is a short update on the recent evolution of the tape technology from the CERN user perspective.

        Speaker: Vladimir Bahyl (CERN)
      • 40
        HEPiX Technology Watch Working Group Report

        The Technology Watch Working Group, established in 2018 to take a close look at the evolution of the technology relevant to HEP computing, has resumed its activities after a long pause. In this report, we provide an overview of the hardware technology landscape and some recent developments, highlighting the impact on the HEP computing community, with a special focus on resource price evolution.

        Speaker: Dr Andrea Sciabà (CERN)
      • 41
        Technology Watch: A broad view on cooling

        With CPU’s and Accelerators that want to have more and more watt compared with the previous generations, cooling will become getting harder to keep future clusters running efficiently.
        At every conference, you’ll see more companies telling you that they have the real golden egg for this problem. The big question is, do they really have that golden egg for you?
        We will show the different type of solutions that are on the market and how they work. What are the up and downsides of all of them.
        What are the big cloud providers doing in this field and what can we learn from it?
        Which questions you should ask to your favorite vendor, to be able to determine how well their options are usable for you.

        Speaker: Tristan Suerink
    • 4:00 PM
      Coffee Break Auditório J.J. Laginha

      Auditório J.J. Laginha

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal
    • Techwatch (Topical Session) Auditório J.J. Laginha

      Auditório J.J. Laginha

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal
      Conveners: Mr Andrea Chierici (Universita e INFN, Bologna (IT)), Dr Andrea Sciabà (CERN)
      • 42
        The future of cooling @ Nikhef

        The problem of efficient and effective cooling has been haunting us for nearly a decade: since 2018 Nikhef has been aware of the issue and has explored several possibilities: ones that fir our environment, use the data centre infrastructure we have, and the infrastructure we have for getting rid of residual heat.
        But what is the best technology out there today? And which technologies that looked promising have miserably failed us? In this contribution we explain why certain technologies aren’t suitable for us and why we don’t believe that other solutions will keep working in the long run.
        By testing and comparing the best candidate that we have found against what we use today, we like to share our experience during this journey.

        Speaker: Tristan Suerink
      • 43
        WLCG Workshop on Heterogeneous Architectures Summary

        The WLCG Workshop on Heterogeneous Architectures (CERN, Dec 2025) reviewed the readiness of GPU‑enabled workflows across LHC experiments and the requirements for heterogeneous resources ahead of Run 4. While GPU acceleration is advancing in simulation, reconstruction and ML workflows, experiments are not yet ready to request formal GPU pledges, pending further benchmarking, workflow integration, and software maturation. The workshop identified key R&D priorities—including HEPScore GPU benchmarking, testing of GPU‑enabled simulation frameworks, improved workflow management for mixed CPU/GPU resources, and enhanced monitoring and accounting—and discussed challenges in advertising and operating heterogeneous resources across sites. With ATLAS and CMS expecting limited (up to ~15% for ATLAS, between 15 and 40% for CMS) GPU offloading by Run 4, the workshop emphasised the need for facilities to prepare sustainable heterogeneous deployments during LS3. This contribution will summarise the workshop findings and their relevance for HEPiX communities managing evolving scientific computing infrastructures.

        Speaker: Oxana Smirnova (Lund University)
      • 44
        GPU deployment at KIT

        GPUs are energy-efficient hardware for several High Energy Physics (HEP) applications.
        However, servers with GPUs are expensive and special in cooling, operation, and software support compared to CPU-only servers.
        We present what we have learned during the operations of GPU servers regarding these aspects at KIT, as well as some experiences from other sites.

        Speaker: Matthias Jochen Schnepf
      • 45
        CERN IT Data Centre Compute & Storage Procurement Towards LHC Run 4: Challenges and Community Perspectives

        As CERN prepares its central IT Data Centres for the High‑Luminosity LHC era (Run 4), the compute and storage infrastructure operated by CERN IT must evolve to meet significantly higher demands in throughput, capacity, and efficiency while remaining within strict constraints on budget, power consumption, and operational simplicity. With LS3 expected to begin in July 2026 and the HL‑LHC start currently aligned with mid‑2030, procurement planning for the CERN IT facilities must carefully balance long hardware lead times with fast‑moving market trends and rapid changes in CPU, GPU, memory and storage technologies.
        This presentation will describe the upcoming procurement activities specifically for servers and storage managed by the CERN IT department, without covering experiment‑side facilities or attempting to represent the full WLCG perspective. Building on the principles used successfully in Run 3 (maximizing performance‑per‑CHF and per‑watt for compute, and capacity‑per‑CHF and per‑watt for storage) we will outline how these guidelines translate into concrete technical specifications, benchmarking requirements, energy‑efficiency considerations, reliability and serviceability requirements, and expected tendering cycles for Run 4.

        In addition to presenting CERN IT’s current planning and the challenges we face, particularly in light of unstable component markets, this talk explicitly aims to foster an open discussion within the HEPiX and WLCG community. The goal is to explore how we, as a distributed ecosystem of data centres, might collectively adapt procurement strategies, operational models, and technology choices to better cope with such market instabilities. While CERN IT will share its current outlook, this session encourages input from other sites and stakeholders, recognising that community experience and alignment will be essential as we move toward the HL‑LHC era.

        Speaker: Luca Atzori (CERN)
    • HEPIX Board (closed) Auditório J.J. Laginha

      Auditório J.J. Laginha

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal
    • Storage & data management Auditório J.J. Laginha

      Auditório J.J. Laginha

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal
      • 46
        Scientific Data Transfer System for High Energy Physics Experiments

        The Institute of High Energy Physics has constructed multiple large-scale scientific facilities, including BSRF, HEPS, LHAASO, JUNO, AliCPT, which generate a large amount of data requiring high-performance data transfer services. To make full use of the computing resources of the remote sites of IHEP, data needs to be transmitted between multiple computing sites. The National High Energy Physics Scientific Data Center receives data from various research projects and requires data submission and long-term preservation, also requiring high-performance data transfer services. To meet the data transfer needs of different experiments, a High Energy Physics Scientific Data transfer System has been designed. As an important module of DOMAS (Data Organization Management Access Software), this system adopts a cluster-based design and management, with the transfer cluster consisting of a control master node and transfer sub-nodes. The control master node implements functions such as transfer task discovery, message queues, and web service support. The transfer sub-nodes provide scientific data transfer services and metadata interactions. The system has been deployed in multiple experiments and has achieved stable operation and good performance. This report will provide a detailed description of the various functional modules of the transfer system, as well as the deployment and application scenarios in different experiments.

        Speaker: Bo Zhuang (中国科学院高能物理研究所)
    • Computing and Batch Services Auditório J.J. Laginha

      Auditório J.J. Laginha

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal
      Convener: Dr Max Kühn (Karlsruhe Institute of Technology)
      • 47
        Summary of the european HTCondor Workshop 2025

        The yearly autumn, european HTCondor workshop is wrapped up and some of the highlights are presented in more detail

        Speaker: Christoph Beyer
      • 48
        AUDITOR: Accounting challenges of future HC-LHC and CO2 accounting

        Distributed computing infrastructures are typically shared by multiple research communities, where precise and transparent resource accounting is essential. To meet these demands, we developed AUDITOR (AccoUnting DatahandlIng Toolbox for Opportunistic Resources), a flexible, modular, and extensible accounting ecosystem designed for heterogeneous computing environments.

        AUDITOR captures and processes individual job information through specialized collectors that interface with systems such as HTCondor, Kubernetes and Slurm, and storing the job data in JSON format in a PostgreSQL database. Its plugin-based architecture allows integration with external tools with Rust and Python clients which allows for the flexible building of accounting pipelines specific to your site. AUDITOR is already deployed at major HEP sites such as CERN, and various German tier-1 and tier-2 sites.

        Existing plugins include the APEL plugin, which publishes accounting data to the European Grid Initiative (EGI). Recent development includes a new utilization report plugin that provides summaries of job counts, HEPScore performance, power consumption, and estimated CO2 footprint. Moreover, AUDITOR is currently in development to accommodate embedded CO2 emissions of each job based on site-specific hardware information.

        In this talk, we will present our new features and highlight how to meet future challenges of HL-LHC accounting and CO2 accounting.

        Speaker: Raghuvar Vijayakumar (University of Freiburg (DE))
      • 49
        Misc Linux Things happening at DESY

        A little bit of DESY Linux tools, all at once.
        In the past, we have developed lots of smaller and larger tools to help in various aspects of Linux administration at DESY.
        We present some of them in this talk.

        • Which packages do applications stem from that run on a Linux system? Security status of packages and applications?
        • Jump hosts: Making administration more secure, and isolating networks
        • Investigating job efficiencies on the Maxwell HPC system
        Speaker: Jan Hartmann
      • 50
        Slurm Appliance: An open source HPC batch cluster defined as code for on-premises clouds (SLIDES ONLY)

        HPC clusters are often seen as relatively static systems that do not need the flexibility provided by cloud environments. We describe the StackHPC Slurm Appliance, a Slurm batch scheduler deployment which uses OpenStack to combine bare metal performance with the operational convenience and ease of testing provided by virtualisation. The resulting HPC system is suitable for uses ranging from a site's primary multi-user HPC cluster, to "self-service" clusters backing a web UI, and disposable isolated development clusters.

        We describe the overall architecture, monitoring stack and the integration of projects such as Open Ondemand and EESSI to make HPC more accessible for scientists. We explore how the appliance aims to require zero configuration even for complex integrations, while still allowing extensive customisation for fundamental site-to-site differences such as parallel filesystems, IAM and scheduler configuration.

        The design of the appliance is heavily driven by a need for repeatable deployments. We explain how that is achieved via image-based updates and the trade-offs it introduces, and how we can use Slurm to schedule node upgrades to minimise disruption to user workloads.

        Speaker: Steve Brasier (StackHPC)
    • 10:30 AM
      Coffee Break Auditório J.J. Laginha

      Auditório J.J. Laginha

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal
    • Computing and Batch Services Auditório J.J. Laginha

      Auditório J.J. Laginha

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal
      Convener: Matthias Schnepf
      • 51
        News from the IDAF @ DESY

        The IDAF (Interdisciplinary Data and Analysis Facility) is an Helmholtz LK II facility. It is located at DESY, and serves computational needs for communities in the MATTER program.

        Speaker: Christoph Beyer
    • Software and Services for Operation Auditório J.J. Laginha

      Auditório J.J. Laginha

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal
      Convener: Jingyan Shi (IHEP)
      • 52
        KEK IdP: Production Deployment and Future SSO Strategy for CRC Services

        KEK has deployed an Identity Provider (IdP) and joined both GakuNin and eduGAIN to enable federated authentication with domestic and international institutions. This presentation describes the technical architecture and current status of the IdP, and outlines plans to extend this federated authentication infrastructure across multiple services provided by the KEK Computing Research Center. It also briefly discusses next-generation federated authentication frameworks aimed at supporting services that require higher levels of identity and authentication assurance, such as IAL2 and AAL2.

        Speaker: Konomi Omori (KEK)
      • 53
        Design and Progress of IHEP Identity Authentication System Upgrade

        The Institute of High Energy Physics (IHEP), Chinese Academy of Sciences,is a leading research institution in China dedicated to high-energy physics, advanced accelerator technology development, and nuclear technology applications. IHEP undertakes several major national science infrastructure projects, the most prominent of which is the High Energy Photon Source (HEPS). With an electron beam energy of 6 GeV and 14 user beamlines planned for its first phase, HEPS will deliver synchrotron radiation with high energy (up to 300 keV), high brightness, and high coherence. In addition, IHEP operates multiple large-scale facilities such as the Beijing Synchrotron Radiation Facility (BSRF) and the China Spallation Neutron Source (CSNS) across different campuses, all of which require interconnection. This distributed, multi-source environment presents significant challenges to identity management, data security, and system integration.
        To address cross-site, multi-source authentication, the HEPS authentication system integrates user information from the CAS Large Scientific Facilities Sharing Platform (LSSF), CSNS, IHEP, and HEPS itself. This enables researchers to use a single set of credentials across different campuses to perform experiment applications, computations, reconstructions, and data retrieval.
        IHEP also maintains multiple remote computing clusters with complex storage architectures closely tied to experimental data. To facilitate seamless data access and collaborative computation across these clusters, we have developed a unified computing environment interconnection solution. This solution uses Active Directory (AD) as the central identity repository and employs customized SSSD templates to accurately map consistent user identities onto each cluster. By overcoming barriers caused by physical separation and divergent storage systems, this architecture ensures uniformity in user attributes and provides robust, efficient, and scalable identity and permission support for interdisciplinary collaboration.
        To meet growing demands for domestic and international cooperation, the HEPS authentication system has joined both the CARSI (China Academic Research and Collaboration Infrastructure) federation and the international EduGAIN network. These memberships enable researchers from universities, institutes, and organizations worldwide to log in conveniently and securely, laying the foundation for smooth experimental workflows.
        In summary, to address these multifaceted requirements, IHEP’s authentication system has undergone comprehensive upgrades and functional expansions. These enhancements provide strong technical support for stable facility operations and greatly improve convenience and reliability for scientists conducting cutting-edge experiments.

        Speaker: qi luo (中科院高能物理所计算中心)
      • 54
        From In-House Tools to HashiCorp Vault: CERN's Transition to Modern Scalable Secrets Management

        CERN's computing infrastructure manages thousands of services across a complex distributed environment, requiring robust secret management for application credentials, root accounts, certificates, and service tokens. This talk explores CERN's transition from puppet-oriented, in-house secrets management solutions to HashiCorp Vault as a centralized, enterprise-level secret management platform.

        For over twelve years, CERN relied on its own Python-based tools using RIAK and database stores for secret management, oriented towards in the Puppet-managed world. These approaches lacked functionality as CERN's infrastructure evolved toward K8S clusters, OKD orchestration, and token-based workflows.

        The migration to HashiCorp Vault addressed these limitations through a phased approach. We implemented a high-availability Vault cluster with integrated RAFT storage, onboarding new projects first, then migrating existing ones. Custom authentication backends integrated with CERN's existing systems, while purpose-built tooling automated legacy migration and established standardized deployment workflows.

        Key milestones included onboarding the HTVault solution, managed with Puppet in CERN's central infrastructure, and application secrets provisioning for OpenShift Projects. Based on GitLab project descriptions, we developed automated per-project secrets management synchronized with CERN OIDC and LDAP.

        Migration from RIAK presented challenges: re-modeling existing structures, projecting ACLs from the legacy system, and adapting CLI tools to work transparently with Vault. We are exploring additional Vault benefits, including CERN certificate deployment and renewal, replacing another legacy tool.

        This presentation covers the technical architecture, migration challenges, lessons learned, and provides a roadmap for organizations considering similar transitions in high-energy physics computing environments.

        Speaker: Zhechka Toteva (CERN)
    • 12:30 PM
      Lunch Auditório J.J. Laginha

      Auditório J.J. Laginha

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal
    • Network & Security Auditório J.J. Laginha

      Auditório J.J. Laginha

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal
      Convener: David Kelsey (Science and Technology Facilities Council STFC (GB))
      • 55
        Evolution and Development of PIC’s Authentication & Authorization Infrastructure

        The Port d’Informació Científica (PIC) serves as a critical data and computing hub for numerous scientific experiments, with the vast majority of its resources dedicated to the Worldwide LHC Computing Grid (WLCG). While standard LHC grid operations rely heavily on traditional batch submissions, the growing demand for interactive, Tier-3 analysis facilities requires a shift toward scalable, federated, and user-friendly access models. Furthermore, modernizing the general infrastructure is essential to secure the environment that underpins these massive computing efforts.

        This work presents an Authentication and Authorization (A\&A) architecture that bridges modern web standards with the strict POSIX and Kerberos requirements of PIC’s interactive computing resources (e.g., Jupyter, local Batch systems used by LHC researchers). The core challenge was not merely integrating Keycloak with FreeIPA, but automating the provisioning of POSIX attributes (Home directories, Shells, UIDs) during self-registration—a feature absent in standard OIDC flows. To address this, we developed custom Keycloak Service Provider Interfaces (SPIs) that intercept the registration process to correctly "materialize" users in FreeIPA and trigger necessary attribute injections.

        A key innovation of this system is its robust handling of federated identities. External experiments and affiliated LHC researchers can now integrate their own Identity Providers (IdPs), allowing their users to seamlessly access PIC resources. The system relies on a set of idempotent Bash automation scripts that act as a central state engine. These scripts are triggered by every lifecycle event—user registration, IdP linking/unlinking, token expiration, or manual approval. They dynamically enforce state convergence, ensuring that privileges (such as SSH access or file permissions) are granted, updated, or revoked automatically without human intervention.

        This architecture allows PIC to scale its support for diverse scientific communities, transforming it from a traditional data repository into a user-ready analysis platform. Crucially, by fortifying the facility's overall A\&A infrastructure, it secures the computing ecosystem that hosts LHC operations while enabling researchers to seamlessly "bring their own identities" for advanced interactive analysis.

        Speaker: Marc Santamaria Riba (INSTITUT DE FÍSICA D´ALTES ENERGIES)
      • 56
        HEPiX IPv6 Working Group Update

        The HEPiX IPv6 Working Group has been encouraging the deployment and use of IPv6 in WLCG for many years. At the last HEPiX meeting in China we reported that more than 70% of all WLCG sites have worker nodes and compute services that are IPv6-capable. That campaign continues. We also presented news that the USA Tier1 sites had successfully removed IPv4 peering from their LHCOPN connections. We will report on work to remove IPv4 from more LHCOPN circuits. There are now a number of WLCG sites who are actively moving towards IPv6-only and who wish to turn off IPv4 peering on their LHCONE connections. The IPv6 working group continues to work towards all wide-area data transfers being on IPv6 before the WLCG Data Challenge in 2029 and before the start of HL-LHC Run4 data taking.

        This talk will present the activities of the IPv6 working group during the last year and our future plans.

        Speaker: David Kelsey (Science and Technology Facilities Council STFC (GB))
      • 57
        LHCOPN IPv6-only progress at IHEP

        At the LHCOPN/ONE meeting in October 2025, IHEP decided to act as a pioneer and volunteer to pilot the LHCOPN IPv6-only initiative. This report presents the LHCOPN IPv6-only progress at IHEP.

        Speaker: 曾珊 zengshan
      • 58
        A phased phase-out plan for IPv4 at a WLCG Tier-1 site

        Universities and research institutes have been early adopters of IPv4, which
        have served scientific research infrastructure well in the past. But now the
        time has come to let go of the legacy protocol with awkward limits, and phase
        it out in favour of IPv6.

        The World-wide LHC Computing Grid (WLCG) is half-way through the transition
        from IPv4 to IPv6, with almost all services now being dual-stack with both
        IPv4 and IPv6. Now the time has come to plan for the rest, where we discard
        the complexity of dual stack in favor of IPv6-only operations.

        The driver for doing this in the Nordic Tier-1 site (NT1) sooner rather
        than later is that we forsee a significant risk of running out of IPv4
        addresses when scaling storage servers horizontally in order to handle the High
        Luminosity LHC (HL-LHC) data rates. We expect to have a data rate of 10-20
        times when HL-LHC comes online in 2030, and the most cost-effective way to
        serve this is to have a larger number storage servers than today. And in order
        to prove that we are ready for HL-LHC data taking in 2030, it would be good to
        finish the bulk of the phase-out of IPv4 by Data Challenge 2027

        This move comes with lots of constraints though. Since it is only "almost all"
        services that understand IPv6 today, we cannot completely shut IPv4 down
        without considerations on how the legacy systems can access data. There
        might also be unknown dependencies on IPv4 in access or management of
        services, that we will only detect in testing or production. Individual
        scientists might want to access the data outside of the grid, for instance
        from their own laptop which might not have IPv6 yet. There are even reasons
        that the physics experiments might want to run legacy software for
        reproducability, some of it too old for IPv6 support.

        Together this indicates a phased approach, and this talk will concentrate
        on the planning and current status of this effort, with steps towards the
        end goal and tentative timing of them.

        Speaker: Mattias Wadenstein (University of Umeå (SE))
    • 3:30 PM
      Coffee Break Auditório J.J. Laginha

      Auditório J.J. Laginha

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal
    • Network & Security Auditório J.J. Laginha

      Auditório J.J. Laginha

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal
      Convener: Garhan Attebury (University of Nebraska Lincoln (US))
      • 59
        CZ-Tier-2 Network Tests

        The Czech WLCG Tier-2 consistently meets its computing and storage commitments to the LHC experiments through a geographically distributed infrastructure. CZ-Tier-2 resources are spread across three sites, connected by high-bandwidth links operated by the Czech NREN, CESNET. Additionally, substantial CPU resources from the Czech national supercomputing center IT4I are incorporated into WLCG operations via the primary CZ-Tier-2 hub at FZU, which serves as the central point for distribution and data exchange.
        A high-performance and reliable network connection is essential for operating a distributed Tier-2 center that is part of the WLCG mesh. Currently, our site is connected via a 400 Gbps link to LHCONE and an additional 100 Gbps link to the general internet. We will report on the results of our network tests of the LHCONE connection, conducted shortly after the 2025 upgrade. We will also present our efforts to tune a new perfSONAR server equipped with a 400 Gbps NIC.

        Speaker: Jiri Chudoba (Czech Academy of Sciences (CZ))
      • 60
        Worldwide Computer Security Situation

        This presentation aims to give an update on the global security landscape from the past year. The global political situation has introduced a novel challenge for security teams everywhere. What’s more, the worrying trend of data leaks, password dumps, ransomware attacks and new security vulnerabilities does not seem to slow down. We present some interesting cases that CERN and the wider HEP community dealt with in the last year, mitigations to prevent possible attacks in the future and preparations for when inevitably an attacker breaks in.

        Speaker: Dawid Kulikowski
    • Show me Your Toolbox Auditório J.J. Laginha

      Auditório J.J. Laginha

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal
      Conveners: Mattias Wadenstein (University of Umeå (SE)), Peter van der Reest
      • 61
        Dennis van Dok - gitlab pipelines for building sw packages

        demo of how we currently run Gitlab pipelines to build deb and rpm packages for software that we maintain. This in container images that themselves are poduced by gitlab pipelines.

        Speaker: Dennis van Dok (Nikhef)
      • 62
        Chris Brew - introducing the slowest NFS server in the world

        showing off the RAL PPD multi-layerered NFS home file service

        Speakers: Chris Brew (Science and Technology Facilities Council STFC (GB)), Chris Brew (Department of Physics), Chris Brew (Particle Physics-Rutherford Appleton Laboratory-STFC - Science &), Chris Brew (CCLRC - RAL)
      • 63
        Garhan Attebury - Updates to his HEPiX Fall 2024 toolbox

        Tools mentioned will be:

        Cobbler
        Netbox
        k9s / Flux
        Puppet
        Akvorado

        Speaker: Garhan Attebury (University of Nebraska Lincoln (US))
      • 64
        Mattias Wadenstein - CLI Tape Management Utility
    • 7:30 PM
      Social dinner Tacho do Pescador

      Tacho do Pescador

      Rua da Pimenta, 17A 1990-254 Parque das Nações - Lisboa https://tachodopescador.pt/

      Restaurante: Tacho do Pescador

      Rua da Pimenta, 17A
      1990-254 Parque das Nações - Lisboa

      Google Maps: https://maps.app.goo.gl/wPpQrwJNZ5F6V9hL7

    • Computing and Batch Services Auditório J.J. Laginha

      Auditório J.J. Laginha

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal
      Convener: Matthias Schnepf
      • 65
        Experience with HTCondor job disk and memory limits at CZ WLCG sites

        We present operational experience from two CZ WLCG sites deploying per-job LVM-enforced disk quotas and CGroup2-enforced memory limits within HTCondor, with a 10% overhead allowance and swap disabled for all batch jobs.
        We cover bugs and unexpected behaviours encountered in production, pitfalls to anticipate, configuration choices to avoid.

        Speaker: Alexandr Mikula (Czech Academy of Sciences (CZ))
    • Miscellaneous Auditório J.J. Laginha

      Auditório J.J. Laginha

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal
      Convener: Ofer Rind (Brookhaven National Laboratory)
      • 66
        SPECTRUM: A Strategic Framework and Technical Blueprint for European Exascale Research Data and Compute Infrastructure

        The SPECTRUM project (https://spectrumproject.eu/), funded under Horizon Europe, presents its final deliverables: the Strategic Research, Innovation and Deployment Agenda (SRIDA) and the Technical Blueprint for a European compute and data continuum serving data-intensive science communities.

        The SRIDA is structured around four pillars encompassing 13 strategic priorities spanning technical enablement, scientific operations, and strategic governance. Each priority includes implementation pathways with short-, medium-, and long-term milestones aligned with European research infrastructure strategies.

        The Technical Blueprint presents a capability map defining eight areas for the compute and data continuum (compute resources, data resources, software distribution and execution, orchestration and workflows, AI/ML and HPC applications, resource federation, monitoring and observability, and security and trust) and identifies key technical challenges with recommended actions to address them.

        Together, the SRIDA and Technical Blueprint provide a coherent strategic and technical foundation for coordinated infrastructure development across European research communities.

        Both documents respond to the unprecedented data processing demands facing High Energy Physics and Radio Astronomy as they enter the Exascale era with next-generation instruments including HL-LHC upgrades and SKAO. The contribution discusses how these strategic and technical frameworks can guide European infrastructure investments and inform future funding programmes targeting the 2030s research landscape.

        Speaker: Dr Jeff Wagg (OCA)
      • 67
        Enhancing Accessibility and Discoverability of CERN Media

        As part of CERN’s transcription and translation service, high-quality captions were produced for approximately 40,000 hours of media using models trained on CERN-specific/HEP terminology, covering CERN’s official languages. Beyond the immediate accessibility benefits of captioning, the service also explored ways to improve the discoverability of media content. Two proof-of-concept systems were developed to address this challenge, based on OpenSearch and PostgreSQL used as a vector database. This talk presents these approaches and discusses how they can enhance search, reuse, and, ultimately, the reach and impact of CERN’s media content.

        Speaker: Ruben Domingo Gaspar Aparicio (CERN)
    • Applied AI in Computing Center Infrastructures Auditório J.J. Laginha

      Auditório J.J. Laginha

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal
      Convener: Jose Flix Molina (CIEMAT - Centro de Investigaciones Energéticas Medioambientales y Tec. (ES))
      • 68
        New AI Infrastructure for Dr.Sai Agents at IHEP

        Interdisciplinary teams at IHEP have developed several AI agents for scientific research, including Dr.Sai BESIII, Dr.Sai Rongzai, and Dr.Sai DORA, which require new AI infrastructure.

        This talk will cover recent progress in these agent systems and introduce the OpenDr.Sai framework (as a harness).

        The implemented solutions cover: connecting agents with experimental data, integrating specialized computing environments like BOSS, enabling seamless model switching, supporting agent collaboration, and implementing mechanisms for human-AI interaction, long-running tasks, data feedback, and multi-protocol support.

        Speaker: Zhengde Zhang (中国科学院高能物理研究所)
    • 10:30 AM
      Coffee Break Auditório J.J. Laginha

      Auditório J.J. Laginha

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal
    • Applied AI in Computing Center Infrastructures Auditório J.J. Laginha

      Auditório J.J. Laginha

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal
      Convener: Jose Flix Molina (CIEMAT - Centro de Investigaciones Energéticas Medioambientales y Tec. (ES))
      • 69
        A Centralized LLM and Agentic AI Infrastructure for CERN

        We are developing a new Machine Learning (ML) service at CERN to support the use of Large Language Models (LLMs) and Agentic AI. Our goal is to provide a reliable and secure foundation for researchers and developers. In this presentation, we will describe the architecture and plans for this new service. It will include several key components: an LLM Proxy that works with OpenAI-compatible APIs, a Model Catalog for on-premises and cloud-based models, and an Agent Hosting and Orchestration platform. We will show some initial applications that will be used with this service.

        These examples will demonstrate the potential benefits of the service, such as improved productivity and faster research workflows. We will also discuss our future plans, including a shared AI Agent evaluation service and incorporating new use cases from the CERN community.

        Our aim is to create a scalable and sustainable ecosystem for LLMs and Agentic AI at CERN, balancing innovation with security and operational needs.

        Speaker: Juan Manuel Guijarro (CERN)
      • 70
        Integrating Spack, MLflow, and AgenticAI in Maxwell Cluster at DESY: Advancing Reproducible and Intelligent Scientific Workflows

        The increasing complexity of computational research demands HPC and AI systems that are not only powerful but also reproducible, adaptable, and easier to manage. This presentation details the integration of Spack, MLflow, and AgenticAI within the Maxwell Cluster at DESY to enhance software management, experiment tracking, and workflow automation. Spack introduces a flexible package management framework enabling consistent, version-controlled software environments across heterogeneous compute nodes. MLflow provides a platform for experiment tracking and reproducibility, bridging machine learning, GenAI pipelines and traditional simulation workloads.
        AgenticAI acts as an intelligent interface for interacting with the HPC environment—simplifying complex operations such as preparing and submitting SLURM jobs, monitoring execution, and dynamically adjusting workflows based on real-time results. Together, these tools form a cohesive ecosystem that improves performance, reproducibility, and usability of the Maxwell infrastructure.

        Speaker: Shore Salle Chota
      • 71
        NSSDC's data management and application practices towards AI4S

        The growing trend of AI for Science (AI4S) has placed new demands on scientific data management and application. As a bridge connecting raw data to data-driven applications, a data repository is expected to enhance its data service capabilities to facilitate AI4S applications.

        The Chinese National Space Science Data Center (NSSDC) is responsible for the archiving, curation, long-term preservation, and open sharing and application of space science data. This presentation will introduce data management and application practices. NSSDC has carried out a series of data management and curation activities to promote data application. In terms of data archiving, the NSSDC collects data from major projects and academic papers in the field of space science. During the data curation process, NSSDC has formulated dataset according to the organization model of dataset, stored data in standardized file formats, and established a metadata standard that contains rich information. In the preservation process, NSSDC implements a hierarchical and multi-replica storage strategy. For the data service, NSSDC customizes specialized data service systems for different projects, and then specially develops an integrated data retrieval platform to provide cross-platform and cross-system data discovery services. Towards AI4S, NSSDC now developing an open research platform which consists of four key components: a domain-specific foundation model, some space science intelligent agent matrix, some domain-specific intelligent models and toolchains, and AI-Ready space science datasets. The domain-specific foundation model has also been developed to serve as the hub of the open research platform. It orchestrates various intelligent agents, models, toolchains, and AI-Ready datasets, providing users with question-answering-based services such as literature review, resource retrieval, and data analysis.

        Speaker: QI XU (National Space Science Center, Chinese Academy of Sciences)
    • Wrap-up Auditório J.J. Laginha

      Auditório J.J. Laginha

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal
      Convener: Jose Flix Molina (CIEMAT - Centro de Investigaciones Energéticas Medioambientales y Tec. (ES))
    • 12:30 PM
      Lunch Auditório J.J. Laginha

      Auditório J.J. Laginha

      ISCTE Instituto Universitário de Lisboa

      Av. das Forças Armadas, 1649-026 Lisboa, Portugal